Compared to Redshift and Snowflake, MySQL HeatWave Lakehouse can load 400 TB of data from object storage 8X faster.
In a number of file formats, including CSV and Parquet, as well as Aurora and Redshift backups, clients can now process and query hundreds of terabytes of data in object stores thanks to MySQL HeatWave Lakehouse, which Oracle introduced today. The newest product in the MySQL HeatWave line, MySQL HeatWave Lakehouse, is the first cloud service that incorporates transaction processing, analytics, machine learning, and automation powered by machine learning all within a single MySQL database.
Industry-recognized benchmarks show that MySQL HeatWave Lakehouse, which is powered by the massively parallel scale-out MySQL HeatWave architecture, outperforms competing cloud database services by a wide margin when it comes to query execution and data loading. Customers can also use conventional MySQL syntax to query transactional data in the MySQL database and combine it with data in the object store in a single query. Additionally, Oracle revealed new MySQL Autopilot features that enhance performance and simplify the operation of MySQL HeatWave Lakehouse. Customers can now test out MySQL HeatWave Lakehouse in beta; it will go generally accessible in 1HCY23.
Customers moving to MySQL HeatWave from AWS, Google, and on-premises have been using it for a wide range of use cases, including marketing analytics, notably real-time analysis of the effectiveness of advertising campaigns and customer data analytics to create successful campaigns. Leaders in the automotive, telecommunications, retail, high-tech, and healthcare industries are among the clients leaving AWS. Additionally, Oracle is releasing fresh lakehouse benchmarks and a number of ground-breaking features for MySQL HeatWave Lakehouse and MySQL Autopilot.
Benchmarks
Faster than Snowflake & Amazon Redshift in both query performance and data loading
As demonstrated by a fully transparent, publicly available 400 TB TPC-H* benchmark, the query performance of MySQL HeatWave Lakehouse is:
- 17X faster than Snowflake
- 6X faster than Amazon Redshift
Loading data from object store into MySQL HeatWave Lakehouse is also significantly faster. For a 400 TB TPC-H* workload, load performance of MySQL HeatWave Lakehouse is:
- 8X faster than Amazon Redshift
- 2.7X faster than Snowflake
Innovative new capabilities for MySQL HeatWave Lakehouse
Greater data size, conventional MySQL syntax: MySQL HeatWave Lakehouse customers can query up to 400 TB of data, and the HeatWave cluster expands to 512 nodes. Customers query the data using conventional MySQL syntax.
The 10TB and 30TB TPC-H evaluations show that MySQL HeatWave gives the same query performance whether the data is stored on object storage or inside a MySQL database. Additionally, the level of compression and the amount of data that can be handled by each node are equivalent in both cases.
Support for many file formats: Customers can load and analyse data saved in a number of file formats, including CSV and Parquet, as well as Aurora and Redshift backups from AWS, using MySQL HeatWave Lakehouse. Customers can now take use of MySQL HeatWave’s advantages even if their data is not kept in a MySQL database thanks to this. No matter what file format the data is saved in, the query performance is the same.
Ability to query data in MySQL and mix it with data in object store: With MySQL HeatWave Lakehouse, users may query their OLTP data stored inside MySQL database and integrate it with data saved in the object store. The query result updates in real-time to reflect any changes made to the OLTP data.
New MySQL Autopilot capabilities for MySQL HeatWave Lakehouse
For MySQL HeatWave, MySQL Autopilot offers automation based on machine learning. For MySQL HeatWave Lakehouse, existing MySQL Autopilot features like auto provisioning and auto query plan improvement have been improved, significantly lowering database administration costs and raising performance. Additionally, MySQL HeatWave Lakehouse now supports a number of fresh MySQL Autopilot features.
Auto schema inference: Autopilot uses automatic schema inference to determine how file data corresponds to database datatypes. Customers can save time and work by not having to manually provide the mapping for each new file that MySQL HeatWave Lakehouse will query.
Adaptive data sampling: It also uses adaptive data sampling to automatically select samples from files in object storage to gather precise statistics with less data access. These statistics are used by MySQL HeatWave to create and enhance query plans, choose the best schema mapping, and other things.
Auto load: Autopilot analyses the data to forecast the MySQL HeatWave load time, decides how the datatypes should be mapped, and generates the loading scripts automatically. The mapping of files to database schemas and tables is automatic for users.
Data flow that is flexible: MySQL HeatWave Lakehouse flexibly adjusts to the efficiency of the underlying object storage. As a result, MySQL HeatWave can utilise the full potential of the underlying cloud architecture, enhancing availability, price performance, and overall performance.