A materialized view (MV) is a type of database object that, unlike a regular view, stores the results of its query. By saving the results of certain time-consuming operations—such as aggregations and joins—the results can be directly reused during queries, avoiding the need to repeatedly execute these resource-intensive operations and ultimately accelerating query performance. This is especially beneficial in data warehouses and decision support systems, where it can significantly reduce computation time and improve query efficiency.
Features of materialized views
Refresh methods of materialized views
When creating a materialized view, users can choose different refresh methods to ensure that the data in the materialized view stays synchronized with the base tables. The choice of refresh method directly affects system performance and the real-time accuracy of query results.
Complete refresh and incremental refresh
Complete refresh: Each time a materialized view is refreshed, it re-executes the query statement and overwrites the original view result data with the calculation result. This method is suitable for scenarios with low latency requirements, infrequent base table data updates, complex query statements, or small data volumes.
Incremental refresh: Also known as fast refresh, it only refreshes incremental change data. This method is particularly suitable for large-scale datasets. Incremental refresh requires the use of a materialized view log (mlog). It has certain requirements for query statements, currently supporting single-table aggregation, multi-table join, and multi-table join aggregation query statements, with specific requirements for aggregate functions and join methods. For more information, see Refresh materialized views in MySQL compatible mode and Refresh materialized views in Oracle compatible mode. This method is suitable for business scenarios with high latency requirements, large data volumes, and frequent changes.
Automatic refresh and manual refresh
- Automatic refresh: When you create a materialized view, you can specify the refresh interval for the materialized view. The system automatically schedules refresh tasks for the materialized view based on the configured refresh schedule.
- Manual refresh: If a materialized view does not have automatic refresh enabled or the automatic refresh interval is large, you can manually execute the refresh command for the materialized view to synchronize the data of the materialized view with that of the base table.
Real-time materialized views
As the name suggests, real-time materialized views allow you to obtain real-time data by querying the materialized view, with query results identical to those obtained by querying the base tables directly, while also benefiting from the precomputed results of the materialized view to accelerate queries. Real-time materialized views use the materialized view log mechanism to capture and process changes in the underlying base tables, ensuring that the data in the materialized view promptly reflects the latest state. Because they rely on the materialized view log, the requirements for queries on real-time materialized views are the same as those for incrementally refreshed materialized views—that is, only materialized views that meet the requirements for incremental refresh can be defined as real-time materialized views.
For more information about how to create a real-time materialized view, see Create a materialized view in MySQL compatible mode and Create a materialized view in Oracle compatible mode.
Nested materialized views
A nested materialized view refers to a materialized view that is referenced or depended upon by another materialized view. This approach is particularly useful in ETL (Extract, Transform, Load) processes. During ETL workflows, nested materialized views can store aggregated or transformed results from different stages as separate views, avoiding repeated calculations and improving the overall efficiency of the ETL process. Nested materialized views do not support automatic cascading refresh. Before using nested materialized views, it is important to understand the refresh considerations to ensure that querying the upper-level materialized view returns the expected data results.
For more information about how to create nested materialized views, see Create a materialized view in MySQL compatible mode and Create a materialized view in Oracle compatible mode.
Query rewriting for materialized views
You can leverage the query rewriting capability of materialized views to accelerate queries without modifying the original query statements. The system can automatically match the query statements with the definitions of materialized views. If a matching materialized view is found, the system automatically rewrites the query to use the materialized view. This way, you can significantly improve query performance and efficiency without changing your business logic.
For more information about query rewriting for materialized views, see Query rewriting for materialized views in MySQL compatible mode and Query rewriting for materialized views in Oracle compatible mode.
Methods for accelerating queries using materialized views
Materialized views are an important tool for accelerating queries in databases. They precompute and store query results to reduce the performance overhead of real-time computations. To further optimize the query performance of materialized views, you can try the following methods:
Storage format of materialized view data
You can choose between row-based and columnar materialized views based on your actual application scenarios to speed up queries.
- Scenarios for row-based materialized views: If the data in the materialized view has been aggregated, your queries will typically access entire data rows.
- Scenarios for columnar materialized views: For wide tables with a large number of columns and rows, queries on these tables are more likely to involve data analysis and aggregation operations.
Use of primary key materialized views
Primary keys ensure the uniqueness of data in materialized views and enable more efficient data lookup and updates.
Create indexes on materialized views
Creating indexes on materialized views can significantly improve query performance. Indexes allow you to quickly locate the required data and reduce the need for full-table scans.
Materialized views are an important tool for improving query performance. However, when dealing with large amounts of data, query performance may be affected. To optimize query performance for materialized views, you can try the preceding methods to further accelerate queries.
Create a materialized view
Note
The following examples are run in MySQL compatible mode.
Create a source table
First, we need to create the source tables to store the raw data. In this example, we will create a sales table and an items table, which contain sales data and product information, respectively.
Create the sales data table
sales.CREATE TABLE sales ( order_id INT PRIMARY KEY, user_id INT, item_id INT, item_count INT, region VARCHAR(100) );Create the product information table
items.CREATE TABLE items ( order_id INT, product_id INT, quantity INT, price_per_item DECIMAL(10, 2) NOT NULL, pic_url VARCHAR(1000), PRIMARY KEY (order_id, product_id) );
Create a complete refresh materialized view
We will create a materialized view based on the sales table. This materialized view will summarize the sales by product and region to accelerate queries.
-- Materialized view that summarizes sales by product and region
CREATE MATERIALIZED VIEW mv_sales_summary(PRIMARY KEY(item_id))
REFRESH COMPLETE
START WITH sysdate()
NEXT sysdate() + interval 1 hour
AS
SELECT item_id, region, SUM(item_count) AS total_count
FROM sales
GROUP BY item_id, region;
In this example, we specify the following characteristics when we create the materialized view:
- PRIMARY KEY: the primary key of the materialized view.
REFRESH COMPLETE: complete refresh.START WITH sysdate() NEXT sysdate() + interval 1 hour: scheduled refresh on an hourly basis.
Create an incremental refresh materialized view
For scenarios with frequent data changes, incremental refresh can be used to improve refresh efficiency. Before creating an incremental refresh materialized view, you must create a materialized view log (mlog) based on the base table of the materialized view. Then, you can successfully create an incremental refresh materialized view.
For more information about materialized view logs, see Materialized view logs in MySQL compatible mode and Materialized view logs in Oracle compatible mode.
The following examples show three query scenarios that are supported for incremental refresh materialized views:
Aggregate data in a single table
Create a materialized view log on the
salestable.CREATE MATERIALIZED VIEW LOG ON sales WITH PRIMARY KEY (item_id, item_count, region) INCLUDING NEW VALUES;Create an incrementally refreshed materialized view
mv_sales_summary_fastfor single-table aggregation.CREATE MATERIALIZED VIEW mv_sales_summary_fast REFRESH FAST START WITH sysdate() NEXT sysdate() + interval 1 hour AS SELECT item_id, region, SUM(item_count) AS total_count, count(*) as c, count(item_count) as count FROM sales GROUP BY item_id, region;
Join data in multiple tables
Create a materialized view log on the
salestable.(Optional) Drop the materialized view log on the
salestable.If a materialized view log has not been created on the
salestable, skip this step.DROP MATERIALIZED VIEW LOG ON sales;Create a materialized view log on the
salestable.CREATE MATERIALIZED VIEW LOG ON sales WITH PRIMARY KEY (user_id, item_id, item_count, region) INCLUDING NEW VALUES;
Create a materialized view log on the
itemstable.CREATE MATERIALIZED VIEW LOG ON items WITH PRIMARY KEY (price_per_item, pic_url) INCLUDING NEW VALUES;Create an incrementally refreshed materialized view
mv_sales_items_joinfor multi-table joins.CREATE MATERIALIZED VIEW mv_sales_items_join PARTITION BY HASH(order_id) PARTITIONS 10 REFRESH FAST START WITH sysdate() NEXT sysdate() + interval 1 hour AS SELECT s.order_id AS order_id, s.user_id AS customer_id, s.item_id AS item_id, s.item_count AS quantity, s.region AS region, i.order_id AS i_id, i.product_id AS i_item_id, i.price_per_item AS price_per_item, i.pic_url AS pic_url FROM sales s JOIN items i ON s.order_id = i.order_id;
Aggregate data that is returned by joining data in multiple tables
Create a materialized view log on the
salestable.(Optional) Drop the materialized view log on the
salestable.If a materialized view log has not been created on the
salestable, skip this step.DROP MATERIALIZED VIEW LOG ON sales;Create a materialized view log on the
salestable.CREATE MATERIALIZED VIEW LOG ON sales WITH PRIMARY KEY (item_id, item_count, region) INCLUDING NEW VALUES;
Create a materialized view log on the
itemstable.(Optional) Drop the materialized view log on the
itemstable.If a materialized view log has not been created on the
itemstable, skip this step.DROP MATERIALIZED VIEW LOG ON items;Create a materialized view log on the
itemstable.CREATE MATERIALIZED VIEW LOG ON items WITH PRIMARY KEY (price_per_item) INCLUDING NEW VALUES;
Create an incrementally refreshed materialized view
mv_sales_item_join_groupfor multi-table join aggregation.CREATE MATERIALIZED VIEW mv_sales_item_join_group REFRESH FAST START WITH sysdate() NEXT sysdate() + interval 1 hour AS SELECT s.item_id AS item_id, s.region AS region, SUM(s.item_count * i.price_per_item) AS sum_price, count(*) AS c, count(s.item_count * i.price_per_item) AS count FROM sales s JOIN items i ON s.order_id = i.order_id GROUP BY item_id, region;
Create a real-time materialized view
Real-time materialized views can ensure that the query results of the materialized view remain synchronized with the base table when data changes occur. Because real-time materialized views rely on materialized view logs, just like incrementally refreshed materialized views, you need to create the materialized view log before creating the materialized view.
Create a materialized view log on the
salestable.(Optional) Drop the materialized view log on the
salestable.If a materialized view log has not been created on the
salestable, skip this step.DROP MATERIALIZED VIEW LOG ON sales;Create a materialized view log on the
salestable.CREATE MATERIALIZED VIEW LOG ON sales WITH PRIMARY KEY (item_id, item_count, region) INCLUDING NEW VALUES;
Create the real-time materialized view
mv_sales_summary_com.CREATE MATERIALIZED VIEW mv_sales_summary_com REFRESH FORCE START WITH sysdate() NEXT sysdate() + interval 1 hour ENABLE ON QUERY COMPUTATION AS SELECT item_id, region, SUM(item_count) AS total_count, count(*) as c, count(item_count) as count FROM sales GROUP BY item_id, region;
In this example, we enabled ENABLE ON QUERY COMPUTATION, which allows the materialized view to be updated in real time when queried, ensuring the latest data is retrieved. Additionally, the query statement for the materialized view meets the requirements for creating an incrementally refreshed materialized view.
Create a nested materialized view
During the ETL process, nested materialized views are used to combine multiple materialized views into a more complex data processing workflow. Here, we will create two materialized views: one to join sales information with item information, and another to perform further aggregation based on the first materialized view.
Create a materialized view log on the
salestable.(Optional) Drop the materialized view log on the
salestable.If a materialized view log has not been created on the
salestable, skip this step.DROP MATERIALIZED VIEW LOG ON sales;Create a materialized view log on the
salestable.CREATE MATERIALIZED VIEW LOG ON sales WITH PRIMARY KEY (user_id, item_id, item_count, region) INCLUDING NEW VALUES;
Create a materialized view log on the
itemstable.(Optional) Drop the materialized view log on the
itemstable.If a materialized view log has not been created on the
itemstable, skip this step.DROP MATERIALIZED VIEW LOG ON items;Create a materialized view log on the
itemstable.CREATE MATERIALIZED VIEW LOG ON items WITH PRIMARY KEY (price_per_item, pic_url) INCLUDING NEW VALUES;
Create the materialized view
mv1_sales_items_jointo join sales information with item information.CREATE MATERIALIZED VIEW mv1_sales_items_join REFRESH FAST START WITH sysdate() NEXT sysdate() + interval 1 hour AS SELECT s.order_id AS order_id, s.user_id AS customer_id, s.item_id AS item_id, s.item_count AS quantity, s.region AS region, i.order_id AS i_id, i.product_id AS i_item_id, i.price_per_item AS price_per_item, i.pic_url FROM sales s JOIN items i ON s.order_id = i.order_id;Create a materialized view log on the materialized view
mv1_sales_items_join.CREATE MATERIALIZED VIEW LOG ON mv1_sales_items_join WITH PRIMARY KEY (region, quantity, price_per_item) INCLUDING NEW VALUES;Create the materialized view
mv2_join_sumbased on the materialized viewmv1_sales_items_join.CREATE MATERIALIZED VIEW mv2_join_sum REFRESH FAST START WITH sysdate() NEXT sysdate() + interval 1 hour AS SELECT region, sum(quantity * price_per_item) AS sum_price, count(*) as c, count(quantity * price_per_item) as count FROM mv1_sales_items_join GROUP BY region;
Create a columnstore materialized view
Suppose we want to improve query efficiency in a large-scale data analysis scenario. We can create a columnar materialized view. By using columnar storage, only the required columns are read during queries, which significantly reduces disk I/O.
Create a materialized view log on the
salestable.(Optional) Drop the materialized view log on the
salestable.If a materialized view log has not been created on the
salestable, skip this step.DROP MATERIALIZED VIEW LOG ON sales;Create a materialized view log on the
salestable.CREATE MATERIALIZED VIEW LOG ON sales WITH PRIMARY KEY (user_id, item_id, item_count, region) INCLUDING NEW VALUES;
Create a materialized view log on the
itemstable.(Optional) Drop the materialized view log on the
itemstable.If a materialized view log has not been created on the
itemstable, skip this step.DROP MATERIALIZED VIEW LOG ON items;Create a materialized view log on the
itemstable.CREATE MATERIALIZED VIEW LOG ON items WITH PRIMARY KEY (price_per_item, pic_url) INCLUDING NEW VALUES;
Create the columnar materialized view
wide_sales_column.CREATE MATERIALIZED VIEW wide_sales_column WITH COLUMN GROUP(each column) REFRESH FAST START WITH sysdate() NEXT sysdate() + interval 1 hour AS SELECT s.order_id AS order_id, s.user_id AS customer_id, s.item_id AS item_id, s.item_count AS quantity, s.region AS region, i.order_id AS i_id, i.product_id AS i_item_id, i.price_per_item AS price_per_item, i.pic_url FROM sales s JOIN items i ON s.order_id = i.order_id;
In this example of creating a columnar materialized view, we specified WITH COLUMN GROUP(each column) so that the materialized view uses a columnar storage format. This is very useful in OLAP scenarios, especially for large-scale data and wide table queries.
Create a materialized view for query rewriting
In this example, specifying ENABLE QUERY REWRITE creates a fully refreshed materialized view that supports query rewrite.
CREATE MATERIALIZED VIEW mv_sales_summary_select
REFRESH COMPLETE
START WITH sysdate()
NEXT sysdate() + interval 1 hour
ENABLE QUERY REWRITE
AS SELECT item_id, region, SUM(item_count) AS total_count
FROM sales
GROUP BY item_id, region;
In this example, specifying both ENABLE QUERY REWRITE and ENABLE ON QUERY COMPUTATION creates a real-time materialized view that supports query rewrite.
Create a materialized view log on the
salestable.(Optional) Drop the materialized view log on the
salestable.If a materialized view log has not been created on the
salestable, skip this step.DROP MATERIALIZED VIEW LOG ON sales;Create a materialized view log on the
salestable.CREATE MATERIALIZED VIEW LOG ON sales WITH PRIMARY KEY (item_id, item_count, region) INCLUDING NEW VALUES;
Create the real-time materialized view
mv_sales_summary_com_selectthat supports query rewrite.CREATE MATERIALIZED VIEW mv_sales_summary_com_select REFRESH FAST START WITH sysdate() NEXT sysdate() + interval 1 hour ENABLE ON QUERY COMPUTATION ENABLE QUERY REWRITE AS SELECT item_id, region, SUM(item_count) AS total_sales, count(*) as c, count(item_count) as count FROM sales GROUP BY item_id, region;
In these examples, query rewrite is enabled so that the optimizer can automatically use the materialized view to answer queries, improving performance. The real-time materialized view can also compute results on the fly when necessary.
Create an index on a materialized view
To further optimize query performance, you can create indexes on materialized views. Indexes on materialized views help speed up queries against the view, especially when dealing with large amounts of data.
(Optional) Create the materialized view
mv_sales_summarythat summarizes sales by product and region.If you have already created the
mv_sales_summarymaterialized view following the example in Creating a Complete Refresh Materialized View, skip this step.CREATE MATERIALIZED VIEW mv_sales_summary(PRIMARY KEY(item_id)) REFRESH COMPLETE START WITH sysdate() NEXT sysdate() + interval 1 hour AS SELECT item_id, region, SUM(item_count) AS total_count FROM sales GROUP BY item_id, region;Create an index
idx_mv_sales_summaryon theregioncolumn of the materialized viewmv_sales_summary.CREATE INDEX idx_mv_sales_summary ON mv_sales_summary (region);This index can speed up queries on the
mv_sales_summarymaterialized view, especially those based on theregioncolumn.
References
- For a detailed introduction and usage guide on materialized views, see Materialized views overview (MySQL compatible mode) and Materialized views overview (Oracle compatible mode).
- For a detailed introduction and usage guide on refreshing materialized views, see Refreshing materialized views (MySQL compatible mode) and Refreshing materialized views (Oracle compatible mode).
- For a detailed introduction and usage guide on deleting materialized view logs, see Materialized view logs (MySQL compatible mode) and Materialized view logs (Oracle compatible mode).
- For a detailed introduction and usage guide on deleting materialized views, see Deleting materialized views (MySQL compatible mode) and Deleting materialized views (Oracle compatible mode).