A materialized view (MV) is a database object that stores the query results of a view. Unlike a regular view, an MV stores the results of the view's query. By saving the results of time-consuming operations (such as aggregation and joins), the MV allows these results to be reused during queries, avoiding the need to re-execute these operations. This ultimately accelerates query performance. In data warehouses and decision support systems, this can significantly reduce computation time and improve query efficiency.
Features of materialized views
Materialized view refresh methods
When you create a materialized view, you can choose different refresh methods to ensure that the data in the materialized view stays synchronized with the base table. The choice of refresh method directly affects system performance and the real-time nature of query results.
Complete refresh and incremental refresh
Complete refresh: During a complete refresh, the materialized view re-executes the query statement and overwrites the existing view results with the new computed data. This method is suitable for scenarios with low latency requirements, infrequent base table updates, complex query statements, or small data volumes.
Incremental refresh: Also known as fast refresh, this method only refreshes the incremental changes in the data. It is particularly suitable for large-scale datasets. Incremental refresh relies on materialized view logs (Mlogs) and has specific requirements for query statements. Currently, it supports queries involving single-table aggregation, multi-table joins, and multi-table join aggregations. For more information, see Refresh a materialized view in MySQL-compatible mode and Refresh a materialized view in Oracle-compatible mode. This method is suitable for scenarios with high latency requirements, large data volumes, and frequent data changes.
Automatic refresh and manual refresh
- Automatic refresh: When you create a materialized view, you can specify the refresh interval. The system will automatically schedule the refresh task based on the configured refresh time rules.
- Manual refresh: If automatic refresh is not configured or the refresh interval is too long, you can manually execute the refresh command for the materialized view to keep its data synchronized with the base table data.
Real-time materialized views
A real-time materialized view (RTMV) allows you to query real-time data from the materialized view. The query results are the same as those obtained by directly querying the base table, and the precomputed results of the materialized view can accelerate queries. It captures and processes changes in the underlying base table using Mlog mechanisms to ensure that the data in the materialized view reflects the latest state in a timely manner. Since real-time materialized views rely on Mlogs, the query statements must meet the requirements of incremental refresh materialized views. That is, only queries that meet the requirements of incremental refresh materialized views can be defined as real-time materialized views.
For more information about creating real-time materialized views, see Create a materialized view in MySQL-compatible mode and Create a materialized view in Oracle-compatible mode.
Nested materialized views
A nested materialized view is a materialized view that references another materialized view. This approach is very useful in ETL (Extract, Transform, Load) processes. In ETL workflows, nested materialized views can aggregate and transform data from different stages and store the results as independent views, avoiding repeated calculations and improving the efficiency of the entire ETL process. Nested materialized views do not support automatic cascading refreshes. Before using nested materialized views, you need to understand their refresh considerations to ensure that the query results obtained from the upper-level materialized view meet your expectations.
For more information about creating nested materialized views, see Create a materialized view in MySQL-compatible mode and Create a materialized view in Oracle-compatible mode.
Materialized view query rewriting
If you want to use a materialized view to accelerate queries without modifying the original query statements, you can leverage the query rewriting capability of the materialized view. The system automatically matches the query statement with the definition of the materialized view. If a matching materialized view is found, the query is automatically rewritten to use the materialized view. This approach can significantly improve query performance and efficiency without changing the business logic.
For more information about materialized view query rewriting, see Materialized view query rewriting in MySQL-compatible mode and Materialized view query rewriting in Oracle-compatible mode.
Query acceleration using materialized views
Materialized views are a key method for improving query performance in databases. They precompute and store query results, reducing the performance overhead of real-time calculations. To further optimize query performance for materialized views, consider the following methods:
Storage format of materialized view data
Choose between row-based and column-based storage for materialized views based on your specific application scenarios to accelerate queries.
- Row-based materialized views: Suitable when the data in the materialized view has been aggregated, and queries primarily access entire data rows.
- Column-based materialized views: Ideal for wide tables with large amounts of data and many columns, where queries often involve data analysis and aggregation.
Use of primary key materialized views
Primary keys ensure the uniqueness of data in materialized views and allow for more efficient lookup and update operations.
Creating indexes on materialized views
Creating indexes on materialized views can significantly enhance query performance. Indexes allow for quick data location, reducing the need for full-table scans.
Materialized views are an important tool for improving query performance, but their performance may be affected when handling large-scale data. To optimize query performance for materialized views, you can adjust the materialized view definition, add indexes, or modify the refresh strategy to further accelerate queries.
Create a materialized view
Notice
The following examples are run in MySQL-compatible mode.
Create the source table
First, we need to create a source table to store the original data. In this example, we create a sales table and an items table, which store sales data and item information, respectively.
Create the
salestable to store sales data.CREATE TABLE sales ( order_id INT PRIMARY KEY, user_id INT, item_id INT, item_count INT, region VARCHAR(100) );Create the
itemstable to store item information.CREATE TABLE items ( order_id INT, product_id INT, quantity INT, price_per_item DECIMAL(10, 2) NOT NULL, pic_url VARCHAR(1000), PRIMARY KEY (order_id, product_id) );
Create a full refresh materialized view
We will create a materialized view based on the sales table. This materialized view will summarize sales by product and region to accelerate queries.
Create a materialized view mv_sales_summary that summarizes sales by product and region.
CREATE MATERIALIZED VIEW mv_sales_summary(PRIMARY KEY(item_id))
REFRESH COMPLETE
START WITH sysdate()
NEXT sysdate() + interval 1 hour
AS SELECT item_id, region, SUM(item_count) AS total_count
FROM sales
GROUP BY item_id, region;
In this example, we specify the following characteristics:
- PRIMARY KEY: specifies the primary key for the materialized view.
REFRESH COMPLETE: indicates that the materialized view is refreshed in full.START WITH sysdate() NEXT sysdate() + interval 1 hour: specifies that the materialized view is refreshed periodically, once every hour.
Create an incremental refresh materialized view
For scenarios where data changes frequently, incremental refresh can improve refresh efficiency. Before creating an incremental refresh materialized view, we need to create a materialized view log (Mlog) based on the base table of the materialized view. Only then can we successfully create an incremental refresh materialized view.
Note
For OceanBase Database V4.3.5, the automatic management of materialized view logs is supported starting from V4.3.5 BP4. If automatic management of mlogs is enabled, you do not need to create an mlog for the base table before creating an incremental refresh materialized view. OceanBase Database will automatically create the corresponding mlog or update the existing mlog table definition to include the columns required by the new materialized view. For more information, see Automatic management of materialized view logs (MySQL-compatible mode) and Automatic management of materialized view logs (Oracle-compatible mode).
For more information about materialized view logs, see Materialized view logs (MySQL-compatible mode) and Materialized view logs (Oracle-compatible mode).
The following examples show three query scenarios supported by incremental refresh materialized views:
Single-table aggregation
Create an mlog for the
salestable.CREATE MATERIALIZED VIEW LOG ON sales WITH PRIMARY KEY (item_id, item_count, region) INCLUDING NEW VALUES;Create an incremental refresh materialized view
mv_sales_summary_fastfor single-table aggregation.CREATE MATERIALIZED VIEW mv_sales_summary_fast REFRESH FAST START WITH sysdate() NEXT sysdate() + interval 1 hour AS SELECT item_id, region, SUM(item_count) AS total_count, count(*) as c, count(item_count) as count FROM sales GROUP BY item_id, region;
Multi-table join
Create an mlog for the
salestable.(Optional) Drop the mlog for the
salestable.If you have not created an mlog for the
salestable, skip this step.DROP MATERIALIZED VIEW LOG ON sales;Create an mlog for the
salestable.CREATE MATERIALIZED VIEW LOG ON sales WITH PRIMARY KEY (user_id, item_id, item_count, region) INCLUDING NEW VALUES;
Create an mlog for the
itemstable.CREATE MATERIALIZED VIEW LOG ON items WITH PRIMARY KEY (price_per_item, pic_url) INCLUDING NEW VALUES;Create an incremental refresh materialized view
mv_sales_items_joinfor multi-table join.CREATE MATERIALIZED VIEW mv_sales_items_join PARTITION BY HASH(order_id) PARTITIONS 10 REFRESH FAST START WITH sysdate() NEXT sysdate() + interval 1 hour AS SELECT s.order_id AS order_id, s.user_id AS customer_id, s.item_id AS item_id, s.item_count AS quantity, s.region AS region, i.order_id AS i_id, i.product_id AS i_item_id, i.price_per_item AS price_per_item, i.pic_url AS pic_url FROM sales s JOIN items i ON s.order_id = i.order_id;
Multi-table join aggregation
Create an mlog for the
salestable.(Optional) Drop the mlog for the
salestable.If you have not created an mlog for the
salestable, skip this step.DROP MATERIALIZED VIEW LOG ON sales;Create an mlog for the
salestable.CREATE MATERIALIZED VIEW LOG ON sales WITH PRIMARY KEY (item_id, item_count, region) INCLUDING NEW VALUES;
Create an mlog for the
itemstable.(Optional) Drop the mlog for the
itemstable.If you have not created an mlog for the
itemstable, skip this step.DROP MATERIALIZED VIEW LOG ON items;Create an mlog for the
itemstable.CREATE MATERIALIZED VIEW LOG ON items WITH PRIMARY KEY (price_per_item) INCLUDING NEW VALUES;
Create an incremental refresh materialized view
mv_sales_item_join_groupfor multi-table join aggregation.CREATE MATERIALIZED VIEW mv_sales_item_join_group REFRESH FAST START WITH sysdate() NEXT sysdate() + interval 1 hour AS SELECT s.item_id AS item_id, s.region AS region, SUM(s.item_count * i.price_per_item) AS sum_price, count(*) AS c, count(s.item_count * i.price_per_item) AS count FROM sales s JOIN items i ON s.order_id = i.order_id GROUP BY item_id, region;
Create a real-time materialized view
A real-time materialized view ensures that the query results of the materialized view are synchronized with those of the base table when data changes. Since a real-time materialized view relies on a materialized view log, you must create a materialized view log before you create a real-time materialized view.
Note
For OceanBase Database V4.3.5, the automatic management of materialized view logs is supported starting from V4.3.5 BP4. If automatic management of materialized view logs is enabled, you do not need to create a materialized view log for the base table before you create an incremental refresh materialized view. OceanBase Database automatically creates the corresponding materialized view log or updates the existing materialized view log table definition to include the columns required by the new materialized view. For more information, see Automatic management of materialized view logs (MySQL-compatible mode) and Automatic management of materialized view logs (Oracle-compatible mode).
Create a materialized view log on the
salestable.(Optional) Drop the materialized view log on the
salestable.If you have not created a materialized view log on the
salestable, skip this step.DROP MATERIALIZED VIEW LOG ON sales;Create a materialized view log on the
salestable.CREATE MATERIALIZED VIEW LOG ON sales WITH PRIMARY KEY (item_id, item_count, region) INCLUDING NEW VALUES;
Create a real-time materialized view named
mv_sales_summary_com.CREATE MATERIALIZED VIEW mv_sales_summary_com REFRESH FORCE START WITH sysdate() NEXT sysdate() + interval 1 hour ENABLE ON QUERY COMPUTATION AS SELECT item_id, region, SUM(item_count) AS total_count, count(*) as c, count(item_count) as count FROM sales GROUP BY item_id, region;
In this example, we enabled ENABLE ON QUERY COMPUTATION to ensure that the materialized view is updated in real time during queries, so that you can obtain the latest data. The query statement of the materialized view meets the requirements for creating an incrementally refreshed materialized view.
Create a nested materialized view
In an ETL process, a nested materialized view combines multiple materialized views into a more complex data processing workflow. In this example, we will create two materialized views: one that joins sales and item information, and another that performs additional aggregations based on the first materialized view.
Create a materialized view log on the
salestable.(Optional) Drop the materialized view log on the
salestable.If you have not created a materialized view log on the
salestable, skip this step.DROP MATERIALIZED VIEW LOG ON sales;Create a materialized view log on the
salestable.CREATE MATERIALIZED VIEW LOG ON sales WITH PRIMARY KEY (user_id, item_id, item_count, region) INCLUDING NEW VALUES;
Create a materialized view log on the
itemstable.(Optional) Drop the materialized view log on the
itemstable.If you have not created a materialized view log on the
itemstable, skip this step.DROP MATERIALIZED VIEW LOG ON items;Create a materialized view log on the
itemstable.CREATE MATERIALIZED VIEW LOG ON items WITH PRIMARY KEY (price_per_item,pic_url) INCLUDING NEW VALUES;
Create a materialized view named
mv1_sales_items_jointhat joins sales and item information.CREATE MATERIALIZED VIEW mv1_sales_items_join REFRESH FAST START WITH sysdate() NEXT sysdate() + interval 1 hour AS SELECT s.order_id AS order_id, s.user_id AS customer_id, s.item_id AS item_id, s.item_count AS quantity, s.region AS region, i.order_id AS i_id, i.product_id AS i_item_id, i.price_per_item AS price_per_item, i.pic_url FROM sales s JOIN items i ON s.order_id = i.order_id;Create a materialized view log on the materialized view
mv1_sales_items_join.CREATE MATERIALIZED VIEW LOG ON mv1_sales_items_join WITH PRIMARY KEY (region,quantity,price_per_item) INCLUDING NEW VALUES;Create a materialized view named
mv2_join_sumbased on the materialized viewmv1_sales_items_join.CREATE MATERIALIZED VIEW mv2_join_sum REFRESH FAST START WITH sysdate() NEXT sysdate() + interval 1 hour AS SELECT region, sum(quantity * price_per_item) AS sum_price, count(*) as c, count(quantity * price_per_item) as count FROM mv1_sales_items_join GROUP BY region;
Create a columnstore materialized view
To improve query efficiency in scenarios involving large amounts of data, you can create a columnstore materialized view. With columnar storage, queries only read the required columns, significantly reducing disk I/O.
Create a materialized view log on the
salestable.(Optional) Drop the materialized view log on the
salestable.If you have not created a materialized view log on the
salestable, skip this step.DROP MATERIALIZED VIEW LOG ON sales;Create a materialized view log on the
salestable.CREATE MATERIALIZED VIEW LOG ON sales WITH PRIMARY KEY (user_id, item_id, item_count, region) INCLUDING NEW VALUES;
Create a materialized view log on the
itemstable.(Optional) Drop the materialized view log on the
itemstable.If you have not created a materialized view log on the
itemstable, skip this step.DROP MATERIALIZED VIEW LOG ON items;Create a materialized view log on the
itemstable.CREATE MATERIALIZED VIEW LOG ON items WITH PRIMARY KEY (price_per_item,pic_url) INCLUDING NEW VALUES;
Create a columnstore materialized view named
wide_sales_column.CREATE MATERIALIZED VIEW wide_sales_column WITH COLUMN GROUP(each column) REFRESH FAST START WITH sysdate() NEXT sysdate() + interval 1 hour AS SELECT s.order_id AS order_id, s.user_id AS customer_id, s.item_id AS item_id, s.item_count AS quantity, s.region AS region, i.order_id AS i_id, i.product_id AS i_item_id, i.price_per_item AS price_per_item, i.pic_url FROM sales s JOIN items i ON s.order_id = i.order_id;
In this example, we specified WITH COLUMN GROUP(each column) to ensure that the materialized view uses columnar storage. This is particularly useful in OLAP scenarios, especially when dealing with large amounts of data and wide tables.
Create a materialized view that supports query rewriting
In this example, the ENABLE QUERY REWRITE clause is specified to create a fully refreshed materialized view that supports query rewriting.
CREATE MATERIALIZED VIEW mv_sales_summary_select
REFRESH COMPLETE
START WITH sysdate()
NEXT sysdate() + interval 1 hour
ENABLE QUERY REWRITE
AS SELECT item_id, region, SUM(item_count) AS total_count
FROM sales
GROUP BY item_id, region;
In this example, the ENABLE QUERY REWRITE and ENABLE ON QUERY COMPUTATION clauses are specified to create a real-time materialized view that supports query rewriting.
Create a materialized view log on the
salestable.(Optional) Drop the materialized view log on the
salestable.If you have not created a materialized view log on the
salestable, skip this step.DROP MATERIALIZED VIEW LOG ON sales;Create a materialized view log on the
salestable.CREATE MATERIALIZED VIEW LOG ON sales WITH PRIMARY KEY (item_id, item_count, region) INCLUDING NEW VALUES;
Create a real-time materialized view
mv_sales_summary_com_selectthat supports query rewriting.CREATE MATERIALIZED VIEW mv_sales_summary_com_select REFRESH FAST START WITH sysdate() NEXT sysdate() + interval 1 hour ENABLE ON QUERY COMPUTATION ENABLE QUERY REWRITE AS SELECT item_id, region, SUM(item_count) AS total_sales, count(*) as c, count(item_count) as count FROM sales GROUP BY item_id, region;
Create a materialized view index
To further optimize query performance, you can create an index on a materialized view. A materialized view index can accelerate queries on the materialized view, especially when the queries involve large amounts of data.
(Optional) Create a materialized view
mv_sales_summarythat summarizes the sales volume by product and region.If you have already created the materialized view
mv_sales_summarybased on the example in Create a fully refreshed materialized view, skip this step.CREATE MATERIALIZED VIEW mv_sales_summary(PRIMARY KEY(item_id)) REFRESH COMPLETE START WITH sysdate() NEXT sysdate() + interval 1 hour AS SELECT item_id, region, SUM(item_count) AS total_count FROM sales GROUP BY item_id, region;Create an index
idx_mv_sales_summaryon theregioncolumn of the materialized viewmv_sales_summary.CREATE INDEX idx_mv_sales_summary ON mv_sales_summary (region);This index can accelerate queries on the
mv_sales_summarymaterialized view, especially those based on theregioncolumn.
References
- For more information about materialized views, see Materialized views (MySQL-compatible mode) and Materialized views (Oracle-compatible mode).
- For more information about refreshing materialized views, see Refresh materialized views (MySQL-compatible mode) and Refresh materialized views (Oracle-compatible mode).
- For more information about dropping materialized view logs, see Materialized view logs (MySQL-compatible mode) and Materialized view logs (Oracle-compatible mode).
- For more information about dropping materialized views, see Drop materialized views (MySQL-compatible mode) and Drop materialized views (Oracle-compatible mode).