A materialized view (MV) is a database object that stores the results of a query. Unlike a regular view, an MV retains the query results, allowing you to reuse them without re-executing the query. This avoids redundant execution of time-consuming and resource-intensive operations such as aggregation and joins. By doing so, it significantly accelerates query performance. This is particularly beneficial in data warehouses and decision support systems, where it reduces computational time and enhances query efficiency.
Characteristics of materialized views
Materialized view refresh methods
When you create a materialized view, you can choose different refresh methods to ensure that the data in the materialized view stays synchronized with the base table. The choice of refresh method directly affects system performance and the real-time nature of query results.
Full refresh and incremental refresh
Full refresh (Complete Refresh): During a full refresh, the materialized view re-executes the query statement and overwrites the existing view results with the new computed data. This method is suitable for scenarios with low latency requirements, infrequent base table updates, complex query statements, or small data volumes.
Incremental refresh (Fast Refresh): Incremental refresh only updates the changed data, making it particularly suitable for large datasets. Incremental refresh relies on Materialized View Logs (Mlogs) and has specific requirements for query statements. Currently, it supports queries involving single-table aggregation, multi-table joins, and multi-table join aggregations. For more information, see Refresh a materialized view in MySQL mode and Refresh a materialized view in Oracle mode. It is suitable for scenarios with high latency requirements, large data volumes, and frequent changes.
Automatic refresh and manual refresh
- Automatic refresh: When you create a materialized view, you can specify the refresh interval. The system will automatically schedule the refresh task based on the configured refresh time rules.
- Manual refresh: If automatic refresh is not configured or the interval is too long, you can manually execute the refresh command to keep the data in the materialized view synchronized with the base table.
Real-time materialized views
Real-time materialized views (Real-Time Materialized Views) allow you to query real-time data from the materialized view. The query results are the same as those obtained by directly querying the base table, and they can leverage the precomputed results of the materialized view to accelerate queries. Real-time materialized views use Mlogs to capture and process changes in the underlying base tables, ensuring that the data in the materialized view reflects the latest state in a timely manner. Since real-time materialized views rely on Mlogs, their query statements must meet the requirements of incremental refresh, which means that only materialized views that satisfy the incremental refresh requirements can be defined as real-time materialized views.
For more information about creating real-time materialized views, see Create a materialized view in MySQL mode and Create a materialized view in Oracle mode.
Nested materialized views
A nested materialized view is one that is referenced by another materialized view. This approach is particularly useful in ETL (Extract, Transform, Load) processes. In ETL workflows, nested materialized views can aggregate and transform data from different stages and store the results as independent views, avoiding repeated calculations and improving the efficiency of the entire ETL process. However, nested materialized views do not support automatic cascading refreshes. Before using nested materialized views, you need to understand the refresh considerations for nested materialized views to ensure that the data retrieved from the upper-level materialized view meets your expectations.
For more information about creating nested materialized views, see Create a materialized view in MySQL mode and Create a materialized view in Oracle mode.
Materialized view query rewriting
If you want to use a materialized view to accelerate queries without modifying the original query statements, you can leverage the query rewriting capability of the materialized view. The system automatically matches the query statements with the definitions of the materialized views. If a matching materialized view is found, the query is automatically rewritten to use the materialized view. This approach significantly improves query performance and efficiency without changing the business logic.
For more information about materialized view query rewriting, see Materialized view query rewriting in MySQL mode and Materialized view query rewriting in Oracle mode.
Query acceleration methods for materialized views
Materialized views are an important means of improving query performance in databases. By precomputing and storing query results, they reduce the performance overhead of real-time calculations. To further optimize the query performance of materialized views, you can consider the following methods:
Storage format of materialized view data
Choose between row-based and column-based materialized views based on your specific application scenarios to accelerate queries.
- Row-based materialized views: Suitable when the data in the materialized view has been aggregated, and queries primarily access entire data rows.
- Column-based materialized views: Ideal for wide tables with large amounts of data and numerous columns, where queries involve more data analysis and aggregation operations.
Use primary key materialized views
Primary keys ensure the uniqueness of data in materialized views and allow for more efficient lookup and update operations.
Create indexes on materialized views
Creating indexes on materialized views can significantly enhance query performance. Indexes allow for quick data location, reducing the need for full-table scans.
Materialized views are an important tool for improving query performance, but their performance may be affected when handling large-scale data. To optimize the query performance of materialized views, you can adjust the materialized view definition, add indexes, or modify the refresh strategy to further accelerate queries.
Create a materialized view
Notice
The following examples are executed in MySQL mode.
Create the source table
First, we need to create a source table to store the original data. In this example, we create a sales table and an items table, which store the sales data and item information, respectively.
Create the
salestable.CREATE TABLE sales ( order_id INT PRIMARY KEY, user_id INT, item_id INT, item_count INT, region VARCHAR(100) );Create the
itemstable.CREATE TABLE items ( order_id INT, product_id INT, quantity INT, price_per_item DECIMAL(10, 2) NOT NULL, pic_url VARCHAR(1000), PRIMARY KEY (order_id, product_id) );
Create a materialized view for full refresh
We will create a materialized view based on the sales table. This materialized view will aggregate sales by product and region to speed up queries.
Create the mv_sales_summary materialized view to aggregate sales by product and region.
CREATE MATERIALIZED VIEW mv_sales_summary(PRIMARY KEY(item_id))
REFRESH COMPLETE
START WITH sysdate()
NEXT sysdate() + interval 1 hour
AS SELECT item_id, region, SUM(item_count) AS total_count
FROM sales
GROUP BY item_id, region;
In this example, we specify the following properties for the materialized view:
- PRIMARY KEY: Specifies the primary key for the materialized view.
REFRESH COMPLETE: Indicates that a complete refresh is used.START WITH sysdate() NEXT sysdate() + interval 1 hour: Specifies that the materialized view is refreshed every hour.
Create a materialized view for incremental refresh
For scenarios where data changes frequently, incremental refresh can improve refresh efficiency. Before creating a materialized view for incremental refresh, we need to create a materialized view log (Mlog) based on the base table of the materialized view.
Note
OceanBase Database supports automatic management of materialized view logs. If automatic management of mlogs is enabled, you do not need to create an mlog for the base table before creating a materialized view for incremental refresh. OceanBase Database will automatically create the corresponding mlog or update the existing mlog table definition to include the columns required by the new materialized view. For more information, see Automatic management of materialized view logs (MySQL mode) and Automatic management of materialized view logs (Oracle mode).
For more information about materialized view logs, see Materialized view logs in MySQL mode and Materialized view logs in Oracle mode.
The following examples show three query scenarios supported by materialized views for incremental refresh:
Single-table aggregation
Create a materialized view log on the
salestable.CREATE MATERIALIZED VIEW LOG ON sales WITH PRIMARY KEY (item_id, item_count, region) INCLUDING NEW VALUES;Create the
mv_sales_summary_fastmaterialized view for incremental refresh to aggregate data from thesalestable.CREATE MATERIALIZED VIEW mv_sales_summary_fast REFRESH FAST START WITH sysdate() NEXT sysdate() + interval 1 hour AS SELECT item_id, region, SUM(item_count) AS total_count, count(*) as c, count(item_count) as count FROM sales GROUP BY item_id, region;
Multi-table join
Create a materialized view log on the
salestable.(Optional) Drop the materialized view log on the
salestable.If you have not created a materialized view log on the
salestable, skip this step.DROP MATERIALIZED VIEW LOG ON sales;Create a materialized view log on the
salestable.CREATE MATERIALIZED VIEW LOG ON sales WITH PRIMARY KEY (user_id, item_id, item_count, region) INCLUDING NEW VALUES;
Create a materialized view log on the
itemstable.CREATE MATERIALIZED VIEW LOG ON items WITH PRIMARY KEY (price_per_item, pic_url) INCLUDING NEW VALUES;Create the
mv_sales_items_joinmaterialized view for incremental refresh to join data from thesalesanditemstables.CREATE MATERIALIZED VIEW mv_sales_items_join PARTITION BY HASH(order_id) PARTITIONS 10 REFRESH FAST START WITH sysdate() NEXT sysdate() + interval 1 hour AS SELECT s.order_id AS order_id, s.user_id AS customer_id, s.item_id AS item_id, s.item_count AS quantity, s.region AS region, i.order_id AS i_id, i.product_id AS i_item_id, i.price_per_item AS price_per_item, i.pic_url AS pic_url FROM sales s JOIN items i ON s.order_id = i.order_id;
Multi-table join aggregation
Create a materialized view log on the
salestable.(Optional) Drop the materialized view log on the
salestable.If you have not created a materialized view log on the
salestable, skip this step.DROP MATERIALIZED VIEW LOG ON sales;Create a materialized view log on the
salestable.CREATE MATERIALIZED VIEW LOG ON sales WITH PRIMARY KEY (item_id, item_count, region) INCLUDING NEW VALUES;
Create a materialized view log on the
itemstable.(Optional) Drop the materialized view log on the
itemstable.If you have not created a materialized view log on the
itemstable, skip this step.DROP MATERIALIZED VIEW LOG ON items;Create a materialized view log on the
itemstable.CREATE MATERIALIZED VIEW LOG ON items WITH PRIMARY KEY (price_per_item) INCLUDING NEW VALUES;
Create the
mv_sales_item_join_groupmaterialized view for incremental refresh to join and aggregate data from thesalesanditemstables.CREATE MATERIALIZED VIEW mv_sales_item_join_group REFRESH FAST START WITH sysdate() NEXT sysdate() + interval 1 hour AS SELECT s.item_id AS item_id, s.region AS region, SUM(s.item_count * i.price_per_item) AS sum_price, count(*) AS c, count(s.item_count * i.price_per_item) AS count FROM sales s JOIN items i ON s.order_id = i.order_id GROUP BY item_id, region;
Create a real-time materialized view
A real-time materialized view ensures that the query results of the materialized view are synchronized with the base table when data changes. Since a real-time materialized view relies on a materialized view log, you need to create a materialized view log before creating a real-time materialized view.
Note
OceanBase Database supports automatic management of materialized view logs. If automatic management of materialized view logs is enabled, you do not need to create a materialized view log for the base table before creating a real-time materialized view. OceanBase Database will automatically create the corresponding materialized view log or update the existing materialized view log table definition to include the columns required by the newly created materialized view. For more information, see Automatic management of materialized view logs (MySQL mode) and Automatic management of materialized view logs (Oracle mode).
Create a materialized view log on the
salestable.(Optional) Drop the materialized view log on the
salestable.If you have not created a materialized view log on the
salestable, skip this step.DROP MATERIALIZED VIEW LOG ON sales;Create a materialized view log on the
salestable.CREATE MATERIALIZED VIEW LOG ON sales WITH PRIMARY KEY (item_id, item_count, region) INCLUDING NEW VALUES;
Create a real-time materialized view named
mv_sales_summary_com.CREATE MATERIALIZED VIEW mv_sales_summary_com REFRESH FORCE START WITH sysdate() NEXT sysdate() + interval 1 hour ENABLE ON QUERY COMPUTATION AS SELECT item_id, region, SUM(item_count) AS total_count, count(*) as c, count(item_count) as count FROM sales GROUP BY item_id, region;
In this example, we enable ENABLE ON QUERY COMPUTATION so that the materialized view is updated in real time during queries to ensure that the latest data is obtained. The query statement of the materialized view also meets the requirements for creating an incrementally refreshed materialized view.
Create a nested materialized view
In the ETL process, a nested materialized view is used to combine multiple materialized views into a more complex data processing workflow. In this example, we will create two materialized views: one that associates sales information with item information, and another that performs additional aggregations based on the first materialized view.
Create a materialized view log on the
salestable.(Optional) Drop the materialized view log on the
salestable.If you have not created a materialized view log on the
salestable, skip this step.DROP MATERIALIZED VIEW LOG ON sales;Create a materialized view log on the
salestable.CREATE MATERIALIZED VIEW LOG ON sales WITH PRIMARY KEY (user_id, item_id, item_count, region) INCLUDING NEW VALUES;
Create a materialized view log on the
itemstable.(Optional) Drop the materialized view log on the
itemstable.If you have not created a materialized view log on the
itemstable, skip this step.DROP MATERIALIZED VIEW LOG ON items;Create a materialized view log on the
itemstable.CREATE MATERIALIZED VIEW LOG ON items WITH PRIMARY KEY (price_per_item,pic_url) INCLUDING NEW VALUES;
Create a materialized view named
mv1_sales_items_jointhat associates sales information with item information.CREATE MATERIALIZED VIEW mv1_sales_items_join REFRESH FAST START WITH sysdate() NEXT sysdate() + interval 1 hour AS SELECT s.order_id AS order_id, s.user_id AS customer_id, s.item_id AS item_id, s.item_count AS quantity, s.region AS region, i.order_id AS i_id, i.product_id AS i_item_id, i.price_per_item AS price_per_item, i.pic_url FROM sales s JOIN items i ON s.order_id = i.order_id;Create a materialized view log on the materialized view
mv1_sales_items_join.CREATE MATERIALIZED VIEW LOG ON mv1_sales_items_join WITH PRIMARY KEY (region,quantity,price_per_item) INCLUDING NEW VALUES;Create a materialized view named
mv2_join_sumbased on the materialized viewmv1_sales_items_join.CREATE MATERIALIZED VIEW mv2_join_sum REFRESH FAST START WITH sysdate() NEXT sysdate() + interval 1 hour AS SELECT region, sum(quantity * price_per_item) AS sum_price, count(*) as c, count(quantity * price_per_item) as count FROM mv1_sales_items_join GROUP BY region;
Create a columnstore materialized view
If you want to improve query efficiency in a scenario with a large amount of data, you can create a columnstore materialized view. With columnar storage, only the required columns are read during queries, significantly reducing disk I/O.
Create a materialized view log on the
salestable.(Optional) Drop the materialized view log on the
salestable.If you have not created a materialized view log on the
salestable, skip this step.DROP MATERIALIZED VIEW LOG ON sales;Create a materialized view log on the
salestable.CREATE MATERIALIZED VIEW LOG ON sales WITH PRIMARY KEY (user_id, item_id, item_count, region) INCLUDING NEW VALUES;
Create a materialized view log on the
itemstable.(Optional) Drop the materialized view log on the
itemstable.If you have not created a materialized view log on the
itemstable, skip this step.DROP MATERIALIZED VIEW LOG ON items;Create a materialized view log on the
itemstable.CREATE MATERIALIZED VIEW LOG ON items WITH PRIMARY KEY (price_per_item,pic_url) INCLUDING NEW VALUES;
Create a columnstore materialized view named
wide_sales_column.CREATE MATERIALIZED VIEW wide_sales_column WITH COLUMN GROUP(each column) REFRESH FAST START WITH sysdate() NEXT sysdate() + interval 1 hour AS SELECT s.order_id AS order_id, s.user_id AS customer_id, s.item_id AS item_id, s.item_count AS quantity, s.region AS region, i.order_id AS i_id, i.product_id AS i_item_id, i.price_per_item AS price_per_item, i.pic_url FROM sales s JOIN items i ON s.order_id = i.order_id;
In this example, we specify WITH COLUMN GROUP(each column) to use columnar storage for the materialized view. This is particularly useful in OLAP scenarios, especially when dealing with large amounts of data and wide table queries.
Create a materialized view that supports query rewriting
In this example, we specify ENABLE QUERY REWRITE to create a fully refreshed materialized view that supports query rewriting.
CREATE MATERIALIZED VIEW mv_sales_summary_select
REFRESH COMPLETE
START WITH sysdate()
NEXT sysdate() + interval 1 hour
ENABLE QUERY REWRITE
AS SELECT item_id, region, SUM(item_count) AS total_count
FROM sales
GROUP BY item_id, region;
In this example, we specify ENABLE QUERY REWRITE and ENABLE ON QUERY COMPUTATION to create a real-time materialized view that supports query rewriting.
Create a materialized view log on the
salestable.(Optional) Drop the materialized view log on the
salestable.If you have not created a materialized view log on the
salestable, skip this step.DROP MATERIALIZED VIEW LOG ON sales;Create a materialized view log on the
salestable.CREATE MATERIALIZED VIEW LOG ON sales WITH PRIMARY KEY (item_id, item_count, region) INCLUDING NEW VALUES;
Create a real-time materialized view named
mv_sales_summary_com_selectthat supports query rewriting.CREATE MATERIALIZED VIEW mv_sales_summary_com_select REFRESH FAST START WITH sysdate() NEXT sysdate() + interval 1 hour ENABLE ON QUERY COMPUTATION ENABLE QUERY REWRITE AS SELECT item_id, region, SUM(item_count) AS total_sales, count(*) as c, count(item_count) as count FROM sales GROUP BY item_id, region;
Create a materialized view index
To further optimize query performance, you can create an index on a materialized view. A materialized view index can accelerate queries on the materialized view, especially when dealing with large amounts of data.
(Optional) Create a materialized view
mv_sales_summarythat summarizes sales by product and region.If you have already created the
mv_sales_summarymaterialized view based on the example in Create a fully refreshed materialized view, skip this step.CREATE MATERIALIZED VIEW mv_sales_summary(PRIMARY KEY(item_id)) REFRESH COMPLETE START WITH sysdate() NEXT sysdate() + interval 1 hour AS SELECT item_id, region, SUM(item_count) AS total_count FROM sales GROUP BY item_id, region;Create an index
idx_mv_sales_summaryon theregioncolumn of themv_sales_summarymaterialized view.CREATE INDEX idx_mv_sales_summary ON mv_sales_summary (region);This index can accelerate queries on the
mv_sales_summarymaterialized view, especially those based on theregioncolumn.
References
- For more information about materialized views, see Materialized views (MySQL mode) and Materialized views (Oracle mode).
- For more information about refreshing materialized views, see Refresh materialized views (MySQL mode) and Refresh materialized views (Oracle mode).
- For more information about deleting materialized view logs, see Materialized view logs (MySQL mode) and Materialized view logs (Oracle mode).
- For more information about deleting materialized views, see Delete materialized views (MySQL mode) and Delete materialized views (Oracle mode).
