A materialized view (MV) is a database object that stores the results of a query. Unlike regular views, which are virtual tables, materialized views physically store the data. By caching the results of expensive operations such as aggregation and joins, materialized views allow queries to reuse these results instead of recomputing them. This significantly accelerates query performance, especially in data warehouses and decision support systems, by reducing computation time and improving query efficiency.
Why use materialized views in AP / real-time data warehouse
- Reduce redundant computation: Multi-dimensional analysis, wide table joins, and periodic reports often involve the same JOIN / GROUP BY operations. Materialized views allow these operations to be unified and maintained through refresh tasks.
- Control freshness and cost: By using strategies such as full, incremental, and real-time (on-demand), you can balance latency and resource usage (see Classic scenarios of materialized views for more information).
- Combine with other AP capabilities: For example, materialized views can be in columnar format, query rewriting can hit materialized results, and they can be combined with external tables for lake-warehouse synergy (scenarios and step-by-step SQL are described in Classic scenarios of materialized views).
Scenarios
- Data aggregation: Aggregate transaction details, behavioral details, etc., by day, week, or month.
- Fixed reports / metric layer: Precompute metric tables for reports or API queries.
- Optimize repeated queries: Materialize resource-intensive SQL results to avoid repeated execution.
- Data redundancy and replica strategy (planned by business): Redundantly store aggregated results in compliance with regulations, using routing and replica strategies.
- Pre-aggregation for monitoring: Pre-aggregate monitoring details by time granularity.
Core capabilities
Query acceleration
Single-table aggregation: For scenarios where there is a large amount of existing data in a single table and frequent group aggregations are needed, you can create an incremental refresh (refresh fast) materialized view for single-table group aggregations. If you need to merge the latest changes during queries, you can use real-time materialized views (ENABLE ON QUERY COMPUTATION) when the conditions are met.
Multi-table joins: For scenarios where there is a large amount of existing data in multiple tables and frequent join queries are needed, you can create an incremental refresh (refresh fast) materialized view for multi-table joins. After the data is properly maintained, you can use query rewriting to automatically redirect queries on the base tables to the materialized view results.
Materialized views and real-time data warehouses
In the AP / Real-time Data Warehouse scenario of OceanBase Database, materialized views are commonly used to transform repeated calculations on detailed or wide tables into refreshable physical result sets. These result sets are organized based on time (batch or near real-time), space (dimensions and granularity), and themes (business domains) to support multi-dimensional aggregation and derived metrics. When combined with features like columnar storage, query rewriting, and incremental refresh, materialized views help reduce the scanning and computational pressure on the base tables during online analytical processing (OLAP).
The role of materialized views in real-time data warehouses
A typical real-time or near-real-time data warehouse link can be summarized as: Access (CDC / File / External Table) → DWD (Detailed Data Warehouse) → DWS / ADS (Aggregated and Thematic Data) → Query and Reporting. Materialized views are suitable for handling repetitive aggregation, join, and denormalization in this process:
- Multidimensional Aggregation: Perform
GROUP BYoperations on the fact table and join dimensions based on factors such as region, category, and time window. - Data Layering: Combine full, incremental, and real-time materialized views with scheduled or manual refreshes to balance latency, resource usage, and result consistency.
- Query Acceleration: Automatically rewrite queries to use materialized view results while retaining access to the detailed table SQL, reducing the need for business changes.
For more information, see Overview of materialized views (MySQL) and Overview of materialized views (Oracle).
Refresh capabilities: Align materialized views with base tables
Materialized views need to be aligned with base tables through refresh strategies.
| Capability | Description | Practical entry |
|---|---|---|
| Full refresh | Recalculate and overwrite results based on the definition each time. Suitable for scenarios with complex definitions, long windows, or the need for complete alignment. | Refresh a materialized view (MySQL mode)、Refresh a materialized view (Oracle mode) |
| Incremental refresh (Fast refresh) | Based on Mlog, suitable for large base tables with relatively localized changes; has limitations on SQL form. | Same as above; Mlog details see Materialized view log (MySQL mode)、Materialized view log (Oracle mode) |
| Automatic/Manual refresh | Scheduling can be configured at creation; manual refresh is also available as needed (e.g., DBMS_MVIEW.REFRESH). |
Create a materialized view (MySQL mode)、Create a materialized view (Oracle mode) |
| Automatic Mlog management | Reduces manual maintenance costs of Mlog on supported versions. | Automatic management of materialized view logs (MySQL mode)、Automatic management of materialized view logs (Oracle mode) |
| Refresh resource isolation | Large-scale refreshes and Mlog maintenance can be isolated from online loads. | Materialized view resource isolation (MySQL mode)、Materialized view resource isolation (Oracle mode) |
Parallelism can be optimized using variables like mview_refresh_dop.
Query rewriting: Hit the materialized result with minimal changes to your business SQL
Query rewriting refers to the optimizer matching a query against an existing materialized view by rewriting the query to access the materialized view instead of the base table (or equivalent semantics), and using the precomputed results to answer the query.
- When creating a materialized view, you can enable
ENABLE QUERY REWRITE(specific syntax and limitations are detailed in the documentation). - Whether a query is rewritten depends on session or global variables, such as query_rewrite_enabled and query_rewrite_integrity.
For more details, see: Query rewriting for materialized views (MySQL mode) and Query rewriting for materialized views (Oracle mode).
Materialized View Capabilities and Selection
To avoid repetition with the previous sections on "real-time data warehouse location, refresh capabilities, and query rewrite capabilities," this section retains only the key decision-making points. Examples and SQL are provided in the following sections.
Refresh strategies (complete, fast, automatic, manual)
- Complete refresh: During a complete refresh, the materialized view re-executes the query statement and overwrites the existing view result data with the new computed results. This strategy is suitable for scenarios with low latency requirements, infrequent updates to the base tables, complex query statements, or small data volumes.
- Fast refresh: Also known as fast refresh, this strategy relies on a materialized view log (Mlog) to capture and process changes in the underlying base tables. It supports specific query types, including single-table aggregations, multi-table joins, and multi-table join aggregations. Fast refresh is ideal for scenarios involving large data volumes and frequent changes.
- Automatic refresh: When creating a materialized view, you can specify the refresh interval. The system then automatically schedules refresh tasks based on the configured refresh time rules.
- Manual refresh: If automatic refresh is not configured or the interval is too long, you can manually execute a refresh command to synchronize the materialized view data with the base table data.
For more information on the syntax and limitations, see Refresh a materialized view (MySQL) and Refresh a materialized view (Oracle).
Real-time materialized views (ON QUERY COMPUTATION)
Real-time materialized views (Real-Time Materialized Views) allow you to obtain real-time data by querying the materialized view. The query results are the same as those obtained by directly querying the base table. Additionally, the precomputed results of the materialized view can accelerate queries. Real-time materialized views capture and process changes in the underlying base tables using the Mlog mechanism, ensuring that the data in the materialized view reflects the latest state. Since real-time materialized views rely on Mlogs, their query statements must meet the requirements of fast refresh. In other words, only materialized views that support fast refresh can be defined as real-time materialized views.
For more information on creating real-time materialized views, see Create a materialized view in MySQL mode and Create a materialized view in Oracle mode.
Nested materialized views
A nested materialized view is one that is referenced by another materialized view. This approach is particularly useful in ETL (Extract, Transform, Load) processes. In an ETL workflow, nested materialized views can aggregate and transform data from different stages and store the results as independent views. This avoids redundant calculations and improves the efficiency of the entire ETL process. However, nested materialized views do not support automatic cascading refreshes. Before using nested materialized views, it is essential to understand the refresh considerations to ensure that the data retrieved from the upper-level materialized view meets the expected results.
For more information on creating nested materialized views, see Create a materialized view in MySQL mode and Create a materialized view in Oracle mode.
Query rewriting (ENABLE QUERY REWRITE)
If you want to use a materialized view to accelerate queries without modifying the original query statements, you can leverage the query rewriting capability of the materialized view. The system automatically matches the query statement with the materialized view definition. If a matching materialized view is found, the system automatically rewrites the query to use the materialized view. This approach enhances query performance and efficiency without changing the business logic.
For more information on materialized view query rewriting, see Materialized view query rewriting in MySQL mode and Materialized view query rewriting in Oracle mode.
Query acceleration optimization methods
| Optimization dimension | Recommended practice | Applicable scenarios | Value |
|---|---|---|---|
| Selecting the storage format for materialized views based on query type | Use wide tables for columnar storage and row-based tables for row-level access. | Scenarios where a balance between scanning efficiency and row access efficiency is needed. | Reduces unnecessary column reads, lowers I/O, and improves query efficiency. |
| Storage format for materialized view data | Choose between row-based or columnar storage based on the actual application scenario. Row-based storage is suitable for scenarios where results are already aggregated and accessed as rows. Columnar storage is suitable for wide tables and aggregation analysis. | Reports, wide table analysis, and fixed metric queries. | Better aligns with access patterns, reducing scanning overhead. |
| Using primary key materialized views | Define a primary key for the materialized view to ensure result uniqueness. | Scenarios requiring stable positioning, updates, or deduplication of materialized results. | Improves lookup and update efficiency, reduces the risk of duplicate data. |
| Creating indexes on materialized views | Create indexes on filter columns, join columns, and sort columns. | High-frequency filter, join, and sort queries. | Reduces full table scans and accelerates hit paths. |
| Adjusting refresh strategies based on SLA | Balance between latency, freshness, and resources, and choose between full/incremental and automatic/manual refresh based on needs. | Different real-time level AP/reporting scenarios. | Avoids blindly pursuing real-time updates, controls resource costs, and maintains available performance. |
Materialized views are a key method for query acceleration. For large-scale data scenarios, the combination of "storage format + index + refresh strategy" can continuously optimize query performance.
Create a materialized view
Notice
The following examples are executed in MySQL mode.
Create a source table
First, we need to create a source table to store the original data. In this example, we create a sales table and an items table to store sales data and item information, respectively.
Create the
salestable.CREATE TABLE sales ( order_id INT PRIMARY KEY, user_id INT, item_id INT, item_count INT, region VARCHAR(100) );Create the
itemstable.CREATE TABLE items ( order_id INT, product_id INT, quantity INT, price_per_item DECIMAL(10, 2) NOT NULL, pic_url VARCHAR(1000), PRIMARY KEY (order_id, product_id) );
Create a complete refresh materialized view
We will create a materialized view based on the sales table. This materialized view will aggregate sales by product and region to accelerate queries.
Create a materialized view mv_sales_summary that aggregates sales by product and region.
CREATE MATERIALIZED VIEW mv_sales_summary(PRIMARY KEY(item_id))
REFRESH COMPLETE
START WITH sysdate()
NEXT sysdate() + interval 1 hour
AS SELECT item_id, region, SUM(item_count) AS total_count
FROM sales
GROUP BY item_id, region;
In this example, we specify the following characteristics for the materialized view:
- PRIMARY KEY: Specifies the primary key for the materialized view.
REFRESH COMPLETE: Indicates that the materialized view will be refreshed completely.START WITH sysdate() NEXT sysdate() + interval 1 hour: Specifies that the materialized view will be refreshed every hour.
Create an incremental refresh materialized view
For scenarios with frequent data changes, incremental refreshes can improve refresh efficiency. Before creating an incremental refresh materialized view, we need to create a materialized view log (Mlog) based on the base table of the materialized view.
Note
OceanBase Database supports automatic management of materialized view logs. If automatic management of mlogs is enabled, you do not need to create an mlog for the base table before creating an incremental refresh materialized view. OceanBase Database will automatically create the corresponding mlog or update the existing mlog table definition to include the columns required by the new materialized view. For more information, see Automatic management of materialized view logs (MySQL mode) and Automatic management of materialized view logs (Oracle mode).
For more information about materialized view logs, see Materialized view logs (MySQL mode) and Materialized view logs (Oracle mode).
Next, we will show three query scenarios supported by incremental refresh materialized views:
Single-table aggregation
Create a materialized view log on the
salestable.CREATE MATERIALIZED VIEW LOG ON sales WITH PRIMARY KEY (item_id, item_count, region) INCLUDING NEW VALUES;Create an incremental refresh materialized view
mv_sales_summary_fastfor single-table aggregation.CREATE MATERIALIZED VIEW mv_sales_summary_fast REFRESH FAST START WITH sysdate() NEXT sysdate() + interval 1 hour AS SELECT item_id, region, SUM(item_count) AS total_count, count(*) as c, count(item_count) as count FROM sales GROUP BY item_id, region;
Multi-table join
Create a materialized view log on the
salestable.(Optional) Drop the materialized view log on the
salestable.If you have not created a materialized view log on the
salestable, skip this step.DROP MATERIALIZED VIEW LOG ON sales;Create a materialized view log on the
salestable.CREATE MATERIALIZED VIEW LOG ON sales WITH PRIMARY KEY (user_id, item_id, item_count, region) INCLUDING NEW VALUES;
Create a materialized view log on the
itemstable.CREATE MATERIALIZED VIEW LOG ON items WITH PRIMARY KEY (price_per_item, pic_url) INCLUDING NEW VALUES;Create an incremental refresh materialized view
mv_sales_items_joinfor multi-table join.CREATE MATERIALIZED VIEW mv_sales_items_join PARTITION BY HASH(order_id) PARTITIONS 10 REFRESH FAST START WITH sysdate() NEXT sysdate() + interval 1 hour AS SELECT s.order_id AS order_id, s.user_id AS customer_id, s.item_id AS item_id, s.item_count AS quantity, s.region AS region, i.order_id AS i_id, i.product_id AS i_item_id, i.price_per_item AS price_per_item, i.pic_url AS pic_url FROM sales s JOIN items i ON s.order_id = i.order_id;
Multi-table join aggregation
Create a materialized view log on the
salestable.(Optional) Drop the materialized view log on the
salestable.If you have not created a materialized view log on the
salestable, skip this step.DROP MATERIALIZED VIEW LOG ON sales;Create a materialized view log on the
salestable.CREATE MATERIALIZED VIEW LOG ON sales WITH PRIMARY KEY (item_id, item_count, region) INCLUDING NEW VALUES;
Create a materialized view log on the
itemstable.(Optional) Drop the materialized view log on the
itemstable.If you have not created a materialized view log on the
itemstable, skip this step.DROP MATERIALIZED VIEW LOG ON items;Create a materialized view log on the
itemstable.CREATE MATERIALIZED VIEW LOG ON items WITH PRIMARY KEY (price_per_item) INCLUDING NEW VALUES;
Create an incremental refresh materialized view
mv_sales_item_join_groupfor multi-table join aggregation.CREATE MATERIALIZED VIEW mv_sales_item_join_group REFRESH FAST START WITH sysdate() NEXT sysdate() + interval 1 hour AS SELECT s.item_id AS item_id, s.region AS region, SUM(s.item_count * i.price_per_item) AS sum_price, count(*) AS c, count(s.item_count * i.price_per_item) AS count FROM sales s JOIN items i ON s.order_id = i.order_id GROUP BY item_id, region;
Create a real-time materialized view
A real-time materialized view ensures that the query results of the materialized view are synchronized with the base table when the data changes. Like incremental refresh materialized views, real-time materialized views require the creation of materialized view logs before they can be created.
Note
OceanBase Database supports automatic management of materialized view logs. If automatic management is enabled, when you create a real-time materialized view, OceanBase Database automatically creates the corresponding materialized view log or updates the existing materialized view log table definition to include the columns required by the new materialized view. For more information, see Automatic management of materialized view logs (MySQL mode) and Automatic management of materialized view logs (Oracle mode).
Create a materialized view log on the
salestable.(Optional) Drop the materialized view log on the
salestable.If you have not created a materialized view log on the
salestable, skip this step.DROP MATERIALIZED VIEW LOG ON sales;Create a materialized view log on the
salestable.CREATE MATERIALIZED VIEW LOG ON sales WITH PRIMARY KEY (item_id, item_count, region) INCLUDING NEW VALUES;
Create a real-time materialized view named
mv_sales_summary_com.CREATE MATERIALIZED VIEW mv_sales_summary_com REFRESH FORCE START WITH sysdate() NEXT sysdate() + interval 1 hour ENABLE ON QUERY COMPUTATION AS SELECT item_id, region, SUM(item_count) AS total_count, count(*) as c, count(item_count) as count FROM sales GROUP BY item_id, region;
In this example, we enable ENABLE ON QUERY COMPUTATION, which allows the materialized view to be updated in real time during queries. This ensures that the latest data is obtained, and the query statement meets the requirements for creating an incremental refresh materialized view.
Create a nested materialized view
In the ETL process, nested materialized views are used to combine multiple materialized views into a more complex data processing workflow. Here, we will create two materialized views: one that links sales information with item information, and another that performs additional aggregations based on the first materialized view.
Create a materialized view log on the
salestable.(Optional) Drop the materialized view log on the
salestable.If you have not created a materialized view log on the
salestable, skip this step.DROP MATERIALIZED VIEW LOG ON sales;Create a materialized view log on the
salestable.CREATE MATERIALIZED VIEW LOG ON sales WITH PRIMARY KEY (user_id, item_id, item_count, region) INCLUDING NEW VALUES;
Create a materialized view log on the
itemstable.(Optional) Drop the materialized view log on the
itemstable.If you have not created a materialized view log on the
itemstable, skip this step.DROP MATERIALIZED VIEW LOG ON items;Create a materialized view log on the
itemstable.CREATE MATERIALIZED VIEW LOG ON items WITH PRIMARY KEY (price_per_item,pic_url) INCLUDING NEW VALUES;
Create a materialized view named
mv1_sales_items_jointhat links sales information with item information.CREATE MATERIALIZED VIEW mv1_sales_items_join REFRESH FAST START WITH sysdate() NEXT sysdate() + interval 1 hour AS SELECT s.order_id AS order_id, s.user_id AS customer_id, s.item_id AS item_id, s.item_count AS quantity, s.region AS region, i.order_id AS i_id, i.product_id AS i_item_id, i.price_per_item AS price_per_item, i.pic_url FROM sales s JOIN items i ON s.order_id = i.order_id;Create a materialized view log on the materialized view
mv1_sales_items_join.CREATE MATERIALIZED VIEW LOG ON mv1_sales_items_join WITH PRIMARY KEY (region,quantity,price_per_item) INCLUDING NEW VALUES;Create a materialized view named
mv2_join_sumbased on the materialized viewmv1_sales_items_join.CREATE MATERIALIZED VIEW mv2_join_sum REFRESH FAST START WITH sysdate() NEXT sysdate() + interval 1 hour AS SELECT region, sum(quantity * price_per_item) AS sum_price, count(*) as c, count(quantity * price_per_item) as count FROM mv1_sales_items_join GROUP BY region;
Create a columnstore materialized view
If you want to improve query efficiency in scenarios with large amounts of data, you can create a columnstore materialized view. With columnar storage, queries only read the required columns, significantly reducing disk I/O.
Create a materialized view log on the
salestable.(Optional) Drop the materialized view log on the
salestable.If you have not created a materialized view log on the
salestable, skip this step.DROP MATERIALIZED VIEW LOG ON sales;Create a materialized view log on the
salestable.CREATE MATERIALIZED VIEW LOG ON sales WITH PRIMARY KEY (user_id, item_id, item_count, region) INCLUDING NEW VALUES;
Create a materialized view log on the
itemstable.(Optional) Drop the materialized view log on the
itemstable.If you have not created a materialized view log on the
itemstable, skip this step.DROP MATERIALIZED VIEW LOG ON items;Create a materialized view log on the
itemstable.CREATE MATERIALIZED VIEW LOG ON items WITH PRIMARY KEY (price_per_item,pic_url) INCLUDING NEW VALUES;
Create a columnstore materialized view named
wide_sales_column.CREATE MATERIALIZED VIEW wide_sales_column WITH COLUMN GROUP(each column) REFRESH FAST START WITH sysdate() NEXT sysdate() + interval 1 hour AS SELECT s.order_id AS order_id, s.user_id AS customer_id, s.item_id AS item_id, s.item_count AS quantity, s.region AS region, i.order_id AS i_id, i.product_id AS i_item_id, i.price_per_item AS price_per_item, i.pic_url FROM sales s JOIN items i ON s.order_id = i.order_id;
In this example, we specify WITH COLUMN GROUP(each column) to use the columnar storage format for the materialized view. This is particularly useful in OLAP scenarios, especially when dealing with large datasets and wide tables.
Create a materialized view for query rewriting
In this example, the ENABLE QUERY REWRITE clause is used to create a fully refreshed materialized view that can be rewritten.
CREATE MATERIALIZED VIEW mv_sales_summary_select
REFRESH COMPLETE
START WITH sysdate()
NEXT sysdate() + interval 1 hour
ENABLE QUERY REWRITE
AS SELECT item_id, region, SUM(item_count) AS total_count
FROM sales
GROUP BY item_id, region;
In this example, the ENABLE QUERY REWRITE and ENABLE ON QUERY COMPUTATION clauses are used to create a real-time materialized view that can be rewritten.
Create a materialized view log on the
salestable.(Optional) Drop the materialized view log on the
salestable.If you have not created a materialized view log on the
salestable, skip this step.DROP MATERIALIZED VIEW LOG ON sales;Create a materialized view log on the
salestable.CREATE MATERIALIZED VIEW LOG ON sales WITH PRIMARY KEY (item_id, item_count, region) INCLUDING NEW VALUES;
Create a real-time materialized view
mv_sales_summary_com_selectthat can be rewritten.CREATE MATERIALIZED VIEW mv_sales_summary_com_select REFRESH FAST START WITH sysdate() NEXT sysdate() + interval 1 hour ENABLE ON QUERY COMPUTATION ENABLE QUERY REWRITE AS SELECT item_id, region, SUM(item_count) AS total_sales, count(*) as c, count(item_count) as count FROM sales GROUP BY item_id, region;
Create a materialized view index
To further optimize query performance, you can create an index on the materialized view. A materialized view index helps speed up queries on the materialized view, especially when dealing with large amounts of data.
(Optional) Create a materialized view
mv_sales_summarythat summarizes sales by product and region.If you have already created the materialized view
mv_sales_summaryas shown in the Create a fully refreshed materialized view example, skip this step.CREATE MATERIALIZED VIEW mv_sales_summary(PRIMARY KEY(item_id)) REFRESH COMPLETE START WITH sysdate() NEXT sysdate() + interval 1 hour AS SELECT item_id, region, SUM(item_count) AS total_count FROM sales GROUP BY item_id, region;Create an index
idx_mv_sales_summaryon theregioncolumn of the materialized viewmv_sales_summary.CREATE INDEX idx_mv_sales_summary ON mv_sales_summary (region);This index can speed up queries on the
mv_sales_summarymaterialized view, especially those based on theregioncolumn.
Dictionary and O&M views
The following system views are commonly used when you troubleshoot materialized views, Mlogs, and refresh tasks. The following example uses the MySQL mode. For more information about the views and their fields, see the reference documentation of the current version.
| DBA view | ALL view | USER view | Description |
|---|---|---|---|
DBA_MVIEW_LOGS |
ALL_MVIEW_LOGS |
USER_MVIEW_LOGS |
Describes information about materialized view logs. |
DBA_MVIEWS |
ALL_MVIEWS |
USER_MVIEWS |
Describes information about materialized views. |
DBA_MVREF_STATS_SYS_DEFAULTS |
USER_MVREF_STATS_SYS_DEFAULTS |
Describes the system-level default values of the basic timing statistics of the refresh operations of all materialized views. | |
DBA_MVREF_STATS_PARAMS |
USER_MVREF_STATS_PARAMS |
Describes the refresh statistics parameters of each materialized view. | |
DBA_MVREF_RUN_STATS |
USER_MVREF_RUN_STATS |
Describes the running information of each refresh operation of a materialized view. Each refresh operation is identified by the REFRESH_ID column. |
|
DBA_MVREF_STATS |
USER_MVREF_STATS |
Describes the basic timing statistics of the refresh operations of all materialized views. | |
DBA_MVREF_CHANGE_STATS |
USER_MVREF_CHANGE_STATS |
Describes the change data loaded to the base tables of the refresh operations of all materialized views. | |
DBA_MVREF_STMT_STATS |
USER_MVREF_STMT_STATS |
Describes the information about the refresh statements. |
References
- For more information about materialized views, see Overview of materialized views (MySQL mode) and Overview of materialized views (Oracle mode).
- For more information about refreshing materialized views, see Refresh a materialized view (MySQL mode) and Refresh a materialized view (Oracle mode).
- For more information about materialized view logs, see Materialized view logs (MySQL mode) and Materialized view logs (Oracle mode).
- For more information about deleting materialized views, see Delete a materialized view (MySQL mode) and Delete a materialized view (Oracle mode).
