A materialized view (MV) is a database object that stores the results of a query. Unlike a regular view, an MV stores the query results, allowing you to avoid repeatedly executing resource-intensive operations such as aggregation and joins. This approach accelerates queries by reusing precomputed results. In data warehouses and decision support systems, this can significantly reduce computation time and improve query efficiency.
Why use materialized views in AP / real-time data warehouses
- Reduce redundant computation: Materialized views help reduce redundant JOIN / GROUP BY operations in multidimensional analysis, wide table joins, and periodic reports.
- Control freshness and cost: By using strategies such as full, incremental, or real-time (on-demand), you can balance latency and resource usage (see Classic scenarios of materialized views for more information).
- Combine with other AP capabilities: For example, materialized views can be used with columnar storage, query rewriting, and external tables for lakehouse collaboration (see Classic scenarios of materialized views for more information).
Use cases
- Data aggregation: Aggregate transaction and behavioral details by day, week, or month.
- Fixed reports / metric layer: Precompute metric tables for reports or API queries.
- Optimize repeated queries: Materialize resource-intensive SQL results to avoid repeated execution.
- Data locality and replication strategy (based on business planning): Redundantly store aggregated results in compliance with regulations, using routing and replication strategies.
- Pre-aggregation for monitoring: Pre-aggregate monitoring details by time granularity.
Core capabilities
Query acceleration
Single-table aggregation: For scenarios where a large amount of data exists in a single table and frequent grouping and aggregation queries are performed, you can create an incremental refresh (refresh fast) materialized view for single-table grouping and aggregation. If you need to merge the latest changes during queries, you can use real-time materialized views (ENABLE ON QUERY COMPUTATION).
Multi-table join: For scenarios where a large amount of data exists in multiple tables and frequent join queries are performed, you can create an incremental refresh (refresh fast) materialized view for multi-table joins. After the data is properly maintained, you can use query rewriting to automatically redirect queries to the materialized view results.
Materialized views and real-time data warehouses
In the AP/Real-time Data Warehouse scenario of OceanBase Database, materialized views are commonly used to convert repeated calculations on detail tables or wide tables into refreshable physical result sets. These result sets are organized by time (batch or near real-time), space (dimensions and granularity), and subject (business domain) to support multi-dimensional aggregation and derived metrics. When combined with features such as columnar storage, query rewriting, and incremental refresh, materialized views help reduce the scanning and computational pressure on the base tables during online analytical processing.
Position of materialized views in a real-time data warehouse
A typical real-time or near real-time data warehouse link can be summarized as follows: Access (CDC / Files / External Tables) → DWD (Detailed Data Warehouse) → DWS / ADS (Aggregated and Topic Layers) → Query and Report. Materialized views are suitable for handling repetitive aggregation, joins, and denormalization in this process:
- Multidimensional Aggregation: Perform
GROUP BYoperations on the fact table based on dimensions such as region, category, and time window, and then join the dimensions. - Data Layering: Combine full, incremental, and real-time materialized views with scheduled or manual refreshes to balance latency, resource usage, and result consistency.
- Query Acceleration: Rewrite queries to use materialized view results without changing the SQL queries on the detailed table, reducing the need for business modifications.
For more information, see Overview of materialized views (MySQL) and Overview of materialized views (Oracle).
Refreshing capabilities: Aligning materialized views with base tables
Materialized views need to be aligned with base table data using a refresh strategy.
| Capability | Description | Practical Entry |
|---|---|---|
| Full refresh | Recalculate and overwrite results based on the definition each time, suitable for scenarios with complex definitions, acceptable long windows, or the need for complete alignment. | Refresh a materialized view (MySQL mode)、Refresh a materialized view (Oracle mode) |
| Incremental refresh (Fast refresh) | Processes changes based on Mlog, suitable for large base tables with relatively localized changes; has limitations on SQL form. | Same as above; Mlog details see Materialized views log (MySQL mode)、Materialized views log (Oracle mode) |
| Automatic / Manual refresh | Scheduling can be configured at creation; manual refresh is also available as needed (e.g., DBMS_MVIEW.REFRESH). |
Create a materialized view (MySQL mode)、Create a materialized view (Oracle mode) |
| Automatic Mlog management | Reduces manual maintenance costs for Mlog on supported versions. | Automatic management of materialized views log (MySQL mode)、Automatic management of materialized views log (Oracle mode) |
| Refresh resource isolation | Large-scale refreshes and Mlog maintenance can be isolated from online loads. | Materialized views resource isolation (MySQL mode)、Materialized views resource isolation (Oracle mode) |
Concurrency can be optimized using variables like mview_refresh_dop.
Query rewriting
You can match queries to materialized views without modifying business SQL statements.
Query rewriting refers to the optimizer matching a query on the base table (or equivalent semantics) to an existing materialized view and using the precomputed results to answer the query.
- You can enable
ENABLE QUERY REWRITEwhen you create a materialized view (specific syntax and limitations are specified in the documentation). - Whether query rewriting is enabled depends on session or global variables, such as query_rewrite_enabled and query_rewrite_integrity.
For more information, see Query rewriting for materialized views (MySQL mode) and Query rewriting for materialized views (Oracle mode).
Materialized view capabilities and selection
To avoid repetition with the previous sections on "real-time data warehouse location, refresh capabilities, and query rewriting capabilities," this section retains only the key decision-making points. Examples and SQL are provided in the following sections.
Refresh strategies (complete, fast, automatic, and manual)
- Complete refresh: In a complete refresh, the system re-executes the query statement of the materialized view and overwrites the original view result data with the new query results. This strategy is suitable for scenarios with low latency requirements, low base table data update frequency, complex query statements, or small data volumes.
- Fast refresh: Also known as fast refresh, this strategy requires the use of materialized view logs (Mlogs) to capture and process changes in the underlying base tables. It supports single-table aggregation, multi-table joins, and multi-table join aggregations. However, it has specific requirements for aggregate functions and join types. This strategy is suitable for scenarios with large data volumes and frequent changes.
- Automatic refresh: When creating a materialized view, you can specify the refresh interval. The system automatically schedules the refresh task based on the configured refresh time rules.
- Manual refresh: If automatic refresh is not configured or the refresh interval is large, you can manually execute the refresh command to synchronize the materialized view data with the base table data.
For more information about the syntax and limitations, see Refresh a materialized view (MySQL mode) and Refresh a materialized view (Oracle mode).
Real-time materialized views (ON QUERY COMPUTATION)
Real-time materialized views (Real-Time Materialized Views) allow you to retrieve real-time data by querying the materialized view. The query results are the same as those obtained by directly querying the base table, and the precomputed results of the materialized view can accelerate the query. It uses the Mlog mechanism to capture and process changes in the underlying base tables, ensuring that the data in the materialized view reflects the latest state in a timely manner. Since real-time materialized views rely on Mlogs, the query statements for real-time materialized views must meet the requirements for fast refresh, i.e., only queries that meet the fast refresh requirements can be defined as real-time materialized views.
For more information about how to create real-time materialized views, see Create a materialized view in MySQL mode and Create a materialized view in Oracle mode.
Nested materialized views
A nested materialized view is a materialized view that is referenced by another materialized view. This approach is particularly useful in ETL (Extract, Transform, Load) processes. During the ETL workflow, nested materialized views can aggregate and transform data from different stages and store the results as independent views. This avoids repeated calculations and improves the overall efficiency of the ETL process. However, nested materialized views do not support automatic cascading refreshes. Before using nested materialized views, you must understand the refresh considerations for nested materialized views to ensure that the query results obtained from the upper-level materialized view meet your expectations.
For more information about how to create nested materialized views, see Create a materialized view in MySQL mode and Create a materialized view in Oracle mode.
Query rewrite (ENABLE QUERY REWRITE)
If you want to use a materialized view to accelerate queries without modifying the original query statements, you can leverage the query rewrite capability of the materialized view. The system automatically matches the query statement with the definition of the materialized view. If a matching materialized view is found, the system automatically rewrites the query to use the materialized view. This approach significantly improves query performance and efficiency without changing the business logic.
For more information about materialized view query rewrite, see Materialized view query rewrite in MySQL mode and Materialized view query rewrite in Oracle mode.
Query acceleration optimization methods
| Optimization dimension | Recommended practice | Applicable scenarios | Value |
|---|---|---|---|
| Selecting the storage format for materialized views based on query patterns | Use columnar storage for wide table analysis and row-based storage for row-level access. | Scenarios where you need to balance scanning efficiency with row access efficiency. | Reduces unnecessary column reads, lowers I/O, and improves query efficiency. |
| Storage format for materialized view data | Choose row-based or columnar storage based on your specific use case. Row-based storage is suitable for scenarios where results are aggregated and accessed as whole rows. Columnar storage is ideal for wide table analysis and aggregation. | Report aggregation, wide table analysis, and fixed metric queries. | Aligns with access patterns, reducing scan overhead. |
| Using primary key materialized views | Define a primary key for the materialized view to ensure result uniqueness. | Scenarios requiring stable location, updates, or deduplication of materialized results. | Enhances lookup and update efficiency, reduces duplicate data risk. |
| Creating indexes on materialized views | Create indexes on filter columns, join columns, and sort columns. | High-frequency filter, join, and sort queries. | Reduces full table scans and accelerates hit paths. |
| Adjusting refresh strategy based on SLA | Balance latency, freshness, and resource usage, and choose between full/incremental and automatic/manual refresh as needed. | Different real-time requirement levels for AP and reporting scenarios. | Avoids blind pursuit of real-time, controls resource costs, and maintains available performance. |
Materialized views are a key method for query acceleration. For large-scale data scenarios, you can continuously optimize query performance by combining "storage format + index + refresh strategy."
Create a materialized view
Notice
The following examples are run in MySQL mode.
Create a source table
First, we need to create a source table to store the original data. In this example, we create a sales table and an items table to store sales data and item information.
Create the
salestable.CREATE TABLE sales ( order_id INT PRIMARY KEY, user_id INT, item_id INT, item_count INT, region VARCHAR(100) );Create the
itemstable.CREATE TABLE items ( order_id INT, product_id INT, quantity INT, price_per_item DECIMAL(10, 2) NOT NULL, pic_url VARCHAR(1000), PRIMARY KEY (order_id, product_id) );
Create a materialized view for full refresh
We will create a materialized view based on the sales table. This materialized view will aggregate sales by product and region to accelerate queries.
Create a materialized view mv_sales_summary to aggregate sales by product and region.
CREATE MATERIALIZED VIEW mv_sales_summary(PRIMARY KEY(item_id))
REFRESH COMPLETE
START WITH sysdate()
NEXT sysdate() + interval 1 hour
AS SELECT item_id, region, SUM(item_count) AS total_count
FROM sales
GROUP BY item_id, region;
In this example, we specify the following properties when creating the materialized view:
- PRIMARY KEY: Specifies the primary key for the materialized view.
REFRESH COMPLETE: Indicates that a complete refresh is used.START WITH sysdate() NEXT sysdate() + interval 1 hour: Specifies that the materialized view is refreshed periodically every hour.
Create a materialized view for incremental refresh
For scenarios where data changes frequently, incremental refresh can improve refresh efficiency. Before creating a materialized view for incremental refresh, we need to create a materialized view log (Mlog) based on the base table of the materialized view. Then, we can successfully create a materialized view for incremental refresh.
Note
OceanBase Database supports automatic management of materialized view logs. If automatic management of mlogs is enabled, when creating a materialized view for incremental refresh, you do not need to create an mlog for the base table. OceanBase Database automatically creates the corresponding mlog or updates the existing mlog table definition to include the columns required by the newly created materialized view. For more information, see Automatic management of materialized view logs (MySQL mode) and Automatic management of materialized view logs (Oracle mode).
For more information about materialized view logs, see Materialized view logs (MySQL mode) and Materialized view logs (Oracle mode).
The following examples show three query scenarios supported by materialized views for incremental refresh:
Single-table aggregation
Create a materialized view log on the
salestable.CREATE MATERIALIZED VIEW LOG ON sales WITH PRIMARY KEY (item_id, item_count, region) INCLUDING NEW VALUES;Create an incremental refresh materialized view
mv_sales_summary_fastfor single-table aggregation.CREATE MATERIALIZED VIEW mv_sales_summary_fast REFRESH FAST START WITH sysdate() NEXT sysdate() + interval 1 hour AS SELECT item_id, region, SUM(item_count) AS total_count, count(*) as c, count(item_count) as count FROM sales GROUP BY item_id, region;
Multi-table join
Create a materialized view log on the
salestable.(Optional) Delete the materialized view log on the
salestable.If you have not created a materialized view log on the
salestable, skip this step.DROP MATERIALIZED VIEW LOG ON sales;Create a materialized view log on the
salestable.CREATE MATERIALIZED VIEW LOG ON sales WITH PRIMARY KEY (user_id, item_id, item_count, region) INCLUDING NEW VALUES;
Create a materialized view log on the
itemstable.CREATE MATERIALIZED VIEW LOG ON items WITH PRIMARY KEY (price_per_item, pic_url) INCLUDING NEW VALUES;Create an incremental refresh materialized view
mv_sales_items_joinfor multi-table join.CREATE MATERIALIZED VIEW mv_sales_items_join PARTITION BY HASH(order_id) PARTITIONS 10 REFRESH FAST START WITH sysdate() NEXT sysdate() + interval 1 hour AS SELECT s.order_id AS order_id, s.user_id AS customer_id, s.item_id AS item_id, s.item_count AS quantity, s.region AS region, i.order_id AS i_id, i.product_id AS i_item_id, i.price_per_item AS price_per_item, i.pic_url AS pic_url FROM sales s JOIN items i ON s.order_id = i.order_id;
Multi-table join aggregation
Create a materialized view log on the
salestable.(Optional) Delete the materialized view log on the
salestable.If you have not created a materialized view log on the
salestable, skip this step.DROP MATERIALIZED VIEW LOG ON sales;Create a materialized view log on the
salestable.CREATE MATERIALIZED VIEW LOG ON sales WITH PRIMARY KEY (item_id, item_count, region) INCLUDING NEW VALUES;
Create a materialized view log on the
itemstable.(Optional) Delete the materialized view log on the
itemstable.If you have not created a materialized view log on the
itemstable, skip this step.DROP MATERIALIZED VIEW LOG ON items;Create a materialized view log on the
itemstable.CREATE MATERIALIZED VIEW LOG ON items WITH PRIMARY KEY (price_per_item) INCLUDING NEW VALUES;
Create an incremental refresh materialized view
mv_sales_item_join_groupfor multi-table join aggregation.CREATE MATERIALIZED VIEW mv_sales_item_join_group REFRESH FAST START WITH sysdate() NEXT sysdate() + interval 1 hour AS SELECT s.item_id AS item_id, s.region AS region, SUM(s.item_count * i.price_per_item) AS sum_price, count(*) AS c, count(s.item_count * i.price_per_item) AS count FROM sales s JOIN items i ON s.order_id = i.order_id GROUP BY item_id, region;
Create a real-time materialized view
A real-time materialized view ensures that the query results of the materialized view are synchronized with the base table when the data changes. Since a real-time materialized view relies on a materialized view log, you must create a materialized view log before you create a materialized view that supports incremental refresh.
Note
OceanBase Database supports automatic management of materialized view logs. If you enable automatic management of materialized view logs, you do not need to create a materialized view log for the base table before you create a real-time materialized view. OceanBase Database automatically creates the corresponding materialized view log or updates the existing materialized view log table definition to include the columns required by the newly created materialized view. For more information, see Automatic management of materialized view logs (MySQL mode) and Automatic management of materialized view logs (Oracle mode).
Create a materialized view log on the
salestable.(Optional) Drop the materialized view log on the
salestable.If you have not created a materialized view log on the
salestable, skip this step.DROP MATERIALIZED VIEW LOG ON sales;Create a materialized view log on the
salestable.CREATE MATERIALIZED VIEW LOG ON sales WITH PRIMARY KEY (item_id, item_count, region) INCLUDING NEW VALUES;
Create a real-time materialized view named
mv_sales_summary_com.CREATE MATERIALIZED VIEW mv_sales_summary_com REFRESH FORCE START WITH sysdate() NEXT sysdate() + interval 1 hour ENABLE ON QUERY COMPUTATION AS SELECT item_id, region, SUM(item_count) AS total_count, count(*) as c, count(item_count) as count FROM sales GROUP BY item_id, region;
In this example, we enable ENABLE ON QUERY COMPUTATION, which allows the materialized view to be updated in real time during queries. This ensures that the latest data is retrieved, and the query statement meets the requirements for creating an incrementally refreshed materialized view.
Create a nested materialized view
In an ETL process, a nested materialized view is used to combine multiple materialized views into a more complex data processing workflow. Here, we will create two materialized views: one that links sales information with item information, and another that performs additional aggregation based on the first materialized view.
Create a materialized view log on the
salestable.(Optional) Drop the materialized view log on the
salestable.If you have not created a materialized view log on the
salestable, skip this step.DROP MATERIALIZED VIEW LOG ON sales;Create a materialized view log on the
salestable.CREATE MATERIALIZED VIEW LOG ON sales WITH PRIMARY KEY (user_id, item_id, item_count, region) INCLUDING NEW VALUES;
Create a materialized view log on the
itemstable.(Optional) Drop the materialized view log on the
itemstable.If you have not created a materialized view log on the
itemstable, skip this step.DROP MATERIALIZED VIEW LOG ON items;Create a materialized view log on the
itemstable.CREATE MATERIALIZED VIEW LOG ON items WITH PRIMARY KEY (price_per_item,pic_url) INCLUDING NEW VALUES;
Create a materialized view named
mv1_sales_items_jointhat links sales information with item information.CREATE MATERIALIZED VIEW mv1_sales_items_join REFRESH FAST START WITH sysdate() NEXT sysdate() + interval 1 hour AS SELECT s.order_id AS order_id, s.user_id AS customer_id, s.item_id AS item_id, s.item_count AS quantity, s.region AS region, i.order_id AS i_id, i.product_id AS i_item_id, i.price_per_item AS price_per_item, i.pic_url FROM sales s JOIN items i ON s.order_id = i.order_id;Create a materialized view log on the materialized view
mv1_sales_items_join.CREATE MATERIALIZED VIEW LOG ON mv1_sales_items_join WITH PRIMARY KEY (region,quantity,price_per_item) INCLUDING NEW VALUES;Create a materialized view named
mv2_join_sumbased on the materialized viewmv1_sales_items_join.CREATE MATERIALIZED VIEW mv2_join_sum REFRESH FAST START WITH sysdate() NEXT sysdate() + interval 1 hour AS SELECT region, sum(quantity * price_per_item) AS sum_price, count(*) as c, count(quantity * price_per_item) as count FROM mv1_sales_items_join GROUP BY region;
Create a columnstore materialized view
Assume that you want to improve query efficiency in scenarios involving large amounts of data. In this case, you can create a columnstore materialized view. With columnstore, queries only read the required columns, significantly reducing disk I/O.
Create a materialized view log on the
salestable.(Optional) Drop the materialized view log on the
salestable.If you have not created a materialized view log on the
salestable, skip this step.DROP MATERIALIZED VIEW LOG ON sales;Create a materialized view log on the
salestable.CREATE MATERIALIZED VIEW LOG ON sales WITH PRIMARY KEY (user_id, item_id, item_count, region) INCLUDING NEW VALUES;
Create a materialized view log on the
itemstable.(Optional) Drop the materialized view log on the
itemstable.If you have not created a materialized view log on the
itemstable, skip this step.DROP MATERIALIZED VIEW LOG ON items;Create a materialized view log on the
itemstable.CREATE MATERIALIZED VIEW LOG ON items WITH PRIMARY KEY (price_per_item,pic_url) INCLUDING NEW VALUES;
Create a columnstore materialized view named
wide_sales_column.CREATE MATERIALIZED VIEW wide_sales_column WITH COLUMN GROUP(each column) REFRESH FAST START WITH sysdate() NEXT sysdate() + interval 1 hour AS SELECT s.order_id AS order_id, s.user_id AS customer_id, s.item_id AS item_id, s.item_count AS quantity, s.region AS region, i.order_id AS i_id, i.product_id AS i_item_id, i.price_per_item AS price_per_item, i.pic_url FROM sales s JOIN items i ON s.order_id = i.order_id;
In this example, we specify WITH COLUMN GROUP(each column) to use the columnar storage format for the materialized view. This is particularly useful in OLAP scenarios, especially when dealing with large volumes of data and wide tables.
Create a materialized view for query rewriting
In this example, the ENABLE QUERY REWRITE clause is used to create a full refresh materialized view that supports query rewriting.
CREATE MATERIALIZED VIEW mv_sales_summary_select
REFRESH COMPLETE
START WITH sysdate()
NEXT sysdate() + interval 1 hour
ENABLE QUERY REWRITE
AS SELECT item_id, region, SUM(item_count) AS total_count
FROM sales
GROUP BY item_id, region;
In this example, the ENABLE QUERY REWRITE and ENABLE ON QUERY COMPUTATION clauses are used to create a real-time materialized view that supports query rewriting.
Create a materialized view log on the
salestable.(Optional) Drop the materialized view log on the
salestable.If you have not created a materialized view log on the
salestable, skip this step.DROP MATERIALIZED VIEW LOG ON sales;Create a materialized view log on the
salestable.CREATE MATERIALIZED VIEW LOG ON sales WITH PRIMARY KEY (item_id, item_count, region) INCLUDING NEW VALUES;
Create a real-time materialized view
mv_sales_summary_com_selectthat supports query rewriting.CREATE MATERIALIZED VIEW mv_sales_summary_com_select REFRESH FAST START WITH sysdate() NEXT sysdate() + interval 1 hour ENABLE ON QUERY COMPUTATION ENABLE QUERY REWRITE AS SELECT item_id, region, SUM(item_count) AS total_sales, count(*) as c, count(item_count) as count FROM sales GROUP BY item_id, region;
Create a materialized view index
To further optimize query performance, you can create an index on the materialized view. A materialized view index can accelerate queries on the materialized view, especially when dealing with large amounts of data.
(Optional) Create a materialized view
mv_sales_summarythat aggregates sales by product and region.If you have already created the materialized view
mv_sales_summarybased on the example in Create a full refresh materialized view, skip this step.CREATE MATERIALIZED VIEW mv_sales_summary(PRIMARY KEY(item_id)) REFRESH COMPLETE START WITH sysdate() NEXT sysdate() + interval 1 hour AS SELECT item_id, region, SUM(item_count) AS total_count FROM sales GROUP BY item_id, region;Create an index
idx_mv_sales_summaryon theregioncolumn of the materialized viewmv_sales_summary.CREATE INDEX idx_mv_sales_summary ON mv_sales_summary (region);This index can accelerate queries on the
mv_sales_summarymaterialized view, especially those based on theregioncolumn.
Dictionary and O&M views
When you troubleshoot materialized views, Mlogs, and refresh tasks, you can use the following system views (the example is based on MySQL mode; for more information about the views and their fields, see the reference documentation of the current version).
| DBA View | ALL View | USER View | Description |
|---|---|---|---|
DBA_MVIEW_LOGS |
ALL_MVIEW_LOGS |
USER_MVIEW_LOGS |
Describes the information about materialized view logs. |
DBA_MVIEWS |
ALL_MVIEWS |
USER_MVIEWS |
Describes the information about materialized views. |
DBA_MVREF_STATS_SYS_DEFAULTS |
USER_MVREF_STATS_SYS_DEFAULTS |
Describes the system scope default values of the basic statistics attributes of the refresh history of all materialized views. | |
DBA_MVREF_STATS_PARAMS |
USER_MVREF_STATS_PARAMS |
Describes the refresh statistics attributes associated with each materialized view. | |
DBA_MVREF_RUN_STATS |
USER_MVREF_RUN_STATS |
Describes the running information of each refresh of a materialized view. Each refresh is identified by the REFRESH_ID column. |
|
DBA_MVREF_STATS |
USER_MVREF_STATS |
Describes the basic timing statistics of the refresh of a materialized view. | |
DBA_MVREF_CHANGE_STATS |
USER_MVREF_CHANGE_STATS |
Describes the data load information of the base tables associated with the refresh runs of all materialized views. | |
DBA_MVREF_STMT_STATS |
USER_MVREF_STMT_STATS |
Describes the information about the refresh statements. |
References
- For more information about materialized views, see Materialized views overview (MySQL mode) and Materialized views overview (Oracle mode).
- For more information about refreshing materialized views, see Refresh materialized views (MySQL mode) and Refresh materialized views (Oracle mode).
- For more information about materialized view logs, see Materialized view logs (MySQL mode) and Materialized view logs (Oracle mode).
- For more information about deleting materialized views, see Delete materialized views (MySQL mode) and Delete materialized views (Oracle mode).
