A materialized view (MV) is a database object that stores the results of a query. Unlike a regular view, which dynamically generates data when queried, an MV stores the results of the query, allowing for faster access to the data. By precomputing and storing the results of expensive operations such as aggregations and joins, MMs eliminate the need to repeatedly execute these operations, significantly accelerating query performance. This is especially beneficial in data warehouses and decision support systems, where it can drastically reduce computation time and improve query efficiency.
Why use materialized views in AP and real-time data warehouses
- Reduce redundant computations: Materialized views help reduce redundant JOIN and GROUP BY operations in multidimensional analysis, wide table joins, and periodic reports.
- Control freshness and cost: By using strategies such as full, incremental, or real-time (on-demand), you can balance the tradeoff between latency and resource usage (see Classic scenarios of materialized views for more information).
- Combine with other AP capabilities: For example, you can use materialized views in columnar storage, query rewriting, and external tables for lakehouse collaboration (see Classic scenarios of materialized views for more information).
Use cases
- Data aggregation: Aggregate transaction details, behavioral data, etc., by day, week, or month.
- Fixed reports / metrics layer: Precompute metrics tables for reports or API queries.
- Replay optimization: Materialize resource-intensive SQL results to avoid repeated execution.
- Data locality and replication strategy (planned by business): Redundantly store aggregated results in compliance with regulations, combined with routing and replication strategies.
- Pre-aggregation for monitoring: Pre-aggregate monitoring details by time granularity.
Core capabilities
Query acceleration
Single-table aggregation: For query scenarios involving a large amount of data in a single table and frequent group aggregations, you can create an incremental refresh (refresh fast) materialized view for single-table grouping aggregation. If you want to merge the latest changes during queries, you can use real-time materialized views (ENABLE ON QUERY COMPUTATION).
Multi-table joins: For query scenarios involving a large amount of data in multiple tables and frequent join queries, you can create an incremental refresh (refresh fast) materialized view for multi-table joins. After the data in the join materialized view is maintained, you can use query rewriting to automatically redirect queries to the materialized view results.
Materialized views and real-time data warehouses
In the AP / Real-time Data Warehouse scenario of OceanBase Database, materialized views are often used to convert repeated calculations on detailed or wide tables into refreshable physical result sets. These result sets are organized by time (batch or near-real time), space (dimensions and granularity), and theme (business domain) to support multi-dimensional aggregation and derived metrics. When combined with features like columnar storage, query rewriting, and incremental refresh, materialized views can reduce the scanning and computational load on the base tables during online analytical processing.
Location of materialized views in real-time data warehouses
A typical real-time or near-real-time data warehouse link can be summarized as follows: Access (CDC/Files/External Tables) → DWD (Detailed Data Warehouse) → DWS/ADS (Aggregated and Topic-based) → Query and Reporting. Materialized views are suitable for handling repetitive aggregation, joins, and denormalization in this process:
- Multidimensional Aggregation: Perform
GROUP BYoperations on fact tables based on dimensions such as region, category, and time windows, and join with dimension tables. - Data Layering: Combine full, incremental, or real-time materialized views with scheduled or manual refreshes to balance latency, resource usage, and result consistency.
- Query Acceleration: Rewrite queries to use materialized view results without changing the original SQL queries on the detailed tables, reducing the need for business modifications.
For more information, see Overview of materialized views (MySQL) and Overview of materialized views (Oracle).
Refresh capabilities: Align materialized results with base tables
Materialized views need to be aligned with base table data through refresh strategies.
| Capability | Description | Practical Entry |
|---|---|---|
| Full refresh | Recalculate and overwrite results each time based on the definition. Suitable for complex definitions, scenarios that can tolerate long windows, or scenarios requiring complete alignment. | Refresh a materialized view (MySQL mode)、Refresh a materialized view (Oracle mode) |
| Incremental refresh (fast refresh) | Process changes based on Mlog, suitable for large base tables with relatively localized changes; has limitations on SQL forms. | Same as above; Mlog details see Materialized view log (MySQL mode)、Materialized view log (Oracle mode) |
| Automatic / manual refresh | Scheduling can be configured at creation; manual refresh is also supported as needed (e.g., DBMS_MVIEW.REFRESH). |
Create a materialized view (MySQL mode)、Create a materialized view (Oracle mode) |
| Automatic Mlog management | Reduce manual maintenance costs of Mlog on supported versions. | Automatic management of materialized view logs (MySQL mode)、Automatic management of materialized view logs (Oracle mode) |
| Refresh resource isolation | Large-scale refreshes and Mlog maintenance can be isolated from online loads. | Resource isolation for materialized views (MySQL mode)、Resource isolation for materialized views (Oracle mode) |
Parallelism can be optimized using variables such as mview_refresh_dop.
Query Rewriting: Hit the materialized result with minimal changes to the business SQL
Query rewriting refers to the optimizer matching a query on the base table (or equivalent semantics) to an existing materialized view and using the precomputed results to answer the query.
- To enable
ENABLE QUERY REWRITEwhen creating a materialized view (specific syntax and limitations are subject to the manual). - Whether to rewrite is influenced by session/global variables, such as query_rewrite_enabled and query_rewrite_integrity.
For more information, see: Materialized view query rewriting (MySQL mode) and Materialized view query rewriting (Oracle mode).
Materialized View Capabilities and Selection
To avoid repetition with the previous sections on "real-time data warehouse location, refresh capabilities, and query rewriting capabilities," this section retains only the key decision-making points. Examples and SQL are provided in the following chapters.
Refresh strategy (complete / fast / automatic / manual)
- Complete refresh: A complete refresh re-executes the query statement and overwrites the original view result data with the computed results. This strategy is suitable for scenarios with low latency requirements, low base table update frequency, complex query statements, or small data volumes.
- Fast refresh: Also known as fast refresh, it relies on Materialized View Logs (Mlogs) and has specific requirements for query statements. Currently, it supports single-table aggregations, multi-table joins, and multi-table join aggregations. It is suitable for scenarios with large data volumes and frequent changes.
- Automatic refresh: When creating a materialized view, you can specify the refresh interval. The system will automatically schedule the refresh task based on the configured refresh time rules.
- Manual refresh: If automatic refresh is not configured or the interval is large, you can manually execute the refresh command to synchronize the materialized view data with the base table data.
For more information about the syntax and limitations, see Refresh a materialized view (MySQL mode) and Refresh a materialized view (Oracle mode).
Real-time materialized views (ON QUERY COMPUTATION)
Real-time materialized views allow you to obtain real-time data by querying the materialized view. The query results are the same as those obtained by directly querying the base table. Additionally, you can leverage the precomputed results of the materialized view to accelerate queries. It captures and processes changes in the underlying base tables using the Mlog mechanism, ensuring that the data in the materialized view reflects the latest state in a timely manner. Since it relies on Mlog, the query statements for real-time materialized views must meet the requirements for fast refresh materialized views, meaning only those that satisfy the fast refresh requirements can be defined as real-time materialized views.
For more information about creating real-time materialized views, see Create a materialized view in MySQL mode and Create a materialized view in Oracle mode.
Nested materialized views
A nested materialized view is one that is referenced by another materialized view. This approach is particularly useful in ETL (Extract, Transform, Load) processes. In ETL workflows, nested materialized views can aggregate and store transformation results from different stages as independent views, avoiding repeated calculations and improving the overall efficiency of the ETL process. However, nested materialized views do not support automatic cascading refreshes. Before using nested materialized views, it is important to understand their refresh considerations to ensure that the data obtained from querying the upper-level materialized view meets expectations.
For more information about creating nested materialized views, see Create a materialized view in MySQL mode and Create a materialized view in Oracle mode.
Query rewrite (ENABLE QUERY REWRITE)
If you want to use a materialized view to accelerate queries without modifying the original query statement, you can leverage the query rewrite capability of the materialized view. The system automatically matches the query statement with the materialized view's definition. If a matching materialized view is found, it automatically rewrites the query to use the materialized view. This way, you can significantly improve query performance and efficiency without changing the business logic.
For more information about materialized view query rewriting, see Materialized view query rewrite in MySQL mode and Materialized view query rewrite in Oracle mode.
Query acceleration optimization methods
| Optimization dimension | Recommended practice | Applicable scenario | Value |
|---|---|---|---|
| Selecting the storage format for materialized views based on query type | Use columnar storage for wide table analysis and row-based storage for row-level access. | Scenarios where a balance between scan efficiency and row access efficiency is needed. | Reduces unnecessary column reads, lowers I/O, and improves query efficiency. |
| Storage format for materialized view data | Choose between row-based or columnar storage based on the actual application scenario. Row-based storage is suitable for scenarios where results are already aggregated and accessed as whole rows. Columnar storage is suitable for wide table analysis and aggregation. | Reporting, wide table analysis, and fixed metric queries. | Better aligns with access patterns, reducing scan overhead. |
| Using primary key materialized views | Define a primary key for the materialized view to ensure result uniqueness. | Scenarios requiring stable location, updates, or deduplication of materialized results. | Improves lookup and update efficiency, reducing the risk of duplicate data. |
| Creating indexes on materialized views | Create indexes based on filter columns, join columns, and sort columns. | High-frequency filter, join, and sort queries. | Reduces full table scans and accelerates hit paths. |
| Adjusting refresh strategy based on SLA | Balance between latency, freshness, and resources, and choose between full/incremental and automatic/manual refresh based on needs. | Different real-time scenarios for applications and reports. | Avoids blind pursuit of real-time, controls resource costs, and maintains available performance. |
Materialized views are an important method for query acceleration. For large-scale data scenarios, the combination of "storage format + index + refresh strategy" can continuously optimize query performance.
Create a materialized view
Notice
The following examples are executed in MySQL mode.
Create the source table
First, we need to create the source table to store the original data. In this example, we create a sales table and an items table, which store the sales data and item information, respectively.
Create the
salestable.CREATE TABLE sales ( order_id INT PRIMARY KEY, user_id INT, item_id INT, item_count INT, region VARCHAR(100) );Create the
itemstable.CREATE TABLE items ( order_id INT, product_id INT, quantity INT, price_per_item DECIMAL(10, 2) NOT NULL, pic_url VARCHAR(1000), PRIMARY KEY (order_id, product_id) );
Create a materialized view for full refresh
We will create a materialized view based on the sales table. This materialized view will summarize the sales volume by product and region to accelerate queries.
Create a materialized view mv_sales_summary to summarize the sales volume by product and region.
CREATE MATERIALIZED VIEW mv_sales_summary(PRIMARY KEY(item_id))
REFRESH COMPLETE
START WITH sysdate()
NEXT sysdate() + interval 1 hour
AS SELECT item_id, region, SUM(item_count) AS total_count
FROM sales
GROUP BY item_id, region;
In this example, we specify the following properties for the materialized view:
- PRIMARY KEY: specifies the primary key for the materialized view.
REFRESH COMPLETE: specifies the full refresh method.START WITH sysdate() NEXT sysdate() + interval 1 hour: specifies the refresh schedule, which is to refresh the materialized view every hour.
Create a materialized view for incremental refresh
For scenarios where data changes frequently, incremental refresh can improve the refresh efficiency. Before creating a materialized view for incremental refresh, we need to create a materialized view log (mlog) based on the base table of the materialized view. Then, we can successfully create a materialized view for incremental refresh.
Note
OceanBase Database supports automatic management of materialized view logs. If automatic management of mlogs is enabled, you do not need to create an mlog for the base table before creating a materialized view for incremental refresh. OceanBase Database automatically creates the corresponding mlog or updates the existing mlog table definition to include the columns required by the new materialized view. For more information, see Automatic management of materialized view logs (MySQL mode) and Automatic management of materialized view logs (Oracle mode).
For more information about materialized view logs, see Materialized view logs (MySQL mode) and Materialized view logs (Oracle mode).
The following examples show three query scenarios supported by a materialized view for incremental refresh:
Single-table aggregation
Create an mlog on the
salestable.CREATE MATERIALIZED VIEW LOG ON sales WITH PRIMARY KEY (item_id, item_count, region) INCLUDING NEW VALUES;Create a materialized view
mv_sales_summary_fastfor incremental refresh to aggregate data from thesalestable.CREATE MATERIALIZED VIEW mv_sales_summary_fast REFRESH FAST START WITH sysdate() NEXT sysdate() + interval 1 hour AS SELECT item_id, region, SUM(item_count) AS total_count, count(*) as c, count(item_count) as count FROM sales GROUP BY item_id, region;
Multi-table join
Create an mlog on the
salestable.(Optional) Drop the mlog on the
salestable.If you have not created an mlog on the
salestable, skip this step.DROP MATERIALIZED VIEW LOG ON sales;Create an mlog on the
salestable.CREATE MATERIALIZED VIEW LOG ON sales WITH PRIMARY KEY (user_id, item_id, item_count, region) INCLUDING NEW VALUES;
Create an mlog on the
itemstable.CREATE MATERIALIZED VIEW LOG ON items WITH PRIMARY KEY (price_per_item, pic_url) INCLUDING NEW VALUES;Create a materialized view
mv_sales_items_joinfor incremental refresh to join data from thesalesanditemstables.CREATE MATERIALIZED VIEW mv_sales_items_join PARTITION BY HASH(order_id) PARTITIONS 10 REFRESH FAST START WITH sysdate() NEXT sysdate() + interval 1 hour AS SELECT s.order_id AS order_id, s.user_id AS customer_id, s.item_id AS item_id, s.item_count AS quantity, s.region AS region, i.order_id AS i_id, i.product_id AS i_item_id, i.price_per_item AS price_per_item, i.pic_url AS pic_url FROM sales s JOIN items i ON s.order_id = i.order_id;
Multi-table join and aggregation
Create an mlog on the
salestable.(Optional) Drop the mlog on the
salestable.If you have not created an mlog on the
salestable, skip this step.DROP MATERIALIZED VIEW LOG ON sales;Create an mlog on the
salestable.CREATE MATERIALIZED VIEW LOG ON sales WITH PRIMARY KEY (item_id, item_count, region) INCLUDING NEW VALUES;
Create an mlog on the
itemstable.(Optional) Drop the mlog on the
itemstable.If you have not created an mlog on the
itemstable, skip this step.DROP MATERIALIZED VIEW LOG ON items;Create an mlog on the
itemstable.CREATE MATERIALIZED VIEW LOG ON items WITH PRIMARY KEY (price_per_item) INCLUDING NEW VALUES;
Create a materialized view
mv_sales_item_join_groupfor incremental refresh to join and aggregate data from thesalesanditemstables.CREATE MATERIALIZED VIEW mv_sales_item_join_group REFRESH FAST START WITH sysdate() NEXT sysdate() + interval 1 hour AS SELECT s.item_id AS item_id, s.region AS region, SUM(s.item_count * i.price_per_item) AS sum_price, count(*) AS c, count(s.item_count * i.price_per_item) AS count FROM sales s JOIN items i ON s.order_id = i.order_id GROUP BY item_id, region;
Create a real-time materialized view
A real-time materialized view ensures that the query results of the materialized view are synchronized with the base table when the data changes. Since a real-time materialized view depends on a materialized view log, you need to create a materialized view log before you create a real-time materialized view.
Note
OceanBase Database supports automatic management of materialized view logs. If automatic management of materialized view logs is enabled, you do not need to create a materialized view log for the base table before you create a real-time materialized view. OceanBase Database automatically creates a materialized view log for the base table or updates the definition of the existing materialized view log to include the columns required by the new materialized view. For more information, see Automatic management of materialized view logs (MySQL mode) and Automatic management of materialized view logs (Oracle mode).
Create a materialized view log on the
salestable.(Optional) Drop the materialized view log on the
salestable.If you have not created a materialized view log on the
salestable, skip this step.DROP MATERIALIZED VIEW LOG ON sales;Create a materialized view log on the
salestable.CREATE MATERIALIZED VIEW LOG ON sales WITH PRIMARY KEY (item_id, item_count, region) INCLUDING NEW VALUES;
Create a real-time materialized view named
mv_sales_summary_com.CREATE MATERIALIZED VIEW mv_sales_summary_com REFRESH FORCE START WITH sysdate() NEXT sysdate() + interval 1 hour ENABLE ON QUERY COMPUTATION AS SELECT item_id, region, SUM(item_count) AS total_count, count(*) as c, count(item_count) as count FROM sales GROUP BY item_id, region;
In this example, we enable ENABLE ON QUERY COMPUTATION to ensure that the materialized view is updated in real time during queries. This guarantees that you can obtain the latest data, and the query statement of the materialized view meets the requirements for creating an incrementally refreshed materialized view.
Create a nested materialized view
In the ETL process, a nested materialized view is used to combine multiple materialized views into a more complex data processing workflow. In this example, we will create two materialized views: one that joins sales and item information, and another that performs additional aggregations based on the first materialized view.
Create a materialized view log on the
salestable.(Optional) Drop the materialized view log on the
salestable.If you have not created a materialized view log on the
salestable, skip this step.DROP MATERIALIZED VIEW LOG ON sales;Create a materialized view log on the
salestable.CREATE MATERIALIZED VIEW LOG ON sales WITH PRIMARY KEY (user_id, item_id, item_count, region) INCLUDING NEW VALUES;
Create a materialized view log on the
itemstable.(Optional) Drop the materialized view log on the
itemstable.If you have not created a materialized view log on the
itemstable, skip this step.DROP MATERIALIZED VIEW LOG ON items;Create a materialized view log on the
itemstable.CREATE MATERIALIZED VIEW LOG ON items WITH PRIMARY KEY (price_per_item,pic_url) INCLUDING NEW VALUES;
Create a materialized view named
mv1_sales_items_jointhat joins sales and item information.CREATE MATERIALIZED VIEW mv1_sales_items_join REFRESH FAST START WITH sysdate() NEXT sysdate() + interval 1 hour AS SELECT s.order_id AS order_id, s.user_id AS customer_id, s.item_id AS item_id, s.item_count AS quantity, s.region AS region, i.order_id AS i_id, i.product_id AS i_item_id, i.price_per_item AS price_per_item, i.pic_url FROM sales s JOIN items i ON s.order_id = i.order_id;Create a materialized view log on the materialized view
mv1_sales_items_join.CREATE MATERIALIZED VIEW LOG ON mv1_sales_items_join WITH PRIMARY KEY (region,quantity,price_per_item) INCLUDING NEW VALUES;Create a materialized view named
mv2_join_sumbased on the materialized viewmv1_sales_items_join.CREATE MATERIALIZED VIEW mv2_join_sum REFRESH FAST START WITH sysdate() NEXT sysdate() + interval 1 hour AS SELECT region, sum(quantity * price_per_item) AS sum_price, count(*) as c, count(quantity * price_per_item) as count FROM mv1_sales_items_join GROUP BY region;
Create a columnstore materialized view
Assume that you want to improve query efficiency in scenarios involving large amounts of data. In this case, you can create a columnstore materialized view. By using the columnar storage format, only the required columns are read during queries, significantly reducing disk I/O.
Create a materialized view log on the
salestable.(Optional) Drop the materialized view log on the
salestable.If you have not created a materialized view log on the
salestable, skip this step.DROP MATERIALIZED VIEW LOG ON sales;Create a materialized view log on the
salestable.CREATE MATERIALIZED VIEW LOG ON sales WITH PRIMARY KEY (user_id, item_id, item_count, region) INCLUDING NEW VALUES;
Create a materialized view log on the
itemstable.(Optional) Drop the materialized view log on the
itemstable.If you have not created a materialized view log on the
itemstable, skip this step.DROP MATERIALIZED VIEW LOG ON items;Create a materialized view log on the
itemstable.CREATE MATERIALIZED VIEW LOG ON items WITH PRIMARY KEY (price_per_item,pic_url) INCLUDING NEW VALUES;
Create a columnstore materialized view named
wide_sales_column.CREATE MATERIALIZED VIEW wide_sales_column WITH COLUMN GROUP(each column) REFRESH FAST START WITH sysdate() NEXT sysdate() + interval 1 hour AS SELECT s.order_id AS order_id, s.user_id AS customer_id, s.item_id AS item_id, s.item_count AS quantity, s.region AS region, i.order_id AS i_id, i.product_id AS i_item_id, i.price_per_item AS price_per_item, i.pic_url FROM sales s JOIN items i ON s.order_id = i.order_id;
In this example, we specify WITH COLUMN GROUP(each column) to ensure that the materialized view uses the columnar storage format. This is particularly useful in OLAP scenarios, especially when dealing with large volumes of data and wide tables.
Create a materialized view with query rewrite enabled
In this example, the ENABLE QUERY REWRITE clause is specified to create a fully refreshed materialized view that supports query rewrite.
CREATE MATERIALIZED VIEW mv_sales_summary_select
REFRESH COMPLETE
START WITH sysdate()
NEXT sysdate() + interval 1 hour
ENABLE QUERY REWRITE
AS SELECT item_id, region, SUM(item_count) AS total_count
FROM sales
GROUP BY item_id, region;
In this example, the ENABLE QUERY REWRITE and ENABLE ON QUERY COMPUTATION clauses are specified to create a real-time materialized view that supports query rewrite.
Create a materialized view log on the
salestable.(Optional) Drop the materialized view log on the
salestable.If you have not created a materialized view log on the
salestable, skip this step.DROP MATERIALIZED VIEW LOG ON sales;Create a materialized view log on the
salestable.CREATE MATERIALIZED VIEW LOG ON sales WITH PRIMARY KEY (item_id, item_count, region) INCLUDING NEW VALUES;
Create a real-time materialized view
mv_sales_summary_com_selectthat supports query rewrite.CREATE MATERIALIZED VIEW mv_sales_summary_com_select REFRESH FAST START WITH sysdate() NEXT sysdate() + interval 1 hour ENABLE ON QUERY COMPUTATION ENABLE QUERY REWRITE AS SELECT item_id, region, SUM(item_count) AS total_sales, count(*) as c, count(item_count) as count FROM sales GROUP BY item_id, region;
Create a materialized view index
To further optimize query performance, you can create an index on a materialized view. A materialized view index can accelerate queries on the materialized view, especially when dealing with large amounts of data.
(Optional) Create a materialized view
mv_sales_summarythat aggregates sales by product and region.If you have already created the materialized view
mv_sales_summarybased on the example in Create a fully refreshed materialized view, skip this step.CREATE MATERIALIZED VIEW mv_sales_summary(PRIMARY KEY(item_id)) REFRESH COMPLETE START WITH sysdate() NEXT sysdate() + interval 1 hour AS SELECT item_id, region, SUM(item_count) AS total_count FROM sales GROUP BY item_id, region;Create an index
idx_mv_sales_summaryon theregioncolumn of the materialized viewmv_sales_summary.CREATE INDEX idx_mv_sales_summary ON mv_sales_summary (region);This index can accelerate queries on the
mv_sales_summarymaterialized view, especially those based on theregioncolumn.
Dictionary and O&M views
The following system views are commonly used for troubleshooting materialized views, Mlogs, and refresh tasks. The following examples are based on the MySQL mode. For the complete list of views and their fields, see the reference documentation of the current version.
| DBA View | ALL View | USER View | Description |
|---|---|---|---|
DBA_MVIEW_LOGS |
ALL_MVIEW_LOGS |
USER_MVIEW_LOGS |
Describes the information of materialized view logs. |
DBA_MVIEWS |
ALL_MVIEWS |
USER_MVIEWS |
Describes the information of materialized views. |
DBA_MVREF_STATS_SYS_DEFAULTS |
USER_MVREF_STATS_SYS_DEFAULTS |
Describes the system-level default values of the basic statistics attributes of the refresh history of all materialized views. | |
DBA_MVREF_STATS_PARAMS |
USER_MVREF_STATS_PARAMS |
Describes the refresh statistics attributes associated with each materialized view. | |
DBA_MVREF_RUN_STATS |
USER_MVREF_RUN_STATS |
Describes the information of each refresh run of a materialized view. Each refresh run is identified by the REFRESH_ID column. |
|
DBA_MVREF_STATS |
USER_MVREF_STATS |
Describes the basic timing statistics of materialized view refreshes. | |
DBA_MVREF_CHANGE_STATS |
USER_MVREF_CHANGE_STATS |
Describes the changes in the base tables associated with the refresh runs of all materialized views. | |
DBA_MVREF_STMT_STATS |
USER_MVREF_STMT_STATS |
Describes the information associated with refresh statements. |
References
- For more information about materialized views, see Materialized views (MySQL mode) and Materialized views (Oracle mode).
- For more information about refreshing materialized views, see Refresh materialized views (MySQL mode) and Refresh materialized views (Oracle mode).
- For more information about materialized view logs, see Materialized view logs (MySQL mode) and Materialized view logs (Oracle mode).
- For more information about deleting materialized views, see Delete materialized views (MySQL mode) and Delete materialized views (Oracle mode).
