This topic provides step-by-step SQL practice in MySQL mode (e.g., orders for beginners, tbl* for small snippets, and external tables for full MVs). You can follow the sections in order. Pair this with the Materialized View Query Acceleration example using sales and items. For a complete walkthrough with consistent business tables, prioritize reading this document.
Note
For detailed syntax and limitations, see: Create a materialized view, Refresh a materialized view, Query rewriting of materialized views, Materialized view logs, and Automatic management of materialized view logs.
Build a real-time data warehouse with materialized views: scenario selection
To support quick decision-making during the construction of a real-time data warehouse, the selection can be made in the order of "data freshness requirements -> computational complexity -> data source type."
| Use case | Recommended capabilities | Typical features | Corresponding section |
|---|---|---|---|
| Periodic dashboard + Minute-level updates | REFRESH FAST + Mlog + Scheduled triggers |
Low incremental maintenance cost, ideal for stable aggregation | Quick start; Scenario 2: Real-time indicator processing pipeline |
| Columnar Scan | Columnar Materialized Views + Query Rewriting (Optional) | Most effective when scanning fewer than all columns | Scenario 1: Operational Analysis and Report Drilling |
| Low-Latency Online Query | ENABLE ON QUERY COMPUTATION |
Trades computation at query time for simplified updates | Scenario 2: Real-time metric processing pipeline |
| SQL Transparent Acceleration (requires minimal changes to business SQL) | ENABLE QUERY REWRITE + index/definition alignment |
Relies on rewrite rules, ideal for fixed analysis templates | Scenario 2: Real-time Metric Processing Pipeline |
| File upload to lake and periodic loading | View + REFRESH COMPLETE |
Data source comes from files, and then query after offline synchronization | Scenario 3: Lake-CBD warehouse upload and periodic loading |
Here are recommended best practice approaches in the recommended order:
- Define the metrics and refresh strategy: Decide whether it is "incremental" or "real-time calculation upon query".
- Refresh Mechanism: First, check whether FAST (Mlog) meets the criteria. If not, use COMPLETE.
- Finally, optimize the query side: Combine column storage, indexes, and query rewrite as needed to avoid introducing too many variables at once.
Notice
A real-time data warehouse does not mean all queries refresh in real time. If minutes of delay is acceptable for your business, prioritize using FAST + scheduling; if your business has a low-latency requirement but acceptable read cost, use ENABLE ON QUERY COMPUTATION.
Getting started: Full and incremental refresh for the order table
You can use the orders table in a business database as an example to demonstrate the default full-maintenance method of a materialized view, the incremental maintenance method of Mlog + FAST, and the basic usage of manual refresh and scheduled refresh. The results are consistent with those in Refresh a materialized view.
Create tables, generate test data, and create full materialized views
CREATE TABLE orders (
order_id INT PRIMARY KEY,
user_id INT,
item_id INT,
item_count INT,
item_price INT,
region VARCHAR(100)
);
INSERT INTO orders VALUES
(1, 10001, 1, 20, 100, 'HZ'),
(2, 10002, 1, 10, 150, 'BJ'),
(3, 10001, 2, 50, 50, 'SH');
CREATE MATERIALIZED VIEW mv1
AS
SELECT region, SUM(item_count * item_price) AS sum_price
FROM orders
GROUP BY region;
SELECT * FROM mv1;
When you create a materialized view, a snapshots is created for it. The materialized view will usually be manually refreshed in full with the base table unless a refresh strategy is explicitly specified.
Full refresh manually (DBMS_MVIEW.REFRESH)
After data is written to the base table, the materialized view will not automatically refresh. You must manually trigger a refresh:
INSERT INTO orders VALUES
(4, 10002, 2, 10, 100, 'SH'),
(5, 10003, 1, 2, 20, 'HZ');
SELECT * FROM mv1;
The results include:
+--------+-----------+
| region | sum_price |
+--------+-----------+
| HZ | 2000 |
| BJ | 1500 |
| SH | 2500 |
+--------+-----------+
3 rows in set
You can see that the materialized view results have not changed automatically. To refresh the materialized view, execute the following SQL statement:
CALL DBMS_MVIEW.REFRESH('mv1', 'c', refresh_parallel => 2);
SELECT * FROM mv1;
Results:
+--------+-----------+
| region | sum_price |
+--------+-----------+
| HZ | 2040 |
| BJ | 1500 |
| SH | 3500 |
+--------+-----------+
3 rows in set
In c indicates complete refresh; other parameters such as refresh_parallel are determined by DBMS_MVIEW.REFRESH.
You can query the refresh running status of a major compaction job. Example:
SELECT MVIEWS, METHOD, START_TIME, END_TIME
FROM oceanbase.DBA_MVREF_RUN_STATS;
Refreshing via incremental refresh using Mlog + FAST
DROP MATERIALIZED VIEW mv1;
CREATE MATERIALIZED VIEW LOG ON orders
WITH PRIMARY KEY, ROWID, SEQUENCE (item_count, item_price, region) INCLUDING NEW VALUES;
CREATE MATERIALIZED VIEW mv1(region, c, sum_price, cnt_price)
REFRESH FAST
ON DEMAND START WITH sysdate() NEXT sysdate() + INTERVAL 5 MINUTE
AS
SELECT region,
COUNT(*) AS c,
SUM(item_count * item_price) AS sum_price,
COUNT(item_count * item_price) AS cnt_price
FROM orders
GROUP BY region;
After inserting a new order, observe the Mlog first and then manually incrementally refresh it.
INSERT INTO orders VALUES (6, 10001, 3, 30, 70, 'HZ');
SELECT * FROM mlog$_orders;
CALL DBMS_MVIEW.REFRESH('mv1', 'f');
SELECT * FROM mlog$_orders;
SELECT region, sum_price FROM mv1;
You can also view the refresh background tasks in the DBA_SCHEDULER_JOBS view if you have specified START WITH ... NEXT ... when you created the materialized view log.
Delete object order
When dropping a materialized view, its materialized view log, and the base table, ensure to drop the MV first, then the Mlog, and finally the base table (directly DROP TABLE might fail if the base table is still bound to an Mlog).
DROP MATERIALIZED VIEW mv1;
DROP MATERIALIZED VIEW LOG ON orders;
DROP TABLE orders;
Scenario 1: Operations analysis and report queries
In operational analytics and report query scenarios, common issues include the presence of many columns in detail tables, large scanning ranges, and varying query conditions.
This scenario is typically read-heavy, making it suitable for reducing the scanning of irrelevant columns through columnar materialized views. Combined with indexes and query rewriting, it can further optimize query performance when needed.
This section provides a simple example to demonstrate how to use columnstore materialized views to support wide table queries and multidimensional analysis.
CREATE TABLE IF NOT EXISTS tbl1 (col1 INT PRIMARY KEY, col2 VARCHAR(20), col3 INT);
CREATE MATERIALIZED VIEW mv_ec_tbl1
WITH COLUMN GROUP(each column)
AS SELECT *
FROM tbl1;
When you enable query rewriting, ensure that the rewritten queries can use the indexes in the materialized views (for more information, see Materialized view query rewriting).
CREATE INDEX idx1_mv_ec_tbl1 ON mv_ec_tbl1(col1);
Scenario 2: Real-time metric calculation and query
In real-time metric scenarios, common requirements include hierarchical aggregation, incremental maintenance, low-latency queries, and minimal changes to business SQL.
These scenarios often require a combination of capabilities rather than a single one. Depending on the business's timeliness requirements, you can use nested materialized views, materialized view logs, FAST refresh, real-time materialized views, and query rewriting.
Nested materialized views
When a query involves multiple layers of aggregation or multiple steps of processing, you can first store the intermediate results in a materialized view and then create another materialized view based on those results. This approach is suitable for hierarchical precomputation scenarios, allowing complex metrics to be maintained in multiple steps.
CREATE TABLE IF NOT EXISTS tbl3(id INT, name VARCHAR(30), PRIMARY KEY(id));
CREATE TABLE IF NOT EXISTS tbl4(id INT, age INT, PRIMARY KEY(id));
CREATE MATERIALIZED VIEW mv1_tbl3_tbl4 (PRIMARY KEY (id1, id2))
REFRESH COMPLETE
AS SELECT tbl3.id id1, tbl4.id id2, tbl3.name, tbl4.age
FROM tbl3, tbl4
WHERE tbl3.id = tbl4.id;
CREATE MATERIALIZED VIEW mv_mv1_tbl3_tbl4
REFRESH COMPLETE
AS SELECT SUM(age) age_sum
FROM mv1_tbl3_tbl4;
CREATE MATERIALIZED VIEW mv1_mv1_tbl3_tbl4
REFRESH COMPLETE INCONSISTENT
AS SELECT SUM(age) age_sum
FROM mv1_tbl3_tbl4;
If a layer undergoes a full refresh, the nested materialized views that depend on it typically need to be fully refreshed again before incremental refreshes can proceed (see the documentation on creation and refreshes).
Materialized view logs and FAST refresh
If your business prioritizes refresh efficiency over recalculating all data from scratch, consider using materialized view logs in conjunction with FAST refresh. This method records changes in the base table and processes only incremental data during refreshes.
When Mlog automatic management is not enabled, you must create the mlog before creating a REFRESH FAST materialized view.
CREATE TABLE IF NOT EXISTS tbl5 (col1 INT PRIMARY KEY, col2 INT, col3 INT);
CREATE MATERIALIZED VIEW LOG ON tbl5
WITH SEQUENCE (col2, col3) INCLUDING NEW VALUES;
CREATE MATERIALIZED VIEW mv_tbl5
REFRESH FAST
AS SELECT col2, COUNT(*) cnt, COUNT(col3) cnt_col3, SUM(col3) sum_col3
FROM tbl5
GROUP BY col2;
CALL DBMS_MVIEW.REFRESH('mv_tbl5');
CALL DBMS_MVIEW.REFRESH('mv_tbl5', 'c');
Real-time materialized views
For scenarios where query timeliness is critical and near-real-time results are desired, but frequent manual refreshes are not feasible, you can evaluate the use of real-time materialized views. This approach calculates data at query time to achieve higher data freshness.
CREATE TABLE IF NOT EXISTS tbl2(col1 INT, col2 INT, col3 INT);
CREATE MATERIALIZED VIEW LOG ON tbl2
WITH PRIMARY KEY, ROWID, SEQUENCE (col1, col2, col3) INCLUDING NEW VALUES;
CREATE MATERIALIZED VIEW mv_tbl2_rt
ENABLE ON QUERY COMPUTATION
AS SELECT col1, COUNT(*) AS cnt
FROM tbl2
GROUP BY col1;
Query rewriting (ENABLE QUERY REWRITE)
If you want to minimize changes to business SQL while allowing the optimizer to automatically use materialized view results, you can use the query rewriting feature. Query rewriting is suitable for fixed analysis scenarios, but whether it can be applied depends on whether the query conditions match the materialized view definitions.
CREATE TABLE IF NOT EXISTS test_tbl1 (col1 INT, col2 INT, col3 INT);
CREATE TABLE IF NOT EXISTS test_tbl2 (col1 INT, col2 INT, col3 INT);
CREATE MATERIALIZED VIEW mv_test_tbl1_tbl2
ENABLE QUERY REWRITE
AS SELECT t1.col1 col1, t1.col2 t1col2, t1.col3 t1col3, t2.col2 t2col2, t2.col3 t2col3
FROM test_tbl1 t1, test_tbl2 t2
WHERE t1.col1 = t2.col1;
SET query_rewrite_enabled = 'force';
SELECT COUNT(*), test_tbl1.col1 col1
FROM test_tbl1, test_tbl2
WHERE test_tbl1.col1 = test_tbl2.col1 AND test_tbl2.col2 > 10
GROUP BY col1;
Scenario 3: File data import and regular refreshes
When data comes from files instead of online business tables, you can first use an external table to read the file data and then create a full-refresh materialized view based on that external table. This approach is suitable for periodic updates, offline synchronization, and regular data ingestion in lakehouse scenarios.
Starting from V4.3.5 BP2, you can create a full-refresh materialized view with an external table as the base. You need to configure secure_file_priv, prepare the external file, and execute CREATE EXTERNAL TABLE (see Create a materialized view — Based on an external table for the complete steps).
CREATE EXTERNAL TABLE ext_tbl1 (
id INT,
name VARCHAR(50),
c_date DATE
)
LOCATION = '/home/admin'
FORMAT = (
TYPE = 'CSV'
FIELD_DELIMITER = ','
FIELD_OPTIONALLY_ENCLOSED_BY = '\''
)
PATTERN = 'ext_tbl1.csv';
CREATE MATERIALIZED VIEW mv_ext_tbl1
REFRESH COMPLETE
AS SELECT * FROM ext_tbl1;
For more information about external tables, see: Import data using an external table and Overview of external tables.
