This topic provides step-by-step SQL practice in MySQL mode (such as orders for beginners, tbl* for small fragments, and external tables for full MVs). You can follow the sections in order. Use this topic in conjunction with the materialized view query acceleration topic, which contains sales and items examples. If you need a complete walkthrough of the business tables, please read this topic first.
Note
For more information about syntax, limitations, and other details, see: Create a materialized view, Refresh a materialized view, Materialized view query rewriting, Materialized view logs, and Automatic management of materialized view logs.
Build a real-time data warehouse with materialized views: scenarios and considerations
To make quick decisions during the construction of a real-time data warehouse, you can select the solution based on the order of data freshness requirements, computational complexity, and data source types.
| Target Use Case | Recommended Capability Combination | Typical Features | Corresponding Chapter |
|---|---|---|---|
| Dashboard with period summarization, updated in minutes | REFRESH FAST + Mlog + Scheduled task |
Low incremental maintenance cost, suitable for stable aggregation | Beginner practice; Scenario 2: Real-time indicator processing chain |
| Detail scan and report drilling of wide tables | Columnar materialized views plus query rewriting (optional) | The benefits are more pronounced when scanning fewer columns compared to scanning all columns. | Scenario 1: Operational analytics and report drilling |
| Low-latency online queries (query-triggered computation) | ENABLE ON QUERY COMPUTATION |
Trade real-time updates for simplified computation | Scenario 2: Real-time metric processing pipeline |
| SQL transparent acceleration (minimal changes to business SQL) | ENABLE QUERY REWRITE + index/view alignment |
Relies on rewrite matching rules, suitable for fixed analysis templates | Scenario 2: real-time metric processing pipeline |
| Lakefile Load | External Table + REFRESH COMPLETE |
Data source comes from files, offline synchronization occurs before querying | Scene 3: Lakefile Load |
The recommended order of practices to apply is:
- Define the metrics and refresh strategy: Specify whether it's "minute-level incremental" or "real-time calculation at query time".
- Rescheduling refresh mechanism : Evaluate whether FAST (Mlog) meets the requirements. If not, use the COMPLETE refresh method.
- Finally, optimize the query side by adding columnar storage, indexes, and query rewriting only when needed to avoid introducing too many variables at once.
Note
A real-time data warehouse does not mean that all queries refresh in real time. If the business can tolerate minutes of delay, prefer FAST + scheduled updates; only introduce ENABLE ON QUERY COMPUTATION if there are specific low-latency requirements and you can accept the query-time overhead.
Try it out: Full refresh and incremental refresh on order tables
Let's take a orders table as an example. This example demonstrates how to create a materialized view with default full maintenance, an incremental view using MLOG + FAST, and basic refresh methods, including manual refresh and scheduled refresh (as described in Refresh a materialized view).
Create the tables, insert data, and create a full-mapping materialized view
CREATE TABLE orders (
order_id INT PRIMARY KEY,
user_id INT,
item_id INT,
item_count INT,
item_price INT,
region VARCHAR(100)
);
INSERT INTO orders VALUES
(1, 10001, 1, 20, 100, 'HZ'),
(2, 10002, 1, 10, 150, 'BJ'),
(3, 10001, 2, 50, 50, 'SH');
CREATE MATERIALIZED VIEW mv1
AS
SELECT region, SUM(item_count * item_price) AS sum_price
FROM orders
GROUP BY region;
SELECT * FROM mv1;
When creating a materialized view, a major compaction will be performed first. If the refresh strategy is not explicitly specified, a full refresh is usually required to keep the materialized view in sync with the base table.
Manually refresh (DBMS_MVIEW.REFRESH)
The materialized result doesn't change automatically until you trigger a refresh after you write to the base table.
INSERT INTO orders VALUES
(4, 10002, 2, 10, 100, 'SH'),
(5, 10003, 1, 2, 20, 'HZ');
SELECT * FROM mv1;
Return result:
+--------+-----------+
| region | sum_price |
+--------+-----------+
| HZ | 2000 |
| BJ | 1500 |
| SH | 2500 |
+--------+-----------+
3 rows in set
You can see that the materialized view results are not changing automatically. You need to execute the following SQL trigger to refresh it:
CALL DBMS_MVIEW.REFRESH('mv1', 'c', refresh_parallel => 2);
SELECT * FROM mv1;
The result is as follows:
+--------+-----------+
| region | sum_price |
+--------+-----------+
| HZ | 2040 |
| BJ | 1500 |
| SH | 3500 |
+--------+-----------+
3 rows in set
'c' means complete refresh; other parameters such as refresh_parallel follow the specifications in DBMS_MVIEW.REFRESH.
Query the refresh run status (example):
SELECT MVIEWS, METHOD, START_TIME, END_TIME
FROM oceanbase.DBA_MVREF_RUN_STATS;
Incremental refresh: Mlog + FAST
DROP MATERIALIZED VIEW mv1;
CREATE MATERIALIZED VIEW LOG ON orders
WITH PRIMARY KEY, ROWID, SEQUENCE (item_count, item_price, region) INCLUDING NEW VALUES;
CREATE MATERIALIZED VIEW mv1(region, c, sum_price, cnt_price)
REFRESH FAST
ON DEMAND START WITH sysdate() NEXT sysdate() + INTERVAL 5 MINUTE
AS
SELECT region,
COUNT(*) AS c,
SUM(item_count * item_price) AS sum_price,
COUNT(item_count * item_price) AS cnt_price
FROM orders
GROUP BY region;
After inserting a new order, observe the Mlog first, then perform a manual incremental refresh:
INSERT INTO orders VALUES (6, 10001, 3, 30, 70, 'HZ');
SELECT * FROM mlog$_orders;
CALL DBMS_MVIEW.REFRESH('mv1', 'f');
SELECT * FROM mlog$_orders;
SELECT region, sum_price FROM mv1;
You can query the DBA_SCHEDULER_JOBS views to find the background refresh jobs created with the START WITH ... NEXT ... clause (actual fields depend on the version of the database).
Object clean-up order
When you drop a materialized view, materialized view log, or base table, you must follow the procedure of first dropping the materialized view, then dropping the materialized view log, and finally dropping the base table (because the base table can still be linked to the materialized view log. In this case, using the DROP TABLE statement might generate an error.
DROP MATERIALIZED VIEW mv1;
DROP MATERIALIZED VIEW LOG ON orders;
DROP TABLE orders;
Scenario 1: Operation analysis and report query
In operational analytics and report querying scenarios, common issues include detailed tables with many columns, large scan ranges, and numerous changing query conditions.
These scenarios are typically read-heavy, making them ideal for reducing unnecessary column scans using columnar materialized views and further optimizing query performance with indexes and query rewriting when needed.
This section provides an example of using a column store materialized view to support wide table queries and multidimensional analysis.
CREATE TABLE IF NOT EXISTS tbl1 (col1 INT PRIMARY KEY, col2 VARCHAR(20), col3 INT);
CREATE MATERIALIZED VIEW mv_ec_tbl1
WITH COLUMN GROUP(each column)
AS SELECT *
FROM tbl1;
For query rewriting, typically the rewritten condition must be able to utilize an index on the materialized view (for details, see Query Rewriting):
CREATE INDEX idx1_mv_ec_tbl1 ON mv_ec_tbl1(col1);
Scenario 2: Real-time metric calculation and query
In real-time metric scenarios, common requirements include hierarchical aggregation, incremental maintenance, low-latency queries, and minimal changes to business SQL.
These scenarios are typically not covered by a single capability but require the combination of nested materialized views, materialized view logs, FAST refresh, real-time materialized views, and query rewriting based on business timeliness requirements.
Nested materialized views
When a query process involves multiple layers of aggregation or multiple steps of processing, intermediate results can be stored in a materialized view, and then subsequent layers of materialized views can be created based on these results. This approach is suitable for hierarchical precomputation scenarios, allowing complex metrics to be broken down into multiple steps of maintenance.
CREATE TABLE IF NOT EXISTS tbl3(id INT, name VARCHAR(30), PRIMARY KEY(id));
CREATE TABLE IF NOT EXISTS tbl4(id INT, age INT, PRIMARY KEY(id));
CREATE MATERIALIZED VIEW mv1_tbl3_tbl4 (PRIMARY KEY (id1, id2))
REFRESH COMPLETE
AS SELECT tbl3.id id1, tbl4.id id2, tbl3.name, tbl4.age
FROM tbl3, tbl4
WHERE tbl3.id = tbl4.id;
CREATE MATERIALIZED VIEW mv_mv1_tbl3_tbl4
REFRESH COMPLETE
AS SELECT SUM(age) age_sum
FROM mv1_tbl3_tbl4;
CREATE MATERIALIZED VIEW mv1_mv1_tbl3_tbl4
REFRESH COMPLETE INCONSISTENT
AS SELECT SUM(age) age_sum
FROM mv1_tbl3_tbl4;
If a layer undergoes a full refresh, any nested materialized views dependent on it typically need to be fully refreshed again before performing an incremental refresh (see the documentation on creation and refresh).
Materialized view logs and FAST refresh
If your business prioritizes refresh efficiency over recalculating all data every time, consider using materialized view logs in conjunction with FAST refresh. This method records changes in the base table and processes only incremental data during refreshes.
When Mlog automatic management is not enabled, you must create the mlog before creating a REFRESH FAST materialized view.
CREATE TABLE IF NOT EXISTS tbl5 (col1 INT PRIMARY KEY, col2 INT, col3 INT);
CREATE MATERIALIZED VIEW LOG ON tbl5
WITH SEQUENCE (col2, col3) INCLUDING NEW VALUES;
CREATE MATERIALIZED VIEW mv_tbl5
REFRESH FAST
AS SELECT col2, COUNT(*) cnt, COUNT(col3) cnt_col3, SUM(col3) sum_col3
FROM tbl5
GROUP BY col2;
CALL DBMS_MVIEW.REFRESH('mv_tbl5');
CALL DBMS_MVIEW.REFRESH('mv_tbl5', 'c');
Real-time materialized views
For scenarios requiring higher query timeliness and nearly real-time results, but not suitable for frequent manual refreshes, assess whether to use real-time materialized views. This approach exchanges query-time computation for higher data freshness.
CREATE TABLE IF NOT EXISTS tbl2(col1 INT, col2 INT, col3 INT);
CREATE MATERIALIZED VIEW LOG ON tbl2
WITH PRIMARY KEY, ROWID, SEQUENCE (col1, col2, col3) INCLUDING NEW VALUES;
CREATE MATERIALIZED VIEW mv_tbl2_rt
ENABLE ON QUERY COMPUTATION
AS SELECT col1, COUNT(*) AS cnt
FROM tbl2
GROUP BY col1;
Query rewriting (ENABLE QUERY REWRITE)
If you want to minimize changes to business SQL while allowing the optimizer to automatically use materialized view results, combine this with query rewriting. Query rewriting is suitable for fixed analysis scenarios, but whether it can be applied depends on whether the query conditions match the materialized view definitions.
CREATE TABLE IF NOT EXISTS test_tbl1 (col1 INT, col2 INT, col3 INT);
CREATE TABLE IF NOT EXISTS test_tbl2 (col1 INT, col2 INT, col3 INT);
CREATE MATERIALIZED VIEW mv_test_tbl1_tbl2
ENABLE QUERY REWRITE
AS SELECT t1.col1 col1, t1.col2 t1col2, t1.col3 t1col3, t2.col2 t2col2, t2.col3 t2col3
FROM test_tbl1 t1, test_tbl2 t2
WHERE t1.col1 = t2.col1;
SET query_rewrite_enabled = 'force';
SELECT COUNT(*), test_tbl1.col1 col1
FROM test_tbl1, test_tbl2
WHERE test_tbl1.col1 = test_tbl2.col1 AND test_tbl2.col2 > 10
GROUP BY col1;
Scenario 3: File data import and scheduled refresh
When data comes from files rather than online business tables, you can first use an external table to read the file data, and then create a full-refresh materialized view based on the external table. This approach is suitable for periodic updates, offline synchronization followed by queries, and regular data ingestion in lakehouse scenarios.
Starting from V4.3.5 BP2, you can create full-refresh materialized views with an external table as the base. First, configure secure_file_priv, prepare the external file, and execute CREATE EXTERNAL TABLE (for complete steps, see Create a materialized view — Based on an external table).
CREATE EXTERNAL TABLE ext_tbl1 (
id INT,
name VARCHAR(50),
c_date DATE
)
LOCATION = '/home/admin'
FORMAT = (
TYPE = 'CSV'
FIELD_DELIMITER = ','
FIELD_OPTIONALLY_ENCLOSED_BY = '\''
)
PATTERN = 'ext_tbl1.csv';
CREATE MATERIALIZED VIEW mv_ext_tbl1
REFRESH COMPLETE
AS SELECT * FROM ext_tbl1;
For more information about external tables, see: Import data using an external table and Overview of external tables.
