Practice scenarios of materialized views|V4.3.5|OceanBase Database| docs|Distributed Database

This topic provides step-by-step SQL practice in MySQL mode (e.g., orders beginner, tbl* snippets, and external tables for full MV). You can follow the sections in order. Pair this with the Materialized View Query Acceleration example using sales / items for a complete, end-to-end business table demonstration. Please read this document first if you need a full, end-to-end business table demonstration.

Note

For more information about syntax, limitations, and other details, see: Create a materialized view, Refresh a materialized view, Query rewrite of materialized views, Materialized view logs, and Automatic management of materialized view logs.

Use materialized views to build a real-time data warehouse: scenario selection

To make quick decisions when building a real-time data warehouse, you can select options based on the following order: data freshness requirements -> computational complexity -> data source type.

Target scenario	Recommended capability combination	Typical characteristics	Corresponding section
Periodic dashboards with minute-level updates	`REFRESH FAST` + Mlog + scheduled refresh	Low incremental maintenance cost, suitable for stable aggregation	Getting started; Scenario 2: Real-time metric processing pipeline
Scanning wide detail tables and report drilling	Columnar MV + query rewriting (optional)	More significant benefits when scanning fewer columns than all columns	Scenario 1: Operational analysis and report drilling
Low-latency online queries (query-triggered computation)	`ENABLE ON QUERY COMPUTATION`	Trade-off between read-time computation and simplified updates, suitable for hot query models	Scenario 2: Real-time metric processing pipeline
Transparent SQL acceleration (minimal changes to business SQL)	`ENABLE QUERY REWRITE` + indexes/alignment	Relies on rewriting matching rules, suitable for fixed analysis templates	Scenario 2: Real-time metric processing pipeline
File ingestion followed by periodic loading	External table + `REFRESH COMPLETE`	Data source is files, requiring offline synchronization before querying	Scenario 3: Lakehouse ingestion and periodic loading

The recommended practice application order is as follows:

Define the metrics and refresh strategy: Determine whether to use "minute-level incremental" or "query-time real-time computation".
Determine the refresh mechanism: Prioritize FAST (Mlog) if it meets the requirements; otherwise, use COMPLETE.
Optimize the query side: Add columnar storage, indexes, and query rewriting as needed to avoid introducing too many variables at once.

Notice

A real-time data warehouse does not mean that all queries are refreshed in real time. If the business can tolerate minute-level latency, prioritize FAST + scheduling. Only introduce ENABLE ON QUERY COMPUTATION if there is a clear low-latency requirement and the read-time overhead is acceptable.

Getting started: Full and incremental refresh of an order table

The following example demonstrates the use of an order table in a business database to show: default full refresh of a materialized view, Mlog + FAST for incremental refresh, and the basic use of manual refresh and scheduled refresh (consistent with Refresh a materialized view).

Create the table, generate sample data, and create a full-refresh materialized view

CREATE TABLE orders (
  order_id INT PRIMARY KEY,
  user_id INT,
  item_id INT,
  item_count INT,
  item_price INT,
  region VARCHAR(100)
);

INSERT INTO orders VALUES
  (1, 10001, 1, 20, 100, 'HZ'),
  (2, 10002, 1, 10, 150, 'BJ'),
  (3, 10001, 2, 50, 50, 'SH');

CREATE MATERIALIZED VIEW mv1
AS
SELECT region, SUM(item_count * item_price) AS sum_price
FROM orders
GROUP BY region;

SELECT * FROM mv1;

When creating a materialized view, a first-time materialization is performed. If no refresh strategy is explicitly declared, a manual full refresh is typically required to keep the materialized view aligned with the base table.

Manual full refresh (DBMS_MVIEW.REFRESH)

After writing to the base table, the materialized view will not automatically update. You need to trigger a refresh:

INSERT INTO orders VALUES
  (4, 10002, 2, 10, 100, 'SH'),
  (5, 10003, 1, 2, 20, 'HZ');

SELECT * FROM mv1;

The result is:

+--------+-----------+
| region | sum_price |
+--------+-----------+
| HZ     |      2000 |
| BJ     |      1500 |
| SH     |      2500 |
+--------+-----------+
3 rows in set

You can see that the materialized view result does not automatically change. You need to execute the following SQL to trigger a refresh:

CALL DBMS_MVIEW.REFRESH('mv1', 'c', refresh_parallel => 2);

SELECT * FROM mv1;

The result is:

+--------+-----------+
| region | sum_price |
+--------+-----------+
| HZ     |      2040 |
| BJ     |      1500 |
| SH     |      3500 |
+--------+-----------+
3 rows in set

'c' indicates a complete full refresh. Parameters such as refresh_parallel are specified in DBMS_MVIEW.REFRESH.

You can query the refresh status (example):

SELECT MVIEWS, METHOD, START_TIME, END_TIME
FROM oceanbase.DBA_MVREF_RUN_STATS;

Switch to incremental refresh: Mlog + FAST

DROP MATERIALIZED VIEW mv1;

CREATE MATERIALIZED VIEW LOG ON orders
  WITH PRIMARY KEY, ROWID, SEQUENCE (item_count, item_price, region) INCLUDING NEW VALUES;

CREATE MATERIALIZED VIEW mv1(region, c, sum_price, cnt_price)
  REFRESH FAST
  ON DEMAND START WITH sysdate() NEXT sysdate() + INTERVAL 5 MINUTE
AS
SELECT region,
       COUNT(*) AS c,
       SUM(item_count * item_price) AS sum_price,
       COUNT(item_count * item_price) AS cnt_price
FROM orders
GROUP BY region;

After inserting a new order, you can first observe the Mlog and then manually trigger an incremental refresh:

INSERT INTO orders VALUES (6, 10001, 3, 30, 70, 'HZ');

SELECT * FROM mlog$_orders;

CALL DBMS_MVIEW.REFRESH('mv1', 'f');

SELECT * FROM mlog$_orders;
SELECT region, sum_price FROM mv1;

If you configured START WITH ... NEXT ... when creating the materialized view, you can also check the background refresh tasks in DBA_SCHEDULER_JOBS (specific fields may vary by version).

Clean up objects in the correct order

When deleting a materialized view, materialized view log, and base table, ensure to delete the materialized view first, then the materialized view log, and finally the base table. If the base table is still bound to a materialized view log, directly dropping the table may result in an error.

DROP MATERIALIZED VIEW mv1;
DROP MATERIALIZED VIEW LOG ON orders;
DROP TABLE orders;

Scenario 1: Operational analysis and report querying

In operational analysis and report querying scenarios, common issues include wide detail tables with many columns, large scanning ranges, and frequent changes in query conditions.

These scenarios are typically read-heavy and benefit from using columnar materialized views to reduce the number of irrelevant columns scanned. Indexes and query rewriting can further optimize query performance when needed.

This section provides a simple example of how to use columnar materialized views to support wide table queries and multidimensional analysis.

CREATE TABLE IF NOT EXISTS tbl1 (col1 INT PRIMARY KEY, col2 VARCHAR(20), col3 INT);

CREATE MATERIALIZED VIEW mv_ec_tbl1
    WITH COLUMN GROUP(each column)
    AS SELECT *
       FROM tbl1;

If you also use query rewriting, ensure that the rewritten conditions can utilize indexes on the materialized view (rules are detailed in Materialized view query rewriting):

CREATE INDEX idx1_mv_ec_tbl1 ON mv_ec_tbl1(col1);

Scenario 2: Real-time metric calculation and query

In real-time metric scenarios, common requirements include hierarchical aggregation, incremental maintenance, low-latency queries, and minimal changes to business SQL.

These scenarios typically require a combination of capabilities rather than a single one. Depending on the business's timeliness requirements, nested materialized views, materialized view logs, FAST refresh, real-time materialized views, and query rewriting can be combined.

Nested materialized views

When a query process involves multiple layers of aggregation or multiple steps of processing, you can first store the intermediate results in one materialized view and then create another materialized view based on these results. This approach is suitable for hierarchical precomputation scenarios, allowing complex metrics to be maintained in multiple steps.

CREATE TABLE IF NOT EXISTS tbl3(id INT, name VARCHAR(30), PRIMARY KEY(id));
CREATE TABLE IF NOT EXISTS tbl4(id INT, age INT, PRIMARY KEY(id));

CREATE MATERIALIZED VIEW mv1_tbl3_tbl4 (PRIMARY KEY (id1, id2))
    REFRESH COMPLETE
    AS SELECT tbl3.id id1, tbl4.id id2, tbl3.name, tbl4.age
        FROM tbl3, tbl4
        WHERE tbl3.id = tbl4.id;

CREATE MATERIALIZED VIEW mv_mv1_tbl3_tbl4
    REFRESH COMPLETE
    AS SELECT SUM(age) age_sum
        FROM mv1_tbl3_tbl4;

CREATE MATERIALIZED VIEW mv1_mv1_tbl3_tbl4
    REFRESH COMPLETE INCONSISTENT
    AS SELECT SUM(age) age_sum
        FROM mv1_tbl3_tbl4;

If a layer undergoes a full refresh, the nested materialized views that depend on it typically need to be fully refreshed again before incremental refreshes can be performed (see the documentation on creation and refresh).

Materialized view logs and FAST refresh

If your business prioritizes refresh efficiency over recalculating all data every time, consider using materialized view logs with FAST refresh. This method records changes in the base table and processes only incremental data during refreshes.

When Mlog Auto-Management is not enabled, you must create the mlog before creating a REFRESH FAST materialized view.

CREATE TABLE IF NOT EXISTS tbl5 (col1 INT PRIMARY KEY, col2 INT, col3 INT);

CREATE MATERIALIZED VIEW LOG ON tbl5
    WITH SEQUENCE (col2, col3) INCLUDING NEW VALUES;

CREATE MATERIALIZED VIEW mv_tbl5
    REFRESH FAST
    AS SELECT col2, COUNT(*) cnt, COUNT(col3) cnt_col3, SUM(col3) sum_col3
       FROM tbl5
       GROUP BY col2;

CALL DBMS_MVIEW.REFRESH('mv_tbl5');
CALL DBMS_MVIEW.REFRESH('mv_tbl5', 'c');

Real-time materialized views

For scenarios requiring higher query timeliness and nearly real-time results, but not suitable for frequent manual refreshes, you can evaluate whether to use real-time materialized views. This approach trades off query computation for higher data freshness.

CREATE TABLE IF NOT EXISTS tbl2(col1 INT, col2 INT, col3 INT);

CREATE MATERIALIZED VIEW LOG ON tbl2
    WITH PRIMARY KEY, ROWID, SEQUENCE (col1, col2, col3) INCLUDING NEW VALUES;

CREATE MATERIALIZED VIEW mv_tbl2_rt
    ENABLE ON QUERY COMPUTATION
    AS SELECT col1, COUNT(*) AS cnt
       FROM tbl2
       GROUP BY col1;

Query rewriting (ENABLE QUERY REWRITE)

If you want to minimize changes to business SQL while allowing the optimizer to automatically use materialized view results, you can combine query rewriting. Query rewriting is suitable for fixed analysis scenarios, but whether it can be applied depends on whether the query conditions match the materialized view definitions.

CREATE TABLE IF NOT EXISTS test_tbl1 (col1 INT, col2 INT, col3 INT);
CREATE TABLE IF NOT EXISTS test_tbl2 (col1 INT, col2 INT, col3 INT);

CREATE MATERIALIZED VIEW mv_test_tbl1_tbl2
    ENABLE QUERY REWRITE
    AS SELECT t1.col1 col1, t1.col2 t1col2, t1.col3 t1col3, t2.col2 t2col2, t2.col3 t2col3
    FROM test_tbl1 t1, test_tbl2 t2
    WHERE t1.col1 = t2.col1;

SET query_rewrite_enabled = 'force';

SELECT COUNT(*), test_tbl1.col1 col1
FROM test_tbl1, test_tbl2
WHERE test_tbl1.col1 = test_tbl2.col1 AND test_tbl2.col2 > 10
GROUP BY col1;

Scenario 3: File data import and scheduled refresh

When data comes from files rather than online business tables, you can first read the file data through an external table and then create a full-refresh materialized view based on the external table. This approach is suitable for periodic updates, offline synchronization, and regular data ingestion in lakehouse scenarios.

Starting from V4.3.5 BP2, you can create a full-refresh materialized view with an external table as the base. First, configure secure_file_priv, prepare the external file, and execute CREATE EXTERNAL TABLE (for complete steps, see Create a materialized view — Based on an external table).

CREATE EXTERNAL TABLE ext_tbl1 (
    id INT,
    name VARCHAR(50),
    c_date DATE
)
LOCATION = '/home/admin'
FORMAT = (
  TYPE = 'CSV'
  FIELD_DELIMITER = ','
  FIELD_OPTIONALLY_ENCLOSED_BY = '\''
)
PATTERN = 'ext_tbl1.csv';

CREATE MATERIALIZED VIEW mv_ext_tbl1
    REFRESH COMPLETE
    AS SELECT * FROM ext_tbl1;

For more information about external tables, see: Import data from an external table and Overview of external tables.

References

Note

Use materialized views to build a real-time data warehouse: scenario selection

To make quick decisions when building a real-time data warehouse, you can select options based on the following order: data freshness requirements -> computational complexity -> data source type.

Target scenario	Recommended capability combination	Typical characteristics	Corresponding section
Periodic dashboards with minute-level updates	`REFRESH FAST` + Mlog + scheduled refresh	Low incremental maintenance cost, suitable for stable aggregation	Getting started; Scenario 2: Real-time metric processing pipeline
Scanning wide detail tables and report drilling	Columnar MV + query rewriting (optional)	More significant benefits when scanning fewer columns than all columns	Scenario 1: Operational analysis and report drilling
Low-latency online queries (query-triggered computation)	`ENABLE ON QUERY COMPUTATION`	Trade-off between read-time computation and simplified updates, suitable for hot query models	Scenario 2: Real-time metric processing pipeline
Transparent SQL acceleration (minimal changes to business SQL)	`ENABLE QUERY REWRITE` + indexes/alignment	Relies on rewriting matching rules, suitable for fixed analysis templates	Scenario 2: Real-time metric processing pipeline
File ingestion followed by periodic loading	External table + `REFRESH COMPLETE`	Data source is files, requiring offline synchronization before querying	Scenario 3: Lakehouse ingestion and periodic loading

The recommended practice application order is as follows:

Define the metrics and refresh strategy: Determine whether to use "minute-level incremental" or "query-time real-time computation".
Determine the refresh mechanism: Prioritize FAST (Mlog) if it meets the requirements; otherwise, use COMPLETE.
Optimize the query side: Add columnar storage, indexes, and query rewriting as needed to avoid introducing too many variables at once.

Notice

Getting started: Full and incremental refresh of an order table

Create the table, generate sample data, and create a full-refresh materialized view

CREATE TABLE orders (
  order_id INT PRIMARY KEY,
  user_id INT,
  item_id INT,
  item_count INT,
  item_price INT,
  region VARCHAR(100)
);

INSERT INTO orders VALUES
  (1, 10001, 1, 20, 100, 'HZ'),
  (2, 10002, 1, 10, 150, 'BJ'),
  (3, 10001, 2, 50, 50, 'SH');

CREATE MATERIALIZED VIEW mv1
AS
SELECT region, SUM(item_count * item_price) AS sum_price
FROM orders
GROUP BY region;

SELECT * FROM mv1;

Manual full refresh (DBMS_MVIEW.REFRESH)

After writing to the base table, the materialized view will not automatically update. You need to trigger a refresh:

INSERT INTO orders VALUES
  (4, 10002, 2, 10, 100, 'SH'),
  (5, 10003, 1, 2, 20, 'HZ');

SELECT * FROM mv1;

The result is:

+--------+-----------+
| region | sum_price |
+--------+-----------+
| HZ     |      2000 |
| BJ     |      1500 |
| SH     |      2500 |
+--------+-----------+
3 rows in set

You can see that the materialized view result does not automatically change. You need to execute the following SQL to trigger a refresh:

CALL DBMS_MVIEW.REFRESH('mv1', 'c', refresh_parallel => 2);

SELECT * FROM mv1;

The result is:

+--------+-----------+
| region | sum_price |
+--------+-----------+
| HZ     |      2040 |
| BJ     |      1500 |
| SH     |      3500 |
+--------+-----------+
3 rows in set

'c' indicates a complete full refresh. Parameters such as refresh_parallel are specified in DBMS_MVIEW.REFRESH.

You can query the refresh status (example):

SELECT MVIEWS, METHOD, START_TIME, END_TIME
FROM oceanbase.DBA_MVREF_RUN_STATS;

Switch to incremental refresh: Mlog + FAST

DROP MATERIALIZED VIEW mv1;

CREATE MATERIALIZED VIEW LOG ON orders
  WITH PRIMARY KEY, ROWID, SEQUENCE (item_count, item_price, region) INCLUDING NEW VALUES;

CREATE MATERIALIZED VIEW mv1(region, c, sum_price, cnt_price)
  REFRESH FAST
  ON DEMAND START WITH sysdate() NEXT sysdate() + INTERVAL 5 MINUTE
AS
SELECT region,
       COUNT(*) AS c,
       SUM(item_count * item_price) AS sum_price,
       COUNT(item_count * item_price) AS cnt_price
FROM orders
GROUP BY region;

After inserting a new order, you can first observe the Mlog and then manually trigger an incremental refresh:

INSERT INTO orders VALUES (6, 10001, 3, 30, 70, 'HZ');

SELECT * FROM mlog$_orders;

CALL DBMS_MVIEW.REFRESH('mv1', 'f');

SELECT * FROM mlog$_orders;
SELECT region, sum_price FROM mv1;

If you configured START WITH ... NEXT ... when creating the materialized view, you can also check the background refresh tasks in DBA_SCHEDULER_JOBS (specific fields may vary by version).

Clean up objects in the correct order

DROP MATERIALIZED VIEW mv1;
DROP MATERIALIZED VIEW LOG ON orders;
DROP TABLE orders;

Scenario 1: Operational analysis and report querying

In operational analysis and report querying scenarios, common issues include wide detail tables with many columns, large scanning ranges, and frequent changes in query conditions.

This section provides a simple example of how to use columnar materialized views to support wide table queries and multidimensional analysis.

CREATE TABLE IF NOT EXISTS tbl1 (col1 INT PRIMARY KEY, col2 VARCHAR(20), col3 INT);

CREATE MATERIALIZED VIEW mv_ec_tbl1
    WITH COLUMN GROUP(each column)
    AS SELECT *
       FROM tbl1;

If you also use query rewriting, ensure that the rewritten conditions can utilize indexes on the materialized view (rules are detailed in Materialized view query rewriting):

CREATE INDEX idx1_mv_ec_tbl1 ON mv_ec_tbl1(col1);

Scenario 2: Real-time metric calculation and query

In real-time metric scenarios, common requirements include hierarchical aggregation, incremental maintenance, low-latency queries, and minimal changes to business SQL.

Nested materialized views

CREATE TABLE IF NOT EXISTS tbl3(id INT, name VARCHAR(30), PRIMARY KEY(id));
CREATE TABLE IF NOT EXISTS tbl4(id INT, age INT, PRIMARY KEY(id));

CREATE MATERIALIZED VIEW mv1_tbl3_tbl4 (PRIMARY KEY (id1, id2))
    REFRESH COMPLETE
    AS SELECT tbl3.id id1, tbl4.id id2, tbl3.name, tbl4.age
        FROM tbl3, tbl4
        WHERE tbl3.id = tbl4.id;

CREATE MATERIALIZED VIEW mv_mv1_tbl3_tbl4
    REFRESH COMPLETE
    AS SELECT SUM(age) age_sum
        FROM mv1_tbl3_tbl4;

CREATE MATERIALIZED VIEW mv1_mv1_tbl3_tbl4
    REFRESH COMPLETE INCONSISTENT
    AS SELECT SUM(age) age_sum
        FROM mv1_tbl3_tbl4;

Materialized view logs and FAST refresh

When Mlog Auto-Management is not enabled, you must create the mlog before creating a REFRESH FAST materialized view.

CREATE TABLE IF NOT EXISTS tbl5 (col1 INT PRIMARY KEY, col2 INT, col3 INT);

CREATE MATERIALIZED VIEW LOG ON tbl5
    WITH SEQUENCE (col2, col3) INCLUDING NEW VALUES;

CREATE MATERIALIZED VIEW mv_tbl5
    REFRESH FAST
    AS SELECT col2, COUNT(*) cnt, COUNT(col3) cnt_col3, SUM(col3) sum_col3
       FROM tbl5
       GROUP BY col2;

CALL DBMS_MVIEW.REFRESH('mv_tbl5');
CALL DBMS_MVIEW.REFRESH('mv_tbl5', 'c');

Real-time materialized views

CREATE TABLE IF NOT EXISTS tbl2(col1 INT, col2 INT, col3 INT);

CREATE MATERIALIZED VIEW LOG ON tbl2
    WITH PRIMARY KEY, ROWID, SEQUENCE (col1, col2, col3) INCLUDING NEW VALUES;

CREATE MATERIALIZED VIEW mv_tbl2_rt
    ENABLE ON QUERY COMPUTATION
    AS SELECT col1, COUNT(*) AS cnt
       FROM tbl2
       GROUP BY col1;

Query rewriting (ENABLE QUERY REWRITE)

CREATE TABLE IF NOT EXISTS test_tbl1 (col1 INT, col2 INT, col3 INT);
CREATE TABLE IF NOT EXISTS test_tbl2 (col1 INT, col2 INT, col3 INT);

CREATE MATERIALIZED VIEW mv_test_tbl1_tbl2
    ENABLE QUERY REWRITE
    AS SELECT t1.col1 col1, t1.col2 t1col2, t1.col3 t1col3, t2.col2 t2col2, t2.col3 t2col3
    FROM test_tbl1 t1, test_tbl2 t2
    WHERE t1.col1 = t2.col1;

SET query_rewrite_enabled = 'force';

SELECT COUNT(*), test_tbl1.col1 col1
FROM test_tbl1, test_tbl2
WHERE test_tbl1.col1 = test_tbl2.col1 AND test_tbl2.col2 > 10
GROUP BY col1;

Scenario 3: File data import and scheduled refresh

CREATE EXTERNAL TABLE ext_tbl1 (
    id INT,
    name VARCHAR(50),
    c_date DATE
)
LOCATION = '/home/admin'
FORMAT = (
  TYPE = 'CSV'
  FIELD_DELIMITER = ','
  FIELD_OPTIONALLY_ENCLOSED_BY = '\''
)
PATTERN = 'ext_tbl1.csv';

CREATE MATERIALIZED VIEW mv_ext_tbl1
    REFRESH COMPLETE
    AS SELECT * FROM ext_tbl1;

For more information about external tables, see: Import data from an external table and Overview of external tables.

OceanBase

Customer Stories

Documentation

Practice scenarios of materialized views

Note

Use materialized views to build a real-time data warehouse: scenario selection

Notice

Getting started: Full and incremental refresh of an order table

Create the table, generate sample data, and create a full-refresh materialized view

Manual full refresh (DBMS_MVIEW.REFRESH)

Switch to incremental refresh: Mlog + FAST

Clean up objects in the correct order

Scenario 1: Operational analysis and report querying

Scenario 2: Real-time metric calculation and query

Nested materialized views

Materialized view logs and FAST refresh

Real-time materialized views

Query rewriting (ENABLE QUERY REWRITE)

Scenario 3: File data import and scheduled refresh

References

Practice scenarios of materialized views

Note

Use materialized views to build a real-time data warehouse: scenario selection

Notice

Getting started: Full and incremental refresh of an order table

Create the table, generate sample data, and create a full-refresh materialized view

Manual full refresh (DBMS_MVIEW.REFRESH)

Switch to incremental refresh: Mlog + FAST

Clean up objects in the correct order

Scenario 1: Operational analysis and report querying

Scenario 2: Real-time metric calculation and query

Nested materialized views

Materialized view logs and FAST refresh

Real-time materialized views

Query rewriting (ENABLE QUERY REWRITE)

Scenario 3: File data import and scheduled refresh

References