Online statistics collection refers to the process where, during INSERT operations, the database optimizer collects statistics without requiring manual system package calls. It can collect statistics in real-time during large-scale data imports without needing to initiate additional collection tasks, reducing maintenance operations and enhancing statistics collection efficiency.
OceanBase Database uses the GATHER_OPTIMIZER_STATISTICS/NO_GATHER_OPTIMIZER_STATISTICS hint and the system variable _optimizer_gather_stats_on_load (which is enabled by default) for online statistics collection. It can also utilize the APPEND hint from direct load functionality to achieve the same.
Note
The _optimizer_gather_stats_on_load is an internal system variable that is enabled by default.
The primary scenarios for online statistics collection include:
CREATE TABLE...AS SELECT(CTAS)When the system variable
_optimizer_gather_stats_on_loadis set to True, CTAS operations are enabled by default for online statistics collection. To disable this feature, use theNO_GATHER_OPTIMIZER_STATISTICShint. Example:CREATE TABLE table_name AS SELECT /*+NO_GATHER_OPTIMIZER_STATISTICS*/* FROM...INSERT INTO SELECTWhen
_optimizer_gather_stats_on_loadis True, online statistics collection can be enabled using theGATHER_OPTIMIZER_STATISTICShint or theAPPENDhint. Example:INSERT /*+GATHER_OPTIMIZER_STATISTICS*/ INTO table_name SELECT * FROM table_nameNotice
At present, OceanBase Database does not support collecting statistics online by executing the
INSERT /*+GATHER_OPTIMIZER_STATISTICS*/ INTO table_name VALUES()...statement.PDML
Online statistics collection can be enabled by using the
ENABLE_PARALLEL_DMLhint or by executingALTER SESSION ENABLE PARALLEL DML;to set the system to parallel mode, allowing for parallel insert operations.Example:
CREATE TABLE t1(c1 INT) PARTITION BY HASH(c1) PARTITIONS 4; CREATE TABLE t2(c1 INT) PARTITION BY HASH(c1) PARTITIONS 4; INSERT INTO t1 VALUES (1),(2),(3),(4); obclient> EXPLAIN INSERT /*+ENABLE_PARALLEL_DML PARALLEL(4)*/ INTO t2 SELECT * FROM t1; +-----------------------------------------------------------------------------------------------------------------------------------------------------------------+ | Query Plan | +-----------------------------------------------------------------------------------------------------------------------------------------------------------------+ | ================================================================================ | | |ID|OPERATOR |NAME |EST.ROWS|EST.TIME(us)| | | -------------------------------------------------------------------------------- | | |0 |OPTIMIZER STATS MERGE | |1 |16 | | | |1 | PX COORDINATOR | |1 |16 | | | |2 | EXCHANGE OUT DISTR |:EX10001 |1 |16 | | | |3 | INSERT | |1 |15 | | | |4 | EXCHANGE IN DISTR | |1 |2 | | | |5 | EXCHANGE OUT DISTR (PKEY RANDOM)|:EX10000 |1 |2 | | | |6 | OPTIMIZER STATS GATHER | |1 |2 | | | |7 | SUBPLAN SCAN |ANONYMOUS_VIEW1|1 |2 | | | |8 | PX BLOCK ITERATOR | |1 |2 | | | |9 | TABLE SCAN |T1 |1 |2 | | | ================================================================================ | | Outputs & filters: | | ------------------------------------- | | 0 - output(nil), filter(nil), rowset=256 | | 1 - output([column_conv(NUMBER,PS:(-1,0),NULL,ANONYMOUS_VIEW1.C1)]), filter(nil), rowset=256 | | 2 - output([column_conv(NUMBER,PS:(-1,0),NULL,ANONYMOUS_VIEW1.C1)]), filter(nil), rowset=256 | | dop=4 | | 3 - output([column_conv(NUMBER,PS:(-1,0),NULL,ANONYMOUS_VIEW1.C1)]), filter(nil) | | columns([{T2: ({T2: (T2.__pk_increment, T2.C1)})}]), partitions(p[0-3]), | | column_values([T_HIDDEN_PK], [column_conv(NUMBER,PS:(-1,0),NULL,ANONYMOUS_VIEW1.C1)]) | | 4 - output([column_conv(NUMBER,PS:(-1,0),NULL,ANONYMOUS_VIEW1.C1)], [PARTITION_ID], [T_HIDDEN_PK]), filter(nil), rowset=256 | | 5 - output([column_conv(NUMBER,PS:(-1,0),NULL,ANONYMOUS_VIEW1.C1)], [PARTITION_ID], [T_HIDDEN_PK]), filter(nil), rowset=256 | | (#keys=1, [column_conv(NUMBER,PS:(-1,0),NULL,ANONYMOUS_VIEW1.C1)]), dop=4 | | 6 - output([column_conv(NUMBER,PS:(-1,0),NULL,ANONYMOUS_VIEW1.C1)], [calc_tablet_id(PARTITION_HASH(column_conv(NUMBER,PS:(-1,0),NULL,ANONYMOUS_VIEW1.C1)))]), | | filter(nil), rowset=256 | | 7 - output([ANONYMOUS_VIEW1.C1]), filter(nil), rowset=256 | | access([ANONYMOUS_VIEW1.C1]) | | 8 - output([T1.C1]), filter(nil), rowset=256 | | 9 - output([T1.C1]), filter(nil), rowset=256 | | access([T1.C1]), partitions(p[0-3]) | | is_index_back=false, is_global_index=false, | | range_key([T1.__pk_increment]), range(MIN ; MAX)always true | +-----------------------------------------------------------------------------------------------------------------------------------------------------------------+ 35 rows in setIn the preceding example, the
OPTIMIZER STATS MERGEoperator is generated, indicating that online statistics collection was executed.
The statistics collected online can be viewed in specific views. For more information about these views, see Overview of statistics collection methods.
Example:
CREATE TABLE t1(c1 INT) PARTITION BY HASH(c1) PARTITIONS 4;
CREATE TABLE t2(c1 INT) PARTITION BY HASH(c1) PARTITIONS 4;
INSERT INTO t1 VALUES (1),(2),(3),(4);
INSERT /*+ENABLE_PARALLEL_DML PARALLEL(4)*/ INTO t2 SELECT * FROM t1;
-- Oracle mode
OceanBase(SYS@SYS)> SELECT table_name, partition_name, num_rows, avg_row_len FROM user_tab_statistics WHERE table_name='T2';
+------------+----------------+----------+-------------+
| TABLE_NAME | PARTITION_NAME | NUM_ROWS | AVG_ROW_LEN |
+------------+----------------+----------+-------------+
| T2 | NULL | 4 | 20 |
| T2 | P0 | 1 | 20 |
| T2 | P1 | 2 | 20 |
| T2 | P2 | 1 | 20 |
| T2 | P3 | 0 | 0 |
+------------+----------------+----------+-------------+
-- MySQL mode
OceanBase(root@test)> SELECT table_name, partition_name, num_rows, avg_row_len FROM OCEANBASE.DBA_TAB_STATISTICS WHERE table_name='T2';
+------------+----------------+----------+-------------+
| table_name | partition_name | num_rows | avg_row_len |
+------------+----------------+----------+-------------+
| t2 | NULL | 4 | 20 |
| t2 | p0 | 1 | 20 |
| t2 | p1 | 1 | 20 |
| t2 | p2 | 1 | 20 |
| t2 | p3 | 1 | 20 |
+------------+----------------+----------+-------------+