Online statistics collection refers to the process where, during an INSERT operation, the database optimizer collects statistics without the need for manual system package calls. During large-scale data imports, statistics can be collected in real-time without the need to initiate additional collection tasks, reducing maintenance operations and improving the efficiency of statistics collection.
OceanBase Database collects online statistics by using the GATHER_OPTIMIZER_STATISTICS or NO_GATHER_OPTIMIZER_STATISTICS hint and the system variable _optimizer_gather_stats_on_load (which is enabled by default), or by using the APPEND hint of the direct load feature.
Note
_optimizer_gather_stats_on_load is an internal system variable that is enabled by default.
Online statistics collection is triggered in the following scenarios:
CREATE TABLE AS SELECT(CTAS)When the system variable
_optimizer_gather_stats_on_loadis set to True, the online collection of statistics is enabled by default for the CTAS statement. To disable this feature, you can use theNO_GATHER_OPTIMIZER_STATISTICShint, as shown in the following example:CREATE TABLE table_name AS SELECT /*+NO_GATHER_OPTIMIZER_STATISTICS*/* FROM...INSERT INTO SELECTWhen the system variable
_optimizer_gather_stats_on_loadis set to True, you can use theGATHER_OPTIMIZER_STATISTICShint or theAPPENDhint to enable online statistics collection. Here is an example:INSERT /*+GATHER_OPTIMIZER_STATISTICS*/ INTO table_name SELECT * FROM table_nameNotice
OceanBase Database does not support online statistics collection by using the
INSERT /*+GATHER_OPTIMIZER_STATISTICS*/ INTO table_name VALUES()...statement.PDML
You can insert data in parallel by using the
ENABLE_PARALLEL_DMLhint or by executing theALTER SESSION ENABLE PARALLEL DML;statement.Here is an example:
CREATE TABLE t1(c1 INT) PARTITION BY HASH(c1) PARTITIONS 4; CREATE TABLE t2(c1 INT) PARTITION BY HASH(c1) PARTITIONS 4; INSERT INTO t1 VALUES (1),(2),(3),(4); obclient> EXPLAIN INSERT /*+ENABLE_PARALLEL_DML PARALLEL(4)*/ INTO t2 SELECT * FROM t1; +-----------------------------------------------------------------------------------------------------------------------------------------------------------------+ | Query Plan | +-----------------------------------------------------------------------------------------------------------------------------------------------------------------+ | ================================================================================ | | |ID|OPERATOR |NAME |EST.ROWS|EST.TIME(us)| | | -------------------------------------------------------------------------------- | | |0 |OPTIMIZER STATS MERGE | |1 |16 | | | |1 | PX COORDINATOR | |1 |16 | | | |2 | EXCHANGE OUT DISTR |:EX10001 |1 |16 | | | |3 | INSERT | |1 |15 | | | |4 | EXCHANGE IN DISTR | |1 |2 | | | |5 | EXCHANGE OUT DISTR (PKEY RANDOM)|:EX10000 |1 |2 | | | |6 | OPTIMIZER STATS GATHER | |1 |2 | | | |7 | SUBPLAN SCAN |ANONYMOUS_VIEW1|1 |2 | | | |8 | PX BLOCK ITERATOR | |1 |2 | | | |9 | TABLE SCAN |T1 |1 |2 | | | ================================================================================ | | Outputs & filters: | | ------------------------------------- | | 0 - output(nil), filter(nil), rowset=256 | | 1 - output([column_conv(NUMBER,PS:(-1,0),NULL,ANONYMOUS_VIEW1.C1)]), filter(nil), rowset=256 | | 2 - output([column_conv(NUMBER,PS:(-1,0),NULL,ANONYMOUS_VIEW1.C1)]), filter(nil), rowset=256 | | dop=4 | | 3 - output([column_conv(NUMBER,PS:(-1,0),NULL,ANONYMOUS_VIEW1.C1)]), filter(nil) | | columns([{T2: ({T2: (T2.__pk_increment, T2.C1)})}]), partitions(p[0-3]), | | column_values([T_HIDDEN_PK], [column_conv(NUMBER,PS:(-1,0),NULL,ANONYMOUS_VIEW1.C1)]) | | 4 - output([column_conv(NUMBER,PS:(-1,0),NULL,ANONYMOUS_VIEW1.C1)], [PARTITION_ID], [T_HIDDEN_PK]), filter(nil), rowset=256 | | 5 - output([column_conv(NUMBER,PS:(-1,0),NULL,ANONYMOUS_VIEW1.C1)], [PARTITION_ID], [T_HIDDEN_PK]), filter(nil), rowset=256 | | (#keys=1, [column_conv(NUMBER,PS:(-1,0),NULL,ANONYMOUS_VIEW1.C1)]), dop=4 | | 6 - output([column_conv(NUMBER,PS:(-1,0),NULL,ANONYMOUS_VIEW1.C1)], [calc_tablet_id(PARTITION_HASH(column_conv(NUMBER,PS:(-1,0),NULL,ANONYMOUS_VIEW1.C1)))]), | | filter(nil), rowset=256 | | 7 - output([ANONYMOUS_VIEW1.C1]), filter(nil), rowset=256 | | access([ANONYMOUS_VIEW1.C1]) | | 8 - output([T1.C1]), filter(nil), rowset=256 | | 9 - output([T1.C1]), filter(nil), rowset=256 | | access([T1.C1]), partitions(p[0-3]) | | is_index_back=false, is_global_index=false, | | range_key([T1.__pk_increment]), range(MIN ; MAX)always true | +-----------------------------------------------------------------------------------------------------------------------------------------------------------------+ 35 rows in setThe example generates the
OPTIMIZER STATS MERGEoperator, indicating that online statistics collection is performed.
The statistics collected online by users can be viewed in the corresponding views. For more information, see Overview of statistics collection methods.
Here is an example:
CREATE TABLE t1(c1 INT) PARTITION BY HASH(c1) PARTITIONS 4;
CREATE TABLE t2(c1 INT) PARTITION BY HASH(c1) PARTITIONS 4;
INSERT INTO t1 VALUES (1),(2),(3),(4);
INSERT /*+ENABLE_PARALLEL_DML PARALLEL(4)*/ INTO t2 SELECT * FROM t1;
-- Oracle mode
OceanBase(SYS@SYS)> SELECT table_name, partition_name, num_rows, avg_row_len FROM user_tab_statistics WHERE table_name='T2';
+------------+----------------+----------+-------------+
| TABLE_NAME | PARTITION_NAME | NUM_ROWS | AVG_ROW_LEN |
+------------+----------------+----------+-------------+
| T2 | NULL | 4 | 20 |
| T2 | P0 | 1 | 20 |
| T2 | P1 | 2 | 20 |
| T2 | P2 | 1 | 20 |
| T2 | P3 | 0 | 0 |
+------------+----------------+----------+-------------+
-- MySQL mode
OceanBase(root@test)> SELECT table_name, partition_name, num_rows, avg_row_len FROM OCEANBASE.DBA_TAB_STATISTICS WHERE table_name='T2';
+------------+----------------+----------+-------------+
| table_name | partition_name | num_rows | avg_row_len |
+------------+----------------+----------+-------------+
| t2 | NULL | 4 | 20 |
| t2 | p0 | 1 | 20 |
| t2 | p1 | 1 | 20 |
| t2 | p2 | 1 | 20 |
| t2 | p3 | 1 | 20 |
+------------+----------------+----------+-------------+