Online statistics collection refers to the process where the database optimizer can collect statistics during the execution of INSERT statements, without the need for manual system package calls. During batch data imports, statistics can be collected in real-time without the need to explicitly initiate a collection task, reducing O&M operations and improving statistics collection performance.
OceanBase Database uses the GATHER_OPTIMIZER_STATISTICS/NO_GATHER_OPTIMIZER_STATISTICS hints and the system variable _optimizer_gather_stats_on_load (which is enabled by default) for online statistics collection. It can also utilize the APPEND hint from the direct load feature to achieve the same.
Note
_optimizer_gather_stats_on_load is an internal system variable that is enabled by default.
The main scenarios for online statistics collection are as follows:
CREATE TABLE...AS SELECT(CTAS)When the system variable
_optimizer_gather_stats_on_loadis set to True, CTAS will by default enable online statistics collection. To disable this feature, use theNO_GATHER_OPTIMIZER_STATISTICShint. Example:CREATE TABLE table_name AS SELECT /*+NO_GATHER_OPTIMIZER_STATISTICS*/* FROM...INSERT INTO SELECTWhen the system variable
_optimizer_gather_stats_on_loadis set to True, online statistics collection can be enabled using either theGATHER_OPTIMIZER_STATISTICShint or theAPPENDhint. Example:INSERT /*+GATHER_OPTIMIZER_STATISTICS*/ INTO table_name SELECT * FROM table_nameNotice
At present, OceanBase Database does not support collecting statistics online by executing the
INSERT /*+GATHER_OPTIMIZER_STATISTICS*/ INTO table_name VALUES()...statement.PDML
Online statistics collection can be enabled by using the
ENABLE_PARALLEL_DMLhint or by executing theALTER SESSION ENABLE PARALLEL DML;statement to set the system to parallel mode, allowing for parallel insert operations.Example:
CREATE TABLE t1(c1 INT) PARTITION BY HASH(c1) PARTITIONS 4; CREATE TABLE t2(c1 INT) PARTITION BY HASH(c1) PARTITIONS 4; INSERT INTO t1 VALUES (1),(2),(3),(4); obclient> EXPLAIN INSERT /*+ENABLE_PARALLEL_DML PARALLEL(4)*/ INTO t2 SELECT * FROM t1; +-----------------------------------------------------------------------------------------------------------------------------------------------------------------+ | Query Plan | +-----------------------------------------------------------------------------------------------------------------------------------------------------------------+ | ================================================================================ | | |ID|OPERATOR |NAME |EST.ROWS|EST.TIME(us)| | | -------------------------------------------------------------------------------- | | |0 |OPTIMIZER STATS MERGE | |1 |16 | | | |1 | PX COORDINATOR | |1 |16 | | | |2 | EXCHANGE OUT DISTR |:EX10001 |1 |16 | | | |3 | INSERT | |1 |15 | | | |4 | EXCHANGE IN DISTR | |1 |2 | | | |5 | EXCHANGE OUT DISTR (PKEY RANDOM)|:EX10000 |1 |2 | | | |6 | OPTIMIZER STATS GATHER | |1 |2 | | | |7 | SUBPLAN SCAN |ANONYMOUS_VIEW1|1 |2 | | | |8 | PX BLOCK ITERATOR | |1 |2 | | | |9 | TABLE SCAN |T1 |1 |2 | | | ================================================================================ | | Outputs & filters: | | ------------------------------------- | | 0 - output(nil), filter(nil), rowset=256 | | 1 - output([column_conv(NUMBER,PS:(-1,0),NULL,ANONYMOUS_VIEW1.C1)]), filter(nil), rowset=256 | | 2 - output([column_conv(NUMBER,PS:(-1,0),NULL,ANONYMOUS_VIEW1.C1)]), filter(nil), rowset=256 | | dop=4 | | 3 - output([column_conv(NUMBER,PS:(-1,0),NULL,ANONYMOUS_VIEW1.C1)]), filter(nil) | | columns([{T2: ({T2: (T2.__pk_increment, T2.C1)})}]), partitions(p[0-3]), | | column_values([T_HIDDEN_PK], [column_conv(NUMBER,PS:(-1,0),NULL,ANONYMOUS_VIEW1.C1)]) | | 4 - output([column_conv(NUMBER,PS:(-1,0),NULL,ANONYMOUS_VIEW1.C1)], [PARTITION_ID], [T_HIDDEN_PK]), filter(nil), rowset=256 | | 5 - output([column_conv(NUMBER,PS:(-1,0),NULL,ANONYMOUS_VIEW1.C1)], [PARTITION_ID], [T_HIDDEN_PK]), filter(nil), rowset=256 | | (#keys=1, [column_conv(NUMBER,PS:(-1,0),NULL,ANONYMOUS_VIEW1.C1)]), dop=4 | | 6 - output([column_conv(NUMBER,PS:(-1,0),NULL,ANONYMOUS_VIEW1.C1)], [calc_tablet_id(PARTITION_HASH(column_conv(NUMBER,PS:(-1,0),NULL,ANONYMOUS_VIEW1.C1)))]), | | filter(nil), rowset=256 | | 7 - output([ANONYMOUS_VIEW1.C1]), filter(nil), rowset=256 | | access([ANONYMOUS_VIEW1.C1]) | | 8 - output([T1.C1]), filter(nil), rowset=256 | | 9 - output([T1.C1]), filter(nil), rowset=256 | | access([T1.C1]), partitions(p[0-3]) | | is_index_back=false, is_global_index=false, | | range_key([T1.__pk_increment]), range(MIN ; MAX)always true | +-----------------------------------------------------------------------------------------------------------------------------------------------------------------+ 35 rows in setIn the preceding example, the
OPTIMIZER STATS MERGEoperator is generated, indicating that online statistics collection was executed.
The statistics collected online can be viewed in the corresponding views. For more information about these views, see Overview of statistics collection methods.
Example:
CREATE TABLE t1(c1 INT) PARTITION BY HASH(c1) PARTITIONS 4;
CREATE TABLE t2(c1 INT) PARTITION BY HASH(c1) PARTITIONS 4;
INSERT INTO t1 VALUES (1),(2),(3),(4);
INSERT /*+ENABLE_PARALLEL_DML PARALLEL(4)*/ INTO t2 SELECT * FROM t1;
-- Oracle mode
OceanBase(SYS@SYS)> SELECT table_name, partition_name, num_rows, avg_row_len FROM user_tab_statistics WHERE table_name='T2';
+------------+----------------+----------+-------------+
| TABLE_NAME | PARTITION_NAME | NUM_ROWS | AVG_ROW_LEN |
+------------+----------------+----------+-------------+
| T2 | NULL | 4 | 20 |
| T2 | P0 | 1 | 20 |
| T2 | P1 | 2 | 20 |
| T2 | P2 | 1 | 20 |
| T2 | P3 | 0 | 0 |
+------------+----------------+----------+-------------+
-- MySQL mode
OceanBase(root@test)> SELECT table_name, partition_name, num_rows, avg_row_len FROM OCEANBASE.DBA_TAB_STATISTICS WHERE table_name='T2';
+------------+----------------+----------+-------------+
| table_name | partition_name | num_rows | avg_row_len |
+------------+----------------+----------+-------------+
| t2 | NULL | 4 | 20 |
| t2 | p0 | 1 | 20 |
| t2 | p1 | 1 | 20 |
| t2 | p2 | 1 | 20 |
| t2 | p3 | 1 | 20 |
+------------+----------------+----------+-------------+