Online statistics collection refers to the ability of the database optimizer to collect statistics during INSERT operations, without the need for manual system package calls. This allows for real-time statistics collection during large-scale data imports, without the need to initiate additional collection tasks, reducing operational overhead and improving statistics collection performance.
OceanBase Database supports online statistics collection using the GATHER_OPTIMIZER_STATISTICS/NO_GATHER_OPTIMIZER_STATISTICS hints and the system variable _optimizer_gather_stats_on_load (which is enabled by default). It also supports online statistics collection for direct load using the APPEND hint.
Note
_optimizer_gather_stats_on_load is an internal system variable that is enabled by default.
Online statistics collection can be triggered in the following scenarios:
CREATE TABLE...AS SELECT(CTAS)When
_optimizer_gather_stats_on_loadis set to True, online statistics collection is enabled by default for CTAS operations. To disable this feature, use theNO_GATHER_OPTIMIZER_STATISTICShint. Example:CREATE TABLE table_name AS SELECT /*+NO_GATHER_OPTIMIZER_STATISTICS*/* FROM...INSERT INTO SELECTWhen
_optimizer_gather_stats_on_loadis set to True, online statistics collection is enabled by default for operations using theGATHER_OPTIMIZER_STATISTICShint or theAPPENDhint. Example:INSERT /*+GATHER_OPTIMIZER_STATISTICS*/ INTO table_name SELECT * FROM table_nameNotice
At present, OceanBase Database does not support online statistics collection for the
INSERT /*+GATHER_OPTIMIZER_STATISTICS*/ INTO table_name VALUES()...statement.PDML
Online statistics collection can be enabled using the
ENABLE_PARALLEL_DMLhint or by setting the system to parallel mode with theALTER SESSION ENABLE PARALLEL DML;command, allowing for parallel insert operations.Example:
CREATE TABLE t1(c1 INT) PARTITION BY HASH(c1) PARTITIONS 4; CREATE TABLE t2(c1 INT) PARTITION BY HASH(c1) PARTITIONS 4; INSERT INTO t1 VALUES (1),(2),(3),(4); obclient> EXPLAIN INSERT /*+ENABLE_PARALLEL_DML PARALLEL(4)*/ INTO t2 SELECT * FROM t1; +-----------------------------------------------------------------------------------------------------------------------------------------------------------------+ | Query Plan | +-----------------------------------------------------------------------------------------------------------------------------------------------------------------+ | ================================================================================ | | |ID|OPERATOR |NAME |EST.ROWS|EST.TIME(us)| | | -------------------------------------------------------------------------------- | | |0 |OPTIMIZER STATS MERGE | |1 |16 | | | |1 | PX COORDINATOR | |1 |16 | | | |2 | EXCHANGE OUT DISTR |:EX10001 |1 |16 | | | |3 | INSERT | |1 |15 | | | |4 | EXCHANGE IN DISTR | |1 |2 | | | |5 | EXCHANGE OUT DISTR (PKEY RANDOM)|:EX10000 |1 |2 | | | |6 | OPTIMIZER STATS GATHER | |1 |2 | | | |7 | SUBPLAN SCAN |ANONYMOUS_VIEW1|1 |2 | | | |8 | PX BLOCK ITERATOR | |1 |2 | | | |9 | TABLE SCAN |T1 |1 |2 | | | ================================================================================ | | Outputs & filters: | | ------------------------------------- | | 0 - output(nil), filter(nil), rowset=256 | | 1 - output([column_conv(NUMBER,PS:(-1,0),NULL,ANONYMOUS_VIEW1.C1)]), filter(nil), rowset=256 | | 2 - output([column_conv(NUMBER,PS:(-1,0),NULL,ANONYMOUS_VIEW1.C1)]), filter(nil), rowset=256 | | dop=4 | | 3 - output([column_conv(NUMBER,PS:(-1,0),NULL,ANONYMOUS_VIEW1.C1)]), filter(nil) | | columns([{T2: ({T2: (T2.__pk_increment, T2.C1)})}]), partitions(p[0-3]), | | column_values([T_HIDDEN_PK], [column_conv(NUMBER,PS:(-1,0),NULL,ANONYMOUS_VIEW1.C1)]) | | 4 - output([column_conv(NUMBER,PS:(-1,0),NULL,ANONYMOUS_VIEW1.C1)], [PARTITION_ID], [T_HIDDEN_PK]), filter(nil), rowset=256 | | 5 - output([column_conv(NUMBER,PS:(-1,0),NULL,ANONYMOUS_VIEW1.C1)], [PARTITION_ID], [T_HIDDEN_PK]), filter(nil), rowset=256 | | (#keys=1, [column_conv(NUMBER,PS:(-1,0),NULL,ANONYMOUS_VIEW1.C1)]), dop=4 | | 6 - output([column_conv(NUMBER,PS:(-1,0),NULL,ANONYMOUS_VIEW1.C1)], [calc_tablet_id(PARTITION_HASH(column_conv(NUMBER,PS:(-1,0),NULL,ANONYMOUS_VIEW1.C1)))]), | | filter(nil), rowset=256 | | 7 - output([ANONYMOUS_VIEW1.C1]), filter(nil), rowset=256 | | access([ANONYMOUS_VIEW1.C1]) | | 8 - output([T1.C1]), filter(nil), rowset=256 | | 9 - output([T1.C1]), filter(nil), rowset=256 | | access([T1.C1]), partitions(p[0-3]) | | is_index_back=false, is_global_index=false, | | range_key([T1.__pk_increment]), range(MIN ; MAX)always true | +-----------------------------------------------------------------------------------------------------------------------------------------------------------------+ 35 rows in setIn the preceding example, the
OPTIMIZER STATS MERGEoperator is generated, indicating that online statistics collection was executed.
The statistics collected online can be viewed in the corresponding views. For more information, see Overview.
Example:
CREATE TABLE t1(c1 INT) PARTITION BY HASH(c1) PARTITIONS 4;
CREATE TABLE t2(c1 INT) PARTITION BY HASH(c1) PARTITIONS 4;
INSERT INTO t1 VALUES (1),(2),(3),(4);
INSERT /*+ENABLE_PARALLEL_DML PARALLEL(4)*/ INTO t2 SELECT * FROM t1;
-- Oracle mode
OceanBase(SYS@SYS)> SELECT table_name, partition_name, num_rows, avg_row_len FROM user_tab_statistics WHERE table_name='T2';
+------------+----------------+----------+-------------+
| TABLE_NAME | PARTITION_NAME | NUM_ROWS | AVG_ROW_LEN |
+------------+----------------+----------+-------------+
| T2 | NULL | 4 | 20 |
| T2 | P0 | 1 | 20 |
| T2 | P1 | 2 | 20 |
| T2 | P2 | 1 | 20 |
| T2 | P3 | 0 | 0 |
+------------+----------------+----------+-------------+
-- MySQL mode
OceanBase(root@test)> SELECT table_name, partition_name, num_rows, avg_row_len FROM OCEANBASE.DBA_TAB_STATISTICS WHERE table_name='T2';
+------------+----------------+----------+-------------+
| table_name | partition_name | num_rows | avg_row_len |
+------------+----------------+----------+-------------+
| t2 | NULL | 4 | 20 |
| t2 | p0 | 1 | 20 |
| t2 | p1 | 1 | 20 |
| t2 | p2 | 1 | 20 |
| t2 | p3 | 1 | 20 |
+------------+----------------+----------+-------------+