In OceanBase Database, "statistics" refer to optimizer statistics, which are a collection of data that describes the table and column information in the database.
The optimizer cost model selects a plan based on the statistics of objects such as tables, columns, and predicates involved in a query. The optimizer uses statistics to optimize plan selection. Therefore, statistics are the key to selecting the optimal execution plan in the cost model.
In the OceanBase Database optimizer, statistics are stored as common data in internal tables. The optimizer maintains the local cache of statistics to speed up access to statistics.
OceanBase Database supports table-level statistics and column-level statistics.
Table-level statistics
The statistics of a table include the following information:
Basic table information, including
tenant_id,table_id, andpartition_idTable statistics type, including
GLOBAL,PARTITION, andSUBPARTITIONNumber of table rows
Number of macroblocks occupied by the table
Number of microblocks occupied by the table
Average length of rows in the table
Timestamp when the statistics were collected
Lock status of table-level statistics
Column-level statistics
The statistics of a column include the following information:
Basic information of the column, including
tenant_id,table_id,partition_id, andcolumn_idColumn statistics type, including
GLOBAL,PARTITION, andSUBPARTITIONNumber of distinct values (NDV) in the column
Number of
NULLvalues in the columnMaximum and minimum values in the column
Size of the sampled data in the column
Histogram density of the column
Number of histogram buckets for the column
Histogram type (frequency/TopK/hybrid histogram)