Column skip index attribute

2025-11-27 02:38:06  Updated

Data skipping is an optimization method that calculates data at the storage layer to skip unnecessary I/O. A skip index is a sparse index structure that provides the data skipping capability by storing pre-aggregated data, aiming to enhance the query efficiency. A skip index extends the metadata stored in the index tree to add column-level metadata fields for aggregating and storing the maximum value, minimum value, number of null values, and sum of the specified column data in the range corresponding to the index node. The aggregated data on the index is then used to dynamically prune the data during the calculation of pushed-down expressions, thereby reducing scanning overheads. Skip indexes are a column attribute. You can use the DESC table_name or SHOW CREATE TABLE table_name statement to query the column attributes of a table.

Note

The essence of pre-aggregation is to move calculation in the query execution phase ahead to the data writing phase. The pre-calculated results are stored to improve the query efficiency. This method requires extra calculation in the compaction task, and pre-aggregated data consumes storage space. Skip indexes are stored in the baseline data. Data updates in the pre-aggregation range can invalidate the pre-aggregated data. Therefore, frequent random updates can make skip indexes invalid and undermine the optimization effect.

DDL behaviors of skip indexes

  • The maintenance of skip index data is completed on the baseline data during major compactions. All DDL operations for updating aggregated data depend on progressive major compactions. That is, a skip index can be partially effective. For example, when a skip index is created on a column, each time a major compaction is completed, the skip index takes effect on the newly written data. After a full major compaction is completed and all data is rewritten, the skip index takes effect on all data in this column.

  • Skip indexes are a column attribute that can be applied by online DDL operations.

  • The skip index attribute of a column is restricted by the data type and characteristics of the column. A column with a cascading relationship, such as an indexed column, can inherit the corresponding aggregation attribute.

  • When you add the skip index attribute to a column, if the skip index size of the table may exceed the maximum storage size, the system reports an error. Using skip indexes is an optimization strategy that trades storage space for query performance. Therefore, when you attempt to add the skip index attribute to a column, make sure that your operation can improve the query performance, so as not to waste storage resources.

Limitations on skip indexes

  • You cannot create a skip index for a JSON column or a spatial column.

  • You cannot create a skip index of the SUM type for a non-numeric column. Numeric data types include integer types, fixed-point types, and floating-point types. The bit value type is not supported.

  • You cannot create a skip index for a generated column.

Identification method of skip indexes

Note

  • By default, no skip index is created for a rowstore table, whereas skip indexes of the MIN_MAX type are created for a columnstore table.
  • When you use the DESC table_name or SHOW CREATE TABLE table_name statement to query the column attributes of a table, the skip index attribute created by default for the table is not displayed.
  • By default, a skip index of the MIN_MAX type is created for a columnstore table. A skip index of the SUM type is not created by default, because it compromises the performance of direct load and major compaction tasks. If a skip index of the SUM type can improve the query performance, you can create such an index to accelerate queries. Otherwise, we recommend that you drop the index.
  • In OceanBase Database V4.3.0 and V4.3.1, a skip index of the SUM type is created for a columnstore table by default. After an upgrade to V4.3.2, such an index may become invalid. If the query performance of a columnstore table with SUM aggregation deteriorates, you can create a skip index of the SUM type to accelerate queries.

You can use SKIP_INDEX(skip_index_option) to specify the skip index attribute for a column. Valid values are as follows:

  • MIN_MAX: a skip index type that stores the maximum value, minimum value, and number of null values of the indexed column at the index node granularity. This is the most common skip index type. This type of skip index can accelerate the pushdown of filters and MIN/MAX aggregation.

  • SUM: a skip index type that is used to accelerate the pushdown of SUM aggregation for numeric values.

  • MIN_MAX, SUM: a skip index type that uses both MIN_MAX and SUM aggregation.

For information about how to modify the skip index attribute, see Modify a table.

Example

Create a table and specify the skip index attribute for a column.

CREATE TABLE test_skidx(
  col1 NUMBER SKIP_INDEX(MIN_MAX, SUM), 
  col2 FLOAT SKIP_INDEX(MIN_MAX), 
  col3 VARCHAR2(1024) SKIP_INDEX(MIN_MAX),
  col4 CHAR(10)
  );

References

Contact Us