CREATE INDEX

2025-11-14 07:33:32  Updated

Purpose

You can use this statement to create an index. An index is a database structure created for a table to sort data in one or more columns of the table in a specific order. It improves the query speed and reduces the performance overhead of database systems.

In the current version of OceanBase Database, indexes are classified into unique indexes and non-unique indexes. When you create an index, you can use the STORING(column_name,...) clause to specify the redundant columns in the index table. The redundant columns of an index table include the columns that you specified and the primary key columns if the table has a primary key, or hidden primary key columns if the table does not have a primary key. The additional columns stored in indexes can provide more choices for the database optimizer.

Limitations and considerations

  • If all index columns in a table are dropped, the index becomes invalid.
  • Full-text indexes can be created only for columns of the CHAR, VARCHAR, and TEXT data types.
  • At present, only local full-text indexes are supported.
  • The UNIQUE keyword cannot be specified when you create a full-text index.
  • If you want to create a full-text index on multiple columns, the columns must have the same character set.
  • The WITH PARSER {SPACE | NGRAM | BENG | IK} clause can be specified only when you create a full-text index.

Required privileges

To create indexes on a table by using the CREATE INDEX statement, you must have the INDEX privilege on the table. For more information about privileges in OceanBase Database, see Privilege types in MySQL mode.

Syntax

CREATE [hint_options] [index_type] INDEX [IF NOT EXISTS] index_name
    [USING BTREE] ON table_name (sort_column_key [, sort_column_key... ])
    [WITH PARSER tokenizer_option]
    [PARSER_PROPERTIES[=](parser_properties_list)]
    [index_option...] [partition_option] [index_column_group_option]

index_type:
    SPATIAL
    | UNIQUE
    | FULLTEXT

sort_column_key:
      column_name [(integer)] [ASC] [ID id]
    | (index_expr) [ASC] [ID id]

tokenizer_option:
    SPACE
    | NGRAM
    | BENG
    | IK
    | NGRAM2

parser_properties_list:
    parser_properties, [parser_properties]

parser_properties:
    min_token_size = int_value
    | max_token_size = int_value
    | ngram_token_size = int_value
    | ik_mode = 'char_value'
    | min_ngram_size = int_value
    | max_ngram_size = int_value

index_option:  
      GLOBAL
    | LOCAL
    | BLOCK_SIZE integer
    | COMMENT STRING_VALUE
    | STORING (column_name [, column_name...])
    | WITH_ROWID
    | WITH PARSER STRING_VALUE
    | index_using_algorithm
    | visibility_option
    | DATA_TABLE_ID data_table_id
    | INDEX_TABLE_ID index_table_id
    | VIRTUAL_COLUMN_ID virtual_column_id
    | MAX_USED_PART_ID used_part_id

index_column_group_option:
    WITH COLUMN GROUP([all columns, ]each column)

Parameters

Parameter Description
hint_options Optional. The hint options. Only the parallel option is supported, for example, /*+ parallel(10) */.
index_type Specifies the type of the index. For more information about the index type, see index_type.
IF NOT EXISTS Specifies not to create an index if it already exists. If the index already exists and IF NOT EXISTS is not specified, an error is reported.
index_name The name of the index to be created.
USING BTREE Optional. Specifies to create a B-tree index.

Note

OceanBase Database supports only USING BTREE.

table_name The table on which the index is created. You can directly specify the table name or specify the table name and the name of the database to which the table belongs in the schema_name.table_name format.
sort_column_key The key of a sort column. You can specify multiple sort columns for an index and separate them by commas (,). For more information, see sort_column_key.
WITH PARSER tokenizer_option Optional. Specifies the tokenizer for the full-text index. For more information, see tokenizer_option below.
PARSER_PROPERTIES[=](parser_properties_list) Optional. specifies the tokenizer properties. For more information, see parser_properties below.

Note

For OceanBase Database V4.3.5, specifying tokenizer properties (PARSER_PROPERTIES) is supported starting from V4.3.5 BP1.

index_option The index options. You can specify multiple index options for an index and separate them by spaces. For more information, see index_option.
partition_option The index partitioning option. You can specify HASH partitioning, KEY partitioning, RANGE partitioning, LIST partitioning, and external table partitioning.
index_column_group_option An index option. For more information, see index_column_group_option.

index_type

  • SPATIAL: Specifies that a spatial index is to be created.
  • UNIQUE: Specifies that a unique index is to be created.
  • FULLTEXT: Specifies that a full-text index is to be created.

For more information about how to create different types of indexes, see Create an index.

sort_column_key

  • column_name [(integer)] [ASC] [ID id]: specifies a column as the sort key.

    • column_name: the name of the column to sort.

    • integer: the length limit of the sort key. This option is optional.

    • ASC: specifies to sort column values in ascending order. The descending order is not supported. This option is optional.

    • ID id: the ID of the sort key. This option is optional.

    The following sample statement creates an index named index3 on the t3 table and sorts the index by the c1 column in ascending order.

    CREATE INDEX index3 ON t3 (c1 ASC);
    
  • (index_expr) [ASC] [ID id]: specifies to use an index expression as the sort key. You can define an index expression by using expressions or functions. The index expression setting contains the following options:

    • (index_expr): the index expression, which can be a Boolean expression, such as c1=c1. You cannot create function-based indexes on generated columns in the current version of OceanBase Database. For more information about the expressions supported by function-based indexes, see System functions supported for function-based indexes.

    • ASC: specifies to sort column values in ascending order. The descending order is not supported. This option is optional.

    • ID id: the ID of the sort key. This option is optional.

    The following example creates an index named index4 on the t4 table, uses c1+c2 as the index expression, and sorts the index in ascending order.

    CREATE INDEX index4 ON t4 ((c1 + c2) ASC);
    

When you create an index, you can specify multiple columns as the sort key and separate them with commas (,). The following example creates an index named index5 on the t5 table and uses the c1 column and the c2+c3 expression as the index sort key.

CREATE INDEX index5 ON t5 (c1, (c2+c3));

tokenizer_option

  • SPACE: the default value. Specifies to tokenize the text by space. The following attributes can be specified:

    Attribute Value range
    min_token_size [1, 16]
    max_token_size [10, 84]
  • NGRAM: specifies to tokenize the text based on N-Gram (Chinese). The following attribute can be specified:

    Attribute Value range
    ngram_token_size [1, 10]
  • NGRAM2: specifies to split text into continuous characters within the range of min_ngram_size to max_ngram_size. The following attributes can be specified:

    Attribute Value Range
    min_ngram_size [1, 16]
    max_ngram_size [1, 16]

    Note

    For OceanBase Database V4.3.5, the NGRAM2 tokenizer is supported starting from V4.3.5 BP2.

  • BENG: specifies to tokenize the text based on Beng (basic English). The following attributes can be specified:

    Attribute Value range
    min_token_size [1, 16]
    max_token_size [10, 84]
  • IK: specifies to tokenize the text based on IK (Chinese). Currently, only the utf-8 character set is supported. The following attribute can be specified:

    Attribute Value range
    ik_mode
    • smart
    • max_word

    Note

    For OceanBase Database V4.3.5, the IK tokenizer is supported starting from V4.3.5 BP1.

You can use the TOKENIZE function to view the results of tokenizing the text based on the specified tokenizer and JSON parameters.

parser_properties

  • min_token_size: the minimum token size. The default value is 3. The value range is 1 to 16.

  • max_token_size: the maximum token size. The default value is 84. The value range is 10 to 84.

  • ngram_token_size: the token size for the NGRAM tokenizer. The value range is 1 to 10. This attribute only takes effect for the NGRAM tokenizer.

  • ik_mode: the tokenization mode for the IK tokenizer. Valid values include:

    • smart: the default value. In this mode, words in the dictionary are used to improve the accuracy of tokenization, and the boundaries of words in the dictionary are prioritized. This may reduce unnecessary expansion.
    • max_word: In this mode, words defined in the dictionary are identified, but this does not affect the maximized expansion of the text. Tokenization will still extend the text into more words even if they are not defined in the dictionary.

Here is an example:

  1. Create a table named tbl1.

    CREATE TABLE tbl1 (col1 VARCHAR(200), col2 TEXT);
    
  2. Create a full-text index on the col2 column of the tbl1 table, and specify to use the IK tokenizer. The PARSER_PROPERTIES parameter is also set to the specified values.

    CREATE FULLTEXT INDEX fidx_tbl1 ON tbl1(col2) WITH PARSER IK PARSER_PROPERTIES=(ik_mode='max_word');
    

index_option

  • GLOBAL: specifies to create a global index.

  • LOCAL: specifies to create a local index.

  • BLOCK_SIZE integer: the size of an index block, that is, the number of bytes in each index block.

  • COMMENT STRING_VALUE: adds a comment to the index.

  • STORING (column_name [, column_name...]): the columns to be stored in the index. Separate multiple columns with commas (,).

  • WITH_ROWID: creates an index that contains the row ID.

  • WITH PARSER STRING_VALUE: the parser required for the index.

  • index_using_algorithm: the algorithm used for the index. You can specify USING BTREE or USING HASH. You cannot use the USING HASH clause when you create multi-valued indexes.

  • visibility_option: the visibility of the index.

  • DATA_TABLE_ID data_table_id: the ID of the data table to which the index belongs.

  • INDEX_TABLE_ID index_table_id: the ID of the index table.

  • VIRTUAL_COLUMN_ID virtual_column_id: the ID of the virtual column.

  • MAX_USED_PART_ID used_part_id: the maximum partition ID allowed for the index.

index_column_group_option

  • WITH COLUMN GROUP(all columns, each column): specifies to create a hybrid rowstore-columnstore index.
  • WITH COLUMN GROUP(each column): specifies to create a columnstore index.

Examples

Create a columnstore index for a table.

  1. Create a table named test_tbl1.

    CREATE TABLE test_tbl1 (col1 INT, col2 VARCHAR(50));
    
  2. On the test_tbl1 table, create a columnstore index named idx1_test_tbl1 that references the col1 column.

    CREATE INDEX idx1_test_tbl1 ON test_tbl1 (col1) WITH COLUMN GROUP(each column);
    

References

Contact Us