Purpose
This statement is used to create an index. An index is a structure created on a table that sorts the values of one or more columns in the database table. Its main purpose is to improve query speed and reduce the performance overhead of the database system.
OceanBase Database currently supports unique and non-unique indexes. When creating an index, you can use the STORING(column_name,...) clause to specify redundant columns in the index table. In addition to the columns specified by the user, the index table will automatically include the primary key columns of the primary table (or the hidden primary key if the table has no explicit primary key). Storing additional columns in the index provides the database optimizer with more options for query optimization.
Limitations and considerations
- If all indexed columns are deleted from the table, the created indexes become invalid.
- Full-text indexes can only be applied to columns of type
CHAR,VARCHAR, andTEXT. - The current version only supports creating local (
LOCAL) full-text indexes. - The
UNIQUEkeyword cannot be specified when creating a full-text index. - If you want to create a full-text index involving multiple columns, ensure that these columns have the same character set.
- The
WITH PARSER {SPACE | NGRAM | BENG | IK}clause can only be used when creating a full-text index.
Privilege requirements
To execute the CREATE INDEX statement, the current user must have at least the INDEX privilege on the corresponding object. For more information about OceanBase Database privileges, see Privilege types in MySQL mode.
Syntax
CREATE [hint_options] [index_type] INDEX [IF NOT EXISTS] index_name
[USING BTREE] ON table_name (sort_column_key [, sort_column_key... ])
[WITH PARSER tokenizer_option]
[PARSER_PROPERTIES[=](parser_properties_list)]
[index_option...] [partition_option] [index_column_group_option]
index_type:
SPATIAL
| UNIQUE
| FULLTEXT
sort_column_key:
column_name [(integer)] [ASC] [ID id]
| (index_expr) [ASC] [ID id]
tokenizer_option:
SPACE
| NGRAM
| BENG
| IK
| NGRAM2
parser_properties_list:
parser_properties, [parser_properties]
parser_properties:
min_token_size = int_value
| max_token_size = int_value
| ngram_token_size = int_value
| ik_mode = 'char_value'
| min_ngram_size = int_value
| max_ngram_size = int_value
index_option:
GLOBAL
| LOCAL
| BLOCK_SIZE integer
| COMMENT STRING_VALUE
| STORING (column_name [, column_name...])
| WITH_ROWID
| WITH PARSER STRING_VALUE
| index_using_algorithm
| visibility_option
| DATA_TABLE_ID data_table_id
| INDEX_TABLE_ID index_table_id
| VIRTUAL_COLUMN_ID virtual_column_id
| MAX_USED_PART_ID used_part_id
index_column_group_option:
WITH COLUMN GROUP([all columns, ]each column)
Parameters
| Parameter | Description |
|---|---|
| hint_options | The hint options. This parameter is optional. Currently, only the parallel option is supported, for example, /*+ parallel(10) */. |
| index_type | The type of the index. This parameter is optional. For more information, see index_type. |
| IF NOT EXISTS | Indicates whether to skip the index creation if the index already exists. If the index exists and the IF NOT EXISTS option is not specified, an error will be returned. |
| index_name | The name of the index to be created. |
| USING BTREE | The type of the index. This parameter is optional.
NoteCurrently, only the |
| table_name | The name of the table on which to create the index. You can specify the table name directly or specify the database name and table name in the schema_name.table_name format. |
| sort_column_key | The key of the sorting column. You can specify multiple sorting columns when you create an index. Separate the sorting columns with commas (,). For more information, see sort_column_key. |
| WITH PARSER tokenizer_option | The tokenizer for the full-text index. This parameter is optional. For more information, see tokenizer_option. |
| PARSER_PROPERTIES[=](parser_properties_list) | The properties of the tokenizer. This parameter is optional. For more information, see parser_properties.
NoteStarting from OceanBase Database V4.3.5 BP1, you can specify the |
| index_option | The options of the index. You can specify multiple options. Separate the options with spaces. For more information, see index_option. |
| partition_option | The options for creating index partitions. You can specify hash partitions, key partitions, range partitions, list partitions, and external table partitions. |
| index_column_group_option | The options of the index. For more information, see index_column_group_option. |
index_type
SPATIAL: specifies that a spatial index is created.UNIQUE: specifies that a unique index is created.FULLTEXT: specifies that a full-text index is created.
For more information about how to create indexes of different types, see Create an index.
sort_column_key
column_name [(integer)] [ASC] [ID id]: specifies a column name as the sort key.column_name: the name of the column to be sorted.integer: an optional parameter that specifies the maximum length of the sort key.ASC: an optional parameter that specifies ascending order. Descending order (DESC) is not supported.ID id: an optional parameter that specifies the ID of the sort key.
For example, create an index named
index3on thec1column of thet3table and specify that the index is sorted in ascending order.CREATE INDEX index3 ON t3 (c1 ASC);(index_expr) [ASC] [ID id]: specifies an index expression as the sort key. An index expression can be defined by using an expression or a function. It can contain the following options:(index_expr): the index expression, which can be a boolean expression, such asc1=c1. OceanBase Database does not allow you to create a function index on a generated column. For more information about the expressions supported by a function index, see Supported system functions for function indexes.ASC: an optional parameter that specifies ascending order. Descending order (DESC) is not supported.ID id: an optional parameter that specifies the ID of the sort key.
For example, create an index named
index4on thet4table, specifyc1+c2as the index expression, and specify that the index is sorted in ascending order.CREATE INDEX index4 ON t4 ((c1 + c2) ASC);
You can specify multiple sort columns when you create an index. Separate the sort columns with commas (,). For example, create an index named index5 on the t5 table, specify c1 and c2+c3 as the sort keys, and specify that the index is sorted in ascending order.
CREATE INDEX index5 ON t5 (c1, (c2+c3));
tokenizer_option
SPACE: the default value. Specifies that the text is tokenized by using spaces. You can specify the following parameters:Parameter Value range min_token_size [1, 16] max_token_size [10, 84] NGRAM: specifies that the text is tokenized by using the N-Gram method. You can specify the following parameters:Parameter Value range ngram_token_size [1, 10] NGRAM2: specifies that the text is tokenized by using the N-Gram2 method. You can specify the following parameters:Parameter Value range min_ngram_size [1, 16] max_ngram_size [1, 16] Note
For OceanBase Database V4.3.5, the
NGRAM2tokenizer is supported starting from V4.3.5 BP2.BENG: specifies that the text is tokenized by using the Beng method. You can specify the following parameters:Parameter Value range min_token_size [1, 16] max_token_size [10, 84] IK: specifies that the text is tokenized by using the IK method. Only theutf-8character set is supported. You can specify the following parameter:Parameter Value range ik_mode smartmax_word
Note
For OceanBase Database V4.3.5, the
IKtokenizer is supported starting from V4.3.5 BP1.
You can use the TOKENIZE function to view the tokenization result of the text in the specified tokenizer and JSON format.
parser_properties
min_token_size: the minimum token length. Default value: 3. Value range: 1 to 16.max_token_size: the maximum token length. Default value: 84. Value range: 10 to 84.ngram_token_size: the token length for theNGRAMtokenizer. This parameter is valid only for theNGRAMtokenizer. Default value: 2. Value range: 1 to 10.ik_mode: the tokenization mode for theIKtokenizer. Valid values:smart: the default value. The dictionary is used to improve the accuracy of tokenization. The boundaries of the dictionary words are prioritized, which may reduce unnecessary expansions.max_word: the dictionary words are identified, but the maximum expansion of the text is not affected. Even if the dictionary contains the words, themax_wordmode still tries to split the text into more words.
Here is an example:
Create a table named
tbl1.CREATE TABLE tbl1 (col1 VARCHAR(200), col2 TEXT);Create a full-text index on the
col2column of thetbl1table. Specify theIKtokenizer and set the tokenizer parameters by using thePARSER_PROPERTIESoption.CREATE FULLTEXT INDEX fidx_tbl1 ON tbl1(col2) WITH PARSER IK PARSER_PROPERTIES=(ik_mode='max_word');
index_option
GLOBAL: creates a global index.LOCAL: creates a local index.BLOCK_SIZE integer: specifies the size of an index block, in bytes.COMMENT STRING_VALUE: adds a comment to the index.STORING (column_name [, column_name...]): specifies the columns to be stored in the index. Separate multiple columns with commas (,).WITH_ROWID: creates an index that contains row identifiers.WITH PARSER STRING_VALUE: specifies the parser required for the index.index_using_algorithm: specifies the algorithm used for the index. You can specify USING BTREE or USING HASH. However, you cannot create a multi-value index by using USING HASH.visibility_option: specifies the visibility of the index.DATA_TABLE_ID data_table_id: specifies the data table ID to which the index belongs.INDEX_TABLE_ID index_table_id: specifies the index table ID.VIRTUAL_COLUMN_ID virtual_column_id: specifies the virtual column ID.MAX_USED_PART_ID used_part_id: specifies the maximum partition ID used by the index.
index_column_group_option
WITH COLUMN GROUP(all columns, each column): creates a rowstore columnstore redundant index.WITH COLUMN GROUP(each column): creates a columnstore index.
Examples
Use the following SQL statement to create a columnstore index on a table.
Run the following SQL statement to create a table named
test_tbl1.CREATE TABLE test_tbl1 (col1 INT, col2 VARCHAR(50));Create a columnstore index named
idx1_test_tbl1on thetest_tbl1table and reference thecol1column.CREATE INDEX idx1_test_tbl1 ON test_tbl1 (col1) WITH COLUMN GROUP(each column);