Purpose
You can use this statement to create an index. An index is a database structure created for a table to sort data in one or more columns of the table in a specific order. It improves the query speed and reduces the performance overhead of database systems.
In the current version of OceanBase Database, indexes are classified into unique indexes and non-unique indexes. When you create an index, you can use the STORING(column_name,...) clause to specify the redundant columns in the index table. The redundant columns of an index table include the columns that you specified and the primary key columns if the table has a primary key, or hidden primary key columns if the table does not have a primary key. The additional columns stored in indexes can provide more choices for the database optimizer.
Limitations and considerations
- If all index columns in a table are dropped, the index becomes invalid.
- Full-text indexes can be created only for columns of the
CHAR,VARCHAR, andTEXTdata types. - At present, only local full-text indexes are supported.
- The
UNIQUEkeyword cannot be specified when you create a full-text index. - If you want to create a full-text index on multiple columns, the columns must have the same character set.
- The
WITH PARSER {SPACE | NGRAM | BENG | IK}clause can be specified only when you create a full-text index.
Required privileges
To create indexes on a table by using the CREATE INDEX statement, you must have the INDEX privilege on the table. For more information about privileges in OceanBase Database, see Privilege types in MySQL mode.
Syntax
CREATE [hint_options] [index_type] INDEX [IF NOT EXISTS] index_name
[USING BTREE] ON table_name (sort_column_key [, sort_column_key... ])
[WITH PARSER tokenizer_option]
[PARSER_PROPERTIES[=](parser_properties_list)]
[index_option...] [partition_option] [index_column_group_option]
index_type:
SPATIAL
| UNIQUE
| FULLTEXT
sort_column_key:
column_name [(integer)] [ASC] [ID id]
| (index_expr) [ASC] [ID id]
tokenizer_option:
SPACE
| NGRAM
| BENG
| IK
| NGRAM2
parser_properties_list:
parser_properties, [parser_properties]
parser_properties:
min_token_size = int_value
| max_token_size = int_value
| ngram_token_size = int_value
| ik_mode = 'char_value'
| min_ngram_size = int_value
| max_ngram_size = int_value
index_option:
GLOBAL
| LOCAL
| BLOCK_SIZE integer
| COMMENT STRING_VALUE
| STORING (column_name [, column_name...])
| WITH_ROWID
| WITH PARSER STRING_VALUE
| index_using_algorithm
| visibility_option
| DATA_TABLE_ID data_table_id
| INDEX_TABLE_ID index_table_id
| VIRTUAL_COLUMN_ID virtual_column_id
| MAX_USED_PART_ID used_part_id
index_column_group_option:
WITH COLUMN GROUP([all columns, ]each column)
Parameters
| Parameter | Description |
|---|---|
| hint_options | Optional. The hint options. Only the parallel option is supported, for example, /*+ parallel(10) */. |
| index_type | Specifies the type of the index. For more information about the index type, see index_type. |
| IF NOT EXISTS | Specifies not to create an index if it already exists. If the index already exists and IF NOT EXISTS is not specified, an error is reported. |
| index_name | The name of the index to be created. |
| USING BTREE | Optional. Specifies to create a B-tree index.
NoteOceanBase Database supports only |
| table_name | The table on which the index is created. You can directly specify the table name or specify the table name and the name of the database to which the table belongs in the schema_name.table_name format. |
| sort_column_key | The key of a sort column. You can specify multiple sort columns for an index and separate them by commas (,). For more information, see sort_column_key. |
| WITH PARSER tokenizer_option | Optional. Specifies the tokenizer for the full-text index. For more information, see tokenizer_option below. |
| PARSER_PROPERTIES[=](parser_properties_list) | Optional. specifies the tokenizer properties. For more information, see parser_properties below.
NoteFor OceanBase Database V4.3.5, specifying tokenizer properties ( |
| index_option | The index options. You can specify multiple index options for an index and separate them by spaces. For more information, see index_option. |
| partition_option | The index partitioning option. You can specify HASH partitioning, KEY partitioning, RANGE partitioning, LIST partitioning, and external table partitioning. |
| index_column_group_option | An index option. For more information, see index_column_group_option. |
index_type
SPATIAL: Specifies that a spatial index is to be created.UNIQUE: Specifies that a unique index is to be created.FULLTEXT: Specifies that a full-text index is to be created.
For more information about how to create different types of indexes, see Create an index.
sort_column_key
column_name [(integer)] [ASC] [ID id]: specifies a column as the sort key.column_name: the name of the column to sort.integer: the length limit of the sort key. This option is optional.ASC: specifies to sort column values in ascending order. The descending order is not supported. This option is optional.ID id: the ID of the sort key. This option is optional.
The following sample statement creates an index named
index3on thet3table and sorts the index by thec1column in ascending order.CREATE INDEX index3 ON t3 (c1 ASC);(index_expr) [ASC] [ID id]: specifies to use an index expression as the sort key. You can define an index expression by using expressions or functions. The index expression setting contains the following options:(index_expr): the index expression, which can be a Boolean expression, such asc1=c1. You cannot create function-based indexes on generated columns in the current version of OceanBase Database. For more information about the expressions supported by function-based indexes, see System functions supported for function-based indexes.ASC: specifies to sort column values in ascending order. The descending order is not supported. This option is optional.ID id: the ID of the sort key. This option is optional.
The following example creates an index named
index4on thet4table, usesc1+c2as the index expression, and sorts the index in ascending order.CREATE INDEX index4 ON t4 ((c1 + c2) ASC);
When you create an index, you can specify multiple columns as the sort key and separate them with commas (,). The following example creates an index named index5 on the t5 table and uses the c1 column and the c2+c3 expression as the index sort key.
CREATE INDEX index5 ON t5 (c1, (c2+c3));
tokenizer_option
SPACE: the default value. Specifies to tokenize the text by space. The following attributes can be specified:Attribute Value range min_token_size [1, 16] max_token_size [10, 84] NGRAM: specifies to tokenize the text based on N-Gram (Chinese). The following attribute can be specified:Attribute Value range ngram_token_size [1, 10] NGRAM2: specifies to split text into continuous characters within the range ofmin_ngram_sizetomax_ngram_size. The following attributes can be specified:Attribute Value Range min_ngram_size [1, 16] max_ngram_size [1, 16] Note
For OceanBase Database V4.3.5, the
NGRAM2tokenizer is supported starting from V4.3.5 BP2.BENG: specifies to tokenize the text based on Beng (basic English). The following attributes can be specified:Attribute Value range min_token_size [1, 16] max_token_size [10, 84] IK: specifies to tokenize the text based on IK (Chinese). Currently, only theutf-8character set is supported. The following attribute can be specified:Attribute Value range ik_mode smartmax_word
Note
For OceanBase Database V4.3.5, the
IKtokenizer is supported starting from V4.3.5 BP1.
You can use the TOKENIZE function to view the results of tokenizing the text based on the specified tokenizer and JSON parameters.
parser_properties
min_token_size: the minimum token size. The default value is 3. The value range is 1 to 16.max_token_size: the maximum token size. The default value is 84. The value range is 10 to 84.ngram_token_size: the token size for theNGRAMtokenizer. The value range is 1 to 10. This attribute only takes effect for theNGRAMtokenizer.ik_mode: the tokenization mode for theIKtokenizer. Valid values include:smart: the default value. In this mode, words in the dictionary are used to improve the accuracy of tokenization, and the boundaries of words in the dictionary are prioritized. This may reduce unnecessary expansion.max_word: In this mode, words defined in the dictionary are identified, but this does not affect the maximized expansion of the text. Tokenization will still extend the text into more words even if they are not defined in the dictionary.
Here is an example:
Create a table named
tbl1.CREATE TABLE tbl1 (col1 VARCHAR(200), col2 TEXT);Create a full-text index on the
col2column of thetbl1table, and specify to use theIKtokenizer. ThePARSER_PROPERTIESparameter is also set to the specified values.CREATE FULLTEXT INDEX fidx_tbl1 ON tbl1(col2) WITH PARSER IK PARSER_PROPERTIES=(ik_mode='max_word');
index_option
GLOBAL: specifies to create a global index.LOCAL: specifies to create a local index.BLOCK_SIZE integer: the size of an index block, that is, the number of bytes in each index block.COMMENT STRING_VALUE: adds a comment to the index.STORING (column_name [, column_name...]): the columns to be stored in the index. Separate multiple columns with commas (,).WITH_ROWID: creates an index that contains the row ID.WITH PARSER STRING_VALUE: the parser required for the index.index_using_algorithm: the algorithm used for the index. You can specifyUSING BTREEorUSING HASH. You cannot use theUSING HASHclause when you create multi-valued indexes.visibility_option: the visibility of the index.DATA_TABLE_ID data_table_id: the ID of the data table to which the index belongs.INDEX_TABLE_ID index_table_id: the ID of the index table.VIRTUAL_COLUMN_ID virtual_column_id: the ID of the virtual column.MAX_USED_PART_ID used_part_id: the maximum partition ID allowed for the index.
index_column_group_option
WITH COLUMN GROUP(all columns, each column): specifies to create a hybrid rowstore-columnstore index.WITH COLUMN GROUP(each column): specifies to create a columnstore index.
Examples
Create a columnstore index for a table.
Create a table named
test_tbl1.CREATE TABLE test_tbl1 (col1 INT, col2 VARCHAR(50));On the
test_tbl1table, create a columnstore index namedidx1_test_tbl1that references thecol1column.CREATE INDEX idx1_test_tbl1 ON test_tbl1 (col1) WITH COLUMN GROUP(each column);