This topic describes how to create an index by using SQL statements. It also covers the prerequisites, concepts, limitations, and best practices for creating an index, and provides some examples.
Note
This topic mainly describes how to create an index by using the CREATE INDEX statement, with the exception of multi-valued indexes. Multi-valued indexes can be created only when you create a table. For other methods of creating an index, see the CREATE TABLE or ALTER TABLE statement.
Index overview
An index, also known as a secondary index, is an optional table structure. OceanBase Database uses clustered index tables. If you specify a primary key, the system automatically generates a primary key index. Other indexes that you create are secondary indexes. You can create indexes on fields based on your business needs to speed up queries on these fields.
For more information about indexes in OceanBase Database, see Index overview.
Prerequisites
Before you create an index, make sure that the following conditions are met:
You have deployed an OceanBase cluster and created a MySQL-compatible tenant. For more information about how to deploy an OceanBase cluster, see Deployment overview.
You have connected to the MySQL-compatible tenant of OceanBase Database. For more information about how to connect to a database, see Overview of connection methods.
You have created a database. For more information about how to create a database, see Create a database.
You have created a table. For more information about how to create a table, see Create a table.
You have the
INDEXprivilege. To view the privileges of the current user, see View user privileges. If you do not have this privilege, contact your administrator to request the privilege. For more information about how to directly grant a privilege, see Directly grant a privilege.
Limitations
In OceanBase Database, an index name must be unique within a database.
The length of an index name cannot exceed 64 bytes.
Limitations on unique indexes:
You can create multiple unique indexes on one table, but the values of the columns in the unique indexes must be unique.
If you want the combination of other columns to meet the global uniqueness requirement in addition to the primary key, you must use a global unique index.
When you use a local unique index, the index must contain all the columns in the partitioning function of the table.
When you use a global index, the partitioning rules of the global index do not necessarily have to be the same as or consistent with those of the table.
Limitations on spatial indexes:
Spatial indexes support only local indexes and do not support global indexes.
The columns on which you want to create spatial indexes must be defined with the
SRIDattribute. Otherwise, the spatial index will not take effect during subsequent queries. For more information aboutSRID, see Spatial reference system (SRS).You can create a spatial index only on a column containing spatial data. For more information about the spatial data types supported by OceanBase Database, see Overview of spatial data types.
The column attribute of the column on which you want to create a spatial index must be
NOT NULL. If it is notNOT NULL, you can execute theALTER TABLEstatement to change the column attribute toNOT NULLbefore you create the spatial index. For more information about how to change the column attribute, see Define the constraint type for a column.OceanBase Database does not support changing the
SRIDattribute of a spatial column by using theALTER TABLEstatement. Therefore, you must define theSRIDattribute of a spatial column when you create a table so that the spatial index takes effect.
Limitations on multi-valued indexes
Each multi-valued index can have only one multi-valued key part. However, the
CAST(... AS ... ARRAY)expression can reference multiple arrays within a JSON document, as shown below:CAST(data->'$.arr[*][*]' AS UNSIGNED ARRAY)In this case, all values matching the JSON expression are stored in the index as a single flat array.
Multi-valued indexes cannot be covering indexes.
The only type of expression allowed in the multi-valued key part is a JSON expression. The expression does not have to reference an existing element in the JSON document inserted into the indexed column, but it must be syntactically valid.
Because index records for the same clustered index entry are distributed across the multi-valued index, multi-valued indexes do not support range scans or index-only scans.
Multi-valued indexes cannot be used in foreign key definitions.
Index prefixes cannot be defined for multi-valued indexes.
Multi-valued indexes cannot be defined on binary data types.
Limitations and considerations on full-text indexes
You can create a full-text index on a non-partitioned table or a partitioned table without a primary key. The limitations on creating a full-text index are as follows:
- Full-text indexes apply only to
CHAR,VARCHAR, andTEXTcolumns. - Only local (
LOCAL) full-text indexes can be created. - The
UNIQUEkeyword cannot be specified when you create a full-text index. - If you want to create a full-text index that involves multiple columns, make sure that these columns have the same character set.
- Full-text indexes apply only to
Recommendations for creating indexes
We recommend that you use a name that clearly describes the columns included in the index and its purpose, such as
idx_customer_name. For more information about naming conventions, see Overview of object naming conventions.If the partitioning rules and number of partitions of a global index are the same as those of the primary table, we recommend that you create a local index.
We recommend that you set the maximum number of concurrent index creation statements to no more than the number of CPU cores specified in the tenant unit specification. For example, if the tenant unit specification specifies 4 CPU cores (4C), we recommend that you create no more than four indexes in parallel.
Avoid creating too many indexes on frequently updated tables. Create indexes on frequently queried columns.
We recommend that you do not create indexes on tables with a small amount of data. This is because queries may run faster when scanning all the data rather than traversing the index.
We recommend that you do not create indexes when the modification performance is much higher than the retrieval performance.
To create an efficient index:
The index should contain all the columns required for queries. The more columns included, the fewer rows need to be fetched from the table.
The equality condition should always be placed at the beginning.
The filter and sort conditions that involve a large amount of data should be placed at the beginning.
Create an index by using the command line
You can use the CREATE INDEX statement to create an index.
Note
You can use the SHOW INDEX FROM table_name; statement to view the index information in a table. Replace table_name with the name of the table.
Examples
Example 1: Create a unique index
You can create a unique index on a column that contains no duplicate values.
Run the following SQL statements to create a table named tbl1 and create a unique index on the col2 column of the tbl1 table.
Create the
tbl1table.obclient [test]> CREATE TABLE tbl1(col1 INT, col2 INT, col3 VARCHAR(50), PRIMARY KEY (col1));Create a unique index named
idx_tbl1_col2on thecol2column of thetbl1table.obclient [test]> CREATE UNIQUE INDEX idx_tbl1_col2 ON tbl1(col2);View the index information of the
tbl1table.obclient [test]> SHOW INDEX FROM tbl1;The return result is as follows:
+-------+------------+---------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+-----------+---------------+---------+------------+ | Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment | Visible | Expression | +-------+------------+---------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+-----------+---------------+---------+------------+ | tbl1 | 0 | PRIMARY | 1 | col1 | A | NULL | NULL | NULL | | BTREE | available | | YES | NULL | | tbl1 | 0 | idx_tbl1_col2 | 1 | col2 | A | NULL | NULL | NULL | YES | BTREE | available | | YES | NULL | +-------+------------+---------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+-----------+---------------+---------+------------+ 2 rows in set
Example 2: Create a non-unique index
Run the following SQL statements to create a table named tbl2 and create an index on the col2 column of the tbl2 table.
Create the
tbl2table.obclient [test]> CREATE TABLE tbl2(col1 INT, col2 INT, col3 VARCHAR(50), PRIMARY KEY (col1));Create an index named
idx_tbl2_col2on thecol2column of thetbl2table.obclient [test]> CREATE INDEX idx_tbl2_col2 ON tbl2(col2);View the index information of the
tbl2table.obclient [test]> SHOW INDEX FROM tbl2;The return result is as follows:
+-------+------------+---------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+-----------+---------------+---------+------------+ | Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment | Visible | Expression | +-------+------------+---------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+-----------+---------------+---------+------------+ | tbl2 | 0 | PRIMARY | 1 | col1 | A | NULL | NULL | NULL | | BTREE | available | | YES | NULL | | tbl2 | 1 | idx_tbl2_col2 | 1 | col2 | A | NULL | NULL | NULL | YES | BTREE | available | | YES | NULL | +-------+------------+---------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+-----------+---------------+---------+------------+ 2 rows in set
Example 3: Create a local index
A local index, also known as a partitioned index, is created by using the LOCAL keyword. The partitioning key of a local index is the same as that of the table, and the number of partitions of a local index is the same as that of the table. Therefore, the partitioning mechanism of a local index is the same as that of the table. Local indexes and unique local indexes are supported. If you want to use a unique local index to constrain the uniqueness of data, the unique local index must contain the partitioning key of the table.
Run the following SQL statements to create a range-list subpartitioned table named tbl3_rl and create a unique local index on the col1 and col2 columns of the tbl3_rl table.
Create the range-list subpartitioned table
tbl3_rl.obclient [test]> CREATE TABLE tbl3_rl(col1 INT,col2 INT) PARTITION BY RANGE(col1) SUBPARTITION BY LIST(col2) (PARTITION p0 VALUES LESS THAN(100) (SUBPARTITION sp0 VALUES IN(1,3), SUBPARTITION sp1 VALUES IN(4,6), SUBPARTITION sp2 VALUES IN(7,9)), PARTITION p1 VALUES LESS THAN(200) (SUBPARTITION sp3 VALUES IN(1,3), SUBPARTITION sp4 VALUES IN(4,6), SUBPARTITION sp5 VALUES IN(7,9)) );Create a unique local index named
idx_tbl3_rl_col1_col2on thecol1andcol2columns of thetbl3_rltable.obclient [test]> CREATE UNIQUE INDEX idx_tbl3_rl_col1_col2 ON tbl3_rl(col1,col2) LOCAL;View the index information of the
tbl3_rltable.obclient [test]> SHOW INDEX FROM tbl3_rl;The return result is as follows:
+---------+------------+-----------------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+-----------+---------------+---------+------------+ | Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment | Visible | Expression | +---------+------------+-----------------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+-----------+---------------+---------+------------+ | tbl3_rl | 0 | idx_tbl3_rl_col1_col2 | 1 | col1 | A | NULL | NULL | NULL | YES | BTREE | available | | YES | NULL | | tbl3_rl | 0 | idx_tbl3_rl_col1_col2 | 2 | col2 | A | NULL | NULL | NULL | YES | BTREE | available | | YES | NULL | +---------+------------+-----------------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+-----------+---------------+---------+------------+ 2 rows in set
Example 4: Create a global index
A global index is created by using the GLOBAL keyword.
Run the following SQL statements to create a hash-partitioned table named tbl4_h and create a global index on the col2 column of the tbl4_h table.
Create the hash-partitioned table
tbl4_h.obclient [test]> CREATE TABLE tbl4_h(col1 INT PRIMARY KEY,col2 INT) PARTITION BY HASH(col1) PARTITIONS 5;Create a global index named
idx_tbl4_h_col2on thecol2column of thetbl4_htable. The index is a range-partitioned index.obclient [test]> CREATE INDEX idx_tbl4_h_col2 ON tbl4_h(col2) GLOBAL PARTITION BY RANGE(col2) (PARTITION p0 VALUES LESS THAN(100), PARTITION p1 VALUES LESS THAN(200), PARTITION p2 VALUES LESS THAN(300) );View the index information of the
tbl4_htable.obclient [test]> SHOW INDEX FROM tbl4_h;The return result is as follows:
+--------+------------+-----------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+-----------+---------------+---------+------------+ | Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment | Visible | Expression | +--------+------------+-----------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+-----------+---------------+---------+------------+ | tbl4_h | 0 | PRIMARY | 1 | col1 | A | NULL | NULL | NULL | | BTREE | available | | YES | NULL | | tbl4_h | 1 | idx_tbl4_h_col2 | 1 | col2 | A | NULL | NULL | NULL | YES | BTREE | available | | YES | NULL | +--------+------------+-----------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+-----------+---------------+---------+------------+ 2 rows in set
Example 5: Create a spatial index
A spatial index is a database index that is used to process and optimize spatial data. It is widely used in geographic information systems (GIS) and for storing and querying location data. OceanBase Database supports the syntax for creating a spatial index, but the SPATIAL keyword must be used when creating a spatial index.
Run the following SQL statements to create a table named tbl5 and create a spatial index on the g column of the tbl5 table.
Create the
tbl5table.obclient [test]> CREATE TABLE tbl5(id INT,name VARCHAR(20),g GEOMETRY NOT NULL SRID 0);Create a spatial index named
idx_tbl5_gon thegcolumn of thetbl5table.obclient [test]> CREATE INDEX idx_tbl5_g ON tbl5(g);View the index information of the
tbl5table.obclient [test]> SHOW INDEX FROM tbl5;The return result is as follows:
+-------+------------+------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+-----------+---------------+---------+------------+ | Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment | Visible | Expression | +-------+------------+------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+-----------+---------------+---------+------------+ | tbl5 | 1 | idx_tbl5_g | 1 | g | A | NULL | NULL | NULL | | SPATIAL | available | | YES | NULL | +-------+------------+------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+-----------+---------------+---------+------------+ 1 row in set
Example 6: Create a function-based index
An index created based on the values of one or more columns in a table after they are calculated is called a function-based index. A function-based index is an optimization technique that allows you to quickly locate the matching function values during a query, thus avoiding repeated calculations and improving query efficiency.
In MySQL-compatible mode of OceanBase Database, expressions for function-based indexes are limited. Expressions of certain system functions are prohibited as function-based indexes. For more information, see List of system functions supported by function-based indexes and List of system functions not supported by function-based indexes.
Session variable persistence mechanism
The session variable persistence mechanism in function-based indexes ensures that the computation semantics of the index are exactly the same when the index is created and when it is used, avoiding "index hit but incorrect result" due to session environment changes.
How it works
When you create a function-based index, the system automatically analyzes the index expression to identify which session variables (such as time_zone, sql_mode, collation_connection, and so on) the expression depends on, and saves the current values of these variables in the index metadata.
When a query hits the function-based index:
- The system retrieves the session variable snapshot saved when the index was created based on the
local_session_var_idof the index expression. - During expression type derivation and calculation, the saved variable values are used instead of the current session variable values.
- This ensures that the calculation result is exactly the same as when the index was created.
Supported session variables
The session variables that can be persisted in function-based indexes include:
| Variable name | Description | Applicable mode |
|---|---|---|
time_zone |
Time zone setting, which affects the results of time-related functions (such as NOW(), CURTIME()) | MySQL/Oracle |
sql_mode |
SQL mode, which affects SQL syntax parsing and execution behavior | MySQL |
nls_date_format |
Date format in Oracle-compatible mode | Oracle |
nls_timestamp_format |
Timestamp format in Oracle-compatible mode | Oracle |
nls_timestamp_tz_format |
Timestamp format with time zone in Oracle-compatible mode | Oracle |
collation_connection |
Connection collation, which affects string comparison and function results | MySQL |
max_allowed_packet |
Maximum allowed packet size | MySQL |
ob_compatibility_version |
OceanBase compatibility version | MySQL/Oracle |
The system dynamically derives which variables need to be persisted based on the index expression. For example, if the index expression contains time functions such as NOW(), time_zone is persisted; if it contains string functions, collation_connection may be persisted.
Considerations
Avoid relying on volatile expressions
Although the system automatically persists session variables, we recommend that you minimize the use of session-affected functions in indexes and prefer deterministic columns:
-- ❌ Not recommended (depends on time_zone)
CREATE INDEX idx_now ON logs ((DATE(NOW())));
-- ✅ Recommended (uses deterministic column)
CREATE INDEX idx_log_date ON logs ((DATE(log_time)));
No need to manually manage session variables
The system automatically handles session variable consistency. You do not need to worry about changes in the current session variables affecting the correctness of function-based indexes. The variable snapshot saved when the index was created is automatically used during queries.
Run the following SQL statements to create a table named tbl6 and create a function-based index on the c_time column of the tbl6 table.
Create the
tbl6table.obclient [test]> CREATE TABLE tbl6(id INT, name VARCHAR(18), c_time DATE);Create an index named
idx_tbl6_c_timeon thetbl6table. This index is created based on the year part of thec_timecolumn.obclient [test]> CREATE INDEX idx_tbl6_c_time ON tbl6((YEAR(c_time)));View the created function-based index by using the following SQL statement.
SHOW INDEX FROM tbl6;The return result is as follows:
obclient [test]> SHOW INDEX FROM tbl6; +-------+------------+-----------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+-----------+---------------+---------+----------------+ | Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment | Visible | Expression | +-------+------------+-----------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+-----------+---------------+---------+----------------+ | tbl6 | 1 | idx_tbl6_c_time | 1 | SYS_NC19$ | A | NULL | NULL | NULL | YES | BTREE | available | | YES | year(`c_time`) | +-------+------------+-----------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+-----------+---------------+---------+----------------+ 1 row in set
Example 7: Create a column store index
A column store index is an index structure designed for columnar storage databases. It is mainly used to optimize the query performance of columnar storage.
Run the following SQL statements to create a column store index.
After creating the tbl7 table, create the column store index idx1_tbl7_cg.
Create the
tbl7table.CREATE TABLE tbl7(id INT, name VARCHAR(20), age INT);Create the column store index
idx1_tbl7_cgon thetbl7table, and store redundant data of theagecolumn in the index table.CREATE INDEX idx1_tbl7_cg ON tbl7(id) STORING(age) WITH COLUMN GROUP(each column);View the created column store index by using the following SQL statement.
SHOW INDEX FROM tbl7;The return result is as follows:
+-------+------------+--------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+-----------+---------------+---------+------------+ | Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment | Visible | Expression | +-------+------------+--------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+-----------+---------------+---------+------------+ | tbl7 | 1 | idx1_tbl7_cg | 1 | id | A | NULL | NULL | NULL | YES | BTREE | available | | YES | NULL | +-------+------------+--------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+-----------+---------------+---------+------------+
Example 8: Create a multi-valued index
A multi-valued index is a special database index used to handle cases where multiple values can be stored in the same column. OceanBase Database supports creating a multi-valued index when you create a table (CREATE TABLE). Adding a multi-valued index after a table is created is not supported. However, you can use the ALTER TABLE statement to drop an existing multi-valued index. For primary tables that contain multi-valued indexes, complex DML operations are supported.
Example of creating a multi-valued index:
Create the
json_tbl1table and create the multi-valued indexidx1_json_tbl1at the same time.CREATE TABLE json_tbl1(col1 INT, col2 VARCHAR(18), col3 JSON, INDEX idx1_json_tbl1((CAST(col3->'$.ipcode' AS UNSIGNED ARRAY))) );Insert test data into the
json_tbl1table by using the following statement.INSERT INTO json_tbl1 VALUES(1, 'a1', '{"user":"Jk001","user_id":37,"ipcode":[94582,94536]}'), (2, 'a2', '{"user":"Jk002","user_id":22,"ipcode":[94568,94507,94582]}'), (3, 'a3', '{"user":"Jk003","user_id":31,"ipcode":[94477,94507]}'), (4, 'a4', '{"user":"Jk004","user_id":72,"ipcode":[94536]}'), (5, 'a5', '{"user":"Jk005","user_id":56,"ipcode":[94507,94582]}');
Example of creating a composite multi-valued index:
Create the json_tbl2 table and create the composite multi-valued index idx1_json_tbl2 at the same time.
CREATE TABLE json_tbl2(col1 INT, col2 VARCHAR(18), col3 JSON,
INDEX idx1_json_tbl2(col1, (CAST(col3->'$.ipcode' AS UNSIGNED ARRAY)))
);
Example 9: Create a full-text index
Example:
Create the tbl8 table and then create the full-text index ft_idx1_tbl8.
Create the
tbl8table.CREATE TABLE tbl8(col1 INT, col2 VARCHAR(4096));Create the full-text index
ft_idx1_tbl8on thetbl8table. Specify the tokenizer asIKfor the full-text index and set the tokenizer properties by usingPARSER_PROPERTIES.CREATE FULLTEXT INDEX ft_idx1_tbl8 ON tbl8(col2) WITH PARSER IK PARSER_PROPERTIES=(ik_mode='max_word');View the created full-text index by using the following SQL statement.
SHOW INDEX FROM tbl8;The return result is as follows:
+-------+------------+--------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+-----------+---------------+---------+------------+ | Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment | Visible | Expression | +-------+------------+--------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+-----------+---------------+---------+------------+ | tbl8 | 1 | ft_idx1_tbl8 | 1 | col2 | A | NULL | NULL | NULL | YES | FULLTEXT | available | | YES | NULL | +-------+------------+--------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+-----------+---------------+---------+------------+
Next steps
After you create an index, you may need to optimize query performance. For more information about SQL tuning, see Overview of SQL optimization.
Related topics
- For more information about how to view indexes, see View indexes.
- For more information about index management, see DROP INDEX and Delete an index.
- For more information about the system functions supported by function-based indexes, see List of system functions supported by function-based indexes.