This topic describes how to create an index by using SQL statements, and provides the prerequisites, overview, limitations, and recommendations for creating an index. It also provides examples.
Note
This topic describes how to create an index by using the CREATE INDEX statement. For information about how to create a multivalue index, see the CREATE TABLE or ALTER TABLE statement. A multivalue index can be created only when you create a table.
Overview
An index is also called a secondary index. It is an optional table structure. OceanBase Database uses a clustered index table model. For the primary key specified by users, the system automatically generates a primary key index. For other indexes created by users, they are secondary indexes. You can decide based on your business needs which fields to create indexes on to speed up queries on those fields.
For more information about indexes in OceanBase Database, see Overview.
Prerequisites
Before you create an index, make sure that the following conditions are met:
You have deployed an OceanBase cluster and created a MySQL tenant. For more information about how to deploy an OceanBase cluster, see Overview.
You have connected to a MySQL tenant of OceanBase Database. For more information about how to connect to OceanBase Database, see Overview.
You have created a database. For more information about how to create a database, see Create a database.
You have created a table. For more information about how to create a table, see Create a table.
You have the
INDEXprivilege. For information about how to check the privileges of the current user, see Check the privileges of the current user. If you do not have theINDEXprivilege, contact the administrator to grant the privilege to you. For information about how to directly grant a privilege to a user, see Directly grant a privilege to a user.
Limitations on index creation
In OceanBase Database, the name of an index must be unique within the database (DataBase).
The length of an index name cannot exceed 64 bytes.
Limitations on unique indexes:
You can create multiple unique indexes in a table, but the values of the columns corresponding to each unique index must be unique.
If you want the combination of other columns to be globally unique in addition to the primary key, you must create a global unique index.
When you create a local unique index, the index must contain all the columns in the partitioning function of the table.
When you create a global index, the partitioning rules of the global index do not have to be the same as or consistent with those of the table.
Limitations on spatial indexes:
Spatial indexes support only local indexes and not global indexes.
The column on which you create a spatial index must be defined with the
SRIDattribute. Otherwise, the spatial index created on the column will not take effect in subsequent queries. For more information about theSRIDattribute, see Spatial reference systems (SRS).You can create a spatial index only on a column of a spatial data type. For more information about the spatial data types supported by OceanBase Database, see Overview of spatial data types.
The column attribute of the column on which you create a spatial index must be
NOT NULL. If the column attribute is notNOT NULL, you can run theALTER TABLEstatement to change the column attribute toNOT NULLand then create a spatial index. For more information about how to define the column attribute, see Define the constraint type for a column.OceanBase Database does not support modifying the
SRIDattribute of a column by using theALTER TABLEstatement. Therefore, you must define theSRIDattribute of the spatial column when you create the table so that the spatial index takes effect.
Limitations on multivalued indexes
A multivalued index can contain only one multivalued key part. However, the
CAST(... AS ... ARRAY)expression can reference multiple arrays in a JSON document, as shown in the following example:CAST(data->'$.arr[*][*]' AS UNSIGNED ARRAY)In this case, all values that match the JSON expression are stored as a single flat array in the index.
A multivalued index cannot be a covering index.
Only JSON expressions are allowed as the unique type of expressions in a multivalued key part. The expressions do not have to reference existing elements in the JSON documents inserted into the index columns, but must be syntactically valid.
Since the index records of a clustered index are scattered in a multivalued index, a multivalued index does not support range scans or index-only scans.
A multivalued index cannot be used in a foreign key specification.
You cannot define an index prefix for a multivalued index.
You cannot create a multivalued index on a Binary column.
Limitations and considerations on full-text indexes
You can create a full-text index on a non-partitioned table or a partitioned table with or without a primary key. The following limitations apply:
- A full-text index can be created only on a
CHAR,VARCHAR, orTEXTcolumn. - Only local (
LOCAL) full-text indexes can be created in the current version. - You cannot specify the
UNIQUEkeyword when you create a full-text index. - If you want to create a full-text index that involves multiple columns, you must ensure that these columns have the same character set.
- A full-text index can be created only on a
Recommendations for creating an index
Use a name that can succinctly describe the columns and purpose of the index, such as
idx_customer_name. For more information about naming conventions, see Overview of object naming conventions.If the partitioning rules and the number of partitions of a global index are the same as those of the primary table, we recommend that you create a local index.
We recommend that you issue the CREATE INDEX statement in parallel, and the number of statements should not exceed the number of CPU cores specified in the tenant specification. For example, if the tenant specification specifies 4 CPU cores (4C), you can create at most 4 indexes in parallel.
Avoid creating too many indexes on a table that is frequently updated. Create indexes on fields that are frequently used for queries.
We recommend that you do not create indexes on tables with a small amount of data. If the amount of data is small, the time spent on querying all data may be shorter than the time spent on traversing the index. In this case, the index may not provide optimization.
If the modification performance is much higher than the query performance, we recommend that you do not create an index.
To create an efficient index:
Include as many columns as possible in the index. This way, you can reduce the number of rows to be retrieved from the table as much as possible.
Place the equality condition at the beginning of the index.
Place the conditions for filtering and sorting large amounts of data at the beginning of the index.
Create an index by using the command-line tool
Use the CREATE INDEX statement to create an index.
Note
You can run the SHOW INDEX FROM table_name; statement to view the index information in a table. In this statement, table_name specifies the name of the table.
Example
Example 1: Create a unique index
If you want to ensure that the values in the indexed column are unique, you can create a unique index.
Use the following SQL statements to create a table named tbl1 and create a unique index named idx_tbl1_col2 based on the col2 column of the tbl1 table.
Create the
tbl1table.obclient [test]> CREATE TABLE tbl1(col1 INT, col2 INT, col3 VARCHAR(50), PRIMARY KEY (col1));Create a unique index named
idx_tbl1_col2based on thecol2column of thetbl1table.obclient [test]> CREATE UNIQUE INDEX idx_tbl1_col2 ON tbl1(col2);View the index information of the
tbl1table.obclient [test]> SHOW INDEX FROM tbl1;The return result is as follows:
+-------+------------+---------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+-----------+---------------+---------+------------+ | Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment | Visible | Expression | +-------+------------+---------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+-----------+---------------+---------+------------+ | tbl1 | 0 | PRIMARY | 1 | col1 | A | NULL | NULL | NULL | | BTREE | available | | YES | NULL | | tbl1 | 0 | idx_tbl1_col2 | 1 | col2 | A | NULL | NULL | NULL | YES | BTREE | available | | YES | NULL | +-------+------------+---------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+-----------+---------------+---------+------------+ 2 rows in set
Example 2: Create a non-unique index
Use the following SQL statements to create a table named tbl2 and create an index named idx_tbl2_col2 based on the col2 column of the tbl2 table.
Create the
tbl2table.obclient [test]> CREATE TABLE tbl2(col1 INT, col2 INT, col3 VARCHAR(50), PRIMARY KEY (col1));Create an index named
idx_tbl2_col2based on thecol2column of thetbl2table.obclient [test]> CREATE INDEX idx_tbl2_col2 ON tbl2(col2);View the index information of the
tbl2table.obclient [test]> SHOW INDEX FROM tbl2;The return result is as follows:
+-------+------------+---------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+-----------+---------------+---------+------------+ | Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment | Visible | Expression | +-------+------------+---------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+-----------+---------------+---------+------------+ | tbl2 | 0 | PRIMARY | 1 | col1 | A | NULL | NULL | NULL | | BTREE | available | | YES | NULL | | tbl2 | 1 | idx_tbl2_col2 | 1 | col2 | A | NULL | NULL | NULL | YES | BTREE | available | | YES | NULL | +-------+------------+---------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+-----------+---------------+---------+------------+ 2 rows in set
Example 3: Create a local index
A local index is also known as a partitioned index. The keyword for creating a local index is LOCAL. The partitioning key of a local index is the same as that of the table, and the number of partitions of a local index is the same as that of the table. Therefore, the partitioning mechanism of a local index is the same as that of the table. You can create a local index or a local unique index. If you want to use a local unique index to enforce the uniqueness of data, the local unique index must contain the table partitioning key.
Use the following SQL statements to create a table named tbl3_rl with range + list subpartitioning and create a local unique index named idx_tbl3_rl_col1_col2 based on the col1 and col2 columns of the tbl3_rl table.
Create the
tbl3_rltable with range + list subpartitioning.obclient [test]> CREATE TABLE tbl3_rl(col1 INT,col2 INT) PARTITION BY RANGE(col1) SUBPARTITION BY LIST(col2) (PARTITION p0 VALUES LESS THAN(100) (SUBPARTITION sp0 VALUES IN(1,3), SUBPARTITION sp1 VALUES IN(4,6), SUBPARTITION sp2 VALUES IN(7,9)), PARTITION p1 VALUES LESS THAN(200) (SUBPARTITION sp3 VALUES IN(1,3), SUBPARTITION sp4 VALUES IN(4,6), SUBPARTITION sp5 VALUES IN(7,9)) );Create a local unique index named
idx_tbl3_rl_col1_col2based on thecol1andcol2columns of thetbl3_rltable.obclient [test]> CREATE UNIQUE INDEX idx_tbl3_rl_col1_col2 ON tbl3_rl(col1,col2) LOCAL;View the index information of the
tbl3_rltable.obclient [test]> SHOW INDEX FROM tbl3_rl;The return result is as follows:
+---------+------------+-----------------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+-----------+---------------+---------+------------+ | Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment | Visible | Expression | +---------+------------+-----------------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+-----------+---------------+---------+------------+ | tbl3_rl | 0 | idx_tbl3_rl_col1_col2 | 1 | col1 | A | NULL | NULL | NULL | YES | BTREE | available | | YES | NULL | | tbl3_rl | 0 | idx_tbl3_rl_col1_col2 | 2 | col2 | A | NULL | NULL | NULL | YES | BTREE | available | | YES | NULL | +---------+------------+-----------------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+-----------+---------------+---------+------------+ 2 rows in set
Example 4: Create a global index
The keyword for creating a global index is GLOBAL.
Use the following SQL statements to create a table named tbl4_h with hash partitioning and create a global index named idx_tbl4_h_col2 based on the col2 column of the tbl4_h table.
Create the
tbl4_htable with hash partitioning.obclient [test]> CREATE TABLE tbl4_h(col1 INT PRIMARY KEY,col2 INT) PARTITION BY HASH(col1) PARTITIONS 5;Create a global index named
idx_tbl4_h_col2based on thecol2column of thetbl4_htable. The index is a range-partitioned index.obclient [test]> CREATE INDEX idx_tbl4_h_col2 ON tbl4_h(col2) GLOBAL PARTITION BY RANGE(col2) (PARTITION p0 VALUES LESS THAN(100), PARTITION p1 VALUES LESS THAN(200), PARTITION p2 VALUES LESS THAN(300) );View the index information of the
tbl4_htable.obclient [test]> SHOW INDEX FROM tbl4_h;The return result is as follows:
+--------+------------+-----------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+-----------+---------------+---------+------------+ | Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment | Visible | Expression | +--------+------------+-----------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+-----------+---------------+---------+------------+ | tbl4_h | 0 | PRIMARY | 1 | col1 | A | NULL | NULL | NULL | | BTREE | available | | YES | NULL | | tbl4_h | 1 | idx_tbl4_h_col2 | 1 | col2 | A | NULL | NULL | NULL | YES | BTREE | available | | YES | NULL | +--------+------------+-----------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+-----------+---------------+---------+------------+ 2 rows in set
Example 5: Create a spatial index
A spatial index is a database index used for processing and optimizing spatial data. It is widely used in geographic information systems (GIS) and for storing and querying location data. OceanBase Database supports creating spatial indexes using the syntax for creating regular indexes, but the SPATIAL keyword must be used.
Use the following SQL statements to create a table named tbl5 and create a spatial index named idx_tbl5_g based on the g column of the tbl5 table.
Create the
tbl5table.obclient [test]> CREATE TABLE tbl5(id INT,name VARCHAR(20),g GEOMETRY NOT NULL SRID 0);Create a spatial index named
idx_tbl5_gbased on thegcolumn of thetbl5table.obclient [test]> CREATE INDEX idx_tbl5_g ON tbl5(g);View the index information of the
tbl5table.obclient [test]> SHOW INDEX FROM tbl5;The return result is as follows:
+-------+------------+------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+-----------+---------------+---------+------------+ | Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment | Visible | Expression | +-------+------------+------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+-----------+---------------+---------+------------+ | tbl5 | 1 | idx_tbl5_g | 1 | g | A | NULL | NULL | NULL | | SPATIAL | available | | YES | NULL | +-------+------------+------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+-----------+---------------+---------+------------+ 1 row in set
Example 6: Create a function index
An index built based on the values of one or more columns in a table is called a function index. A function index is an optimization technique that allows you to quickly locate matching function values during queries, avoiding redundant calculations and improving query efficiency.
In the MySQL mode of OceanBase Database, the expressions of some system functions are not supported as function index expressions. For more information about the system functions that are supported and not supported as function index expressions, see List of system functions supported as function index expressions and List of system functions not supported as function index expressions.
Session variable fixation mechanism
The session variable fixation mechanism in a function index ensures that the calculation semantics are consistent when the index is created and used. This prevents the issue of "index hit but incorrect results" caused by changes in the session environment.
Working principle
When you create a function index, the system automatically analyzes the index expression to identify which session variables (such as time_zone, sql_mode, and collation_connection) the expression depends on. The current values of these variables are then saved in the index metadata.
When a query hits the function index:
- The system retrieves the session variable snapshot saved during index creation based on the
local_session_var_idin the index expression. - During expression type inference and calculation, the saved variable values are used instead of the current session variable values.
- The calculation results are ensured to be consistent with those during index creation.
Supported session variables
The session variables that can be fixed in a function index include:
| Variable | Description | Applicable mode |
|---|---|---|
time_zone |
The time zone setting, which affects the results of time-related functions (such as NOW() and CURTIME()). | MySQL/Oracle |
sql_mode |
The SQL mode, which affects SQL syntax parsing and execution behavior. | MySQL |
nls_date_format |
The date format in Oracle mode. | Oracle |
nls_timestamp_format |
The timestamp format in Oracle mode. | Oracle |
nls_timestamp_tz_format |
The timestamp format with time zone in Oracle mode. | Oracle |
collation_connection |
The connection character set collation, which affects string comparisons and function results. | MySQL |
max_allowed_packet |
The maximum allowed data packet size. | MySQL |
ob_compatibility_version |
The OceanBase compatibility version. | MySQL/Oracle |
The system dynamically infers the variables that need to be fixed based on the index expression. For example, if the index expression contains a time function such as NOW(), the time_zone variable is fixed. If the expression contains string functions, the collation_connection variable may be fixed.
Considerations
Avoid using volatile expressions
Although the system automatically fixes session variables, it is recommended to minimize the use of functions affected by the session in the index. Instead, use deterministic columns:
-- ❌ Not recommended (depends on time_zone)
CREATE INDEX idx_now ON logs ((DATE(NOW())));
-- ✅ Recommended (uses a deterministic column)
CREATE INDEX idx_log_date ON logs ((DATE(log_time)));
No manual management of session variables is required
The system automatically handles session variable consistency, so you do not need to worry about changes in the current session variables affecting the correctness of existing function indexes. During queries, the system automatically uses the variable snapshots saved during index creation.
Use the following SQL statement to create a table named tbl6 and create a function index based on the c_time column of the table.
Create the
tbl6table.obclient [test]> CREATE TABLE tbl6(id INT, name VARCHAR(18), c_time DATE);Create an index named
idx_tbl6_c_timeon thec_timecolumn of thetbl6table.obclient [test]> CREATE INDEX idx_tbl6_c_time ON tbl6((YEAR(c_time)));Use the following SQL statement to view the created function index.
SHOW INDEX FROM tbl6;The result is as follows:
obclient [test]> SHOW INDEX FROM tbl6; +-------+------------+-----------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+-----------+---------------+---------+----------------+ | Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment | Visible | Expression | +-------+------------+-----------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+-----------+---------------+---------+----------------+ | tbl6 | 1 | idx_tbl6_c_time | 1 | SYS_NC19$ | A | NULL | NULL | NULL | YES | BTREE | available | | YES | year(`c_time`) | +-------+------------+-----------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+-----------+---------------+---------+----------------+ 1 row in set
Example 7: Create a columnstore index
A columnstore index is a type of index designed for columnar databases to optimize query performance in columnar storage.
Use the following SQL statement to create a columnstore index.
After creating the tbl7 table, create a columnstore index named idx1_tbl7_cg.
Create the
tbl7table.CREATE TABLE tbl7(id INT, name VARCHAR(20), age INT);Create a columnstore index named
idx1_tbl7_cgon thetbl7table and store redundant data of theagecolumn in the index table.CREATE INDEX idx1_tbl7_cg ON tbl7(id) STORING(age) WITH COLUMN GROUP(each column);Use the following SQL statement to view the created function index.
SHOW INDEX FROM tbl7;The result is as follows:
+-------+------------+--------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+-----------+---------------+---------+------------+ | Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment | Visible | Expression | +-------+------------+--------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+-----------+---------------+---------+------------+ | tbl7 | 1 | idx1_tbl7_cg | 1 | id | A | NULL | NULL | NULL | YES | BTREE | available | | YES | NULL | +-------+------------+--------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+-----------+---------------+---------+------------+
Example 8: Create a multivalued index
A multivalued index is a special type of database index used to handle scenarios where multiple values can be stored in a single column. OceanBase Database allows you to create a multivalued index when you create a table using the CREATE TABLE statement. Currently, you cannot add a multivalued index to an existing table, but you can use the ALTER TABLE statement to drop an existing index. For tables that already contain multivalued indexes, complex DML operations are supported.
Here is an example of creating a multivalued index:
Create a table named
json_tbl1and create a multivalued index namedidx1_json_tbl1at the same time.CREATE TABLE json_tbl1(col1 INT, col2 VARCHAR(18), col3 JSON, INDEX idx1_json_tbl1((CAST(col3->'$.ipcode' AS UNSIGNED ARRAY))) );Use the following statement to insert test data into the
json_tbl1table.INSERT INTO json_tbl1 VALUES(1, 'a1', '{"user":"Jk001","user_id":37,"ipcode":[94582,94536]}'), (2, 'a2', '{"user":"Jk002","user_id":22,"ipcode":[94568,94507,94582]}'), (3, 'a3', '{"user":"Jk003","user_id":31,"ipcode":[94477,94507]}'), (4, 'a4', '{"user":"Jk004","user_id":72,"ipcode":[94536]}'), (5, 'a5', '{"user":"Jk005","user_id":56,"ipcode":[94507,94582]}');
Here is another example of creating a multivalued index:
Create a table named json_tbl2 and create a composite multivalued index named idx1_json_tbl2 at the same time.
CREATE TABLE json_tbl2(col1 INT, col2 VARCHAR(18), col3 JSON,
INDEX idx1_json_tbl2(col1, (CAST(col3->'$.ipcode' AS UNSIGNED ARRAY)))
);
Example 9: Create a full-text index
Here is an example:
Create the ft_idx1_tbl8 full-text index on the tbl8 table.
Create the
tbl8table.CREATE TABLE tbl8(col1 INT, col2 VARCHAR(4096));Create the
ft_idx1_tbl8full-text index on thetbl8table, specify theIKtokenizer, and set the tokenizer properties by using thePARSER_PROPERTIESparameter.CREATE FULLTEXT INDEX ft_idx1_tbl8 ON tbl8(col2) WITH PARSER IK PARSER_PROPERTIES=(ik_mode='max_word');Run the following SQL statement to query the full-text index.
SHOW INDEX FROM tbl8;The return result is as follows:
+-------+------------+--------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+-----------+---------------+---------+------------+ | Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment | Visible | Expression | +-------+------------+--------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+-----------+---------------+---------+------------+ | tbl8 | 1 | ft_idx1_tbl8 | 1 | col2 | A | NULL | NULL | NULL | YES | FULLTEXT | available | | YES | NULL | +-------+------------+--------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+-----------+---------------+---------+------------+
What to do next
After you create an index, you may need to optimize query performance. For more information about SQL tuning, see Overview of SQL tuning.
References
- For more information about viewing indexes, see View indexes.
- For more information about managing indexes, see DROP INDEX and Delete an index.
- For more information about the system functions supported by function indexes, see List of system functions supported by function indexes.
