An index, also known as a secondary index, is an optional structure that users can create on some fields based on their business needs. Its purpose is to accelerate the query speed on those fields. This topic mainly introduces the advantages and disadvantages of using indexes, their availability and visibility, as well as the relationship between indexes and keys.
OceanBase Database adopts a clustered index table model. A primary key index is automatically generated for the user-specified primary key, while indexes created by users on other columns are considered secondary indexes.
As shown in the following example, a table named EMPLOYEE is created and three records are inserted.
obclient> CREATE TABLE EMPLOYEE(id INT, name VARCHAR(20), PRIMARY KEY(id));
obclient> INSERT INTO EMPLOYEE VALUES(1,'John'),(2,'Alice'),(3,'Bob');
obclient> SELECT * FROM EMPLOYEE;
In the EMPLOYEE table, data is stored in order based on the user-specified id. When searching for data, you can quickly locate specific records using binary search on the id. If you need to quickly search by the name column, you can create a secondary index on the name column as shown below:
CREATE INDEX IDX_NAME ON EMPLOYEE(name);
The data in the index table is as follows:
name: Alice, id: 2
name: Bob, id :3
name: John, id: 1
In the index table name_index, data is stored in order based on the name column. When users query by the name column, the system can quickly locate specific records using binary search.
Advantages and disadvantages
The advantages of indexes are as follows:
You can accelerate queries without modifying SQL statements. Only the required data is scanned.
Indexes store fewer columns, which reduces query I/O.
The disadvantages of indexes are as follows:
You need to have a deep understanding of your business and data model to decide on which fields to create indexes.
When your business changes, you need to re-evaluate whether the existing indexes still meet your requirements.
Writing data requires maintaining the data in the index table, which consumes a certain amount of performance.
Index tables consume memory, disk space, and other resources.
Usability and visibility
Usability of indexes
If you drop a partition without specifying the rebuild index field, the index is marked as UNUSABLE, indicating that the index is unavailable. In this case, the index does not need to be maintained during DML operations, and the optimizer ignores the index.
Visibility of indexes
Index visibility refers to whether the optimizer ignores the index. If the index is invisible, the optimizer ignores it, but the index still needs to be maintained during DML operations. Generally, before you delete an index, you can set the index to invisible to observe its impact on your business. If no impact is observed, you can delete the index.
Relationship between indexes and keys
A key is a column or expression on which you can create an index. However, indexes and keys are different. An index is an object stored in data, while a key is a logical concept.
Function-based indexes
A function-based index is an index created based on the values of one or more columns in a table. Function-based indexes are an optimization technique that allows you to quickly locate matching function values during queries, avoiding redundant calculations and improving query efficiency.
Columnstore indexes
A columnstore index stores data in an index table in a columnar format. Starting from V4.3.0, OceanBase Database allows you to specify the storage format of a table as columnar when you create the table. Like a data table, an index table is also a table, and therefore supports storing data in a columnar format.
For more information about columnar storage, see Columnar storage.