A secondary index is an optional structure that users can choose to create based on their business needs. It helps accelerate queries on specific fields by allowing users to create indexes on those fields. This topic describes the advantages and disadvantages of using indexes, their availability and visibility, and the relationship between indexes and keys.
In OceanBase Database, which uses clustered index tables, a primary key index is automatically created for the user-specified primary key. Other indexes created by the user are secondary indexes.
The following example shows how to create a table named employee and insert three rows of data.
obclient> CREATE TABLE employee(id INT, name VARCHAR(20), PRIMARY KEY(id));
Query OK, 0 rows affected
obclient> INSERT INTO employee VALUES(1,'John'),(2,'Alice'),(3,'Bob');
Query OK, 3 rows affected
Records: 3 Duplicates: 0 Warnings: 0
obclient> SELECT * FROM employee;
+----+-------+
| ID | NAME |
+----+-------+
| 1 | John |
| 2 | Alice |
| 3 | Bob |
+----+-------+
3 rows in set
In the employee table, data is stored in order of the user-specified id field. When searching for data, you can quickly locate the specific data based on the id field using binary search. If you need to quickly find data based on the name field, you can create a secondary index on the name field. The following example shows how to do this:
CREATE INDEX name_index ON employee(name);
The data in the index table is as follows:
name: Alice, id: 2
name: Bob, id :3
name: John, id: 1
In the name_index index table, data is stored in order of the name field. When you specify the name field for queries, you can quickly locate the specific data using binary search.
Advantages and disadvantages of indexes
The advantages of indexes are as follows:
You can accelerate queries without modifying SQL statements. Only the necessary data is scanned.
Indexes store fewer columns, which reduces query I/O.
The disadvantages of indexes are as follows:
You need to have a deep understanding of your business and data model to determine which fields to create indexes on.
When your business changes, you need to re-evaluate whether the existing indexes still meet your needs.
Writing data requires maintaining the data in the index table, which consumes a certain amount of performance.
Index tables consume memory, disk space, and other resources.
Availability and visibility of indexes
Index availability
In a DROP PARTITION scenario, if the rebuild index field is not specified, the index is marked as UNUSABLE, indicating that it is unavailable. At this point, the index does not need to be maintained during DML operations, and the optimizer will ignore it.
Index visibility
Index visibility refers to whether the optimizer ignores the index. If the index is invisible, the optimizer will ignore it, but the index still needs to be maintained during DML operations. Generally, before deleting an index, you can set it to invisible to observe its impact on the business. If there is no impact, you can then delete the index.
Relationship between indexes and keys
A key is a set of columns or expressions on which you can create an index. However, indexes and keys are different. An index is an object stored in the data, while a key is a logical concept.
Function-based indexes
A function-based index is an index created based on the values of one or more columns in a table. Function-based indexes are an optimization technique that allows you to quickly locate matching function values during queries, avoiding redundant calculations and improving query efficiency.
Columnar indexes
A columnar index stores data in the index table in a columnar format. Starting from V4.3.0, OceanBase Database supports specifying the columnar storage format for a table when creating it. Indexes and data tables are both stored as single tables, so they also support storing data in the index table in a columnar format.
For more information about columnar storage, see Columnar storage.