An index, also known as a secondary index, is an optional structure that users can choose to create based on their business needs. This helps accelerate queries on specific fields. This topic describes the advantages and disadvantages of using indexes, their availability and visibility, and the relationship between indexes and keys.
In OceanBase Database, a clustered index table model is used. This means that a primary key index is automatically created for the user-specified primary key, while other user-created indexes are secondary indexes.
The following example shows how to create a table named employee and insert three data records.
obclient> CREATE TABLE employee(id INT, name VARCHAR(20), PRIMARY KEY(id));
Query OK, 0 rows affected
obclient> INSERT INTO employee VALUES(1,'John'),(2,'Alice'),(3,'Bob');
Query OK, 3 rows affected
Records: 3 Duplicates: 0 Warnings: 0
obclient> SELECT * FROM employee;
+----+-------+
| ID | NAME |
+----+-------+
| 1 | John |
| 2 | Alice |
| 3 | Bob |
+----+-------+
3 rows in set
In the employee table, data is stored in an ordered manner based on the user-specified id. When searching for data, it can be quickly located based on the id using a binary search method. If you need to quickly search for data based on the name field, you can create a secondary index on the name field, as shown in the following example:
CREATE INDEX name_index ON employee(name);
The data in the index table is as follows:
name: Alice, id: 2
name: Bob, id :3
name: John, id: 1
In the name_index index table, data is stored in an ordered manner based on the name field. When you specify the name field for queries, data can be quickly located based on the name field using a binary search method.
Advantages and disadvantages of indexes
The advantages of indexes are as follows:
You can accelerate queries without modifying SQL statements. Only the required data is scanned.
Indexes store fewer columns, which reduces query I/O.
The disadvantages of indexes are as follows:
You need to have a deep understanding of your business and data model to decide which fields to create indexes on.
When your business changes, you need to re-evaluate whether the previously created indexes still meet your needs.
Writing data requires maintaining the data in the index table, which consumes a certain amount of performance.
Index tables consume memory, disk space, and other resources.
Availability and visibility of indexes
Index availability
In the Drop Partition scenario, if the rebuild index field is not specified, the index is marked as UNUSABLE, indicating that the index is unavailable. At this point, the index does not need to be maintained during DML operations, and the optimizer will ignore this index.
Index visibility
Index visibility refers to whether the optimizer ignores the index. If the index is invisible, the optimizer will ignore it, but the index still needs to be maintained during DML operations. Generally, before deleting an index, you can first set the index to invisible to observe its impact on the business. If no impact is observed, you can then proceed to delete the index.
Relationship between indexes and keys
A key is a set of columns or expressions on which you can create an index. However, indexes and keys are different. An index is an object stored in the data, while a key is a logical concept.
Function-based indexes
A function-based index is an index created based on the values of one or more columns in a table. This is an optimization technique that allows the optimizer to quickly locate matching function values during queries, thereby avoiding redundant calculations and improving query efficiency.
Columnar indexes
A columnar index stores data in a columnar format in the index table. Starting from V4.3.0, OceanBase Database allows you to specify the storage format of a table as columnar when you create the table. An index table is a table like a data table, so you can also store data in the index table in a columnar format.
For more information about columnar storage, see Columnar storage.