An index, also known as a secondary index, is an optional structure that users can choose to create on specific fields based on their business needs. This helps accelerate queries on those fields. This topic describes the advantages and disadvantages of using indexes, their availability and visibility, and their relationship with keys.
In OceanBase Database, the clustered index table model is used. This means that a primary key index is automatically created for the user-specified primary key, while other user-created indexes are secondary indexes.
For example, you can create a table named employee and insert three data records.
obclient> CREATE TABLE employee(id INT, name VARCHAR(20), PRIMARY KEY(id));
Query OK, 0 rows affected
obclient> INSERT INTO employee VALUES(1,'John'),(2,'Alice'),(3,'Bob');
Query OK, 3 rows affected
Records: 3 Duplicates: 0 Warnings: 0
obclient> SELECT * FROM employee;
+----+-------+
| ID | NAME |
+----+-------+
| 1 | John |
| 2 | Alice |
| 3 | Bob |
+----+-------+
3 rows in set
The data in the employee table is stored in an ordered manner based on the user-specified id. When you want to locate specific data, you can use the id field for binary search. If you need to quickly locate data based on the name field, you can create a secondary index on the name field, as shown in the following example:
CREATE INDEX name_index ON employee(name);
The data in the index table is as follows:
name: Alice, id: 2
name: Bob, id :3
name: John, id: 1
The data in the name_index index table is stored in an ordered manner based on the name field. When you want to locate specific data based on the name field, you can use the name field for binary search.
Advantages and disadvantages of indexes
The advantages of indexes are as follows:
You can accelerate queries without modifying SQL statements. This is because only the required data is scanned.
Indexes typically contain fewer columns, which can reduce query I/O.
The disadvantages of indexes are as follows:
You need a deep understanding of your business and data model to decide which fields to create indexes on.
When your business changes, you need to re-evaluate whether the existing indexes still meet your needs.
Maintaining indexes during data writes consumes some performance.
Indexes occupy resources such as memory and disk space.
Availability and visibility of indexes
Index availability
When you drop a partition without specifying the rebuild index option, the index is marked as UNUSABLE, indicating that it is unavailable. In this case, the index does not need to be maintained during DML operations, and the optimizer will ignore it.
Index visibility
Index visibility refers to whether the optimizer ignores the index. If an index is invisible, the optimizer will ignore it, but the index still needs to be maintained during DML operations. Generally, before deleting an index, you can set it to invisible to observe its impact on your business. If there is no impact, you can then delete the index.
Relationship between indexes and keys
A key is a set of columns or expressions on which you can create an index. However, indexes and keys are different. An index is an object stored in the data, while a key is a logical concept.