An index, also known as a secondary index, is an optional structure that users can create on some fields based on their business needs. Its purpose is to accelerate the query speed on those fields. This topic mainly introduces the advantages and disadvantages of using indexes, their availability and visibility, as well as the relationship between indexes and keys.
OceanBase Database adopts a clustered index table model. For a user-specified primary key, a primary key index is automatically generated. Other indexes created by users are considered secondary indexes.
In the following example, a table called employee is created, and three sets of data are inserted.
obclient> CREATE TABLE employee(id INT, name VARCHAR(20), PRIMARY KEY(id));
Query OK, 0 rows affected
obclient> INSERT INTO employee VALUES(1,'John'),(2,'Alice'),(3,'Bob');
Query OK, 3 rows affected
Records: 3 Duplicates: 0 Warnings: 0
obclient> SELECT * FROM employee;
+----+-------+
| ID | NAME |
+----+-------+
| 1 | John |
| 2 | Alice |
| 3 | Bob |
+----+-------+
3 rows in set
In the employee table, data is stored in order based on the user-specified id field. When searching for data, it is possible to quickly locate the specific data using binary search based on id. If there is a need to quickly search by the name field, a secondary index can be created on the name field, as shown in the example below:
CREATE INDEX name_index ON employee(name);
The data in the index table would be as follows:
name: Alice, id: 2
name: Bob, id :3
name: John, id: 1
In the name_index index table, the data is stored in order based on the name field. When a user specifies a query on the name field, specific data can be quickly located using binary search.
Advantages and disadvantages
Indexes have the following advantages:
Indexes enable faster SQL queries without the need to modify SQL statements, as only the necessary data is scanned.
Indexes typically store a minimal number of columns, reducing query I/O overhead.
Indexes have the following disadvantages:
Creating indexes requires a deep understanding of the business and its data model to choose the appropriate columns.
If any changes occur in the business, a reassessment of existing indexes is necessary to ensure they still meet the requirements.
Maintaining index table data during incremental data writes can impact database performance.
Index tables consume resources, including memory and disk resources.
Usability and visibility
Usability of indexes
In a DROP PARTITION operation, if the rebuild index field is not specified, the index is marked as UNUSABLE, which indicates that the index is unavailable. In this case, the index does not need to be maintained in Data Manipulation Language (DML) operations, and is ignored by the optimizer.
Visibility of indexes
The visibility of an index determines whether the index is ignored by the optimizer. If an index is invisible, it is ignored by the optimizer but needs to be maintained in DML operations. Before you drop an index, you can make it invisible to test the effect on your business. If your business is not affected, you can drop the index.
Relationships between indexes and keys
A key is one or more columns or expressions on which you can create indexes. A key is a logical concept, whereas an index is an object stored in a database.
Columnstore index
Columnstore index stores the data in the index table in a columnar format. Starting from V4.3.0, OceanBase Database supports specifying the storage format of a table as columnar when creating a table. Both the index and the data table are treated as a single table, so it also supports storing the data in the index table in a columnar format.
For more information about columnar storage, see Columnar storage.