An index is an optional structure that you can create on certain fields based on your business requirements to accelerate queries on these fields. This topic describes the advantages and disadvantages of using indexes and the availability and visibility of indexes.
Advantages and disadvantages of indexes
Advantages
You can accelerate queries without modifying SQL statements. You only need to scan the required data.
Indexes store fewer columns, which can reduce query I/O.
Disadvantages
You need to understand your business and data model to decide which fields to create indexes on.
When your business changes, you need to reevaluate whether the existing indexes still meet your needs.
Maintaining indexes during data writes consumes some performance.
Indexes consume memory and disk resources.
Availability and visibility of indexes
Availability of indexes
When you truncate or drop a partition, OceanBase Database rebuilds all global indexes of the corresponding table. During the rebuild, the indexes are temporarily unavailable.
Visibility of indexes
The visibility of an index refers to whether the optimizer ignores it. If an index is invisible, the optimizer ignores it, but the index is still maintained during DML operations. Generally, you can set an index to invisible before deleting it to observe its impact on your business. If no impact is observed, you can delete the index.
Relationship between indexes and keys
A key is a set of columns or expressions on which you can create an index. However, indexes and keys are different. An index is a database object, while a key is a logical concept.
Types of indexes
Global indexes and local indexes
Global indexes
A global index is created on the entire table and contains data from all partitions. Global indexes are suitable for queries that span multiple partitions.
Local indexes
A local index is created on each partition and contains data from only that partition. Local indexes are suitable for queries that are performed within a single partition.
Unique indexes and non-unique indexes
Unique indexes
A unique index requires that the values in the indexed columns are unique, with each index value corresponding to only one record. Unique indexes ensure the uniqueness of a specific field in a table.
Non-unique indexes
A non-unique index allows duplicate values in the indexed columns, meaning that multiple records can have the same index value. Non-unique indexes accelerate queries on indexed columns but do not guarantee the uniqueness of the indexed columns.
Spatial indexes
A spatial index is a special type of index that accelerates queries on spatial data. It is primarily used in geographic information systems (GIS) and spatial data processing. A spatial index supports efficient queries and spatial relationship analysis on spatial objects such as points, lines, and polygons.
Function indexes
A function index is created based on the results of calculations performed on one or more columns of a table. Function indexes are an optimization technique that allows you to quickly locate matching function values during queries, avoiding redundant calculations and improving query efficiency.
Columnstore indexes
A columnstore index stores data in the index table in a columnar format. OceanBase Database supports specifying the storage format of a table as columnstore when you create the table. Since an index is also a table, you can also store data in the index table in a columnar format.
For more information about columnstore, see Columnstore.
Multivalued indexes
OceanBase Database's MySQL mode supports multivalued indexes, which are particularly useful for JSON documents and other collection data types. This feature allows you to create indexes on arrays or collections, improving the efficiency of queries that search for elements within JSON arrays.
In the current version, you can create multivalued indexes using the CREATE TABLE statement. This involves using the CAST(... AS ... ARRAY) statement in the index definition to convert scalar values of the same type within a JSON array into SQL array types. This operation creates a virtual column that is automatically populated with SQL array values. A function index (also known as a virtual index) is then created on this virtual column. A multivalued index is built on this virtual column, which extracts values from the SQL array.
Description of multivalued indexes
DML operations that affect multivalued indexes are handled in the same way as those that affect regular indexes. The only difference is that a single cluster index record may be inserted or updated multiple times.
Nullability and multivalued indexes:
- If a multivalued key part contains an empty array, no entry is added to the index, and data records cannot be accessed through index scans.
- If a typed array column is set to NULL, the storage engine stores a single entry containing NULL and pointing to the data record.
Since multivalued indexes are virtual indexes on virtual generated columns, they must follow the same rules as auxiliary indexes on virtual generated columns.
Complex DML operations are supported for tables that contain multivalued indexes.
Multivalued indexes can be created after the main table, meaning you can create the main table first and then the multivalued index.
For more information about multivalued indexes, see Multivalued indexes.
Full-text indexes
A full-text index is a technology that accelerates the search of text data. It can comprehensively index and retrieve text content. A full-text index can index entire documents or large segments of text content. Its main function is to enable full-text search, which allows you to quickly find matching text in the database based on the keywords you enter. It can handle large volumes of text data, including articles, reports, web pages, and emails.
For more information about full-text indexes, see Full-text indexes.
Vector indexes
A vector index is a special type of index that accelerates vector nearest-neighbor searches. It provides efficient retrieval support based on approximate nearest neighbor (ANN) methods for vector data. OceanBase Database supports various types of vector indexes, such as HNSW, IVF, sparse indexes, and semantic indexes. All vector indexes support DML operations and real-time retrieval, making them suitable for scenarios such as search, recommendation, and retrieval-augmented generation (RAG).
For more information about vector indexes, see Vector indexes.
Search indexes
A search index is a special type of index that accelerates searches on structured and semi-structured data. It is implemented based on a generalized inverted index (GIN) and supports full-path inverted indexing for complex data types such as JSON and arrays. It also supports the creation of indexes on regular scalar columns. Search indexes are commonly used in hybrid search scenarios and in scenarios involving multi-column indexes and index merge operations.
For more information about search indexes, see Search indexes.
References
- For information about the support of multivalued and full-text indexes for offline DDL operations, see Offline DDL.
