This topic describes the vector index types supported by OceanBase Database.
Index types
OceanBase Database supports the following vector index types:
- Dense index
- Sparse index
- Semantic index
Dense index
OceanBase Database supports dense vector indexes, including HNSW and IVF series. For the sake of readability, they are referred to as dense indexes in the following sections.
| Index type | Description |
|---|---|
| HNSW | The maximum dimension of the indexed column is 4096. HNSW is an in-memory index that needs to be fully loaded into memory. |
| HNSW_SQ | HNSW_SQ provides similar construction speed, search performance, and recall rate as HNSW, but the total memory usage is reduced to 1/2 to 1/3 of that of HNSW. |
| HNSW_BQ | HNSW_BQ has a slightly lower recall rate than HNSW, but significantly reduces memory usage. The BQ quantization compression algorithm (Rabitq) can compress vectors to 1/32 of their original size. As the dimension of the vector increases, the memory optimization effect of HNSW_BQ becomes more pronounced. |
| IVF | The IVF index is implemented based on a database table and does not occupy resident memory. |
| IVF_PQ | The IVF_PQ index is implemented based on a database table and does not occupy resident memory. It applies the PQ quantization technique to the IVF index, resulting in a slightly lower recall rate but higher performance than the IVF index. The PQ quantization compression algorithm can compress vectors to 1/16 to 1/32 of their original size in most scenarios. |
Sparse index
OceanBase Database supports sparse vector indexes implemented based on memory. For the sake of readability, they are referred to as in-memory sparse indexes. In-memory sparse indexes are efficient index types provided by OceanBase Database for sparse vectors (vectors with most elements being zero). They need to be fully loaded into memory and support DML and real-time search.
Note
In-memory sparse indexes are an experimental feature in the current version and are not recommended for use in production environments.
Semantic index
OceanBase Database supports semantic indexes, which leverage the built-in embedding capabilities of OceanBase Database to greatly simplify the usage process of vector indexes. They achieve the transparency of vector concepts for users: you can directly write the original data (such as text) that needs to be stored, and OceanBase Database will automatically convert it into vectors and build indexes internally. During searches, you only need to provide the original search content, and OceanBase Database will automatically perform embedding and search the vector indexes, significantly enhancing usability. Currently, multiple semantic indexes can be created on a single text column, supporting different models, distance algorithms, or index parameters.
Note
Semantic indexes are an experimental feature in the current version and are not recommended for use in production environments.
Considerations and limitations
- Distance algorithms: Dense vector indexes support L2, inner product (IP), and cosine distance as index distance algorithms.
- Distance functions: Vector index search supports calling some distance functions. For more information, see Use SQL functions.
- Filter conditions: Vector search supports filter conditions. Filter conditions can be scalar conditions or spatial relationships, such as ST_Intersects. Multi-value indexes, full-text indexes, and global indexes cannot be used as pre-filterers.
- Hybrid search: You can create vector indexes and full-text indexes on the same table. Vector indexes include dense and in-memory sparse indexes.
- Offline DDL: For information about the support of vector indexes for offline DDL, see Offline DDL.
- Columnstore indexes: Columnstore vector indexes are not supported in the current version.
