Choose an index type|V4.6.0|OceanBase Database| docs|Distributed Database

OceanBase Database provides vector indexes with different algorithms. You can choose an appropriate index type based on your use case.

HNSW or IVF?

OceanBase Database provides two types of dense indexes:

Graph-based HNSW series indexes: HNSW, HNSW_SQ, and HNSW_BQ.
Disk-based IVF series indexes: IVF and IVF_PQ.

These two types of indexes have different strengths. HNSW series indexes typically provide higher query performance but require more memory. IVF series indexes perform well when cache is sufficient and do not require memory. However, the choice between HNSW and IVF is not solely based on memory considerations. You must also evaluate other factors such as business scenarios, data size, performance requirements, and resource constraints. The following sections compare the core differences between these two types of indexes and provide recommendations.

Core differences

Dimension	HNSW series	IVF series
Storage method	Index based on a graph structure in memory	Index based on a disk
Memory usage	Requires complete loading into memory, with high memory usage	Does not require memory, with low memory usage
Query performance (QPS)	Extremely high, with sub-millisecond response	High, close to HNSW when cache is sufficient
Recall rate	High, up to 99%	High, slightly lower than HNSW, and can be optimized by parameters
Build speed	Slower, requires building a graph structure	Faster, based on clustering algorithms
Applicable data volume	Millions to billions	Millions to tens of billions
Cost	High memory cost	Low storage cost, suitable for large-scale data
Real-time capability	Supports real-time DML operations	Supports real-time DML operations

Decision flowchart

Note

Before selecting an index type, you need to estimate the memory usage based on the information in Manage memory for vector indexes.
The recommendations in the decision flowchart are based on 1024-dimensional vectors. If the actual dimension is different, you can approximately calculate the required resources based on the proportion.
The decision flowchart primarily considers memory costs to help you decide on the index type.
Even if the tenant has sufficient memory, you do not always need to choose the highest specification index such as HNSW. If you have high performance requirements, you can consider more cost-effective options such as HNSW_SQ.

Notice

This section only lists some common scenarios. If you cannot determine the index type based on the above flowchart or have other requirements, contact OceanBase Technical Support.

Use partitioned tables?

The primary purpose of using partitioned tables is to handle large-scale data scenarios. Additionally, if the query conditions can be used as partition keys, partition pruning can enhance query performance. We recommend using partitioned tables in the following two scenarios:

The data volume reaches tens of millions or even hundreds of millions: When the data volume is very large, partitioned tables can distribute the data across multiple partitions. Each partition independently builds an index, thereby reducing the load of a single query and improving overall query performance.
The query conditions include a specific scalar column that can be used for partition pruning: For example, if the label field always appears in the WHERE condition, you can consider creating a partitioned table with label as the partition key. This way, partition pruning can reduce the number of partitions to be queried.

Specific recommendations are as follows:

Partitioning

When using vector indexes, the number of partitions should not be excessive. Unlike scalar indexes, vector indexes (such as HNSW) do not significantly increase the computational overhead for querying the top K results when the index size increases from 1 million to 2 million vectors under the same configuration. Therefore, if partition pruning cannot be used, an excessive number of partitions may reduce performance. Additionally, a single partition that is too large not only increases the time required for index rebuilding but also affects the efficiency of joint queries with scalar conditions.

In conclusion, we recommend that the data volume in each partition be controlled to be less than 20 million and that the partition key be selected as a field that supports partition pruning.

Algorithm selection

We recommend using the HNSW_BQ or IVF_PQ index for large-scale data. If you need to use other indexes, refer to the Memory usage section for estimation.

Memory usage

HNSW_BQ

For the HNSW_BQ index, we recommend that the tenant memory be greater than the total memory required for HNSW_BQ queries plus the memory required for the HNSW_SQ index in a single partition. The memory estimation process is as follows:

Use the INDEX_VECTOR_MEMORY_ADVISOR function to calculate the recommended memory values for the HNSW_BQ and HNSW_SQ indexes during construction and queries.
Based on the recommended memory values obtained from the above tool, calculate and determine the total memory required for the tenant.

For example, assume that you have 100 million 1024-dimensional vector data and use 10 partitions (approximately 10 million vectors per partition). The specific calculation process is as follows:

-- Specify REFINE_TYPE=SQ8 to directly and accurately obtain the recommended memory values for the HNSW_BQ index during construction and queries.
-- You do not need to separately calculate the memory required for the HNSW_SQ index. This is because the HNSW_BQ index is constructed using the SQ8 quantization algorithm by default.
-- After specifying refine_type=sq8, the function automatically includes the memory required for the SQ8 quantization vectors in the calculation.
-- The recommended memory value for the HNSW_BQ index is 74.6 GB, and the memory consumption during queries is 57.4 GB. We use the recommended memory value.
SELECT DBMS_VECTOR.INDEX_VECTOR_MEMORY_ADVISOR('HNSW_BQ',100000000,1024,'FLOAT32','M=32,DISTANCE=COSINE,REFINE_TYPE=SQ8', 10000000);
+------------------------------------------------------------------------------------------------------------------------------+
| DBMS_VECTOR.INDEX_VECTOR_MEMORY_ADVISOR('HNSW_BQ',100000000,1024,'FLOAT32','M=32,DISTANCE=COSINE,REFINE_TYPE=SQ8', 10000000) |
+------------------------------------------------------------------------------------------------------------------------------+
| Suggested minimum vector memory is 74.6 GB, memory consumption when providing search service is 57.4 GB                      |
+------------------------------------------------------------------------------------------------------------------------------+
1 row in set

-- By default, the vector memory occupies 50% of the tenant memory. Therefore, the total tenant memory is
SELECT 74.6/0.5;
+--------------------------------------------------------------------------------------------------------------+
| 74.6/0.5 = 149.2 GB |
+--------------------------------------------------------------------------------------------------------------+
1 row in set

-- To account for the possibility of new data not being compressed in actual environments, we recommend reserving some redundancy.
SELECT 149.2 * 1.2;
+--------------------------------------------------------------------------------------------------------------+
| 149.2 * 1.2 = 179.04 GB |
+--------------------------------------------------------------------------------------------------------------+
1 row in set

-- Therefore, we recommend that the tenant memory be configured to 179 GB.

IVF series

For the IVF and IVF_PQ indexes, we recommend that the tenant memory be greater than the memory required for constructing a single partition plus the total memory required for all partitions. For example, if you have 100 million data records and use 10 partitions, each with approximately 10 million vectors, the memory required for constructing a single partition is about 2.7 GB, and the total memory required for all partitions during queries is about 1.1 GB (110 MB per partition). Therefore, the minimum memory required is about 3 GB. We recommend reserving some redundancy and configuring 6 GB of memory. You can use the above method to estimate memory requirements for other data volume scenarios.

Construction and query parameters

We recommend setting the index construction and query parameters based on the maximum data volume in a single partition. For details, see the Index parameter recommendations section.

Performance and recall rate

If the query conditions can prune to a single partition: the performance and recall rate are the same as in the single-partition scenario. For details, see the section describing different data volume scenarios.
If the query conditions cannot prune to a single partition: QPS can be estimated proportionally based on the number of partitions on a single OBServer node (for example, if a single node contains 3 partitions, the QPS is approximately one-third of the performance in a single partition). Since cross-partition queries merge more candidate results, the actual recall rate is typically higher than in the single-partition scenario.

Index parameter recommendations

HNSW series

The recommended index construction and query parameters vary depending on the data volume within the same series. This section provides suggested configurations for HNSW, HNSW_SQ, and HNSW_BQ indexes for 768-dimensional vectors with 1 million and 10 million data points. For 100 million vectors, refer to the subsequent sections of this topic for IVF_PQ or HNSW_BQ indexes under partitioned tables.

Notice

For scenarios where the data volume is expected to grow, we recommend that you set the parameters based on the final data volume.

Notice

HNSW_BQ is a high-compression quantization algorithm. Its recall rate may be relatively low for low-dimensional vectors. To avoid performance loss, we recommend that you use HNSW_BQ for vectors with 512 dimensions or more.

Scenario	Index type	Parameter recommendation
Maximum recall (Maximum memory usage)	HNSW	For 1 million data points: m = 16, ef_construction = 200, ef_search = 100, and other parameters are set to their default values.
Best performance (Minimum memory usage)	HNSW_SQ	For 1 million data points: m = 16, ef_construction = 200, ef_search = 100, and other parameters are set to their default values. For 10 million data points: m = 32, ef_construction = 400, ef_search = 350, and other parameters are set to their default values.
Best cost-performance ratio (Low memory usage and high performance)	HNSW_BQ	For 1 million data points: m = 16, ef_construction = 200, ef_search = 100, and other parameters are set to their default values. For 10 million data points: m = 32, ef_construction = 400, ef_search = 1000, refine_k = 10, and other parameters are set to their default values. For 100 million data points: Use a partitioned table. m = 32, ef_construction = 400, ef_search = 1000, refine_k = 10, and other parameters are set to their default values.

Details for 1 million data points

The following table describes the memory usage and recall rate optimization for 1 million 768-dimensional vectors (using the parameters in the preceding table).

Memory usage:

Index type	Recommended tenant memory	Description
HNSW	15 GB	The recommended size of the vector index memory is 7.3 GB. If the tenant memory is greater than 8 GB, the vector index can use up to 50% of the tenant memory by default. If the tenant memory is 8 GB or less, the vector index can use up to 40% of the tenant memory by default. Therefore, a tenant memory of 15 GB is recommended.
HNSW_SQ	6 GB	The recommended size of the vector index memory is 2.1 GB.
HNSW_BQ	6 GB	HNSW_BQ indexes require high-precision vectors during construction. Therefore, the memory usage during construction is the same as that of HNSW_SQ indexes, which is 6 GB. After the index is built, the memory usage of HNSW_BQ indexes is significantly reduced to approximately 405 MB. The preceding description applies to non-partitioned tables. If you use partitioned tables, OceanBase Database dynamically adjusts the number of partitions that can be built in parallel based on the tenant memory. When you configure the parameters, we recommend that you reserve some memory for the tenant to ensure that the tenant memory is sufficient for the memory usage of HNSW_BQ queries and the memory usage of HNSW_SQ during the construction of a partition. For more information, see the Memory usage section above.

Recall rate optimization:

You can improve the recall rate by increasing the number of vector calculations. However, this will reduce query performance. You can set the parameters to the values in the following table for different TopN values. If you need to further improve the recall rate, you can increase the parameter values.

Notice

The recall rate is directly related to the data characteristics. The following table provides the parameter values for a 768-dimensional standard dataset to achieve a recall rate of approximately 0.95.

TopN	ef_search	refine_k (only for HNSW_BQ)
Top10	64	4
Top100	240	4

Maximum recall rate:

The maximum recall rates of different index algorithms vary. In the preceding table, the ef_search parameter is set to 1000. In this case, the recall rate may be improved, but the QPS will drop to one-third of that at a recall rate of 0.95. Increasing the ef_search parameter does not improve the recall rates of all types of indexes. Only HNSW indexes can achieve a recall rate of more than 0.99. You can increase the refine_k parameter to further improve the recall rate of HNSW_BQ indexes. However, this will further reduce the query performance.

Maximum recall rate (ef_search = 1000):

HNSW index: Recall rate of 0.991
HNSW_SQ index: Recall rate of 0.9786
HNSW_BQ index: Recall rate of 0.9897 (ef_search = 1000, refine_k = 10)

Details for 10 million data points

The following table describes the memory usage and recall rate optimization for 10 million 768-dimensional vectors (using the parameters in the preceding table).

Memory usage:

Index type	Recommended tenant memory	Description
HNSW	160 GB	The recommended size of the vector index memory is 76.3 GB.
HNSW_SQ	48 GB	The recommended size of the vector index memory is 22.6 GB.
HNSW_BQ	48 GB	HNSW_BQ indexes require high-precision vectors during construction. HNSW_BQ indexes use HNSW_SQ indexes as cache during construction. Therefore, for non-partitioned tables, the memory usage of HNSW_BQ indexes is the same as that of HNSW_SQ indexes. After the index is built, HNSW_BQ indexes only occupy approximately 5.4 GB of memory.

Recall rate optimization:

Notice

TopN	ef_search	refine_k (only for HNSW_BQ)
Top10	100	4
Top100 (HNSW/HNSW_SQ)	350	-
Top100 (HNSW_BQ)	1000	10

IVF series

Scenario	Index Type	Parameter Recommendations
Low-dimensional (384 dimensions or fewer)	IVF	For 10 million data records: use a partitioned table with nlist=3000 For 100 million data records: use a partitioned table with nlist=3000
Low cost (minimal memory usage)	IVF_PQ	For 1 million data records: nlist=1000, m=vector dimension/2 For 10 million data records: nlist=3000, m=vector dimension/2 For 100 million data records: use a partitioned table with nlist=3000, m=vector dimension/2

To balance the number of cluster centers and the amount of data per center, we recommend setting nlist to the square root of the data volume. For example, for 10 million data records, we recommend setting nlist to approximately 3000. When using IVF_PQ, we recommend setting the m parameter to half of the vector dimension (dim).

Notice

IVF_PQ is a high-compression quantization algorithm. Its recall rate may be relatively low for low-dimensional vectors. To avoid performance loss, we recommend using IVF_PQ for vectors with 128 dimensions or more.

Notice

For scenarios where the data volume is expected to grow, we recommend setting the parameters based on the final data volume.

10 million data records (with partitioned tables)

For 10 million 768-dimensional vector data records (using the parameters in the table above), we provide additional details on memory usage and recall optimization:

Memory usage:

Index Type	Index Parameters	Memory Usage (Build Time/Resident Time)
IVF	distance=l2, nlist=3000	2.7 GB/ 10.5 MB
IVF_PQ	distance=l2, nlist=3000, m=384	4.0 GB/ 1.3 GB
IVF_PQ	distance=cosine, nlist=3000, m=384	2.7 GB/ 11.4 MB

In the table, Build Time refers to the temporary memory used during index creation, which is released after the index is built. Resident Time refers to the memory continuously occupied by the IVF vector index after the index is built.

For IVF_PQ indexes, if you choose distance = l2, more resident memory will be occupied due to the need to store additional precomputed results. In contrast, using distance = inner_product or cosine consumes less resident memory. Therefore, in practical applications, we recommend prioritizing inner_product or cosine as the distance type to optimize memory resources.

Recall optimization:

By adjusting the nprobes parameter, you can increase the number of vector calculations to improve recall, but this will reduce query performance. You can set the parameter to the recommended values in the table below for different TopN scenarios. If you need to further improve recall, you can set the parameter value to a larger value.

Notice

Recall is directly related to data characteristics. The values in the table below are recommended for a standard dataset with 768 dimensions, where recall reaches approximately 0.95.

TopN	nprobes
Top10	1
Top100	20

100 million data records (with partitioned tables)

When the vector data volume reaches 100 million or more, we strongly recommend using partitioned tables with IVF-type indexes. As the data scale and nlist parameter increase, the query overhead of a single IVF index will significantly increase. By splitting the data into multiple partitions and building smaller-scale IVF indexes for each partition, you can effectively reduce the query load and further improve overall query performance and recall through parallel queries across partitions.

In a multi-partitioned table scenario, since IVF indexes are local indexes, each partition will independently build its own IVF index. Therefore, we recommend calculating the nlist value based on the average data volume per partition. For example, for 100 million 768-dimensional vectors split into 10 partitions, each partition contains approximately 10 million data records, and we recommend setting nlist to sqrt(10 million) = 3162.

For 100 million 768-dimensional vector data records (using the parameters in the table above), we provide additional details on memory usage and recall optimization:

Memory usage:

In a multi-partitioned scenario, since each partition independently builds and maintains its own IVF index, the total resident memory usage must be summed across all partitions. For example, if a single IVF index has a resident memory usage of 10.5 MB and there are 10 partitions, the total resident memory usage would be approximately 10.5 × 10 = 105 MB.

Index Type	Index Parameters	Memory Usage (Build Time/Resident Time)
IVF	distance=l2, nlist=10000	2.7 GB/ 13.1 * 10 MB
IVF_PQ	distance=l2, nlist=10000, m=384	4.0 GB/ 1.6 * 10 GB
IVF_PQ	distance=cosine, nlist=10000, m=384	2.7 GB/ 14.2 * 10 MB

Recall optimization:

In a partitioned table scenario, since each partition independently executes an IVF index query, when a query involves multiple partitions, the system will retrieve TopN results from each partition and then aggregate and re-sort all the results. This not only improves the overall search accuracy but also typically results in a higher actual recall rate compared to a single-partition scenario. Therefore, in a multi-partitioned table, you can appropriately reduce the nprobes parameter to achieve a recall rate comparable to that of a single-partitioned table.

Notice

Recall is directly related to data characteristics. The values in the table below are recommended for a standard dataset with 768 dimensions, where recall reaches approximately 0.95.

TopN	nprobes
Top10	1
Top100	10

References

For more information about parameters and tuning, see HNSW index and IVF index.

OceanBase Database provides vector indexes with different algorithms. You can choose an appropriate index type based on your use case.