OceanBase Database provides vector indexes with different algorithms. You can choose an appropriate index type based on your use case.
HNSW or IVF?
OceanBase Database provides two types of dense indexes:
- Graph-based HNSW series indexes: HNSW, HNSW_SQ, and HNSW_BQ.
- Disk-based IVF series indexes: IVF and IVF_PQ.
These two types of indexes have different strengths. HNSW series indexes typically provide higher query performance but require more memory. IVF series indexes perform well when cache is sufficient and do not require memory. However, the choice between HNSW and IVF is not solely based on memory considerations. You must also evaluate other factors such as business scenarios, data size, performance requirements, and resource constraints. The following sections compare the core differences between these two types of indexes and provide recommendations.
Core differences
| Dimension | HNSW series | IVF series |
|---|---|---|
| Storage method | Index based on a graph structure in memory | Index based on a disk |
| Memory usage | Requires complete loading into memory, with high memory usage | Does not require memory, with low memory usage |
| Query performance (QPS) | Extremely high, with sub-millisecond response | High, close to HNSW when cache is sufficient |
| Recall rate | High, up to 99% | High, slightly lower than HNSW, and can be optimized by parameters |
| Build speed | Slower, requires building a graph structure | Faster, based on clustering algorithms |
| Applicable data volume | Millions to billions | Millions to tens of billions |
| Cost | High memory cost | Low storage cost, suitable for large-scale data |
| Real-time capability | Supports real-time DML operations | Supports real-time DML operations |
Decision flowchart
Note
- Before selecting an index type, you need to estimate the memory usage based on the information in Manage memory for vector indexes.
- The recommendations in the decision flowchart are based on 1024-dimensional vectors. If the actual dimension is different, you can approximately calculate the required resources based on the proportion.
- The decision flowchart primarily considers memory costs to help you decide on the index type.
- Even if the tenant has sufficient memory, you do not always need to choose the highest specification index such as HNSW. If you have high performance requirements, you can consider more cost-effective options such as HNSW_SQ.
Notice
This section only lists some common scenarios. If you cannot determine the index type based on the above flowchart or have other requirements, contact OceanBase Technical Support.
Use partitioned tables?
The primary purpose of using partitioned tables is to handle large-scale data scenarios. Additionally, if the query conditions can be used as partition keys, partition pruning can enhance query performance. We recommend using partitioned tables in the following two scenarios:
- The data volume reaches tens of millions or even hundreds of millions: When the data volume is very large, partitioned tables can distribute the data across multiple partitions. Each partition independently builds an index, thereby reducing the load of a single query and improving overall query performance.
- The query conditions include a specific scalar column that can be used for partition pruning: For example, if the
labelfield always appears in theWHEREcondition, you can consider creating a partitioned table withlabelas the partition key. This way, partition pruning can reduce the number of partitions to be queried.
Specific recommendations are as follows:
Partitioning
When using vector indexes, the number of partitions should not be excessive. Unlike scalar indexes, vector indexes (such as HNSW) do not significantly increase the computational overhead for querying the top K results when the index size increases from 1 million to 2 million vectors under the same configuration. Therefore, if partition pruning cannot be used, an excessive number of partitions may reduce performance. Additionally, a single partition that is too large not only increases the time required for index rebuilding but also affects the efficiency of joint queries with scalar conditions.
In conclusion, we recommend that the data volume in each partition be controlled to be less than 20 million and that the partition key be selected as a field that supports partition pruning.
Algorithm selection
We recommend using the HNSW_BQ or IVF_PQ index for large-scale data. If you need to use other indexes, refer to the Memory usage section for estimation.
Memory usage
HNSW_BQ
For the HNSW_BQ index, we recommend that the tenant memory be greater than the total memory required for HNSW_BQ queries plus the memory required for the HNSW_SQ index in a single partition. The memory estimation process is as follows:
- Use the
INDEX_VECTOR_MEMORY_ADVISORfunction to calculate the recommended memory values for the HNSW_BQ and HNSW_SQ indexes during construction and queries. - Based on the recommended memory values obtained from the above tool, calculate and determine the total memory required for the tenant.
For example, assume that you have 100 million 1024-dimensional vector data and use 10 partitions (approximately 10 million vectors per partition). The specific calculation process is as follows:
-- Specify REFINE_TYPE=SQ8 to directly and accurately obtain the recommended memory values for the HNSW_BQ index during construction and queries.
-- You do not need to separately calculate the memory required for the HNSW_SQ index. This is because the HNSW_BQ index is constructed using the SQ8 quantization algorithm by default.
-- After specifying refine_type=sq8, the function automatically includes the memory required for the SQ8 quantization vectors in the calculation.
-- The recommended memory value for the HNSW_BQ index is 74.6 GB, and the memory consumption during queries is 57.4 GB. We use the recommended memory value.
SELECT DBMS_VECTOR.INDEX_VECTOR_MEMORY_ADVISOR('HNSW_BQ',100000000,1024,'FLOAT32','M=32,DISTANCE=COSINE,REFINE_TYPE=SQ8', 10000000);
+------------------------------------------------------------------------------------------------------------------------------+
| DBMS_VECTOR.INDEX_VECTOR_MEMORY_ADVISOR('HNSW_BQ',100000000,1024,'FLOAT32','M=32,DISTANCE=COSINE,REFINE_TYPE=SQ8', 10000000) |
+------------------------------------------------------------------------------------------------------------------------------+
| Suggested minimum vector memory is 74.6 GB, memory consumption when providing search service is 57.4 GB |
+------------------------------------------------------------------------------------------------------------------------------+
1 row in set
-- By default, the vector memory occupies 50% of the tenant memory. Therefore, the total tenant memory is
SELECT 74.6/0.5;
+--------------------------------------------------------------------------------------------------------------+
| 74.6/0.5 = 149.2 GB |
+--------------------------------------------------------------------------------------------------------------+
1 row in set
-- To account for the possibility of new data not being compressed in actual environments, we recommend reserving some redundancy.
SELECT 149.2 * 1.2;
+--------------------------------------------------------------------------------------------------------------+
| 149.2 * 1.2 = 179.04 GB |
+--------------------------------------------------------------------------------------------------------------+
1 row in set
-- Therefore, we recommend that the tenant memory be configured to 179 GB.
IVF series
For the IVF and IVF_PQ indexes, we recommend that the tenant memory be greater than the memory required for constructing a single partition plus the total memory required for all partitions. For example, if you have 100 million data records and use 10 partitions, each with approximately 10 million vectors, the memory required for constructing a single partition is about 2.7 GB, and the total memory required for all partitions during queries is about 1.1 GB (110 MB per partition). Therefore, the minimum memory required is about 3 GB. We recommend reserving some redundancy and configuring 6 GB of memory. You can use the above method to estimate memory requirements for other data volume scenarios.
Construction and query parameters
We recommend setting the index construction and query parameters based on the maximum data volume in a single partition. For details, see the Index parameter recommendations section.
Performance and recall rate
- If the query conditions can prune to a single partition: the performance and recall rate are the same as in the single-partition scenario. For details, see the section describing different data volume scenarios.
- If the query conditions cannot prune to a single partition: QPS can be estimated proportionally based on the number of partitions on a single OBServer node (for example, if a single node contains 3 partitions, the QPS is approximately one-third of the performance in a single partition). Since cross-partition queries merge more candidate results, the actual recall rate is typically higher than in the single-partition scenario.
Index parameter recommendations
HNSW series
The recommended index construction and query parameters vary depending on the data volume within the same series. This section provides suggested configurations for HNSW, HNSW_SQ, and HNSW_BQ indexes for 768-dimensional vectors with 1 million and 10 million data points. For 100 million vectors, refer to the subsequent sections of this topic for IVF_PQ or HNSW_BQ indexes under partitioned tables.
Notice
For scenarios where the data volume is expected to grow, we recommend that you set the parameters based on the final data volume.
Notice
HNSW_BQ is a high-compression quantization algorithm. Its recall rate may be relatively low for low-dimensional vectors. To avoid performance loss, we recommend that you use HNSW_BQ for vectors with 512 dimensions or more.
| Scenario | Index type | Parameter recommendation |
|---|---|---|
| Maximum recall (Maximum memory usage) |
HNSW | For 1 million data points: m = 16, ef_construction = 200, ef_search = 100, and other parameters are set to their default values. |
| Best performance (Minimum memory usage) |
HNSW_SQ | For 1 million data points: m = 16, ef_construction = 200, ef_search = 100, and other parameters are set to their default values. For 10 million data points: m = 32, ef_construction = 400, ef_search = 350, and other parameters are set to their default values. |
| Best cost-performance ratio (Low memory usage and high performance) |
HNSW_BQ | For 1 million data points: m = 16, ef_construction = 200, ef_search = 100, and other parameters are set to their default values. For 10 million data points: m = 32, ef_construction = 400, ef_search = 1000, refine_k = 10, and other parameters are set to their default values. For 100 million data points: Use a partitioned table. m = 32, ef_construction = 400, ef_search = 1000, refine_k = 10, and other parameters are set to their default values. |
Details for 1 million data points
The following table describes the memory usage and recall rate optimization for 1 million 768-dimensional vectors (using the parameters in the preceding table).
Memory usage:
| Index type | Recommended tenant memory | Description |
|---|---|---|
| HNSW | 15 GB | The recommended size of the vector index memory is 7.3 GB. If the tenant memory is greater than 8 GB, the vector index can use up to 50% of the tenant memory by default. If the tenant memory is 8 GB or less, the vector index can use up to 40% of the tenant memory by default. Therefore, a tenant memory of 15 GB is recommended. |
| HNSW_SQ | 6 GB | The recommended size of the vector index memory is 2.1 GB. |
| HNSW_BQ | 6 GB | HNSW_BQ indexes require high-precision vectors during construction. Therefore, the memory usage during construction is the same as that of HNSW_SQ indexes, which is 6 GB. After the index is built, the memory usage of HNSW_BQ indexes is significantly reduced to approximately 405 MB. The preceding description applies to non-partitioned tables. If you use partitioned tables, OceanBase Database dynamically adjusts the number of partitions that can be built in parallel based on the tenant memory. When you configure the parameters, we recommend that you reserve some memory for the tenant to ensure that the tenant memory is sufficient for the memory usage of HNSW_BQ queries and the memory usage of HNSW_SQ during the construction of a partition. For more information, see the Memory usage section above. |
Recall rate optimization:
You can improve the recall rate by increasing the number of vector calculations. However, this will reduce query performance. You can set the parameters to the values in the following table for different TopN values. If you need to further improve the recall rate, you can increase the parameter values.
Notice
The recall rate is directly related to the data characteristics. The following table provides the parameter values for a 768-dimensional standard dataset to achieve a recall rate of approximately 0.95.
| TopN | ef_search | refine_k (only for HNSW_BQ) |
|---|---|---|
| Top10 | 64 | 4 |
| Top100 | 240 | 4 |
Maximum recall rate:
The maximum recall rates of different index algorithms vary. In the preceding table, the ef_search parameter is set to 1000. In this case, the recall rate may be improved, but the QPS will drop to one-third of that at a recall rate of 0.95. Increasing the ef_search parameter does not improve the recall rates of all types of indexes. Only HNSW indexes can achieve a recall rate of more than 0.99. You can increase the refine_k parameter to further improve the recall rate of HNSW_BQ indexes. However, this will further reduce the query performance.
Maximum recall rate (ef_search = 1000):
- HNSW index: Recall rate of 0.991
- HNSW_SQ index: Recall rate of 0.9786
- HNSW_BQ index: Recall rate of 0.9897 (ef_search = 1000, refine_k = 10)
Details for 10 million data points
The following table describes the memory usage and recall rate optimization for 10 million 768-dimensional vectors (using the parameters in the preceding table).
Memory usage:
| Index type | Recommended tenant memory | Description |
|---|---|---|
| HNSW | 160 GB | The recommended size of the vector index memory is 76.3 GB. |
| HNSW_SQ | 48 GB | The recommended size of the vector index memory is 22.6 GB. |
| HNSW_BQ | 48 GB | HNSW_BQ indexes require high-precision vectors during construction. HNSW_BQ indexes use HNSW_SQ indexes as cache during construction. Therefore, for non-partitioned tables, the memory usage of HNSW_BQ indexes is the same as that of HNSW_SQ indexes. After the index is built, HNSW_BQ indexes only occupy approximately 5.4 GB of memory. |
Recall rate optimization:
You can improve the recall rate by increasing the number of vector calculations. However, this will reduce query performance. You can set the parameters to the values in the following table for different TopN values. If you need to further improve the recall rate, you can increase the parameter values.
Notice
The recall rate is directly related to the data characteristics. The following table provides the parameter values for a 768-dimensional standard dataset to achieve a recall rate of approximately 0.95.
| TopN | ef_search | refine_k (only for HNSW_BQ) |
|---|---|---|
| Top10 | 100 | 4 |
| Top100 (HNSW/HNSW_SQ) | 350 | - |
| Top100 (HNSW_BQ) | 1000 | 10 |
IVF series
| Scenario | Index Type | Parameter Recommendations |
|---|---|---|
| Low-dimensional (384 dimensions or fewer) |
IVF | For 10 million data records: use a partitioned table with nlist=3000 For 100 million data records: use a partitioned table with nlist=3000 |
| Low cost (minimal memory usage) |
IVF_PQ | For 1 million data records: nlist=1000, m=vector dimension/2 For 10 million data records: nlist=3000, m=vector dimension/2 For 100 million data records: use a partitioned table with nlist=3000, m=vector dimension/2 |
To balance the number of cluster centers and the amount of data per center, we recommend setting nlist to the square root of the data volume. For example, for 10 million data records, we recommend setting nlist to approximately 3000. When using IVF_PQ, we recommend setting the m parameter to half of the vector dimension (dim).
Notice
IVF_PQ is a high-compression quantization algorithm. Its recall rate may be relatively low for low-dimensional vectors. To avoid performance loss, we recommend using IVF_PQ for vectors with 128 dimensions or more.
Notice
For scenarios where the data volume is expected to grow, we recommend setting the parameters based on the final data volume.
10 million data records (with partitioned tables)
For 10 million 768-dimensional vector data records (using the parameters in the table above), we provide additional details on memory usage and recall optimization:
Memory usage:
| Index Type | Index Parameters | Memory Usage (Build Time/Resident Time) |
|---|---|---|
| IVF | distance=l2, nlist=3000 | 2.7 GB/ 10.5 MB |
| IVF_PQ | distance=l2, nlist=3000, m=384 | 4.0 GB/ 1.3 GB |
| IVF_PQ | distance=cosine, nlist=3000, m=384 | 2.7 GB/ 11.4 MB |
In the table, Build Time refers to the temporary memory used during index creation, which is released after the index is built. Resident Time refers to the memory continuously occupied by the IVF vector index after the index is built.
For IVF_PQ indexes, if you choose distance = l2, more resident memory will be occupied due to the need to store additional precomputed results. In contrast, using distance = inner_product or cosine consumes less resident memory. Therefore, in practical applications, we recommend prioritizing inner_product or cosine as the distance type to optimize memory resources.
Recall optimization:
By adjusting the nprobes parameter, you can increase the number of vector calculations to improve recall, but this will reduce query performance. You can set the parameter to the recommended values in the table below for different TopN scenarios. If you need to further improve recall, you can set the parameter value to a larger value.
Notice
Recall is directly related to data characteristics. The values in the table below are recommended for a standard dataset with 768 dimensions, where recall reaches approximately 0.95.
| TopN | nprobes |
|---|---|
| Top10 | 1 |
| Top100 | 20 |
100 million data records (with partitioned tables)
When the vector data volume reaches 100 million or more, we strongly recommend using partitioned tables with IVF-type indexes. As the data scale and nlist parameter increase, the query overhead of a single IVF index will significantly increase. By splitting the data into multiple partitions and building smaller-scale IVF indexes for each partition, you can effectively reduce the query load and further improve overall query performance and recall through parallel queries across partitions.
In a multi-partitioned table scenario, since IVF indexes are local indexes, each partition will independently build its own IVF index. Therefore, we recommend calculating the nlist value based on the average data volume per partition. For example, for 100 million 768-dimensional vectors split into 10 partitions, each partition contains approximately 10 million data records, and we recommend setting nlist to sqrt(10 million) = 3162.
For 100 million 768-dimensional vector data records (using the parameters in the table above), we provide additional details on memory usage and recall optimization:
Memory usage:
In a multi-partitioned scenario, since each partition independently builds and maintains its own IVF index, the total resident memory usage must be summed across all partitions. For example, if a single IVF index has a resident memory usage of 10.5 MB and there are 10 partitions, the total resident memory usage would be approximately 10.5 × 10 = 105 MB.
| Index Type | Index Parameters | Memory Usage (Build Time/Resident Time) |
|---|---|---|
| IVF | distance=l2, nlist=10000 | 2.7 GB/ 13.1 * 10 MB |
| IVF_PQ | distance=l2, nlist=10000, m=384 | 4.0 GB/ 1.6 * 10 GB |
| IVF_PQ | distance=cosine, nlist=10000, m=384 | 2.7 GB/ 14.2 * 10 MB |
In the table, Build Time refers to the temporary memory used during index creation, which is released after the index is built. Resident Time refers to the memory continuously occupied by the IVF vector index after the index is built.
Recall optimization:
By adjusting the nprobes parameter, you can increase the number of vector calculations to improve recall, but this will reduce query performance. You can set the parameter to the recommended values in the table below for different TopN scenarios. If you need to further improve recall, you can set the parameter value to a larger value.
In a partitioned table scenario, since each partition independently executes an IVF index query, when a query involves multiple partitions, the system will retrieve TopN results from each partition and then aggregate and re-sort all the results. This not only improves the overall search accuracy but also typically results in a higher actual recall rate compared to a single-partition scenario. Therefore, in a multi-partitioned table, you can appropriately reduce the nprobes parameter to achieve a recall rate comparable to that of a single-partitioned table.
Notice
Recall is directly related to data characteristics. The values in the table below are recommended for a standard dataset with 768 dimensions, where recall reaches approximately 0.95.
| TopN | nprobes |
|---|---|
| Top10 | 1 |
| Top100 | 10 |
References
- For more information about parameters and tuning, see HNSW index and IVF index.
