This topic describes the vector functions supported by OceanBase Database and the considerations for using them.
Considerations
- Vectors with different dimensions are not supported. An error
different vector dimensions %d and %dis returned when you perform an operation on vectors with different dimensions. - An error
value out of range: overflow / underflowis returned when the result overflows or underflows the floating-point number range. - Dense vector indexes support L2, inner product, and cosine distance as index distance algorithms. Sparse vector indexes support inner product and negative inner product as index distance algorithms. For more information, see Create a vector index. The negative inner product algorithm for sparse vector indexes is supported starting from V4.3.5 BP2.
- The
L2_distance,Cosine_distance,Inner_product, andNegative_inner_productfunctions can be used for vector index search.
Configure vector index memory
OceanBase vector search allows you to configure the vector index memory by setting the ob_vector_memory_limit_percentage parameter:
Before V4.3.5 BP3, you need to manually set
ob_vector_memory_limit_percentageto enable vector search. We recommend that you set it to30for optimal query performance. If you use the default value, no memory is allocated for vector indexes, and an error will be returned when you create an index. Here is an example:ALTER SYSTEM SET ob_vector_memory_limit_percentage = 30;Starting from V4.3.5 BP3, the vector search feature is enabled by default. The default value
0indicates adaptive mode, where the system automatically adjusts the memory usage ratio for vector index data within the tenant, so manual adjustment is not required.- When the tenant’s actual memory is 8 GB or less, the value is set to
40. - When the tenant’s actual memory is more than 8 GB, the value is set to
50.
- When the tenant’s actual memory is 8 GB or less, the value is set to
Distance functions
A distance function calculates the distance between two vectors based on the specific distance algorithm used.
L2_distance
Euclidean distance measures the distance between coordinates of the compared vectors -- essentially the straight-line distance between two vectors. It is calculated using the Pythagorean theorem applied to vector coordinates:

The function syntax is as follows:
l2_distance(vector v1, vector v2)
The parameters are described as follows:
Apart from the vector type, other types that can be forcibly converted to the vector type are accepted. Examples of other types include single-level array types like
[1,2,3,..]and string types like'[1,2,3]'.The dimensions of the two parameters must be the same.
If a single-level array type is involved, the elements of this parameter cannot be
NULL.
The return values are described as follows:
The return value is a
distance(double)value.If any parameter is
NULL, the return value isNULL.
Here is an example:
CREATE TABLE t1(c1 vector(3));
INSERT INTO t1 VALUES('[1,2,3]');
SELECT l2_distance(c1, [1,2,3]), l2_distance([1,2,3],[1,1,1]), l2_distance('[1,1,1]','[1,2,3]') FROM t1;
The return result is as follows:
+--------------------------+------------------------------+----------------------------------+
| l2_distance(c1, [1,2,3]) | l2_distance([1,2,3],[1,1,1]) | l2_distance('[1,1,1]','[1,2,3]') |
+--------------------------+------------------------------+----------------------------------+
| 0 | 2.23606797749979 | 2.23606797749979 |
+--------------------------+------------------------------+----------------------------------+
1 row in set
L1_distance
The Manhattan distance measures the sum of absolute distances between two points in a standard coordinate system. The calculation formula is as follows:

The function syntax is as follows:
l1_distance(vector v1, vector v2)
The parameters are described as follows:
Apart from the vector type, other types that can be forcibly converted to the vector type are accepted. Examples of other types include single-level array types like
[1,2,3,..]and string types like'[1,2,3]'.The dimensions of the two parameters must be the same.
If a single-level array type is involved, the elements of this parameter cannot be
NULL.
The return values are described as follows:
The return value is a
distance(double)value.If any parameter is
NULL, the return value isNULL.
Here is an example:
CREATE TABLE t2(c1 vector(3));
INSERT INTO t2 VALUES('[1,2,3]');
INSERT INTO t2 VALUES('[1,1,1]');
SELECT l1_distance(c1, [1,2,3]) FROM t2;
The return result is as follows:
+--------------------------+
| l1_distance(c1, [1,2,3]) |
+--------------------------+
| 0 |
| 3 |
+--------------------------+
2 rows in set
Cosine_distance
Cosine similarity measures the angular difference between two vectors and reflects their directional similarity, regardless of their lengths. The value range is [-1, 1], where 1 indicates the same direction, 0 indicates orthogonality, and -1 indicates opposite directions.
Cosine similarity is calculated by using the following formula:

Cosine distance, a measure of vector distance, is calculated by subtracting the cosine similarity from 1. This is because vectors with a cosine distance closer to 1 are more similar.

The value range of cosine distance is [0, 2], where 0 indicates the same direction (no distance) and 2 indicates opposite directions.
The function syntax is as follows:
cosine_distance(vector v1, vector v2)
The parameters are described as follows:
Apart from the vector type, other types that can be forcibly converted to the vector type are accepted. Examples of other types include single-level array types like
[1,2,3,..]and string types like'[1,2,3]'.The dimensions of the two parameters must be the same.
If a single-level array type is involved, the elements of this parameter cannot be
NULL.
The return values are described as follows:
The return value is a
distance(double)value.If any parameter is
NULL, the return value isNULL.
Here is an example:
CREATE TABLE t3(c1 vector(3));
INSERT INTO t3 VALUES('[1,2,3]');
INSERT INTO t3 VALUES('[1,2,1]');
SELECT cosine_distance(c1, [1,2,3]) FROM t3;
+------------------------------+
| cosine_distance(c1, [1,2,3]) |
+------------------------------+
| 0 |
| 0.12712843905603044 |
+------------------------------+
2 rows in set
Inner_product
The inner product, also known as the dot product or scalar product, is a type of multiplication between two vectors. In geometry, the inner product indicates the direction and magnitude relationship between two vectors. The formula for calculating the inner product is as follows:

The syntax is as follows:
inner_product(vector v1, vector v2)
The parameters are described as follows:
Apart from vector types, other types that can be forcibly converted to vector types are supported. Examples of other types include single-level array types such as
[1,2,3,...]and string types such as'[1,2,3]'.The dimensions of the two parameters must be the same.
If a single-level array type is used as a parameter, the parameter elements cannot contain
NULL.For sparse vectors, one parameter can be a string in the sparse vector format, such as
c2,'{1:2.4}'. However, if one parameter is a string in the sparse vector format, the other parameter must not be a string.
The return values are described as follows:
The return value is a
distance(double)value.If any parameter is
NULL, the return value isNULL.
Here is an example that uses dense vectors:
CREATE TABLE t4(c1 vector(3));
INSERT INTO t4 VALUES('[1,2,3]');
INSERT INTO t4 VALUES('[1,2,1]');
SELECT inner_product(c1, [1,2,3]) FROM t4;
The return result is as follows:
+----------------------------+
| inner_product(c1, [1,2,3]) |
+----------------------------+
| 14 |
| 8 |
+----------------------------+
2 rows in set
Here is an example that uses sparse vectors:
CREATE TABLE t4(c1 INT, c2 SPARSEVECTOR, c3 SPARSEVECTOR);
INSERT INTO t4 VALUES(1, '{1:1.1, 2:2.2}', '{1:2.4}');
INSERT INTO t4 VALUES(2, '{1:1.5, 3:3.6}', '{4:4.5}');
SELECT inner_product(c2,c3) FROM t4;
The return result is as follows:
```shell
+----------------------+
| inner_product(c2,c3) |
+----------------------+
| 2.640000104904175 |
| 0 |
+----------------------+
2 rows in set
Negative_inner_product
The negative_inner_product function calculates the negative inner product of two vectors, by using the following formula:

The syntax is as follows:
negative_inner_product(vector v1, vector v2)
The parameters are described as follows:
Apart from vector types, other types that can be forcibly converted to vector types are supported. Examples of other types include single-level array types such as
[1,2,3,...]and string types such as'[1,2,3]'.The dimensions of the two parameters must be the same.
If a single-level array type is used as a parameter, the parameter elements cannot contain
NULL.For sparse vectors, one parameter can be a string in the sparse vector format, such as
c2,'{1:2.4}'. However, if one parameter is a string, the other parameter must not be a string.
The return values are described as follows:
The return value is a
distance(double)value.If any parameter is
NULL, the return value isNULL.
Here is an example that uses dense vectors:
CREATE TABLE t5(c1 vector(3));
INSERT INTO t5 VALUES('[1,2,3]');
INSERT INTO t5 VALUES('[1,2,1]');
SELECT negative_inner_product(c1, [1,2,3]) FROM t5;
The return result is as follows:
+-------------------------------------+
| negative_inner_product(c1, [1,2,3]) |
+-------------------------------------+
| -14 |
| -8 |
| -14 |
| -8 |
+-------------------------------------+
4 rows in set
Here is an example that uses sparse vectors:
CREATE TABLE t5(c1 INT, c2 SPARSEVECTOR, c3 SPARSEVECTOR);
INSERT INTO t5 VALUES(1, '{1:1.1, 2:2.2}', '{1:2.4}');
INSERT INTO t5 VALUES(2, '{1:1.5, 3:3.6}', '{4:4.5}');
SELECT negative_inner_product(c2,c3) FROM t5;
The return result is as follows:
```shell
+-------------------------------+
| negative_inner_product(c2,c3) |
+-------------------------------+
| -2.640000104904175 |
| 0 |
+-------------------------------+
2 rows in set
Vector_distance
The vector_distance function calculates the distance between two vectors. You can specify parameters to select different distance algorithms.
The syntax is as follows:
vector_distance(vector v1, vector v2 [, string metric])
The vector v1/v2 parameters are described as follows:
Apart from the vector type, other types of parameters can be forcibly converted to the vector type. For example, the parameters can be a one-level array such as
[1,2,3,...], or a string such as'[1,2,3]'.The dimensions of the two parameters must be the same.
If any of the parameters is of the one-level array type, the elements in the parameter cannot be
NULL.
The metric parameter specifies the distance algorithm. Valid values:
If the parameter is not specified, the default algorithm is
euclidean.If the parameter is specified, it can only be one of the following values:
euclidean: the Euclidean distance. This value is equivalent to L2_distance.manhattan: the Manhattan distance. This value is equivalent to L1_distance.cosine: the cosine distance. This value is equivalent to Cosine_distance.dot: the inner product. This value is equivalent to Inner_product.
The return values are described as follows:
The return value is
distance(double), which is a distance value.If any parameter is
NULL, the return value isNULL.
Here is an example:
CREATE TABLE t5(c1 vector(3));
INSERT INTO t5 VALUES('[1,2,3]');
INSERT INTO t5 VALUES('[1,2,1]');
SELECT vector_distance(c1, [1,2,3], euclidean) FROM t5;
The return result is as follows:
+-----------------------------------------+
| vector_distance(c1, [1,2,3], euclidean) |
+-----------------------------------------+
| 0 |
| 2 |
+-----------------------------------------+
2 rows in set
Arithmetic functions
Arithmetic functions provide element-wise addition (+), subtraction (-), and multiplication () operations between vector types, between single-element array types, and between special string types. The operations also support element-wise addition (+), subtraction (-), and multiplication () operations between single-element array types and special string types, as well as between two special string types. The operations are performed element-wise.

The syntax is as follows:
v1 + v2
v1 - v2
v1 * v2
The parameters are described as follows:
Parameters other than vector types can be of any type that can be forcibly converted to a vector type, including single-element array types (such as
[1,2,3,..]) and string-like types (such as'[1,2,3]'). Note: Two string-like parameters are not allowed. At least one parameter must be of a vector or single-element array type.The dimensions of the two parameters must be the same.
If a single-element array type parameter exists, the parameter must not contain
NULLelements.
The return values are described as follows:
If at least one of the two parameters is of a vector type, the return value is of the same vector type as the vector parameter.
If both parameters are of single-element array types, the return value is of the
array(float)type.If any parameter is
NULL, the return value isNULL.
Here is an example:
CREATE TABLE t6(c1 vector(3));
INSERT INTO t6 VALUES('[1,2,3]');
SELECT [1,2,3] + '[1.12,1000.0001, -1.2222]', c1 - [1,2,3] FROM t6;
The return result is as follows:
+---------------------------------------+--------------+
| [1,2,3] + '[1.12,1000.0001, -1.2222]' | c1 - [1,2,3] |
+---------------------------------------+--------------+
| [2.12,1002,1.7778] | [0,0,0] |
+---------------------------------------+--------------+
1 row in set
Comparison functions
Comparison functions provide comparison calculations for vector types, single-level array types, and special string types. The supported comparison operators include =, !=, >, <, >=, and <=. The comparison is performed in dictionary order element by element.
The syntax is as follows:
v1 = v2
v1 != v2
v1 > v2
v1 < v2
v1 >= v2
v1 <= v2
The parameters are described as follows:
Parameters other than vector types can be of any type that can be forcibly converted to a vector type, including single-level array types (such as
[1,2,3,..]) and string literal types (such as'[1,2,3]').Notice
One of the two parameters must be a vector type.
The dimensions of the two parameters must be the same.
If a single-level array type parameter exists,
NULLvalues are not allowed in its elements.
The return values are described as follows:
The return value is of the bool type.
If any parameter is
NULL, the return value isNULL.
Example:
CREATE TABLE t7(c1 vector(3));
INSERT INTO t7 VALUES('[1,2,3]');
SELECT c1 = '[1,2,3]' FROM t7;
The return result is as follows:
+----------------+
| c1 = '[1,2,3]' |
+----------------+
| 1 |
+----------------+
1 row in set
Aggregate functions
Notice
Vector columns cannot be used as GROUP BY conditions, and the DISTINCT keyword is not supported.
Sum
The sum function calculates the sum of vectors in a vector column of a table. The sums of corresponding elements in the vectors are calculated to obtain the sum vector.
The following example shows the syntax:
sum(vector v1)
The following table describes the parameters.
- The vector type is supported.
The following table describes the return values.
- The return value is in the
sum (vector)format.
Example:
CREATE TABLE t8(c1 vector(3));
INSERT INTO t8 VALUES('[1,2,3]');
SELECT sum(c1) FROM t8;
The return result is as follows:
+---------+
| sum(c1) |
+---------+
| [1,2,3] |
+---------+
1 row in set
Avg
The avg function calculates the average of vectors in a vector column of a table.
The following example shows the syntax:
avg(vector v1)
The following table describes the parameters.
- The vector type is supported.
The following table describes the return values.
The return value is in the
avg (vector)format.NULL values in the vector column are not counted.
When no input parameters are provided, the output is NULL.
Example:
CREATE TABLE t9(c1 vector(3));
INSERT INTO t9 VALUES('[1,2,3]');
SELECT avg(c1) FROM t9;
The return result is as follows:
+---------+
| avg(c1) |
+---------+
| [1,2,3] |
+---------+
1 row in set
Other common vector functions
Vector_norm
The Vector_norm function calculates the Euclidean norm (or length) of a vector, which represents the Euclidean distance between the vector and the origin. The calculation formula is as follows:

The syntax is as follows:
vector_norm(vector v1)
The parameters are described as follows:
Apart from the vector type, other types that can be forcibly converted to the vector type are allowed. For example, single-level array types such as
[1,2,3,...]and string types such as'[1,2,3]'are allowed.When a single-level array type is used as a parameter, the elements of this parameter cannot be
NULL.
The return values are described as follows:
The return value is
norm(double), which indicates the modulus.If the parameter is
NULL, the return value is alsoNULL.
Here is an example:
CREATE TABLE t10(c1 vector(3));
INSERT INTO t10 VALUES('[1,2,3]');
SELECT vector_norm(c1),vector_norm([1,2,3]) FROM t10;
The return result is as follows:
+--------------------+----------------------+
| vector_norm(c1) | vector_norm([1,2,3]) |
+--------------------+----------------------+
| 3.7416573867739413 | 3.7416573867739413 |
+--------------------+----------------------+
1 row in set
Vector_dims
The Vector_dims function returns the dimension of a vector.
The syntax is as follows:
vector_dims(vector v1)
The parameters are described as follows:
- Apart from the vector type, other types that can be forcibly converted to the vector type are allowed. For example, single-level array types such as
[1,2,3,...]and string types such as'[1,2,3]'are allowed.
The return values are described as follows:
The return value is
dims(int64), which indicates the dimension.If the parameter is
NULL, an error is returned.
Here is an example:
CREATE TABLE t11(c1 vector(3));
INSERT INTO t11 VALUES('[1,2,3]');
INSERT INTO t11 VALUES('[1,1,1]');
SELECT vector_dims(c1), vector_dims('[1,2,3]') FROM t11;
The return result is as follows:
+-----------------+------------------------+
| vector_dims(c1) | vector_dims('[1,2,3]') |
+-----------------+------------------------+
| 3 | 3 |
| 3 | 3 |
+-----------------+------------------------+
2 rows in set