Use SQL functions|V4.3.3|OceanBase Database| docs|Distributed Database

Use SQL functions

Last Updated：2025-11-27 06:23:27 Updated

This topic describes the vector functions supported by OceanBase Database and the considerations for using them.

Considerations

Vectors in different dimensions are not supported. An error different vector dimensions %d and %d will be returned when you perform operations on vectors in different dimensions.
An error value out of range: overflow / underflow will be returned when the result is out of the floating-point number range.
Vector indexes support L2 distance and inner product distance calculations. For more information, see Create a vector index.

Preparations

Before you use vector indexes, you must enable vector-based indexes by setting ob_vector_memory_limit_percentage.

ALTER SYSTEM SET ob_vector_memory_limit_percentage = 30;

Distance function

A distance function calculates the distance between two vectors based on the specific distance algorithm used.

L2_distance

The Euclidean distance measures the distance between coordinates of two vectors -- essentially the straight-line distance between two vectors. It is calculated using the Pythagorean theorem applied to vector coordinates:

Pythagorean theorem

The syntax is as follows:

l2_distance(vector v1, vector v2)

The parameters are described as follows:

Parameters can be of any type that can be forcibly converted to a vector type, including single-level array types (such as [1,2,3,..]) and string types (such as '[1,2,3]'), except for the vector type.
The dimensions of the two parameters must be the same.
If a single-level array type is used as a parameter, the parameter elements must not contain NULL.

The return value is described as follows:

The return value is a distance(double) value.
If any parameter is NULL, the return value is NULL.

Here is an example:

CREATE TABLE t1(c1 vector(3));
INSERT INTO t1 VALUES('[1,2,3]');
SELECT l2_distance(c1, [1,2,3]), l2_distance([1,2,3],[1,1,1]), l2_distance('[1,1,1]','[1,2,3]') FROM t1;

The return result is as follows:

+--------------------------+------------------------------+----------------------------------+
| l2_distance(c1, [1,2,3]) | l2_distance([1,2,3],[1,1,1]) | l2_distance('[1,1,1]','[1,2,3]') |
+--------------------------+------------------------------+----------------------------------+
|                        0 |             2.23606797749979 |                 2.23606797749979 |
+--------------------------+------------------------------+----------------------------------+
1 row in set

L1_distance

The Manhattan distance measures the sum of absolute differences of coordinates between two points in a standard coordinate system. The formula is as follows:

Manhattan distance

The syntax is as follows:

l1_distance(vector v1, vector v2)

The parameters are described as follows:

Parameters can be of any type that can be forcibly converted to a vector type, including single-level array types (such as [1,2,3,..]) and string types (such as '[1,2,3]'), except for the vector type.
The dimensions of the two parameters must be the same.
If a single-level array type is used as a parameter, the parameter elements must not contain NULL.

The return value is described as follows:

The return value is a distance(double) value.
If any parameter is NULL, the return value is NULL.

Here is an example:

CREATE TABLE t2(c1 vector(3));
INSERT INTO t2 VALUES('[1,2,3]');
INSERT INTO t2 VALUES('[1,1,1]');
SELECT l1_distance(c1, [1,2,3]) FROM t2;

The return result is as follows:

+--------------------------+
| l1_distance(c1, [1,2,3]) |
+--------------------------+
|                        0 |
|                        3 |
+--------------------------+
2 rows in set

Cosine_distance

The cosine similarity measures the angle difference between two vectors. It measures the directional similarity between two vectors regardless of their length (magnitude). The cosine similarity ranges from [-1, 1]. A value of 1 indicates that the two vectors are in the same direction. A value of 0 indicates that the two vectors are orthogonal. A value of -1 indicates that the two vectors are in opposite directions.

The cosine similarity is calculated using the following formula:

Cosine similarity

The cosine distance, which measures the distance between two vectors, is calculated by subtracting the cosine similarity from 1. The formula is as follows:

Cosine distance

The cosine distance ranges from [0, 2]. A value of 0 indicates that the two vectors are in the same direction and there is no distance between them. A value of 2 indicates that the two vectors are in opposite directions.

The syntax is as follows:

cosine_distance(vector v1, vector v2)

The parameters are described as follows:

Parameters can be of any type that can be forcibly converted to a vector type, including single-level array types (such as [1,2,3,..]) and string types (such as '[1,2,3]'), except for the vector type.
The dimensions of the two parameters must be the same.
If a single-level array type is used as a parameter, the parameter elements must not contain NULL.

The return value is described as follows:

The return value is a distance(double) value.
If any parameter is NULL, the return value is NULL.

Here is an example:

CREATE TABLE t3(c1 vector(3));
INSERT INTO t3 VALUES('[1,2,3]');
INSERT INTO t3 VALUES('[1,2,1]');
SELECT cosine_distance(c1, [1,2,3]) FROM t3;

The return result is as follows:

+------------------------------+
| cosine_distance(c1, [1,2,3]) |
+------------------------------+
|                            0 |
|          0.12712843905603044 |
+------------------------------+
2 rows in set

Inner_product

The inner product, also known as the dot product or scalar product, is a type of multiplication between two vectors. In geometry, the inner product indicates the directional and magnitude relationships between two vectors. The formula for calculating the inner product is as follows:

Inner product

The syntax is as follows:

inner_product(vector v1, vector v2)

The parameters are described as follows:

Apart from vector types, other types that can be forcibly converted to vector types are supported. Examples of other types include single-level array types like [1,2,3,...] and string types like '[1,2,3]'.
The dimensions of the two parameters must be the same.
If the parameter contains single-level array types, NULL values are not allowed in the elements of this parameter.

The return value is described as follows:

The return value is a distance(double) value.
If any parameter is NULL, the return value is NULL.

Here is an example:

CREATE TABLE t4(c1 vector(3));
INSERT INTO t4 VALUES('[1,2,3]');
INSERT INTO t4 VALUES('[1,2,1]');
SELECT inner_product(c1, [1,2,3]) FROM t4;

The return result is as follows:

+----------------------------+
| inner_product(c1, [1,2,3]) |
+----------------------------+
|                         14 |
|                          8 |
+----------------------------+
2 rows in set

Negative_inner_product

The negative_inner_product function calculates the negative inner product between two vectors by using the following formula:

Negative inner product

The syntax is as follows:

negative_inner_product(vector v1, vector v2)

The parameters are described as follows:

Apart from vector types, other types that can be forcibly converted to vector types are supported. Examples of other types include single-level array types like [1,2,3,...] and string types like '[1,2,3]'.
The dimensions of the two parameters must be the same.
If the parameter contains single-level array types, NULL values are not allowed in the elements of this parameter.

The return value is described as follows:

The return value is a distance(double) value.
If any parameter is NULL, the return value is NULL.

Here is an example:

CREATE TABLE t4(c1 vector(3));
INSERT INTO t4 VALUES('[1,2,3]');
INSERT INTO t4 VALUES('[1,2,1]');
SELECT negative_inner_product(c1, [1,2,3]) FROM t4;

The return result is as follows:

+-------------------------------------+
| negative_inner_product(c1, [1,2,3]) |
+-------------------------------------+
|                                 -14 |
|                                  -8 |
|                                 -14 |
|                                  -8 |
+-------------------------------------+
4 rows in set

Vector_distance

The vector_distance function calculates the distance between two vectors by using different distance algorithms based on the specified parameters.

The syntax is as follows:

vector_distance(vector v1, vector v2 [, string metric])

The parameters are described as follows:

Apart from vector types, other types that can be forcibly converted to vector types are supported. Examples of other types include single-level array types like [1,2,3,...] and string types like '[1,2,3]'.
The dimensions of the two parameters must be the same.
If the parameter contains single-level array types, NULL values are not allowed in the elements of this parameter.

The metric parameter specifies the distance algorithm. Valid values:

If the parameter is not specified, the default algorithm is euclidean.
If the parameter is specified, the value must be one of the following:
- euclidean: the Euclidean distance, which is the same as L2_distance.
- manhattan: the Manhattan distance, which is the same as L1_distance.
- cosine: the cosine distance, which is the same as Cosine_distance.
- dot: the inner product, which is the same as Inner_product.

The return value is described as follows:

The return value is a distance(double) value.
If any parameter is NULL, the return value is NULL.

Here is an example:

CREATE TABLE t5(c1 vector(3));
INSERT INTO t5 VALUES('[1,2,3]');
INSERT INTO t5 VALUES('[1,2,1]');
SELECT vector_distance(c1, [1,2,3], euclidean) FROM t5;

The return result is as follows:

+-----------------------------------------+
| vector_distance(c1, [1,2,3], euclidean) |
+-----------------------------------------+
|                                       0 |
|                                       2 |
+-----------------------------------------+
2 rows in set

Arithmetic functions

Arithmetic functions provide element-wise addition (+), subtraction (-), and multiplication (*) operations between vector types, between single-level array types, and between special string types. The operation results are as follows:

Addition

The syntax is as follows:

v1 + v2
v1 - v2
v1 * v2

The parameters are described as follows:

Except for vector types, parameters can be of any type that can be forcibly converted to a vector type, including single-level array types (such as [1,2,3,..]) and string-like types (such as '[1,2,3]'). Note: the two parameters cannot both be string-like types; one of them must be a vector or single-level array type.
The dimensions of the two parameters must be the same.
If a single-level array type parameter exists, the elements of this parameter must not be NULL.

The return values are described as follows:

If at least one of the two parameters is a vector type, the return value is a vector of the same type as the vector parameter.
If the two parameters are both single-level array types, the return value is of the array(float) type.
If any parameter is NULL, the return value is NULL.

Here is an example:

CREATE TABLE t6(c1 vector(3));
INSERT INTO t6 VALUES('[1,2,3]');
SELECT [1,2,3] + '[1.12,1000.0001, -1.2222]', c1 - [1,2,3] FROM t6;

The return result is as follows:

+---------------------------------------+--------------+
| [1,2,3] + '[1.12,1000.0001, -1.2222]' | c1 - [1,2,3] |
+---------------------------------------+--------------+
| [2.12,1002,1.7778]                    | [0,0,0]      |
+---------------------------------------+--------------+
1 row in set

Comparison functions

Comparison functions provide comparison between vector types, single-level array types, and special string types. It supports comparison operators such as =, !=, >, <, >=, and <=. The comparison is performed by comparing the elements in the order of dictionaries.

The syntax is as follows:

v1 = v2
v1 != v2
v1 > v2
v1 < v2
v1 >= v2
v1 <= v2

The parameters are described as follows:

Apart from vector types, other types that can be forcibly converted to vector types are allowed. This includes single-level array types, such as [1,2,3,...], and string literal types, such as '[1,2,3]'.

Notice

One of the two parameters must be a vector type.
The dimensions of the two parameters must be the same.
If one of the two parameters is a single-level array type, NULL values are not allowed in the elements of this parameter.

The return value is described as follows:

The return value is of the bool type.
If any parameter is NULL, the return value is NULL.

Example:

CREATE TABLE t7(c1 vector(3));
INSERT INTO t7 VALUES('[1,2,3]');
SELECT c1 = '[1,2,3]' FROM t7;

The return result is as follows:

+----------------+
| c1 = '[1,2,3]' |
+----------------+
|              1 |
+----------------+
1 row in set

Aggregate functions

Notice

Vector columns cannot be used as GROUP BY conditions, and the DISTINCT keyword is not supported.

Sum

The sum function calculates the sum of vector columns in a table. It performs element-by-element addition to obtain the sum vector.

The syntax is as follows:

sum(vector v1)

The parameters are described as follows:

Only vectors are supported.

The return values are described as follows:

The return value is sum (vector).
NULL columns are not included in the summation result.

Here is an example:

CREATE TABLE t8(c1 vector(3));
INSERT INTO t8 VALUES('[1,2,3]');
SELECT sum(c1) FROM t8;

The return result is as follows:

+---------+
| sum(c1) |
+---------+
| [1,2,3] |
+---------+
1 row in set

Avg

The avg function calculates the average of vector columns in a table.

The syntax is as follows:

avg(vector v1)

The parameters are described as follows:

Only vectors are supported.

The return values are described as follows:

The return value is avg (vector).
NULL columns are not included in the sum or count processes.
If the input parameter is empty, the output is NULL.

Here is an example:

CREATE TABLE t9(c1 vector(3));
INSERT INTO t9 VALUES('[1,2,3]');
SELECT avg(c1) FROM t9;

The return result is as follows:

+---------+
| avg(c1) |
+---------+
| [1,2,3] |
+---------+
1 row in set

Other commonly used vector functions

Vector_norm

The Vector_norm function is used to calculate the Euclidean norm (magnitude) of a vector, which represents the Euclidean distance between the vector and the origin. The calculation method is as follows:

Euclidean norm

The syntax is as follows:

vector_norm(vector v1)

The parameters are described as follows:

Apart from the vector type, other types that can be forcibly converted to the vector type are acceptable. These include single-level array types such as [1,2,3,...] and string types such as '[1,2,3]'.
When a single-level array type parameter is involved, its elements must not be NULL.

The return values are described as follows:

Returns norm(double).
Returns NULL when the parameter is NULL.

Here is an example:

CREATE TABLE t10(c1 vector(3));
INSERT INTO t10 VALUES('[1,2,3]');
SELECT vector_norm(c1),vector_norm([1,2,3]) FROM t10;

The return result is as follows:

+--------------------+----------------------+
| vector_norm(c1)    | vector_norm([1,2,3]) |
+--------------------+----------------------+
| 3.7416573867739413 |   3.7416573867739413 |
+--------------------+----------------------+
1 row in set

Vector_dims

The Vector_dims function is used to return the dimension of a vector.

The syntax is as follows:

vector_dims(vector v1)

The parameters are described as follows:

Apart from the vector type, other types that can be forcibly converted to the vector type are acceptable. These include single-level array types such as [1,2,3,...] and string types such as '[1,2,3]'.

The return values are described as follows:

Returns dims(int64).
Returns an error when the parameter is NULL.

Here is an example:

CREATE TABLE t11(c1 vector(3));
INSERT INTO t11 VALUES('[1,2,3]');
INSERT INTO t11 VALUES('[1,1,1]');
SELECT vector_dims(c1), vector_dims('[1,2,3]') FROM t11;

The return result is as follows:

+-----------------+------------------------+
| vector_dims(c1) | vector_dims('[1,2,3]') |
+-----------------+------------------------+
|               3 |                      3 |
|               3 |                      3 |
+-----------------+------------------------+
2 rows in set

OceanBase

Customer Stories

Documentation

Use SQL functions

Considerations

Preparations

Distance function

L2_distance

L1_distance

Cosine_distance

Inner_product

Negative_inner_product

Vector_distance

Arithmetic functions

Comparison functions

Notice

Aggregate functions

Notice

Sum

Avg

Other commonly used vector functions

Vector_norm

Vector_dims