Vector embedding|V4.6.0|OceanBase Database| docs|Distributed Database

Vector embedding

Last Updated：2026-05-07 11:26:24 Updated

This topic describes vector embedding in vector search.

What is vector embedding?

Vector embedding is a technique that converts unstructured data into numerical vectors. These vectors capture the semantic information of the unstructured data, allowing computers to "understand" and process the meaning of the unstructured data. Specifically:

Vector embedding maps unstructured data such as text, images, or audio/video to points in a high-dimensional vector space.
In this vector space, semantically similar unstructured data are mapped to nearby positions.
Vectors are typically composed of hundreds of numbers (e.g., 512 dimensions, 1024 dimensions, etc.).
The similarity between vectors can be calculated using mathematical methods such as cosine similarity.
Common vector embedding models include Word2Vec, BERT, and BGE. For example, when developing a RAG application, we usually need to embed the text data into vectors and store them in a vector database, while other structured data are stored in a relational database.

Starting from OceanBase Database V4.3.3, you can store vector data as a data type in a relational table. This allows vectors and traditional scalar data to be stored in OceanBase Database in an organized and efficient manner.

Generate vector embeddings in OceanBase Database by using AI Function Service

OceanBase Database supports generating vector embeddings by using AI Function Service. You do not need to install any dependencies. You only need to register the model information. For more information, see AI Function Service syntax and examples.

Common text embedding methods

This section describes text embedding methods.

Prerequisites

You need to install the pip command.

Use an offline, local pre-trained embedding model

Using a pre-trained model for local text embedding is the most flexible approach, but it requires significant computational resources. Commonly used models include:

Use Sentence Transformers

Sentence Transformers are models designed for natural language processing (NLP) tasks, aiming to convert sentences or paragraphs into vector embeddings. They are based on deep learning techniques, particularly using the Transformer architecture, which effectively captures the semantic information of text. Due to potential timeouts when directly accessing the Hugging Face domain in China, please set the Hugging Face mirror address before proceeding: export HF_ENDPOINT=https://hf-mirror.com. After setting this, execute the following code:

from sentence_transformers import SentenceTransformer

model = SentenceTransformer("BAAI/bge-m3")

sentences = [
    "That is a happy person",
    "That is a happy dog",
    "That is a very happy person",
    "Today is a sunny day"
]
embeddings = model.encode(sentences)
print(embeddings)
# [[-0.01178016  0.00884024 -0.05844684 ...  0.00750248 -0.04790139
#   0.00330675]
# [-0.03470375 -0.00886354 -0.05242309 ...  0.00899352 -0.02396279
#   0.02985837]
# [-0.01356584  0.01900942 -0.05800966 ...  0.00523864 -0.05689549
#   0.00077098]
# [-0.02149693  0.02998871 -0.05638731 ...  0.01443702 -0.02131325
#  -0.00112451]]
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# torch.Size([4, 4])

Use Hugging Face Transformers

Hugging Face Transformers is an open-source library that provides a wide range of pre-trained deep learning models, especially for natural language processing (NLP) tasks. Due to geographical issues, direct access to the Hugging Face domain may result in timeouts. Please set the Hugging Face mirror address before proceeding: export HF_ENDPOINT=https://hf-mirror.com. After setting this, execute the following code:

from transformers import AutoTokenizer, AutoModel
import torch

# Load the model and tokenizer
tokenizer = AutoTokenizer.from_pretrained("BAAI/bge-m3")
model = AutoModel.from_pretrained("BAAI/bge-m3")

# Prepare input
texts = ["This is an example text"]
inputs = tokenizer(texts, padding=True, truncation=True, return_tensors="pt")

# Generate embeddings
with torch.no_grad():
    outputs = model(**inputs)
    embeddings = outputs.last_hidden_state[:, 0]  # Use the output of the [CLS] token
    print(embeddings)
    # tensor([[-1.4136,  0.7477, -0.9914,  ...,  0.0937, -0.0362, -0.1650]])
print(embeddings.shape)
# torch.Size([1, 1024])

Ollama

Ollama is an open-source model. When running, it allows users to easily run, manage, and use various large language models locally. In addition to supporting open-source language models like Llama 3 and Mistral, it also supports embedding models like bge-m3.

Deploy Ollama

On macOS and Windows, you can directly download the installation package from the official website and install it. The installation method can be referenced from Ollama's official website. After installation, Ollama will run as a service in the background.

On Linux, install Ollama:
```
curl -fsSL https://ollama.ai/install.sh | sh
```
Pull the embedding model

Ollama supports using the bge-m3 model for text embedding:
```
ollama pull bge-m3
```

Use Ollama for text embedding

You can use Ollama's embedding capabilities through HTTP API or Python SDK:

HTTP API

import requests

def get_embedding(text: str) -> list:
"""Get text embeddings using Ollama's HTTP API"""
response = requests.post(
'http://localhost:11434/api/embeddings',
json={
    'model': 'bge-m3',
    'prompt': text
}
)
return response.json()['embedding']

# Example usage
text = "This is an example text"
embedding = get_embedding(text)
print(embedding)
# [-1.4269912242889404, 0.9092104434967041, ...]

Python SDK

First, install the Ollama Python SDK:

pip install ollama

Then, you can use it as follows:

import ollama

# Example usage
texts = ["First sentence", "Second sentence"]
embeddings = ollama.embed(model="bge-m3", input=texts)['embeddings']
print(embeddings)
# [[0.03486196, 0.0625187, ...], [...]]

Advantages and Limitations of Ollama

Advantages:
- Fully local deployment, no need for internet connection
- Open-source and free, no API key required
- Supports multiple models, easy to switch and compare
- Relatively low resource consumption
Limitations:
- Limited selection of embedding models
- Performance may not match commercial services
- Requires self-maintenance and updates
- Lacks enterprise-level support

When deciding whether to use Ollama, consider these factors. If your application requires high privacy or complete offline operation, Ollama is a good choice. However, if you need more stable service quality and better performance, commercial services may be more suitable.

Use online remote embedding services

Using an offline, local embedding model typically requires higher specifications for the deployment machine and has higher requirements for the management of model loading and unloading. Therefore, many users have a high demand for online embedding services. As a result, many AI inference service providers now offer corresponding text embedding services. For example, the text embedding service of Qwen allows you to register for an account on Alibaba Cloud Baichuan and obtain an API Key. Then, you can call its public interface to get the text embedding results.

HTTP call

After obtaining the API Key, you can use the following code to perform text embedding. If the requests package is not installed in your Python environment, you need to install it using pip install requests to send network requests.

import requests
from typing import List

class RemoteEmbedding():
    def __init__(
        self,
        base_url: str,
        api_key: str,
        model: str,
        dimensions: int = 1024,
        **kwargs,
    ):
        self._base_url = base_url
        self._api_key = api_key
        self._model = model
        self._dimensions = dimensions

    """
        OpenAI compatible embedding API. Tongyi, Baichuan, Doubao, etc.
    """

    def embed_documents(
        self,
        texts: List[str],
    ) -> List[List[float]]:
        """Embed search docs.

        Args:
            texts: List of text to embed.

        Returns:
            List of embeddings.
        """
        res = requests.post(
            f"{self._base_url}",
            headers={"Authorization": f"Bearer {self._api_key}"},
            json={
                "input": texts,
                "model": self._model,
                "encoding_format": "float",
                "dimensions": self._dimensions,
            },
        )
        data = res.json()
        embeddings = []
        try:
            for d in data["data"]:
                embeddings.append(d["embedding"][: self._dimensions])
            return embeddings
        except Exception as e:
            print(data)
            print("Error", e)
            raise e

    def embed_query(self, text: str, **kwargs) -> List[float]:
        """Embed query text.

        Args:
            text: Text to embed.

        Returns:
            Embedding.
        """
        return self.embed_documents([text])[0]

embedding = RemoteEmbedding(
  base_url="https://dashscope.aliyuncs.com/compatible-mode/v1/embeddings", # For more information, see https://bailian.console.aliyun.com/#/model-market/detail/text-embedding-v3?tabKey=sdk
  api_key="your-api-key", # Fill in your API Key
  model="text-embedding-v3",
)

print("Embedding result:", embedding.embed_query("Today's weather is nice"), "\n")
# Embedding result: [-0.03573227673768997, 0.0645645260810852, ...]
print("Embedding results:", embedding.embed_documents(["Today's weather is nice", "What about tomorrow?"]), "\n")
# Embedding results: [[-0.03573227673768997, 0.0645645260810852, ...], [-0.05443647876381874, 0.07368793338537216, ...]]

Use the Qwen SDK

Qwen provides a SDK called dashscope for quick model calls. After installing it with pip install dashscope, you can obtain the text embeddings.

import dashscope
from dashscope import TextEmbedding

# Set the API Key
dashscope.api_key = "your-api-key"

# Prepare the input text
texts = ["This is the first sentence", "This is the second sentence"]

# Call the embedding service
response = TextEmbedding.call(
    model="text-embedding-v3",
    input=texts
)

# Get the embedding results
if response.status_code == 200:
    print(response.output['embeddings'])
# [{"embedding": [-0.03193652629852295, 0.08152323216199875, ...]}, {"embedding": [...]}]

Common image embedding methods

This section introduces image embedding methods.

Using an offline, locally pre-trained embedding model

Using CLIP

CLIP (Contrastive Language-Image Pretraining) is a model proposed by OpenAI that aims to perform multimodal learning by combining images and text. CLIP can understand and process the relationships between images and text, allowing it to excel in various tasks such as image classification, image search, and text generation.

from PIL import Image
from transformers import CLIPProcessor, CLIPModel

model = CLIPModel.from_pretrained("openai/clip-vit-base-patch32")
processor = CLIPProcessor.from_pretrained("openai/clip-vit-base-patch32")

# Prepare the input image
image = Image.open("path_to_your_image.jpg")
texts = ["This is the first sentence", "This is the second sentence"]

# Call the embedding service
inputs = processor(text=texts, images=image, return_tensors="pt", padding=True)
outputs = model(**inputs)

# Obtain the embedding results
if outputs.status_code == 200:
    print(outputs.output['embeddings'])
# [{"embedding": [-0.03193652629852295, 0.08152323216199875, ...]}, {"embedding": [...]}]

OceanBase

Customer Stories

Documentation