Overview
DB-GPT is an open-source AI database assistant that supports using OceanBase as a vector database for knowledge base question-answering.
This topic describes how DB-GPT uses OceanBase vector search to store and retrieve document vectors, and integrates LLMs such as Qwen to deliver accurate question-answering services.
Version compatibility
- OceanBase Database: ≥ V4.3.3
- DB-GPT: v0.7.x and later
Prerequisites
Before integrating DB-GPT with OceanBase Database, ensure that:
- You have deployed OceanBase Database and created a MySQL user tenant. For more information, see Create a tenant.
- You have set the
ob_vector_memory_limit_percentageparameter to enable vector search. We recommend setting this parameter to 30 in ODP versions earlier than V4.3.5 BP3, and keeping the default value of 0 in ODP V4.3.5 BP3 and later. For more information on how to set this parameter, see ob_vector_memory_limit_percentage. - Docker is deployed and running, and the current user has permission to execute Docker commands (verify with
docker info). - You have obtained an API key for Qwen. To obtain it, go to the Qwen API Key page.
- You have installed Git for the operating system you're using.
Step 1: Obtain the database connection string
You can use the following command to quickly deploy an independent OceanBase Database for testing:
docker run --name=ob433 -e MODE=slim -p 2881:2881 -e OB_TENANT_PASSWORD=****** -d quay.io/oceanbase/oceanbase-ce:latest
Wait for the container initialization to complete (about 2-3 minutes). You can check the status by running the following command. If the output is boot success!, the startup is successful.
docker logs ob433 | tail -1
The database connection string is as follows:
mysql -h 127.0.0.1 -uroot@test -P2881 -p********* -Dtest
Step 2: Install DB-GPT
This example uses source code deployment. For more deployment methods, see DB-GPT official website.
- Download the source code
git clone https://github.com/eosphoros-ai/DB-GPT.git
cd DB-GPT
- Install the uv package manager
# macOS / Linux
curl -LsSf https://astral.sh/uv/install.sh | sh
source $HOME/.local/bin/env
# Verify the installation by running the following command, which should output something like: uv 0.10.2
uv --version
- Install the basic dependencies
For example, for Qwen, perform the following steps:
uv sync --all-packages \
--extra "base" \
--extra "proxy_tongyi" \
--extra "rag" \
--extra "storage_chromadb" \
--extra "dbgpts"
- Install the OceanBase partner package
uv pip install pyobvector
- Connect to OceanBase and set memory usage ratio for vector data (optional)
source .venv/bin/activate
python3
from pyobvector import ObVecClient
client = ObVecClient(uri="127.0.0.1:2881", user="root@test",password="",db_name="test")
client.perform_raw_text_sql(
"ALTER SYSTEM ob_vector_memory_limit_percentage = 30"
)
Step 3: Configure DB-GPT to use OceanBase Database
After the basic dependencies are downloaded, modify the TOML configuration file to switch to OceanBase Database.
Configure TOML file
In DB-GPT v0.7.x and later, we recommend using the TOML configuration file. Edit configs/dbgpt-proxy-tongyi.toml as follows:
[system]
# Load language from environment variable (It is set by the hook)
language = "${env:DBGPT_LANG:-en}"
api_keys = []
encrypt_key = "your_secret_key"
# Server Configurations
[service.web]
host = "0.0.0.0"
port = 5670
[service.web.database]
type = "sqlite"
path = "pilot/meta_data/dbgpt.db"
# Vector Store Configuration - Use OceanBase
[rag.storage]
[rag.storage.vector]
type = "OceanBase"
ob_host = "127.0.0.1"
ob_port = 2881
ob_user = "root@test"
ob_database = "test"
ob_password = ""
# ob_enable_normalize_vector = true
# Model Configurations
[models]
# Qwen LLM model
[[models.llms]]
name = "qwen-plus"
provider = "${env:LLM_MODEL_PROVIDER:-proxy/tongyi}"
api_base = "https://dashscope.aliyuncs.com/compatible-mode/v1"
api_key = "${env:DASHSCOPE_API_KEY}"
# Qwen Embedding model
[[models.embeddings]]
name = "text-embedding-v3"
provider = "${env:EMBEDDING_MODEL_PROVIDER:-proxy/tongyi}"
api_url = "https://dashscope.aliyuncs.com/compatible-mode/v1/embeddings"
api_key = "${env:DASHSCOPE_API_KEY}"
Parameter description:
$ob_host: The IP address of the OceanBase Database.$ob_port: The port of the OceanBase Database.$ob_user: The username of the OceanBase Database.$ob_password: The password of the OceanBase Database user.$ob_database: The name of the OceanBase database.
Step 4: Start DB-GPT
First, set the DashScope API Key environment variable, then start DB-GPT:
# Set the Qwen API Key
export DASHSCOPE_API_KEY="your-dashscope-api-key"
# Start DB-GPT
uv run dbgpt start webserver --config configs/dbgpt-proxy-tongyi.toml
Once started successfully, visit http://localhost:5670 to access the DB-GPT web interface.
Step 5: Use the knowledge base for Q&A
- Open your browser and go to http://localhost:5670.
- Go to the Knowledge page and click Create Knowledge to create a knowledge base.
- Fill in the Space Config and click Next.
- Select the Datasource type and click Next.
- Go to the Upload page, upload the file, and click Next.
- Go to the Segmentation page and select the document slicing method.
- The documents are sliced and vectorized and stored in the OceanBase vector database.
- In the Explore section, select the knowledge space to perform Q&A.