OceanBase Database supports vector type storage, vector indexes, and embedding vector search starting from V4.3.3. You can store vectorized data in OceanBase Database for subsequent retrieval.
Dify is an open-source platform for building large language model (LLM) applications. It combines the concepts of Backend as a Service (BaaS) and LLMOps, enabling developers to quickly build production-grade generative AI applications. Even non-technical users can participate in defining AI applications and managing data operations.
Dify comes with a built-in technology stack essential for developing LLM applications, including support for hundreds of models, an intuitive prompt orchestration interface, a high-quality RAG engine, a robust agent framework, flexible workflow orchestration, and easy-to-use interfaces and APIs. This saves developers from reinventing the wheel and allows them to focus on innovation and business requirements.
This topic describes how to integrate the vector search capabilities of OceanBase Database with Dify.
Prerequisites
Before you deploy Dify, make sure that your server meets the following minimum system requirements:
- CPU: 2 cores
- Memory: 4 GB
This integration tutorial is performed in a Docker container. Make sure that you have set up a Docker container platform.
You have deployed OceanBase Database V4.3.3 or later and created a MySQL-compatible tenant. For more information about how to create a user tenant, see Create a tenant.
- You have set the
ob_vector_memory_limit_percentageparameter in the tenant to enable vector search. We recommend that you set the value to30for OceanBase Database versions earlier than V4.3.5 BP3, and to0for V4.3.5 BP3 and later. For more information about this parameter, see ob_vector_memory_limit_percentage.
- You have set the
Step 1: Obtain the database connection information
Obtain the database connection string from the deployment engineer or administrator of OceanBase Database. For example:
obclient -h$host -P$port -u$user_name -p$password -D$database_name
Parameter description:
$host: the IP address for connecting to OceanBase Database. For connection through OceanBase Database Proxy (ODP), use the IP address of an ODP. For direct connection, use the IP address of an OBServer node.$port: the port for connecting to OceanBase Database. For connection through ODP, the default value is2883, which can be customized when ODP is deployed. For direct connection, the default value is2881, which can be customized when OceanBase Database is deployed.$database_name: the name of the database to be accessed.Notice
The user for connecting to the tenant must have the
CREATE,INSERT,DROP, andSELECTprivileges on the database. For more information about user privileges, see Privilege types in MySQL-compatible mode.$user_name: the tenant account. For connection through ODP, the format isusername@tenant name#cluster nameorcluster name:tenant name:username. For direct connection, the format isusername@tenant name.$password: the password of the account.
For more information about the connection string, see Connect to an OceanBase tenant by using OBClient.
Step 2: Deploy Dify
Method 1
For more information about deploying Dify, see Deploy with Docker Compose.
Make the following changes:
- Set the value of the
VECTOR_STOREvariable in.envtooceanbase. - Run
docker compose --profile oceanbase up -dto start the service.
Method 2
You can also deploy Dify by referring to Dify for MySQL.
To deploy Dify, perform the following operations:
cd docker
bash setup-mysql-env.sh
docker compose up -d
Step 3: Use Dify
To integrate Dify with a large language model, see Integrate with a large language model.