OceanBase logo

OceanBase

A unified distributed database ready for your transactional, analytical, and AI workloads.

Product Overview
DEPLOY YOUR WAY

OceanBase Cloud

The best way to deploy and scale OceanBase

OceanBase Enterprise

Run and manage OceanBase on your infra

TRY OPEN SOURCE

OceanBase Community Edition

The free, open-source distributed database

OceanBase seekdb

Open source AI native search database

Customer Stories

Real-world success stories from enterprises across diverse industries.

View All
BY USE CASES

Mission-Critical Transactions

Global & Multicloud Application

Elastic Scaling for Peak Traffic

Real-time Analytics

Active Geo-redundancy

Database Consolidation

Resources

Comprehensive knowledge hub for OceanBase.

Blog

Live Demos

Training & Certification

Documentation

Official technical guides, tutorials, API references, and manuals for all OceanBase products.

View All
PRODUCTS

OceanBase Cloud

OceanBase Database

Tools

Connectors and Middleware

QUICK START

OceanBase Cloud

OceanBase Database

BEST PRACTICES

Practical guides for utilizing OceanBase more effectively and conveniently

Company

Learn more about OceanBase – our company, partnerships, and trust and security initiatives.

About OceanBase

Partner

Trust Center

Contact Us

International - English
中国站 - 简体中文
日本 - 日本語
Sign In
Start on Cloud

OceanBase

A unified distributed database ready for your transactional, analytical, and AI workloads.

Product Overview
DEPLOY YOUR WAY

OceanBase Cloud

The best way to deploy and scale OceanBase

OceanBase Enterprise

Run and manage OceanBase on your infra

TRY OPEN SOURCE

OceanBase Community Edition

The free, open-source distributed database

OceanBase seekdb

Open source AI native search database

Customer Stories

Real-world success stories from enterprises across diverse industries.

View All
BY USE CASES

Mission-Critical Transactions

Global & Multicloud Application

Elastic Scaling for Peak Traffic

Real-time Analytics

Active Geo-redundancy

Database Consolidation

Comprehensive knowledge hub for OceanBase.

Blog

Live Demos

Training & Certification

Documentation

Official technical guides, tutorials, API references, and manuals for all OceanBase products.

View All
PRODUCTS
OceanBase CloudOceanBase Database
ToolsConnectors and Middleware
QUICK START
OceanBase CloudOceanBase Database
BEST PRACTICES

Practical guides for utilizing OceanBase more effectively and conveniently

Learn more about OceanBase – our company, partnerships, and trust and security initiatives.

About OceanBase

Partner

Trust Center

Contact Us

Start on Cloud
编组
All Products
    • Databases
    • iconOceanBase Database
    • iconOceanBase Cloud
    • iconOceanBase Tugraph
    • iconInteractive Tutorials
    • iconOceanBase Best Practices
    • Tools
    • iconOceanBase Cloud Platform
    • iconOceanBase Migration Service
    • iconOceanBase Developer Center
    • iconOceanBase Migration Assessment
    • iconOceanBase Admin Tool
    • iconOceanBase Loader and Dumper
    • iconOceanBase Deployer
    • iconKubernetes operator for OceanBase
    • iconOceanBase Diagnostic Tool
    • iconOceanBase Binlog Service
    • Connectors and Middleware
    • iconOceanBase Database Proxy
    • iconEmbedded SQL in C for OceanBase
    • iconOceanBase Call Interface
    • iconOceanBase Connector/C
    • iconOceanBase Connector/J
    • iconOceanBase Connector/ODBC
    • iconOceanBase Connector/NET
icon

OceanBase Database

SQL - V4.4.2

    Download PDF

    OceanBase logo

    The Unified Distributed Database for the AI Era.

    Follow Us
    Products
    OceanBase CloudOceanBase EnterpriseOceanBase Community EditionOceanBase seekdb
    Resources
    DocsBlogLive DemosTraining & CertificationTicket
    Company
    About OceanBaseTrust CenterLegalPartnerContact Us
    Follow Us

    © OceanBase 2026. All rights reserved

    Cloud Service AgreementPrivacy PolicySecurity
    Contact Us
    Document Feedback
    1. Documentation Center
    2. OceanBase Database
    3. SQL
    4. V4.4.2
    iconOceanBase Database
    SQL - V 4.4.2
    SQL
    KV
    • V 4.6.0
    • V 4.4.2
    • V 4.3.5
    • V 4.3.3
    • V 4.3.1
    • V 4.3.0
    • V 4.2.5
    • V 4.2.2
    • V 4.2.1
    • V 4.2.0
    • V 4.1.0
    • V 4.0.0
    • V 3.1.4 and earlier

    Build an intelligent Q&A assistant with OceanBase Database

    Last Updated:2026-04-02 06:23:56  Updated
    share
    What is on this page
    Background information
    Architecture
    Prerequisites
    Step 1: Register for an LLM platform account
    Step 2: Build your AI assistant
    Clone the code repository
    Install the dependencies
    Set environment variables
    Connect to the database
    Prepare document corpus
    Start the UI chat interface
    Application demo

    folded

    share

    Background information

    In the information explosion era, users often need to quickly retrieve necessary information from massive amounts of data. Efficient retrieval systems are required to quickly locate content of interest in online literature databases, e-commerce product catalogs, and rapidly growing multimedia content libraries. As the amount of data continues to increase, traditional keyword-based search methods cannot meet users' needs for both accuracy and speed. This is where vector search technology comes in. It encodes different types of data, such as text, images, and audio, into mathematical vectors and performs search operations in the vector space. This allows the system to capture the deep semantic information of data and provide more accurate and efficient search results.

    This topic will show you how to build an intelligent document Q&A assistant using OceanBase's vector search capability.

    Architecture

    The intelligent Q&A assistant stores documents as vectors within an OceanBase database. When a user asks a question through the user interface (UI), the application embeds the question into vectors by using the BGE-M3 model and retrieves similar vectors from the database. After obtaining the documents corresponding to the similar vectors, the application sends them along with the user's question to the Large Language Model (LLM). The LLM then generates a more accurate answer based on the provided documents.

    5

    Prerequisites

    • You have deployed OceanBase Database V4.4.0 or later and created a MySQL-compatible tenant. For more information about how to deploy an OceanBase cluster, see Deployment overview.

    • The MySQL-compatible tenant you created has the INSERT and SELECT privileges. For more information about how to configure privileges, see Grant direct privileges.

    • You have created a database. For more information about how to create a database, see Create a database.

    • The vector search feature is enabled for the database. For more information about the vector search feature, see Perform fast vector search by using SQL.

      obclient> ALTER SYSTEM SET ob_vector_memory_limit_percentage = 30;
      
    • You have installed Python 3.11 or later.

    • You have installed Poetry.

      python3 -m ensurepip
      python3 -m pip install poetry
      

    Step 1: Register for an LLM platform account

    Register for an account with Alibaba Cloud Model Studio, activate the model service, and obtain an API key.

    Notice

    Activating Alibaba Cloud Model Studio services requires you to complete the process on a third-party platform. This operation will follow the third-party platform's billing rules and may incur charges. Before proceeding, please visit its official website or refer to the relevant documentation to confirm and accept its pricing. If you do not agree, do not proceed.

    Notice

    This topic uses Qwen LLM as an example to demonstrate how to build a Q&A chatbot. You can also choose to use other LLMs. If you use a different LLM, remember to update the API_KEY, LLM_BASE_URL, and LLM_MODEL fields in the .env file accordingly.

    Click to activate the model service

    Confirm to activate the model service

    Alibaba Cloud Model Studio

    Step 2: Build your AI assistant

    Clone the code repository

    git clone https://github.com/ob-labs/ChatBot.git
    cd ChatBot
    

    Install the dependencies

    poetry install
    

    Set environment variables

    cp .env.example .env
    # If you are using the LLM capabilities provided by Tongyi Qwen, update the API_KEY and OPENAI_EMBEDDING_API_KEY with the API KEY you obtained from the Alibaba Cloud Model Studio console. Also, update the variables starting with DB_ with your database connection information, then save the file.
    vi .env
    

    Connect to the database

    You can use the script we have prepared to test the database connection and ensure that the related environment variables are set correctly:

    bash utils/connect_db.sh
    # If you successfully enter the MySQL connection, it means the environment variables have been set correctly.
    

    Prepare document corpus

    This step involves cloning OceanBase's open-source documentation repository, processing the documentation, and converting the documents into vector data, which is then stored in an OceanBase database.

    1. Clone and process the documentation repository.

      Notice

      This step involves downloading and processing a large number of OceanBase documents, which will take some time.

      git clone --single-branch --branch V4.3.3 https://github.com/oceanbase/oceanbase-doc.git doc_repos/oceanbase-doc
      # If your access to GitHub is slow, you can use the following command to clone the Gitee mirror version.
      git clone --single-branch --branch V4.3.4 https://gitee.com/oceanbase-devhub/oceanbase-doc.git doc_repos/oceanbase-doc
      
    2. Standardize the document formatting.

      Since some files in OceanBase's documentation use ==== and ---- to indicate first-level and second-level headings, in this step we will convert them to the standard # and ## notation.

      # Convert headings to standard Markdown format.
      poetry run python convert_headings.py \
        doc_repos/oceanbase-doc/en-US \
      
    3. Convert the documents to vectors and insert them into the OceanBase database.

      We provide the embed_docs.py script, which, after you specify the document directory and corresponding component, will scan all Markdown files in that directory. The script splits long documents into smaller chunks, converts them into vectors using an embedding model, and then inserts the chunk content, embedded vectors, and chunk metadata (in JSON format, including the document title, relative path, component name, chunk title, and hierarchical headings) into a single table in OceanBase as reference data.

      To save time, we only process a few documents related to vector search from the many available OceanBase documents. After you open the chat interface in Step 6, your questions about OceanBase’s vector search capabilities will receive more accurate answers.

      # Generate document vectors and metadata.
      poetry run python embed_docs.py --doc_base doc_repos/oceanbase-doc/en-US/640.ob-vector-search
      

    Start the UI chat interface

    Run the following command to start the chat interface:

    poetry run streamlit run --server.runOnSave false chat_ui.py
    

    Access the URL provided in the terminal to open the chatbot application.

      You can now view your Streamlit app in your browser.
    
      Local URL: http://localhost:8501
      Network URL: http://172.xxx.xxx.xxx:8501
      External URL: http://xxx.xxx.xxx.xxx:8501 # This is the URL you can access from your browser
    

    Application demo

    Notice

    Since this application is built using OceanBase documentation, please ask questions related to OceanBase.

    chatbot-ui

    Previous topic

    Build an image search application with OceanBase Database
    Last

    Next topic

    Build a cultural tourism assistant with OceanBase multi-model integration
    Next
    What is on this page
    Background information
    Architecture
    Prerequisites
    Step 1: Register for an LLM platform account
    Step 2: Build your AI assistant
    Clone the code repository
    Install the dependencies
    Set environment variables
    Connect to the database
    Prepare document corpus
    Start the UI chat interface
    Application demo