Analyze OceanBase data by using dbt

2025-06-24 11:54:39  Updated

dbt (data build tool) is an open-source tool for data transformation. It uses SQL to convert data and turns commands into tables or views. This topic describes how to analyze data in OceanBase Database by using dbt-oceanbase.

Prerequisites

Before you use dbt, make sure that you have installed dbt-oceanbase.

Install dbt-oceanbase

dbt-oceanbase is not published to PyPI, so you need to install it from the repository. Once released, you can install dbt-oceanbase via pip. Use this command to install dbt-oceanbase:

git clone https://github.com/oceanbase/dbt-oceanbase.git --branch dev/1.0.x
cd dbt-oceanbase
pip3 install .

Notice

When you install dbt-oceanbase, dbt-core will be installed automatically.

Use this command to list your installed dbt versions:

pip list|grep dbt

dbt-adapters 1.3.1
dbt-common 1.4.0
dbt-core 1.8.3
dbt-extractor 0.5.1
dbt-oceanbase 1.0.0
dbt-semantic-interfaces 0.5.1

Expected return:

dbt-adapters                1.3.1
dbt-common                  1.4.0
dbt-core                    1.8.3
dbt-extractor               0.5.1
dbt-oceanbase               1.0.0
dbt-semantic-interfaces     0.5.1

Configure project

In your current directory, use this command to initialize a project:

dbt init dbtexample

Then, follow the prompts to fill in OceanBase Database connection details.

Expected return:

06:00:31  Running with dbt=1.8.3
06:00:31  Your new dbt project "dbtexample" was created!

For more information on how to configure the profiles.yml file,
please consult the dbt documentation here:

https://docs.getdbt.com/docs/configure-your-profile

One more thing:

Need help? Don't hesitate to reach out to us via GitHub issues or on Slack:

https://community.getdbt.com/

Happy modeling!

06:00:31  Setting up your profile.
Which database would you like to use?
[1] oceanbase_mysql

(Don't see the one you want? https://docs.getdbt.com/docs/available-adapters)

Enter a number: 1
host (hostname for the instance): xx.xxx.xxx.xx
port: 2503
user (username@tenant#cluster): username@tenant#cluster
pass (password):
database (default database that dbt will build objects in): test
threads (1 or more) [1]:
06:02:20  Profile dbtexample written to /Users/username/.dbt/profiles.yml using target's profile_template.yml and your supplied values. Run 'dbt debug' to validate the connection.

After you complete the details, use this command to verify the dbt connection:

dbt debug

Expected return:

06:02:35 Running with dbt=1.8.3
06:02:35 dbt version: 1.8.3
06:02:35 python version: 3.11.5
06:02:35 python path: /Users/username/miniconda3/bin/python
06:02:35 os info: macOS-14.2.1-arm64-arm-64bit
06:02:35 Using profiles dir at /Users/username/.dbt
06:02:35 Using profiles.yml file at /Users/username/.dbt/profiles.yml
06:02:35 Using dbt_project.yml file at /Users/username/projects/dbtexample/dbt_project.yml
06:02:35 adapter type: oceanbase_mysql
06:02:35 adapter version: 1.0.0
06:02:35 Configuration:
06:02:35   profiles.yml file [OK found and valid]
06:02:35   dbt_project.yml file [OK found and valid]
06:02:35 Required dependencies:
06:02:35  - git [OK found]

06:02:35 Connection:
06:02:35   host: xx.xxx.xxx.xx
06:02:35   port: 2503
06:02:35   user: username
06:02:35   retries: 1
06:02:35   schema: test
06:02:35   database: test
06:02:35 Registered adapter: oceanbase_mysql=1.0.0
06:02:36  Connection test: [OK connection ok]

06:02:36 All checks passed!

Prepare test data

Use this command to download example data from dbt official site:

wget https://dbt-tutorial-public.s3-us-west-2.amazonaws.com/jaffle_shop_customers.csv
wget https://dbt-tutorial-public.s3-us-west-2.amazonaws.com/jaffle_shop_orders.csv
wget https://dbt-tutorial-public.s3-us-west-2.amazonaws.com/stripe_payments.csv

Put data to dbtexample/seeds/ directory.

Use this command to import data:

dbt seed

(Optional) Use this command to remove unused models/example:

# optional 
rm -rf  models/example/ 

Create customers.sql file in directory, insert this content:

{{
  config(
    materialized='view'
  )
}}

with customers as (

    select
        id as customer_id,
        first_name,
        last_name

    from jaffle_shop_customers

),

orders as (

    select
        id as order_id,
        user_id as customer_id,
        order_date,
        status

    from jaffle_shop_orders

),

customer_orders as (

    select
        customer_id,

        min(order_date) as first_order_date,
        max(order_date) as most_recent_order_date,
        count(order_id) as number_of_orders

    from orders

    group by 1

),

final as (

    select
        customers.customer_id,
        customers.first_name,
        customers.last_name,
        customer_orders.first_order_date,
        customer_orders.most_recent_order_date,
        coalesce(customer_orders.number_of_orders, 0) as number_of_orders

    from customers

    left join customer_orders using (customer_id)

)

select * from final 

USe this command to detect and execute the files under models:

dbt run

Expected return:

06:37:08  Running with dbt=1.8.3
06:37:08  Registered adapter: oceanbase_mysql=1.0.0
06:37:08  [WARNING]: Configuration paths exist in your dbt_project.yml file which do not apply to any resources.
There are 1 unused configuration paths:
- models.dbtexample.example
06:37:08  Found 1 model, 393 macros
06:37:08
06:37:10  Concurrency: 1 threads (target='dev')
06:37:10
06:37:10  1 of 1 START sql view model customers ........................................ [RUN]
06:37:11  1 of 1 OK created sql view model customers .............................. [SUCCESS-1 in 1.25s]
06:37:12
06:37:12  Finished running 1 view model in 0 hours 0 minutes and 3.79 seconds (3.79s).
06:37:12
06:37:12  Completed successfully
06:37:12
06:37:12  Done. PASS=1 WARN=0 ERROR=0 SKIP=0 TOTAL=1

You can check your view in your OceanBase Database. The command is executed successfully, and a View is created. You can verify this View in your OceanBase database.

Contact Us