dbt (data build tool) is an open-source tool for data transformation. It uses SQL to convert data and turns commands into tables or views. This topic describes how to analyze data in OceanBase Database by using dbt-oceanbase.
Prerequisites
Before you use dbt, make sure that you have installed dbt-oceanbase.
Install dbt-oceanbase
dbt-oceanbase is not published to PyPI, so you need to install it from the repository. Once released, you can install dbt-oceanbase via pip. Use this command to install dbt-oceanbase:
git clone https://github.com/oceanbase/dbt-oceanbase.git --branch dev/1.0.x
cd dbt-oceanbase
pip3 install .
Notice
When you install dbt-oceanbase, dbt-core will be installed automatically.
Use this command to list your installed dbt versions:
pip list|grep dbt
dbt-adapters 1.3.1
dbt-common 1.4.0
dbt-core 1.8.3
dbt-extractor 0.5.1
dbt-oceanbase 1.0.0
dbt-semantic-interfaces 0.5.1
Expected return:
dbt-adapters 1.3.1
dbt-common 1.4.0
dbt-core 1.8.3
dbt-extractor 0.5.1
dbt-oceanbase 1.0.0
dbt-semantic-interfaces 0.5.1
Configure project
In your current directory, use this command to initialize a project:
dbt init dbtexample
Then, follow the prompts to fill in OceanBase Database connection details.
Expected return:
06:00:31 Running with dbt=1.8.3
06:00:31 Your new dbt project "dbtexample" was created!
For more information on how to configure the profiles.yml file,
please consult the dbt documentation here:
https://docs.getdbt.com/docs/configure-your-profile
One more thing:
Need help? Don't hesitate to reach out to us via GitHub issues or on Slack:
https://community.getdbt.com/
Happy modeling!
06:00:31 Setting up your profile.
Which database would you like to use?
[1] oceanbase_mysql
(Don't see the one you want? https://docs.getdbt.com/docs/available-adapters)
Enter a number: 1
host (hostname for the instance): xx.xxx.xxx.xx
port: 2503
user (username@tenant#cluster): username@tenant#cluster
pass (password):
database (default database that dbt will build objects in): test
threads (1 or more) [1]:
06:02:20 Profile dbtexample written to /Users/username/.dbt/profiles.yml using target's profile_template.yml and your supplied values. Run 'dbt debug' to validate the connection.
After you complete the details, use this command to verify the dbt connection:
dbt debug
Expected return:
06:02:35 Running with dbt=1.8.3
06:02:35 dbt version: 1.8.3
06:02:35 python version: 3.11.5
06:02:35 python path: /Users/username/miniconda3/bin/python
06:02:35 os info: macOS-14.2.1-arm64-arm-64bit
06:02:35 Using profiles dir at /Users/username/.dbt
06:02:35 Using profiles.yml file at /Users/username/.dbt/profiles.yml
06:02:35 Using dbt_project.yml file at /Users/username/projects/dbtexample/dbt_project.yml
06:02:35 adapter type: oceanbase_mysql
06:02:35 adapter version: 1.0.0
06:02:35 Configuration:
06:02:35 profiles.yml file [OK found and valid]
06:02:35 dbt_project.yml file [OK found and valid]
06:02:35 Required dependencies:
06:02:35 - git [OK found]
06:02:35 Connection:
06:02:35 host: xx.xxx.xxx.xx
06:02:35 port: 2503
06:02:35 user: username
06:02:35 retries: 1
06:02:35 schema: test
06:02:35 database: test
06:02:35 Registered adapter: oceanbase_mysql=1.0.0
06:02:36 Connection test: [OK connection ok]
06:02:36 All checks passed!
Prepare test data
Use this command to download example data from dbt official site:
wget https://dbt-tutorial-public.s3-us-west-2.amazonaws.com/jaffle_shop_customers.csv
wget https://dbt-tutorial-public.s3-us-west-2.amazonaws.com/jaffle_shop_orders.csv
wget https://dbt-tutorial-public.s3-us-west-2.amazonaws.com/stripe_payments.csv
Put data to dbtexample/seeds/ directory.
Use this command to import data:
dbt seed
(Optional) Use this command to remove unused models/example:
# optional
rm -rf models/example/
Create customers.sql file in directory, insert this content:
{{
config(
materialized='view'
)
}}
with customers as (
select
id as customer_id,
first_name,
last_name
from jaffle_shop_customers
),
orders as (
select
id as order_id,
user_id as customer_id,
order_date,
status
from jaffle_shop_orders
),
customer_orders as (
select
customer_id,
min(order_date) as first_order_date,
max(order_date) as most_recent_order_date,
count(order_id) as number_of_orders
from orders
group by 1
),
final as (
select
customers.customer_id,
customers.first_name,
customers.last_name,
customer_orders.first_order_date,
customer_orders.most_recent_order_date,
coalesce(customer_orders.number_of_orders, 0) as number_of_orders
from customers
left join customer_orders using (customer_id)
)
select * from final
USe this command to detect and execute the files under models:
dbt run
Expected return:
06:37:08 Running with dbt=1.8.3
06:37:08 Registered adapter: oceanbase_mysql=1.0.0
06:37:08 [WARNING]: Configuration paths exist in your dbt_project.yml file which do not apply to any resources.
There are 1 unused configuration paths:
- models.dbtexample.example
06:37:08 Found 1 model, 393 macros
06:37:08
06:37:10 Concurrency: 1 threads (target='dev')
06:37:10
06:37:10 1 of 1 START sql view model customers ........................................ [RUN]
06:37:11 1 of 1 OK created sql view model customers .............................. [SUCCESS-1 in 1.25s]
06:37:12
06:37:12 Finished running 1 view model in 0 hours 0 minutes and 3.79 seconds (3.79s).
06:37:12
06:37:12 Completed successfully
06:37:12
06:37:12 Done. PASS=1 WARN=0 ERROR=0 SKIP=0 TOTAL=1
You can check your view in your OceanBase Database. The command is executed successfully, and a View is created. You can verify this View in your OceanBase database.