data build tool (dbt) is an open-source data conversion tool that can convert commands into tables or views by using SQL. This topic describes how to use dbt-oceanbase to analyze data in OceanBase Database.
Prerequisites
You have installed dbt-oceanbase by referring to Step 2: Install dbt-oceanbase.
You have installed OceanBase Database and created a MySQL tenant.
Procedure
Step 1: Obtain the OceanBase Database connection string
Contact the deployment personnel or administrator of OceanBase Database to obtain the connection string. The following is a connection string example:
obclient -h$host -P$port -u$user_name -p$password -D$database_name
The parameters are described as follows:
$host: the IP address for connecting to OceanBase Database. It is the IP address of OceanBase Database Proxy (ODP) for connection through ODP, or the IP address of an OBServer node for direct connection.$port: the port for connecting to OceanBase Database. For connection through ODP, the default value is2883, which can be customized when ODP is deployed. For direct connection, the default value is2881, which can be customized when OceanBase Database is deployed.$database_name: the name of the database to be accessed.Notice
The user for connecting to a tenant must have the
CREATE,INSERT,DROP, andSELECTprivileges on the database. For more information about user privileges, see Privilege types in MySQL mode.$user_name: the tenant account. For connection through ODP, two account formats are supported:username@tenant name#cluster nameandcluster name:tenant name:username. For direct connection, theusername@tenant nameformat is supported.$password: the password of the account.
For more information about the connection string, see Connect to an OceanBase Database tenant by using OBClient.
Here is an example:
obclient -hxxx.xxx.xxx.xxx -P2881 -utest_user001@mysql001 -p****** -Dtest
Step 2: Install dbt-oceanbase
dbt-oceanbase has not been published to PyPI. Therefore, you need to install it from the GitHub repository. After it is published, you can install it through pip. Run the following command to install dbt-oceanbase:
git clone https://github.com/oceanbase/dbt-oceanbase.git --branch dev/1.0.x
cd dbt-oceanbase
pip3 install .
Note
dbt-core is automatically installed when you install dbt-oceanbase.
Run the following command to view the installed dbt version:
pip list | grep dbt
The expected return result is as follows:
dbt 1.0.0.38.15
dbt-adapters 1.7.0
dbt-common 1.10.0
dbt-core 1.9.0b2
dbt-extractor 0.5.1
dbt-oceanbase 1.0.0
dbt-semantic-interfaces 0.7.3
Configure a project
Run the following command in the current directory to initialize the new project:
dbt init dbtexampleSpecify the connection information of the OceanBase database as prompted.
The expected return result of Step 1 is as follows:
06:00:31 Running with dbt=1.8.3 06:00:31 Your new dbt project "dbtexample" was created! For more information on how to configure the profiles.yml file, please consult the dbt documentation here: https://docs.getdbt.com/docs/configure-your-profile One more thing: Need help? Don't hesitate to reach out to us via GitHub issues or on Slack: https://community.getdbt.com/ Happy modeling! 06:00:31 Setting up your profile. Which database would you like to use? [1] oceanbase_mysql (Don't see the one you want? https://docs.getdbt.com/docs/available-adapters) Enter a number: 1 host (hostname for the instance): xx.xxx.xxx.xx port: 2503 user (username@tenant#cluster): username@tenant#cluster pass (password): database (default database that dbt will build objects in): test threads (1 or more) [1]: 06:02:20 Profile dbtexample written to /Users/username/.dbt/profiles.yml using target's profile_template.yml and your supplied values. Run 'dbt debug' to validate the connection.Run the following command to verify the dbt connection:
dbt debugThe expected return result is as follows:
06:02:35 dbt version: 1.8.3 06:02:35 python version: 3.11.5 06:02:35 python path: /Users/username/miniconda3/bin/python 06:02:35 os info: macOS-14.2.1-arm64-arm-64bit 06:02:35 Using profiles dir at /Users/username/.dbt 06:02:35 Using profiles.yml file at /Users/username/.dbt/profiles.yml 06:02:35 Using dbt_project.yml file at /Users/username/projects/dbtexample/dbt_project.yml 06:02:35 adapter type: oceanbase_mysql 06:02:35 adapter version: 1.0.0 06:02:35 Configuration: 06:02:35 profiles.yml file [OK found and valid] 06:02:35 dbt_project.yml file [OK found and valid] 06:02:35 Required dependencies: 06:02:35 - git [OK found] 06:02:35 Connection: 06:02:35 host: xx.xxx.xxx.xx 06:02:35 port: 2503 06:02:35 user: username 06:02:35 retries: 1 06:02:35 schema: test 06:02:35 database: test 06:02:35 Registered adapter: oceanbase_mysql=1.0.0 06:02:36 Connection test: [OK connection ok] 06:02:36 All checks passed!
Prepare test data
Run the following command to download sample data from the official website of dbt:
wget https://dbt-tutorial-public.s3-us-west-2.amazonaws.com/jaffle_shop_customers.csv wget https://dbt-tutorial-public.s3-us-west-2.amazonaws.com/jaffle_shop_orders.csv wget https://dbt-tutorial-public.s3-us-west-2.amazonaws.com/stripe_payments.csvSave the data in the
dbtexample/seeds/directory. Run the following command to import data:dbt seed(Optional) Run the following command to remove
models/example, which is not required:# optional rm -rf models/example/Create a file named
customers.sqlin themodelsdirectory and write the following content to the file:{{ config( materialized='view' ) }} with customers as ( select id as customer_id, first_name, last_name from jaffle_shop_customers ), orders as ( select id as order_id, user_id as customer_id, order_date, status from jaffle_shop_orders ), customer_orders as ( select customer_id, min(order_date) as first_order_date, max(order_date) as most_recent_order_date, count(order_id) as number_of_orders from orders group by 1 ), final as ( select customers.customer_id, customers.first_name, customers.last_name, customer_orders.first_order_date, customer_orders.most_recent_order_date, coalesce(customer_orders.number_of_orders, 0) as number_of_orders from customers left join customer_orders using (customer_id) ) select * from finalRun the following command to detect and execute the file in the
modelsdirectory:dbt runThe expected return result is as follows:
06:37:08 Running with dbt=1.8.3 06:37:08 Registered adapter: oceanbase_mysql=1.0.0 06:37:08 [WARNING]: Configuration paths exist in your dbt_project.yml file which do not apply to any resources. There are 1 unused configuration paths: - models.dbtexample.example 06:37:08 Found 1 model, 393 macros 06:37:08 06:37:10 Concurrency: 1 threads (target='dev') 06:37:10 06:37:10 1 of 1 START sql view model customers ........................................ [RUN] 06:37:11 1 of 1 OK created sql view model customers .............................. [SUCCESS-1 in 1.25s] 06:37:12 06:37:12 Finished running 1 view model in 0 hours 0 minutes and 3.79 seconds (3.79s). 06:37:12 06:37:12 Completed successfully 06:37:12 06:37:12 Done. PASS=1 WARN=0 ERROR=0 SKIP=0 TOTAL=1
The return result shows that the preceding command is successfully executed and a view is created. You can query the view in OceanBase Database.