OceanBase logo

OceanBase

A unified distributed database ready for your transactional, analytical, and AI workloads.

Product Overview
DEPLOY YOUR WAY

OceanBase Cloud

The best way to deploy and scale OceanBase

OceanBase Enterprise

Run and manage OceanBase on your infra

TRY OPEN SOURCE

OceanBase Community Edition

The free, open-source distributed database

OceanBase seekdb

Open source AI native search database

Customer Stories

Real-world success stories from enterprises across diverse industries.

View All
BY USE CASES

Mission-Critical Transactions

Global & Multicloud Application

Elastic Scaling for Peak Traffic

Real-time Analytics

Active Geo-redundancy

Database Consolidation

Resources

Comprehensive knowledge hub for OceanBase.

Blog

Live Demos

Training & Certification

Documentation

Official technical guides, tutorials, API references, and manuals for all OceanBase products.

View All
PRODUCTS

OceanBase Cloud

OceanBase Database

Tools

Connectors and Middleware

QUICK START

OceanBase Cloud

OceanBase Database

BEST PRACTICES

Practical guides for utilizing OceanBase more effectively and conveniently

Company

Learn more about OceanBase – our company, partnerships, and trust and security initiatives.

About OceanBase

Partner

Trust Center

Contact Us

International - English
中国站 - 简体中文
日本 - 日本語
Sign In
Start on Cloud

OceanBase

A unified distributed database ready for your transactional, analytical, and AI workloads.

Product Overview
DEPLOY YOUR WAY

OceanBase Cloud

The best way to deploy and scale OceanBase

OceanBase Enterprise

Run and manage OceanBase on your infra

TRY OPEN SOURCE

OceanBase Community Edition

The free, open-source distributed database

OceanBase seekdb

Open source AI native search database

Customer Stories

Real-world success stories from enterprises across diverse industries.

View All
BY USE CASES

Mission-Critical Transactions

Global & Multicloud Application

Elastic Scaling for Peak Traffic

Real-time Analytics

Active Geo-redundancy

Database Consolidation

Comprehensive knowledge hub for OceanBase.

Blog

Live Demos

Training & Certification

Documentation

Official technical guides, tutorials, API references, and manuals for all OceanBase products.

View All
PRODUCTS
OceanBase CloudOceanBase Database
ToolsConnectors and Middleware
QUICK START
OceanBase CloudOceanBase Database
BEST PRACTICES

Practical guides for utilizing OceanBase more effectively and conveniently

Learn more about OceanBase – our company, partnerships, and trust and security initiatives.

About OceanBase

Partner

Trust Center

Contact Us

Start on Cloud
编组
All Products
    • Databases
    • iconOceanBase Database
    • iconOceanBase Cloud
    • iconOceanBase Tugraph
    • iconInteractive Tutorials
    • iconOceanBase Best Practices
    • Tools
    • iconOceanBase Cloud Platform
    • iconOceanBase Migration Service
    • iconOceanBase Developer Center
    • iconOceanBase Migration Assessment
    • iconOceanBase Admin Tool
    • iconOceanBase Loader and Dumper
    • iconOceanBase Deployer
    • iconKubernetes operator for OceanBase
    • iconOceanBase Diagnostic Tool
    • iconOceanBase Binlog Service
    • Connectors and Middleware
    • iconOceanBase Database Proxy
    • iconEmbedded SQL in C for OceanBase
    • iconOceanBase Call Interface
    • iconOceanBase Connector/C
    • iconOceanBase Connector/J
    • iconOceanBase Connector/ODBC
    • iconOceanBase Connector/NET
icon

OceanBase Database

SQL - V4.4.2

    Download PDF

    OceanBase logo

    The Unified Distributed Database for the AI Era.

    Follow Us
    Products
    OceanBase CloudOceanBase EnterpriseOceanBase Community EditionOceanBase seekdb
    Resources
    DocsBlogWhite PaperLive DemosTraining & CertificationTicket
    Company
    About OceanBaseTrust CenterLegalPartnerContact Us
    Follow Us

    © OceanBase 2026. All rights reserved

    Cloud Service AgreementPrivacy PolicySecurity
    Contact Us
    Document Feedback
    1. Documentation Center
    2. OceanBase Database
    3. SQL
    4. V4.4.2
    iconOceanBase Database
    SQL - V 4.4.2
    Databases
    • OceanBase Database
    • OceanBase Cloud
    • OceanBase Tugraph
    • Interactive Tutorials
    • OceanBase Best Practices
    Tools
    • OceanBase Cloud Platform
    • OceanBase Migration Service
    • OceanBase Developer Center
    • OceanBase Migration Assessment
    • OceanBase Admin Tool
    • OceanBase Loader and Dumper
    • OceanBase Deployer
    • Kubernetes operator for OceanBase
    • OceanBase Diagnostic Tool
    • OceanBase Binlog Service
    Connectors and Middleware
    • OceanBase Database Proxy
    • Embedded SQL in C for OceanBase
    • OceanBase Call Interface
    • OceanBase Connector/C
    • OceanBase Connector/J
    • OceanBase Connector/ODBC
    • OceanBase Connector/NET
    SQL
    KV
    • V 4.6.0
    • V 4.4.2
    • V 4.3.5
    • V 4.3.3
    • V 4.3.1
    • V 4.3.0
    • V 4.2.5
    • V 4.2.2
    • V 4.2.1
    • V 4.2.0
    • V 4.1.0
    • V 4.0.0
    • V 3.1.4 and earlier

    OceanBase Database TPC-H testing

    Last Updated:2026-06-23 02:59:11  Updated
    Share
    What is on this page
    What is TPC-H
    Prepare the environment
    Software requirements
    Tenant specification configuration
    Test methods
    Run a one-click TPC-H test with OBD
    Run the TPC-H test manually with the TPC-H tool
    FAQ

    folded

    Share

    This topic describes the software requirements, tenant specifications, and detailed test methods for running TPC-H tests on OceanBase Database.

    What is TPC-H

    TPC-H (Transaction Processing Performance Council's Decision Support Benchmark) is a benchmark from the Transaction Processing Performance Council (TPC) that simulates decision-support workloads. Academia and industry widely use it to evaluate decision-support performance. It measures a system's overall analytic computing capability, places high demands on vendors, and has significant commercial value. Common use cases include credit and card analysis in banking, telecom operations analysis, tax analysis, and decision support in the tobacco industry.

    TPC-H evolved from TPC-D, a decision-support benchmark TPC introduced in 1994. TPC-H models a data warehouse in third normal form (3NF) with eight base tables. The primary metric is query response time—the interval from query submission to result return. Results are reported as Queries per Hour at a given scale (QphH@size), where H is the average number of complex queries executed per hour and size is the database scale. Because TPC-H reflects real production environments, it can evaluate performance characteristics that other benchmarks miss. In short, TPC-H meets data-warehouse benchmarking needs and drives vendors and researchers to push the technology forward.

    Note

    Starting with OceanBase Database V4.0.0, extensive optimizations improve usability and help developers get better out-of-the-box performance. This testing guide covers basic parameter tuning to help you achieve strong TPC-H results.

    Prepare the environment

    Before you start the test, prepare the test environment as required:

    Note

    This example is based on a MySQL tenant.

    Software requirements

    • JDK: We recommend that you use JDK 1.8u131 or later.

    • make: Run the yum install make command to install make.

    • GCC: Run the yum install gcc command to install GCC.

    • mysql-devel: Run the yum install mysql-devel command to install mysql-devel.

    • Python database driver: Run the sudo yum install MySQL-python command to install the Python database driver.

    • prettytable: Run the pip install prettytable command to install prettytable.

    • JDBC: We recommend that you use the mysql-connector-java-5.1.47 version.

    • TPC-H Tool: Download it from the TPC-H Tool download page. You can skip this tool if you use the OBD one-click test.

    • OBClient: See OBClient documentation.

    • OceanBase Database: See Quickly experience OceanBase Database.

    • IOPS: Use disks with IOPS above 10,000.

    Tenant specification configuration

    The tenant specification follows the hardware configuration in the OceanBase Database TPC-H Benchmark Report. Adjust it to match your hardware.

    • Cluster deployment

      1. This test requires four machines. Deploy TPC-H and OBD on one machine as the client load generator. Use OBD to deploy a three-node OceanBase cluster (1:1:1 scale).

        Note

        • For the TPC-H test, the machine that hosts TPC-H and OBD requires only four CPU cores and 16 GB of memory.
        • When deploying a cluster, avoid the obd cluster autodeploy command. It limits resource usage (for example, memory) for stability. Instead, tune the configuration file to make full use of available resources.
      2. After deployment, create a tenant and user for the TPC-H test. Do not use the sys tenant—it is the internal management tenant. Set the tenant's primary_zone to RANDOM so that leaders of newly created table partitions are distributed randomly across the three nodes.

    • Tenant creation

      You can run the OBD CLUSTER TENANT CREATE command to create a test tenant. The command syntax is as follows:

      obd cluster tenant create <DEPLOY_NAME> -n <TENANT_NAME> --max-cpu=28 --memory-size=180G -–zone-list=zone1,zone2,zone3 -–primary-zone=RANDOM  --locality=F@zone1,F@zone2,F@zone3 --charset=utf8 -s 'ob_tcp_invited_nodes="%"' --optimize=<optimize>
      

      The parameters are described as follows:

      • DEPLOY_NAME: the name of the cluster.

      • TENANT_NAME: the name of the tenant.

      • --zone-list: the list of zones for the tenant.

      • --primary-zone: the primary zone for the tenant.

      • --locality: the distribution of replicas across zones.

      • --charset: the character set of the tenant.

      • -s: the system variable value of the tenant.

      • OPTIMIZE: the load type of the tenant, which can be express_oltp, complex_oltp, olap, htap, or kv. The default load type is htap, which is suitable for hybrid OLAP and OLTP workloads. For more information about OBD deployment, see obd cluster tenant create.

        Notice

        For V4.3.x and later versions, when you use OBD for deployment, you can set the scenario parameter to specify the cluster load type. If you do not set the scenario parameter, the default value is htap. For more information, see Deploy OceanBase Database with OBD.

      For example, create a tenant named tpch_tenant that uses the obperf cluster with 28 CPU cores and 180 GB of memory, and set the default tenant load type to match the cluster scenario.

      obd cluster tenant create obperf -n tpch_tenant --max-cpu=28 --memory-size=180G -–zone-list=zone1,zone2,zone3 -–primary-zone=RANDOM  --locality=F@zone1,F@zone2,F@zone3 --charset=utf8 -s 'ob_tcp_invited_nodes="%"' --optimize=htap
      

      Note

      In this example, --optimize=htap is the default load type. In a production environment, select the appropriate load type based on the cluster type.

    Test methods

    After the test environment is ready, you can perform the TPC-H performance test by using either of the following two methods:

    • Run the TPC-H test with OBD (one-click)

    • Run the TPC-H test manually with the TPC-H tool

    Run a one-click TPC-H test with OBD

    Run a TPC-H test with one command using OBD:

    sudo yum install -y yum-utils
    sudo yum-config-manager --add-repo https://mirrors.aliyun.com/oceanbase/OceanBase.repo
    sudo yum install obtpch
    sudo ln -s /usr/tpc-h-tools/tpc-h-tools/ /usr/local/
    obd test tpch obperf  --tenant=tpch_mysql -s 100 --remote-tbl-dir=/tmp/tpch100
    

    Before running the script, note the following:

    • For TPC-H test parameters, see obd test tpch.

    • This example uses mostly default values. Adjust parameters for your environment as needed. Here, the cluster name is obperf and the tenant name is tpch_mysql.

    • For a one-click OBD test, the cluster must have been installed and deployed by OBD. Otherwise, OBD cannot read cluster information or tune performance based on the cluster configuration.

    • If you changed the sys tenant password from the default empty value in the client, reset it to empty in the client first. Then use obd cluster edit-config to set the password in the configuration file (# root_password: # root user password). After editing the config, run obd cluster reload to apply the change.

    • After you run obd test tpch, the system prints each step and its output in detail. Larger data volumes take longer.

    • The remote-tbl-dir directory must have enough space for TPC-H data. We recommend using a separate disk for load test data.

    • The obd test tpch command runs the full workflow without further manual steps: data generation and transfer, OceanBase parameter tuning, data loading, and testing. If a step fails, retry with obd test tpch—for example, skip data generation and transfer and run loading and testing only.

    Run the TPC-H test manually with the TPC-H tool

    After you choose a cluster load type and tenant tuning scenario, you can run TPC-H manually to better understand OceanBase Database, especially parameter tuning.

    Step 1: Create a test tenant

    Note

    If the test tenant has been created during the environment preparation, you can skip this step.

    Run the following commands in the system tenant (sys tenant) to create a test tenant:

    Note

    The OceanBase cluster environment for this test is deployed in the 1:1:1 mode.

    1. Create a resource unit named mysql_box.

      CREATE RESOURCE UNIT mysql_box
         MAX_CPU 28,
         MEMORY_SIZE '200G',
         MIN_IOPS 200000,
         MAX_IOPS 12800000,
         LOG_DISK_SIZE '300G';
      
    2. Create a resource pool named mysql_pool.

      CREATE RESOURCE POOL mysql_pool
         UNIT = 'mysql_box',
         UNIT_NUM = 1,
         ZONE_LIST = ('z1','z2','z3');
      
    3. Create a MySQL tenant named mysql_tenant.

      CREATE TENANT mysql_tenant
         RESOURCE_POOL_LIST = ('mysql_pool'),
         PRIMARY_ZONE = RANDOM,
         LOCALITY = 'F@z1,F@z2,F@z3'
         SET VARIABLES ob_compatibility_mode='mysql', ob_tcp_invited_nodes='%', secure_file_priv = "/";
      

    Step 2: Optimize the environment

    1. Optimize OceanBase Database.

      Run the following commands in the system tenant (sys tenant) to configure the related parameters.

      ALTER SYSTEM flush plan cache GLOBAL;
      
      ALTER system SET enable_sql_audit=false;
      
      select sleep(5);
      
      ALTER system SET enable_perf_event=false;
      
      ALTER system SET syslog_level='PERF';
      
      alter system set enable_record_trace_log=false;
      
      alter system set data_storage_warning_tolerance_time = '300s';
      
      alter system set _data_storage_io_timeout = '600s';
      
      alter system set trace_log_slow_query_watermark = '7d';
      
      alter system set large_query_threshold='0ms';
      
      alter system set enable_syslog_recycle= 1;
      
      alter system set _max_px_workers_per_cpu = 4 tenant= 'mysql_tenant';
      
      alter system set default_table_organization = 'HEAP' tenant = 'mysql_tenant';
      
    2. Optimize the tenant.

      Run the following commands in the test tenant (user tenant) to configure the related parameters.

      SET global NLS_DATE_FORMAT='YYYY-MM-DD HH24:MI:SS';
      SET global NLS_TIMESTAMP_FORMAT='YYYY-MM-DD HH24:MI:SS.FF';
      SET global NLS_TIMESTAMP_TZ_FORMAT='YYYY-MM-DD HH24:MI:SS.FF TZR TZD';
      
      set global ob_query_timeout=10800000000;
      set global ob_trx_timeout=10000000000;
      
      set global ob_sql_work_area_percentage=80;
      
      alter system set ob_enable_batched_multi_statement='true';
      alter system set default_table_store_format = 'column' ;
      alter system set _io_read_batch_size = '2M';
      alter system set _io_read_redundant_limit_percentage = 50;
      
      set global parallel_servers_target=10000;
      
      set global collation_connection = utf8mb4_bin;
      set global collation_database = utf8mb4_bin;
      set global collation_server = utf8mb4_bin;
      
      set global autocommit=1;
      
      alter system set _nested_loop_join_enabled = false;
      
      alter system set ob_enable_batched_multi_statement='true';
      

    Step 3: Install TPC-H Tool

    1. Download TPC-H Tool. For more information, see TPC-H Tool Download Page.

    2. After the file is downloaded, decompress it and go to the TPC-H directory.

      [wieck@localhost ~] $ unzip 7e965ead-8844-4efa-a275-34e35f8ab89b-tpc-h-tool.zip
      [wieck@localhost ~] $ cd TPC-H_Tools_v3.0.0
      
    3. Copy the Makefile.suite file.

      [wieck@localhost TPC-H_Tools_v3.0.0] $ cd dbgen/
      [wieck@localhost dbgen] $ cp Makefile.suite Makefile
      
    4. Modify the CC, DATABASE, MACHINE, and WORKLOAD parameters in the Makefile file.

      [wieck@localhost dbgen] $ vim Makefile
      
      CC      = gcc
      # Current values for DATABASE are: INFORMIX, DB2, TDAT (Teradata)
      #                                  SQLSERVER, SYBASE, ORACLE, VECTORWISE
      # Current values for MACHINE are:  ATT, DOS, HP, IBM, ICL, MVS,
      #                                  SGI, SUN, U2200, VMS, LINUX, WIN32
      # Current values for WORKLOAD are:  TPCH
      DATABASE= MYSQL
      MACHINE = LINUX
      WORKLOAD = TPCH
      
    5. Modify the tpcd.h file and add the new macro definitions.

      [wieck@localhost dbgen] $ vim tpcd.h
      
      #ifdef MYSQL
      #define GEN_QUERY_PLAN ""
      #define START_TRAN "START TRANSACTION"
      #define END_TRAN "COMMIT"
      #define SET_OUTPUT ""
      #define SET_ROWCOUNT "limit %d;\n"
      #define SET_DBASE "use %s;\n"
      #endif
      
    6. Compile the file.

      make
      

      The returned result is as follows:

      gcc  -g -DDBNAME=\"dss\" -DLINUX -DMYSQL -DTPCH -DRNG_TEST -D_FILE_OFFSET_BITS=64    -c -o build.o build.c
      gcc  -g -DDBNAME=\"dss\" -DLINUX -DMYSQL -DTPCH -DRNG_TEST -D_FILE_OFFSET_BITS=64    -c -o driver.o driver.c
      gcc  -g -DDBNAME=\"dss\" -DLINUX -DMYSQL -DTPCH -DRNG_TEST -D_FILE_OFFSET_BITS=64    -c -o bm_utils.o bm_utils.c
      gcc  -g -DDBNAME=\"dss\" -DLINUX -DMYSQL -DTPCH -DRNG_TEST -D_FILE_OFFSET_BITS=64    -c -o rnd.o rnd.c
      gcc  -g -DDBNAME=\"dss\" -DLINUX -DMYSQL -DTPCH -DRNG_TEST -D_FILE_OFFSET_BITS=64    -c -o print.o print.c
      gcc  -g -DDBNAME=\"dss\" -DLINUX -DMYSQL -DTPCH -DRNG_TEST -D_FILE_OFFSET_BITS=64    -c -o load_stub.o load_stub.c
      gcc  -g -DDBNAME=\"dss\" -DLINUX -DMYSQL -DTPCH -DRNG_TEST -D_FILE_OFFSET_BITS=64    -c -o bcd2.o bcd2.c
      gcc  -g -DDBNAME=\"dss\" -DLINUX -DMYSQL -DTPCH -DRNG_TEST -D_FILE_OFFSET_BITS=64    -c -o speed_seed.o speed_seed.c
      gcc  -g -DDBNAME=\"dss\" -DLINUX -DMYSQL -DTPCH -DRNG_TEST -D_FILE_OFFSET_BITS=64    -c -o text.o text.c
      gcc  -g -DDBNAME=\"dss\" -DLINUX -DMYSQL -DTPCH -DRNG_TEST -D_FILE_OFFSET_BITS=64    -c -o permute.o permute.c
      gcc  -g -DDBNAME=\"dss\" -DLINUX -DMYSQL -DTPCH -DRNG_TEST -D_FILE_OFFSET_BITS=64    -c -o rng64.o rng64.c
      gcc  -g -DDBNAME=\"dss\" -DLINUX -DMYSQL -DTPCH -DRNG_TEST -D_FILE_OFFSET_BITS=64  -O -o dbgen build.o driver.o bm_utils.o rnd.o print.o load_stub.o bcd2.o speed_seed.o text.o permute.o rng64.o -lm
      gcc  -g -DDBNAME=\"dss\" -DLINUX -DMYSQL -DTPCH -DRNG_TEST -D_FILE_OFFSET_BITS=64    -c -o qgen.o qgen.c
      gcc  -g -DDBNAME=\"dss\" -DLINUX -DMYSQL -DTPCH -DRNG_TEST -D_FILE_OFFSET_BITS=64    -c -o varsub.o varsub.c
      gcc  -g -DDBNAME=\"dss\" -DLINUX -DMYSQL -DTPCH -DRNG_TEST -D_FILE_OFFSET_BITS=64  -O -o qgen build.o bm_utils.o qgen.o rnd.o varsub.o text.o bcd2.o permute.o speed_seed.o rng64.o -lm
      

      This step produces dbgen for data generation and qgen and dists.dss for SQL generation.

    Step 4: Generate data

    Generate 10 GB, 100 GB, or 1 TB of TPC-H data depending on your environment. This example uses 100 GB.

    ./dbgen -s 100
    mkdir tpch100
    mv *.tbl tpch100
    

    Generate 1 TB of data using multiple threads. OceanBase Database supports direct load, which allows you to import data from multiple files into a table at the same time:

    #!/bin/bash  
      
    SCALE_FACTOR=1000  
    CHUNK_COUNT=20  
    for ((i=1; i<=CHUNK_COUNT; i++))  
    do   
       CMD="./dbgen -s ${SCALE_FACTOR} -C ${CHUNK_COUNT} -S ${i} -vf"   
       $CMD &  
    done  
    wait  
    echo "All data generation tasks completed."
    

    Step 5: Generate query SQL statements

    Note

    Follow the steps below to generate query SQL, then adjust as needed. Alternatively, use the query SQL on GitHub. If you use the GitHub queries, replace cpu_num with your actual concurrency level.

    Use the TPC-H tool to generate query SQL as follows:

    1. Copy the dbgen/qgen and dbgen/dists.dss files to the mysql_sql directory.

    2. Create a gen.sh script in the mysql_sql directory to generate query SQL statements.

      vim gen.sh
      
      #!/usr/bin/bash
      for i in {1..22}
      do  
      ./qgen -d $i -s 100 > db"$i".sql
      done
      
    3. Modify the query SQL statements based on the actual concurrency number.

      You can use the following command in the sys tenant to view the total number of available CPU cores for the tenant.

      select sum(max_cpu) from DBA_OB_UNITS;
      

      Here is an example of the modified SQL statement for Q1:

      SELECT /*+    parallel(96) */   ---Add parallel execution
      l_returnflag,
      l_linestatus,
      sum(l_quantity) as sum_qty,
      sum(l_extendedprice) as sum_base_price,
      sum(l_extendedprice * (1 - l_discount)) as sum_disc_price,
      sum(l_extendedprice * (1 - l_discount) * (1 + l_tax)) as sum_charge,
      avg(l_quantity) as avg_qty,
      avg(l_extendedprice) as avg_price,
      avg(l_discount) as avg_disc,
      count(*) as count_order
      FROM
      lineitem
      WHERE
      l_shipdate <= date '1998-12-01' - interval '90' day 
      GROUP BY
      l_returnflag,
      l_linestatus
      ORDER BY
      l_returnflag,
      l_linestatus;
      

    Step 6: Create tables

    • For 100 GB of data, create a table structure file named create_tpch_mysql_table_part.ddl.

      drop tablegroup IF EXISTS tpch_tg_SF_TPC_USER_lineitem_order_group;
      drop tablegroup IF EXISTS  tpch_tg_SF_TPC_USER_partsupp_part;
      create tablegroup tpch_tg_SF_TPC_USER_lineitem_order_group binding true partition by key 1 partitions 256;
      create tablegroup tpch_tg_SF_TPC_USER_partsupp_part binding true partition by key 1 partitions 256;
      
      
      DROP TABLE IF EXISTS LINEITEM;
      CREATE TABLE lineitem (
         l_orderkey bigint NOT NULL,
         l_partkey int(32) NOT NULL,
         l_suppkey int(32) NOT NULL,
         l_linenumber int(32) NOT NULL,
         l_quantity decimal(32,2) NOT NULL,
         l_extendedprice decimal(32,2) NOT NULL,
         l_discount decimal(15,2) NOT NULL,
         l_tax decimal(15,2) NOT NULL,
         l_returnflag char(1) DEFAULT NULL,
         l_linestatus char(1) DEFAULT NULL,
         l_shipdate date NOT NULL,
         l_commitdate date DEFAULT NULL,
         l_receiptdate date DEFAULT NULL,
         l_shipinstruct varchar(64) DEFAULT NULL,
         l_shipmode varchar(64) DEFAULT NULL,
         l_comment varchar(64) DEFAULT NULL,
      primary key(l_shipdate, l_orderkey, l_linenumber)
      )row_format = condensed
      partition by key (l_orderkey) partitions 96 with column group(each column);
      alter table lineitem CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_bin;
      
      DROP TABLE IF EXISTS ORDERS;
      CREATE TABLE orders (
         o_orderkey bigint NOT NULL,
         o_custkey int(32) NOT NULL,
         o_orderstatus varchar(64) DEFAULT NULL,
         o_totalprice decimal(15,2) DEFAULT NULL,
         o_orderdate date NOT NULL,
         o_orderpriority varchar(15) DEFAULT NULL,
         o_clerk varchar(15) DEFAULT NULL,
         o_shippriority int(32) DEFAULT NULL,
         o_comment varchar(128) DEFAULT NULL,
      PRIMARY KEY (o_orderkey, o_orderdate)
      ) row_format = condensed
      partition by key(o_orderkey) partitions 96 with column group(each column);
      alter table orders CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_bin;
      
      DROP TABLE IF EXISTS PARTSUPP;
      CREATE TABLE partsupp (
         ps_partkey int(11) NOT NULL,
         ps_suppkey int(11) NOT NULL,
         ps_availqty int(11) DEFAULT NULL,
         ps_supplycost decimal(15,2) DEFAULT NULL,
         ps_comment varchar(199) DEFAULT NULL,
         PRIMARY KEY (ps_partkey, ps_suppkey)) row_format = condensed
      partition by key(ps_partkey) partitions 96 with column group(each column);
      alter table partsupp CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_bin;
      
      DROP TABLE IF EXISTS PART;
      CREATE TABLE part (
      p_partkey int(11) NOT NULL,
      p_name varchar(55) DEFAULT NULL,
      p_mfgr varchar(25) DEFAULT NULL,
      p_brand varchar(10) DEFAULT NULL,
      p_type varchar(25) DEFAULT NULL,
      p_size int(11) DEFAULT NULL,
      p_container varchar(10) DEFAULT NULL,
      p_retailprice decimal(12,2) DEFAULT NULL,
      p_comment varchar(23) DEFAULT NULL,
      PRIMARY KEY (p_partkey)) row_format = condensed
      partition by key(p_partkey) partitions 96 with column group(each column);
      alter table part CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_bin;
      
      DROP TABLE IF EXISTS CUSTOMER;
      CREATE TABLE customer (
      c_custkey int(11) NOT NULL,
      c_name varchar(25) DEFAULT NULL,
      c_address varchar(40) DEFAULT NULL,
      c_nationkey int(11) DEFAULT NULL,
      c_phone varchar(15) DEFAULT NULL,
      c_acctbal decimal(15,2) DEFAULT NULL,
      c_mktsegment char(10) DEFAULT NULL,
      c_comment varchar(117) DEFAULT NULL,
      PRIMARY KEY (c_custkey)) row_format = condensed
      partition by key(c_custkey) partitions 96 with column group(each column);
      alter table customer CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_bin;
      
      DROP TABLE IF EXISTS SUPPLIER;
      CREATE TABLE supplier (
      s_suppkey int(11) NOT NULL,
      s_name varchar(25) DEFAULT NULL,
      s_address varchar(40) DEFAULT NULL,
      s_nationkey int(11) DEFAULT NULL,
      s_phone varchar(15) DEFAULT NULL,
      s_acctbal decimal(15,2) DEFAULT NULL,
      s_comment varchar(101) DEFAULT NULL,
      PRIMARY KEY (s_suppkey)
      ) row_format = condensed partition by key(s_suppkey) with column group(each column);
      alter table supplier CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_bin;
      
      DROP TABLE IF EXISTS NATION;
      CREATE TABLE nation (
      n_nationkey int(11) NOT NULL,
      n_name varchar(25) DEFAULT NULL,
      n_regionkey int(11) DEFAULT NULL,
      n_comment varchar(152) DEFAULT NULL,
      PRIMARY KEY (n_nationkey)
      ) row_format = condensed with column group(each column);
      alter table nation CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_bin;
      
      DROP TABLE IF EXISTS REGION;
      CREATE TABLE region (
      r_regionkey int(11) NOT NULL,
      r_name varchar(25) DEFAULT NULL,
      r_comment varchar(152) DEFAULT NULL,
      PRIMARY KEY (r_regionkey)
      ) row_format = condensed with column group(each column);
      alter table region CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_bin;
      
      CREATE VIEW revenue0 AS 
      SELECT l_suppkey as supplier_no,
               SUM(l_extendedprice * ( 1 - l_discount )) as total_revenue
                  FROM   lineitem
                  WHERE  l_shipdate >= DATE '1996-01-01'
                           AND l_shipdate < DATE '1996-04-01'
                  GROUP  BY l_suppkey;
      
    • For 1 TB of data, create a table structure file named create_tpch_mysql_table_part_1000G.ddl.

      drop tablegroup IF EXISTS tpch_tg_SF_TPC_USER_lineitem_order_group_1000;
      drop tablegroup IF EXISTS  tpch_tg_SF_TPC_USER_partsupp_part_1000;
      create tablegroup tpch_tg_SF_TPC_USER_lineitem_order_group_1000 binding true partition by key 1 partitions 256;
      create tablegroup tpch_tg_SF_TPC_USER_partsupp_part_1000 binding true partition by key 1 partitions 256;
      
      
      DROP TABLE IF EXISTS LINEITEM;
      CREATE TABLE lineitem (
         l_orderkey bigint NOT NULL,
         l_partkey int(32) NOT NULL,
         l_suppkey int(32) NOT NULL,
         l_linenumber int(32) NOT NULL,
         l_quantity decimal(32,2) NOT NULL,
         l_extendedprice decimal(32,2) NOT NULL,
         l_discount decimal(15,2) NOT NULL,
         l_tax decimal(15,2) NOT NULL,
         l_returnflag char(1) DEFAULT NULL,
         l_linestatus char(1) DEFAULT NULL,
         l_shipdate date NOT NULL,
         l_commitdate date DEFAULT NULL,
         l_receiptdate date DEFAULT NULL,
         l_shipinstruct varchar(64) DEFAULT NULL,
         l_shipmode varchar(64) DEFAULT NULL,
         l_comment varchar(64) DEFAULT NULL,
      primary key(l_shipdate, l_orderkey, l_linenumber)
      )row_format = condensed
      partition by key (l_orderkey) partitions 96 with column group(each column);
      alter table lineitem CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_bin;
      
      DROP TABLE IF EXISTS ORDERS;
      CREATE TABLE orders (
         o_orderkey bigint NOT NULL,
         o_custkey int(32) NOT NULL,
         o_orderstatus varchar(64) DEFAULT NULL,
         o_totalprice decimal(15,2) DEFAULT NULL,
         o_orderdate date NOT NULL,
         o_orderpriority varchar(15) DEFAULT NULL,
         o_clerk varchar(15) DEFAULT NULL,
         o_shippriority int(32) DEFAULT NULL,
         o_comment varchar(128) DEFAULT NULL,
      PRIMARY KEY (o_orderkey, o_orderdate)
      ) row_format = condensed
      partition by key(o_orderkey) partitions 96 with column group(each column);
      alter table orders CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_bin;
      
      DROP TABLE IF EXISTS PARTSUPP;
      CREATE TABLE partsupp (
         ps_partkey int(11) NOT NULL,
         ps_suppkey int(11) NOT NULL,
         ps_availqty int(11) DEFAULT NULL,
         ps_supplycost decimal(15,2) DEFAULT NULL,
         ps_comment varchar(199) DEFAULT NULL,
         PRIMARY KEY (ps_partkey, ps_suppkey)) row_format = condensed
      partition by key(ps_partkey) partitions 96 with column group(each column);
      alter table partsupp CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_bin;
      
      DROP TABLE IF EXISTS PART;
      CREATE TABLE part (
      p_partkey int(11) NOT NULL,
      p_name varchar(55) DEFAULT NULL,
      p_mfgr varchar(25) DEFAULT NULL,
      p_brand varchar(10) DEFAULT NULL,
      p_type varchar(25) DEFAULT NULL,
      p_size int(11) DEFAULT NULL,
      p_container varchar(10) DEFAULT NULL,
      p_retailprice decimal(12,2) DEFAULT NULL,
      p_comment varchar(23) DEFAULT NULL,
      PRIMARY KEY (p_partkey)) row_format = condensed
      partition by key(p_partkey) partitions 96 with column group(each column);
      alter table part CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_bin;
      
      DROP TABLE IF EXISTS CUSTOMER;
      CREATE TABLE customer (
      c_custkey int(11) NOT NULL,
      c_name varchar(25) DEFAULT NULL,
      c_address varchar(40) DEFAULT NULL,
      c_nationkey int(11) DEFAULT NULL,
      c_phone varchar(15) DEFAULT NULL,
      c_acctbal decimal(15,2) DEFAULT NULL,
      c_mktsegment char(10) DEFAULT NULL,
      c_comment varchar(117) DEFAULT NULL,
      PRIMARY KEY (c_custkey)) row_format = condensed
      partition by key(c_custkey) partitions 96 with column group(each column);
      alter table customer CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_bin;
      
      DROP TABLE IF EXISTS SUPPLIER;
      CREATE TABLE supplier (
      s_suppkey int(11) NOT NULL,
      s_name varchar(25) DEFAULT NULL,
      s_address varchar(40) DEFAULT NULL,
      s_nationkey int(11) DEFAULT NULL,
      s_phone varchar(15) DEFAULT NULL,
      s_acctbal decimal(15,2) DEFAULT NULL,
      s_comment varchar(101) DEFAULT NULL,
      PRIMARY KEY (s_suppkey)
      ) row_format = condensed partition by key(s_suppkey) with column group(each column);
      alter table supplier CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_bin;
      
      DROP TABLE IF EXISTS NATION;
      CREATE TABLE nation (
      n_nationkey int(11) NOT NULL,
      n_name varchar(25) DEFAULT NULL,
      n_regionkey int(11) DEFAULT NULL,
      n_comment varchar(152) DEFAULT NULL,
      PRIMARY KEY (n_nationkey)
      ) row_format = condensed with column group(each column);
      alter table nation CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_bin;
      
      DROP TABLE IF EXISTS REGION;
      CREATE TABLE region (
      r_regionkey int(11) NOT NULL,
      r_name varchar(25) DEFAULT NULL,
      r_comment varchar(152) DEFAULT NULL,
      PRIMARY KEY (r_regionkey)
      ) row_format = condensed with column group(each column);
      alter table region CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_bin;
      
      CREATE VIEW revenue0 AS 
      SELECT l_suppkey as supplier_no,
               SUM(l_extendedprice * ( 1 - l_discount )) as total_revenue
                  FROM   lineitem
                  WHERE  l_shipdate >= DATE '1996-01-01'
                           AND l_shipdate < DATE '1996-04-01'
                  GROUP  BY l_suppkey;
      

    Step 7: Load data

    Use the data and SQL from the previous steps to build a load script. The example below shows one approach:

    1. Create a load script named load_data.sh.

      #!/bin/bash
      host='$host_ip'   # Note: Please fill in the IP address of an observer, such as the server where observer A is located. It is recommended to store the data files on the same server.
      port='$host_port' # The port number of observer A.
      user='$user'      # The username.
      tenant='$tenant_name'  # The tenant name.
      password='$password'   # The password.
      database='$db_name'    # The database name.
      data_path='$data_file' # Note: Please fill in the path of a data file generated in the data generation step, such as the .tbl file generated in observer A.
      
      function load_data
      {
         remote_user="$user"         # The username of the observer node where the data is stored.
         table_name=${1}
         if [[ ${password} == "" ]];then
            obclient_conn="obclient -h${host} -P${port} -u${user} -D${database} -A -c"
         else
            obclient_conn="obclient -h${host} -P${port} -u${user} -D${database} -p${password} -A -c"
         fi
         table_list=$(ssh "${remote_user}@${host}" "ls ${data_path}/${table_name}.tbl* 2>/dev/null")  
         echo "$table_list"
      
         IFS=$'\n' read -d '' -r -a table_files <<< "$table_list" 
         table_files_comma_separated=$(IFS=,; echo "${table_files[*]}")
         echo "${table_files_comma_separated}"
         echo `date "+[%Y-%m-%d %H:%M:%S]"` "----------------------Importing data files of the ${table_name} table----------------------"
      
         # Import data by using direct load. You can modify the following statement to use other data import methods.
         # Note: The data files must be stored on the test server of the OBServer.
         echo "load data /*+ parallel(80) direct(true,0) */ infile '${table_files_comma_separated}' into table ${table_name} fields terminated by '|';" | ${obclient_conn}
      
      }
      
      starttime=`date +%s%N`
      for table in "nation" "region" "customer" "lineitem" "orders" "partsupp" "part" "supplier"
      do
         load_data "${table}"
      done
      end_time=`date +%s%N`
      totaltime=`echo ${end_time} ${starttime} | awk '{printf "%0.2f\n", ($1 - $2) / 1000000000}'`
      echo `date "+[%Y-%m-%d %H:%M:%S]"` "load data cost ${totaltime}s"
      

      After the data is loaded, you must perform a major compaction and collect statistics.

    2. Perform a major compaction.

      Execute the following statement in the test tenant to trigger a major compaction:

      ALTER SYSTEM MAJOR FREEZE;
      
    3. Check whether the major compaction is completed.

      You can check whether the major compaction is completed in the sys tenant.

      SELECT dt.TENANT_NAME, cc.FROZEN_SCN, cc.LAST_SCN
      FROM oceanbase.DBA_OB_TENANTS dt, oceanbase.CDB_OB_MAJOR_COMPACTION cc
      WHERE dt.TENANT_ID = cc.TENANT_ID
      AND dt.TENANT_NAME = 'mysql_tenant';
      

      Note

      If all FROZEN_SCN and LAST_SCN values are equal, the major compaction is completed.

    4. Collect statistics.

      Create a file named analyze_table.sql:

      call dbms_stats.gather_table_stats(NULL, 'part', degree=>128, granularity=>'AUTO', method_opt=>'FOR ALL COLUMNS SIZE 128');
      call dbms_stats.gather_table_stats(NULL, 'lineitem', degree=>128, granularity=>'AUTO', method_opt=>'FOR ALL COLUMNS SIZE 128');
      call dbms_stats.gather_table_stats(NULL, 'customer', degree=>128, granularity=>'AUTO', method_opt=>'FOR ALL COLUMNS SIZE 128');
      call dbms_stats.gather_table_stats(NULL, 'orders', degree=>128, granularity=>'AUTO', method_opt=>'FOR ALL COLUMNS SIZE 128');
      call dbms_stats.gather_table_stats(NULL, 'partsupp', degree=>128, granularity=>'AUTO', method_opt=>'FOR ALL COLUMNS SIZE 128');
      call dbms_stats.gather_table_stats(NULL, 'supplier', degree=>128, granularity=>'AUTO', method_opt=>'FOR ALL COLUMNS SIZE 128');
      

      Log in to the test tenant and run:

      source analyze_table.sql 
      

    Step 8: Run the test

    Build a test script from the data and SQL created in the previous steps. The example below shows one approach:

    1. Create a test script named tpch.sh.

      #!/bin/bash
      host='$host_ip'   # Note: Please fill in the IP address of an observer, such as observer A.
      port='$host_port' # The port number of observer A.
      user='$user'      # The username.
      tenant='$tenant_name'  # The tenant name.
      password='$password'   # The password.
      database='$db_name'    # The database name.
      if [[ ${password} == "" ]];then
      TPCH_TEST="obclient -h${host} -P${port} -u${user}@{$tenant} -D${database} -A -c"
      else
      TPCH_TEST="obclient -h${host} -P${port} -p${password} -u${user}@{$tenant} -D${database} -A -c"
      fi
      
      
      function clear_kvcache
      {
         if [[ ${password_sys} == "" ]];then
            obclient_sys="obclient -h${host} -P${port} -uroot@sys -Doceanbase -A -c"
         else
            obclient_sys="obclient -h${host} -P${port} -uroot@sys -Doceanbase -p${password_sys} -A -c"
         fi
         tenant_name=${user#*@}
         echo "alter system flush kvcache ;" | ${obclient_sys}
         echo "alter system flush kvcache tenant '${tenant_name}' cache 'user_row_cache';" | ${obclient_sys}
         sleep 3s
      }
      
      function do_explain
      {
      # Execution plan
      echo `date  '+[%Y-%m-%d %H:%M:%S]'` "BEGIN EXPLAIN ALL TPCH PLAN"
      for i in {1..22}
      do
         sql_explain="source explain_mysql/${i}.sql"
         echo `date  '+[%Y-%m-%d %H:%M:%S]'` "BEGIN EXPLAIN Q${i}:"
         echo ${sql_explain} | ${TPCH_TEST} | sed 's/\\n/\n/g' |tee explain_log/${i}.exp
         echo `date  '+[%Y-%m-%d %H:%M:%S]'` "Q${i} END"
      done
      }
      
      function do_warmup
      {
      # Warm-up
      totaltime=0
      for i in {1..22}
      do
            starttime=`date +%s%N`
            echo `date  '+[%Y-%m-%d %H:%M:%S]'` "BEGIN prewarm Q${i}"
            sql1="source mysql_sql/${i}.sql"
            echo ${sql1}| ${TPCH_TEST} > mysql_log/${i}_prewarm.log  || ret=1
            stoptime=`date +%s%N`
            costtime=`echo ${stoptime} ${starttime} | awk '{printf "%0.2f\n", ($1 - $2) / 1000000000}'`
            first_array[$i]=$(echo "scale=2; ${first_array[$i]} + $costtime" | bc)  
            echo `date  '+[%Y-%m-%d %H:%M:%S]'` "END,COST ${costtime}s"
            totaltime=`echo ${totaltime} ${costtime} | awk '{printf "%0.2f\n", ($1 + $2)}'`
      done
      echo "total cost:${totaltime}s"
      }
      
      function hot_run
      {
      # Actual execution
      for j in {1..10}
      do
      totaltime=0
      for i in {1..22}
      do
            starttime=`date +%s%N`
            echo `date  '+[%Y-%m-%d %H:%M:%S]'` "BEGIN BEST Q${i} (hot run)"
            sql1="source mysql_sql/${i}.sql"
            echo ${sql1}| ${TPCH_TEST} > mysql_log/${i}.log  || ret=1
            stoptime=`date +%s%N`
            costtime=`echo ${stoptime} ${starttime} | awk '{printf "%0.2f\n", ($1 - $2) / 1000000000}'`
            hot_array[$i]=$(echo "scale=2; ${hot_array[$i]} + $costtime" | bc)  
            echo `date  '+[%Y-%m-%d %H:%M:%S]'` "END,COST ${costtime}s"
            totaltime=`echo ${totaltime} ${costtime} | awk '{printf "%0.2f\n", ($1 + $2)}'`
      done
      echo "total cost:${totaltime}s"
      done
      }
      
      function cold_run
      {
      # Actual execution
      for j in {1..3}
      do
      totaltime=0
      for i in {1..22}
      do
            clear_kvcache
            starttime=`date +%s%N`
            echo `date  '+[%Y-%m-%d %H:%M:%S]'` "BEGIN BEST Q${i} (cold run)"
            sql1="source mysql_sql/${i}.sql"
            echo $sql1| $TPCH_TEST > mysql_log/${i}_cold.log  || ret=1
            stoptime=`date +%s%N`
            costtime=`echo $stoptime $starttime | awk '{printf "%0.2f\n", ($1 - $2) / 1000000000}'`
            cold_array[$i]=$(echo "scale=2; ${cold_array[$i]} + $costtime" | bc)
            echo `date  '+[%Y-%m-%d %H:%M:%S]'` "END,COST ${costtime}s"
            totaltime=`echo ${totaltime} ${costtime} | awk '{printf "%0.2f\n", ($1 + $2)}'`
      done
      echo "total cost:${totaltime}s"
      done
      }
      
      do_explain
      
      do_warmup
      
      hot_run
      
      cold_run
      
    2. Execute the test script.

      sh tpch.sh
      

    FAQ

    • Data import fails with:

      ERROR 1017 (HY000) at line 1: File not exist
      

      Place the .tbl files on the same server as the OceanBase instance you connect to. Data import must run locally on that server.

    • Query fails with:

      ERROR 4624 (HY000): No memory or reach tenant memory limit
      

      The tenant is out of memory. Increase the tenant memory limit.

    • Data import fails with:

      ERROR 1227 (42501) at line 1: Access denied
      

      Grant the required privileges:

      grant file on *.* to tpch_100g_part;
      

    Previous topic

    OceanBase Sysbench performance issue analysis
    Last

    Next topic

    OceanBase Database TPC-H benchmark report
    Next
    What is on this page
    What is TPC-H
    Prepare the environment
    Software requirements
    Tenant specification configuration
    Test methods
    Run a one-click TPC-H test with OBD
    Run the TPC-H test manually with the TPC-H tool
    FAQ