Perform single-table and multi-table queries on imported TPC-H sample data, comparing performance with and without parallel execution. You can also explore other TPC-H tests.
Concepts
Parallel Execution
OceanBase Database’s parallel execution refers to concurrent processing in a distributed architecture. OceanBase utilizes distributed architecture, storing data shards across different nodes, achieving parallel processing of both data and compute through data and task partitioning. In OceanBase, each node can independently execute tasks and communicate with each other via high-speed networks to facilitate data interaction and synchronization.
Prerequisites
You have completed the tasks in Get Started with OceanBase Cloudor have existing tenants in the MySQL mode and the corresponding database and account in your environment.
Scenario One: Single Table Query
Select your MySQL tenant.
Click to navigate to Tenant Workspace.
Click to navigate to SQL Console, log in to the SQL console using the database account “tutorial”, and select database “tutorialdb”.
Enter the following SQL statements into the SQL window.
select
l_returnflag,
l_linestatus,
sum(l_quantity) as sum_qty,
sum(l_extendedprice) as sum_base_price,
sum(l_extendedprice * (1 - l_discount)) as sum_disc_price,
sum(l_extendedprice * (1 - l_discount) * (1 + l_tax)) as sum_charge,
avg(l_quantity) as avg_qty,
avg(l_extendedprice) as avg_price,
avg(l_discount) as avg_disc,
count(*) as count_order
from
lineitem
where
l_shipdate <= date '1998-12-01' - interval '90' day
group by
l_returnflag,
l_linestatus
order by
l_returnflag,
l_linestatus;
- Click to execute the SQL , and query data in table
lineitemwhile recording execution time.
SQL Execution Analysis
The execution time is approximately 315.73ms, which is 3~5 times faster than a MySQL instance of similar specifications. OceanBase, based on its distributed architecture, not only ensures high-performance transaction processing but also supports real-time analysis and batch processing scenarios, minimizing data redundancy and significantly reducing overall costs for enterprises.
Notice
The above data is for reference only. Please refer to your actual experimental results for accurate values.
Scenario Two: Multi-Table Query
Aggregate query on tables part, supplier, partsupp, nation, region.
select
s_acctbal,
s_name,
n_name,
p_partkey,
p_mfgr,
s_address,
s_phone,
s_comment
from
part,
supplier,
partsupp,
nation,
region
where
p_partkey = ps_partkey
and s_suppkey = ps_suppkey
and p_size = 15
and p_type like '%BRASS'
and s_nationkey = n_nationkey
and n_regionkey = r_regionkey
and r_name = 'EUROPE'
and ps_supplycost = (
select
min(ps_supplycost)
from
partsupp,
supplier,
nation,
region
where
p_partkey = ps_partkey
and s_suppkey = ps_suppkey
and s_nationkey = n_nationkey
and n_regionkey = r_regionkey
and r_name = 'EUROPE'
)
order by
s_acctbal desc,
n_name,
s_name,
p_partkey;
Execute the following command to enable parallel execution with 8 parallel threads to speed up computation and record computation time.
select /*+ parallel(8) */
s_acctbal,
s_name,
n_name,
p_partkey,
p_mfgr,
s_address,
s_phone,
s_comment
from
part,
supplier,
partsupp,
nation,
region
where
p_partkey = ps_partkey
and s_suppkey = ps_suppkey
and p_size = 15
and p_type like '%BRASS'
and s_nationkey = n_nationkey
and n_regionkey = r_regionkey
and r_name = 'EUROPE'
and ps_supplycost = (
select
min(ps_supplycost)
from
partsupp,
supplier,
nation,
region
where
p_partkey = ps_partkey
and s_suppkey = ps_suppkey
and s_nationkey = n_nationkey
and n_regionkey = r_regionkey
and r_name = 'EUROPE'
)
order by
s_acctbal desc,
n_name,
s_name,
p_partkey;
Click to execute SQL, and query data in table lineitem while recording execution time.
SQL Execution Analysis
Without parallel execution, the execution time is approximately 1.44s. With 8 parallel executions enabled, the execution time is reduced to 397.7ms, demonstrating significantly faster performance. OceanBase, leveraging its distributed architecture, not only ensures high-performance transaction processing but also supports real-time analysis and batch processing scenarios, minimizing data redundancy and significantly reducing overall costs for enterprises.
Notice
The above data is for reference only. Please refer to your actual experimental results for accurate values.