Explore Complex SQL Performance with OceanBase Cloud

2024-06-21 03:03:07  Updated

Perform single-table and multi-table queries on imported TPC-H sample data, comparing performance with and without parallel execution. You can also explore other TPC-H tests.

Concepts

Parallel Execution

OceanBase Database’s parallel execution refers to concurrent processing in a distributed architecture. OceanBase utilizes distributed architecture, storing data shards across different nodes, achieving parallel processing of both data and compute through data and task partitioning. In OceanBase, each node can independently execute tasks and communicate with each other via high-speed networks to facilitate data interaction and synchronization.

Prerequisites

You have completed the tasks in Get Started with OceanBase Cloudor have existing tenants in the MySQL mode and the corresponding database and account in your environment.

Scenario One: Single Table Query

  1. Select your MySQL tenant.

  2. Click to navigate to Tenant Workspace.

  3. Click to navigate to SQL Console, log in to the SQL console using the database account “tutorial”, and select database “tutorialdb”.

  4. Enter the following SQL statements into the SQL window.

    select 
        l_returnflag,
        l_linestatus,
        sum(l_quantity) as sum_qty,
        sum(l_extendedprice) as sum_base_price,
        sum(l_extendedprice * (1 - l_discount)) as sum_disc_price,
        sum(l_extendedprice * (1 - l_discount) * (1 + l_tax)) as sum_charge,
        avg(l_quantity) as avg_qty,
        avg(l_extendedprice) as avg_price,
        avg(l_discount) as avg_disc,
        count(*) as count_order
    from
        lineitem
    where
        l_shipdate <= date '1998-12-01' - interval '90' day 
    group by
        l_returnflag,
        l_linestatus
    order by
        l_returnflag,
        l_linestatus;
  1. Click to execute the SQL , and query data in table lineitem while recording execution time.

SQL Execution Analysis

The execution time is approximately 315.73ms, which is 3~5 times faster than a MySQL instance of similar specifications. OceanBase, based on its distributed architecture, not only ensures high-performance transaction processing but also supports real-time analysis and batch processing scenarios, minimizing data redundancy and significantly reducing overall costs for enterprises.

Notice

The above data is for reference only. Please refer to your actual experimental results for accurate values.

Scenario Two: Multi-Table Query

Aggregate query on tables part, supplier, partsupp, nation, region.

select 
    s_acctbal,
    s_name,
    n_name,
    p_partkey,
    p_mfgr,
    s_address,
    s_phone,
    s_comment
from
    part,
    supplier,
    partsupp,
    nation,
    region
where
    p_partkey = ps_partkey
    and s_suppkey = ps_suppkey
    and p_size = 15
    and p_type like '%BRASS'
    and s_nationkey = n_nationkey
    and n_regionkey = r_regionkey
    and r_name = 'EUROPE'
    and ps_supplycost = (
        select
            min(ps_supplycost)
        from
            partsupp,
            supplier,
            nation,
            region
        where
            p_partkey = ps_partkey
            and s_suppkey = ps_suppkey
            and s_nationkey = n_nationkey
            and n_regionkey = r_regionkey
            and r_name = 'EUROPE'
    )
order by
    s_acctbal desc,
    n_name,
    s_name,
    p_partkey;

Execute the following command to enable parallel execution with 8 parallel threads to speed up computation and record computation time.

select /*+ parallel(8) */
    s_acctbal,
    s_name,
    n_name,
    p_partkey,
    p_mfgr,
    s_address,
    s_phone,
    s_comment
from
    part,
    supplier,
    partsupp,
    nation,
    region
where
    p_partkey = ps_partkey
    and s_suppkey = ps_suppkey
    and p_size = 15
    and p_type like '%BRASS'
    and s_nationkey = n_nationkey
    and n_regionkey = r_regionkey
    and r_name = 'EUROPE'
    and ps_supplycost = (
        select
            min(ps_supplycost)
        from
            partsupp,
            supplier,
            nation,
            region
        where
            p_partkey = ps_partkey
            and s_suppkey = ps_suppkey
            and s_nationkey = n_nationkey
            and n_regionkey = r_regionkey
            and r_name = 'EUROPE'
    )
order by
    s_acctbal desc,
    n_name,
    s_name,
    p_partkey;

Click to execute SQL, and query data in table lineitem while recording execution time.

SQL Execution Analysis

Without parallel execution, the execution time is approximately 1.44s. With 8 parallel executions enabled, the execution time is reduced to 397.7ms, demonstrating significantly faster performance. OceanBase, leveraging its distributed architecture, not only ensures high-performance transaction processing but also supports real-time analysis and batch processing scenarios, minimizing data redundancy and significantly reducing overall costs for enterprises.

Notice

The above data is for reference only. Please refer to your actual experimental results for accurate values.

Contact Us