OceanBase logo

OceanBase

A unified distributed database ready for your transactional, analytical, and AI workloads.

Product Overview
DEPLOY YOUR WAY

OceanBase Cloud

The best way to deploy and scale OceanBase

OceanBase Enterprise

Run and manage OceanBase on your infra

TRY OPEN SOURCE

OceanBase Community Edition

The free, open-source distributed database

OceanBase seekdb

Open source AI native search database

Customer Stories

Real-world success stories from enterprises across diverse industries.

View All
BY USE CASES

Mission-Critical Transactions

Global & Multicloud Application

Elastic Scaling for Peak Traffic

Real-time Analytics

Active Geo-redundancy

Database Consolidation

Resources

Comprehensive knowledge hub for OceanBase.

Blog

Live Demos

Training & Certification

Documentation

Official technical guides, tutorials, API references, and manuals for all OceanBase products.

View All
PRODUCTS

OceanBase Cloud

OceanBase Database

Tools

Connectors and Middleware

QUICK START

OceanBase Cloud

OceanBase Database

BEST PRACTICES

Practical guides for utilizing OceanBase more effectively and conveniently

Company

Learn more about OceanBase – our company, partnerships, and trust and security initiatives.

About OceanBase

Partner

Trust Center

Contact Us

International - English
中国站 - 简体中文
日本 - 日本語
Sign In
Start on Cloud

OceanBase

A unified distributed database ready for your transactional, analytical, and AI workloads.

Product Overview
DEPLOY YOUR WAY

OceanBase Cloud

The best way to deploy and scale OceanBase

OceanBase Enterprise

Run and manage OceanBase on your infra

TRY OPEN SOURCE

OceanBase Community Edition

The free, open-source distributed database

OceanBase seekdb

Open source AI native search database

Customer Stories

Real-world success stories from enterprises across diverse industries.

View All
BY USE CASES

Mission-Critical Transactions

Global & Multicloud Application

Elastic Scaling for Peak Traffic

Real-time Analytics

Active Geo-redundancy

Database Consolidation

Comprehensive knowledge hub for OceanBase.

Blog

Live Demos

Training & Certification

Documentation

Official technical guides, tutorials, API references, and manuals for all OceanBase products.

View All
PRODUCTS
OceanBase CloudOceanBase Database
ToolsConnectors and Middleware
QUICK START
OceanBase CloudOceanBase Database
BEST PRACTICES

Practical guides for utilizing OceanBase more effectively and conveniently

Learn more about OceanBase – our company, partnerships, and trust and security initiatives.

About OceanBase

Partner

Trust Center

Contact Us

Start on Cloud
编组
All Products
    • Databases
    • iconOceanBase Database
    • iconOceanBase Cloud
    • iconOceanBase Tugraph
    • iconInteractive Tutorials
    • iconOceanBase Best Practices
    • Tools
    • iconOceanBase Cloud Platform
    • iconOceanBase Migration Service
    • iconOceanBase Developer Center
    • iconOceanBase Migration Assessment
    • iconOceanBase Admin Tool
    • iconOceanBase Loader and Dumper
    • iconOceanBase Deployer
    • iconKubernetes operator for OceanBase
    • iconOceanBase Diagnostic Tool
    • iconOceanBase Binlog Service
    • Connectors and Middleware
    • iconOceanBase Database Proxy
    • iconEmbedded SQL in C for OceanBase
    • iconOceanBase Call Interface
    • iconOceanBase Connector/C
    • iconOceanBase Connector/J
    • iconOceanBase Connector/ODBC
    • iconOceanBase Connector/NET
icon

OceanBase Database

SQL - V4.2.2

    Download PDF

    OceanBase logo

    The Unified Distributed Database for the AI Era.

    Follow Us
    Products
    OceanBase CloudOceanBase EnterpriseOceanBase Community EditionOceanBase seekdb
    Resources
    DocsBlogLive DemosTraining & CertificationTicket
    Company
    About OceanBaseTrust CenterLegalPartnerContact Us
    Follow Us

    © OceanBase 2026. All rights reserved

    Cloud Service AgreementPrivacy PolicySecurity
    Contact Us
    Document Feedback
    1. Documentation Center
    2. OceanBase Database
    3. SQL
    4. V4.2.2
    iconOceanBase Database
    SQL - V 4.2.2
    Databases
    • OceanBase Database
    • OceanBase Cloud
    • OceanBase Tugraph
    • Interactive Tutorials
    • OceanBase Best Practices
    Tools
    • OceanBase Cloud Platform
    • OceanBase Migration Service
    • OceanBase Developer Center
    • OceanBase Migration Assessment
    • OceanBase Admin Tool
    • OceanBase Loader and Dumper
    • OceanBase Deployer
    • Kubernetes operator for OceanBase
    • OceanBase Diagnostic Tool
    • OceanBase Binlog Service
    Connectors and Middleware
    • OceanBase Database Proxy
    • Embedded SQL in C for OceanBase
    • OceanBase Call Interface
    • OceanBase Connector/C
    • OceanBase Connector/J
    • OceanBase Connector/ODBC
    • OceanBase Connector/NET
    SQL
    KV
    • V 4.6.0
    • V 4.4.2
    • V 4.3.5
    • V 4.3.3
    • V 4.3.1
    • V 4.3.0
    • V 4.2.5
    • V 4.2.2
    • V 4.2.1
    • V 4.2.0
    • V 4.1.0
    • V 4.0.0
    • V 3.1.4 and earlier

    Dynamic sampling

    Last Updated:2026-04-15 08:27:14  Updated
    Share
    What is on this page
    Principle
    Application scenarios of dynamic sampling
    Methods for controlling dynamic sampling
    Control dynamic sampling by using a system variable
    Control dynamic sampling by using a hint
    Control dynamic sampling by using a system parameter
    Considerations

    folded

    Share

    Starting from OceanBase Database V4.2.0, the optimizer can perform dynamic sampling. This feature allows the optimizer to collect required statistics at SQL runtime to generate better execution plans, thus optimizing query performance. The dynamic sampling feature is automatically applied to SQL query statements in OceanBase Database, making execution plans more accurate and efficient.

    Principle

    By performing dynamic sampling, the optimizer can not only obtain sufficient statistics, but also sample the target database objects in advance to estimate the number of rows during plan generation. The estimation result is then applied in the cost model to generate a better execution plan.

    The dynamic sampling feature of OceanBase Database provides the following benefits:

    • Dynamic sampling can obtain more accurate statistics based on stale or even insufficient statistics.
    • Dynamic sampling can provide more accurate statistics in queries that contain complex predicates, associative predicates, and so on.
    • Dynamic sampling can reduce the time and cost consumed for statistics collection. For example, for wide tables, regular statistics collection can be time-consuming and resource-intensive.
    • Dynamic sampling can improve the query efficiency by dynamically adjusting the execution plan during a query to adapt to data changes.

    Before an SQL query is executed, the OceanBase Database optimizer collects table and index statistics to select the best execution plan. If the statistics are inaccurate or incomplete, the optimizer may select an execution plan other than the optimal one, resulting in poor query performance. Basic statistics are usually collected automatically or manually. However, if the data distribution changes or statistics are not collected, or in some complex SQL query scenarios, the statistics may no longer be accurate.

    In the following example, both tables t1 and t2 have 1,000 rows of data, and statistics are collected on neither table.

    CREATE TABLE t1(c1 INT, c2 INT, c3 INT);
    CREATE TABLE t2(c1 INT, c2 INT, c3 INT);
    CREATE INDEX  idx_c1 ON t2(c1);
    INSERT INTO t1 SELECT level,level,level FROM DUAL CONNECT BY level<=1000;
    INSERT INTO t2 SELECT level,level,level FROM DUAL CONNECT BY level<=1000;
    

    The execution plan of the query SELECT * FROM t1, t2 WHERE t1.c1 = t2.c1 AND t1.c2 = 1000 is as follows:

    +---------------------------------------------------------------------------------------------+
    | Query Plan                                                                                  |
    +---------------------------------------------------------------------------------------------+
    | ====================================================================                        |
    | |ID|OPERATOR                      |NAME      |EST.ROWS|EST.TIME(us)|                        |
    | --------------------------------------------------------------------                        |
    | |0 |HASH JOIN                     |          |98      |199         |                        |
    | |1 |├─TABLE SCAN                  |T1        |10      |44          |                        |
    | |2 |└─TABLE SCAN                  |T2        |1000    |61          |                        |
    | ====================================================================                        |
    

    You can see that a hash join is used to join the two tables. However, only one row of data meets the t1.c2 = 1000 condition. Therefore, a nested loop join can be used to push the join condition t1.c1 = t2.c1 down to the base table t2, so that the t2 table can also use the idx_c1 index. In this case, the execution performance of the plan is improved. The following example shows an execution plan with better performance.

    +---------------------------------------------------------------------------------------------+
    | Query Plan                                                                                  |
    +---------------------------------------------------------------------------------------------+
    | ====================================================================                        |
    | |ID|OPERATOR                      |NAME      |EST.ROWS|EST.TIME(us)|                        |
    | --------------------------------------------------------------------                        |
    | |0 |NESTED-LOOP JOIN              |          |1       |21          |                        |
    | |1 |├─TABLE FULL SCAN             |T1        |1       |2           |                        |
    | |2 |└─DISTRIBUTED TABLE RANGE SCAN|T2(IDX_C1)|1       |18          |                        |
    | ====================================================================
    

    Dynamic sampling, therefore, can provide more accurate statistics and help the optimizer select a better execution plan.

    You can better understand the principle of dynamic sampling in OceanBase Database from the following example.

    CREATE TABLE t1(c1 INT, c2 INT, c3 INT, c4 INT);
    CREATE INDEX idx_c1 ON t1(c1);
    CREATE INDEX idx_c2 ON t1(c2);
    
    Q1: SELECT c4 FROM t1 WHERE c1 > 1 AND c2 > 1 AND c3 > 1 GROUP BY c4;
    Q2: SELECT /*+PARALLEL(2)*/c4 FROM t1 WHERE c1 > 1 AND c2 > 1 AND c3 > 1 GROUP BY c4;
    

    For the queries in the preceding example, the following three base table paths are generated:

    • Primary table path
    • Index table path idx_c1
    • index table path idx_c2

    Dynamic sampling constructs the following sampling SQL statements based on the three paths:

    • Estimate the number of rows in the base table: count(*).
    • Obtain the basic statistics on the c4 column: approx_count_distinct(c1) and sum(case when c1 is null then 1 else 0 end).
    • Estimate the number of rows that meet all predicates: sum(case when c1 > 1 and c2 > 1 and c3 > 1 then 1 else 0 end).
    • Estimate the number of rows in the query range of the index table path idx_c1: sum(case when c1 > 1 then 1 else 0 end).
    • Estimate the number of rows in the query range of the index table path idx_c2: sum(case when c2 > 1 then 1 else 0 end).

    Calculate the sampling rate based on the number of microblocks. If the sampling rate of the Q1 query is ratio, the sampling rate of the Q2 query is 2 × ratio. Dynamic sampling constructs the following two sampling SQL statements:

    DYNAMIC SAMPLING Q1:
    SELECT
    /*+ NO_REWRITE
        NO_PARALLEL
        DYNAMIC_SAMPLING(0)
        QUERY_TIMEOUT(1000000)
    */
    count(*),
    approx_count_distinct("C4"),
    Sum(CASE WHEN "C4" IS NULL THEN 1 ELSE 0 END),
    Sum(CASE WHEN ( "C1" > 1 ) AND ( "C2" > 1 ) AND ( "C3" > 1 ) THEN 1 ELSE 0 END),
    Sum(CASE WHEN ( "C1" > 1 ) THEN 1 ELSE 0 END),
    Sum(CASE WHEN ( "C2" > 1 ) THEN 1 ELSE 0 END)
    FROM "TEST"."T1" SAMPLE BLOCK(ratio) SEED(seed);
    
    DYNAMIC SAMPLING Q2:
    SELECT
    /*+ NO_REWRITE
        PARALLEL(64)
        DYNAMIC_SAMPLING(0)
        QUERY_TIMEOUT(1000000)
    */
    count(*),
    approx_count_distinct("C4"),
    Sum(CASE WHEN "C4" IS NULL THEN 1 ELSE 0 END),
    Sum(CASE WHEN ( "C1" > 1 ) AND ( "C2" > 1 ) AND ( "C3" > 1 ) THEN 1 ELSE 0 END),
    Sum(CASE WHEN ( "C1" > 1 ) THEN 1 ELSE 0 END),
    Sum(CASE WHEN ( "C2" > 1 ) THEN 1 ELSE 0 END)
    FROM "TEST"."T1" SAMPLE BLOCK(2*ratio);
    

    Keywords in the preceding example are described as follows:

    • NO_REWRITE: specifies not to perform rewriting, but to scan the base table without sampling.
    • NO_PARALLEL: specifies not to enable parallel sampling. Parallel sampling is not performed by default because the original SQL statement does not explicitly specify the DOP.
    • PARALLEL(x): specifies to use parallel sampling. If the original SQL statement specifies the DOP, parallel sampling is enabled for the sampling SQL statement.
    • DYNAMIC_SAMPLING(0): specifies whether to enable dynamic sampling. The value 0 indicates that dynamic sampling is disabled for SQL queries.
    • QUERY_TIMEOUT: the maximum execution time for the sampling SQL statement.

    Application scenarios of dynamic sampling

    By default, the dynamic sampling feature takes effect on the SQL statements of the current user. Currently, dynamic sampling is supported only for base tables. When the dynamic sampling feature is not disabled, the optimizer attempts to use dynamic sampling during the plan generation phase in the following scenarios:

    • Scenarios without statistics.
    • Scenarios where complex predicates exist in query conditions, such as c1 like '%test%'. In such scenarios, the selectivity formula cannot be used to estimate the number of rows.
    • Scenarios where the user specifies to use dynamic sampling.

    Methods for controlling dynamic sampling

    Currently, OceanBase Database allows you to control dynamic sampling by using a system variable, a query hint, or a system parameter. The size of the sampling set for dynamic sampling is restricted by the DOP.

    Control dynamic sampling by using a system variable

    You can use the system variable optimizer_dynamic_sampling to enable or disable the dynamic sampling feature. The syntaxes are as follows:

    /* Enable global dynamic sampling. */
    SET GLOBAL optimizer_dynamic_sampling = 1;
    
    /* Enable dynamic sampling at the session level. */
    SET SESSION optimizer_dynamic_sampling = 1;
    SET optimizer_dynamic_sampling = 1;
    
    /* Disable global dynamic sampling. */
    SET GLOBAL optimizer_dynamic_sampling = 0;
    
    /* Disable dynamic sampling at the session level. */
    SET SESSION optimizer_dynamic_sampling = 0;
    SET optimizer_dynamic_sampling = 0;
    

    Control dynamic sampling by using a hint

    You can use the DYNAMIC_SAMPLING hint to control whether to enable dynamic sampling in a query. The syntax is as follows:

    /*+DYNAMIC_SAMPLING( [[ qb_name_option ] table_name ] INTNUM1 [, INTNUM2 ] ) */
    

    The parameters in the DYNAMIC_SAMPLING hint are described as follows:

    • qb_name_option: the name of the query block. This parameter is optional.
    • table_name: the name of the table for dynamic sampling. This parameter is optional. If you do not specify this parameter, dynamic sampling is used for the entire query.
    • INTNUM1: the level of sampling. Valid values are 0 and 1. The value 1 specifies to enable dynamic sampling, and the value 0 specifies to disable dynamic sampling.
    • INTNUM2: the number of microblocks to be sampled. We recommend that you specify more than 32 microblocks.

    The following example enables dynamic sampling for table t1 in a query.

    SELECT /*+DYNAMIC_SAMPLING(t1 1)*/ c4 FROM t1 WHERE c1 > 1 AND c2 > 1 AND c3 > 1 GROUP BY c4;
    

    Control dynamic sampling by using a system parameter

    The maximum query time for dynamic sampling is 1/10 of the maximum query time. For example, if the timeout of a query is 10s, the maximum dynamic sampling time is 1s. To prevent dynamic sampling from taking too long, OceanBase Database allows you to specify the system parameter _optimizer_ads_time_limit to control the maximum time allowed for dynamic sampling. The default maximum time is 10s and the value range is [0s, 300s]. If the system parameter is set to 0, dynamic sampling is disabled.

    The following example sets the maximum dynamic sampling time to 100s for the current tenant.

    obclient [oceanbase]> ALTER SYSTEM SET _optimizer_ads_time_limit = 100;
    Query OK, 0 rows affected
    

    Considerations

    Dynamic sampling is designed to provide sufficient statistics for the optimizer. It samples database objects in advance to estimate the number of rows during plan generation. The estimation result is then applied in the cost model to generate a better execution plan.

    Dynamic sampling provides a better solution for the optimizer to obtain statistics when no statistics are available or accurate row count estimates cannot be made. When you use the dynamic sampling feature, note that:

    • Dynamic sampling uses block sampling by default. Therefore, for a better sampling effect, we recommend that you perform minor and major compactions after data is imported.
    • Dynamic sampling inevitably causes additional overhead in hard parsing. If the overhead is not tolerable for business in some transaction processing (TP) scenarios, you can choose to disable dynamic sampling.
    • Dynamic sampling can be used only as a supplement to statistics collection methods. Do not rely on this feature in business scenarios. Basic statistics collection still needs to be done.

    Previous topic

    Online statistics collection
    Last

    Next topic

    Overview
    Next
    What is on this page
    Principle
    Application scenarios of dynamic sampling
    Methods for controlling dynamic sampling
    Control dynamic sampling by using a system variable
    Control dynamic sampling by using a hint
    Control dynamic sampling by using a system parameter
    Considerations