OceanBase logo

OceanBase

A unified distributed database ready for your transactional, analytical, and AI workloads.

Product Overview
DEPLOY YOUR WAY

OceanBase Cloud

The best way to deploy and scale OceanBase

OceanBase Enterprise

Run and manage OceanBase on your infra

TRY OPEN SOURCE

OceanBase Community Edition

The free, open-source distributed database

OceanBase seekdb

Open source AI native search database

Customer Stories

Real-world success stories from enterprises across diverse industries.

View All
BY USE CASES

Mission-Critical Transactions

Global & Multicloud Application

Elastic Scaling for Peak Traffic

Real-time Analytics

Active Geo-redundancy

Database Consolidation

Resources

Comprehensive knowledge hub for OceanBase.

Blog

Live Demos

Training & Certification

Documentation

Official technical guides, tutorials, API references, and manuals for all OceanBase products.

View All
PRODUCTS

OceanBase Cloud

OceanBase Database

Tools

Connectors and Middleware

QUICK START

OceanBase Cloud

OceanBase Database

BEST PRACTICES

Practical guides for utilizing OceanBase more effectively and conveniently

Company

Learn more about OceanBase – our company, partnerships, and trust and security initiatives.

About OceanBase

Partner

Trust Center

Contact Us

International - English
中国站 - 简体中文
日本 - 日本語
Sign In
Start on Cloud

OceanBase

A unified distributed database ready for your transactional, analytical, and AI workloads.

Product Overview
DEPLOY YOUR WAY

OceanBase Cloud

The best way to deploy and scale OceanBase

OceanBase Enterprise

Run and manage OceanBase on your infra

TRY OPEN SOURCE

OceanBase Community Edition

The free, open-source distributed database

OceanBase seekdb

Open source AI native search database

Customer Stories

Real-world success stories from enterprises across diverse industries.

View All
BY USE CASES

Mission-Critical Transactions

Global & Multicloud Application

Elastic Scaling for Peak Traffic

Real-time Analytics

Active Geo-redundancy

Database Consolidation

Comprehensive knowledge hub for OceanBase.

Blog

Live Demos

Training & Certification

Documentation

Official technical guides, tutorials, API references, and manuals for all OceanBase products.

View All
PRODUCTS
OceanBase CloudOceanBase Database
ToolsConnectors and Middleware
QUICK START
OceanBase CloudOceanBase Database
BEST PRACTICES

Practical guides for utilizing OceanBase more effectively and conveniently

Learn more about OceanBase – our company, partnerships, and trust and security initiatives.

About OceanBase

Partner

Trust Center

Contact Us

Start on Cloud
编组
All Products
    • Databases
    • iconOceanBase Database
    • iconOceanBase Cloud
    • iconOceanBase Tugraph
    • iconInteractive Tutorials
    • iconOceanBase Best Practices
    • Tools
    • iconOceanBase Cloud Platform
    • iconOceanBase Migration Service
    • iconOceanBase Developer Center
    • iconOceanBase Migration Assessment
    • iconOceanBase Admin Tool
    • iconOceanBase Loader and Dumper
    • iconOceanBase Deployer
    • iconKubernetes operator for OceanBase
    • iconOceanBase Diagnostic Tool
    • iconOceanBase Binlog Service
    • Connectors and Middleware
    • iconOceanBase Database Proxy
    • iconEmbedded SQL in C for OceanBase
    • iconOceanBase Call Interface
    • iconOceanBase Connector/C
    • iconOceanBase Connector/J
    • iconOceanBase Connector/ODBC
    • iconOceanBase Connector/NET
icon

OceanBase Database

SQL - V4.2.0

    Download PDF

    OceanBase logo

    The Unified Distributed Database for the AI Era.

    Follow Us
    Products
    OceanBase CloudOceanBase EnterpriseOceanBase Community EditionOceanBase seekdb
    Resources
    DocsBlogLive DemosTraining & CertificationTicket
    Company
    About OceanBaseTrust CenterLegalPartnerContact Us
    Follow Us

    © OceanBase 2026. All rights reserved

    Cloud Service AgreementPrivacy PolicySecurity
    Contact Us
    Document Feedback
    1. Documentation Center
    2. OceanBase Database
    3. SQL
    4. V4.2.0
    iconOceanBase Database
    SQL - V 4.2.0
    Databases
    • OceanBase Database
    • OceanBase Cloud
    • OceanBase Tugraph
    • Interactive Tutorials
    • OceanBase Best Practices
    Tools
    • OceanBase Cloud Platform
    • OceanBase Migration Service
    • OceanBase Developer Center
    • OceanBase Migration Assessment
    • OceanBase Admin Tool
    • OceanBase Loader and Dumper
    • OceanBase Deployer
    • Kubernetes operator for OceanBase
    • OceanBase Diagnostic Tool
    • OceanBase Binlog Service
    Connectors and Middleware
    • OceanBase Database Proxy
    • Embedded SQL in C for OceanBase
    • OceanBase Call Interface
    • OceanBase Connector/C
    • OceanBase Connector/J
    • OceanBase Connector/ODBC
    • OceanBase Connector/NET
    SQL
    KV
    • V 4.6.0
    • V 4.4.2
    • V 4.3.5
    • V 4.3.3
    • V 4.3.1
    • V 4.3.0
    • V 4.2.5
    • V 4.2.2
    • V 4.2.1
    • V 4.2.0
    • V 4.1.0
    • V 4.0.0
    • V 3.1.4 and earlier

    Manual statistics collection

    Last Updated:2023-10-31 11:17:12  Updated
    Share
    What is on this page
    Collect statistics by using the DBMS_STATS package
    method_opt
    Examples
    GRANULARITY association between table-level and index statistics collection strategies
    Use the ANALYZE statement to collect statistics
    Syntax of the ANALYZE statement in Oracle mode
    Syntax of the ANALYZE statement in MySQL mode
    Practice scenarios for manual statistics collection
    Statistics collection on non-partitioned tables
    Statistics collection on partitioned tables
    Statistics collection at the schema (database) level

    folded

    Share

    The OceanBase Database optimizer collects statistics by using the DBMS_STATS package or the ANALYZE statement.

    Collect statistics by using the DBMS_STATS package

    The OceanBase Database optimizer can use the DBMS_STATS package to collect table-level statistics, schema-level statistics, and index statistics by calling stored procedures gather_table_stats, gather_schema_stats, and gather_index_stats respectively.

    PROCEDURE gather_table_stats (
      ownname            VARCHAR2,
      tabname            VARCHAR2,
      partname           VARCHAR2 DEFAULT NULL,
      estimate_percent   NUMBER DEFAULT AUTO_SAMPLE_SIZE,
      block_sample       BOOLEAN DEFAULT FALSE,
      method_opt         VARCHAR2 DEFAULT DEFAULT_METHOD_OPT,
      degree             NUMBER DEFAULT NULL,
      granularity        VARCHAR2 DEFAULT DEFAULT_GRANULARITY,
      cascade            BOOLEAN DEFAULT NULL,
      stattab            VARCHAR2 DEFAULT NULL,
      statid             VARCHAR2 DEFAULT NULL,
      statown            VARCHAR2 DEFAULT NULL,
      no_invalidate      BOOLEAN DEFAULT FALSE,
      stattype           VARCHAR2 DEFAULT 'DATA',
      force              BOOLEAN DEFAULT FALSE
    );
    
    PROCEDURE gather_schema_stats (
      ownname            VARCHAR2,
      estimate_percent   NUMBER DEFAULT AUTO_SAMPLE_SIZE,
      block_sample       BOOLEAN DEFAULT FALSE,
      method_opt         VARCHAR2 DEFAULT DEFAULT_METHOD_OPT,
      degree             NUMBER DEFAULT NULL,
      granularity        VARCHAR2 DEFAULT DEFAULT_GRANULARITY,
      cascade            BOOLEAN DEFAULT NULL,
      stattab            VARCHAR2 DEFAULT NULL,
      statid             VARCHAR2 DEFAULT NULL,
      statown            VARCHAR2 DEFAULT NULL,
      no_invalidate      BOOLEAN DEFAULT FALSE,
      stattype           VARCHAR2 DEFAULT 'DATA',
      force              BOOLEAN DEFAULT FALSE
    );
    
    PROCEDURE gather_index_stats (
      ownname            VARCHAR2,
      indname            VARCHAR2,
      partname           VARCHAR2 DEFAULT NULL,
      estimate_percent   NUMBER DEFAULT AUTO_SAMPLE_SIZE,
      stattab            VARCHAR2 DEFAULT NULL,
      statid             VARCHAR2 DEFAULT NULL,
      statown            VARCHAR2 DEFAULT NULL,
      degree             NUMBER DEFAULT NULL,
      granularity        VARCHAR2 DEFAULT DEFAULT_GRANULARITY,
      no_invalidate      BOOLEAN DEFAULT FALSE,
      force              BOOLEAN DEFAULT FALSE,
      tabname            VARCHAR2  DEFAULT NULL
    );
    

    The following table describes the parameters.

    Parameter
    Description
    ownname
    • Oracle mode: the username. If the parameter is set to NULL, the current logon username is used by default.
    • MySQL mode: the name of the database to which the table belongs.
    tabname The name of the table.
    indname The name of the index.
    partname The name of the partition. Default value: NULL.
    estimate_percent The percentage of data to be used to compute distribution features. The value range is [0.000001,100]. If you specify this parameter to NULL, all data is used. The default value is AUTO_SAMPLE_SIZE, which indicates that the optimizer determines the percentage of data used for computation.
    block_sample Specifies whether to use block sampling instead of row sampling. Default value: FALSE.
    method_opt The statistics collection method at the column level. For more information, see method_opt.
    degree The degree of parallelism in statistics collection. Default value: NULL.
    granularity The granularity in statistics collection. Valid values:
    • 'GLOBAL': collects global statistics.
    • 'PARTITION': collects partition-level statistics.
    • 'SUBPARTITION': collects subpartition-level statistics.
    • 'ALL': collects statistics at all levels, including GLOBAL, PARTITION, and SUBPARTITION.
    • 'AUTO': uses the default method to collect statistics at all levels, including GLOBAL, PARTITION, and SUBPARTITION. If granularity is not specified, the default value 'AUTO' is used.
    • 'DEFAULT': collects global and partition-level statistics.
    • 'GLOBAL AND PARTITION': collects global and partition-level statistics.
    • 'APPROX_GLOBAL AND PARTITION': collects partition-level statistics and deduces global statistics based on the partition-level statistics.
    cascade Specifies whether to collect index statistics on the table at the same time. Default value: TRUE.
    stattab This parameter is not available.
    statid This parameter is not available.
    statown This parameter is not available.
    no_invalidate This parameter is not available.
    stattype This parameter is not available.
    force Specifies whether to collect statistics by force and ignore the lock status. Default value: FALSE. This parameter can be executed only in the BEGIN ... END clause of a PL statement.

    method_opt

    method_opt can be set in the following syntax:

    method_opt:
      FOR ALL [INDEXED | HIDDEN] COLUMNS [size_clause]
    | FOR COLUMNS [size clause] column [size_clause] [,column [size_clause]...]
    
    size_clause:
      SIZE integer
    | SIZE REPEAT
    | SIZE AUTO
    | SIZE SKEWONLY
    
    column:
       column_name
    | (column_name [, column_name])
    

    The following table describes the parameters.

    Parameter
    Description
    SIZE integer The number of histogram buckets for the column. Value range: [1, 2048].
    REPEAT Histograms are collected only for columns whose histograms have been collected. Use the previous number of histogram buckets set for collection.
    AUTO The OceanBase Database optimizer determines whether to collect column histograms based on column usage. The default number of histogram buckets is 254.
    SKEWONLY Histograms are collected only for columns with uneven data distribution. The default number of histogram buckets is 254.

    In addition, the DBMS_STATS package also provides GATHER_DATABASE_STATS_JOB_PROC to collect statistics on all tables in the entire database. The default preference settings are used for all collection strategies. For more information about how to set preferences, see Statistics preference management.

    PROCEDURE GATHER_DATABASE_STATS_JOB_PROC();
    

    Examples

    The following examples use the DBMS_STATS package to collect statistics:

    • Collect global statistics on table tbl1 under user user.

      CALL dbms_stats.gather_table_stats('user', 'tbl1', granularity=>'GLOBAL', method_opt=>'FOR ALL COLUMNS SIZE AUTO');
      
    • Collect partition-level statistics on table t_part1 under user user, with a DOP of 64. Only histograms of columns with uneven data distribution are collected.

      CALL dbms_stats.gather_table_stats('user', 't_part1', degree=>'64', granularity=>'PARTITION', method_opt=>'FOR ALL COLUMNS SIZE SKEWONLY');
      
    • Collect all statistics on table t_subpart1 under user user, with a DOP of 128. Only 50% of data is collected, with histograms of columns determined by the optimizer.

      CALL dbms_stats.gather_table_stats('user', 't_subpart1', degree=>'128', estimate_percent=> '50', granularity=>'ALL', method_opt=>'FOR ALL COLUMNS SIZE AUTO');
      

    GRANULARITY association between table-level and index statistics collection strategies

    By default, all index statistics on a table are collected along with the table statistics. Some statistics, including the number of rows and the average row length (which is calculated based on the corresponding columns), on an index table can be obtained from the statistics on the base table. This accelerates statistics collection and also avoids repeated whole table data scanning. The following table lists the differences in GRANULARITY between local indexes and global indexes.

    Specified partition
    GRANULARITY
    Global index
    Local index
    No ALL GLOBAL ALL
    No AUTO GLOBAL AUTO
    No DEFAULT GLOBAL DEFAULT
    No GLOBAL AND PARTITION GLOBAL GLOBAL AND PARTITION
    No APPROX_GLOBAL AND PARTITION GLOBAL APPROX_GLOBAL AND PARTITION
    No GLOBAL GLOBAL GLOBAL
    No PARTITION GLOBAL PARTITION
    No SUBPARTITION Not collected SUBPARTITION
    Yes ALL GLOBAL ALL
    Yes AUTO GLOBAL AUTO
    Yes DEFAULT GLOBAL DEFAULT
    Yes GLOBAL AND PARTITION GLOBAL GLOBAL AND PARTITION
    Yes APPROX_GLOBAL AND PARTITION Not collected APPROX_GLOBAL AND PARTITION
    Yes GLOBAL GLOBAL GLOBAL
    Yes PARTITION Not collected PARTITION
    Yes SUBPARTITION Not collected SUBPARTITION

    Use the ANALYZE statement to collect statistics

    OceanBase Database allows you to use the ANALYZE statement to collect statistics in both Oracle and MySQL modes.

    Syntax of the ANALYZE statement in Oracle mode

    The syntax of the ANALYZE statement in Oracle mode is as follows:

    analyze_stmt:
    ANALYZE TABLE table_name [use_partition] analyze_statistics_clause
    
    use_partition:
      PARTITION (partition_name [,partition_name...])
    | SUBPARTITION(subpartition_name [,subpartition_name...])
    
    analyze_statistics_clause:
      COMPUTE STATISTICS [analyze_for_clause]
    | ESTIMATE STATISTICS [analyze_for_clause] [SAMPLE INTNUM {ROWS | PERCENTAGE}]
    
    analyze_for_clause:
      FOR TABLE
    | FOR ALL [INDEXED | HIDDEN] COLUMNS [size_clause]
    | FOR COLUMNS [size_clause] column [size_clause] [,column [size_clause]...]
    
    size_clause:
      SIZE integer
    | SIZE REPEAT
    | SIZE AUTO
    | SIZE SKEWONLY
    
    column:
      column_name
    | (column_name [,column_name])
    

    The column histogram collection method in the preceding syntax is the same as the syntax of method_opt in the DBMS_STATS package. For more information, see method_opt.

    The ANALYZE statement does not provide as many strategy settings as the DBMS_STATS package does. To implement statistics collection at the GLOBAL level only in an ANALYZE statement, you can set partition_name in the use_partition syntax to the table name.

    The following examples collect statistics using the ANALYZE statement in Oracle mode:

    • Collect statistics on table tbl1 under user user.

      ANALYZE TABLE tbl1 COMPUTE STATISTICS FOR ALL COLUMNS SIZE AUTO;
      
    • Collect global statistics on table t_part1 under user user. Only histograms of columns with uneven data distribution are collected.

      ANALYZE TABLE t_part1 PARTITION (t_part1) COMPUTE STATISTICS FOR ALL COLUMNS SIZE SKEWONLY;
      
    • Collect statistics on partitions p0sp0 and p1sp2 in table t_subpart1 under user user, with histograms of columns determined by the optimizer.

      ANALYZE TABLE t_subpart1 SUBPARTITION(p0sp0,p1sp2) COMPUTE STATISTICS FOR ALL COLUMNS SIZE AUTO;
      

    Syntax of the ANALYZE statement in MySQL mode

    The syntax of the ANALYZE statement in MySQL mode is as follows:

    analyze_stmt:
    ANALYZE TABLE table_name UPDATE HISTOGRAM ON column_name_list WITH INTNUM BUCKETS
    

    When the session variable enable_sql_extension is TRUE, you can use the syntax in Oracle mode, which is the extended syntax for OceanBase Database in MySQL mode.

    analyze_stmt:
    ANALYZE TABLE table_name [use_partition] analyze_statistics_clause
    

    The following examples collect statistics using the ANALYZE statement in MySQL mode:

    • Collect statistics on table tbl1, with the number of buckets being 30 for columns.

      ANALYZE TABLE tbl1 UPDATE HISTOGRAM ON a, b, c, d WITH 30 BUCKETS;
      
    • In MySQL mode, collect statistics on table tbl1 under user test by using the syntax in Oracle mode.

      ALTER SYSTEM SET ENABLE_SQL_EXTENSION = TRUE;
      ANALYZE TABLE tbl1 COMPUTE STATISTICS FOR ALL COLUMNS SIZE AUTO;
      

    Practice scenarios for manual statistics collection

    Statistics collection on non-partitioned tables

    When the product of the data volume and the number of columns in a non-partitioned table is within 10 million, we recommend that you run the following command to collect statistics on the table. In the following example, the test.t1 table has 10 columns and a data volume of 1 million rows.

    CREATE TABLE test.t1(c1 INT, c2 INT, c3 INT, c4 INT, c5 INT, c6 INT, c7 INT, c8 INT, c9 INT, c10 INT);
    CREATE TABLE t2 (c1 INT PRIMARY KEY);
    INSERT INTO t2 VALUES(0),(1),(2),(3),(4),(5),(6),(7),(8),(9);
    INSERT /*+APPEND*/ INTO t1 SELECT * FROM (SELECT s0.c1 c1, s1.c1 c2, s2.c1 c3, s3.c1 c4, s4.c1 c5, s5.c1 c6, s0.c1 * 10 c7, s1.c1 * 10 c8, s2.c1 * 10 c9, s3.c1 * 10 c10 FROM t2 s0, t2 s1, t2 s2, t2 s3, t2 s4, t2 s5);
    
    # Scenarios where histograms are not collected
    CALL DBMS_STATS.GATHER_TABLE_STATS('test', 't1', method_opt=>'for all columns size 1');
    
    # Scenarios where histograms are collected and the default strategy is used
    CALL DBMS_STATS.GATHER_TABLE_STATS('test', 't1');
    

    When the product of the data volume and the number of columns in a non-partitioned table exceeds 10 million, we recommend that you set an appropriate degree of parallelism (DOP) based on the business requirements and system resources to speed up statistics collection. In the following example, the data volume in the test.t1 table is increased to 10 million rows. The DOP is set to 8.

    # Scenarios where histograms are not collected
    CALL DBMS_STATS.GATHER_TABLE_STATS('test', 't1', degree=>8, method_opt=>'for all columns size 1');
    
    # Scenarios where histograms are collected and the default strategy is used
    CALL DBMS_STATS.GATHER_TABLE_STATS('test', 't1', degree=>8);
    

    Statistics collection on partitioned tables

    Different from statistics collection strategies for non-partitioned tables, statistics collection strategies for partitioned tables must cover the collection of partition statistics. When system resources are sufficient, we recommend that you double the DOP for statistics collection on partitioned tables.

    In the following example, the test.t_part table has 10 columns and a data volume of 1 million rows. Because partition statistics need to be collected additionally, the DOP is increased to 2.

    CREATE TABLE t_part(c1 INT, c2 INT, c3 INT, c4 INT, c5 INT, c6 INT, c7 INT, c8 INT, c9 INT, c10 INT) PARTITION BY HASH(c1) PARTITIONS 128;
    CREATE TABLE t2 (c1 INT PRIMARY KEY);
    INSERT INTO t2 VALUES(0),(1),(2),(3),(4),(5),(6),(7),(8),(9);
    INSERT /*+APPEND*/ INTO t1 SELECT * FROM (SELECT s0.c1 c1, s1.c1 c2, s2.c1 c3, s3.c1 c4, s4.c1 c5, s5.c1 c6, s0.c1 * 10 c7, s1.c1 * 10 c8, s2.c1 * 10 c9, s3.c1 * 10 c10 FROM t2 s0, t2 s1, t2 s2, t2 s3, t2 s4, t2 s5);
    

    For partitioned tables, you also deduce global statistics from partition-level statistics to speed up statistics collection. For example, in the same scenario, you can adjust the granularity in statistics collection rather than increasing the DOP. Here are some examples:

    # Scenarios where histograms are not collected
    CALL DBMS_STATS.GATHER_TABLE_STATS('test', 't_part', granularity=>'APPROX_GLOBAL AND PARTITION', method_opt=>'for all columns size 1');
    
    # Scenarios where histograms are collected and the default strategy is used
    CALL DBMS_STATS.GATHER_TABLE_STATS('test', 't_part', granularity=>'APPROX_GLOBAL AND PARTITION');
    

    Statistics collection at the schema (database) level

    In addition to manual statistics collection on a single table, OceanBase Database provides the DBMS_STATS system package for you to collect statistics on all tables under a user.

    Note

    This feature is supported only in OceanBase Database V4.x in Oracle and MySQL modes and OceanBase Database V3.2.x in Oracle mode.

    Statistics collection on all tables under a user takes time. Therefore, we recommend that you use this feature during off-peak hours.

    If the tables under the user are all small tables with no more than 1 million rows, you can use the command in the following example to collect statistics.

    # Scenarios where histograms are not collected
    CALL DBMS_STATS.GATHER_SCHEMA_STATS('test', method_opt=>'for all columns size 1');
    
    # Scenarios where histograms are collected and the default strategy is used
    CALL DBMS_STATS.GATHER_SCHEMA_STATS('test');
    

    If the user has large tables with tens of millions of rows, you can increase the DOP and collect statistics during off-peak hours.

    # Scenarios where histograms are not collected
    CALL DBMS_STATS.GATHER_SCHEMA_STATS('test', degree=>'16', method_opt=>'for all columns size 1');
    
    
    # Scenarios where histograms are collected and the default strategy is used
    CALL DBMS_STATS.GATHER_SCHEMA_STATS('test', degree=>'16');
    

    If the user has massive tables with more than 100 million rows, you can collect statistics on the massive tables separately at a high DOP. Then, you can lock the statistics on the massive tables and use the preceding command to collect statistics on all tables of the user. After the statistics are collected, you can unlock the statistics on the massive tables and then collect subsequent statistics in incremental mode.

    CALL DBMS_STATS.GATHER_TABLE_STATS('test', 'big_table', degree=>128, method_opt=>'for all columns size 1');
    
    CALL DBMS_STATS.LOCK_TABLE_STATS('test','big_table');
    
    CALL DBMS_STATS.GATHER_SCHEMA_STATS('TEST', degree=>'16', method_opt=>'for all columns size 1');
    
    CALL DBMS_STATS.UNLOCK_TABLE_STATS('test','big_table');
    

    Previous topic

    Histograms
    Last

    Next topic

    Automatic statistics collection
    Next
    What is on this page
    Collect statistics by using the DBMS_STATS package
    method_opt
    Examples
    GRANULARITY association between table-level and index statistics collection strategies
    Use the ANALYZE statement to collect statistics
    Syntax of the ANALYZE statement in Oracle mode
    Syntax of the ANALYZE statement in MySQL mode
    Practice scenarios for manual statistics collection
    Statistics collection on non-partitioned tables
    Statistics collection on partitioned tables
    Statistics collection at the schema (database) level