Meet OceanBase AI Database, the unified database for operational data, real-time analytics, and AI. Explore ->

Meet OceanBase AI Database, the unified database for operational data, real-time analytics, and AI. Explore ->

OceanBase logo

OceanBase

A unified distributed database ready for your transactional, analytical, and AI workloads.

Product Overview
DEPLOY YOUR WAY

OceanBase Cloud

The best way to deploy and scale OceanBase

OceanBase Enterprise

Run and manage OceanBase on your infra

TRY OPEN SOURCE

OceanBase Community Edition

The free, open-source distributed database

OceanBase seekdb

Open source AI native search database

Customer Stories

Real-world success stories from enterprises across diverse industries.

View All
BY USE CASES

Mission-Critical Transactions

Global & Multicloud Application

Elastic Scaling for Peak Traffic

Real-time Analytics

Active Geo-redundancy

Database Consolidation

Resources

Comprehensive knowledge hub for OceanBase.

Blog

Live Demos

Training & Certification

Documentation

Official technical guides, tutorials, API references, and manuals for all OceanBase products.

View All
PRODUCTS

OceanBase Cloud

OceanBase Database

Tools

Connectors and Middleware

QUICK START

OceanBase Cloud

OceanBase Database

BEST PRACTICES

Practical guides for utilizing OceanBase more effectively and conveniently

Company

Learn more about OceanBase – our company, partnerships, and trust and security initiatives.

About OceanBase

Partner

Trust Center

Contact Us

International - English
中国站 - 简体中文
日本 - 日本語
Sign In
Start on Cloud

OceanBase

A unified distributed database ready for your transactional, analytical, and AI workloads.

Product Overview
DEPLOY YOUR WAY

OceanBase Cloud

The best way to deploy and scale OceanBase

OceanBase Enterprise

Run and manage OceanBase on your infra

TRY OPEN SOURCE

OceanBase Community Edition

The free, open-source distributed database

OceanBase seekdb

Open source AI native search database

Customer Stories

Real-world success stories from enterprises across diverse industries.

View All
BY USE CASES

Mission-Critical Transactions

Global & Multicloud Application

Elastic Scaling for Peak Traffic

Real-time Analytics

Active Geo-redundancy

Database Consolidation

Comprehensive knowledge hub for OceanBase.

Blog

Live Demos

Training & Certification

Documentation

Official technical guides, tutorials, API references, and manuals for all OceanBase products.

View All
PRODUCTS
OceanBase CloudOceanBase Database
ToolsConnectors and Middleware
QUICK START
OceanBase CloudOceanBase Database
BEST PRACTICES

Practical guides for utilizing OceanBase more effectively and conveniently

Learn more about OceanBase – our company, partnerships, and trust and security initiatives.

About OceanBase

Partner

Trust Center

Contact Us

Start on Cloud
编组
All Products
    • Databases
    • iconOceanBase Database
    • iconOceanBase Cloud
    • iconOceanBase Tugraph
    • iconInteractive Tutorials
    • iconOceanBase Best Practices
    • Tools
    • iconOceanBase Cloud Platform
    • iconOceanBase Migration Service
    • iconOceanBase Developer Center
    • iconOceanBase Migration Assessment
    • iconOceanBase Admin Tool
    • iconOceanBase Loader and Dumper
    • iconOceanBase Deployer
    • iconKubernetes operator for OceanBase
    • iconOceanBase Diagnostic Tool
    • iconOceanBase Binlog Service
    • Connectors and Middleware
    • iconOceanBase Database Proxy
    • iconEmbedded SQL in C for OceanBase
    • iconOceanBase Call Interface
    • iconOceanBase Connector/C
    • iconOceanBase Connector/J
    • iconOceanBase Connector/ODBC
    • iconOceanBase Connector/NET
icon

OceanBase Diagnostic Tool

V4.3.0

  • obdiag Overview
  • obdiag installation
  • obdiag configuration
  • One-click Cluster Inspection
    • One-click cluster inspection
    • Detailed explanation of cluster inspection indicators
  • One-click Information Gathering
    • One-click routine information collection
      • OceanBase cluster log collection
      • Host information collection
      • SQL details collection
      • Stack information collection
      • Flame Picture/Bian Que Picture Collection
      • ASH report collection
      • Collect table-related information
      • Cluster parameter collection
      • Variable collection
      • clog/slog collection
      • DBMS_XPLAN information collection
      • Core file collection
      • AWR report collection
      • obproxy log collection
      • OMS log collection
      • Collect all information
    • One-click scenario-based information collection
      • Overview
      • Collect basic cluster information
      • Collect backup problem information
      • Collect backup cleanup problem information
      • Collect clog disk full problem information
      • Collect cluster downtime problem information
      • Collect merge problem information
      • Collect CPU high problem information
      • Collect information on delay issues in primary and standby databases
      • Collect I/O problem information
      • Collect log archiving problem information
      • Collect long transaction problem information
      • Collect memory problem information
      • Collect SQL performance problem information
      • Collect PX error reporting information
      • Collect recovery problem information
      • Collect observer restart information without reason
      • Collect owner-cutting problem information
      • Collect information about hanging transaction issues
      • Collect copy imbalance problem information
      • Collect SQL error reporting information
      • Collect cluster TopSQL information
      • Collect application error information
      • Collect obproxy restart problem information for no reason
      • Collect ODP parameter information
      • Collect unspecified scenario problem information
  • One-click Diagnostic Analysis
    • One-click diagnostic analysis log
    • One-click full-link diagnostic log analysis
    • Parameter analysis (compared with default values)
    • Parameter analysis (parameter differences on different observers)
    • Analyze variables
    • Instructions for using index space analysis
    • One-click diagnostic analysis of memory
    • One-click diagnosis and analysis of queue backlog
  • One-click Root Cause Analysis
    • One-click root cause analysis
    • Root cause analysis scenario: disconnection
    • Root cause analysis scenario: card merging major_hold
    • Root cause analysis scenario: lock conflict lock_conflict
    • Root cause analysis scenario: executing DDL and reporting disk full error ddl_disk_full
    • Root cause analysis scenario: clog disk is full clog_disk_full
    • Root cause analysis scenario: log error log_error
    • Root cause analysis scenario: DDL failure ddl_failure
    • Root cause analysis scenario: Troubleshooting index build execution error index_ddl_error
    • Root cause analysis scenario: transaction disconnection scenario transaction_disconnection
    • Root cause analysis scenario: transaction execution times out and error transaction_execute_timeout
    • Root cause analysis scenario: transaction does not end and error transaction_not_ending is reported
    • Root cause analysis scenario: transaction other errors transaction_other_error
    • Root cause analysis scenario: transaction rollback error transaction_rollback
    • Root cause analysis scenario: transaction wait timeout error transaction_wait_timeout
    • Root cause analysis scenario: OMS full migration exception oms_full_trans
    • Root cause analysis scenario: OMS obcdc component analysis oms_obcdc
    • Root cause analysis scenario: suspended transaction suspend_transaction
    • Root cause analysis scenario: Unit GC exception unit_gc
    • Root cause analysis scenario: OceanBase cluster playback card
    • Root cause analysis scenario: OceanBase cluster memory explosion
    • Root cause analysis scenario: Abnormal deletion of OBServer node
    • Root cause analysis scenario: GC troubleshooting gc_troubleshooting
    • Root cause analysis scenario: Schema leak schema_leak
    • Root cause analysis scenario: partition split scheduling error split_schedule_error
    • Root cause analysis scenario: weak read problem troubleshooting weak_read_troubleshooting
    • Root cause analysis scenario: SQL execution memory is too high execute_memory_high
  • One-click Cluster Insights
    • Overview
    • Cluster overview information insights
    • Cluster node information insight
    • Cluster unit information insight
    • Cluster Zone Information Insights
    • Cluster RS Information Insights
    • Cluster tenant information insight
    • Cluster event information insight
    • Cluster lock information insight
    • Cluster topsql information insight
    • Cluster slowsql information insight
    • Cluster table information insight
    • Cluster processlist information insight
    • SQL execution plan information insights
    • Insights into database disk usage information
    • Insight on the disk usage of the specified table in the database
    • Insights into the full tenant information of the cluster
    • Cluster node CPU usage information insights
    • Internal table name fuzzy matching information insight
    • Cluster leader information insight
    • Information insights into locks held on a certain table
    • Cluster long transaction information information insight
    • Actual execution plan operator information insight
    • Memory information insights for all tenants
    • processlist Real-time session summary information insights
    • Table/index storage method information insight
    • Table NDV Information Insights
    • Table index information insight
    • Merge status display
    • clog log volume/capacity statistics
  • Plug-in file upgrade
  • Update and uninstall
  • Telemetry Mode
  • FAQ
  • Tools
    • Configuration file encryption
    • AI Intelligent Diagnosis Assistant
    • Disk IO performance detection
    • Configure verification tool
  • Release Notes
    • obdiag V4.2.0
    • obdiag V4.1.0
    • obdiag V4.0.0
    • obdiag V3.7.2
    • obdiag V3.7.1
    • obdiag V3.7.0
    • obdiag V3.6.0
    • obdiag V3.5.0
    • obdiag V3.4.0
    • obdiag V3.3.0
    • obdiag V3.2.0
    • obdiag V3.1.0
    • obdiag V3.0.0
    • obdiag V2.6.0
    • obdiag V2.5.0
    • obdiag V2.4.0
    • obdiag V2.3.0
    • obdiag V2.2.0
    • obdiag V2.1.0
    • obdiag V2.0.0
    • obdiag V1.6.2
    • obdiag V1.6.1
    • obdiag V1.6.0
    • obdiag V1.5.2
    • obdiag V1.5.1
    • obdiag V1.5.0
    • obdiag V1.4.0
    • obdiag V1.3.0

Download PDF

obdiag Overviewobdiag installationobdiag configurationOne-click cluster inspectionDetailed explanation of cluster inspection indicatorsOceanBase cluster log collectionHost information collectionSQL details collectionStack information collectionFlame Picture/Bian Que Picture CollectionASH report collectionCollect table-related informationCluster parameter collectionVariable collectionclog/slog collectionDBMS_XPLAN information collectionCore file collectionAWR report collectionobproxy log collectionOMS log collectionCollect all informationOverviewCollect basic cluster informationCollect backup problem informationCollect backup cleanup problem informationCollect clog disk full problem informationCollect cluster downtime problem informationCollect merge problem informationCollect CPU high problem informationCollect information on delay issues in primary and standby databasesCollect I/O problem informationCollect log archiving problem informationCollect long transaction problem informationCollect memory problem informationCollect SQL performance problem informationCollect PX error reporting informationCollect recovery problem informationCollect observer restart information without reasonCollect owner-cutting problem informationCollect information about hanging transaction issuesCollect copy imbalance problem informationCollect SQL error reporting informationCollect cluster TopSQL informationCollect application error informationCollect obproxy restart problem information for no reasonCollect ODP parameter informationCollect unspecified scenario problem informationOne-click diagnostic analysis logOne-click full-link diagnostic log analysisParameter analysis (compared with default values)Parameter analysis (parameter differences on different observers)Analyze variablesInstructions for using index space analysisOne-click diagnostic analysis of memoryOne-click diagnosis and analysis of queue backlogOne-click root cause analysisRoot cause analysis scenario: disconnectionRoot cause analysis scenario: card merging major_holdRoot cause analysis scenario: lock conflict lock_conflictRoot cause analysis scenario: executing DDL and reporting disk full error ddl_disk_fullRoot cause analysis scenario: clog disk is full clog_disk_fullRoot cause analysis scenario: log error log_errorRoot cause analysis scenario: DDL failure ddl_failureRoot cause analysis scenario: Troubleshooting index build execution error index_ddl_errorRoot cause analysis scenario: transaction disconnection scenario transaction_disconnectionRoot cause analysis scenario: transaction execution times out and error transaction_execute_timeoutRoot cause analysis scenario: transaction does not end and error transaction_not_ending is reportedRoot cause analysis scenario: transaction other errors transaction_other_errorRoot cause analysis scenario: transaction rollback error transaction_rollbackRoot cause analysis scenario: transaction wait timeout error transaction_wait_timeoutRoot cause analysis scenario: OMS full migration exception oms_full_transRoot cause analysis scenario: OMS obcdc component analysis oms_obcdcRoot cause analysis scenario: suspended transaction suspend_transactionRoot cause analysis scenario: Unit GC exception unit_gcRoot cause analysis scenario: OceanBase cluster playback cardRoot cause analysis scenario: OceanBase cluster memory explosionRoot cause analysis scenario: Abnormal deletion of OBServer nodeRoot cause analysis scenario: GC troubleshooting gc_troubleshootingRoot cause analysis scenario: Schema leak schema_leakRoot cause analysis scenario: partition split scheduling error split_schedule_errorRoot cause analysis scenario: weak read problem troubleshooting weak_read_troubleshootingRoot cause analysis scenario: SQL execution memory is too high execute_memory_highOverviewCluster overview information insightsCluster node information insightCluster unit information insightCluster Zone Information InsightsCluster RS Information InsightsCluster tenant information insightCluster event information insightCluster lock information insightCluster topsql information insightCluster slowsql information insightCluster table information insightCluster processlist information insightSQL execution plan information insightsInsights into database disk usage informationInsight on the disk usage of the specified table in the databaseInsights into the full tenant information of the clusterCluster node CPU usage information insights
OceanBase logo

The Unified Distributed Database for the AI Era.

Follow Us
Products
OceanBase CloudOceanBase EnterpriseOceanBase Community EditionOceanBase seekdb
Resources
DocsBlogWhite PaperLive DemosTraining & CertificationTicket
Company
About OceanBaseTrust CenterLegalPartnerContact Us
Follow Us

© OceanBase 2026. All rights reserved

Cloud Service AgreementPrivacy PolicySecurity
Contact Us
Document Feedback
  1. Documentation Center
  2. OceanBase Diagnostic Tool
  3. V4.3.0
iconOceanBase Diagnostic Tool
V 4.3.0
Databases
  • OceanBase Database
  • OceanBase Cloud
  • OceanBase Tugraph
  • Interactive Tutorials
  • OceanBase Best Practices
Tools
  • OceanBase Cloud Platform
  • OceanBase Migration Service
  • OceanBase Developer Center
  • OceanBase Migration Assessment
  • OceanBase Admin Tool
  • OceanBase Loader and Dumper
  • OceanBase Deployer
  • Kubernetes operator for OceanBase
  • OceanBase Diagnostic Tool
  • OceanBase Binlog Service
Connectors and Middleware
  • OceanBase Database Proxy
  • Embedded SQL in C for OceanBase
  • OceanBase Call Interface
  • OceanBase Connector/C
  • OceanBase Connector/J
  • OceanBase Connector/ODBC
  • OceanBase Connector/NET
  • V 4.3.0
  • V 4.2.0
  • V 3.3.0
  • V 3.2.0
  • V 3.1.0
  • V 3.0.0
  • V 2.6.0
  • V 2.5.0
  • V 2.4.0
  • V 2.3.0
  • V 1.5.0
  • V 1.4.0

Detailed explanation of cluster inspection indicators

Last Updated:2026-06-30 15:09:40  Updated
Share
What is on this page
OBProxy check items
OceanBase database check items
Log flow check items
Version check items
Index check items
Tenant inspection items
Sysbench Check Items
Log check items
Archive Check Items
Error code check items
Network check items
System check items
Table check items
CPU Check Items
Defect inspection items
Disk check items
Column storage check items
Cluster check items

folded

Share

This article explains the sources of some obdiag inspection indicators.

Description

The issue #xx mentioned in this section refers to the issue number of the obdiag project.

OBProxy check items

Check item name
Check item description
version.bad_version Check if the OBProxy version is a deprecated version. Some versions of OBProxy are buggy and their use is not recommended.
version.old_version Check if the OBProxy version is an old version. Some older versions of OBProxy are no longer supported and are not recommended for use. The source can be found on GitHub issue #1103.
parameter.request_buffer_length Check whether the OBProxy parameter request_buffer_length is the default value.
parameter.work_thread_num Check the value of the OBProxy parameter work_thread_num to prevent thread exhaustion problems. The source can be found on GitHub issue #1019.
parameter.enable_ob_protocol_v2_with_client Check the OBProxy parameter enable_ob_protocol_v2_with_client, and alert if it is enabled.

OceanBase database check items

Log flow check items

Check item name
Check item description
clog.clog_hang Check for disk failure issues that may cause the log stream to hang. The source can be found on GitHub issue #963.
clog.clog_disk_full Check whether there is a log stream disk full problem.
ls.paxo_members Check whether the log stream is consistent with paxo-members. If inconsistent, the server deletion operation cannot be performed successfully.

Version check items

Check item name
Check item description
version.bad_version Check whether the OceanBase database is a deprecated version. Some versions of the OceanBase database have bugs and are not recommended for use.
version.old_version Check whether the OceanBase database is an old version. Some older versions of OceanBase databases are no longer supported and are not recommended for use.

Index check items

Check item name
Check item description
index.global_index_unpartitioned Check for unpartitioned global indexes that may cause hotspot issues during batch operations. The source can be found on GitHub issue #957.

Tenant inspection items

Check item name
Check item description
tenant.ddl_operation_table_size Check the size of tenant internal table __all_ddl_operation. When the number of records exceeds 10 million, the user is prompted to pay attention. The source can be found on GitHub issue #1061.
tenant.tenant_threshold Check tenant thread utilization and alert when it exceeds the 95% threshold. The source can be found on GitHub issue #963.
tenant.tenant_locality_consistency_check Check the tenant's regional consistency and the number of log stream members to ensure tenant availability. The source can be found on GitHub issue #1048.
tenant.max_stale_time_for_weak_consistency Check whether the max_stale_time_for_weak_consistency parameter is the default value. The source can be found on GitHub issue #850.
tenant.tenant_min_resource Check tenant resource pool configuration and report if less than 2C4G CPU or memory.
tenant.parameters_default Checks whether all parameters have default values. The source can be found on GitHub issue #850.
tenant.macroblock_utilization_rate_tenant Check whether the ratio of actual data volume to actual disk usage for all tenants in the OceanBase cluster is within a reasonable range. The OceanBase database stores data in macroblocks, and each macroblock may not be fully utilized to improve efficiency. If the ratio of actual data volume to actual disk usage is too low, a full merge should be performed to improve disk utilization. The source can be found on GitHub issue #847.
tenant.writing_throttling_trigger_percentage Check whether writing_throttling_trigger_percentage is configured to 100. If configured to 100, the write speed limit will be turned off, causing MemStore to explode. The source can be found on GitHub issue #758.

Sysbench Check Items

Check item name
Check item description
sysbench.sysbench_free_test_memory_limit Check cluster memory limit information when sysbench is idle.
sysbench.sysbench_free_test_network_speed Check cluster network speed information when sysbench is idle.
sysbench.sysbench_test_cluster_parameters Check cluster parameters when running sysbench.
sysbench.sysbench_test_cluster_datafile_size Check cluster data file size and log disk size information when sysbench is idle.
sysbench.sysbench_test_cluster_log_disk_size Check the cluster log disk size parameter.
sysbench.sysbench_test_log_level Check the cluster system log level information when running sysbench.
sysbench.sysbench_test_tenant_primary_zone Check the cluster tenant primary availability zone information when running sysbench.
sysbench.sysbench_run_test_tenant_memory_used Check cluster memory usage and memory usage information when sysbench is idle.
sysbench.sysbench_test_cpu_quota_concurrency Check cluster CPU quota concurrency information when running sysbench.
sysbench.sysbench_free_test_cpu_count Check cluster CPU count information when sysbench is idle.
sysbench.sysbench_test_tenant_log_disk_size Check the tenant log disk size parameter.
sysbench.sysbench_run_test_tenant_cpu_used Check sysbench runtime cluster CPU information.
sysbench.sysbench_test_sql_net_thread_count Check cluster SQL network thread count information when running sysbench.
sysbench.sysbench_test_tenant_cpu_parameters Check tenant CPU parameters.

Log check items

Check item name
Check item description
log.log_size_with_ocp Check whether the free space of the log directory exceeds the size of 100 files.
log.log_size Check the cluster max_syslog_file_count parameter value, and alarm when it is not set to 0 or the setting exceeds 100. The source can be found on GitHub issue #963.

Archive Check Items

Check item name
Check item description
archive.archive_continuous_error Check the OceanBase database log for the pay ATTENTION!! archive continuous encounter error more than 15 error. This error means that the archive encountered the error more than 15 times in a row. The source can be found on GitHub issue #991.

Error code check items

Check item name
Check item description
err_code.find_err_4016 Check whether error 4016 is reported when enable_sql_audit is set to True.
err_code.find_err_4012 Check whether error 4012 is reported when enable_sql_audit is set to True.
err_code.find_err_4001 Check whether error 4001 is reported when enable_sql_audit is set to True.
err_code.find_err_4377 Check whether error 4377 is reported when enable_sql_audit is set to True.
err_code.find_err_4108 Check whether error 4108 is reported when enable_sql_audit is set to True.
err_code.find_err_4013 Check whether error 4013 is reported when enable_sql_audit is set to True.
err_code.find_err_4015 Check whether error 4015 is reported when enable_sql_audit is set to True.
err_code.find_err_4000 Check whether error 4000 is reported when enable_sql_audit is set to True.
err_code.find_err_4105 Check whether error 4105 is reported when enable_sql_audit is set to True.
err_code.find_err_4103 Check whether error 4103 is reported when enable_sql_audit is set to True.

Network check items

Check item name
Check item description
network.network_offset Check cluster network clock offset information.
network.network_drop Check cluster network packet loss information.
network.network_speed_diff Checks if all OBServer nodes have consistent network card speed via dynamic network card name lookup. The source can be found on GitHub issue #763.
network.network_write_cond_wakeup Check the OceanBase cluster log for network write condition wakeup issues.
network.local_ip_check Verify that local_ip in observer.config.bin matches the actual network card IP on the configured network interface. The source can be found on GitHub issue #878.
network.log_easy_slow Check for network latency issues by searching for EASY SLOW in the OceanBase cluster logs.
network.TCP_retransmission Check for TCP retransmissions. The source can be found on GitHub issue #348.
network.network_speed Check cluster network speed information.

System check items

Check item name
Check item description
system.parameter_tcp_wmem Detect kernel parameters.
system.core_pattern Check the kernel core_pattern.
system.cgroup_version Check the cgroup version. OceanBase database currently uses cgroup v1. If the customer's operating system is cgroup v2, resource isolation will not take effect. The source can be found on GitHub issue #1101.
system.getenforce Check SELinux via getenforce.
system.check_command Confirm whether dependent components exist.
system.dependent_software To detect dependent software, please refer to the official website "OceanBase Cloud Platform" document Host Standardization Check Items for details.
system.instruction_set_avx Check whether the CPU supports the AVX instruction set to be compatible with the OceanBase database. The source can be found on GitHub issue #1024.
system.dependent_software_swapon To detect dependent software, please refer to the official website "OceanBase Cloud Platform" document Host Standardization Check Items for details.
system.kernel_bad_version Check whether the operating system version is 3.10. Using the cgroup method to deploy the OceanBase database on the operating system kernel version 3.10 has the risk of system downtime. The source can be found on GitHub issue #910.
system.clock_source Check whether the clock source type is tsc.
system.aio To detect aio, please refer to the official website "OceanBase Cloud Platform" document Host Standardization Check Items for details.
system.parameter_ip_local_port_range Detect kernel parameters. For details, please refer to the official website "OceanBase Cloud Platform" document Host Standardization Check Items.
system.ulimit_parameter Detect the ulimit parameter. For details, please refer to the official website "OceanBase Cloud Platform" document Host Standardization Check Items.
system.tcp_tw_reuse Checks whether sockets in TIME-WAIT state (TIME-WAIT port) are allowed for new TCP connections. Needs to be set to 1 to ensure system performance. The source can be found on GitHub issue #737.
system.mount_options When mounting NFS, you need to ensure that the parameters of the backup mount environment include nfsvers=4.1, sync, lookupcache=positive and hard. The sources can be found on GitHub issue #611 and issue #852.
system.parameter_tcp_rmem Detect kernel parameters. For details, please refer to the official website "OceanBase Cloud Platform" document Host Standardization Check Items.
system.arm_smmu If the node is an arm architecture, check whether smmu needs to be turned off. The source can be found on GitHub issue #784.
system.python_version Check whether the Python version installed on the host is 2.7.x and ensure that the relevant OceanBase database scripts can run normally. The source can be found on GitHub issue #869.
system.clock_source_check It is recommended to add a check item to check whether the OBServer node clock source configuration file server IP is consistent. The sources can be found on GitHub issue #781 and issue #873.
system.check_system_language Check if $LANG is en_US.UTF-8
system.parameter Detect kernel parameters. For details, please refer to the official website "OceanBase Cloud Platform" document Host Standardization Check Items.
system.dmesg_log Confirm if Hardware Error exists in dmesg. The source can be found on GitHub issue #885.
system.docker0_interface_check Check whether docker0 or docker0-like network interface exists in the deployment environment. When deploying OBProxy through OCP, if the docker0 interface exists, the displayed IP may correspond to the docker0 address rather than the actual physical host address. It is recommended to remove the docker0 interface after confirming that it is not in use. The source can be found on GitHub issue #1198.

Table check items

Check item name
Check item description
table.macroblock_utilization_rate_table Check whether the ratio of the actual data volume to the actual disk usage of all tables in the OceanBase cluster is within a reasonable range. The OceanBase database stores data in macroblocks, and each macroblock may not be fully utilized to improve efficiency. If the ratio of actual data volume to actual disk usage is too low, a full merge should be performed to improve disk utilization. This task includes query timeout protection to prevent hangs. The sources can be found on GitHub issue #848 and issue #1067.
table.information_schema_tables_two_data Check whether there is a table with two records in information_schema.tables. The source can be found on GitHub issue #390.
table.auto_split_error_table Checks an auto-split-enabled table for shards that should trigger auto-split but do not. This check identifies tables where some shards have reached the auto_part_size threshold but auto-split is not triggered, which may indicate a problem with the auto-split mechanism.

CPU Check Items

Check item name
Check item description
cpu.oversold Check whether any OBServer node has CPU oversold.

Defect inspection items

Check item name
Check item description
bugs.bug_469 Check the glibc version of the OBServer node (obtained through ldd). The glibc version must be less than 2.34, otherwise it may cause the OBServer node to crash. The source can be found on GitHub issue #469.
bugs.bug_182 Check for OceanBase database bug: OceanBase database has been upgraded to version 4.2.1, and error code -4109 and error message Server state or role not the same as expected appear when executing DDL for some partition tables. The source can be found on GitHub issue #182.
bugs.cgroup_kernel_bad_version Check whether the operating system kernel version is 3.10. Using the cgroup method to deploy the OceanBase database on the operating system kernel version 3.10 may cause system downtime. The source can be found on GitHub issue #910.
bugs.bug_385 Check whether there is an OceanBase database bug: When the OceanBase database version is between [4.2.1.0,4.2.1.3], there are multiple root users under the tenant. If this bug occurs, please consider upgrading the OceanBase database version or deleting redundant users. The source can be found on GitHub issue #385.

Disk check items

Check item name
Check item description
disk.clog_abnormal_file Check the clog folder for files that do not belong to the OceanBase database.
disk.data_disk_full Check the data disk usage and alert when the usage exceeds the 85% threshold. The source can be found on GitHub issue #963.
disk.disk_hole Check whether there is a disk hole problem.
disk.mount_disk_full Check the disk usage of each mount point on the host. The source can be found on GitHub issue #611.
disk.xfs_repair Check the xfs_repair log in dmesg. The source can be found on GitHub issue #451.
disk.sstable_abnormal_file Check the data folder for files that do not belong to the OceanBase database.
disk.disk_full Check whether the disk usage reaches the threshold.
disk.disk_iops Check disk IOPS.

Column storage check items

Check item name
Check item description
column_storage.tenant_parameters Check the column storage proof of concept on tenant parameters.

Cluster check items

Check item name
Check item description
cluster.mod_too_large Check if any module is using more than 10GB of memory.
cluster.core_file_find Check if the core file exists.
cluster.optimizer_better_inlist_costing_parmmeter Check if the tag parameter is enabled for a specific version.
cluster.memory_limit_vs_phy_mem Check if memory_limit is larger than the physical memory size. Memory_limit larger than physical memory will cause serious problems. The source can be found on GitHub issue #1066.
cluster.memory_chunk_cache_size Check the memory block capacity cached by the memory allocator. It is recommended to set it to 0. The source can be found on GitHub issue #843.
cluster.memstore_limit_percentage Checks the percentage of MemStore memory used by the tenant as a percentage of its total available memory. It is recommended to keep the default value of 50. The source can be found on GitHub issue #871.
cluster.trace_log_slow_query_watermark Check the query execution time threshold. It is recommended to be no less than 1s and no more than 2s. If the query execution time exceeds this threshold, it is considered a slow query, and the tracking log of the slow query will be printed to the system log. The source can be found on GitHub issue #842.
cluster.memstore_usage Check the MemStore usage and alert when the utilization exceeds 50%. The source can be found on GitHub issue #963.
cluster.cpu_quota_concurrency Check the maximum number of concurrencies allowed by each CPU quota of the tenant. The recommended value is in the range [2,4]. The source can be found on GitHub issue #738.
cluster.observer_not_active Check if any OBServer node is not in ACTIVE state.
cluster.clog_sync_time_warn_threshold Check the clog synchronization time warning threshold, it is recommended to set it to 100ms. If the synchronization time exceeds the alarm threshold, a WARN log will be generated. The source can be found on GitHub issue #793.
cluster.autoinc_cache_refresh_interval Check the refresh interval of the auto-increment column cache. It is recommended to set it to more than 1 hour. Frequent refreshes can affect system performance. The source can be found on GitHub issue #817.
cluster.syslog_io_bandwidth_limit Check the disk IO bandwidth limit that the system log can occupy. It is recommended not to exceed 30M. System logs that exceed the bandwidth limit will be discarded. The source can be found on GitHub issue #841.
cluster.ls_number Check if the log stream ID is not_enough_replica.
cluster.tenant_number Check the number of tenants.
cluster.part_trans_action_max Check if there are more than 200 transaction participants.
cluster.major_suspended Check if there is a manually suspended major compaction in the OceanBase cluster. The source can be found on GitHub issue #1015.
cluster.resource_limit_max_session_num Check whether the hidden parameter _resource_limit_max_session_num has been changed. Parameter modification may cause Too many connections errors.
cluster.task_opt_stat_gather_fail Check whether the history collection task has failed execution results.
cluster.logons_check Check whether the cumulative user login value is close to the 2147483647 threshold, only check OceanBase database versions before V4.2.1.4. The source can be found on GitHub issue #972.
cluster.zone_not_active Check if any availability zone is not in ACTIVE state.
cluster.deadlocks Check for deadlocks.
cluster.ob_query_timeout Check the ob_query_timeout global variable for thread hang issues. The source can be found on GitHub issue #978.
cluster.datafile_next Check node parameter datafile_maxsize. Checks if datafile_next is 0 when datafile_maxsize is set and is greater than datafile_size. If this value is 0, the data file will not grow. The source can be found on GitHub issue #573.
cluster.ob_enable_prepared_statement Check whether prepared statements are enabled. It is recommended to enable it, especially if the front end is a JAVA application. The source can be found on GitHub issue #844.
cluster.data_path_settings Check if data_dir and log_dir_disk are on the same disk.
cluster.sys_obcon_health Check if the cluster is connected by connecting to the sys tenant. The source can be found on GitHub issue #872.
cluster.tenant_memory_tablet_count Check whether the tenant memory specifications and the number of tablets per OBServer node exceed the 90% health check threshold. The source can be found on GitHub issue #1104.
cluster.tenant_500_memory_analysis Analyze the memory usage of tenant 500 (internal system tenant) and identify memory anomalies. Check the total memory, top consuming modules, known problem modules and memory ratio. The source can be found on GitHub issue #99.
cluster.major Check if there are any suspended major compaction processes.
cluster.tenant_locks Check the waiting number of tenant locks and alarm when it exceeds the 5000 threshold. The source can be found on GitHub issue #963.
cluster.server_permanent_offline_time Check the server_permanent_offline_time parameter and alert when it is not set to 3600s. The source can be found on GitHub issue #816.
cluster.no_leader Check the cluster tenant log stream leader.
cluster.cgroup Check whether tenant isolation is enabled when the OceanBase database is version 4.x and above. Should be enabled by default to ensure performance. The source can be found on GitHub issue #849.
cluster.freeze_trigger_percentage Check the freeze_trigger_percentage parameter. It is recommended that the server maintain the default configuration 20. The source can be found on GitHub issue #795.
cluster.observer_port Check whether the necessary ports between OceanBase cluster nodes are connected. The source can be found on GitHub issue #845.
cluster.memory_limit_percentage Check the total available memory size in the system as a percentage of the total memory size. It is recommended to keep the default value 80. The source can be found on GitHub issue #750.
cluster.upper_trans_version Check OceanBase database version. When the OceanBase database version is V4.0.0.0 or above, if executing the relevant SQL query in the sys tenant returns a non-empty result (that is, upper_trans_version cannot be calculated for a long time), the user is prompted to upgrade to OceanBase database V4.2.5.3 or above to fix this problem. The source can be found on GitHub issue #838.
cluster.task_opt_stat Check the task optimization statistics collection history.
cluster.ob_enable_plan_cache_bad_version Check the ob_enable_plan_cache variable. When the OceanBase database version is V4.1.0 or V4.1.0 BP1, it is recommended to turn off ob_enable_plan_cache.
cluster.session_limit Check the number of tenant sessions and alert when it exceeds the 5000 threshold. The source can be found on GitHub issue #963.
cluster.table_history_too_many Check the table history of tenants in the cluster. If there are too many table histories of tenants in the cluster, schema refresh will continue to report -4013 when the machine is restarted, causing the specific machine to be unable to refresh the schema of the corresponding tenant.
cluster.auto_increment_cache_size Check the globally available cache of auto-increment columns for all tenants in the cluster. The source can be found on GitHub issue #870.
cluster.sys_log_level Check the sys_log_level parameter.
cluster.global_indexes_too_much Check if there are tables with more than 20 global indexes.
cluster.large_query_threshold Check the query execution time threshold, it is recommended to set it to 5s. Requests that exceed the time limit may be suspended. After the suspension, it is automatically determined to be a large query, and the large query scheduling policy is implemented. The source can be found on GitHub issue #859.
cluster.enable_lock_priority Check whether the enable_lock_priority parameter is enabled. Activation of the enable_lock_priority parameter will affect the performance of DDL/DML in daily use. It is not recommended to turn it on unless lock-free structural changes are required. The source can be found on GitHub issue #890.
cluster.upgrade_finished Check whether the OceanBase cluster upgrade is completed and verify version consistency. The source can be found on GitHub issue #759.
cluster.default_compress_func Check the default compression algorithm for manifest data. It is recommended to use a default value that matches ob_version to improve compression ratio and reduce storage costs. For scenarios with higher query rt requirements, consider using lz4_1.0 or turning off compression. The source can be found on GitHub issue #792.

Previous topic

One-click cluster inspection
Last

Next topic

OceanBase cluster log collection
Next
What is on this page
OBProxy check items
OceanBase database check items
Log flow check items
Version check items
Index check items
Tenant inspection items
Sysbench Check Items
Log check items
Archive Check Items
Error code check items
Network check items
System check items
Table check items
CPU Check Items
Defect inspection items
Disk check items
Column storage check items
Cluster check items