Meet OceanBase AI Database, the unified database for operational data, real-time analytics, and AI. Explore ->

Start on Cloud

OceanBase

A unified distributed database ready for your transactional, analytical, and AI workloads.

Product Overview

DEPLOY YOUR WAY

OceanBase Cloud

The best way to deploy and scale OceanBase

OceanBase Enterprise

Run and manage OceanBase on your infra

TRY OPEN SOURCE

OceanBase Community Edition

The free, open-source distributed database

OceanBase seekdb

Open source AI native search database

Customer Stories

Real-world success stories from enterprises across diverse industries.

View All

BY USE CASES

Mission-Critical Transactions

Global & Multicloud Application

Elastic Scaling for Peak Traffic

Real-time Analytics

Active Geo-redundancy

Database Consolidation

Comprehensive knowledge hub for OceanBase.

Blog

Live Demos

Training & Certification

Documentation

Official technical guides, tutorials, API references, and manuals for all OceanBase products.

View All

PRODUCTS

OceanBase Cloud

OceanBase Database

Tools Connectors and Middleware

QUICK START

OceanBase Cloud OceanBase Database

BEST PRACTICES

Practical guides for utilizing OceanBase more effectively and conveniently

Learn more about OceanBase – our company, partnerships, and trust and security initiatives.

About OceanBase

Partner

Trust Center

Document Feedback

Troubleshoot I/O bottlenecks

Last Updated：2026-07-14 08:02:21 Updated

During database operation, I/O issues are a common problem and can be divided into the following two scenarios:

One is that the main overhead of a certain SQL statement is disk reading. You need to find out which database object or SQL operator is associated with the disk reading.
Another is that the disk read/write pressure is very high in the OCP monitoring. You need to find out the source of the disk pressure.

To help you quickly locate the root cause and efficiently solve the problem, this topic summarizes a clear and practical I/O troubleshooting process. This process provides clear steps to improve the efficiency of problem handling, minimize the impact on business operations, and provide strong support for daily maintenance work.

I/O issues can be divided into the following two scenarios:

SQL-level issues: Some SQL statements have high disk reading overhead. You need to associate the database object or operator with the disk reading.
Cluster-level issues: The OCP monitoring shows high disk read/write pressure. You need to find out the source of the disk pressure.

Process introduction

Confirm whether there are I/O-related issues.

You can use the following methods to confirm:
- Log in to the OCP console and check the Physical I/O Count and Physical I/O Throughput performance metrics. View the I/O bandwidth at the node level (OBServer level) and tenant level (tenant level). If the I/O bandwidth of a node or tenant is high, it indicates that there may be I/O-related issues.
- Check the Top Foreground DB Time and Top Background DB Time sections in the ASH report. If the waiting events related to I/O, such as db file data read, db file compact read, db file compact write, and row store disk write, account for a high proportion (corresponding to Wait Class values of USER_IO and SYSTEM_IO), it indicates that there may be I/O-related issues.
Troubleshoot cluster-level issues.
1. Obtain the ASH report for the corresponding node and tenant during the issue period.
2. View the Top Foreground DB Time and Top Background DB Time sections to identify the source of I/O.
  
  If the proportion of I/O-related waiting events in the foreground is high, it indicates that the I/O load is mainly concentrated on foreground sessions. In this case, proceed to step 3 to troubleshoot SQL-level issues.
  
  Otherwise, continue to further identify the source of I/O.
3. Further identify the source of I/O.
  1. View the Top IO Bandwidth section. A larger IO Size(MB) value indicates that the total size of I/O operations issued by the corresponding module during the issue period is larger.
  2. Select the Program Module Action/SQL ID field with a large IO Size(MB) value to obtain the corresponding information. If the information relates to PX/DAS execution, proceed to step 3 to troubleshoot SQL-level issues. Otherwise, refer to the relevant documentation or contact technical support for assistance.
Troubleshoot SQL-level issues.

Based on the ASH report for the corresponding node and tenant during the issue period, if the proportion of I/O-related waiting events in the foreground is high, you can further troubleshoot:
1. View the Top SQL with Top Event and Top SQL with Top Operator sections.
  - Top SQL with Top Event: Displays the SQL statements with the highest execution overhead in the obtained ASH report. The Top Event column shows the waiting event with the highest proportion for each SQL statement.
  - Top SQL with Top Operator: Displays the SQL statements with the highest execution overhead in the obtained ASH report. The Top Operator and Top Event columns show the operator ID and event_id with the highest overhead for each SQL statement.
    
    By viewing these two tables, you can obtain the SQL statements and operator IDs associated with I/O-related waiting events.
2. If you want to know which database table partition caused the I/O, view the Top DB Object section. It lists the tablet_id accessed most frequently by user SQL statements.
3. Based on the obtained information, refer to the relevant documentation or contact technical support for assistance.

References

Previous topic

Troubleshoot uneven CPU usage

Last

Next topic

Query the tenant_id and table_id