There are many reasons for high I/O pressure on a node, with traffic surge being the most common one. Other reasons include data migration and major compaction. To reduce the I/O pressure, you can downgrade some I/O-intensive tasks. This topic explains how to handle the issue in various scenarios.
Emergency handling procedure
If the disk I/O on a node is high, you can follow the emergency handling procedure below to reduce the I/O pressure.
Pause the ongoing major compaction.
If a major compaction is in progress on a node with high I/O pressure, you can pause the compaction. The command is as follows:
ALTER SYSTEM SUSPEND MERGE [ZONE [=] 'zone'];After the I/O pressure is reduced, you can resume the major compaction if needed. The syntax is as follows:
ALTER SYSTEM RESUME MERGE [ZONE [=] 'zone'];Pause the ongoing backup task.
Use OCP to check whether a backup task is running on the node. If yes, you can pause the backup task to reduce the I/O pressure.
Pause the ongoing data transmission or import/export task.
You can use the TOPSQL feature or session management function in OCP to identify SQL statements that are writing data in batches. If you cannot quickly identify the batch processing task that is causing the high I/O pressure, you can use OMS to check whether a data transmission task is running on the node. If yes, you can pause the task based on your business needs to reduce the I/O pressure and resume it later.
For more information, see Manage migration tasks.
You can also use ODC to check whether a dump task is running on the node. Go to the Ticket tab in ODC and click Import to view the task list. When a dump task is pending execution, you can Abort it based on your business needs. Additionally, scheduled tasks in third-party big data platforms or the DataX component can also be paused on a task-by-task basis as needed.
Reduce the number of threads for minor compaction.
If the number of threads for minor compaction is high, the disk I/O will also increase. The parameter
compaction_high_thread_scorespecifies the threshold for the number of threads for minor compaction. You can reduce the parameter value to lower the disk I/O. The default value is0, which indicates adaptive determination. For 64-core systems, the value is typically 10. You can reduce the value based on your business needs. The modification of this parameter takes effect immediately without the need to restart the system. The syntax is as follows:ALTER SYSTEM SET compaction_high_thread_score = 5;After modifying the parameter, you can use the
SHOW PARAMETERSstatement to query whether the modification is successful.SHOW PARAMETERS LIKE 'compaction_high_thread_score ';Reduce the network bandwidth for background tasks.
You can execute the following command to reduce the network bandwidth for background tasks:
ALTER SYSTEM SET sys_bkgd_net_percentage=30; --default value 60Implement SQL throttling or create indexes for high-load SQL statements.
When the disk I/O is high, if you locate an SQL statement, you can add a hint max_concurrent to the execution plan of the SQL statement to limit its concurrency, thereby throttling the SQL statement.
You can create indexes on tables to reduce full table scans caused by missing indexes, thereby lowering the I/O pressure.
Cancel the ongoing index creation task.
If a major index creation task is running in the cluster, you can cancel it. You can resume the task after the cluster is restored.