There are many reasons that can lead to high disk I/O on an OBServer node. Apart from an increase in traffic, it is usually the result of factors such as migration, replication, and major compaction. The usual approach to handling this issue is to downgrade some tasks with high I/O load. This topic provides information on how to handle this issue in emergency situations.
Emergency handling procedure
To address high disk I/O on a node, the general approach is to downgrade some high I/O load tasks. The specific steps are categorized as follows:
Pause running major compaction tasks.
If a major compaction is being performed on the node with high disk I/O, pause the major compaction to reduce the I/O load.
ALTER SYSTEM SUSPEND MERGE [ZONE [=] 'zone'];Once the I/O pressure is relieved, you can resume the major compaction.
ALTER SYSTEM RESUME MERGE [ZONE [=] 'zone'];Pause running backup tasks.
You can use OCP to check if the current node is performing backups. If so, pausing the backups can alleviate the I/O pressure.
Pause running data transmission, import, and export tasks.
You can use the TopSQL and Session Management features in OCP to identify SQL statements that are performing batch writing. If you cannot quickly identify which system the batch task is from, you can use OMS to check whether the current node is performing data transmission tasks. If yes, pause them to alleviate the I/O pressure before resuming those tasks.
For more information, see View details of a data migration project.
You can also check whether a node is undergoing data imports through ODC. On the
Ticket tab of ODC, clickImport to view the task list. If there are pending import tasks, you can clickAbort to terminate the task based on your business needs. Additionally, scheduled tasks in other third-party big data platforms or the DataX component can also be stopped manually at any time.Reduce the number of threads for minor compaction
A high degree of parallelism (DOP) during minor compaction can increase disk I/O. The
compaction_high_thread_scoreparameter specifies the number of threads for minor compaction. You can decrease the value of this parameter to reduce disk I/O usage. The default value is0, which indicates that the system adaptively adjusts the number of threads. For 64 CPU cores, it is generally set to 10. You can adjust it according to your specific situation. The modification takes effect immediately without the need for a restart.ALTER SYSTEM SET compaction_high_thread_score= 5;After the modification, you can execute the
SHOW PARAMETERSstatement to check whether the modification is successful.SHOW PARAMETERS LIKE 'compaction_high_thread_score';Reduce the network bandwidth of background tasks.
You can use the following command to reduce the network bandwidth for background tasks:
ALTER SYSTEM SET sys_bkgd_net_percentage=30; -- Default value: 60.Apply throttling and add indexes to high-load SQL queries.
If a particular high-load SQL query is identified, you can limit the concurrency of the SQL query by adding the
max_concurrenthint to its execution plan. This achieves throttling the SQL query.If a full table scan causes high I/O due to missing indexes, you can add indexes to the relevant tables as needed.
Pause running index creation tasks.
If index creation for a large table is currently in progress in the cluster, you can pause it if necessary and resume the creation after the cluster has been recovered.