There are many reasons that can lead to high disk I/O on an OBServer node. Apart from an increase in traffic, it is usually the result of factors such as migration, replication, and major compaction. The usual approach to handling this issue is to downgrade some tasks with high I/O load. This topic provides information on how to handle this issue in emergency situations.
Emergency handling procedure
To address high disk I/O on a node, the general approach is to downgrade some high I/O load tasks. The specific steps are categorized as follows:
Pause running major compaction tasks.
If a major compaction is being performed on the node with high disk I/O, pause the major compaction to reduce the I/O load.
ALTER SYSTEM SUSPEND MERGE [ZONE [=] 'zone'];Once the I/O pressure is relieved, you can resume the major compaction as needed. The syntax is as follows:
ALTER SYSTEM RESUME MERGE [ZONE [=] 'zone'];Pause running backup tasks.
You can use OCP to check if the current node is performing backups. If so, pausing the backups can alleviate the I/O pressure.
Pause running data transmission, import, and export tasks.
You can use the TopSQL and Session Management features in OCP to identify SQL statements that are performing batch writing. If you cannot quickly identify which system the batch task is from, you can use OMS to check whether the current node is performing data transmission tasks. If yes, pause them to alleviate the I/O pressure before resuming those tasks.
For more information, see View details of a data migration project.
You can also check whether a node is undergoing data imports through ODC. On the Ticket tab of ODC, click Import to view the task list. If there are pending import tasks, you can click Abort to terminate the task based on your business needs. Additionally, scheduled tasks in other third-party big data platforms or the DataX component can also be stopped manually at any time.
Reduce the number of minor compaction threads
A high degree of parallelism during minor compaction can increase disk I/O. The
compaction_high_thread_scoreparameter controls the number of parallel minor compaction threads. You can decrease the value of this parameter to reduce disk I/O usage. The default value is0, which indicates adaptive mode. For 64 CPU cores, it is generally 10. You can reduce the value as needed. The modification takes effect immediately without a restart.ALTER SYSTEM SET compaction_high_thread_score= 5;After the modification, you can execute the
SHOW PARAMETERSstatement to check whether the modification is successful.SHOW PARAMETERS LIKE 'compaction_high_thread_score';Reduce the network bandwidth of background tasks.
You can use the following command to reduce the network bandwidth for background tasks:
ALTER SYSTEM SET sys_bkgd_net_percentage=30; -- Default value: 60.Apply throttling and add indexes to high-load SQL queries.
If a particular high-load SQL query is identified, you can limit the concurrency of the SQL query by adding the
max_concurrenthint to its execution plan. This achieves throttling the SQL query.If a full table scan causes high I/O due to missing indexes, you can add indexes to the relevant tables as needed.
Cancel running index creation.
If index creation for a large table is currently in progress in the cluster, you can cancel it as appropriate and resume the creation after the cluster has recovered.