Many causes can result in high disk I/O on an OBServer node, such as traffic increase, data migration and replication, and major compaction. The common solution to this issue is to downgrade some tasks with high I/O load. This topic describes emergency procedures in detail.
Emergency procedure
To relieve high disk I/O on an OBServer node, you can downgrade the tasks of high I/O load. Perform the following operations:
Pause the major compaction in progress.
If a major compaction is in progress on the node with high disk I/O, pause the major compaction to reduce the I/O load. Syntax of the command:
ALTER SYSTEM SUSPEND MERGE [ZONE [=] 'zone'];After the disk I/O is reduced, you can resume the major compaction. Syntax of the command:
ALTER SYSTEM RESUME MERGE [ZONE [=] 'zone'];Pause a running backup task.
You can check whether a backup task is running on the current node in the OceanBase Cloud Platform (OCP) console. If yes, pause the backup task to reduce the I/O load. Perform the following steps:
In the left-side navigation pane, click Clusters. The cluster overview page appears.
Find the cluster in the Clusters list, and click its name. In the left-side navigation pane of the page that appears, click Backup and Restore.
On the backup and restore page of the cluster, find the task that is under backup scheduling and pause the task. To resume the backup task, perform the following steps:
To pause the backup task, find the task and choose Actions > Pause.
To resume the backup task, choose Actions > Restart to restart the task. The system periodically initiates the task at the specified time.
Pause the data transmission, import, or export task.
You can use the TOP SQL and session management features in the OCP console to identify SQL statements that are being written in batches. If you cannot identify the client that requests the batch task, you can check whether a data transmission task is running on the node in the OceanBase Migration Service (OMS) console. If yes, pause the task to reduce the I/O load. Perform the following steps:
In the left-side navigation pane in the OMS console, choose OMS > Data Transmission Projects. On the page that appears, find the project in the project list, click Enter in the Actions column to go to the details page of the project. The details page displays a migration task list. Find the migration task and click Enter in the Actions column to go to the details page of the migration task.
a. Click Stop to stop the running subtasks in the orchestration task list and stop scheduling all subsequent atomic tasks.
b. Click Reset to set the status of all atomic tasks to TODO. This way, the tasks can be re-executed.
You can also check whether a data import task is running on the current node in the OceanBase Developer Center (ODC) console. After the destination database is connected, click Task in the top navigation bar. In the Task Center panel, click the Import tab. Find the task in the task list and click Abort or Retry as needed in the Actions column.
Reduce the number of minor compaction threads
A high concurrency of minor compaction threads can increase the disk I/O load. To reduce the disk I/O load, you can decrease the value of the
minor_merge_concurrencyparameter, which determines the number of concurrent minor compaction threads. The default value of this parameter is0, which indicates that the system adaptively adjusts the number of threads. If you configure 64 CPU cores, we recommend that you set the value to 10. You can modify the value as needed. The modification immediately takes effect without a restart. Syntax:ALTER SYSTEM SET minor_merge_concurrency= 5;After the parameters are modified, you can execute the
SHOW PARAMETERSstatement to check whether the modification is successful.SHOW PARAMETERS LIKE 'minor_merge_concurrency';You can also reduce the number of threads for minor compactions by modifying the minor compaction strategy in the OCP console.
Lower the concurrency of migration tasks.
If the high load of the NIC is accompanied by unit migration or load balancing tasks in the OceanBase cluster, you can throttle the I/O by limiting the concurrency of migration tasks.
ALTER SYSTEM SET migrate_concurrency=5; -- Default value: 10. ALTER SYSTEM SET data_copy_concurrency=5; -- Default value: 10. ALTER SYSTEM SET server_data_copy_in_concurrency=2; -- Default value: 2. If the value is greater than 2, you can reset it to 2. ALTER SYSTEM SET server_data_copy_out_concurrency=2; -- Default value: 2. If the value is greater than 2, you can reset it to 2.Lower the network bandwidth of background tasks.
Run the following command:
ALTER SYSTEM SET sys_bkgd_net_percentage=30; -- Default value: 60.Throttle highly concurrent SQL executions and add table indexes.
If the disk I/O is high because of the execution of an SQL statement, insert the max_concurrent hint to the execution plan bound to the SQL statement to limit the SQL execution concurrency. For more information, see "SQL statements" in SQL Reference (MySQL Mode) and "SQL statements" in SQL Reference (Oracle Mode).
If the high I/O load is caused by the full scan of tables that lack indexes, add indexes to the tables as needed.
Cancel the index creation in progress.
If an index is being created for a large table in the cluster, cancel the creation as needed.