Optimize backup and restore performance|V4.2.2| docs|Distributed Database

Optimize backup and restore performance

Last Updated：2025-12-02 02:50:55 Updated

Performance tuning methods

Ideally, the performance of backup and restore should be limited only by the data distribution (number of partitions and size of partitions) and hardware (CPU, disk, and network). However, by default, backup and restore cannot fully utilize the hardware performance. To address this, OceanBase Database offers resource isolation strategies and the following parameters for performance tuning:

Network configuration: The sys_bkgd_net_percentage parameter specifies the maximum percentage of total network bandwidth available to background system tasks (including backup and restore tasks). Setting this parameter to an appropriate value helps the backup and restore tasks make full use of the network bandwidth without affecting the foreground business.
CPU and I/O configurations: OceanBase Database also offers the Resource Manager feature for isolating the CPU and I/O resources of different types of tasks. If resource isolation is configured for backup and restore tasks, set the upper limits based on actual resource and business requirements to avoid bottlenecks of CPU and I/O resources for backup and restore tasks.
Other configurations: If sufficient CPU, I/O, and network resources are available, you can increase the concurrency of backup and restore tasks by setting relevant parameters (ha_low_thread_score, log_archive_concurrency, log_restore_concurrency, and ha_high_thread_score) to improve the performance.

Cluster-level parameter `sys_bkgd_net_percentage`

The sys_bkgd_net_percentage parameter specifies the percentage of network bandwidth available to background system tasks. The default value is 60% of the network card rate of the server. When the network bandwidth is full, you can increase the value of the sys_bkgd_net_percentage parameter to reserve sufficient network bandwidth for business requests.

After the parameter is set, perform the following steps to view the log:

Log in to the server where the OBServer node resides as the admin user.
Navigate to the installation directory of OceanBase Database.

For example, if OceanBase Database is installed at /home/admin/oceanbase/, follow the instructions below. If the actual installation path is different, proceed with the appropriate modifications.
```
[admin@xxx /]$ cd /home/admin/oceanbase
```
Run the following command to view the network card rate.
```
[admin@xxx oceanbase]$ grep -E 'print band limit|succeed to init_bandwidth_throttle' log/observer.log*
```
Here, observer.log is the observer log file generated when the cluster starts.

For example, the query result is as follows:
```
log/observer.log.20210811100806:[2021-08-11 10:06:32.934433] INFO  [SERVER] ob_server.cpp:1783 [76957] [0] [Y0-0000000000000000] [lt=4] [dc=0] succeed to init_bandwidth_throttle(sys_bkgd_net_percentage_=60,ethernet_speed_=1310720000,rate=786432000)
log/observer.log.20210811100806:[2021-08-1110:07:42.351813] INFO  [COMMON] utility.cpp:1487 [77169][418] [Y9FA64586E9E-0005C93F15DAE715] [lt=11] [dc=0] print band limit(comment= in , copy_KB=0, sleep_ms_sum=0, speed_KB_per_s=0, total_sleep_ms=0,total__bytes=531, rate_KB/s=786432,print_interval_ms=69417)
```
In the first query result, sys_bkgd_net_percentage_=60 indicates that background system tasks can use 60% of the network bandwidth, which is the network card rate of the server; network_speed=1310720000 indicates that the maximum network card rate identified by OceanBase Database is 1310720000 B/s; and rate=786432000 indicates that the maximum network bandwidth after speed limiting is 786432000 B/s, and rate = network_speed * sys_bkgd_net_percentage.

In the second query result, rate_KB/s=786432 indicates that the identified maximum speed limit (rate) is 786432 KB/s.

If the network card rate identified by OceanBase Database is inaccurate, after you view the log, you can modify the network card rate based on the checklist for network card rate and then view the log again.

Resource isolation in the Resource Manager

The Resource Manager is the resource isolation mechanism in OceanBase Database. In function-level resource isolation, you can configure the upper limits of resources, such as CPU, IOPS, and network bandwidth, for different background tasks. For more information, see Overview of resource isolation.

In function-level resource isolation, background tasks related to backup and restore are classified into the following categories based on priority and reliability:

ha_high: tasks of high priority and reliability, such as replication, rebuild, and restore.
ha_mid: tasks of medium priority and reliability, such as migration.
ha_low: tasks of low priority and reliability, such as backup and backup cleanup.

You can query the DBA_OB_RSRC_IO_DIRECTIVES view for the resource isolation plan configured for the current tenant. If the query result set is empty, resource isolation is not configured for the tenant. If the query result set contains records of background tasks, check whether the CPU, I/O, network bandwidth, and other resources are bottlenecked and whether the bottleneck is within the limits set by resource isolation. If yes, modify the resource isolation plan without affecting the frontend business. For more information, see Modify a resource management plan (MySQL mode) and Modify a resource management plan (Oracle mode).

Do not configure resource isolation plans during performance tests of backup and restore.

Parameter	Description	Default value	Note
ha_low_thread_score	The maximum number of threads for concurrent data backup. This parameter is a tenant-level parameter.	0, which indicates that the default value, 2, is used.	We recommend that you set this parameter to the default value for small-sized tenants (CPU cores ≤ 4) and start with the value of 10 for large-sized tenants. If you find that the backup speed is too slow, you can double the value. During performance tests for backup and restore, we recommend that you set this parameter to the maximum value, 100.

Parameter	Description	Default value	Notes
log_archive_concurrency	The maximum number of concurrent threads for log archive. This is a tenant-level parameter.	0. In this case, the system calculates the number of archive threads based on the `MAX_CPU` of the tenant by using the following adaptive rule: If `tenant's MAX_CPU <= 8`, then `archive threads = MAX_CPU`. If `8 < tenant's MAX_CPU < 32`, then `archive threads = tenant's MAX_CPU / 2`, with a minimum value of 8. If `tenant's MAX_CPU >= 32`, then `archive threads = tenant's MAX_CPU / 4`, with a minimum value of 16.	We recommend that you set this parameter to the default value for both large- and small-scale tenants so that the system can adaptively calculate the number of worker threads.

Parameter	Description	Default value	Remarks
log_restore_concurrency	The maximum number of concurrent log restorations. This is a tenant-level parameter.	0, which means the number of concurrent threads is equal to the number of cores of the tenant's `MAX_CPU`.	Increasing this parameter increases the number of worker threads and the memory resource overhead. We recommend that you set this parameter to the default value of 0 and increase it only if you find that the restoration speed is too slow.
ha_high_thread_score	The maximum number of concurrent data restorations. This is a tenant-level parameter.	0, which means the default number of concurrent threads is 8.	We recommend that you use the default value in non-performance test scenarios and set it to the maximum value of 100 in performance test scenarios.
_restore_idle_time	Cluster-level hidden configuration item used to control the scheduling interval for RS recovery.	1m, which means 1 minute	When adjusted to 10s, the time consumed for data recovery will be reduced by tens of seconds to two minutes. It is recommended to adjust for small-scale tenants with higher performance requirements (tenant CPU ≤ 4C), otherwise the effect is not obvious.

Optimize backup and restore performance

Performance tuning methods

Resource configuration related

Cluster-level parameter sys_bkgd_net_percentage

Resource isolation in the Resource Manager

Data backup-related

Log archive-related

Restoration-related

References

Cluster-level parameter `sys_bkgd_net_percentage`