Load balancing is an important aspect of performance tuning, which includes two aspects: load balancing within the cluster and load balancing of business traffic. Good load balancing can fully utilize the software and hardware environment, achieving the best performance. During stress testing, we need to pay attention to the resource usage of each OBServer node in the cluster, such as CPU, IO, and Load. This article will briefly discuss the aspects of cluster deployment and resource distribution.
Cluster deployment
For a cluster deployment, information about locations, latency, and bandwidth is needed.
Location
Location information is crucial, and some critical information like IDC and deployment method affects SQL routing and forwarding, transaction model, and performance. It includes the following areas:
Deployment method: same-city three IDCs, two places with three centers, three places with five centers, or other deployment methods.
Location of OBProxy and other middleware: Should it be deployed on the client side or co-located with the observer? Different deployment modes will have different performance effects.
The locations of the application servers and other middleware.
The following SQL query is used to query the data center for each zone in the cluster and the city where the data center is located:
obclient [oceanbase]> SELECT * FROM oceanbase.DBA_OB_ZONES;
The query result is as follows:
+-------+----------------------------+----------------------------+--------+------+----------------+-----------+
| ZONE | CREATE_TIME | MODIFY_TIME | STATUS | IDC | REGION | TYPE |
+-------+----------------------------+----------------------------+--------+------+----------------+-----------+
| zone1 | 2024-07-10 14:19:05.573991 | 2024-07-10 14:19:05.573991 | ACTIVE | | default_region | ReadWrite |
+-------+----------------------------+----------------------------+--------+------+----------------+-----------+
1 row in set
Latency
You can assess the query response time (rt) of an SQL statement based on the delayed information. The specific delay information for the cluster is as follows:
Inter-room delay;
Interzone delay;
Delay from OBProxy to OBServer;
Client to OBProxy latency.
Bandwidth
Confirm the bandwidth requirements for each of the following components:
Bandwidth of the network card of the machine on which OBProxy is installed.
Bandwidth of application server network cards.
NIC, disk I/O bandwidth of OBServer nodes.
The following information can be obtained by using the ping, tsar, ethtool xxx, and ifconfig commands. The following examples describe the FFF deployment across three IDCs with three replicas.
Resource allocation
Before we start the performance analysis, let's understand how tenant resources are allocated.
Tenant Basic Info
Includes the primary zone and locality, with related SQL as follows:
obclient(root@sys)[oceanbase]> SELECT * FROM oceanbase.DBA_OB_TENANTS LIMIT 1\G
The query result is as follows:
*************************** 1. row ***************************
TENANT_ID: 1
TENANT_NAME: sys
TENANT_TYPE: SYS
CREATE_TIME: 2026-02-10 10:34:01.635622
MODIFY_TIME: 2026-02-10 10:34:01.635622
PRIMARY_ZONE: RANDOM
LOCALITY: FULL{1}@zone1
PREVIOUS_LOCALITY: NULL
COMPATIBILITY_MODE: MYSQL
STATUS: NORMAL
IN_RECYCLEBIN: NO
LOCKED: NO
TENANT_ROLE: PRIMARY
SWITCHOVER_STATUS: NORMAL
SWITCHOVER_EPOCH: 0
SYNC_SCN: NULL
REPLAYABLE_SCN: NULL
READABLE_SCN: NULL
RECOVERY_UNTIL_SCN: NULL
LOG_MODE: NOARCHIVELOG
ARBITRATION_SERVICE_STATUS: DISABLED
UNIT_NUM: 1
ZONE_UNIT_NUM_LIST: zone1:1
COMPATIBLE: 4.6.0.0
MAX_LS_ID: 1
RESTORE_DATA_MODE: NORMAL
FLASHBACK_LOG_SCN: NULL
COMMENT: system tenant
1 row in set (0.037 sec)
Resource allocation information
The following is a sample SQL statement for this query:
obclient(root@sys)[oceanbase]> SELECT * FROM oceanbase.gv$ob_units LIMIT 1\G
The query results are as follows:
*************************** 1. row ***************************
SVR_IP: 172.xx.xx.xx
SVR_PORT: 2882
UNIT_ID: 1
TENANT_ID: 1
ZONE: zone1
ZONE_TYPE: ReadWrite
REGION: default_region
MAX_CPU: 4
MIN_CPU: 4
MEMORY_SIZE: 5368709120
MAX_IOPS: 9223372036854775807
MIN_IOPS: 9223372036854775807
IOPS_WEIGHT: 4
LOG_DISK_SIZE: 17448304640
LOG_DISK_IN_USE: 2311121885
DATA_DISK_IN_USE: 88608768
STATUS: NORMAL
CREATE_TIME: 2025-10-15 11:13:02.088335
REPLICA_TYPE: FULL
1 row in set
Single User Partition Total and Leader Distribution
obclient [oceanbase]> SELECT svr_ip,count(1) FROM oceanbase.__all_virtual_ls_meta_table WHERE tenant_id=1002 GROUP BY svr_ip;
+---------------+----------+
| svr_ip | count(1) |
+---------------+----------+
| 10.10.10.1 | 1 |
| 10.10.10.2 | 1 |
| 10.10.10.3 | 1 |
+---------------+----------+
3 rows in set
obclient [oceanbase]> SELECT svr_ip,count(1) FROM oceanbase.__all_virtual_ls_meta_table WHERE tenant_id=1001 and role=1 GROUP BY svr_ip;
+---------------+----------+
| svr_ip | count(1) |
+---------------+----------+
| 10.10.10.1 | 5 |
+---------------+----------+
1 row in set
Others
Requests issued from the application server to the observer application server must pass through various components, and the performance of any component has a significant impact on the overall performance. The following items are recommended to be monitored:
Physical resources: Whether the resources of the intermediate components are bottlenecked, for example, the JVM memory, the CPU usage of the application server and OBProxy, and soft interrupt.
Routing: Whether the requests can be routed correctly and whether they are redirected incorrectly.
Connection pooling: number of long and short connections, SocketTimeout.
Traffic distribution: Does the number of SQL requests handled by each OBServer show significant imbalance?
