Load balancing is an important part of performance tuning. It encompasses load balancing among physical servers in a cluster and load balancing of business traffic in a cluster. A sound load balancing status ensures optimal performance by fully utilizing software and hardware resources. During a stress test, pay attention to the resource usage of each OBServer node in the cluster, including the CPU, I/O, and load. This topic describes load balancing in terms of cluster deployment and resource distribution.
Cluster deployment
Collect locations, latency, and bandwidth information for cluster deployment.
Location
Location information, such as the IDC and deployment methods, is essential to SQL route forwarding, transaction models, and performance. Location information includes the following parts:
Deployment method: the way in which an OceanBase cluster is deployed, such as three IDCs in the same region, three IDCs across two regions, and five IDCs across three regions.
Location of OceanBase Database Proxy (ODP) and other middleware: on the client or on the same server as an OBServer node. Different deployment methods have some effect on performance.
Location of application servers and other middleware.
Execute the following SQL statement to check the IDC and region where each zone in the cluster is located:
obclient [oceanbase]> SELECT * FROM oceanbase.DBA_OB_ZONES;
+-------+----------------------------+----------------------------+--------+------+----------------+-----------+
| ZONE | CREATE_TIME | MODIFY_TIME | STATUS | IDC | REGION | TYPE |
+-------+----------------------------+----------------------------+--------+------+----------------+-----------+
| zone1 | 2024-07-10 14:19:05.573991 | 2024-07-10 14:19:05.573991 | ACTIVE | | default_region | ReadWrite |
+-------+----------------------------+----------------------------+--------+------+----------------+-----------+
1 row in set
Latency
You can use the latency information to evaluate whether the response time of a single SQL statement meets your expectations. The latency information of a cluster includes:
The latency between IDCs
The latency between zones
The latency between ODP and an OBServer node
The latency between a client and ODP
Bandwidth
Check the bandwidths of the following components:
NIC bandwidth on the server where ODP is deployed
NIC bandwidth on the application server
NIC bandwidth and disk I/O bandwidth on an OBServer node
You can use commands such as ping, tsar, ethtool xxx, and ifconfig to obtain the information. The following section uses deployment with three full replicas in three IDCs in the same region as an example.
Resource distribution
Obtain information about tenant resource distribution for subsequent performance diagnostics.
Basic information about a tenant
The information includes the primary zone and locality. You can execute the following SQL statement to obtain the information:
obclient [oceanbase]> SELECT * FROM oceanbase.DBA_OB_TENANTS LIMIT 1\G
*************************** 1. row ***************************
TENANT_ID: 1
TENANT_NAME: sys
TENANT_TYPE: SYS
CREATE_TIME: 2024-07-10 14:19:05.397680
MODIFY_TIME: 2024-07-10 14:19:05.397680
PRIMARY_ZONE: RANDOM
LOCALITY: FULL{1}@zone1
PREVIOUS_LOCALITY: NULL
COMPATIBILITY_MODE: MYSQL
STATUS: NORMAL
IN_RECYCLEBIN: NO
LOCKED: NO
TENANT_ROLE: PRIMARY
SWITCHOVER_STATUS: NORMAL
SWITCHOVER_EPOCH: 0
SYNC_SCN: NULL
REPLAYABLE_SCN: NULL
READABLE_SCN: NULL
RECOVERY_UNTIL_SCN: NULL
LOG_MODE: NOARCHIVELOG
ARBITRATION_SERVICE_STATUS: DISABLED
UNIT_NUM: 1
COMPATIBLE: 4.3.2.0
MAX_LS_ID: 1
RESTORE_DATA_MODE: NULL
1 row in set
Resource allocation information
Here is a sample SQL statement:
obclient [oceanbase]> SELECT * FROM oceanbase.gv$ob_units LIMIT 1\G
*************************** 1. row ***************************
SVR_IP: 172.xx.xx.xx
SVR_PORT: 2882
UNIT_ID: 1
TENANT_ID: 1
ZONE: zone1
ZONE_TYPE: ReadWrite
REGION: default_region
MAX_CPU: 4
MIN_CPU: 4
MEMORY_SIZE: 5368709120
MAX_IOPS: 9223372036854775807
MIN_IOPS: 9223372036854775807
IOPS_WEIGHT: 4
LOG_DISK_SIZE: 17448304640
LOG_DISK_IN_USE: 2311121885
DATA_DISK_IN_USE: 88608768
STATUS: NORMAL
CREATE_TIME: 2024-07-10 14:18:25.157169
1 row in set
Total number of user partitions on a server and leader distribution
obclient [oceanbase]> SELECT svr_ip,count(1) FROM oceanbase. __all_virtual_ls_meta_table WHERE tenant_id=1002 GROUP BY svr_ip;
+---------------+----------+
| svr_ip | count(1) |
+---------------+----------+
| 10.10.10.1 | 1 |
| 10.10.10.2 | 1 |
| 10.10.10.3 | 1 |
+---------------+----------+
3 rows in set
obclient [oceanbase]> SELECT svr_ip,count(1) FROM oceanbase. __all_virtual_ls_meta_table WHERE tenant_id=1001 and role=1 GROUP BY svr_ip;
+---------------+----------+
| svr_ip | count(1) |
+---------------+----------+
| 10.10.10.1 | 5 |
+---------------+----------+
1 row in set
Others
Throughout the process from the application server sending a request to an OBServer node receiving it, any involved components deserve our attention, as any bottlenecks on these components can impact performance. Therefore, pay attention to:
Physical resources. Check for resource bottlenecks on each component in the intermediate link, such as the JVM memory, CPU utilization of the application server and ODP, and software interrupts.
Request routing. Check whether ODP can correctly route SQL requests and whether unauthorized forwarding occurs.
Connection pool. Check the numbers of long and short connections and the
SocketTimeoutparameter.Traffic balancing. Check whether the number of SQL requests received and processed by each OBServer node is seriously unbalanced.