Load balancing is an important part of performance tuning. It encompasses load balancing among physical servers in a cluster and load balancing of business traffic in a cluster. A sound load balancing status ensures optimal performance by fully utilizing software and hardware resources. During a stress test, pay attention to the resource usage of each OBServer node in the cluster, including the CPU, I/O, and load. This topic describes load balancing in terms of cluster deployment and resource distribution.
Cluster deployment
Collect locations, latency, and bandwidth information for cluster deployment.
Location
Location information, such as the IDC and deployment methods, is essential to SQL route forwarding, transaction models, and performance. Location information includes the following parts:
Deployment method: the way in which the OceanBase clusters are deployed, such as three IDCs in the same city, three IDCs across two regions, and five IDCs across three regions.
Location of OceanBase Database Proxy (ODP) and other middleware: on the client or on the same server as the OBServer node. Different deployment methods have some effect on performance.
Location of application servers and other middleware.
Run the following SQL statement to check the IDC and region where each zone in the cluster is located:
MySQL [oceanbase]> select zone, name, info from __all_zone where name in ('region', 'idc') ;
+------+--------+------+
| zone | name | info |
+------+--------+------+
| z1 | idc | |
| z1 | region | SZ |
| z2 | idc | |
| z2 | region | SZ |
| z3 | idc | |
| z3 | region | SZ |
+------+--------+------+
6 rows in set (0.00 sec)
Latency
You can use the latency information to evaluate whether the response time of a single SQL statement meets your expectations. The latency information of a cluster includes:
The latency between IDCs
The latency between zones
Latency between the ODP and an OBServer node
The latency between a client and ODP
Bandwidth
Check the bandwidths of the following components:
NIC bandwidth on the server where ODP is deployed
NIC bandwidth on the application server
NIC bandwidth and disk I/O bandwidth on the OBServer node
You can use commands such as ping, tsar, ethtool xxx, and ifconfig to obtain the information. The following section uses FFF deployment with three replicas in three IDCs in the same city as an example.
Resource distribution
Obtain information about tenant resource distribution for subsequent performance diagnostics.
Basic information about a tenant
The information includes the primary zone and locality. You can run the following SQL statement to obtain the information:
MySQL [oceanbase]> select * from __all_tenant limit 1\G;
*************************** 1. row ***************************
gmt_create: 2021-08-25 20:38:18.699528
gmt_modified: 2021-08-25 20:38:18.699528
tenant_id: 1
tenant_name: sys
zone_list: z1;z2;z3
primary_zone: z1;z2,z3
locked: 0
collation_type: 0
info: system tenant
locality: FULL{1}@z1, FULL{1}@z2, FULL{1}@z3
previous_locality:
default_tablegroup_id: -1
compatibility_mode: 0
drop_tenant_time: -1
status: TENANT_STATUS_NORMAL
in_recyclebin: 0
1 row in set
Resource allocation information
Here is a sample SQL statement:
OceanBase(root@oceanbase)> select * from gv$ob_units limit 1\G;
*************************** 1. row ***************************
SVR_IP: xx.xx.xx.xx
SVR_PORT: 59802
UNIT_ID: 1
TENANT_ID: 1
ZONE: zone1
MAX_CPU: 1
MIN_CPU: 1
MEMORY_SIZE: 17179869184
MAX_IOPS: 10000
MIN_IOPS: 10000
IOPS_WEIGHT: 1
LOG_DISK_SIZE: 17179869184
LOG_DISK_IN_USE: 81677640
DATA_DISK_IN_USE: 50331648
STATUS: NORMAL
CREATE_TIME: 2022-10-24 16:37:01.245584
1 row in set
Total number of user partitions on a server and leader distribution
MySQL [oceanbase]> select svr_ip,count(1) from __all_virtual_ls_meta_table where tenant_id=1002 group by svr_ip;
+---------------+----------+
| svr_ip | count(1) |
+---------------+----------+
| 10.10.10.1 | 1 |
| 10.10.10.2 | 1 |
| 10.10.10.3 | 1 |
+---------------+----------+
3 rows in set
MySQL [oceanbase]> select svr_ip,count(1) from __all_virtual_ls_meta_table where tenant_id=1001 and role=1 group by svr_ip;
+---------------+----------+
| svr_ip | count(1) |
+---------------+----------+
| 10.10.10.1 | 5 |
+---------------+----------+
1 row in set
Other variables
Throughout the process starting from the application server that initiates a request to the OBServer application server, any involved component deserves our attention. Any bottleneck on a component can affect the performance. Therefore, pay attention to:
Physical resources. Check for resource bottlenecks on each component in the intermediate link, such as the JVM memory, CPU utilization of the application server and ODP, and software interrupts.
Request routing. Check whether ODP can correctly route SQL requests and whether unauthorized forwarding occurs.
Connection pool. Check the numbers of long and short connections and the
SocketTimeoutparameter.Traffic balancing. Check whether the number of SQL statements received and processed by each OBServer node is seriously unbalanced.