OceanBase Database uses RootService to manage load balancing among the resource units of a tenant. Different types of replica require different amounts of resources. RootService considers the CPU, disk usage, memory utilization, and IOPS of each resource unit when managing partitions. To make full use of resources available on each node, the utilization of all resource types will be balanced among all nodes after load balancing.
The load balancing algorithm operates based on units called load balancing groups. Members of a load balancing group are evenly distributed among the servers in a cluster. Two major types of load balancing groups exist in OceanBase Database. First, all non-partitioned tables make up a default load balancing group. Second, a partitioned table will include one or more load balancing groups depending on the partitioning strategy.
In tables using single-layered hash, range, or list partitioning strategies, all partitions in a table belong to a single load balancing group. In tables with sub-partitioning, layer 1 partitions created through hash partitioning form one load balancing group. If hash partitioning was not used on the first layer but was used on the second, the layer 2 partitions under each of the layer-one partitions make up a separate load balancing group. For all other partitioning strategies, all partitions in a table form one load balancing group.
In each partition group, load balancing is achieved by first distributing the partitions so that all resource units have the same number of partitions. The load on each unit is then calculated and the partitions on the most and least loaded units are switched. This keeps both the number of partitions and the load balanced. As each partition grows, the load on each resource unit changes dynamically. This triggers further migration to keep disk usage balanced.
OceanBase Database also introduces table groups to minimize distributed transactions. Tables that are frequently accessed together are assigned to the same table group. For example, a general user information table "user" and a user item table "user_item" are hash partitioned based on user ID. By assigning them to the same table group, the system automatically relocates the user partition and user_item partition containing the same user ID to the same server. In this way, cross-IDC transactions will not occur even when multiple tables are operated for the same user. This design combines the scalability of a distributed system with the ease of use and flexibility of a relational database and is in line with the habit of database administrators.
For leaders, the overall principle of the balancing strategy is to elect leaders in real-time in each load balancing group based on the distribution of replicas, so that all resource units in the primary_zone have the same number of leaders.