OceanBase Database provides horizontal scaling and data dynamic balancing capabilities for load balancing.
Horizontal scaling refers to the ability to adjust the number of service nodes to scale the service capacity. For example, expanding from a single service node to two service nodes can increase the service capacity. Additionally, horizontal scaling requires data redistribution, such as evenly distributing data across two service nodes when expanding from one to two, or redistributing data to a single service node when scaling down from two to one.
Data dynamic balancing refers to the ability to adjust data distribution without changing the number of service nodes to achieve dynamic load balancing across service nodes. For example, as tables and partitions are dynamically created and deleted, the number of partitions on different service nodes can vary significantly, leading to load imbalance. Using dynamic partition balancing, partitions can be evenly distributed across service nodes to achieve load balancing.
Horizontal scaling
The storage capacity and read/write service capability of a tenant in OceanBase Database are mainly affected by the following two factors:
Unit Number, which refers to the number of units providing services in each zone.
You can scale the number of service nodes by increasing or decreasing the unit number, thereby achieving horizontal scaling of the read/write service and storage capacity.
For more information about how to scale a tenant by adjusting the unit number, see Scale a tenant by adjusting the unit number.
Primary Zone, which refers to the list of zones providing read/write services.
You can scale the number of zones providing read/write services by increasing or decreasing the number of primary zones, thereby achieving horizontal scaling of the read/write service across zones.
For more information about how to scale a tenant by adjusting the primary zone, see Scale a tenant by adjusting the primary zone.
By dynamically adjusting the unit number and primary zone, you can achieve horizontal scaling of the read/write service capability of a tenant within and across zones. The load balancing feature will automatically adjust the log streams and partition distribution based on the configured service capabilities.
Partition balancing
Partition balancing refers to dynamically adjusting the distribution of partitions in tables and partitions to achieve balance in the number of partitions, partition weights, table group weights, and storage space across service nodes, especially when tables and partitions are dynamically changing.
OceanBase Database supports various table types, including non-partitioned tables, partitioned tables, and subpartitioned tables. Each type of table has its own balancing strategy. To describe the balancing effect more conveniently, OceanBase Database divides different table partitions into balancing groups. Within each balancing group, the number of partitions and storage space should be balanced. Balancing groups are independent of each other, and the system automatically adjusts the distribution between balancing groups. By default, OceanBase Database uses the following partition balancing strategy:
For partitioned tables: Each partitioned table is an independent balancing group, and all partitions of the table are distributed across service nodes.
For subpartitioned tables: All subpartitions under each partition form an independent balancing group, and all subpartitions under each partition are distributed across service nodes.
For non-partitioned tables: All non-partitioned tables are considered as a single balancing group, and all non-partitioned tables are distributed across service nodes.
To more flexibly describe the clustering and distribution relationships between different table data, OceanBase Database introduces the concept of table groups.
A table group is a logical concept that represents a collection of tables. Tables within a table group are physically stored close to each other. Tables with associative relationships often have the same partitioning rules. By grouping tables with the same partitioning rules together, you can achieve partition-wise joins, significantly optimizing read and write performance.
Starting from OceanBase Database V4.2.0, the SHARDING attribute of a table group is introduced to control the clustering and distribution relationships of data within the table group. The SHARDING attribute of a table group can be set to NONE, PARTITION, or ADAPTIVE.
You can choose the appropriate SHARDING attribute value based on your specific use case.
Scenario 1: All tables in the table group are clustered together
If you want to cluster any type of table on a single machine to meet the requirements of single-machine access, you can use a table group with SHARDING = NONE to cluster any type of table together.
The meaning of a table group with SHARDING = NONE is as follows:
- Supports adding any type of partitioned table, including non-partitioned tables, partitioned tables, and subpartitioned tables.
- Clusters all partitions of all tables in the table group on a single machine.
Scenario 2: Data of tables in the table group is horizontally distributed
When a single machine cannot handle the data of a single business, you can distribute the data across multiple machines to achieve horizontal scaling. You can use a table group with SHARDING = PARTITION or SHARDING = ADAPTIVE to meet this requirement.
The meaning of a table group with SHARDING = PARTITION is as follows:
Supports adding partitioned tables and subpartitioned tables.
- Partitioning requirements: The partitioning method of the primary partitions must be the same. For subpartitioned tables, the system only checks the partitioning method of the primary partitions. Therefore, partitioned tables and subpartitioned tables can coexist as long as their primary partitions have the same partitioning method.
- Partition alignment rules: Partitions with the same primary partition value are clustered together, including the primary partitions of partitioned tables and the subpartitions of subpartitioned tables under the same primary partition.
Distributes all tables in the table group based on primary partitions. For subpartitioned tables, all subpartitions under each primary partition are clustered together.
The meaning of a table group with SHARDING = ADAPTIVE is as follows:
Supports adding partitioned tables or subpartitioned tables.
Partitioning requirements: All tables must be partitioned tables or all tables must be subpartitioned tables. If they are partitioned tables, the partitioning method of the primary partitions must be the same. If they are subpartitioned tables, the partitioning methods of both the primary partitions and subpartitions must be the same.
Partition alignment rules:
- For partitioned tables: Partitions with the same primary partition value are clustered together.
- For subpartitioned tables: Partitions with the same primary partition value and the same subpartition value are clustered together.
Uses an adaptive distribution method. If all tables in the table group are partitioned tables, they are distributed based on primary partitions. If all tables in the table group are subpartitioned tables, they are distributed based on subpartitions under each primary partition.
For more information about table groups and how to manage them, see Create and manage table groups (MySQL-compatible mode) and Create and manage table groups (Oracle-compatible mode).
Table group weight-based balancing
In MySQL-compatible mode of OceanBase Database, user tables can be aggregated by binding a database to a table group with Sharding = 'NONE'. This feature supports automatic aggregation of user tables within the same database and across multiple databases.
By setting weights for table groups with Sharding = 'NONE', you can achieve table group weight balancing. This helps distribute user tables under automatically aggregated databases based on their weights. Table group weight balancing is a process within the partition balance (PARTITION_BALANCE) task. You can manually trigger it or wait for the scheduled partition balance task to trigger it.
Partition weight-based balancing
To further address partition hotspots within and across tables, OceanBase Database provides a partition weight-based balancing strategy. Partition weight refers to the relative proportion of resources such as CPU, memory, and disk occupied by different partitions. Partition weight balancing is a process within the partition balance (PARTITION_BALANCE) task. You can manually trigger it or wait for the scheduled partition balance task to trigger it.
The partition weight balancing algorithm uses partition movement and partition exchange to minimize the variance of the sum of partition weights across different user log streams.
Only partitions with weights set will participate in partition weight balancing. Unweighted partitions will still be balanced based on their count.
Parameters related to data load balancing
-
The tenant-level parameter
enable_rebalanceis used to control whether to perform load balancing between tenants in the system tenant, and whether to perform load balancing within a tenant in a user tenant. The default value istrue. -
The tenant-level parameter
enable_transferis used to control whether to perform Transfer within a tenant. The default value istrue. Specifically:If the value of the parameter
enable_rebalanceisfalse, automatic load balancing will not be performed regardless of whether the value of the parameterenable_transferistrueorfalse.If the value of the parameter
enable_rebalanceistrueand the value of the parameterenable_transferistrue, it indicates that during tenant scaling, the system will automatically adjust the number of log streams within the tenant. This is achieved through operations such as log stream splitting, merging, and Transfer, to balance the leaders and partitions within the tenant.If the value of the parameter
enable_rebalanceistrueand the value of the parameterenable_transferisfalse, it indicates that during tenant scaling, the system will not perform Transfer and will not change the number of log streams. It will only attempt to balance the existing log streams as much as possible.
partition_balance_schedule_interval
The tenant-level parameter
partition_balance_schedule_intervalis used to control the time interval for generating partition load balancing tasks. Whenenable_rebalanceis set totrue, the system will automatically trigger partition load balancing tasks usingpartition_balance_schedule_intervalas the time interval. The default value is2h, and the valid range is [0s, +∞). A value of0sindicates that partition balancing is disabled.Notice
For V4.4.x versions, the default value of this parameter was changed to 0s starting from V4.4.1. For V4.4.1 and later versions, it is not recommended to use this parameter to control partition balancing. Instead, users can call the
DBMS_BALANCE.TRIGGER_PARTITION_BALANCEprocedure to trigger partition balancing manually or on a schedule.enable_database_sharding_none
The tenant-level parameter
enable_database_sharding_noneis used to control whether to enable automatic aggregation of user tables when creating a database. The default value isFalse, indicating that automatic aggregation of user tables is disabled by default when creating a database.