This topic describes how to migrate a replica. Replica migration involves only the location change of a replica within a zone. It does not change the number or type of replicas. Replica migration is applicable in scenarios where a node fails or the load is unbalanced.
Unit migration
Unit migration refers to the migration of units between nodes within the same zone. By migrating units under a tenant, you can migrate replicas of log streams under the tenant.
(Optional) Adjust the maximum wait time for unit garbage collection
After a unit migration is completed in OceanBase Database, the system automatically deletes the tenant on the source server and kills all sessions on the tenant. This may affect the execution of ongoing queries. To address this issue, OceanBase Database implements a smooth garbage collection (GC) mechanism for units. Once a unit migration is completed, the system starts a countdown for unit GC. During this countdown, the system does not accept new connection requests. If there are ongoing requests, the system waits for them to complete before performing GC.
However, if the GC process is interrupted by a long-running task, the GC may be delayed for an extended period. To ensure smooth GC, OceanBase Database uses the cluster-level configuration item unit_gc_wait_time to control the maximum wait time for unit GC. If the maximum wait time is exceeded, the system automatically performs GC. The default value of the configuration item unit_gc_wait_time is 1 minute. You can adjust this value based on your business needs.
To adjust the maximum wait time for unit GC, perform the following steps:
Log in to the sys tenant of the cluster as the root user.
Here is an example:
obclient -h172.30.xxx.xxx -P2883 -uroot@sys#obdemo -pxxxx -AView the current value of the
unit_gc_wait_timeconfiguration item.obclient> SHOW PARAMETERS LIKE '%unit_gc_wait_time%';Adjust the value of the
unit_gc_wait_timeconfiguration item.Here is an example:
obclient> ALTER SYSTEM SET unit_gc_wait_time = '1m';
Perform unit migration
Example: Migrate the unit with UNIT_ID = 1006 in the mq_t1 tenant to another node in the same zone.
Log in to the sys tenant of the cluster as the root user.
obclient -h172.30.xxx.xxx -P2883 -uroot@sys#obdemo -pxxxx -AEnter the
oceanbasedatabase.obclient(root@sys)[(none)]> USE oceanbase;Query the
tenant_idof themq_t1tenant.In this example, the
tenant_idof themq_t1tenant is 1004.obclient(root@sys)[oceanbase]> SELECT TENANT_ID, TENANT_NAME, TENANT_TYPE, PRIMARY_ZONE, LOCALITY, STATUS, TENANT_ROLE FROM oceanbase.DBA_OB_TENANTS WHERE TENANT_NAME = 'mq_t1';The query result is as follows:
+-----------+-------------+-------------+----------------------------+--------------------------------------------------------------------+--------+-------------+ | TENANT_ID | TENANT_NAME | TENANT_TYPE | PRIMARY_ZONE | LOCALITY | STATUS | TENANT_ROLE | +-----------+-------------+-------------+----------------------------+--------------------------------------------------------------------+--------+-------------+ | 1004 | mq_t1 | USER | sa128_obv4_1;sa128_obv4_2 | FULL{1}@sa128_obv4_1, FULL{1}@sa128_obv4_2, FULL{1}@@sa128_obv4_3 | NORMAL | PRIMARY | +-----------+-------------+-------------+----------------------------+--------------------------------------------------------------------+--------+-------------+ 1 row in setQuery the unit information of the
mq_t1tenant.In this example, the unit with
UNIT_ID = 1006is located on thexx.xx.xx.19node in thesa128_obv4_3zone.obclient(root@sys)[oceanbase]> SELECT UNIT_ID, TENANT_ID, STATUS, RESOURCE_POOL_ID, UNIT_GROUP_ID, ZONE, SVR_IP, SVR_PORT FROM oceanbase.DBA_OB_UNITS WHERE TENANT_ID = 1004;The query result is as follows:
+---------+-----------+--------+------------------+---------------+--------------+---------------+----------+ | UNIT_ID | TENANT_ID | STATUS | RESOURCE_POOL_ID | UNIT_GROUP_ID | ZONE | SVR_IP | SVR_PORT | +---------+-----------+--------+------------------+---------------+--------------+---------------+----------+ | 1004 | 1004 | ACTIVE | 1005 | 1002 | sa128_obv4_1 | xxx.xx.xxx.47 | 2882 | | 1005 | 1004 | ACTIVE | 1005 | 1002 | sa128_obv4_2 | xxx.xx.xxx.81 | 2882 | | 1006 | 1004 | ACTIVE | 1005 | 1002 | sa128_obv4_3 | xxx.xx.xxx.19 | 2882 | +---------+-----------+--------+------------------+---------------+--------------+---------------+----------+ 3 rows in setQuery the IP address of the OBServer node in the zone where the unit to be migrated is located.
In this example, the zone
sa128_obv4_3contains two OBServer nodes:xx.xx.xx.19andxx.xx.xx.158.obclient(root@sys)[oceanbase]> SELECT SVR_IP, SVR_PORT, ID, ZONE, SQL_PORT, STATUS FROM oceanbase.DBA_OB_SERVERS;The query result is as follows:
+--------------+----------+----+--------------+----------+--------+ | SVR_IP | SVR_PORT | ID | ZONE | SQL_PORT | STATUS | +--------------+----------+----+--------------+----------+--------+ | xx.xx.xx.81 | 2882 | 6 | sa128_obv4_2 | 2881 | ACTIVE | | xx.xx.xx.19 | 2882 | 4 | sa128_obv4_3 | 2881 | ACTIVE | | xx.xx.xx.158 | 2882 | 6 | sa128_obv4_3 | 2881 | ACTIVE | | xx.xx.xx.43 | 2882 | 6 | sa128_obv4_1 | 2881 | ACTIVE | | xx.xx.xx.106 | 2882 | 6 | sa128_obv4_2 | 2881 | ACTIVE | | xx.xx.xx.47 | 2882 | 6 | sa128_obv4_1 | 2881 | ACTIVE | +--------------+----------+----+--------------+----------+--------+ 6 rows in setMigrate the unit with
UNIT_ID = 1006in themq_t1tenant to another node in the same zone.obclient(root@sys)[oceanbase]> ALTER SYSTEM migrate unit = 1006 destination = 'xx.xx.xx.158:2882'; Query OK, 0 rows affectedQuery the migration status of the unit.
obclient(root@sys)[oceanbase]> SELECT * FROM oceanbase.DBA_OB_UNIT_JOBS WHERE JOB_TYPE = 'MIGRATE_UNIT';The query result is as follows:
+--------+--------------+------------+-------------+----------+----------------------------+----------------------------+-----------+---------+----------+------------+--------------+-------------+ | JOB_ID | JOB_TYPE | JOB_STATUS | RESULT_CODE | PROGRESS | START_TIME | MODIFY_TIME | TENANT_ID | UNIT_ID | SQL_TEXT | EXTRA_INFO | RS_SVR_IP | RS_SVR_PORT | +--------+--------------+------------+-------------+----------+----------------------------+----------------------------+-----------+---------+----------+------------+--------------+-------------+ | 4 | MIGRATE_UNIT | INPROGRESS | NULL | 0 | 2023-01-04 17:22:02.208219 | 2023-01-04 17:22:02.208219 | 1004 | 1006 | NULL | NULL | xx.xx.xx.106 | 2882 | +--------+--------------+------------+-------------+----------+----------------------------+----------------------------+-----------+---------+----------+------------+--------------+-------------+ 1 row in setIf the value of
JOB_STATUSisINPROGRESS, the migration is in progress. If the value isSUCCESS, the migration is successful.Query the information of the unit after migration.
obclient(root@sys)[oceanbase]> SELECT UNIT_ID, TENANT_ID, STATUS, RESOURCE_POOL_ID, UNIT_GROUP_ID, ZONE, SVR_IP, SVR_PORT FROM oceanbase.DBA_OB_UNITS WHERE UNIT_ID = 1006;The query result is as follows:
+---------+-----------+--------+------------------+---------------+--------------+----------------+----------+ | UNIT_ID | TENANT_ID | STATUS | RESOURCE_POOL_ID | UNIT_GROUP_ID | ZONE | SVR_IP | SVR_PORT | +---------+-----------+--------+------------------+---------------+--------------+----------------+----------+ | 1006 | 1004 | ACTIVE | 1005 | 1002 | sa128_obv4_3 | xxx.xx.xxx.158 | 2882 | +---------+-----------+--------+------------------+---------------+--------------+----------------+----------+ 1 row in setThe query result shows that the IP address of the unit with
UNIT_ID = 1006has changed fromxx.xx.xx.19toxx.xx.xx.158, indicating that the unit migration was successful.
Manually migrate a replica
When a replica is not located in the same unit as its unit, you can use the ALTER SYSTEM MIGRATE REPLICA command to migrate the replica to the specified location.
Limitations
The sys tenant can perform replica migration operations on the log streams of all tenants. User tenants can only perform replica migration operations on their own log streams.
For the same log stream of a tenant, only one disaster recovery task can be executed at a time, except for replica migration tasks, which can be executed in parallel.
The sys tenant can check the
CDB_OB_LS_REPLICA_TASKSview to confirm if any disaster recovery tasks are running. User tenants can check theDBA_OB_LS_REPLICA_TASKSview to confirm if any disaster recovery tasks are running.Replica migration can only be performed between replicas in the same zone.
Prerequisites
Before performing a replica migration operation, ensure that the current user has the
ALTER SYSTEMprivilege. Otherwise, theALTER SYSTEM MIGRATE REPLICAcommand cannot be executed.Before querying the views, ensure that the current user has the
SELECTprivilege on the following views. Otherwise, the relevant information cannot be queried.DBA_OB_TENANTSDBA_OB_LS/CDB_OB_LSGV$OB_UNITSDBA_OB_LS_LOCATIONS/CDB_OB_LS_LOCATIONS
Before performing a replica migration operation, ensure that the target server has available resources (units) for the user and that the unit does not have a replica of the log stream.
If you need to execute multiple replica migration tasks simultaneously, set the value of the
replica_parallel_migration_modeconfiguration item toonfor the primary tenant. For the standby tenant, set the value of thereplica_parallel_migration_modeconfiguration item toonorauto. For more information about the tenant-level configuration itemreplica_parallel_migration_mode, see replica_parallel_migration_mode.
Procedure
Assume that you have a tenant named tenant1. You executed the ALTER SYSTEM MIGRATE UNIT= 1003 DESTINATION = '100.xx.xx.002:5072'; command to migrate the unit with the unit_id of 1003 to the destination server 100.xx.xxx.002:5072. However, due to an exception, the system failed to complete the migration of the replicas. You need to manually migrate the replicas to the destination server 100.xx.xxx.002:5072.
Log in to the cluster as the tenant.
Here is an example of the connection command:
obclient -h172.30.xxx.xxx -P2883 -uroot@tenant1#obdemo -pxxxx -AFor more information about how to connect to the database, see Overview of database connection (MySQL mode) and Overview of database connection (Oracle mode).
Obtain the tenant ID of the tenant to be operated on.
System tenant
obclient(root@sys)[oceanbase]> SELECT TENANT_NAME,TENANT_ID FROM oceanbase.DBA_OB_TENANTS WHERE TENANT_NAME='tenant1';User tenant
MySQL modeOracle modeExecute the following statement in MySQL mode:
obclient(root@tenant1)[oceanbase]> SELECT TENANT_NAME,TENANT_ID FROM oceanbase.DBA_OB_TENANTS;Execute the following statement in Oracle mode:
obclient(SYS@tenant1)[SYS]> SELECT TENANT_NAME,TENANT_ID FROM SYS.DBA_OB_TENANTS;Here is an example of the query result:
+-------------+-----------+ | TENANT_NAME | TENANT_ID | +-------------+-----------+ | tenant1 | 1002 | +-------------+-----------+ 1 row in setThe tenant ID of the tenant is
1002based on the query result.For more information about the fields in the
DBA_OB_TENANTSview, see DBA_OB_TENANTS.
Query the log stream information of the tenant to be operated on.
System tenant
obclient(root@sys)[oceanbase]> SELECT * FROM oceanbase.CDB_OB_LS WHERE TENANT_ID=1002;For more information about the fields in the
CDB_OB_LSview, see CDB_OB_LS.User tenant
MySQL modeOracle modeExecute the following statement in MySQL mode:
obclient(root@tenant1)[oceanbase]> SELECT * FROM oceanbase.DBA_OB_LS;Execute the following statement in Oracle mode:
obclient(SYS@tenant1)[SYS]> SELECT * FROM SYS.DBA_OB_LS;Here is an example of the query result:
+-------+--------+--------------+---------------+-------------+---------------------+----------+---------------------+---------------------+-----------+----------------+ | LS_ID | STATUS | PRIMARY_ZONE | UNIT_GROUP_ID | LS_GROUP_ID | CREATE_SCN | DROP_SCN | SYNC_SCN | READABLE_SCN | FLAG | UNIT_LIST | +-------+--------+--------------+---------------+-------------+---------------------+----------+---------------------+---------------------+-----------+----------------+ | 1 | NORMAL | z1;z2 | 0 | 0 | NULL | NULL | 1712455113896017202 | 1712455113896017202 | | | | 1001 | NORMAL | z1;z2 | 0 | 1001 | 1712125812893680165 | NULL | 1712455113896017202 | 1712455113896017202 | | 1002,1004,1006 | | 1002 | NORMAL | z1;z2 | 0 | 1002 | 1712125812908098857 | NULL | 1712455114041323052 | 1712455113540551113 | | 1001,1003,1005 | | 1003 | NORMAL | z1;z2 | 0 | 0 | 1712125828880850585 | NULL | 1712455113896017202 | 1712455113896017202 | DUPLICATE | | +-------+--------+--------------+---------------+-------------+---------------------+----------+---------------------+---------------------+-----------+----------------+ 4 rows in setBased on the query result, the tenant1 has one log stream of ID 1, one log stream of ID 1001, one log stream of ID 1002, and one broadcast log stream of ID 1003.
For more information about the fields in the
DBA_OB_LSview, see DBA_OB_LS.
Query the unit resources of the tenant to be operated on.
System tenant
obclient(root@sys)[oceanbase]> SELECT SVR_IP, SVR_PORT, UNIT_ID, TENANT_ID, ZONE, ZONE_TYPE, STATUS FROM oceanbase.GV$OB_UNITS WHERE TENANT_ID=1002;User tenant
MySQL modeOracle modeExecute the following statement in MySQL mode:
obclient(root@tenant1)[oceanbase]> SELECT SVR_IP, SVR_PORT, UNIT_ID, TENANT_ID, ZONE, ZONE_TYPE, STATUS FROM oceanbase.GV$OB_UNITS;Execute the following statement in Oracle mode:
obclient(SYS@tenant1)[SYS]> SELECT SVR_IP, SVR_PORT, UNIT_ID, TENANT_ID, ZONE, ZONE_TYPE, STATUS FROM SYS.GV$OB_UNITS;Here is an example of the query result:
+----------------+----------+---------+-----------+-------+-----------+--------+ | SVR_IP | SVR_PORT | UNIT_ID | TENANT_ID | ZONE | ZONE_TYPE | STATUS | +----------------+----------+---------+-----------+-------+-----------+--------+ | 100.xx.xxx.012 | 5070 | 1002 | 1002 | z1 | ReadWrite | NORMAL | | 100.xx.xxx.002 | 5072 | 1004 | 1002 | z2 | ReadWrite | NORMAL | | 100.xx.xxx.003 | 5073 | 1003 | 1002 | z2 | ReadWrite | NORMAL | | 100.xx.xxx.001 | 5071 | 1001 | 1002 | z1 | ReadWrite | NORMAL | | 100.xx.xxx.004 | 5074 | 1006 | 1002 | z3 | ReadWrite | NORMAL | | 100.xx.xxx.005 | 5075 | 1005 | 1002 | z3 | ReadWrite | NORMAL | +----------------+----------+---------+-----------+-------+-----------+--------+ 6 rows in setBased on the query result, the tenant has available unit resources on the servers
100.xx.xxx.012:5070,100.xx.xxx.002:5072,100.xx.xxx.003:5073,100.xx.xxx.001:5071,100.xx.xxx.004:5074, and100.xx.xxx.005:5075. The unit with theunit_idof1003is located on the server100.xx.xxx.003:5073.For more information about the fields in the
GV$OB_UNITSview, see GV$OB_UNITS.
Based on the log stream information obtained in the preceding step, query the replica distribution of the log stream with the ID 1002.
System tenant
obclient(root@sys)[oceanbase]> SELECT * FROM oceanbase.CDB_OB_LS_LOCATIONS WHERE LS_ID=1002 AND TENANT_ID=1002;For more information about the fields in the
CDB_OB_LS_LOCATIONSview, see CDB_OB_LS_LOCATIONS.User tenant
MySQL modeOracle modeExecute the following statement in MySQL mode:
obclient(root@tenant1)[oceanbase]> SELECT * FROM oceanbase.DBA_OB_LS_LOCATIONS WHERE LS_ID=1002;Execute the following statement in Oracle mode:
obclient(SYS@tenant1)[SYS]> SELECT * FROM SYS.DBA_OB_LS_LOCATIONS WHERE LS_ID=1002;Here is an example of the query result:
+----------------------------+----------------------------+-------+----------------+----------+----------+------+----------+-------------------------------------------------------------------------------------------------+----------------------+--------------+--------------+---------+ | CREATE_TIME | MODIFY_TIME | LS_ID | SVR_IP | SVR_PORT | SQL_PORT | ZONE | ROLE | MEMBER_LIST | PAXOS_REPLICA_NUMBER | REPLICA_TYPE | LEARNER_LIST | REBUILD | +----------------------------+----------------------------+-------+----------------+----------+----------+------+----------+-------------------------------------------------------------------------------------------------+----------------------+--------------+--------------+---------+ | 2023-11-05 23:52:15.716363 | 2023-11-05 23:52:24.212243 | 1002 | 100.xx.xxx.003 | 5073 | 5107 | z2 | LEADER | 100.xx.xxx.003:5073:1699199535653021,100.xx.xxx.012:5070:1,100.xx.xxx.004:5074:1699194477097063 | 3 | FULL | | FALSE | | 2023-11-05 22:05:20.420160 | 2023-11-05 23:52:24.807711 | 1002 | 100.xx.xxx.012 | 5070 | 5105 | z1 | FOLLOWER | NULL | NULL | FULL | | FALSE | | 2023-11-05 22:27:57.157395 | 2023-11-05 23:52:24.361105 | 1002 | 100.xx.xxx.004 | 5074 | 5109 | z3 | FOLLOWER | NULL | NULL | FULL | | FALSE | +----------------------------+----------------------------+-------+----------------+----------+----------+------+----------+-------------------------------------------------------------------------------------------------+----------------------+--------------+--------------+---------+ 3 rows in setBased on the query results, you need to migrate the replica on
100.xx.xxx.003:5073to another server. Since the replica is onz2, you need to find an available unit in the same zone based on the unit information obtained in the previous query. You can migrate the replica to100.xx.xxx.002:5072.
Execute the replica migration command.
The statement is as follows:
ALTER SYSTEM MIGRATE REPLICA LS [=] ls_id SOURCE [=] 'svr_ip:svr_port' DESTINATION [=] 'destination_ip:destination_port' [DATA_SOURCE [=] 'data_source'] [TENANT [=] 'tenant_name']The statement is used as follows:
ls_id: specifies the log stream ID of the replica to be migrated.svr_ip:svr_port: specifies the IP address and port number of the server where the replica to be migrated is located. Example:100.xx.xxx.003:5073.destination_ip:destination_port: specifies the IP address and port number of the target server for the replica to be migrated. Example:100.xx.xxx.002:5072.data_source: specifies the data source address of the replica. You can select any other replica in the log stream as the data source address. Choose the data source address based on your business requirements. Example:100.xx.xxx.004:5074.If the specified data source is unavailable, an error is returned. If you do not explicitly specify the data source, the system automatically selects an available data source.
tenant_name: specifies the tenant to be operated on. The system tenant can specify other tenants, and the user tenant can only specify itself. If you do not explicitly specify the tenant, the default tenant name is the current tenant. This statement does not support usingall,all_user, orall_metato specify all tenants, user tenants, or all Meta tenants.This statement can migrate only one replica of a log stream at a time. If you need to migrate multiple replicas, you must execute this statement multiple times.
Here is an example:
obclient> ALTER SYSTEM MIGRATE REPLICA LS = 1002 SOURCE='100.xx.xxx.003:5073' DESTINATION = '100.xx.xxx.002:5072'After the execution is successful, query the distribution of the log stream replicas again.
System tenant
obclient(root@sys)[oceanbase]> SELECT * FROM oceanbase.CDB_OB_LS_LOCATIONS WHERE LS_ID=1002 AND TENANT_ID=1002;User tenant
MySQL modeOracle modeExecute the following statement in MySQL mode:
obclient(root@tenant1)[oceanbase]> SELECT * FROM oceanbase.DBA_OB_LS_LOCATIONS WHERE LS_ID=1002;Execute the following statement in Oracle mode:
obclient(SYS@tenant1)[SYS]> SELECT * FROM SYS.DBA_OB_LS_LOCATIONS WHERE LS_ID=1002;Here is an example of the query result:
+----------------------------+----------------------------+-------+----------------+----------+----------+------+----------+-------------------------------------------------------------------------------------------------+----------------------+--------------+--------------+---------+ | CREATE_TIME | MODIFY_TIME | LS_ID | SVR_IP | SVR_PORT | SQL_PORT | ZONE | ROLE | MEMBER_LIST | PAXOS_REPLICA_NUMBER | REPLICA_TYPE | LEARNER_LIST | REBUILD | +----------------------------+----------------------------+-------+----------------+----------+----------+------+----------+-------------------------------------------------------------------------------------------------+----------------------+--------------+--------------+---------+ | 2023-11-06 00:05:39.520972 | 2023-11-06 00:05:49.101503 | 1002 | 100.xx.xxx.002 | 5072 | 5106 | z2 | LEADER | 100.xx.xxx.002:5072:1699200339454143,100.xx.xxx.012:5070:1,100.xx.xxx.004:5074:1699194477097063 | 3 | FULL | | FALSE | | 2023-11-05 22:05:20.420160 | 2023-11-06 00:05:50.037494 | 1002 | 100.xx.xxx.012 | 5070 | 5105 | z1 | FOLLOWER | NULL | NULL | FULL | | FALSE | | 2023-11-05 22:27:57.157395 | 2023-11-06 00:05:49.484669 | 1002 | 100.xx.xxx.004 | 5074 | 5109 | z3 | FOLLOWER | NULL | NULL | FULL | | FALSE | +----------------------------+----------------------------+-------+----------------+----------+----------+------+----------+-------------------------------------------------------------------------------------------------+----------------------+--------------+--------------+---------+ 3 rows in setThe query result indicates that the replica of log stream 1002 on
100.xx.xxx.003:5073has been successfully migrated to100.xx.xxx.002:5072.
