There is a restriction on the execution of disaster recovery tasks: only one replica task can be executed at a time for a log stream of a tenant, excluding replica migration tasks. As a result, some emergency tasks may need to wait for the completion of the ongoing task before they can be executed. If you need to prioritize the execution of an emergency task, you can manually cancel the ongoing replica task by using the ALTER SYSTEM CANCEL REPLICA TASK command.
Limitations
The current version supports canceling only two types of replica tasks: adding replicas and migrating replicas.
Note
In the current version, for a replica migration task caused by Unit migration, if the destination of the migration task no longer has a Unit, the system automatically cancels the corresponding replica migration task. You do not need to manually cancel it.
You can cancel replica tasks of all tenants in the sys tenant but cancel only replica tasks of the current tenant in a user tenant.
Prerequisites
Before executing the cancel replica task operation, ensure that the current user has the
ALTER SYSTEMprivilege; otherwise, theALTER SYSTEM CANCEL REPLICA TASKstatement cannot be executed.Before querying the views, ensure that the current user has the
SELECTprivilege on the following views; otherwise, the relevant information cannot be queried.DBA_OB_LS_REPLICA_TASKS/CDB_OB_LS_REPLICA_TASKSDBA_OB_LS_REPLICA_TASK_HISTORY/CDB_OB_LS_REPLICA_TASK_HISTORY
Procedure
Assume that a tenant named tenant1 has a read-on replica of log stream 1001 that is performing a migration task. The cluster encounters an exception and urgently needs to add a full-featured replica on an available OBServer node to ensure high availability. However, the read-on replica being migrated involves a large amount of data, the migration task cannot complete soon, and the emergency operation to add a full-featured replica cannot be performed in time. In this case, you can refer to this topic to manually cancel the disaster recovery task of the migrating read-on replica and prioritize the emergency operation.
Log in to the cluster as the corresponding tenant.
Here is an example:
obclient -h172.30.xxx.xxx -P2883 -uroot@tenant1#obdemo -pxxxx -AFor more information about how to connect to a database, see Overview (MySQL-compatible mode) and Overview (Oracle-compatible mode).
View the ongoing disaster recovery tasks.
sys tenant
obclient [oceanbase]> SELECT * FROM oceanbase.CDB_OB_LS_REPLICA_TASKS\GFor more information about the fields in the
CDB_OB_LS_REPLICA_TASKSview, see CDB_OB_LS_REPLICA_TASKS.User tenant
MySQL-compatible modeOracle-compatible modeExecute the following statement in MySQL-compatible mode:
obclient [oceanbase]> SELECT * FROM oceanbase.DBA_OB_LS_REPLICA_TASKS\GExecute the following statement in Oracle-compatible mode:
obclient [SYS]> SELECT * FROM SYS.DBA_OB_LS_REPLICA_TASKS\G
A sample query result is as follows:
*************************** 1. row *************************** LS_ID: 1001 TASK_TYPE: MIGRATE REPLICA TASK_ID: Y13CE64586BD4-000610C5F3EDBBCB-0-0 TASK_STATUS: INPROGRESS PRIORITY: LOW TARGET_REPLICA_SVR_IP: 100.xx.xxx.002 TARGET_REPLICA_SVR_PORT: 5072 TARGET_PAXOS_REPLICA_NUMBER: 2 TARGET_REPLICA_TYPE: FULL SOURCE_REPLICA_SVR_IP: 100.xx.xxx.003 SOURCE_REPLICA_SVR_PORT: 5073 SOURCE_PAXOS_REPLICA_NUMBER: 2 SOURCE_REPLICA_TYPE: FULL DATA_SOURCE_SVR_IP: 100.xx.xxx.003 DATA_SOURCE_SVR_PORT: 5073 IS_MANUAL: FALSE TASK_EXEC_SVR_IP: 100.xx.xxx.002 TASK_EXEC_SVR_PORT: 5072 CREATE_TIME: 2024-02-07 15:23:04 START_TIME: 2024-02-07 15:23:04 MODIFY_TIME: 2024-02-07 15:23:04 COMMENT: migrate replica due to unit group not match CONFIG_VERSION: 1 row in setThe query result shows that a disaster recovery task with
TASK_TYPEofMIGRATE REPLICAis in progress. Record itsTASK_ID.Execute the cancel replica task command.
The syntax is as follows:
ALTER SYSTEM CANCEL REPLICA TASK TASK_ID [=] 'task_id' [TENANT [=] 'tenant_name'];Description of statement parameters:
task_id: theTASK_IDof the replica task to be canceled.tenant_name: the name of the target tenant. You can specify another tenant in the sys tenant and specify only the current tenant in a user tenant. If this parameter is not explicitly specified, the current tenant is used. You cannot useall,all_user, orall_metain this statement to specify all tenants, all user tenants, or all meta tenants.This statement cancels only one disaster recovery task of the tenant at a time.
Example:
obclient> ALTER SYSTEM CANCEL REPLICA TASK TASK_ID = 'Y13CE64586BD4-000610C5F3EDBBCB-0-0';View the result of canceling the replica task.
sys tenant
obclient [oceanbase]> SELECT * FROM oceanbase.CDB_OB_LS_REPLICA_TASK_HISTORY WHERE TASK_ID = 'Y13CE64586BD4-000610C5F3EDBBCB-0-0'\GFor more information about the fields in the
CDB_OB_LS_REPLICA_TASK_HISTORYview, see CDB_OB_LS_REPLICA_TASK_HISTORY.User tenant
MySQL-compatible modeOracle-compatible modeExecute the following statement in MySQL-compatible mode:
obclient [oceanbase]> SELECT * FROM oceanbase.DBA_OB_LS_REPLICA_TASK_HISTORY WHERE TASK_ID = 'Y13CE64586BD4-000610C5F3EDBBCB-0-0'\GExecute the following statement in Oracle-compatible mode:
obclient [SYS]> SELECT * FROM SYS.DBA_OB_LS_REPLICA_TASK_HISTORY WHERE TASK_ID = 'Y13CE64586BD4-000610C5F3EDBBCB-0-0'\G
A sample query result is as follows:
*************************** 1. row *************************** LS_ID: 1001 TASK_TYPE: MIGRATE REPLICA TASK_ID: Y13CE64586BD4-000610C5F3EDBBCB-0-0 TASK_STATUS: CANCELED PRIORITY: LOW TARGET_REPLICA_SVR_IP: 100.xx.xxx.002 TARGET_REPLICA_SVR_PORT: 5072 TARGET_PAXOS_REPLICA_NUMBER: 2 TARGET_REPLICA_TYPE: FULL SOURCE_REPLICA_SVR_IP: 100.xx.xxx.003 SOURCE_REPLICA_SVR_PORT: 5073 SOURCE_PAXOS_REPLICA_NUMBER: 2 SOURCE_REPLICA_TYPE: FULL DATA_SOURCE_SVR_IP: 100.xx.xxx.003 DATA_SOURCE_SVR_PORT: 5073 IS_MANUAL: FALSE TASK_EXEC_SVR_IP: 100.xx.xxx.002 TASK_EXEC_SVR_PORT: 5072 CREATE_TIME: 2024-02-07 15:54:35 START_TIME: 2024-02-07 15:54:31 MODIFY_TIME: 2024-02-07 15:54:35 FINISH_TIME: 2024-02-07 15:54:34 EXECUTE_RESULT: ret:-4072, OB_CANCELED; elapsed:4109659; comment:[storage] receive task reply from storage rpc; COMMENT: migrate replica due to unit not match CONFIG_VERSION: 1 row in setIn the query result,
TASK_STATUSisCANCELED, which indicates that the task has been canceled.