When the primary tenant becomes unavailable, you can switch a standby tenant to the primary tenant role to continue providing services.
Limitations
When you perform failover, all log stream replicas of the standby tenant must be online. Otherwise, you must wait for the corresponding replicas to go offline permanently.
You can log in to the standby tenant or the
systenant of the cluster where the standby tenant resides and query theDBA_OB_LSorCDB_OB_LSview to check whether all log stream replicas are online. For more information about log stream replicas, see Replica introduction.Failover does not check the status of the corresponding primary tenant. Therefore, in addition to disaster recovery when the primary tenant fails, failover can also be used for scenarios such as creating a snapshot of an independent primary tenant at a specific point in time for subsequent business verification.
You cannot execute the failover command in a session established by using a service name. For more information about services, see Create a service.
Notice
When you perform failover, if the tenant on which the failover command is to be executed has a service name, the system automatically deletes the service name of that tenant.
Background information
The failover operation modifies only log files, not data files.
Because each OceanBase Database tenant has multiple log streams and the failover operation must achieve data consistency after completion, the system selects the smallest SCN among the synchronization positions of all log streams as the failover execution position. After the failover operation, all log streams of the tenant are rolled back to that position.
Procedure
Log in to the standby tenant or to the
systenant of the cluster where the standby tenant resides as the administrator.Execute the
ACTIVATE STANDBY VERIFYcommand to verify whether theACTIVATE STANDBYcommand can be executed successfully.The
systenant of the cluster where the standby tenant resides executes theACTIVATE STANDBY VERIFYcommandALTER SYSTEM ACTIVATE STANDBY TENANT [=] tenant_name VERIFY;Example:
ALTER SYSTEM ACTIVATE STANDBY TENANT = mysql VERIFY;The standby tenant executes the
ACTIVATE STANDBY VERIFYcommandALTER SYSTEM ACTIVATE STANDBY VERIFY;If the command returns
OK, the verification passes and you can proceed to the next step.If an error occurs, refer to Switchover or failover issues for handling based on the error message, and then retry the command.
After the verification passes, execute the
ACTIVATE STANDBYcommand to switch the standby tenant to the primary tenant role.The
systenant of the cluster where the standby tenant resides switches the standby tenant to the primary tenantALTER SYSTEM ACTIVATE STANDBY TENANT = tenant_name;The standby tenant switches itself to the primary tenant
ALTER SYSTEM ACTIVATE STANDBY;
Query the
DBA_OB_TENANTSview to confirm whether the standby tenant has been switched to the primary tenant role.The
systenant of the cluster where the standby tenant resides queries the viewobclient [oceanbase]> SELECT TENANT_NAME, TENANT_TYPE, TENANT_ROLE, SWITCHOVER_STATUS FROM oceanbase.DBA_OB_TENANTS;The standby tenant queries the view
MySQL-compatible mode:
obclient [oceanbase]> SELECT TENANT_NAME, TENANT_TYPE, TENANT_ROLE, SWITCHOVER_STATUS FROM oceanbase.DBA_OB_TENANTS;Oracle-compatible mode:
obclient [SYS]> SELECT TENANT_NAME, TENANT_TYPE, TENANT_ROLE, SWITCHOVER_STATUS FROM SYS.DBA_OB_TENANTS;
A sample query result is as follows:
+-----------------+-------------+-------------+-------------------+ | TENANT_NAME | TENANT_TYPE | TENANT_ROLE | SWITCHOVER_STATUS | +-----------------+-------------+-------------+-------------------+ | standby_tenant | USER | PRIMARY | NORMAL | +-----------------+-------------+-------------+-------------------+ 1 row in setIn the query result, if the standby tenant's
TENANT_ROLEisPRIMARYandSWITCHOVER_STATUSisNORMAL, the standby-to-primary switchover is successful.
Considerations after failover
When a tenant is switched to a primary tenant by executing the failover command, if that tenant has a service name, the system automatically deletes the service name of that tenant. However, the system does not delete the service name of the original primary tenant.
For example, under service name
service1, there is a primary tenanttenantAand its corresponding standby tenanttenantB. When the failover command is executed ontenantBto switch it to a primary tenant, because it is no longer part of the original primary/standby relationship, the system deletestenantB's service nameservice1, whiletenantA's service nameservice1is retained.Because the failover operation restores all synchronized data of all log streams to a consistency position, ensuring that data of all log streams before that position is complete, after the failover operation is performed:
- The original primary tenant cannot be demoted to a standby tenant and then connected as a standby tenant of the new primary tenant.
What to do next: Connect the original primary tenant's standby tenants to the new primary tenant
OceanBase supports connecting the original primary tenant's standby tenants to the new primary tenant after failover through the standby tenant log flashback maintenance command:
- The standby tenant to be connected can be one that existed before the original primary tenant failed.
- It can also be one restored from the original primary tenant's physical backup after the original primary tenant failed.
Syntax
The standby tenant log flashback maintenance command supports flashing back the standby tenant's logs to a specified position. This specified position must be greater than or equal to the standby tenant's synchronization position.
ALTER SYSTEM FLASHBACK STANDBY LOG TO SCN = $flashback_log_scn [TENANT = 'tenant_name'];
The parameters are described as follows:
flashback_log_scn: the specified position to which the standby tenant's logs are flashed back.tenant_name: the tenant name. If you log in by directly connecting to the tenant, this parameter can be omitted.
Considerations
Before performing the following operations, ensure that the following conditions are met:
- Before using this command, ensure that the standby tenant has stopped synchronization and the restore source is empty.
- No tenant is allowed to have valid replicas on offline machines.
- Only standby tenants can perform log flashback. The tenant's
STATUSandSWITCHOVER_STATUSmust both beNORMAL.SWITCHOVER_STATUScan also beFLASHBACK_AND_STAY_STANDBY_STATUS. - The log flashback position to be set must be greater than or equal to the tenant's synchronization position.
- This command automatically clears the tenant's service name (if any).
- When the tenant's
SWITCHOVER_STATUSisFLASHBACK_AND_STAY_STANDBY_STATUS, the tenant is not allowed to continue synchronization or modify the tenant's restore source.
Procedure
Stop the standby tenant's synchronization, then confirm that the tenant's
SYNC_SCNequalsRECOVERY_UNTIL_SCN.ALTER SYSTEM RECOVER STANDBY [TENANT [=] tenant_name] CANCEL; SELECT TENANT_ID, SYNC_SCN, RECOVERY_UNTIL_SCN FROM DBA_OB_TENANTS WHERE TENANT_ID=xxxx;Set the standby tenant's restore source to empty, and query the
CDB_OB_LOG_RESTORE_SOURCEview to confirm the modification succeeded.ALTER SYSTEM SET LOG_RESTORE_SOURCE = '' [TENANT [=] tenant_name]; SELECT * FROM CDB_OB_LOG_RESTORE_SOURCE WHERE TENANT_ID = xxxx; -- Or query in a user tenant SELECT * FROM DBA_OB_LOG_RESTORE_SOURCE;Confirm the log flashback position. The maintenance command has only one requirement for the log flashback position: it must be greater than or equal to the tenant's synchronization position. For how to set a meaningful value, see the "Scenario best practices" section below. The procedure here assumes the log flashback position is any value greater than or equal to the tenant's synchronization position.
Query the
DBA_OB_TENANTSview- Confirm that the log flashback position to be set is greater than or equal to the standby tenant's
SYNC_SCN - Confirm that the standby tenant's
STATUSisNORMALandSWITCHOVER_STATUSisNORMALorFLASHBACK_AND_STAY_STANDBY_STATUS.
SELECT TENANT_ID, SYNC_SCN, STATUS, SWITCHOVER_STATUS FROM DBA_OB_TENANTS WHERE TENANT_ID=xxxx;- Confirm that the log flashback position to be set is greater than or equal to the standby tenant's
Ensure that no tenant has valid replicas on offline machines.
SELECT COUNT(*) -> FROM CDB_OB_LS_LOCATIONS -> WHERE (SVR_IP, SVR_PORT) IN ( -> SELECT SVR_IP, SVR_PORT -> FROM DBA_OB_SERVERS -> WHERE STATUS = 'INACTIVE' -> ) -> AND TENANT_ID = xxxx; /* A return value of 0 indicates no valid replicas on offline machines */Execute the standby tenant log flashback maintenance command.
ALTER SYSTEM FLASHBACK STANDBY LOG TO SCN = $flashback_log_scn [TENANT = 'tenant_name'];
Scenario best practices
This section describes some best practices for failover scenarios.
Notice
We recommend using OCP to implement the following best practices.
Select the new primary tenant and take over other standby tenants
In a one-primary-multiple-standby scenario, identify the standby tenant with the largest synchronization position, perform failover on it to make it the primary tenant, and then connect the remaining standby tenants to the new primary tenant. This avoids the situation where the other standby tenants of the original primary tenant become unusable after a new primary tenant exists. You need to complete the following two operations:
Select the new primary tenant
- Send stop synchronization commands to all standby tenants under the original primary tenant.
- Set
LOG_RESTORE_SOURCEto empty for all standby tenants under the primary tenant. - Read the synchronization positions of all standby tenants, find the standby tenant with the largest synchronization position, and perform failover on it to make it the primary tenant. If you need to specify a particular standby tenant as the new primary tenant, standby tenants with a synchronization position greater than the new primary tenant cannot be connected to the new primary tenant.
Connect standby tenants to the new primary tenant via FLASHBACK
- Confirm the new primary tenant's failover position by querying the
DBA_OB_TENANTSview for the new primary tenant'sFLASHBACK_LOG_SCN. - Send
FLASHBACKcommands to the remaining standby tenants, setting theirFLASHBACK_SCNto the new primary tenant's failover position. - Set the remaining standby tenants'
LOG_RESTORE_SOURCEto the new primary tenant. - Set the standby tenants' restore endpoint to
UNLIMITED. - Set the same service name for the standby tenants as the new primary tenant (if any).
- Confirm the new primary tenant's failover position by querying the
Create a standby tenant for the new primary tenant from the original primary tenant's physical backup
Use the original primary tenant's physical backup to create a standby tenant for the new primary tenant. This reuses the original primary tenant's physical backup effectively, avoiding the need for the new primary tenant to wait for physical backup to complete before creating a standby tenant, and avoiding the situation where the original primary tenant's physical backup has no consumption value. When building a standby tenant for the primary tenant, you have an additional option: restore a standby tenant from the original primary tenant's backup. This scenario is the same as the one above, except that the standby tenant is created at a different time. You need to complete the following two operations:
Create a standby tenant based on the original primary tenant's backup
- Confirm the new primary tenant's failover position by querying the
DBA_OB_TENANTSview for the tenant'sFLASHBACK_LOG_SCN. - Restore a standby tenant from the original primary tenant's backup and archive logs, specifying the restore position as the new primary tenant's failover position.
- Confirm the new primary tenant's failover position by querying the
Execute step 2 of the previous scenario: Connect standby tenants to the new primary tenant via FLASHBACK.