When the primary tenant is unavailable, you can fail over to the standby tenant to continue providing services.
Limitations
During a failover, all replicas of the log streams in the standby tenant must be online. Otherwise, you must wait for the corresponding replicas to go offline permanently.
The
systenant of the standby tenant or the standby tenant cluster can query theDBA_OB_LSorCDB_OB_LSview to check whether all replicas of the log streams are online. For more information about log stream replicas, see Replica introduction.During a failover, the status of the primary tenant is not checked. Therefore, in addition to disaster recovery scenarios where the primary tenant is faulty, a failover operation can also be used to create a snapshot of an independent primary tenant at a specific point in time for subsequent business verification.
You cannot execute a failover command in a session established by using a service name. For more information about services, see Create a service.
Notice
During a failover, if the tenant to which the failover command is to be executed has a service name, the system automatically deletes the service name of the tenant.
Background information
The failover operation only modifies the log files and does not modify the data files.
Since each tenant in OceanBase Database has multiple log streams, and the failover operation requires the system to reach a consistent state after the operation is completed, the system selects the smallest SCN among all the synchronized positions of the log streams as the execution point for the failover. After the failover operation is executed, all log streams under the tenant will be rolled back to this point.
Procedure
Log in to the
systenant of the standby cluster or the standby tenant as the administrator.Execute the
ACTIVATE STANDBY VERIFYstatement to verify whether theACTIVATE STANDBYstatement can be executed.Execute the
ACTIVATE STANDBY VERIFYstatement in thesystenant of the standby cluster.ALTER SYSTEM ACTIVATE STANDBY TENANT [=] tenant_name VERIFY;Example:
obclient(root@sys)[(none)]> ALTER SYSTEM ACTIVATE STANDBY TENANT = mysql VERIFY;Execute the
ACTIVATE STANDBY VERIFYstatement in the standby tenant.obclient> ALTER SYSTEM ACTIVATE STANDBY VERIFY;If the statement returns
OK, the verification is successful and you can proceed to the next step.If an error is returned, handle the error based on the error message and then re-execute the statement.
After the verification is successful, execute the
ACTIVATE STANDBYstatement to switch the standby tenant to the primary tenant.Switch the standby tenant to the primary tenant in the
systenant of the standby cluster.ALTER SYSTEM ACTIVATE STANDBY TENANT = tenant_name;Switch the standby tenant to the primary tenant in the standby tenant.
ALTER SYSTEM ACTIVATE STANDBY;
Query the
DBA_OB_TENANTSview to confirm whether the standby tenant has been switched to the primary tenant.Query the view in the
systenant of the standby cluster.obclient(root@sys)[(none)]> SELECT TENANT_NAME, TENANT_TYPE, TENANT_ROLE, SWITCHOVER_STATUS FROM oceanbase.DBA_OB_TENANTS;Query the view in the standby tenant.
In MySQL mode:
obclient [oceanbase]> SELECT TENANT_NAME, TENANT_TYPE, TENANT_ROLE, SWITCHOVER_STATUS FROM oceanbase.DBA_OB_TENANTS;In Oracle mode:
obclient [SYS]> SELECT TENANT_NAME, TENANT_TYPE, TENANT_ROLE, SWITCHOVER_STATUS FROM SYS.DBA_OB_TENANTS;
The query result is as follows:
+-----------------+-------------+-------------+-------------------+ | TENANT_NAME | TENANT_TYPE | TENANT_ROLE | SWITCHOVER_STATUS | +-----------------+-------------+-------------+-------------------+ | standby_tenant | USER | PRIMARY | NORMAL | +-----------------+-------------+-------------+-------------------+ 1 row in setIf the
TENANT_ROLEof the original standby tenant isPRIMARYand theSWITCHOVER_STATUSisNORMAL, the standby tenant has been successfully switched to the primary tenant.
Considerations after failover
During the failover process, if the tenant has a service name, the system will automatically delete the service name of the tenant. However, the service name of the original primary tenant will not be deleted.
For example, if service1 has a primary tenant tenantA and its corresponding standby tenant tenantB, when you execute the failover command on tenantB to switch it to the primary tenant, the system will delete the service name service1 of tenantB because it is no longer the primary tenant. However, the service name service1 of tenantA will be retained.
After a failover operation, all logs that have been synchronized are restored to a consistent point in time, ensuring that all data before this point is complete. Therefore, after a failover operation:
- The original primary tenant cannot be demoted to a standby tenant of the new primary tenant.
Next steps: Connect a standby tenant of the original primary tenant to the new primary tenant
After a failover, OceanBase Database allows you to connect a standby tenant of the original primary tenant to the new primary tenant by using the maintenance command to truncate the logs of the standby tenant to a specified position. The standby tenant to be connected can be one of the following two types:
A standby tenant that existed before the original primary tenant failed
A standby tenant restored from the physical backup of the original primary tenant after the original primary tenant failed
Syntax
The flashback log command allows you to truncate the logs of a standby tenant to a specified point in time, which must be greater than or equal to the synchronization point of the standby tenant.
ALTER SYSTEM FLASHBACK STANDBY LOG TO SCN = $flashback_log_scn [TENANT = 'tenant_name'];
The parameters are described as follows:
flashback_log_scn: the point in time to which the logs of the standby tenant are truncated.tenant_name: the name of the standby tenant. This parameter is optional if you are logged in as the tenant itself.
Considerations
Before performing subsequent operations, ensure the following conditions are met:
- The standby tenant must be stopped from synchronizing, and its recovery source must be empty.
- The standby tenant must not have any valid replicas on the offline server.
- Only the standby tenant can perform log truncation. The
STATUSandSWITCHOVER_STATUSof the tenant must both beNORMAL. Additionally, theSWITCHOVER_STATUScan also beFLASHBACK_AND_STAY_STANDBY_STATUS. - The log truncation point must be greater than or equal to the synchronization point of the tenant.
- This command will automatically clear the service name of the tenant (if any).
- If the
SWITCHOVER_STATUSof the tenant isFLASHBACK_AND_STAY_STANDBY_STATUS, the tenant cannot continue to synchronize or modify its recovery source.
Procedure
Stop the standby tenant from synchronizing and confirm that the
SYNC_SCNof the standby tenant is equal to theRECOVERY_UNTIL_SCN.Execute the following command to stop the log synchronization of the standby tenant.
ALTER SYSTEM RECOVER STANDBY [TENANT [=] tenant_name] CANCEL;Here is an example:
Stop the log synchronization of the standby tenant in the
systenant of the cluster where the standby tenant is located.obclient(root@sys)[(none)]> ALTER SYSTEM RECOVER STANDBY TENANT = standby3 CANCEL;Stop the log synchronization of the standby tenant.
obclient> ALTER SYSTEM RECOVER STANDBY CANCEL;
Execute the following command to confirm whether the
SYNC_SCNof the standby tenant is equal to theRECOVERY_UNTIL_SCN.Confirm in the
systenant of the cluster where the standby tenant is located.obclient(root@sys)[(none)]> SELECT TENANT_ID, SYNC_SCN, RECOVERY_UNTIL_SCN FROM oceanbase.DBA_OB_TENANTS WHERE TENANT_ID=xxxx;Confirm in the standby tenant.
MySQL mode:
obclient(root@standby3)[(none)]> SELECT TENANT_ID, SYNC_SCN, RECOVERY_UNTIL_SCN FROM oceanbase.DBA_OB_TENANTS;Oracle mode:
obclient(sys@standby3)[SYS]> SELECT TENANT_ID, SYNC_SCN, RECOVERY_UNTIL_SCN FROM SYS.DBA_OB_TENANTS;
Set the recovery source of the standby tenant to empty and query the
CDB_OB_LOG_RESTORE_SOURCEview to confirm whether the modification is successful.Execute the following command to set the recovery source of the standby tenant to empty.
The statement is as follows:
ALTER SYSTEM SET LOG_RESTORE_SOURCE = '' [TENANT [=] tenant_name];Here is an example:
Set the recovery source of the standby tenant to empty in the
systenant of the cluster where the standby tenant is located.obclient(root@sys)[(none)]> ALTER SYSTEM SET LOG_RESTORE_SOURCE = '' TENANT = standby3;Set the recovery source of the standby tenant to empty.
obclient> ALTER SYSTEM SET LOG_RESTORE_SOURCE = '';
Query the
CDB_OB_LOG_RESTORE_SOURCEview to confirm whether the modification is successful.Confirm in the
systenant of the cluster where the standby tenant is located.obclient(root@sys)[(none)]> SELECT * FROM oceanbase.CDB_OB_LOG_RESTORE_SOURCE WHERE TENANT_ID = xxxx;Confirm in the standby tenant.
MySQL mode:
obclient(root@standby3)[(none)]> SELECT * FROM oceanbase.DBA_OB_LOG_RESTORE_SOURCE;Oracle mode:
obclient(sys@standby3)[SYS]> SELECT * FROM SYS.DBA_OB_LOG_RESTORE_SOURCE;
Confirm the log truncation point. The only requirement for the log truncation point is that it must be greater than or equal to the synchronization point of the tenant. For more information on how to set a meaningful value, see the Best Practices section below. This step assumes that the log truncation point is any value greater than or equal to the synchronization point of the tenant.
Query the
DBA_OB_TENANTSview to confirm the relevant information.Mainly confirm the following information:
- The log truncation point to be set must be greater than or equal to the
SYNC_SCNof the standby tenant. - The
STATUSof the standby tenant must beNORMAL, and theSWITCHOVER_STATUSmust beNORMALorFLASHBACK_AND_STAY_STANDBY_STATUS.
The query statement is as follows:
Confirm in the
systenant of the cluster where the standby tenant is located.obclient(root@sys)[(none)]> SELECT TENANT_ID, SYNC_SCN, STATUS, SWITCHOVER_STATUS FROM oceanbase.DBA_OB_TENANTS WHERE TENANT_ID=xxxx;Confirm in the standby tenant.
MySQL mode:
obclient(root@standby3)[(none)]> SELECT TENANT_ID, SYNC_SCN, STATUS, SWITCHOVER_STATUS FROM oceanbase.DBA_OB_TENANTS;MySQL mode:
obclient(sys@standby3)[SYS]> SELECT TENANT_ID, SYNC_SCN, STATUS, SWITCHOVER_STATUS FROM SYS.DBA_OB_TENANTS;
- The log truncation point to be set must be greater than or equal to the
Query the
DBA_OB_SERVERSview to ensure that the tenant does not have any valid replicas on the offline server.Log in as the
systenant of the cluster where the standby tenant is located and execute the following statement to query.obclient(root@sys)[(none)]> SELECT COUNT(*) -> FROM CDB_OB_LS_LOCATIONS -> WHERE (SVR_IP, SVR_PORT) IN ( -> SELECT SVR_IP, SVR_PORT -> FROM oceanbase.DBA_OB_SERVERS -> WHERE STATUS = 'INACTIVE' -> ) -> AND TENANT_ID = xxxx;If the query result is empty, it indicates that there are no valid replicas on the offline server.
Execute the standby tenant log truncation command.
ALTER SYSTEM FLASHBACK STANDBY LOG TO SCN = $flashback_log_scn [TENANT = 'tenant_name'];
Best practices
This section describes some best practices for failover.
Notice
We recommend that you use OCP to implement the following best practices.
Choose a new primary tenant and take over other standby tenants
In a one-primary-tenant-and-multiple-standby-tenants scenario, identify the standby tenant with the largest synchronization point, fail it over to the primary tenant, and then connect the remaining standby tenants to the new primary tenant. This way, the other standby tenants of the original primary tenant are not rendered obsolete after the new primary tenant is selected. To achieve this, you need to complete the following two steps:
Select a new primary tenant.
You need to perform the following operations:
- Send the stop synchronization command to all standby tenants of the original primary tenant.
- Set the
LOG_RESTORE_SOURCEparameter of all standby tenants of the original primary tenant to an empty string. - Read the synchronization points of all standby tenants, identify the standby tenant with the largest synchronization point, and fail it over to the primary tenant. If you want to select a specific standby tenant as the new primary tenant, the standby tenant with a synchronization point larger than that of the new primary tenant cannot be connected to the new primary tenant.
Flashback the remaining standby tenants to the new primary tenant.
You need to perform the following operations:
- Obtain the failover point of the new primary tenant from the
FLASHBACK_LOG_SCNcolumn of theDBA_OB_TENANTSview. - Send the
FLASHBACKcommand to the remaining standby tenants and set theirFLASHBACK_SCNparameter to the failover point of the new primary tenant. - Set the
LOG_RESTORE_SOURCEparameter of the remaining standby tenants to the new primary tenant. - Set the restore endpoint of the remaining standby tenants to
UNLIMITED. - Set the service name of the remaining standby tenants to the same value as that of the new primary tenant, if any.
- Obtain the failover point of the new primary tenant from the
Create a standby tenant for the new primary tenant by using the physical backup of the original primary tenant
Create a standby tenant for the new primary tenant by using the physical backup of the original primary tenant. This way, the physical backup of the original primary tenant can be effectively reused. The new primary tenant does not need to wait for the physical backup to be completed before a standby tenant is created. In addition, the physical backup of the original primary tenant is still valuable. When you create a standby tenant for the original primary tenant, you can select the option to restore a standby tenant from the backup of the original primary tenant. This scenario is the same as the previous one, except that the standby tenant is created at a different time. To achieve this, you need to complete the following two steps:
Create a standby tenant based on the backup of the original primary tenant.
You need to perform the following operations:
- Obtain the failover point of the new primary tenant from the
FLASHBACK_LOG_SCNcolumn of theDBA_OB_TENANTSview. - Restore a standby tenant from the backup and archive logs of the original primary tenant and specify the failover point of the new primary tenant as the restore point.
- Obtain the failover point of the new primary tenant from the
Flashback the standby tenant to the new primary tenant. This is the second step in the previous scenario.
