The concept of zones is introduced to implement advanced disaster recovery for OceanBase Database. A zone is a logical concept and is a group of nodes with similar disaster recovery attributes. A zone is usually an independent physical deployment unit, which can be an Internet Data Center (IDC), an availability zone in a cloud, or an independent rack. You can deploy your OceanBase cluster across multiple zones (generally at least 3) to implement fault isolation and quick recovery when a single zone fails.
When you need to perform disaster recovery or O&M changes in the zone level, you can isolate a zone. After a zone is isolated, new read and write requests will not be routed to nodes in this zone. In this way, the fault is isolated or you can perform lossless O&M changes.
Fault isolation: Assume that in a multi-IDC deployment architecture, a switch in an IDC malfunctions, and packet losses, retransmission, and even elections without a leader occur on some nodes. You can isolate the zone corresponding to the IDC for quick recovery.
O&M changes: During the rotating upgrade of an OceanBase cluster, you can first isolate one zone, switch the user traffic from this zone to another zone, and then perform the upgrade in this zone. After the upgrade in this zone is completed, you can start this zone. Then, you can perform the upgrade in other zones. In this way, the upgrade is transparent to users.
Isolating a zone simply involves diverting the business traffic away from that node without changing the number of voting members in the Paxos group. Different isolation commands provide different levels of security guarantees, and it is necessary to carefully consider the subsequent actions that can be safely executed based on the specific isolation command being executed. Isolation commands are suitable for scenarios where short-term isolation is required but continuous monitoring is necessary. It is important to have contingency plans for secondary failures in advance. If the recovery cannot be achieved within a short period, the node can be completely removed by replacing it.
Three isolation commands with different different levels of security guarantees are provided:
STOP ZONEFORCE STOP ZONEISOLATE ZONE
After you isolate a zone by using any of the preceding commands, the STATUS value of this zone changes to INACTIVE, indicating that this zone is in the isolated state.
STOP ZONE
The command is as follows:
obclient [(none)]> ALTER SYSTEM STOP ZONE 'zone_name';
STOP ZONE is the most secure isolation command. The STOP ZONE command switches the business traffic from the target zone to isolate the nodes in this zone from the business traffic. It can also ensure that the nodes in other zones are still the majority in the Paxos group. In this way, you can securely perform all actions on the nodes in the isolated zone. For example, you can run the pstack command, adjust the log level, or stop the observer process. When you need to isolate a fault or stop a zone for maintenance, you can use the STOP ZONE command, which is the best choice in this case.
The precheck for the STOP ZONE command is strict to ensure that any actions can be securely performed on the nodes in the isolated zone. Various internal status checks are performed to ensure that after the observer process on the target node is stopped, the replicas on the remaining available nodes in the cluster are still the majority in the Paxos group, thereby ensuring the continuity of database services.
Limitations of the STOP ZONE command are as follows:
You cannot stop multiple zones or nodes in multiple zones at the same time. If another zone or the nodes in another zone are already stopped, the error
-4660, cannot stop server or stop zone in multiple zonesis returned when you run theSTOP ZONEcommand. In this case, you can query theoceanbase.DBA_OB_SERVERSandoceanbase.DBA_OB_ZONESviews to check whether a zone or the nodes in a zone are stopped.The
STOP ZONEcommand will check the Paxos member lists of all log streams in all tenants to verify whether the replicas on the remaining nodes, except for nodes in the target zone and other unavailable nodes, are still the majority in the Paxos group. If no, you cannot run theSTOP ZONEcommand.Nodes in the
INACTIVE,stopped(in theACTIVEstate and the value of thestop_timefield is greater than 0), orDELETINGstate are unavailable nodes.To check the Paxos member list of log streams, make sure that all log streams have a leader. If any log stream does not have a leader, the error code
-4179is returned and the information about this log stream is displayed.For example, assume that log stream 1 in the
1001tenant does not have a leader. The error messageTenant(1001) LS(1) has no leader, stop zone not allowedis returned. In this case, you can query theGV$OB_LOG_STATview to check information about the leader of the log stream and troubleshoot the issue.If the locality of the tenant is modified when you run the
STOP ZONEcommand, a misjudgment may be made.For example, assume that the locality of the
1001tenant is being changed to remove a replica and the member list of the log stream is changing from A, B, and C to A and B. If you run theSTOP ZONEcommand to stop the zone where node B is located, the system may misjudge that this zone can be stopped, which conflicts with the replica removal operation. To avoid misjudgments, theSTOP ZONEcommand returns the error code-4179and displays the modification information about the tenantTenant(1001) locality is changing, stop zone not allowed. In this case, you can query theDBA_OB_TENANTSview to check whether the locality of the tenant is modified.After you check the nodes in the target zone and other unavailable nodes, if the replicas on the remaining nodes are not the majority, the
-4179error code is returned and the information about the abnormal log stream is displayed.For example, after you run the
STOP ZONEcommand on log stream 1 of the1001tenant, if the replicas on the remaining nodes are not the majority, the error messageTenant(1001) LS(1) has no enough valid paxos member after stop zone, stop zone not allowedis returned. In this case, you can query theGV$OB_LOG_STATview to check the member list and replica information of the log stream to verify whether the replicas on the remaining nodes are the majority.
After you run the
STOP ZONEcommand, you can perform all invasive actions on nodes in the isolated zone. Make sure that the replicas on the remaining available nodes are the majority and the logs of these replicas are synchronized. In this way, after the observer processes on nodes in the isolated zone are stopped, the log replicas on the remaining nodes are still the majority.To determine whether the logs are synchronized, view the latest log checkpoints of the followers and leader in the log stream. If the difference between the checkpoints is within 5s, the logs are synchronized. You can query the
END_SCNfield in theGV$OB_LOG_STATview for the latest log checkpoint of each replica in the specified log stream.If the difference exceeds 5s, the error code
-4179is returned and the information about the abnormal log stream is displayed.For example, assume that logs of replicas in log stream 1 of the
1001tenant are out of synchronization. The error messageTenant(1001) LS(1) log not sync, stop server not allowedis returned. In this case, you can query theEND_SCNfield in theGV$OB_LOG_STATview to check the synchronization of logs among replicas in the log stream.
# Obtain the latest log checkpoints of replicas in a specified log stream and convert the checkpoints into timestamps. obclient [(none)]> SELECT SCN_TO_TIMESTAMP(END_SCN) FROM GV$OB_LOG_STAT WHERE TENANT_ID = 1001 and LS_ID = 1;
STOP ZONE is the most secure isolation command and has the strictest prerequisites. In actual application scenarios especially in the case of emergency response to faults, the preceding prerequisites may not be fully met. In this case, the STOP ZONE command will fail. Therefore, OceanBase Database provides another two zone isolation commands: FORCE STOP ZONE and ISOLATE ZONE. The two commands have less strict prerequisites and limits and apply to more scenarios. However, after a zone is isolated by using one of the two commands, the actions that can be performed are limited. The STOP ZONE command is the best choice in all scenarios. If STOP ZONE is inapplicable, you can choose the other two isolation commands.
FORCE STOP ZONE
Unlike STOP ZONE, the FORCE STOP ZONE command skips the log synchronization check. For a zone isolated by using the FORCE STOP ZONE command, you can switch the business traffic away from the isolated zone but you may not securely stop the observer process on a node in this zone. If you forcibly stop the observer process on a node, the remaining nodes in the Paxos group may not be the majority, thereby affecting the continuity of database services.
Here is a typical application scenario: assume that a server or network device in an IDC encounters a hardware exception and must be stopped for maintenance. To ensure that the observer process on a node in the isolated zone can be securely stopped, the STOP ZONE command is the best choice to isolate the zone corresponding to this IDC. However, due to the long latency between the replica logs on the nodes in other zones, the STOP ZONE command cannot be successfully run. In this case, you can only use the FORCE STOP ZONE command to isolate the zone. This prevents the hardware exception from affecting the business traffic. Identify the cause of the latency between replica logs. After the logs are synchronized between the nodes, try to run the STOP ZONE command. If the command is successfully executed, you can securely stop the server or network device for maintenance.
obclient [(none)]> ALTER SYSTEM FORCE STOP ZONE zone_name;
Considerations:
After you run the
FORCE STOP ZONEcommand, if nodes in the target zone still participate in Paxos log synchronization and will not be removed, you can securely use theFORCE STOP ZONEcommand to isolate the fault and switch the business traffic away. The service continuity is not affected even if some replica logs are out of synchronization.To stop a target node after you run the
FORCE STOP ZONEcommand, you must pay attention to the log stream where the logs of a replica are out of synchronization and verify whether the log stream can still properly provide services after the target node is removed. You can also query theGV$OB_LOG_STATview to assess the impact duration of services.
Warning
The FORCE STOP ZONE command applies only when a latency exists between logs on other nodes except for those in the zone to be isolated. In this case, you can isolate the zone only for fault isolation. You must not perform any invasive action that will cause the observer process to stop. Otherwise, the replicas on the remaining nodes are not the majority in the Paxos group. As a result, no leader exists in the tenant and the continuity of services is affected.
ISOLATE ZONE
The precheck for the ISOLATE ZONE command is looser than those for STOP ZONE and FORCE STOP ZONE, contributing to the highest response speed. However, the ISOLATE ZONE command does not provide any security guarantee. That is, it does not guarantee that after a node in the target zone is removed, the replicas on the remaining nodes in the cluster are still the majority in the Paxos group to continue to provide services.
obclient [(none)]> ALTER SYSTEM ISOLATE ZONE zone_name;
The ISOLATE ZONE command loosens the precheck based on the following logic:
The
ISOLATE ZONEcommand has no restrictions on the isolation of nodes in multiple zones. If a node in another zone is already isolated by using any isolation command, you can still run theISOLATE ZONEcommand.Here is a typical scenario: a cluster has three zones Z1, Z2, and Z3. Each of Z1 and Z3 has a faulty node to be isolated. After the faulty node in Z1 is isolated, if you use
STOP ZONEorFORCE STOP ZONEto isolate the faulty node in Z3, an error will be returned, indicating that you cannot isolate nodes in multiple zones at the same time. In this case, you can only use theISOLATE ZONEcommand to switch the business traffic away from the faulty node. After the faulty node is isolated, You must not perform any invasive action that will cause the observer process to stop.The
ISOLATE ZONEcommand does not check whether the replicas on the remaining nodes in the log stream are still the majority, or whether the logs of these replicas are synchronized. Even if the required conditions are not met, theISOLATE ZONEcommand can still be executed.Here is a typical scenario: the initial Paxos member list of a log stream contains three nodes A, B, and C. The server of node C fails and becomes permanently offline. Then, the member list of the log stream contains only nodes A and B and lacks replicas. Assume that node B encounters a hardware fault at this time and the corresponding zone must be isolated. If you run
STOP ZONEorFORCE STOP ZONEto isolate the zone, the command will fail and an error message will be displayed, indicating that the remaining replicas are not the majority. In this case, you can only use theISOLATE ZONEcommand to switch the business traffic away from the faulty node. After the faulty node is isolated, You must not perform any invasive action that will cause the observer process to stop.
Limitations of ISOLATE ZONE are as follows:
When nodes in multiple zones are isolated, the ISOLATE ZONE command will check the primary zone of each tenant and require that you cannot isolate all zones in the region where the primary zone of a tenant is located. This prevents the read and write services of the tenant from being switched over to another region, and avoids a sharp increase in the response time when business requests are routed across regions.
You can query the DBA_OB_TENANTS view for the primary zone of the tenant, and verify whether the preceding restrictions are violated based on the information about isolated zones and nodes queried in the DBA_OB_ZONES and DBA_OB_SERVERS views.
Warning
The ISOLATE ZONE command applies only when another node except for those in the zone to be isolated is abnormal. In this case, you can isolate the zone only for fault isolation. You must not perform any invasive action that will cause the observer process to stop. Otherwise, the replicas on the remaining nodes are not the majority in the Paxos group. As a result, no leader exists in the tenant and the continuity of services is affected.
Summary
The STOP ZONE command is highly recommended in most scenarios. After you run this command, the fault is isolated and you can securely perform all emergency actions and O&M actions on nodes in the isolated zone. If the STOP SERVER command is inapplicable, you can choose one of the other two isolation commands. However, after any of the two isolation commands is executed, you cannot securely perform invasive actions.
The FORCE STOP ZONE command applies when a latency exists between logs on other nodes except for those in the zone to be isolated. The ISOLATE ZONE command applies when another node except for those in the zone to be isolated is abnormal. In this case, you can isolate the zone only for fault isolation. You must not perform any invasive action that will cause the observer process to stop. Otherwise, the replicas on the remaining nodes are not the majority in the Paxos group and the tenant has no leader, which will affect the continuity of database services.
References
For more zone-related O&M operations, see the following topics: