You can isolate an OBServer node when it fails. This ensures that new read and write requests are not routed to the failed node.
When a node fails, you can isolate it to switch leaders on the failed node to another node, thus restoring the write service for users and log synchronization. After the cluster recovers, run the STOP SERVER command to stop the failed node and then restart it.
Sample statement for isolating a failed node:
ALTER SYSTEM ISOLATE SERVER 'ip:port' [,'ip:port'...] [ZONE [=] 'zone']
This statement can be executed only in the sys tenant.
Procedure
Log on to the
systenant of the cluster as therootuser.Execute the following statement to isolate the failed OBServer node.
Sample code:
obclient> ALTER SYSTEM ISOLATE SERVER '10.10.10.10:2882' ZONE='zone1';If the statement succeeds, the failed node is isolated. You can also find that the
statusfield in theoceanbase.DBA_OB_SERVERSview remainsactive, but thestop_timefield of the node is no longerNULL, indicating that the node is in thestoppedstate. The value of thestop_timefield becomes the timestamp when the node was isolated.Sample statement for querying the
oceanbase.DBA_OB_SERVERSview:obclient> SELECT * FROM oceanbase.DBA_OB_SERVERS\G *************************** 1. row *************************** SVR_IP: xx.xx.xx.xx SVR_PORT: 2882 ID: 1 ZONE: zone1 SQL_PORT: 2881 WITH_ROOTSERVER: YES STATUS: ACTIVE START_SERVICE_TIME: 2022-06-17 11:30:04.589074 STOP_TIME: NULL BLOCK_MIGRATE_IN_TIME: NULL LAST_OFFLINE_TIME: NULL BUILD_VERSION: 4.0.0.0_20220525115829-1873fc2598d56060fe307ce3b7b88647686e0b09(May 25 2022 12:12:10) 1 row in setFor more information about the fields in the
oceanbase.DBA_OB_SERVERSview, see DBA_OB_SERVERS.
You can execute the ALTER SYSTEM START SERVER '10.10.10.10:2882' statement to cancel the isolation of a node.