OceanBase Database allows you to manage the status of OBServers from a command-line interface (CLI).
When you need to replace, repair, or diagnose a node during O&M operations, you can use the STOP SERVER command to switch the partition leader on this node to another node. Then, the server enters the stopped state and does not provide services externally.
When the OBServer enters the stopped state, the cluster considers the node stopped, but the corresponding observer process may still be running. The same applies to the started state of an OBServer.
Start Server
The Start Server action is inverse to the Stop Server action. By default, when an OBServer in the cluster is started, it enters the started state. After you run the Stop Server command for an OBServer, you need to run the Start Server command to restore the OBServer status to started. The Start Server command is as follows:
ALTER SYSTEM START SERVER 'ip:port' [,'ip:port'...] [ZONE='zone']
Sample statement:
obclient> ALTER SYSTEM START SERVER "10.10.10.1:2882"
Stop Server
The purpose of the Stop Server action is to switch the partition leader on the current OBServer to another OBServer. If the current OBServer has no partition leader, the system will internally mark the OBServer as stopped, so that client requests will no longer be sent to the OBServer and the OBServer will not provide services externally.
The Stop Server action is usually performed under special OAM scenarios such as server diagnostics or the repair, replacement, or upgrade of hardware. The Stop Server command is as follows:
ALTER SYSTEM STOP SERVER 'ip:port' [,'ip:port'...] [ZONE='zone']
Sample statement:
obclient> ALTER SYSTEM STOP SERVER "10.10.10.1:2882" zone='z1'
Add Server
The Add Server action adds nodes to the cluster for cluster scale-out. The new node must be empty. That is, the CLog and ILog directories are empty. The Add Server command is as follows:
ALTER SYSTEM ADD SERVER 'ip:port' [,'ip:port'...] [ZONE [=] 'zone']
Sample statement:
obclient> ALTER SYSTEM ADD SERVER "10.10.10.1:2882" zone='z1'
Delete Server
The Delete Server action deletes nodes from a cluster. The Delete Server command is as follows:
ALTER SYSTEM DELETE SERVER 'ip:port' [,'ip:port'...] [ZONE [=] 'zone']
Sample statement:
obclient> ALTER SYSTEM DELETE SERVER "xx.xx.xx.xx:2882" zone='z1'
Cancel Delete Server
The Cancel Delete Server action affects load balancing. Resource units on the deleted OBServer are migrated to another OBServer in the same zone. Resource unit migration implements automatic balancing of resource units, which is controlled by RootService. Resources may become insufficient during the unit balancing process. Servers in the zone may be insufficient to accommodate all the resource units of the deleted OBServer. In that case, resource unit migration fails, and an error message with the error code 4624 will be recorded in /home/admin/oceanbase/log/rootservice.log. To solve such a problem, you may perform the Cancel Delete Server action to restore the server. The Cancel Delete Server command is as follows:
ALTER SYSTEM CANCEL DELETE SERVER 'ip:port' [,'ip:port'...] [ZONE [=] 'zone']
Sample statement:
obclient> ALTER SYSTEM CANCEL DELETE SERVER "10.10.10.1:2882" zone='z1'
Operation constraints
Observe the following constraints when you perform operations on a node:
Do not perform the Stop Server action for an OBServer in another zone. You can perform the Stop Server action for multiple OBServers in the same zone.
Do not initiate another Stop Server action if the current Stop Server action has not completed.
The
enable_auto_leader_switchparameter must be set to Enabled.Partitioned replicas are in the majority.
If a large number of partitions exist or a large number of partition leaders exist in the OBServer subject to the Stop Server action, the Alter System Stop Server action takes a long time. If the action times out, you can change the SQL timeout duration to a larger value.
If the command fails immediately, the logs may not be synchronized.
View
__all_rootservice_event_historyto check whether Stop Server actions have been performed.After you perform the Stop Server action, the OBServer remains in the
Activestate, but the value ofstop_service_timechanges from 0 to the time when the Stop Server action was performed.