You can stop or start an OBServer node based on the operating status of the database.
Stop an OBServer node by using an SQL statement
To stop an OBServer node, perform the following actions:
Stop the node service by performing the STOP SERVER operation
Stop the observer process
Stop the node service
Note
Stopping an OBServer node will switch all its traffic and the leader role to another OBServer node. Proceed with caution.
You can perform the STOP SERVER operation to stop the node service on an OBServer node. The STOP SERVER operation is usually performed under special O&M scenarios such as server diagnostics or the repair, replacement, or upgrade of hardware.
The purpose of the STOP SERVER operation is to switch the partition leader on the current OBServer node to another OBServer node. If the current OBServer node has no partition leader, the system will internally mark the OBServer node as stopped, so that client requests will no longer be sent to the OBServer node and the OBServer node will not provide services externally.
For more information about the OBServer node status, see View OBServer node status.
Sample statement of the STOP SERVER operation:
obclient> ALTER SYSTEM STOP SERVER 'ip:port' [,'ip:port'...] [ZONE='zone'];
This statement can be executed only in the sys tenant.
Before you perform the STOP SERVER operation, make sure that:
The
enable_auto_leader_switchparameter is set toTrue.For more information about the
enable_auto_leader_switchparameter, see enable_auto_leader_switch.Partitioned replicas are in the majority.
When you stop an OBServer node, note that:
Do not perform the STOP SERVER operation for an OBServer node in another zone. You can perform the STOP SERVER operation for multiple OBServer nodes in the same zone.
Do not initiate another STOP SERVER operation if the current STOP SERVER operation has not completed.
If a large number of partitions or partition leaders exist on the OBServer node on which the STOP SERVER operation is performed, the STOP SERVER operation takes a long time. If the operation times out, you can change the SQL timeout period to a larger value.
The SQL timeout duration can be specified by the
ob_query_timeoutvariable in microseconds. The default value is10000000. For more information about how to set the variable, see Set variables.For more information about the
ob_query_timeoutvariable, see ob_query_timeout.If the statement execution fails immediately, the logs may not be synchronized. In this case, check for STOP SERVER records in the
__all_rootservice_event_historytable. If no STOP SERVER record is found, the failure has occurred before the RootService performs the STOP SERVER operation. If a STOP SERVER record is found, the STOP SERVER operation has failed because the logs are not synchronized.Here is an example:
obclient> SELECT rs_svr_ip FROM oceanbase.__all_rootservice_event_history WHERE event='stop_server'; +----------------+ | rs_svr_ip | +----------------+ | xxx.xx.xxx.xx5 | | xxx.xx.xxx.xx5 | +----------------+ 2 rows in setAfter the STOP SERVER operation is performed, the value of the
statusfield for the OBServer node in the__all_servertable is stillActive, but the value of thestop_timefield changes from0to the point in time when the STOP SERVER operation was performed, and the OBServer node is in thestoppedstate.For more information about the OBServer node status, see View OBServer node status.
Here is an example:
Log on to the
systenant as therootuser.Execute the following statement to stop the node service:
obclient> ALTER SYSTEM STOP SERVER "10.10.10.1:2882" zone='z1';If a leader is deployed on the node, the system automatically switches it to a follower after the node service is stopped. Followers on the node can still participate in voting but will not be elected as a leader. Stopping the node service is different from OBServer failure. The service stop time of the node can exceed the permanently offline time specified by
server_permanent_offline_time, without causing the node to be actually offline.Note
server_permanent_offline_timespecifies the time threshold for heartbeat missing at which a server is considered permanently offline. Data replicas on a permanently offline server must be automatically supplemented.
Stop the observer process
Notice
If you stop the observer process, the OBServer node becomes unavailable and cannot provide services. Proceed with caution.
Log on to the OBServer node.
Access the
/home/admin/oceanbase/bindirectory from the command-line interface (CLI).Run the following command to stop the observer process:
kill `pidof observer`Run the following command to check whether the process is stopped:
ps -ef | grep observer | grep -v grepIf no response is returned after the command is executed, the process is stopped.
Stop an OBServer node in the OCP console
You can also stop an OBServer node in the OceanBase Cloud Platform (OCP) console by respectively stopping the node service and observer process.
Prerequisites
The target cluster can be managed in the current OCP cluster.
If the cluster has not been added to OCP, request the administrator to add the cluster. For more information, see the Take over a cluster topic in the OCP User Guide of the corresponding version.
You have the permissions to manage the cluster.
If you do not have the permissions to manage the cluster, request the administrator to assign the required role. For more information, see the Edit a user topic in the OCP User Guide of the corresponding version.
Stop the node service
Log on to the OCP console.
The Clusters page automatically appears.
In the Clusters list, find the cluster that includes the OBServer node you want to stop and click its name to go to the Overview page of the cluster.
In the OBServers list, click Stop Service in the Actions column of the target OBServer node.
In the dialog box that appears, click Stop Service to stop the node service.
In the dialog box, you can click View Task to view the progress.
You can also choose System Management > Tasks to view the progress of the task.
When the task status is Completed and the status of the OBServer node is Service Stopped in the OBServers list on the Overview page, the OBServer node is stopped.
Stop the observer process
Log on to the OCP console.
The Clusters page automatically appears.
In the Clusters list, find the cluster that includes the OBServer node you want to stop and click its name to go to the Overview page of the cluster.
In the OBServers list, click the More icon in the Actions column and choose Stop Process.
In the dialog box that appears, click Stop Process to stop the observer process.
In the dialog box, you can click View Task to view the progress.
You can also choose System Management > Tasks to view the progress of the task.
When the task status is Completed and the status of the OBServer node is Process Stopped in the OBServers list on the Overview page, the observer process is stopped.
Start an OBServer node by using an SQL statement
To start an OBServer node, perform the following actions:
Start the observer process
Start the node service by performing the START SERVER operation
Start the observer process
In special cases, you can start the observer process by running a command.
Log on to the OBServer node.
Access the
/home/admin/oceanbase/bindirectory from the command-line interface (CLI).Run the following command to start the observer process:
cd /home/admin/oceanbase/ ./bin/observer [Startup parameters]Generally, you need to add the startup parameters only for the first startup. In other startups, you can directly run the
./bin/observercommand. You can also run./bin/observer --helpto view the details of the observer startup parameters.Here is the sample command for starting the observer process for the first time:
cd /home/admin/oceanbase/bin ./observer -p 2881 -P 2882 -z 'zone_1' -d '/data/1/prod_data/' -r '10.10.10.1:2882:2881;10.10.10.2:2882:2881;10.10.10.3:2882:2881' -l WARN -o 'memory_limit=100GB,datafile_disk_percentage=85'The parameters are described as follows:
-pspecifies the port number for direct connection. The value is2881in this example.-Pspecifies the RPC port number. The value is2882in this example.-zspecifies the zone where the OBServer node to be started is located. The value iszone_1in this example.-dspecifies the storage directory of data. The value is/data/1/prod_datain this example.-rspecifies the IP address of the OBServer node to be started.-lspecifies the level of logs to be printed. The value isWARNin this example, indicating that logs of the WARNING level are to be printed.-ospecifies the startup parameters. When the-oparameter is used, note that:The parameters are not case-insensitive. However, we recommend that you set them based on the names in
observer.config.bin.A parameter name cannot contain the following special characters: spaces,
\r,\n, and\t.An equal sign (=) is required between the name and the value of a parameter.
Separate multiple parameters with commas (,).
datafile_disk_percentage = 85indicates that the utilization of the data disk is 85%.memory_limit = 100GBspecifies that the maximum memory available for starting the process is 100 GB.
After you start the process, wait for 5 to 10 seconds and check whether the process is started.
Run the following command to check whether the process is running.
For example,
[root@xx oceanbase]#ps -ef | grep observer | grep -v grep root 6136 0 99 11:23 ? 00:00:19 ./bin/observerIn the example, if a response is returned after the command is executed, the process is started. Otherwise, the process is not started.
Run the following command to check whether port listening is enabled.
For example,
[root@xxx oceanbase]#netstat -ntlp | grep `pidof observer` tcp 0 0 0.0.0.0:2881 0.0.0.0:* LISTEN 6136/./bin/observer tcp 0 0 0.0.0.0:2882 0.0.0.0:* LISTEN 6136/./bin/observerIn the example,
6136in6136/./bin/observerindicates the ID of the observer process. The execution results show that port listening is enabled.
Start the node service
Generally, you can perform the START SERVER operation to start the node service. The START SERVER operation is inverse to the STOP SERVER operation. By default, when an OBServer node in the cluster is started, it enters the started state. After you perform the STOP SERVER operation on an OBServer node, you must perform the START SERVER operation to set the OBServer node status to started.
Here is the sample statement for starting the node service:
obclient> ALTER SYSTEM START SERVER 'ip:port' [,'ip:port'...] [ZONE='zone'];
This statement can be executed only in the sys tenant.
Here is an example:
Log on to the
systenant as therootuser.Execute the following statement to start the node service:
obclient> ALTER SYSTEM START SERVER "10.10.10.1:2882";Execute the following statement to check whether the node service is operating properly.
For example,
obclient> SELECT a.zone, concat(a.svr_ip,':', a.svr_port) observer, cpu_total, (cpu_total-cpu_assigned) cpu_free, round(mem_total/1024/1024/1024) mem_total_gb, round((mem_total-mem_assigned)/1024/1024/1024) mem_free_gb, usec_to_time(b.last_offline_time) last_offline_time, usec_to_time(b.start_service_time) start_service_time, b.status, usec_to_time(b.stop_time) stop_time, b.build_version FROM oceanbase.__all_virtual_server_stat a join oceanbase.__all_server b on (a.svr_ip=b.svr_ip and a.svr_port=b.svr_port) ORDER BY a.zone, a.svr_ip\G *************************** 1. row *************************** zone: zone1 observer: 10.10.10.2:2882 cpu_total: 62 cpu_free: 55 mem_total_gb: 50 mem_free_gb: 20 last_offline_time: 1970-01-01 08:00:00.000000 start_service_time: 2021-12-03 09:54:53.237400 status: active stop_time: 1970-01-01 08:00:00.000000 build_version: 3.2.1_20211031212624-2c7eade2fd94a4ae32bec1683d1118da9d30cf8b(Oct 31 2021 22:03:03) *************************** 2. row *************************** zone: zone2 observer: 10.10.10.1:2882 cpu_total: 62 cpu_free: 55 mem_total_gb: 50 mem_free_gb: 20 last_offline_time: 1970-01-01 08:00:00.000000 start_service_time: 2021-12-08 11:24:05.281388 status: active stop_time: 1970-01-01 08:00:00.000000 build_version: 3.2.1_20211031212624-2c7eade2fd94a4ae32bec1683d1118da9d30cf8b(Oct 31 2021 22:03:03) 2 rows in setThe parameters are described as follows:
statusindicates the status of the node service. Valid values:active: indicates that the OBServer node is operating properly.inactive: indicates that the OBServer node is offline. During an upgrade of the cluster, the value of this parameter isinactive.deleting: indicates that the OBServer node is being deleted.
start_service_timespecifies the time when the node service is started. The value of this parameter cannot be the default value1970-01-01 08:00:00.000000. If the value of this parameter is the default value, the OBServer node has not been recovered.stop_timespecifies the time when the node service was stopped. The value of this parameter should be the default value1970-01-01 08:00:00.000000. If not, the STOP SERVER operation has been performed for the OBServer node, and you need to perform the START SERVER operation for it.
Start an OBServer node in the OCP console
Starting an OBServer node in the OCP console is equivalent to executing ALTER SYSTEM START SERVER and then starting the observer process.
Prerequisites
The target cluster can be managed in the current OCP cluster.
If the cluster has not been added to OCP, request the administrator to add the cluster. For more information, see the Take over a cluster topic in the OCP User Guide of the corresponding version.
You have the permissions to manage the cluster.
If you do not have the permissions to manage the cluster, request the administrator to assign the required role. For more information, see the Edit a user topic in the OCP User Guide of the corresponding version.
Procedure
Log on to the OCP console.
The Clusters page automatically appears.
In the Clusters list, find the cluster that includes the OBServer node you want to stop and click its name to go to the Overview page of the cluster.
In the OBServers list, click Start in the Actions column of the OBServer node that you want to start.
In the dialog box that appears, click Start.
In the dialog box, you can click View Task to view the progress.
You can also choose System Management > Tasks to view the progress of the task.
When the task status is Completed, and the status of the OBServer node is Running in the OBServers list on the Overview page, the OBServer node is started.