Stop or start an OBServer node|V3.2.4|OceanBase Database| docs|Distributed Database

Stop or start an OBServer node

Last Updated：2023-10-27 09:57:43 Updated

You can stop or start an OBServer node based on the operating status of the database.

Stop an OBServer node by using an SQL statement

To stop an OBServer node, perform the following actions:

Stop the node service by performing the STOP SERVER operation
Stop the observer process

Stop the node service

Note

Stopping an OBServer node will switch all its traffic and the leader role to another OBServer node. Proceed with caution.

You can perform the STOP SERVER operation to stop the node service on an OBServer node. The STOP SERVER operation is usually performed under special O&M scenarios such as server diagnostics or the repair, replacement, or upgrade of hardware.

The purpose of the STOP SERVER operation is to switch the partition leader on the current OBServer node to another OBServer node. If the current OBServer node has no partition leader, the system will internally mark the OBServer node as stopped, so that client requests will no longer be sent to the OBServer node and the OBServer node will not provide services externally.

For more information about the OBServer node status, see View OBServer node status.

Sample statement of the STOP SERVER operation:

obclient> ALTER SYSTEM STOP SERVER 'ip:port' [,'ip:port'...] [ZONE='zone'];

This statement can be executed only in the sys tenant.

Before you perform the STOP SERVER operation, make sure that:

The enable_auto_leader_switch parameter is set to True.

For more information about the enable_auto_leader_switch parameter, see enable_auto_leader_switch.
Partitioned replicas are in the majority.

When you stop an OBServer node, note that:

Do not perform the STOP SERVER operation for an OBServer node in another zone. You can perform the STOP SERVER operation for multiple OBServer nodes in the same zone.
Do not initiate another STOP SERVER operation if the current STOP SERVER operation has not completed.
If a large number of partitions or partition leaders exist on the OBServer node on which the STOP SERVER operation is performed, the STOP SERVER operation takes a long time. If the operation times out, you can change the SQL timeout period to a larger value.

The SQL timeout duration can be specified by the ob_query_timeout variable in microseconds. The default value is 10000000. For more information about how to set the variable, see Set variables.

For more information about the ob_query_timeout variable, see ob_query_timeout.
If the statement execution fails immediately, the logs may not be synchronized. In this case, check for STOP SERVER records in the __all_rootservice_event_history table. If no STOP SERVER record is found, the failure has occurred before the RootService performs the STOP SERVER operation. If a STOP SERVER record is found, the STOP SERVER operation has failed because the logs are not synchronized.

Here is an example:
```
obclient> SELECT rs_svr_ip FROM oceanbase.__all_rootservice_event_history WHERE event='stop_server';
+----------------+
| rs_svr_ip      |
+----------------+
| xxx.xx.xxx.xx5 |
| xxx.xx.xxx.xx5 |
+----------------+
2 rows in set
```
After the STOP SERVER operation is performed, the value of the status field for the OBServer node in the __all_server table is still Active, but the value of the stop_time field changes from 0 to the point in time when the STOP SERVER operation was performed, and the OBServer node is in the stopped state.

For more information about the OBServer node status, see View OBServer node status.

Here is an example:

Log on to the sys tenant as the root user.
Execute the following statement to stop the node service:
```
obclient> ALTER SYSTEM STOP SERVER "10.10.10.1:2882" zone='z1';
```
If a leader is deployed on the node, the system automatically switches it to a follower after the node service is stopped. Followers on the node can still participate in voting but will not be elected as a leader. Stopping the node service is different from OBServer failure. The service stop time of the node can exceed the permanently offline time specified by server_permanent_offline_time, without causing the node to be actually offline.

Note

server_permanent_offline_time specifies the time threshold for heartbeat missing at which a server is considered permanently offline. Data replicas on a permanently offline server must be automatically supplemented.

Stop the observer process

Notice

If you stop the observer process, the OBServer node becomes unavailable and cannot provide services. Proceed with caution.

Log on to the OBServer node.
Access the /home/admin/oceanbase/bin directory from the command-line interface (CLI).
Run the following command to stop the observer process:
```
kill `pidof observer`
```
Run the following command to check whether the process is stopped:
```
ps -ef | grep observer | grep -v grep
```
If no response is returned after the command is executed, the process is stopped.

Stop an OBServer node in the OCP console

You can also stop an OBServer node in the OceanBase Cloud Platform (OCP) console by respectively stopping the node service and observer process.

Prerequisites

The target cluster can be managed in the current OCP cluster.

If the cluster has not been added to OCP, request the administrator to add the cluster. For more information, see the Take over a cluster topic in the OCP User Guide of the corresponding version.
You have the permissions to manage the cluster.

If you do not have the permissions to manage the cluster, request the administrator to assign the required role. For more information, see the Edit a user topic in the OCP User Guide of the corresponding version.

Stop the node service

Log on to the OCP console.

The Clusters page automatically appears.
In the Clusters list, find the cluster that includes the OBServer node you want to stop and click its name to go to the Overview page of the cluster.
In the OBServers list, click Stop Service in the Actions column of the target OBServer node.
In the dialog box that appears, click Stop Service to stop the node service.

In the dialog box, you can click View Task to view the progress.

You can also choose System Management > Tasks to view the progress of the task.

When the task status is Completed and the status of the OBServer node is Service Stopped in the OBServers list on the Overview page, the OBServer node is stopped.

Stop the observer process

Log on to the OCP console.

The Clusters page automatically appears.
In the Clusters list, find the cluster that includes the OBServer node you want to stop and click its name to go to the Overview page of the cluster.
In the OBServers list, click the More icon in the Actions column and choose Stop Process.
In the dialog box that appears, click Stop Process to stop the observer process.

In the dialog box, you can click View Task to view the progress.

You can also choose System Management > Tasks to view the progress of the task.

When the task status is Completed and the status of the OBServer node is Process Stopped in the OBServers list on the Overview page, the observer process is stopped.

Start an OBServer node by using an SQL statement

To start an OBServer node, perform the following actions:

Start the observer process
Start the node service by performing the START SERVER operation

Start the observer process

In special cases, you can start the observer process by running a command.

Log on to the OBServer node.
Access the /home/admin/oceanbase/bin directory from the command-line interface (CLI).
Run the following command to start the observer process:
```
cd /home/admin/oceanbase/

./bin/observer [Startup parameters]
```
Generally, you need to add the startup parameters only for the first startup. In other startups, you can directly run the ./bin/observer command. You can also run ./bin/observer --help to view the details of the observer startup parameters.

Here is the sample command for starting the observer process for the first time:
```
cd /home/admin/oceanbase/bin

./observer -p 2881 -P 2882 -z 'zone_1' -d '/data/1/prod_data/' -r '10.10.10.1:2882:2881;10.10.10.2:2882:2881;10.10.10.3:2882:2881' -l WARN -o 'memory_limit=100GB,datafile_disk_percentage=85'
```
The parameters are described as follows:
- -p specifies the port number for direct connection. The value is 2881 in this example.
- -P specifies the RPC port number. The value is 2882 in this example.
- -z specifies the zone where the OBServer node to be started is located. The value is zone_1 in this example.
- -d specifies the storage directory of data. The value is /data/1/prod_data in this example.
- -r specifies the IP address of the OBServer node to be started.
- -l specifies the level of logs to be printed. The value is WARN in this example, indicating that logs of the WARNING level are to be printed.
- -o specifies the startup parameters. When the -o parameter is used, note that:
  - The parameters are not case-insensitive. However, we recommend that you set them based on the names in observer.config.bin.
  - A parameter name cannot contain the following special characters: spaces, \r, \n, and \t.
  - An equal sign (=) is required between the name and the value of a parameter.
  - Separate multiple parameters with commas (,).
- datafile_disk_percentage = 85 indicates that the utilization of the data disk is 85%.
- memory_limit = 100GB specifies that the maximum memory available for starting the process is 100 GB.
After you start the process, wait for 5 to 10 seconds and check whether the process is started.
1. Run the following command to check whether the process is running.
  
  For example,
```
[root@xx oceanbase]#ps -ef | grep observer | grep -v grep
root       6136      0 99 11:23 ?        00:00:19 ./bin/observer
```
  In the example, if a response is returned after the command is executed, the process is started. Otherwise, the process is not started.
2. Run the following command to check whether port listening is enabled.
  
  For example,
```
[root@xxx oceanbase]#netstat -ntlp | grep `pidof observer`
tcp        0      0 0.0.0.0:2881            0.0.0.0:*               LISTEN      6136/./bin/observer
tcp        0      0 0.0.0.0:2882            0.0.0.0:*               LISTEN      6136/./bin/observer
```
  In the example, 6136 in 6136/./bin/observer indicates the ID of the observer process. The execution results show that port listening is enabled.

Start the node service

Generally, you can perform the START SERVER operation to start the node service. The START SERVER operation is inverse to the STOP SERVER operation. By default, when an OBServer node in the cluster is started, it enters the started state. After you perform the STOP SERVER operation on an OBServer node, you must perform the START SERVER operation to set the OBServer node status to started.

Here is the sample statement for starting the node service:

obclient> ALTER SYSTEM START SERVER 'ip:port' [,'ip:port'...] [ZONE='zone'];

This statement can be executed only in the sys tenant.

Here is an example:

Log on to the sys tenant as the root user.
Execute the following statement to start the node service:
```
obclient> ALTER SYSTEM START SERVER "10.10.10.1:2882";
```

Execute the following statement to check whether the node service is operating properly.

For example,

obclient> SELECT a.zone, concat(a.svr_ip,':', a.svr_port) observer, cpu_total, (cpu_total-cpu_assigned) cpu_free, round(mem_total/1024/1024/1024) mem_total_gb, round((mem_total-mem_assigned)/1024/1024/1024) mem_free_gb, usec_to_time(b.last_offline_time) last_offline_time, usec_to_time(b.start_service_time) start_service_time, b.status, usec_to_time(b.stop_time) stop_time, b.build_version
FROM oceanbase.__all_virtual_server_stat a join oceanbase.__all_server b on (a.svr_ip=b.svr_ip and a.svr_port=b.svr_port)
ORDER BY a.zone, a.svr_ip\G
*************************** 1. row ***************************
              zone: zone1
          observer: 10.10.10.2:2882
         cpu_total: 62
          cpu_free: 55
      mem_total_gb: 50
       mem_free_gb: 20
 last_offline_time: 1970-01-01 08:00:00.000000
start_service_time: 2021-12-03 09:54:53.237400
            status: active
         stop_time: 1970-01-01 08:00:00.000000
     build_version: 3.2.1_20211031212624-2c7eade2fd94a4ae32bec1683d1118da9d30cf8b(Oct 31 2021 22:03:03)
*************************** 2. row ***************************
              zone: zone2
          observer: 10.10.10.1:2882
         cpu_total: 62
          cpu_free: 55
      mem_total_gb: 50
       mem_free_gb: 20
 last_offline_time: 1970-01-01 08:00:00.000000
start_service_time: 2021-12-08 11:24:05.281388
            status: active
         stop_time: 1970-01-01 08:00:00.000000
     build_version: 3.2.1_20211031212624-2c7eade2fd94a4ae32bec1683d1118da9d30cf8b(Oct 31 2021 22:03:03)
2 rows in set

The parameters are described as follows:

status indicates the status of the node service. Valid values:
- active: indicates that the OBServer node is operating properly.
- inactive: indicates that the OBServer node is offline. During an upgrade of the cluster, the value of this parameter is inactive.
- deleting: indicates that the OBServer node is being deleted.
start_service_time specifies the time when the node service is started. The value of this parameter cannot be the default value 1970-01-01 08:00:00.000000. If the value of this parameter is the default value, the OBServer node has not been recovered.
stop_time specifies the time when the node service was stopped. The value of this parameter should be the default value 1970-01-01 08:00:00.000000. If not, the STOP SERVER operation has been performed for the OBServer node, and you need to perform the START SERVER operation for it.

Start an OBServer node in the OCP console

Starting an OBServer node in the OCP console is equivalent to executing ALTER SYSTEM START SERVER and then starting the observer process.

Prerequisites

The target cluster can be managed in the current OCP cluster.

If the cluster has not been added to OCP, request the administrator to add the cluster. For more information, see the Take over a cluster topic in the OCP User Guide of the corresponding version.
You have the permissions to manage the cluster.

If you do not have the permissions to manage the cluster, request the administrator to assign the required role. For more information, see the Edit a user topic in the OCP User Guide of the corresponding version.

Procedure

Log on to the OCP console.

The Clusters page automatically appears.
In the Clusters list, find the cluster that includes the OBServer node you want to stop and click its name to go to the Overview page of the cluster.
In the OBServers list, click Start in the Actions column of the OBServer node that you want to start.
In the dialog box that appears, click Start.

In the dialog box, you can click View Task to view the progress.

You can also choose System Management > Tasks to view the progress of the task.

When the task status is Completed, and the status of the OBServer node is Running in the OBServers list on the Overview page, the OBServer node is started.