Arbitration server process startup failed and NIC missing error returned
Symptom
The following error message is returned upon an attempt to start the arbitration server process on the command-line interface (CLI):
[2023-11-08 11:01:54.555369] ERROR issue_dba_error (ob_log.cpp:1841) [2649528][observer][T0][Y0-0000000000000000-0-0] [lt=3][errcode=-4388] Unexpected internal error happen, please checkout the internal errcode(errcode=-4182, file="ob_net_util.cpp", line_no=212, info="can not find ifname by local ip")
[2023-11-08 11:01:54.555407] EDIAG [LIB] get_ifname_by_addr (ob_net_util.cpp:212) [2649528][observer][T0][Y0-0000000000000000-0-0] [lt=37][errcode=-4182] can not find ifname by local ip(local_ip=0.0.0.0) BACKTRACE:0x10a24a3c 0x5e92550 0xb0a0c38 0x5e9228c 0x5e8b32c 0x10f7503c 0x10f74ac8 0x8e3a8bc 0x8e31a30 0x5e8c4c4 0xffffaad3485c 0x453d958
[2023-11-08 11:01:54.555846] ERROR init_config (ob_server.cpp:1840) [2649528][observer][T0][Y0-0000000000000000-0-0] [lt=434][errcode=-4393] observer start process failure(local_ip is not a valid IP for this machine, local_ip="0.0.0.0")
Possible causes
In the error message, local_ip is 0.0.0.0. A possible cause is as follows: During the first attempt to start the arbitration server process on the CLI, no network interface card (NIC) name or IP address is specified by using the -i eth0 or -I 10.xx.xx.xx parameter. The arbitration server attempts to guess devname but fails, and therefore uses the default bond0. Moreover, the -I parameter is not specified. As a result, the local_ip parameter is set to an empty string by default, namely 0.0.0.0, and the IP address is stored to the configuration file. Next time when you attempt to start the arbitration server process, the error is still returned due to the incorrect IP address stored last time, even if you specify the NIC name. This is because the arbitration server forcibly verifies the validity of the local_ip parameter. If the arbitration server can obtain the NIC name based on this parameter, the verification succeeds. Otherwise, the verification fails. However, the arbitration server cannot obtain the NIC name based on local_ip=0.0.0.0. Therefore, the verification fails, leading to the arbitration server process startup failure.
Solution
Solution 1: Delete the persistent file in the /etc directory and specify -i eth0 on the CLI to restart the arbitration server process.
Solution 2: Specify -I 10.xx.xx.xx on the CLI to restart the arbitration server process without deleting the persistent file.
Arbitration server process startup failed and killed error returned
Symptom and possible causes
If startup of the arbitration server process fails and the system returns the killed error each time you run the startup command, the value of the vm.min_free_kbyte parameter may be invalid. The kernel parameter vm.min_free_kbyte specifies the minimum size of idle memory space reserved by the system to avoid memory fragmentation. If a process to be started causes the available memory space on the OBServer node to be less than the minimum size, the system kills the process.
Solution
Decrease the value of the vm.min_free_kbyte parameter. When you deploy the arbitration service on an OBServer node with no more than 8 GB of physical memory, we recommend that you retain the default value of this parameter.
To modify this parameter, perform the following steps:
Log in to the OBServer node where the arbitration service resides as the
rootuser.Query the value of the
min_free_kbytesparameter. The value is in KB.[root@xxx admin]# cat /proc/sys/vm/min_free_kbytesChange the value based on your business needs.
For example, you can change the value to
262144, which indicates 256 MB.[root@xxx admin]# echo 262144 > /proc/sys/vm/min_free_kbytes
References
For more information about O&M operations related to the arbitration service, see Overview of arbitration services.