This article is applicable to the scenario of independent deployment of obdiag. Using the obdiag check run command can help inspect the relevant status of the OceanBase database cluster. Currently, it supports the analysis of the OceanBase cluster from the system kernel parameters, internal tables, etc., and analyzes the causes of existing or possible abnormal problems in the cluster and provides operation and maintenance suggestions.
check command group
Overview
# List all inspection packages
obdiag check list
# Full cluster inspection (most common)
obdiag check run
# Column-store POC checks
obdiag check run --cases=column_storage_poc
# OBProxy version checks
obdiag check run --obproxy_cases=proxy
# Deployment env checks (includes docker0 since V4.2.0)
obdiag check run --cases=build_before
# Inspection tasks while running sysbench
obdiag check run --cases=sysbench_run
# Inspection tasks before sysbench
obdiag check run --cases=sysbench_free
# Run a single task
obdiag check run --observer_tasks="ls.paxo_members"
grammar
obdiag check run [options]
options description:
Option name |
Is it required |
Data type |
Default value |
Description |
|---|---|---|---|---|
| --cases | No | string | Default is empty | The name of the OceanBase inspection package that needs to be executed (package name). If not specified, all tasks under the default package will be executed. |
| --obproxy_cases | No | string | Default is empty | The name of the OBProxy inspection package that needs to be executed. If not specified, all tasks under the default package will be executed. |
| --observer_tasks | No | string | Default is empty | Specify the OceanBase database inspection task to be executed. Multiple tasks are separated by English semicolons (;). When specified, the OceanBase database partial inspection in --cases is ignored. |
| --obproxy_tasks | No | string | Default is empty | Specify the OBProxy inspection tasks to be executed. Multiple tasks are separated by English semicolons (;). The --obproxy_cases option is ignored when specified. |
| --store_dir | No | string | ./check_report/ |
Inspection report output directory. |
| --report_type | No | string | table | Inspection report output format, currently supports specifying table, json, xml, yaml, html. |
| --env | No | string | Default is empty | Scene environment variable, format: --env key=value, can be specified multiple times. Will be used in some inspection scenarios. |
| -c | No | string | ~/.obdiag/config.yml |
Configuration file path. |
| --inner_config | No | string | Default is empty | obdiag Self-configuration, format: --inner_config key=value. |
| --config | No | string | Default is empty | Configuration of the cluster to be diagnosed by obdiag, format: --config key1=value1 --config key2=value2. Parameters that support configuration through this option can be found in obdiag configuration. |
| --config_password | No | string | Default is empty | obdiag When using an encrypted configuration file, you need to pass in the corresponding password through this option. For details, see Configuration File Encryption. |
Description
Since V3.3.0, obdiag check supports inspection tasks in the form of Python scripts, and inspection scenarios with complex logic can be written. Inspection tasks and packages are stored in
~/.obdiag/check/tasks/(can be modified by the system configurationcheck.tasks_base_path).Since V4.2.0, the check framework has introduced SSH connection management. You can configure
max_connections_per_node(maximum number of connections per node) andidle_timeout(idle timeout seconds) throughcheck.ssh_managerininner_config.ymlto achieve multi-tasking sharing of SSH connections per node and improve efficiency.
Usage example
Method 1: Use without configuration file (out of the box)
- Full inspection (most commonly used)
obdiag check run \
--config db_host=xx.xx.xx.xx \
--config db_port=xxxx \
--config tenant_sys.user=root@sys \
--config tenant_sys.password=*** \
--config obcluster.servers.global.ssh_username=test \
--config obcluster.servers.global.ssh_password=****** \
--config obcluster.servers.global.home_path=/home/admin/oceanbase \
--config obcluster.servers.nodes[0].data_dir=/home/admin/oceanbase/store \
--config obcluster.servers.nodes[0].redo_dir=/home/admin/oceanbase/store \
--config obcluster.servers.nodes[0].ip=xx.xx.xx.1 \
--config obproxy.servers.nodes[0].ip=xx.xx.xx.1 \
--config obproxy.servers.global.ssh_username=test \
--config obproxy.servers.global.ssh_password=****** \
--config obproxy.servers.global.home_path=/home/admin/obproxy
```* List POC check
```shell
obdiag check run --cases=column_storage_poc \
--config db_host=xx.xx.xx.xx \
--config db_port=xxxx \
--config tenant_sys.user=root@sys \
--config tenant_sys.password=*** \
--config obcluster.servers.global.ssh_username=test \
--config obcluster.servers.global.ssh_password=****** \
--config obcluster.servers.global.home_path=/home/admin/oceanbase \
--config obcluster.servers.nodes[0].data_dir=/home/admin/oceanbase/store \
--config obcluster.servers.nodes[0].redo_dir=/home/admin/oceanbase/store \
--config obcluster.servers.nodes[0].ip=xx.xx.xx.1 \
--config obproxy.servers.nodes[0].ip=xx.xx.xx.1 \
--config obproxy.servers.global.ssh_username=test \
--config obproxy.servers.global.ssh_password=****** \
--config obproxy.servers.global.home_path=/home/admin/obproxy
```* OBProxy version check
```shell
obdiag check run --obproxy_cases=proxy \
--config db_host=xx.xx.xx.xx \
--config db_port=xxxx \
--config tenant_sys.user=root@sys \
--config tenant_sys.password=*** \
--config obcluster.servers.global.ssh_username=test \
--config obcluster.servers.global.ssh_password=****** \
--config obcluster.servers.global.home_path=/home/admin/oceanbase \
--config obcluster.servers.nodes[0].data_dir=/home/admin/oceanbase/store \
--config obcluster.servers.nodes[0].redo_dir=/home/admin/oceanbase/store \
--config obcluster.servers.nodes[0].ip=xx.xx.xx.1 \
--config obproxy.servers.nodes[0].ip=xx.xx.xx.1 \
--config obproxy.servers.global.ssh_username=test \
--config obproxy.servers.global.ssh_password=****** \
--config obproxy.servers.global.home_path=/home/admin/obproxy
```* Deployment environment check
```shell
obdiag check run --cases=build_before \
--config db_host=xx.xx.xx.xx \
--config db_port=xxxx \
--config tenant_sys.user=root@sys \
--config tenant_sys.password=*** \
--config obcluster.servers.global.ssh_username=test \
--config obcluster.servers.global.ssh_password=****** \
--config obcluster.servers.global.home_path=/home/admin/oceanbase \
--config obcluster.servers.nodes[0].data_dir=/home/admin/oceanbase/store \
--config obcluster.servers.nodes[0].redo_dir=/home/admin/oceanbase/store \
--config obcluster.servers.nodes[0].ip=xx.xx.xx.1 \
--config obproxy.servers.nodes[0].ip=xx.xx.xx.1 \
--config obproxy.servers.global.ssh_username=test \
--config obproxy.servers.global.ssh_password=****** \
--config obproxy.servers.global.home_path=/home/admin/obproxy
```* Collection of inspection tasks when executing sysbench
```shell
obdiag check run --cases=sysbench_run \
--config db_host=xx.xx.xx.xx \
--config db_port=xxxx \
--config tenant_sys.user=root@sys \
--config tenant_sys.password=*** \
--config obcluster.servers.global.ssh_username=test \
--config obcluster.servers.global.ssh_password=****** \
--config obcluster.servers.global.home_path=/home/admin/oceanbase \
--config obcluster.servers.nodes[0].data_dir=/home/admin/oceanbase/store \
--config obcluster.servers.nodes[0].redo_dir=/home/admin/oceanbase/store \
--config obcluster.servers.nodes[0].ip=xx.xx.xx.1 \
--config obproxy.servers.nodes[0].ip=xx.xx.xx.1 \
--config obproxy.servers.global.ssh_username=test \
--config obproxy.servers.global.ssh_password=****** \
--config obproxy.servers.global.home_path=/home/admin/obproxy
```* Collection of inspection tasks before executing sysbench
```shell
obdiag check run --cases=sysbench_free \
--config db_host=xx.xx.xx.xx \
--config db_port=xxxx \
--config tenant_sys.user=root@sys \
--config tenant_sys.password=*** \
--config obcluster.servers.global.ssh_username=test \
--config obcluster.servers.global.ssh_password=****** \
--config obcluster.servers.global.home_path=/home/admin/oceanbase \
--config obcluster.servers.nodes[0].data_dir=/home/admin/oceanbase/store \
--config obcluster.servers.nodes[0].redo_dir=/home/admin/oceanbase/store \
--config obcluster.servers.nodes[0].ip=xx.xx.xx.1 \
--config obproxy.servers.nodes[0].ip=xx.xx.xx.1 \
--config obproxy.servers.global.ssh_username=test \
--config obproxy.servers.global.ssh_password=****** \
--config obproxy.servers.global.home_path=/home/admin/obproxy
Method 2: Use with configuration file
Description
You need to ensure that the login information of the node to be collected has been configured in the obdiag configuration file config.yml. For related detailed configuration introduction, see obdiag configuration.
Examples are as follows:
obdiag check run
'cat ./check_report/check_report_2023-10-30-16:15:52.table', export type is table
For more details, please run cmd 'cat ./check_report/check_report_2023-10-30-16:15:52.table'
If you want to view detailed obdiag logs, please run:'obdiag display-trace --trace_id a7674ecb-0d99-36fe-b584-3b707b4647bc'
Specify configuration file:
obdiag check run -c /path/xxx_config.yml
task writing tutorial
Task is an independent inspection scenario, which can be understood as a professional script file written in yaml and recognized by obdiag.
The task will contain some pre-declarations for inspection to implement more professional inspection of OceanBase.
Before starting to write
Before writing the test.yaml file, you need to confirm the storage location of the test.yaml file.
The test.yaml file must be stored in the directory identified by CHECK.tasks_base_path set in the config.yml file. In this directory, analyze whether the written inspection scenario belongs to an existing category. If not, create a folder to declare this category.
example:
# cd ${CHECK.tasks_base_path}/observer, mkdir test, create test.yaml
cd ~/.obdiag/check/tasks/observer
mkdir test
cd test
touch test.yaml
Write task script
The function of task is to declare the steps for inspection execution. Its basic structure is a list, which is used to be compatible with different versions that may cause discrepancies in steps or the inspection project being unusable.
An example of a task file is as follows:
info: testinfo
task:
- version: "[3.1.0,3.2.4]"
steps:
{steps_object}
- version: "[4.2.0.0,4.3.0.0]"
steps:
{steps_object}
Parameter description:
Parameter name |
Is it required |
Description |
|---|---|---|
| info | Yes | Declares the usage scenario of this yaml to facilitate maintenance. |
| version | No | Indicates the applicable version, using the form of Str to represent the range, and a complete numeric version number is required. The version of OceanBase database V3.x is three digits, for example [3.1.1,3.2.0]. OceanBase database V4.x version is four digits, for example: [4.1.0.0,4.2.0.0]. |
| steps | Yes | The steps executed, which are list structures. |
steps introduction
steps is also a list, used to represent multiple specific execution processes.
The structure of an element of steps is a single process, as follows:
Parameter name |
Is it required |
|
|---|---|---|
| type | Yes | Indicates the applicable execution type, currently supports get_system_parameter/ssh/sql. |
| {parameter_name/ssh/sql} | Yes | Parameters provided based on the selected type. This depends more on the logical description of the execution type in the code. |
| result | No | The structure is a separate object, used to analyze the operations that need to be performed after this step, such as verifying the result logic, explaining the text information that needs to be reported when the logic fails, etc. For details, please refer to the result & verify function. |
Examples of various types are as follows. step: is just a mark and has no practical effect.
- get_system_parameter
steps:
- type: get_system_parameter
parameter: parameter
result:
set_value: servervm.max_map_count
```*ssh
Execute instructions remotely and obtain the corresponding return value.
```yaml
steps:
- type: ssh
ssh: wc -l /proc/${task_OBServer_pid}/maps | awk '{print $1}'
result:
set_value: observerMaps
```*SQL
Execute SQL and get the corresponding value.
```yaml
steps:
- type: sql
sql: select tenant_name from oceanbase.table_name from where tenant_id=${taskTenantId};
result:
set_value: tenant_name
result & verify function
This field is also the main dependent field of the verify function, which is used to verify the results obtained by the task.
Its format is as follows:
result:
set_value: {set_value}
verify_type: {verify_type}
report_type: {report_type}
verify: {verify}
err_msg: {err_msg}
```**Parameter description:**
| Parameter name | Is it required | Description |
|-------------|------|--------------------------------------------------------------------------------------------------|
| set_value | No | Assign the value after execution as a variable that applies to the entire task, such as `set_value: max_map_count`. |
| verify_type | No | Used to set the verification method, the default is base, generally needs to be linked with `verify`, base is to verify through the expression of `verify`, the output result is `true` or `false`, and the following common judgment types are provided to reduce the amount of writing. |
| verify | No | Serves verify_type, used to verify whether the execution result meets expectations. If not, the information in the `err_msg` part will be output. |
| report_type | No | Used to set the alarm level that needs to be executed if `verify` is `false` in this step. The default alarm level is `critical`. |
| err_msg | No | Used for logs allowed during abnormal execution. It supports configuring global variables. The msg output when `verify` is `false`. It is recommended that `verify` be configured, and `err_msg` must be configured. |
**`verify_type` supported types**
Types currently supported by verify_type, except base, are only applicable to int types.
* `between`: Determine whether the value of `set_value` is within the range provided by `verify`.
* `max`: Determine whether the value of `set_value` is less than the value provided by `verify`.
* `min`: Determine whether the value of `set_value` is greater than the value provided by `verify`.
* `equal`: Determine whether the value of `set_value` is equal to the value provided by `verify`.
**Notes on base**
The `verify` expression will be used to replace `new_expr` in the following shell formula for execution verification. When writing verify, you can manually perform logical verification locally.
```shell
if ${new_expr}; then
echo "true"
else
echo "false"
fi
Write package script
The package script currently contains observer_check_pakage.yaml and obproxy_check_package.yaml, which correspond to the inspection packages of observer and proxy respectively.
The structure of the package itself is a large dict with the following structure:
{case_name}:
info_en: "" # English description of this package
info_cn: "" # Chinese description of this package
tasks:
{tasks_names} # List of one or more task names
After the package script is written, you can view the currently available case list through the obdiag check list command.
About manual update of task
In order to better provide more effective detection tasks, we will update the quantity and quality of tasks from time to time. You can obtain the latest inspection tasks and packages from the plugins/check/ directory of obdiag code repository, copy them to the local ~/.obdiag/check/tasks (or the directory specified by the system configuration check.tasks_base_path), and update *check_package.yaml synchronously (such as ~/.obdiag/check/*check_package.yaml) to update the inspection package. Hot updates can also be performed via obdiag update.
