In the native distributed database system OceanBase Database, root cause analysis for faults is complex because many factors may be involved, such as the server environment, parameters, and running load. A large amount of information must be collected and analyzed for troubleshooting. OceanBase Diagnostic Tool (obdiag) is designed to help you efficiently collect information scattered on various nodes. The code of obdiag is fully open sourced and available on GitHub.
What is obdiag?
obdiag is a CLI diagnostic tool designed for OceanBase Database. It scans, collects, and analyzes information such as the logs, SQL audit records, and the process stack information of OceanBase Database. You may deploy your OceanBase cluster by using OceanBase Cloud Platform (OCP) or OceanBase Deployer (obd), or manually deploy it based on the OceanBase documentation. Regardless of the deployment mode, you can use obdiag to collect diagnostic information with a few clicks.
Features
obdiag has the following features:
Easy deployment: You can deploy obdiag by using the RPM package or obd with a few clicks. You can deploy it on an OBServer node or any server that can connect to nodes in the OceanBase cluster.
Centralized collection: You need to deploy obdiag only on a single server rather than all servers. Then, you can execute collection or analysis commands on the server where obdiag is deployed.
Easy to use: You can perform installation, cluster inspection, information collection, diagnostics, and root cause analysis all by using commands.
Open source: obdiag is developed based on Python. The source code is 100% open sourced. For more information, see the GitHub code repository.
High scalability: The inspection, scenario-based information collection, root cause analysis, and information display features of obdiag are all available as add-ons. You can add custom diagnostic scenarios at low costs.
obdiag V3.0.0 allows you to perform the following operations with a few clicks:
- Collect logs of OceanBase Database.
- Collect Automatic Workload Repository (AWR) reports (OCP Enterprise Edition required).
- Collect host information.
- Collect the stack information of OceanBase Database.
- Collect the parsed commit logs (clogs) and SSTable logs (slogs).
- Collect the performance information of OceanBase Database.
- Collect the execution details of parallel SQL statements.
- Collect Active Session History (ASH) reports.
- Collect logs of OceanBase Database Proxy (ODP).
- Collect table information.
- Collect cluster parameters and variables.
- Analyze logs of OceanBase Database to identify errors that have occurred.
- Inspect OceanBase clusters for possible or existing exceptions, analyze their causes, and provide O&M suggestions.
- Compare parameters and variables.
- Analyze the space sizes of indexes, including pre-created indexes.
- Collect, inspect, and analyze the diagnostic information and logs of an OceanBase cluster deployed by using Docker.
- Inspect an OceanBase cluster during a stress test using Sysbench, analyze the causes of existing or possible cluster exceptions, and provide O&M suggestions.
- Perform end-to-end diagnostics based on the
trace.logfile. - Collect information based on fault scenarios.
- Analyze the root cause based on fault scenarios.
- Upgrade inspection files and collection scenario files through hot updates.
- Display cluster information.
- Analyze logs of an OceanBase cluster and generate a memory analysis report based on the memory usage information in the logs.