obdiag is a CLI diagnostic tool designed for OceanBase Database. It performs comprehensive scans and collects crucial data, such as logs, SQL audit records, and process stack information of OceanBase. You may deploy your OceanBase cluster by using OceanBase Control Platform (OCP) or OceanBase Deployer (OBD), or manually deploy it based on the OceanBase documentation. Regardless of the deployment mode, you can use obdiag to gather diagnostic information with a few clicks. This powerful tool has now been officially open-sourced, further enhancing its accessibility and usability for developers and database administrators.
The obdiag team has compiled relevant experiences regarding OceanBase diagnosis and tuning and has commenced releasing a series of tutorial articles. This article will give detailed guidance on how to troubleshoot compaction issues in OceanBase.
The storage engine of OceanBase is based on the LSM-tree architecture. Data is divided into static baseline data (stored in an SSTable) and dynamic incremental data (stored in a MemTable). An SSTable is read-only and stored on the disk. After an SSTable is generated, it is not modified. A MemTable can be read and written and is stored in the memory.
Data related to DML operations, such as inserting, updating, and deleting, is first written into the MemTable. After the size of the MemTable reaches the specified threshold, its data is compacted with the baseline data and stored in the SSTable on the disk. This process is called a minor compaction. After OceanBase receives a user query, it queries both the SSTable and the MemTable, merges the query results, and then returns the results to the SQL layer. OceanBase implements both block cache and row cache in the memory to prevent random read of the baseline data.
When the incremental data in a MemTable reaches a specified size, it is compacted with the baseline data and then persisted to the disk. This process is called a major compaction. The system performs a daily major compaction during idle hours every night.
View Name | Description |
---|---|
GV$OB_SSTABLES | Displays the SSTables of tenants. |
CDB_OB_MAJOR_COMPACTION | Displays the global major compaction information of all tenants in the sys tenant. |
GV$OB_COMPACTION_PROGRESS | Displays the progress of OBServer node-level compactions in the current tenant. |
GV$OB_TABLET_COMPACTION_PROGRESS | Displays the progress of a tablet-level compaction. |
GV$OB_TABLET_COMPACTION_HISTORY | Displays the tablet-level compaction history. |
GV$OB_COMPACTION_DIAGNOSE_INFO | Displays the compaction diagnostics. |
GV$OB_COMPACTION_SUGGESTIONS | Displays the suggestions on compactions. |
1. The CDB_OB_MAJOR_COMPACTION view provides the global major compaction information about all tenants in the sys tenant. Use this view to check the compaction status of the current cluster. If the STATUS is COMPACTING, it means that a compaction is currently in progress.
select * from CDB_OB_MAJOR_COMPACTION;
2. The GV$OB_COMPACTION_PROGRESS view displays the OBServer node-level compaction progress of a tenant. Use this view to query the node-level compaction progress, and you can see if there are any ongoing compaction tasks (STATUS="NODE_RUNNING"), as well as the number of unfinished tablets (UNFINISHED_TABLET_COUNT) and other information.
select * from GV$OB_COMPACTION_PROGRESS where STATUS="NODE_RUNNING";
# More specifically:
select * from GV$OB_COMPACTION_PROGRESS where tenant_id = xx and compaction_scn = xxx and STATUS != "FINISH";
3. The GV$OB_TABLET_COMPACTION_PROGRESS view displays the progress of a tablet-level compaction. Use this view to query the tablet-level compaction, and you can see information such as the size of unfinished data (UNFINISHED_DATA_SIZE) and the estimated finish time (ESTIMATED_FINISH_TIME).
select * from GV$OB_TABLET_COMPACTION_PROGRESS;
4. The GV$OB_COMPACTION_DIAGNOSE_INFO view provides the compaction diagnostic information. For tablets that do not appear in the GV$OB_TABLET_COMPACTION_PROGRESS view or haven't been finished for a long time, you can use the GV$OB_COMPACTION_DIAGNOSE_INFO view to see if there are any abnormal situations.
select * from GV$OB_COMPACTION_DIAGNOSE_INFO;
Note:
There is no strict benchmark to determine if a compaction is stuck, but typically you can check if the tenant's STATUS in the CDB_OB_MAJOR_COMPACTION view has been in the COMPACTING state for a long time (the definition of "long time" here depends on the amount of data and operation experience; as a rule of thumb, it could be 36 hours).
Another way to determine this is to check the compaction tasks in GV$OB_COMPACTION_PROGRESS with STATUS="NODE_RUNNING" to see if UNFINISHED_TABLET_COUNT has not been updated for a long time.
Troubleshooting Steps:
First, refer to the GV$OB_COMPACTION_DIAGNOSE_INFO view to collect compaction diagnostic information, then use obdiag to conduct root cause analysis, as explained in the subsequent section of this article.
1. Check for inactive MemTables by using GV$OB_SSTABLES.
select * from GV$OB_SSTABLES where table_type = "MEMTABLE" and is_active = "NO";
2. Use the GV$OB_TABLET_COMPACTION_PROGRESS view to query the tablet-level compaction, and you can see information such as the size of unfinished data (UNFINISHED_DATA_SIZE) and the estimated finish time (ESTIMATED_FINISH_TIME).
select * from GV$OB_TABLET_COMPACTION_PROGRESS;
3. For tablets that do not appear in the GV$OB_TABLET_COMPACTION_PROGRESS view or haven't been finished for a long time, you can use the GV$OB_COMPACTION_DIAGNOSE_INFO view to see if there are any abnormal situations.
select * from GV$OB_COMPACTION_DIAGNOSE_INFO;
In case of abnormal compaction situations, OBServer collects relevant information for diagnostic purposes.
select * from GV$OB_COMPACTION_DIAGNOSE_INFO;
Filter information by severity level using STATUS, from low to high:
● SPECIAL: Display the number of tablets with similar issues.
● RS_UNCOMPACTED: Not necessarily indicating an anomaly. It might mean that there are tablet versions that have not been updated to the current compaction version. Check GV$OB_COMPACTION_PROGRESS to see if it is in a compaction state. If there are still RUNNING compactions, it is likely an issue with the compaction tasks.
● NOT_SCHEDULE: Indicates that compaction has not been scheduled for a long time. It commonly occurs in the follower node due to the synchronization lag of medium info, leading to unscheduled compactions.
● FAILED: Indicates obvious anomalies.
Specific issues are described through the DIAGNOSE_INFO field.
The obdiag tool provides diagnostic information analysis related to OceanBase. Use the obdiag rca command to analyze abnormal scenarios in OceanBase and identify potential root causes of issues.
obdiag rca list # view supported scenarios
obdiag rca run --scene=<scene_name> # analyze the diagnostic information in a specified failure scenario to find the possible cause of the issue
The scene_name parameter specifies the name of the failure scenario:
● disconnection: One-click diagnosis for disconnection.
● major_hold: One-click diagnosis for stuck compactions.
● lock_conflict: One-click diagnosis for lock conflicts.
Example: diagnosis for stuck compactions
obdiag rca run --scene=major_hold
Find the latest version of obdiag for free from the OceanBase Software Center.
Explore comprehensive usage guides and configuration details in the obdiag Documentation.
Review the source code, report issues, and contribute to the project on GitHub.