Troubleshooting procedure
When an exception occurs in the binlog service, you can perform troubleshooting as follows:
Check whether the versions of OceanBase Database and OceanBase Database Proxy (ODP) match that of obbinlog. For the version mapping information of obbinlog Community Edition, see Release notes.
Check the running status of instances in the binlog task. For more information about binlog statements, see Overview.
-- View the running status of instances in the binlog task of the specified tenant. SHOW BINLOG INSTANCES FOR cluster.tenant; -- View the binlog files and resource metrics of the primary binlog instance of the specified tenant. SHOW BINLOG STATUS FOR TENANT cluster.tenant;-- If a binlog instance fails to be created, check whether files in the
logdirectory of the main process of obbinlog containERRORorEDIAGinformation.-- If a binlog instance is successfully created but its running status is abnormal, for example, its
convert_runningvalue isNo, check whether files in therun/${instance_name}/logdirectory of the binlog instance containERRORorEDIAGinformation.If exception information is found in a log file, resolve the issue based on the "FAQ" section of this topic.
If no exception information is found or you cannot resolve the issue based on the "FAQ" section, check for similar questions in the Q&A section of OceanBase Community or submit a question if no similar question is found. When you submit a question, you must provide the following information to facilitate troubleshooting by OceanBase Technical Support:
- Version numbers of OceanBase Database, ODP, and obbinlog
- Related status information obtained during troubleshooting by yourself
- Original log files containing exception information
FAQ
The Node resources are insufficient error is returned when CREATE BINLOG is executed
View the logproxy.log file in the log/ directory and check the resource usage of the binlog cluster. Here is an example:
[error] selection_strategy.cpp(519): [ResourcesFilter] The resource threshold of node [89842bbc393b3d1afea947f3267895d(10.10.10.1:2983)] does not meet requirements, used cpu: 6.7903543 >= 0.8 || used memory: 0.6238438 >= 0.85 || used disk: 0.12204241 >= 0.7000000000000001
In the preceding example, the CPU check condition used cpu: 6.7903543 >= 0.8 is true, namely, the system considers that the CPU utilization exceeds the threshold 0.8. As a result, the binlog task creation failed.
The binlog server is deployed in a container and the obtained CPU information is inaccurate. Therefore, the obtained CPU utilization is higher than the actual value.
If you encounter a similar situation, you can modify the threshold in the metadata table config_template as needed or directly disable resource usage check. For more information about metadata tables, see Parameters.
-- Example: Change the CPU utilization threshold to `0.9`.
UPDATE config_template SET value='90' WHERE key_name='node_cpu_limit_threshold_percent';
-- Example: Disable resource usage check.
UPDATE config_template SET value='false' WHERE key_name='enable_resource_check';
After the modification, you need to restart the binlog server and then execute the CREATE BINLOG statement again.
# Restart without using supervisord
./run.sh stop
./run.sh start
# Restart by using supervisord
supervisorctl restart binlog
The error start lsn from all server fail is returned in libobcdc while no new binlogs are generated by the binlog instance
This error usually occurs because the data dictionary is reclaimed. In this case, you can decrease the interval for generating a data dictionary, such as 1 hour in this example.
Adjust the data dictionary generation interval for the current tenant (non-sys tenant):
ALTER SYSTEM SET dump_data_dictionary_to_log_interval = '1h';
Adjust the data dictionary generation interval for all tenants by connecting to the sys tenant:
ALTER SYSTEM SET dump_data_dictionary_to_log_interval = '1h' TENANT all_user;
The Failed to match gtid mapping for last complete transaction with gtid error is returned after the binlog instance is recovered or restarted
This error usually occurs when the binlog files are inconsistent with the binlog index file. You can perform the following steps to resolve the issue:
Check the integrity of the
mysql-bin.indexfile, namely, whether the number of binlog files and the offset are correct. If not, overwrite this file with themysql-bin.index.tmpfile.Rename the
mysql-bin.index.tmpfile.Obtain the binlog offset from the latest row of data in the
mysql-bin.indexfile. The value of the last column is the binlog offset, such as 414233659 in the following example./home/ds/oblogproxy/run/ob4g1x9cpve0qo/t4g20eb6dh8xs/data/mysql-bin.001352 1352 {hash:11302655175424709514, inc:938515546, addr:"10.10.10.10:2882", t:1690537006395697}=68058433 {hash:11268779827260349032, inc:938515541, addr:"10.10.10.10:2882", t:1690537006386235}=68058432 1690537005389969 414233659Back up the latest binlog file and run the
splitcommand to split it.cp mysql-bin.001352 mysql-bin.001352-bak split -b 414233659 mysql-bin.001352 # By default, the generated files are named `xaa` and `xab`. mv xaa mysql-bin.001352Try to recover or restart the binlog instance again.