SQL execution errors are common in database O&M and can directly impact business operations. There are many reasons for SQL execution errors, such as failed database connection, insufficient user privileges, syntax errors, or data not meeting query conditions.
To help you quickly identify the root cause and efficiently resolve the issue, the following sections provide a clear and practical SQL error troubleshooting process. This process offers explicit steps to enhance issue resolution efficiency and minimize business impact, thereby supporting your daily O&M work.
Procedure
You can troubleshoot the issue when an SQL execution error occurs, based on the following procedure.
After an SQL execution error occurs, view the SQL error message. If the error message contains an error code, troubleshoot the issue based on the error code. If the error message does not contain an error code, determine the error type. The error may be an application execution error or an SQL execution error:
If the error is an application execution error, search for information in the Knowledge Base to troubleshoot the issue.
If the error is an SQL execution error, check whether the error can be manually reproduced.
If the error cannot be reproduced, search for information in the Knowledge Base based on the SQL statement to troubleshoot the issue.
If the error can be reproduced, reproduce the error scenario. Connect to the OceanBase cluster through port 2881 or 2883 based on the original scenario and execute the original SQL statement to reproduce the error scenario.
After reproducing the SQL execution error, perform the following steps to collect information for troubleshooting.
Execute the following statement to obtain
trace_id:Notice
You must execute the following statement immediately after the SQL error occurs; otherwise,
trace_idof the error SQL cannot be obtained.MySQL modeOracle modeThe following statement is executed to obtain
trace_idin MySQL mode:obclient> SELECT last_trace_id();The following statement is executed to obtain
trace_idin Oracle mode:obclient> SELECT last_trace_id() FROM DUAL;Obtain the host information based on the obtained
trace_id.Generally, an OceanBase cluster is deployed across multiple nodes. You can execute the following statement to obtain the node that actually executed the SQL statement, and then filter logs based on the node.
MySQL modeOracle modeExecute the following statement in MySQL mode:
obclient> SELECT * FROM oceanbase.GV$OB_SQL_AUDIT WHERE trace_id=last_trace_id;In this statement, replace
last_trace_idwithtrace_idobtained in the previous step.Execute the following statement in Oracle mode:
obclient> SELECT * FROM SYS.GV$OB_SQL_AUDIT WHERE trace_id=last_trace_id;In this statement, replace
last_trace_idwithtrace_idobtained in the previous step.Based on the query result of the
GV$OB_SQL_AUDITview, the host corresponding tosvr_ipis the host that actually executed the SQL statement.Use the
sshcommand to log in to the host based on the obtained host information.Go to the log directory.
Assume that the installation directory of OceanBase Database is
/home/admin/oceanbase. The specific log path depends on your actual setup.cd /home/admin/oceanbase/logExecute the following command to filter out related information from logs.
grep "${trace_id}" observer.loggrep "${trace_id}" observer.log.xxxIn these commands, replace
${trace_id}withtrace_idobtained in earlier steps, and replaceobserver.log.xxxwith a log file that contains a timestamp based on the actual situation.Based on the information provided in the logs, analyze the issue based on the error code or error message.
For more information about logs and error codes, see Overview of logging and Overview of error information.
If the information in the logs is unclear, contact technical engineers for assistance.
Case studies
The following case studies describe some typical scenarios where SQL execution errors occur and how to troubleshoot them.
After the SQL error is reproduced, the error code is returned in the result.
- When a
SELECTstatement contains manyORconditions, or manyAND-connectedINconditions, or manyAND NOTconditions, the database returns the error-4013, No memory or reach tenant memory limitwhile executing the statement. For more information, see Troubleshoot SQL errors in the parsing phase, error code -4013, tenant memory full.
- When a
The error code is included in the log.
When you execute an SQL statement to process a
longtextfield, the errorErrorCode=5098is returned. For more information, see SQL error: Varchar value is too long for the column.The SQL execution error is
error 4119 (RPC packet to send too long). At the same time, you can queryobserver.logby using thetrace_idand find the informationobrpc packet payload execced its limit. For more information, see Troubleshoot SQL error -4119, RPC packet to send too long.
Other error messages are included in the log.
- If the filter conditions in an SQL statement involve more than 64 fields, the error
-4002 Invalid argumentis returned. For more information, see Error -4002 when there are more than 64 filter conditions in an SQL statement.
- If the filter conditions in an SQL statement involve more than 64 fields, the error
After the SQL error is reproduced, the error code is returned in the result.
- A query statement that contains columns
c1,c2, andc3is executed. Columnsc1andc2hit indexes, and multipleINexpressions are on columnc1orc2.c1,c2, andc3form a vector expression. The statement returns aninternal errorand the error codeORA-00600. For more information, see SQL error 4016 when multiple IN expressions and vector expressions are included in an SQL statement.
- A query statement that contains columns
The error code is returned after SQL execution, and the error code is combined with the error code information in the log.
- When you execute an SQL statement, it times out and the error code is
ORA-00600. For more information, see Troubleshoot SQL execution timeout error.
- When you execute an SQL statement, it times out and the error code is