SQL execution errors are common in database O&M and can directly impact business operations. There are many reasons for SQL execution errors, such as failed database connection, insufficient user privileges, syntax errors, or data not meeting query conditions.
To help you quickly identify the root cause and efficiently resolve issues, the following sections describe a clear and practical SQL error troubleshooting process. This process provides specific steps to improve issue resolution efficiency and minimize business impact, providing strong support for daily O&M work.
The SQL execution error troubleshooting process is shown in the following figure.
Procedure
When an SQL execution error occurs, you can troubleshoot the issue based on the following procedure.
After an SQL execution error occurs, view the SQL error message. If the error message contains an error code, troubleshoot the issue based on the error code. If the error message does not contain an error code, determine the error type. The error may be an application execution error or an SQL execution error.
If the error is an application execution error, troubleshoot the issue based on Application exceptions--error message does not contain OceanBase error code or Application exceptions--error message contains OceanBase error code.
If the error is an SQL execution error, determine whether the error can be manually reproduced.
If the error cannot be reproduced, view the Knowledge Base for troubleshooting.
If the error can be reproduced, reproduce the error scenario. Connect to the OceanBase cluster through port 2881 or 2883 based on the original scenario and execute the original SQL statement to reproduce the error scenario.
After reproducing the SQL execution error, perform the following steps to collect information for troubleshooting.
Execute the following statement to obtain
trace_id:Notice
You must execute the following statement immediately after the SQL statement that causes the error is executed; otherwise, the
trace_idobtained by executing the following statement does not belong to the SQL statement that causes the error.MySQL modeOracle modeThe statement to obtain
trace_idin MySQL mode is as follows:obclient> SELECT last_trace_id();The statement to obtain
trace_idin Oracle mode is as follows:obclient> SELECT last_trace_id() FROM DUAL;Obtain the host information that executed the SQL statement based on the obtained
trace_id.Generally, an OceanBase cluster is deployed with multiple nodes. You can obtain the node that executed the SQL statement based on
trace_idand then filter logs.MySQL modeOracle modeExecute the following statement in MySQL mode:
obclient> SELECT * FROM oceanbase.GV$OB_SQL_AUDIT WHERE trace_id=last_trace_id;Replace
last_trace_idwithtrace_idobtained in the previous step.Execute the following statement in Oracle mode:
obclient> SELECT * FROM SYS.GV$OB_SQL_AUDIT WHERE trace_id=last_trace_id;Replace
last_trace_idwithtrace_idobtained in the previous step.View the
svr_ipcolumn in theGV$OB_SQL_AUDITview. The value in thesvr_ipcolumn is the host that executed the SQL statement.Use the
sshcommand to log in to the host based on the obtained host information.Go to the log directory.
For example, if the installation directory of OceanBase Database is
/home/admin/oceanbase, the specific log path depends on your actual environment.cd /home/admin/oceanbase/logExecute the following command to filter out the required information from logs.
grep "${trace_id}" observer.loggrep "${trace_id}" observer.log.xxxReplace
${trace_id}withtrace_idobtained in the previous steps. Replaceobserver.log.xxxwith a log file that contains timestamps, such asobserver.log.20210520. xxx must be replaced with the actual timestamp.Based on the information provided in the logs, analyze the error based on the error code or error message.
For more information about logs and error codes, see Overview of logs and Overview of error information.
If the information in the logs is unclear, contact technical support for assistance.
Case studies
The following case studies describe some typical scenarios where SQL execution errors occur.
After the SQL error is reproduced, the error code is returned in the result.
- When a
SELECTstatement contains manyORconditions, or manyAND-connectedINconditions, or manyAND NOTconditions, an error is returned when the statement is executed:-4013, No memory or reach tenant memory limit.
- When a
The error code is contained in the log.
- An error is returned when an SQL statement is executed to process a
longtextfield:ErrorCode=5098.
- An error is returned when SQL is executed:
error 4119 (RPC packet to send too long). You can queryobserver.logby using thetrace_idand find the information:obrpc packet payload execced its limit.
- An error is returned when an SQL statement is executed to process a
Other error messages are contained in the log.
- An error is returned when the filter conditions in an SQL statement contain more than 64 conditions on different fields:
-4002 Invalid argument.
- An error is returned when the filter conditions in an SQL statement contain more than 64 conditions on different fields:
After the SQL error is reproduced, the error code is returned in the result.
- An error is returned when an SQL statement that contains columns
c1,c2, andc3is executed:internal error, error codeORA-00600. The columnsc1andc2hit indexes, andc1orc2contains multipleINexpressions. In addition,c1,c2, andc3form a vector expression.
- An error is returned when an SQL statement that contains columns
The error code is contained in the log.
- An error is returned when an SQL statement is executed:
Timeout, error codeORA-00600.
- An error is returned when an SQL statement is executed: