We recommend that you use Tars, an external diagnostic tool, to analyze the problem, or use the (g)v$sql_audit view for troubleshooting.
Troubleshooting process when using (g)v$sql_audit:
In case of online response time (RT) jittering, where the RT is not constantly high, you may consider immediately disabling
sql_audit(settingalter system set ob_enable_sql_audit = 0) once jittering occurs to ensure that the SQL request causing the jitter exists insql_audit.Run an SQL Audit query to check the TOP N requests with the longest RT around the point in time when the jitter occurred. Then, analyze if any abnormal SQL statements exist.
Find the requests with abnormal RT and analyze their record in
sql_auditfor troubleshooting:- Check the
RETRY_CNTfield for the times of retry. A large number indicates possible lock conflicts or follower-to-leader switchover. - Check the
QUEUE_TIMEfield for excessively long queue time. - Check the
GET_PLAN_TIMEfield for the time spent in getting the execution plan. A long period of time often leads toIS_HIT_PLAN = 0, meaning a failure to hit the plan cache. - Check the value of
EXECUTE_TIME. If the value is excessively large, perform the following steps:- Check for time-consuming pending events.
- Analyze and check for an unusually large number of logical reads, which is possible in the case of sudden accesses by a great number of accounts.
- Check the
Logical reads = 2 × ROW_CACHE_HIT
+ 2 × BLOOM_FILTER_CACHE_HIT
+ BLOCK_INDEX_CACHE_HIT
+ BLOCK_CACHE_HIT + DISK_READS
If the request data corresponding to the RT jitter has been evicted from SQL Audit, you need to check the OBServer node for the trace logs of slow queries at the time of jitter and analyze these trace logs (if any).