SQL Audit|V4.1.0| docs|Distributed Database

SQL Audit

Last Updated：2023-07-28 02:55:42 Updated

The GV$OB_SQL_AUDIT view is the most frequently used SQL monitoring view. It records the source, execution status, resource consumption, and wait events of each SQL request, as well as key information such as the SQL text and execution plan. Therefore, it is a great tool for SQL diagnostics.

This view provides information in many fields. The main fields are described as follows:

TENANT_ID: the ID of the tenant that sent the request.
SVR_IP: the IP address of the server that received the request.
CLIENT_IP: the IP address of the client that sent the request.
REQUEST_TIME: the time when the database received the request.
REQUEST_ID: the ID of the request, which identifies one execution of the request and increases over time. External monitoring systems use this field as a cursor to pull the audit information.
IS_INNER_SQL: indicates whether the request is an internal SQL request.
IS_EXECUTOR_RPC: indicates whether the current request is an RPC request.
SQL_ID: the identifier of a specific SQL statement. Unlike the REQUEST_ID field, the SQL_ID field of an SQL statement remains unchanged regardless of multiple executions.
QUERY_SQL: the full-text content of the SQL statement. OceanBase Database allows you to bind an execution plan by using the SQL_ID or SQL_TEXT field.
SID: the identifier of a session. You can associate the SID field with all SQL statements and wait events of the connection. This field corresponds to the ID field in the GV$OB_PROCESSLIST view.
TX_ID: the identifier of a transaction. You can associate the TX_ID field with all SQL statements of the transaction. A large difference between the start time and the end time of two consecutively executed SQL statements in a transaction indicates that the execution link consumes a large amount of time.
TRACE_ID: the trace ID of the SQL request. You can associate the trace ID with other metrics or logs in a query.
IS_HIT_PLAN: indicates whether the plan cache is hit. SQL optimization is time-consuming. To avoid repeated SQL optimization, plans generated are stored in the plan cache and retrieved from the plan cache when the related SQL statement is executed again. If the plan cache is not hit, a hard parsing of the SQL statement is performed. Otherwise, soft parsing is performed. Hard parsing affects the SQL performance and increases time consumption. If the hit rate of the plan cache of a tenant is too low, check whether the plan cache space is too small. A small plan cache causes frequent eviction of plans.
PLAN_ID: the ID of the execution plan. You can query the details and statistics of the execution plan by using this field.
PLAN_HASH: the hash value of the execution plan.
PLAN_TYPE: the type of the execution plan. Valid values: 0, 1, 2, and 3. Values 1, 2, and 3 correspond to local plans, remote plans, and distributed plans. The value 0 indicates no execution plan is available. For example, a COMMIT statement does not have an execution plan.
AFFECTED_ROWS: the number of affected rows.
RETURN_ROWS: the number of returned rows.
RET_CODE: the return code of the execution.
EVENT: the name of the wait event with the longest wait time.
P1TEXT to P3TEXT: the names of the three parameters of the wait event.
P1 to P3: the values of the three parameters of the wait event.
LEVEL: the level of the wait event.
WAIT_CLASS_ID: the ID of the class to which the wait event belongs.
WAIT_CLASS: the name of the class to which the wait event belongs.
STATE: the state of the wait event.
WAIT_TIME_MICRO: the wait time of the wait event in microseconds.
TOTAL_WAIT_TIME_MICRO: the total wait time in microseconds during execution.
TOTAL_WAITS: the total number of waits during the execution.
ELAPSED_TIME: the total time of this execution, from the time when the request was received to the end of the execution, which consists of the following sub-stages:
- NET_TIME: the amount of time elapsed from when the RPC was sent to when the request was received.
- NET_WAIT_TIME: the amount of time elapsed from when the request was received to when it entered the queue.
- QUEUE_TIME: the time that the request waits in the queue, which reflects the request accumulation in the current tenant.
- DECODE_TIME: the time spent decoding the request after it left the queue.
- GET_PLAN_TIME: the time when the execution plan was generated, which reflects the health of the plan cache of the current tenant.
- EXECUTE_TIME: the execution time of the plan.
EXECUTE_TIME: the actual execution time, which is the sum of the CPU time and the value of the TOTAL_WAIT_TIME_MICRO field. The value of the TOTAL_WAIT_TIME_MICRO field is the sum of the values of the following fields: APPLICATION_WAIT_TIME, CONCURRENCY_WAIT_TIME, USER_IO_WAIT_TIME, and SCHEDULE_TIME. The difference between the values of the EXECUTE_TIME and TOTAL_WAIT_TIME_MICRO fields is the value of the CPU_TIME field.
- APPLICATION_WAIT_TIME: the total amount of time spent waiting for events of the application class.
- CONCURRENCY_WAIT_TIME: the total amount of time spent waiting for events of the concurrency class.
- USER_IO_WAIT_TIME: the total amount of time spent waiting for the events of the user_io class.
- SCHEDULE_TIME: the total amount of time spent on events of the schedule class.
Logical reads: During the execution of a request, the database first reads data from caches at all levels. The corresponding fields include ROW_CACHE_HIT, BLOOM_FILTER_CACHE_HIT, and BLOCK_CACHE_HIT. If no cache is hit, a disk read is performed. The field corresponding to a disk read is DISK_READS. You can determine whether to optimize an SQL request based on the number of rows scanned during the execution of the request. The number of rows scanned is calculated based on the number of cache and disk reads. Note that the number of rows scanned does not equal the number of physical reads, because the database first scans caches at all levels.
- ROW_CACHE_HIT: the number of row cache hits.
- BLOOM_FILTER_CACHE_HIT: the number of Bloom filter cache hits.
- BLOCK_CACHE_HIT: the number of block cache hits.
- BLOCK_INDEX_CACHE_HIT: the number of block index cache hits.
- DISK_READS: the number of physical reads.
RETRY_CNT: the number of retries.
TABLE_SCAN: indicates whether the request contains a full table scan.
CONSISTENCY_LEVEL: indicates the consistency level.
MEMSTORE_READ_ROW_COUNT: the number of rows read from MemStore.
SSSTORE_READ_ROW_COUNT: the number of rows read from SSStore.
REQUEST_MEMORY_USED: the memory consumed by the request.

The data of the GV$OB_SQL_AUDIT view is stored in a configurable memory space. On each node, a separate cache is allocated for each tenant. When the memory usage or the number of records reaches the specified threshold, automatic eviction is triggered. The oldest data is evicted first. An experienced database administrator (DBA) often starts troubleshooting SQL response time (RT) jitters by disabling the SQL Audit feature to retain the monitoring records. This prevents the monitoring data from being evicted.

You can use the following parameters to control the behavior of the SQL Audit feature:

enable_sql_audit: specifies whether to enable the SQL Audit feature for all tenants. This cluster parameter takes effect immediately after it is modified.
ob_enable_sql_audit: specifies whether to enable the SQL Audit feature for the current tenant. This tenant-level system variable takes effect immediately after it is modified.
ob_sql_audit_percentage: specifies the percentage of tenant memory that can be used by the SQL Audit feature for the current tenant. This tenant-level system variable takes effect immediately after it is modified. To prevent excessive memory usage, the maximum memory space for the SQL Audit feature is set to 1 GB.

The GV$OB_SQL_AUDIT view allows you to query the SQL execution information at multiple dimensions.

The following sample statement shows how to query SQL statements whose execution time exceeds 100 ms.

obclient> select request_id,usec_to_time(request_time),ELAPSED_TIME,QUEUE_TIME,EXECUTE_TIME,query_sql
from v$OB_SQL_AUDIT where ELAPSED_TIME > 100000 limit 10;
+------------+----------------------------+--------------+------------+--------------+------------------------------------------------------------------------------------------------------------------------------------------------------+
| request_id | usec_to_time(request_time) | ELAPSED_TIME | QUEUE_TIME | EXECUTE_TIME | query_sql                                                                                                                                            |
+------------+----------------------------+--------------+------------+--------------+------------------------------------------------------------------------------------------------------------------------------------------------------+
| 1538599798 | 2023-03-08 11:00:46.089711 |       335152 |        462 |       329196 | select request_id,usec_to_time(request_time),ELAPSED_TIME,QUEUE_TIME,EXECUTE_TIME,query_sql from v$OB_SQL_AUDIT where ELAPSED_TIME > 100000 limit 10 |
| 1538601580 | 2023-03-08 11:00:47.411316 |       276913 |       1420 |       275345 | select request_id,usec_to_time(request_time),ELAPSED_TIME,QUEUE_TIME,EXECUTE_TIME,query_sql from v$OB_SQL_AUDIT where ELAPSED_TIME > 100000 limit 10 |
| 1538603976 | 2023-03-08 11:00:49.258464 |       154873 |        461 |       154236 | select request_id,usec_to_time(request_time),ELAPSED_TIME,QUEUE_TIME,EXECUTE_TIME,query_sql from v$OB_SQL_AUDIT where ELAPSED_TIME > 100000 limit 10 |
| 1538613501 | 2023-03-08 11:00:56.123111 |       188973 |        688 |       188144 | select request_id,usec_to_time(request_time),ELAPSED_TIME,QUEUE_TIME,EXECUTE_TIME,query_sql from v$OB_SQL_AUDIT where ELAPSED_TIME > 100000 limit 10 |
| 1538712684 | 2023-03-08 11:02:07.504777 |       288516 |       1137 |       287180 | select request_id,usec_to_time(request_time),ELAPSED_TIME,QUEUE_TIME,EXECUTE_TIME,query_sql from v$OB_SQL_AUDIT where ELAPSED_TIME > 100000 limit 10 |
| 1538743161 | 2023-03-08 11:02:29.135127 |       289585 |         26 |       289380 | select request_id,usec_to_time(request_time),ELAPSED_TIME,QUEUE_TIME,EXECUTE_TIME,query_sql from v$OB_SQL_AUDIT where ELAPSED_TIME > 100000 limit 10 |
| 1538749786 | 2023-03-08 11:02:33.890317 |       294356 |         45 |       294180 | select request_id,usec_to_time(request_time),ELAPSED_TIME,QUEUE_TIME,EXECUTE_TIME,query_sql from v$OB_SQL_AUDIT where ELAPSED_TIME > 100000 limit 10 |
| 1538792259 | 2023-03-08 11:03:04.626596 |       192843 |        128 |       192569 | select request_id,usec_to_time(request_time),ELAPSED_TIME,QUEUE_TIME,EXECUTE_TIME,query_sql from v$OB_SQL_AUDIT where ELAPSED_TIME > 100000 limit 10 |
| 1538799117 | 2023-03-08 11:03:09.567622 |       201594 |         55 |       201388 | select request_id,usec_to_time(request_time),ELAPSED_TIME,QUEUE_TIME,EXECUTE_TIME,query_sql from v$OB_SQL_AUDIT where ELAPSED_TIME > 100000 limit 10 |
| 1538804299 | 2023-03-08 11:03:13.274090 |       235720 |        241 |       235302 | select request_id,usec_to_time(request_time),ELAPSED_TIME,QUEUE_TIME,EXECUTE_TIME,query_sql from v$OB_SQL_AUDIT where ELAPSED_TIME > 100000 limit 10 |
+------------+----------------------------+--------------+------------+--------------+------------------------------------------------------------------------------------------------------------------------------------------------------+
10 rows in set (0.28 sec)

The following sample statement shows how to query the average queuing time of the last 1,000 SQL statements.

obclient> select /*+ query_timeout(30000000) */ avg(queue_time) from v$OB_SQL_AUDIT
where request_id > (select max(request_id) from v$OB_SQL_AUDIT) - 1000 ;
+-----------------+
| avg(queue_time) |
+-----------------+
|        350.8740 |
+-----------------+
1 row in set (0.26 sec)

The following sample statement shows how to query the SQL statements that occupy the most resources of a tenant. The SQL statements are sorted in descending order by the execution time × the number of executions. If the CPU resource of the tenant is fully used, you can use this statement to check whether the issue is caused by SQL statements and, if yes, the suspicious SQL statements.

obclient>
select SQL_ID,
avg(ELAPSED_TIME),
avg(QUEUE_TIME),
avg(ROW_CACHE_HIT + BLOOM_FILTER_CACHE_HIT + BLOCK_CACHE_HIT + DISK_READS) avg_logical_read,
avg(execute_time) avg_exec_time,
count(*) cnt,
avg(execute_time - TOTAL_WAIT_TIME_MICRO ) avg_cpu_time,
avg( TOTAL_WAIT_TIME_MICRO ) avg_wait_time,
WAIT_CLASS,
avg(retry_cnt)
from v$OB_SQL_AUDIT
group by 1
order by avg_exec_time * cnt desc limit 10;
+----------------------------------+-------------------+-----------------+------------------+---------------+--------+--------------+---------------+------------+----------------+
| SQL_ID                           | avg(ELAPSED_TIME) | avg(QUEUE_TIME) | avg_logical_read | avg_exec_time | cnt    | avg_cpu_time | avg_wait_time | WAIT_CLASS | avg(retry_cnt) |
+----------------------------------+-------------------+-----------------+------------------+---------------+--------+--------------+---------------+------------+----------------+
| 2705182A6EAB699CEC8E59DA80710B64 |        54976.9269 |         43.8605 |       17664.2727 |    54821.5828 |  11759 |   54821.5828 |        0.0000 | OTHER      |         0.0000 |
| 32AB97A0126F566064F84DDDF4936F82 |         1520.9832 |        380.7903 |          63.7847 |      789.6781 |  63632 |     789.6781 |        0.0000 | OTHER      |         0.0000 |
| A5F514E873BE9D1F9A339D0DA7481D69 |        44032.5553 |         44.5149 |        8943.7834 |    43878.1405 |   1039 |   43878.1405 |        0.0000 | OTHER      |         0.0000 |
| 31FD78420DB07C11C8E3154F1658D237 |      7769857.0000 |         35.7500 |      399020.7500 |  7769682.7500 |      4 | 7769682.7500 |        0.0000 | NETWORK    |         1.0000 |
| C48AEE941D985D8DEB66892228D5E845 |         8528.6227 |          0.0000 |           0.0000 |     8450.4047 |   1601 |    8450.4047 |        0.0000 | OTHER      |         0.0000 |
| 101B7B79DFA9AE801BEE4F1A234AD294 |          158.2296 |         41.7211 |           0.0000 |       46.0345 | 286758 |      46.0345 |        0.0000 | OTHER      |         0.0000 |
| 1D0BA376E273B9D622641124D8C59264 |         1774.5924 |          0.0049 |           0.0000 |     1737.4885 |   5081 |    1737.4885 |        0.0000 | OTHER      |         0.0000 |
| 64CF75576816DB5614F3D5B1F35B1472 |         1801.8767 |        747.0343 |           0.0000 |      827.1674 |  10340 |     827.1674 |        0.0000 | OTHER      |         0.0000 |
| 23D1C653347BA469396896AD9B20DCA1 |         5564.9419 |          0.0000 |           0.0000 |     5478.2228 |   1257 |    5478.2228 |        0.0000 | OTHER      |         0.0000 |
| FA4F493FA5CE2DCC64F51CF3754F96C6 |         2478.3956 |        378.7557 |           3.1040 |     1731.1802 |   3357 |    1731.1802 |        0.0000 | OTHER      |         0.0000 |
+----------------------------------+-------------------+-----------------+------------------+---------------+--------+--------------+---------------+------------+----------------+
10 rows in set (1.34 sec)

Note

When an SQL response time (RT) jitter occurs in a tenant, the CPU resource of the tenant is fully used and the RT of all SQL statements soars. In this case, you must first determine whether the issue is caused by the SQL statements or other problems.
The SQL statement described in the preceding example is quite useful. It aggregates executed SQL statements based on the SQL_ID and sorts them in descending order by the amount of occupied CPU resource, which is the product of avg_exec_time multiplied by cnt. This way, you can check the top SQL statements for exceptions.