Blog

Diagnosis and Tuning with OceanBase: sql_audit Performance View Analysis

ENGINEERING

Diagnosis and Tuning with OceanBase: sql_audit Performance View Analysis

OceanBase

2024-07-04

Engineering

Tutorial|Query Optimization|Sql Optimization

obdiag is a CLI diagnostic tool designed for OceanBase Database. It scans and collects information such as logs, SQL audit records, and process stack information of OceanBase. You may deploy your OceanBase cluster by using OceanBase Cloud Platform (OCP) or OceanBase Deployer (OBD), or manually deploy it based on the OceanBase documentation. Regardless of the deployment mode, you can use obdiag to collect diagnostic information with a few clicks. obdiag has now been officially open-sourced.

The obdiag team has compiled relevant experiences regarding OceanBase diagnosis and tuning and has commenced releasing a series of tutorial articles. This article will dive into the performance view analysis, and give a detailed introduction on the functions and usage instructions of sql_audit.

The SQL audit view: sql_audit

The SQL audit view (gv$sql_audit) is a very useful feature in OceanBase for SQL performance diagnosis. In OceanBase 4.0.0.0 and later versions, it is also known as gv$ob_sql_audit. It can help developers and operators to troubleshoot any SQL statement that has been run in OceanBase more conveniently, regardless of whether the SQL statement was executed successfully. It records detailed information such as client and server IP ports, SQL statements, execution time, execution node, execution plan ID, session ID, execution time, wait time, total time, queue time, related block read information, and execution error messages.

View	Description
(g)v$ob_sql_audit	For OceanBase 4.0.0.0 and later, gv$xx retrieves the records on every server of the tenant, while v$xxx retrieves the records connecting to this particular server.
(g)v$sql_audit	For versions earlier than OceanBase 4.0.0.0, gv$xx retrieves the records on every server of the tenant, while v$xxx retrieves the records connecting to this particular server.

The sql_audit is a view based on the virtual table _all_virtual_sql_audit. Corresponding data is stored in a configurable memory space, capable of recording and displaying the source, execution status, and statistical information of each SQL request. As the memory space is limited, when it reaches a certain limit, it triggers an elimination mechanism.

The sql_audit checks background tasks every 1 second and decides whether to trigger the elimination based on the following criteria:

● The maximum memory usage limit for sql_audit is avail_mem_limit = min (available memory on OBServer * 10%, sql_audit_memory_limit).

● When the avail_mem_limit is within the range of [64M, 100M], elimination is triggered when the memory usage reaches avail_mem_limit - 20M.

● When the avail_mem_limit is within the range of [100M, 5G], elimination is triggered when the memory usage reaches avail_mem_limit * 0.8.

● When the avail_mem_limit is within the range of [5G, +∞), elimination is triggered when the memory usage reaches avail_mem_limit - 1G.

● Elimination is also triggered when the number of sql_audit records exceeds 9 million.

If the elimination is triggered due to reaching the maximum record limit, the elimination will be stopped when the number of records is reduced to 8 million. If the elimination is triggered due to reaching the memory limit, sql_audit will decide whether to stop the elimination based on the following criteria:

● When the avail_mem_limit is within the range of [64M, 100M], elimination will be stopped when the memory usage is reduced to avail_mem_limit - 40M.

● When the avail_mem_limit is within the range of [100M, 5G], elimination will be stopped when the memory usage is reduced to avail_mem_limit * 0.6.

● When the avail_mem_limit is within the range of [5G, +∞], elimination will be stopped when the memory usage is reduced to avail_mem_limit - 2G.

Fields of the sql_audit View

Column	Type	Description
SVR_IP	VARCHAR2(46)	The IP address of the server.
SVR_PORT	NUMBER(38)	The port number of the server.
REQUEST_ID	NUMBER(38)	The ID of the request.
SQL_EXEC_ID	NUMBER(38)	The ID of the current execution.
TRACE_ID	VARCHAR2(128)	The trace ID of the statement.
SID	NUMBER(38)	The ID of the session connection.
CLIENT_IP	VARCHAR2(46)	When the client is connected to the OBServer node by using OceanBase Database Proxy (ODP), this column indicates the IP address of the ODP instance that forwarded the request. When the client is directly connected to the OBServer node, this column indicates the IP address of the client that sent the request.
CLIENT_PORT	NUMBER(38)	The port number of the client that sent the request.
TENANT_ID	NUMBER(38)	The ID of the tenant that sent the request.
EFFECTIVE_TENANT_ID	NUMBER(38)	The ID of the tenant.
TENANT_NAME	VARCHAR2(64)	The name of the tenant that sent the request.
USER_ID	NUMBER(38)	The ID of the user that sent the request.
USER_NAME	VARCHAR2(64)	The name of the user that sent the request.
USER_GROUP	NUMBER(38)	The ID of the resource group to which the user belongs.
USER_CLIENT_IP	VARCHAR2(32)	The IP address of the client that sent the request.
DB_ID	NUMBER(38)	The ID of the database.
DB_NAME	VARCHAR2(128)	The name of the database.
SQL_ID	VARCHAR2(32)	The ID of the SQL statement.
QUERY_SQL	CLOB	The actual SQL statement.
PLAN_ID	NUMBER(38)	The ID of the execution plan.
AFFECTED_ROWS	NUMBER(38)	The number of rows affected.
RETURN_ROWS	NUMBER(38)	The number of rows returned.
PARTITION_CNT	NUMBER(38)	The number of partitions scanned by the request.
RET_CODE	NUMBER(38)	The return code of the execution.
QC_ID	NUMBER(38)	The ID of the scheduler in the parallel execution scenario.
DFO_ID	NUMBER(38)	The ID of the sub-plan being executed in the parallel execution scenario.
SQC_ID	NUMBER(38)	The ID of the local coordinator in the parallel execution scenario.
WORKER_ID	NUMBER(38)	The ID of the worker thread in the parallel execution scenario.
EVENT	VARCHAR2(64)	The name of the wait event with the longest wait time.
P1TEXT	VARCHAR2(64)	The first parameter of the wait event.
P1	NUMBER(38)	The value of the first parameter of the wait event.
P2TEXT	VARCHAR2(64)	The second parameter of the wait event.
P2	NUMBER(38)	The value of the second parameter of the wait event.
P3TEXT	VARCHAR2(64)	The third parameter of the wait event.
P3	NUMBER(38)	The value of the third parameter of the wait event.
LEVEL	NUMBER(38)	The level of the wait event.
WAIT_CLASS_ID	NUMBER(38)	The ID of the class to which the wait event belongs.
WAIT_CLASS#	NUMBER(38)	The subscript of the class to which the wait event belongs.
WAIT_CLASS	VARCHAR2(64)	The name of the class to which the wait event belongs.
STATE	VARCHAR2(19)	The status of the wait event.
WAIT_TIME_MICRO	NUMBER(38)	The wait time of the wait event in microseconds.
TOTAL_WAIT_TIME_MICRO	NUMBER(38)	The total wait time in microseconds during execution.
TOTAL_WAITS	NUMBER(38)	The total number of waits during the execution.
RPC_COUNT	NUMBER(38)	The number of remote procedure calls (RPCs) sent.
PLAN_TYPE	NUMBER(38)	The type of the execution plan.
IS_INNER_SQL	NUMBER(38)	Indicates whether the request is an internal SQL request.
IS_EXECUTOR_RPC	NUMBER(38)	Indicates whether the current request is an RPC request.
IS_HIT_PLAN	NUMBER(38)	Indicates whether the plan cache is hit.
REQUEST_TIME	NUMBER(38)	The time when the execution starts.
ELAPSED_TIME	NUMBER(38)	The amount of time elapsed from when the request was received to when the execution of the request ended.
NET_TIME	NUMBER(38)	The amount of time elapsed from when the RPC was sent to when the request was received.
NET_WAIT_TIME	NUMBER(38)	The amount of time elapsed from when the request was received to when it entered the queue.
QUEUE_TIME	NUMBER(38)	The time that the request waits in the queue.
DECODE_TIME	NUMBER(38)	The time spent on decoding the request after it left the queue.
GET_PLAN_TIME	NUMBER(38)	The amount of time elapsed from when the processing started to when the plan was obtained.
EXECUTE_TIME	NUMBER(38)	The time consumed for plan execution.
APPLICATION_WAIT_TIME	NUMBER(38)	The total amount of time spent on waiting for events of the application class.
CONCURRENCY_WAIT_TIME	NUMBER(38)	The total amount of time spent on waiting for events of the concurrency class.
USER_IO_WAIT_TIME	NUMBER(38)	The total amount of time spent on waiting for events of the user_io class.
SCHEDULE_TIME	NUMBER(38)	The total amount of time spent on events of the schedule class.
ROW_CACHE_HIT	NUMBER(38)	The number of row cache hits.
BLOOM_FILTER_CACHE_HIT	NUMBER(38)	The number of Bloom filter cache hits.
BLOCK_CACHE_HIT	NUMBER(38)	The number of block cache hits.
BLOCK_INDEX_CACHE_HIT	NUMBER(38)	The number of block index cache hits.
DISK_READS	NUMBER(38)	The number of physical reads.
RETRY_CNT	NUMBER(38)	The number of retries.
TABLE_SCAN	NUMBER(38)	Indicates whether the request contains a full table scan.
CONSISTENCY_LEVEL	NUMBER(38)	The consistency level.
MEMSTORE_READ_ROW_COUNT	NUMBER(38)	The number of rows read in the MemStore.
SSSTORE_READ_ROW_COUNT	NUMBER(38)	The number of rows consecutively read in the SSStore.
REQUEST_MEMORY_USED	NUMBER(38)	The memory consumed by the request.
EXPECT_WORKER_COUNT	NUMBER(38)	The number of worker threads expected by the request.
USED_WORKER_COUNT	NUMBER(38)	The number of worker threads used by the request.
SCHED_INFO	VARCHAR2(16384)	The scheduling information about the request.
PS_STMT_ID	NUMBER(38)	The ID of the prepared statement corresponding to the request.
TRANSACTION_HASH	NUMBER(38)	The hash value of the transaction corresponding to the request.
REQUEST_TYPE	NUMBER(38)	The type of the request.
IS_BATCHED_MULTI_STMT	NUMBER(38)	Indicates whether optimization has been performed for batch execution of multiple statements.
OB_TRACE_INFO	VARCHAR2(4096)	The trace information configured by the user.
PLAN_HASH	NUMBER(38)	The hash value of the execution plan.
PARAMS_VALUE	CLOB	The PS-bound parameter value.

Diagnostic cases based on sql_audit

Monitor the top SQL queries in the last 100 seconds.

Statement:

-- For OceanBase 4.0.0.0 and later，replace tenant_name with actual tenant name 
select /*+read_consistency(weak),query_timeout(100000000)*/ SQL_ID,count(1),avg(ELAPSED_TIME),avg(EXECUTE_TIME),avg(QUEUE_TIME),avg(AFFECTED_ROWS),avg(GET_PLAN_TIME) 
from gv$ob_sql_audit  
where time_to_usec(now(6))-request_time <1000000000 
and tenant_name='test_tenant' 
group by SQL_ID order by avg(ELAPSED_TIME)*count(1) desc limit 20;

-- For versions earlier than OceanBase 4.0.0.0，replace tenant_name with actual tenant name 
select /*+read_consistency(weak),query_timeout(100000000)*/ SQL_ID,count(1),avg(ELAPSED_TIME),avg(EXECUTE_TIME),avg(QUEUE_TIME),avg(AFFECTED_ROWS),avg(GET_PLAN_TIME) 
from gv$sql_audit  
where time_to_usec(now(6))-request_time <1000000000 and tenant_name='test_tenant'  
group by SQL_ID order by avg(ELAPSED_TIME)*count(1) desc limit 20 ;

Expected value: Check if the overall SQL time duration, cpu_time, physical and logical reads are reasonable. Generally, single row insert or query with primary key should be within 500us.

Suggestion: Compare the SQL with the table structure to confirm if the execution plan is reasonable and the time duration is normal.

Check if the SQL request traffic in the cluster is evenly distributed.

Approach: You can start by retrieving all the SQL queries in the database within a specific time period, aggregating them at the server level, and then calculating the queries per second (QPS) for each server within that time frame.

Statement:

-- For OceanBase 4.0.0.0 and later，replace t1.tenant_id with actual tenant id
select t2.zone, t1.svr_ip,  count(*) as QPS
from oceanbase.gv$ob_sql_audit t1, oceanbase.__all_server t2
where t1.svr_ip = t2.svr_ip and t1.tenant_id = 1001
and IS_EXECUTOR_RPC = 0 and request_time > (time_to_usec(now()) - 1000000)
and request_time < time_to_usec(now())
group by t1.svr_ip   order by QPS;

-- For versions earlier than OceanBase 4.0.0.0，replace t1.tenant_id with actual tenant id
select t2.zone, t1.svr_ip,  count(*) as QPS
from oceanbase.gv$sql_audit t1, oceanbase.__all_server t2
where t1.svr_ip = t2.svr_ip and t1.tenant_id = 1001
and IS_EXECUTOR_RPC = 0 and request_time > (time_to_usec(now()) - 1000000)
and request_time < time_to_usec(now())
group by t1.svr_ip   order by QPS;

Analyze the SQL with the Top-N number of requests within a specific time frame.

Approach: Retrieve all the SQL queries in the database within the specified time period, aggregate them at the sql_id level, and then calculate the QPS for each sql_id within that time frame. Finally, get the TOP-N values.

Statement:

-- For OceanBase 4.0.0.0 and later，replace tenant_id with actual tenant id
select SQL_ID, count(*) as QPS, avg(t1.elapsed_time) RT
from oceanbase.gv$ob_sql_audit t1
where   tenant_id = 1001       and IS_EXECUTOR_RPC = 0
and request_time > (time_to_usec(now()) - 10000000)
and request_time < time_to_usec(now())
group by t1.sql_id order by QPS desc limit 10;

-- For versions earlier than OceanBase 4.0.0.0，replace tenant_id with actual tenant id
select SQL_ID, count(*) as QPS, avg(t1.elapsed_time) RT
from oceanbase.gv$sql_audit t1
where   tenant_id = 1001       and IS_EXECUTOR_RPC = 0
and request_time > (time_to_usec(now()) - 10000000)
and request_time < time_to_usec(now())
group by t1.sql_id order by QPS desc limit 10;

Identify the SQL with the highest CPU consumption.

Approach: The CPU usage time can be calculated as elapsed_time - queue_time, as the queuing process does not consume CPU. This can be particularly useful for identifying SQL queries that contribute to high CPU usage.

Statement:

-- For OceanBase 4.0.0.0 and later，replace tenant_id with actual tenant id
select sql_id, substr(query_sql, 1, 20) as query_sql,       
    sum(elapsed_time - queue_time) sum_t, count(*) cnt,       
    avg(get_plan_time), avg(execute_time)    
from oceanbase.gv$ob_sql_audit     
where   tenant_id = 1001             
    and request_time > (time_to_usec(now()) - 10000000)            
    and request_time < time_to_usec(now())
group by sql_id order by sum_t desc   limit 10;

-- For versions earlier than OceanBase 4.0.0.0，replace tenant_id with actual tenant id
select sql_id, substr(query_sql, 1, 20) as query_sql,       
    sum(elapsed_time - queue_time) sum_t, count(*) cnt,       
    avg(get_plan_time), avg(execute_time)    
from oceanbase.gv$sql_audit     
where   tenant_id = 1001             
    and request_time > (time_to_usec(now()) - 10000000)            
    and request_time < time_to_usec(now())
group by sql_id order by sum_t desc   limit 10;

Check if there is a large number of unreasonable remote execution.

Approach: The "plan_type" field in sql_audit indicates the type of execution plan for the SQL query:

● plan_type=1: Local execution plan, which offers the best performance.

● plan_type=2: Remote execution plan.

● plan_type=3: Distributed execution plan, which includes both local and remote execution plans.

In general, when there is a significant amount of remote execution, it may lead to leader switching or inaccurate routing of the the proxy client.

Statement:

-- For OceanBase 4.0.0.0 and later，replace tenant_id with actual tenant id
select count(*), plan_type    
from oceanbase.gv$ob_sql_audit     
where tenant_id = 1001                
    and IS_EXECUTOR_RPC = 0                
    and request_time > (time_to_usec(now()) - 10000000)               
    and request_time < time_to_usec(now()) 
group by plan_type ;

-- For versions earlier than OceanBase 4.0.0.0，replace tenant_id with actual tenant id
select count(*), plan_type    
from oceanbase.gv$sql_audit     
where tenant_id = 1001                
    and IS_EXECUTOR_RPC = 0                
    and request_time > (time_to_usec(now()) - 10000000)               
    and request_time < time_to_usec(now()) 
group by plan_type ;

Query SQL statements that perform full table scans.

Approach: The table_scan field in sql_audit identifies whether a statement performs a full table scan. Further analysis can be conducted to determine if adding indexes can prevent full table scans for these SQL statements.

Statement:

-- For OceanBase 4.0.0.0 and later，replace tenant_id with actual tenant id
select query_sql 
from oceanbase.gv$ob_sql_audit 
where table_scan = 1 and tenant_id = 1001 
group by sql_id;

-- For versions earlier than OceanBase 4.0.0.0，replace tenant_id with actual tenant id
select query_sql 
from oceanbase.gv$sql_audit 
where table_scan = 1 and tenant_id = 1001 
group by sql_id;

Analyze SQL queries that cause an abrupt RT jitter.

Approach: If RT jitter occurs online, but the RT is not consistently high, consider disabling sql_audit immediately after the jitter occurs (alter system set ob_enable_sql_audit = 0) to ensure the presence of the target SQL in sql_audit. Then identify the abnormal SQL by analyzing the SQL queries with the Top-N number of requests.

Upon identifying the RT abnormal requests in sql_audit, the following aspects can be further analyzed:

● Check the retry count (RETRY_CNT). If the count is high, there may be locking conflicts or leader switches.

● Check if the queue time value is too high (QUEUE_TIME).

● Check the time taken to acquire the execution plan (GET_PLAN_TIME). If the time is long, usually IS_HIT_PLAN = 0, indicating a plan cache miss.

● Check the execution time (EXECUTE_TIME). If the value is abnormally high, check if there are long wait events or if the number of rows read is high using SSSTORE_READ_ROW_COUNT and MEMSTORE_READ_ROW_COUNT.

Tutorial Query Optimization Sql Optimization

OceanBase

Technology Trend | Product Interpretation | User Practice

Content