You can use wait events for performance analysis. At any given point in time, the worker threads of OceanBase Database are calculating the CPU utilization or waiting for resources, such as I/O resources, network bandwidth, and critical sections. You can identify the system performance bottlenecks, especially the performance of high-concurrency small queries, based on where the wait events occur and the statistics about the events. Wait events are an important tool for performance optimization.
Typical scenarios in which the causes of an exception are located based on wait events:
The CPU utilization stays low regardless of the increasing system workload. This is often caused by concurrency bottlenecks in the system, such as the resource contention in frequently accessed critical sections. You can locate the cause based on the top N wait events that wait the longest time in these critical sections.
In general, the response time (RT) of a business SQL statement is less than 1 ms. However, occasionally, the RT may spike to greater than 100 ms. This is usually because the execution of the SQL statement is stuck somewhere. You can locate the cause based on the recorded information about the wait events during the execution of the SQL statement, especially the details of the wait event that takes the longest time.
The system suddenly stops the execution of an SQL statement. This is usually because some sessions are blocked by factors such as lock contention when querying frequently accessed rows and deadlocks caused by unknown reasons. You can locate the cause based on the recorded details of the current wait event in the session.
At present, OceanBase Database supports more than 200 wait events, such as events related to network, scheduling, critical section, latches, clusters, transaction commit, user I/O, and system I/O. These events are classified into 11 categories. User I/O wait events, such as a page fault wait, are directly triggered by users, while system I/O wait events, such as a major compaction wait, are triggered by background tasks. The monitoring information related to wait events is classified into wait statistics and wait details, which are displayed in four views. Wait statistics views display the number and time of wait events at the session and tenant levels. We recommend that you view the wait statistics on the visual dashboard of an external monitoring system. Wait details views display details of wait events at the session level. One wait details view displays the details of the current wait events in each session or, if no wait events in the current session, the details of the last wait event. The other wait details view displays the details of the last 10 historical wait events in each session.
| View | Description |
|---|---|
GV$SYSTEM_EVENT |
Displays the statistics of tenant-level wait events. |
GV$SESSION_EVENT |
Displays the statistics of session-level wait events. |
GV$SESSION_WAIT |
Displays the details of session-level wait events. |
GV$SESSION_WAIT_HISTORY |
Displays the details of historical session-level wait events. |
The monitoring information related to wait events is classified into wait statistics and wait details for different purposes.
Wait statistics
Wait statistics provide information such as the total number of waits, total wait time, total number of timeouts, and average wait time of all wait events. You can use wait statistics for performance optimization.
The GV$SYSTEM_EVENT view describes the important statistics of a tenant-level wait event in the following fields:
CON_ID: the ID of the tenant.SVR_IP: the IP address of the node.WAIT_CLASS: the class of the wait event.EVENT_ID: the ID of the wait event.EVENT: the name of the wait event.TOTAL_WAITS: the total number of waits for the event.TIME_WAITED: the total wait time of the event.TOTAL_TIMEOUTS: the total number of wait timeouts for the event.AVERAGE_WAIT: the average wait time for the event.
The following sample statement shows how to query the statistics of wait events in a tenant:
obclient>
select tenant_id, tenant_name, sum(total_waits) as total_waits, sum(time_waited_micro) / 1000000 as time_waited_seconds
from v$system_event join DBA_OB_TENANTS
on v$system_event.con_id = DBA_OB_TENANTS.tenant_id
where v$system_event.wait_class <> 'IDLE' and DBA_OB_TENANTS.tenant_type!='META'
group by tenant_name;
+-----------+---------------+-------------+---------------------+
| tenant_id | tenant_name | total_waits | time_waited_seconds |
+-----------+---------------+-------------+---------------------+
| 1 | sys | 14669140176 | 522012421.8556 |
| 1008 | mq_t1 | 3234084699 | 32159617.6839 |
+-----------+---------------+-------------+---------------------+
2 rows in set (0.06 sec)
Wait details
Wait details provide information such as the wait start time, current status, and remaining time of a specific wait event. You can use wait details for troubleshooting. To help locate the causes of a wait, OceanBase Database allows you to pass three parameters to the wait details. For example, to locate the causes of a latch-triggered wait event, you can record in the wait details the latch applier, latch mode, and current latch holder. When the wait event is triggered, these details are displayed.
The GV$SESSION_WAIT view describes the important details of a session-level wait event in the following fields:
SID: the session ID.WAIT_CLASS: the class of the wait event.EVENT: The name of the wait event.P1TEXT,P1,P2TEXT,P2,P3TEXT, andP3: the names and values of the three parameters that you can pass to the wait details.STATE: the current status.WAIT_TIME_MICRO: the wait time that has elapsed, in microseconds.TIME_REMAINING_MICRO: the remaining wait time, in microseconds.
The following sample statement shows how to query the wait details of a session in WAITING state:
obclient> select * from GV$SESSION_WAIT where STATE='WAITING' limit 1\G
*************************** 1. row ***************************
SID: 3222786366
CON_ID: 1
SVR_IP: xx.xx.xx.xx
SVR_PORT: 2882
EVENT: px loop condition wait
P1TEXT: address
P1: 140578612358640
P2TEXT:
P2: 0
P3TEXT:
P3: 0
WAIT_CLASS_ID: 104
WAIT_CLASS#: 4
WAIT_CLASS: CONCURRENCY
STATE: WAITING
WAIT_TIME_MICRO: 926
TIME_REMAINING_MICRO: 74
TIME_SINCE_LAST_WAIT_MICRO: 0
1 row in set (0.06 sec)