Description
This alert is triggered when the number of active sessions of an OceanBase cluster exceeds the threshold.
Principle
The following table describes the key parameters that are involved in the monitoring and alerting logic.
| Parameter | Value |
|---|---|
| Metric | active_session
|
| Source | select /*+ MONITOR_AGENT READ_CONSISTENCY(WEAK)*/ case when cnt is null then 0 else cnt end as cnt, tenant_name, tenant_id from (select __all_tenant.tenant_name,__all_tenant.tenant_id, cnt from __all_tenant left join (select count(`state`='ACTIVE' OR NULL) as cnt, tenant as tenant_name from __all_virtual_processlist where svr_ip = ? and svr_port = ? group by tenant) t1 on __all_tenant.tenant_name = t1.tenant_name) t2;
|
| Collected metric | ob_active_session_num |
| Metric expression | sum(ob_active_session_num{@LABELS}) by (@GBLABELS) |
| Collection cycle | 1 minute |
Alert rule
| Metric | Default threshold | Duration | Alert cycle | Elimination cycle |
|---|---|---|---|---|
| active_session | 10000 | 0 seconds | 60 seconds | 5 minutes |
Alert information
| Trigger method | Alert level | Scope |
|---|---|---|
| Based on the expression of the metric | Critical | Cluster |
Alert templates
Overview: ${alarm_target} ${alarm_name}
Details: Cluster: ${ob_cluster_name}, Alert: ${alarm_name}, The number of active sessions of an OceanBase cluster is ${value}, exceeding the threshold of ${alarm_threshold}.
Overview example: ob_cluster=obcluster1-1 Excessive number of active sessions of an OceanBase cluster
Details example: Cluster: obcluster1-1, Alert: Excessive number of active sessions of an OceanBase cluster. The number of active sessions of an OceanBase cluster is 11000, exceeding the threshold of 10000.
Impact on the system
The number of active sessions indicates the number of services provided for the client. This describes how busy the system is. The threshold of active sessions is affected by the available resources of OceanBase Database tenants in the cluster, especially the available CPU resources. If resources are sufficient, you can set the threshold to a relatively high value to meet your business requirements. The threshold is also affected by the number of session connections. Pay attention to this number in the system when you set the threshold.
If the number of active sessions exceeds the threshold, it may lead to the following results:
The CPU utilization exceeds the limit, and therefore services fail to run.
The OceanBase cluster is busy. If it fails to process a large amount of data, the memory utilization may reach the upper limit.
The excessive number of active sessions is not the root cause of the alert. You also need to analyze other related alert metrics such as CPU and memory to locate the root cause of the alert.
Possible causes
The number of active sessions may exceed the threshold on the following occasions:
The available CPU resources of an OceanBase Database tenant are insufficient. Then, requests cannot be processed by the tenant in time. As a result, active sessions accumulate.
The available memory resources of an OceanBase Database tenant are insufficient, which contributes to delay in request processing by the OceanBase cluster.
If the threshold of active sessions is set to a relatively low value, alerts can be frequently triggered during peak hours.
Suggested solutions
Check whether the tenant_cpu_percent_over_threshold or tenant_memstore_percent_over_threshold alert has been triggered. If so, the alert in this topic is triggered for lack of resources. Then, you can perform steps in the Suggested solutions section of the linked topics and check whether this alert is automatically eliminated.
Check whether spikes occur in peak hours.
You can view the metric trend charts on the Database Performance tab of the
Performance Monitoring page. For more information, see Performance monitoring.If all metrics are steadily higher than usual in the alert period, the period is business peak hours.
In this case, you can ignore the alert and raise the threshold.
If the value of active_session suddenly increases at specific time points but other metrics remain normal in the alert period, spikes of active sessions may occur.
In this case, you can modify the value of
Duration when specifyingTrigger Conditions for the alert.