Alert description
Note
This alert takes effect only for OceanBase clusters of version V4.3.5.3 or later.This alert monitors whether the actual memory usage percentage of tenant vector index data exceeds the limit in an OceanBase cluster.
Alert principle
The following table lists the key parameters involved in the monitoring logic of this alert.
Parameter |
Value |
|---|---|
| Monitoring Metrics | ob_tenant_vector_mem_used_percent: the actual memory usage percentage of tenant vector index data. An alert is triggered when this value exceeds the threshold. |
| Monitoring Expression | 100 * sum(ob_tenant_vector_mem_used{@LABELS}) by (@GBLABELS) / sum(ob_tenant_vector_mem_limit{@LABELS}) by (@GBLABELS) |
| Metric Collection | |
| Data Source | SQL collection:select /*+ MONITOR_AGENT READ_CONSISTENCY(WEAK) */ tenant_id,svr_ip,svr_port,vector_mem_hold,vector_mem_used,vector_mem_limit from GV$OB_VECTOR_MEMORYThis SQL statement runs in the sys tenant of each cluster. vector_mem_hold、vector_mem_used、ector_mem_limitThey are the memory usage, the memory size in use, and the maximum memory allocation. |
| Collection Cycle | 5 Seconds |
Rule information
Monitoring Metrics |
Default Threshold (Unit: %) |
Duration |
Detection Cycle |
Elimination Cycle |
|---|---|---|---|---|
| ob_tenant_vector_mem_used_percent | This metric has two default thresholds: |
0 Seconds | 10 Seconds | 5 Minutes |
Alert information
Alert Trigger Method |
Alert Level |
Scope |
|---|---|---|
| Based on monitoring metric expressions | Tenant |
Alert template
Alert overview
- Template: ${alarm_target} ${alarm_name}
- Example: alarm_template_id=0:ob_cluster_name=obcluster:ob_cluster_id=4:tenant_name=mysql host=xx.xx.xx.xx OceanBase tenant vector index data actual memory usage exceeds the limit
Alert Details
- Template: Cluster: ${ob_cluster_name}, Tenant: ${tenant_name}, Alert: ${alarm_name}. The actual memory usage of the vector index data ${value_shown} exceeds ${alarm_threshold} %.
- Example: cluster: obcluster, tenant: mysql, alert: The actual memory usage percentage of vector index data in the OceanBase tenant exceeds the limit. The actual memory usage percentage of vector index data is 90%, which exceeds 70%.
Alert recovery
- Template: Alert: ${alarm_name}, OceanBase Tenant Vector Index Actual Memory Usage Exceeds Limit: ${value_shown}
- Example: Alert: OceanBase tenant vector index data actual memory usage exceeds the limit. OceanBase tenant vector index data memory usage: 10%
Impact on the system
- When the vector index reaches or approaches 100%, further writes to the index will fail, causing insert failures into related tables or update failures of the vector index.
- When it reaches or approaches 100%, it will cause the degradation of vector queries to become irreversible.
- When it reaches or approaches 100%, it may cause vector index background asynchronous tasks to retry repeatedly, consuming CPU and I/O resources and further degrading system performance.
Possible causes
- Uses the HNSW series index and the new data volume exceeds the memory capacity supported by the vector index specification.
- Uses the HNSWSQ or HNSWBQ index and writes a large amount of incremental data in a short period, preventing quantitative compression from keeping up.
- A large number of vector asynchronous tasks are executed simultaneously, causing a temporary increase in memory usage.
Solution
- Use the
INDEX_VECTOR_MEMORY_ESTIMATEfunction to estimate whether the memory usage is reasonable. For more information about this function, see INDEX_VECTOR_MEMORY_ESTIMATE. - Query the number of concurrent asynchronous tasks in the background. For more information about related views, see oceanbase.DBA_OB_VECTOR_INDEX_TASKS.
- Based on the results of steps 1 and 2, determine whether memory expansion is needed if the tenant's vector memory configuration is unreasonable, or whether to adjust the concurrency of asynchronous tasks if there are too many background concurrent asynchronous tasks. For related parameter descriptions, see vector_index_optimization_concurrency.
