The following table lists the common performance monitoring metrics on OCP. Note
This section describes monitoring metrics on OCP V2.4.4 as an example. For information about the monitoring metrics on OCP of other versions, see the Monitoring Metrics section in the OCP User Guide of the corresponding version.
| Metric group | Metric name | Description | Calculation expression |
|---|---|---|---|
| CPU utilization | cpu_percent | The CPU utilization. | 100 * (1 - sum(rate(node_cpu_seconds_total{mode="idle", @LABELS}[@INTERVAL]) by (@GBLABELS)) by (@GBLABELS) / sum(rate(node_cpu_seconds_total{@LABELS}[@INTERVAL]) by (@GBLABELS)) by (@GBLABELS)) |
| IO throughput rate | read | The amount of data read each time. | avg(rate(node_disk_read_bytes_total{@LABELS}[@INTERVAL]) by (@GBLABELS)) by (@GBLABELS) / 1048576 |
| IO throughput rate | write | The amount of data written each time. | avg(rate(node_disk_written_bytes_total{@LABELS}[@INTERVAL]) by (@GBLABELS)) by (@GBLABELS) / 1048576 |
| I/O time | read | The average read time per second. | 1000000 * avg(rate(node_disk_read_time_seconds_total{@LABELS}[@INTERVAL]) by (@GBLABELS)) by (@GBLABELS) |
| I/O time | write | The average write time per second. | 1000000 * avg(rate(node_disk_write_time_seconds_total{@LABELS}[@INTERVAL]) by (@GBLABELS)) by (@GBLABELS) |
| IOPS | read | The number of reads per second. | avg(rate(node_disk_reads_completed_total{@LABELS}[@INTERVAL]) by (@GBLABELS)) by (@GBLABELS) |
| IOPS | write | The number of writes per second. | avg(rate(node_disk_writes_completed_total{@LABELS}[@INTERVAL]) by (@GBLABELS)) by (@GBLABELS) |
| Linux system load | load1 | The average system load in the last 1 minute. | avg(node_load1{@LABELS}) by (@GBLABELS) |
| Linux system load | load15 | The average system load in the last 15 minutes. | avg(node_load15{@LABELS}) by (@GBLABELS) |
| Linux system load | load5 | The average system load in the last 5 minutes. | avg(node_load5{@LABELS}) by (@GBLABELS) |
| MEMStore | active | The total size of active memstores. | sum(sysstat_value{metric_group="sysstat",stat_id="130000",@LABELS}) by (@GBLABELS) / 1048576 |
| MEMStore | limit | The upper limit of the total size of all memstores. | sum(sysstat_value{metric_group="sysstat",stat_id="130004",@LABELS}) by (@GBLABELS) / 1048576 |
| MEMStore | total | The total memstore size. | sum(sysstat_value{metric_group="sysstat",stat_id="130001",@LABELS}) by (@GBLABELS) / 1048576 |
| MEMStore | trigger | The threshold that triggers a major compaction. | sum(sysstat_value{metric_group="sysstat",stat_id="130002",@LABELS}) by (@GBLABELS) / 1048576 |
| QPS | all | The number of statements processed per second. | sum(rate(sysstat_value{metric_group="sysstat",stat_id="40002",@LABELS}[@INTERVAL]) by (@GBLABELS)) by (@GBLABELS) + sum(rate(sysstat_value{metric_group="sysstat",stat_id="40004",@LABELS}[@INTERVAL]) by (@GBLABELS)) by (@GBLABELS) + sum(rate(sysstat_value{metric_group="sysstat",stat_id="40006",@LABELS}[@INTERVAL]) by (@GBLABELS)) by (@GBLABELS) + sum(rate(sysstat_value{metric_group="sysstat",stat_id="40008",@LABELS}[@INTERVAL]) by (@GBLABELS)) by (@GBLABELS) + sum(rate(sysstat_value{metric_group="sysstat",stat_id="40000",@LABELS}[@INTERVAL]) by (@GBLABELS)) by (@GBLABELS) |
| QPS | delete | The number of Delete statements processed per second. | sum(rate(sysstat_value{metric_group="sysstat",stat_id="40008",@LABELS}[@INTERVAL]) by (@GBLABELS)) by (@GBLABELS) |
| QPS | insert | The number of Insert statements processed per second. | sum(rate(sysstat_value{metric_group="sysstat",stat_id="40002",@LABELS}[@INTERVAL]) by (@GBLABELS)) by (@GBLABELS) |
| QPS | replace | The number of Replace statements processed per second. | sum(rate(sysstat_value{metric_group="sysstat",stat_id="40004",@LABELS}[@INTERVAL]) by (@GBLABELS)) by (@GBLABELS) |
| QPS | select | The number of Select statements processed per second. | sum(rate(sysstat_value{metric_group="sysstat",stat_id="40000",@LABELS}[@INTERVAL]) by (@GBLABELS)) by (@GBLABELS) |
| QPS | update | The number of Update statements processed per second. | sum(rate(sysstat_value{metric_group="sysstat",stat_id="40006",@LABELS}[@INTERVAL]) by (@GBLABELS)) by (@GBLABELS) |
| SQL response time | all | The average processing time of each SQL statement on the server. | (sum(rate(sysstat_value{metric_group="sysstat",stat_id="40003",@LABELS}[@INTERVAL]) by (@GBLABELS)) by (@GBLABELS) + sum(rate(sysstat_value{metric_group="sysstat",stat_id="40005",@LABELS}[@INTERVAL]) by (@GBLABELS)) by (@GBLABELS) + sum(rate(sysstat_value{metric_group="sysstat",stat_id="40007",@LABELS}[@INTERVAL]) by (@GBLABELS)) by (@GBLABELS) + sum(rate(sysstat_value{metric_group="sysstat",stat_id="40009",@LABELS}[@INTERVAL]) by (@GBLABELS)) by (@GBLABELS) + sum(rate(sysstat_value{metric_group="sysstat",stat_id="40001",@LABELS}[@INTERVAL]) by (@GBLABELS)) by (@GBLABELS)) /(sum(rate(sysstat_value{metric_group="sysstat",stat_id="40002",@LABELS}[@INTERVAL]) by (@GBLABELS)) by (@GBLABELS) + sum(rate(sysstat_value{metric_group="sysstat",stat_id="40004",@LABELS}[@INTERVAL]) by (@GBLABELS)) by (@GBLABELS) + sum(rate(sysstat_value{metric_group="sysstat",stat_id="40006",@LABELS}[@INTERVAL]) by (@GBLABELS)) by (@GBLABELS) + sum(rate(sysstat_value{metric_group="sysstat",stat_id="40008",@LABELS}[@INTERVAL]) by (@GBLABELS)) by (@GBLABELS) + sum(rate(sysstat_value{metric_group="sysstat",stat_id="40000",@LABELS}[@INTERVAL]) by (@GBLABELS)) by (@GBLABELS)) |
| SQL execution plan category | distributed | The number of distributed execution plans processed per second. | sum(rate(sysstat_value{metric_group="sysstat",stat_id="40012",@LABELS}[@INTERVAL]) by (@GBLABELS)) by (@GBLABELS) |
| SQL execution plan category | local | The number of local execution plans processed per second. | sum(rate(sysstat_value{metric_group="sysstat",stat_id="40010",@LABELS}[@INTERVAL]) by (@GBLABELS)) by (@GBLABELS) |
| SQL execution plan category | remote | The number of remote execution plans processed per second. | sum(rate(sysstat_value{metric_group="sysstat",stat_id="40011",@LABELS}[@INTERVAL]) by (@GBLABELS)) by (@GBLABELS) |
| TPS | trans_count | The number of transactions processed per second. | sum(rate(sysstat_value{metric_group="sysstat",stat_id="30005",@LABELS}[@INTERVAL]) by (@GBLABELS)) by (@GBLABELS) |
| Event waiting_time | wait_time | The average wait time of a wait event. | sum(rate(time_wait{metric_group="waitevent",@LABELS}[@INTERVAL]) by (@GBLABELS)) by (@GBLABELS) / sum(rate(total_waits{metric_group="waitevent",@LABELS}[@INTERVAL]) by (@GBLABELS)) by (@GBLABELS) |
| Event waiting_number of times | wait_count | The average number of wait events per second. | sum(rate(total_waits{metric_group="waitevent",@LABELS}[@INTERVAL]) by (@GBLABELS)) by (@GBLABELS) |
| Transaction response time | trans_time | The average processing time of each transaction on the server. | sum(rate(sysstat_value{metric_group="sysstat",stat_id="30006",@LABELS}[@INTERVAL]) by (@GBLABELS)) by (@GBLABELS) / sum(rate(sysstat_value{metric_group="sysstat",stat_id="30005",@LABELS}[@INTERVAL]) by (@GBLABELS)) by (@GBLABELS) |
| Number of transaction logs | log_count | The number of transaction logs submitted per second. | sum(rate(sysstat_value{metric_group="sysstat",stat_id="30002",@LABELS}[@INTERVAL]) by (@GBLABELS)) by (@GBLABELS) |
| Transaction log time-consuming | sync_time | The average time consumed by each synchronization of transaction logs over the network. | sum(rate(sysstat_value{metric_group="sysstat",stat_id="30000",@LABELS}[@INTERVAL]) by (@GBLABELS)) by (@GBLABELS) / sum(rate(sysstat_value{metric_group="sysstat",stat_id="30001",@LABELS}[@INTERVAL]) by (@GBLABELS)) by (@GBLABELS) |
| Transaction log time-consuming | write_disk | The average time required for writing transaction logs to the disk each time. | sum(rate(sysstat_value{metric_group="sysstat",stat_id="80041",@LABELS}[@INTERVAL]) by (@GBLABELS)) by (@GBLABELS) / sum(rate(sysstat_value{metric_group="sysstat",stat_id="80040",@LABELS}[@INTERVAL]) by (@GBLABELS)) by (@GBLABELS) |
| Transaction log volume | log_size | The total size of transaction logs submitted per second. | sum(rate(sysstat_value{metric_group="sysstat",stat_id="80057",@LABELS}[@INTERVAL]) by (@GBLABELS)) by (@GBLABELS) |
| Memory | buffers | The size of the kernel buffer cache. | avg(node_memory_Buffers_bytes{@LABELS}) by (@GBLABELS) / 1073741824 |
| Memory | free | The size of the available physical memory. | avg(node_memory_MemFree_bytes{@LABELS}) by (@GBLABELS) / 1073741824 |
| Memory | used | The size of the physical memory used. | (avg(node_memory_MemTotal_bytes{@LABELS}) by (@GBLABELS) - avg(node_memory_MemFree_bytes{@LABELS}) by (@GBLABELS) - avg(node_memory_Cached_bytes{@LABELS}) by (@GBLABELS) - avg(node_memory_Buffers_bytes{@LABELS}) by (@GBLABELS)) / 1073741824 |
| Response time | all | The average processing time of each SQL statement on the server. | (sum(rate(sysstat_value{metric_group="sysstat",stat_id="40003",@LABELS}[@INTERVAL]) by (@GBLABELS)) by (@GBLABELS) + sum(rate(sysstat_value{metric_group="sysstat",stat_id="40005",@LABELS}[@INTERVAL]) by (@GBLABELS)) by (@GBLABELS) + sum(rate(sysstat_value{metric_group="sysstat",stat_id="40007",@LABELS}[@INTERVAL]) by (@GBLABELS)) by (@GBLABELS) + sum(rate(sysstat_value{metric_group="sysstat",stat_id="40009",@LABELS}[@INTERVAL]) by (@GBLABELS)) by (@GBLABELS) + sum(rate(sysstat_value{metric_group="sysstat",stat_id="40001",@LABELS}[@INTERVAL]) by (@GBLABELS)) by (@GBLABELS)) /(sum(rate(sysstat_value{metric_group="sysstat",stat_id="40002",@LABELS}[@INTERVAL]) by (@GBLABELS)) by (@GBLABELS) + sum(rate(sysstat_value{metric_group="sysstat",stat_id="40004",@LABELS}[@INTERVAL]) by (@GBLABELS)) by (@GBLABELS) + sum(rate(sysstat_value{metric_group="sysstat",stat_id="40006",@LABELS}[@INTERVAL]) by (@GBLABELS)) by (@GBLABELS) + sum(rate(sysstat_value{metric_group="sysstat",stat_id="40008",@LABELS}[@INTERVAL]) by (@GBLABELS)) by (@GBLABELS) + sum(rate(sysstat_value{metric_group="sysstat",stat_id="40000",@LABELS}[@INTERVAL]) by (@GBLABELS)) by (@GBLABELS)) |
| Response time | delete | The average processing time of each Delete statement on the server. | sum(rate(sysstat_value{metric_group="sysstat",stat_id="40009",@LABELS}[@INTERVAL]) by (@GBLABELS)) by (@GBLABELS) / sum(rate(sysstat_value{metric_group="sysstat",stat_id="40008",@LABELS}[@INTERVAL]) by (@GBLABELS)) by (@GBLABELS) |
| Response time | insert | The average processing time of each Insert statement on the server. | sum(rate(sysstat_value{metric_group="sysstat",stat_id="40003",@LABELS}[@INTERVAL]) by (@GBLABELS)) by (@GBLABELS) / sum(rate(sysstat_value{metric_group="sysstat",stat_id="40002",@LABELS}[@INTERVAL]) by (@GBLABELS)) by (@GBLABELS) |
| Response time | replace | The average processing time of each Replace statement on the server. | sum(rate(sysstat_value{metric_group="sysstat",stat_id="40005",@LABELS}[@INTERVAL]) by (@GBLABELS)) by (@GBLABELS) / sum(rate(sysstat_value{metric_group="sysstat",stat_id="40004",@LABELS}[@INTERVAL]) by (@GBLABELS)) by (@GBLABELS) |
| Response time | select | The average processing time of each Select statement on the server. | sum(rate(sysstat_value{metric_group="sysstat",stat_id="40001",@LABELS}[@INTERVAL]) by (@GBLABELS)) by (@GBLABELS) / sum(rate(sysstat_value{metric_group="sysstat",stat_id="40000",@LABELS}[@INTERVAL]) by (@GBLABELS)) by (@GBLABELS) |
| Response time | update | The average processing time of each Update statement on the server. | sum(rate(sysstat_value{metric_group="sysstat",stat_id="40007",@LABELS}[@INTERVAL]) by (@GBLABELS)) by (@GBLABELS) / sum(rate(sysstat_value{metric_group="sysstat",stat_id="40006",@LABELS}[@INTERVAL]) by (@GBLABELS)) by (@GBLABELS) |
| Capacity_Number of Partitions | partition_count | The number of partitions. | sum(partition_count{metric_group="all_meta_table",@LABELS}) by (@GBLABELS) |
| Capacity_Number of tables | table_count | The number of tables. | max(table_count{metric_group="all_table",@LABELS}) by (@GBLABELS) |
| Query response time | all | The average processing time of each SQL statement on the server. | (sum(rate(sysstat_value{metric_group="sysstat",stat_id="40003",@LABELS}[@INTERVAL]) by (@GBLABELS)) by (@GBLABELS) + sum(rate(sysstat_value{metric_group="sysstat",stat_id="40005",@LABELS}[@INTERVAL]) by (@GBLABELS)) by (@GBLABELS) + sum(rate(sysstat_value{metric_group="sysstat",stat_id="40007",@LABELS}[@INTERVAL]) by (@GBLABELS)) by (@GBLABELS) + sum(rate(sysstat_value{metric_group="sysstat",stat_id="40009",@LABELS}[@INTERVAL]) by (@GBLABELS)) by (@GBLABELS) + sum(rate(sysstat_value{metric_group="sysstat",stat_id="40001",@LABELS}[@INTERVAL]) by (@GBLABELS)) by (@GBLABELS)) /(sum(rate(sysstat_value{metric_group="sysstat",stat_id="40002",@LABELS}[@INTERVAL]) by (@GBLABELS)) by (@GBLABELS) + sum(rate(sysstat_value{metric_group="sysstat",stat_id="40004",@LABELS}[@INTERVAL]) by (@GBLABELS)) by (@GBLABELS) + sum(rate(sysstat_value{metric_group="sysstat",stat_id="40006",@LABELS}[@INTERVAL]) by (@GBLABELS)) by (@GBLABELS) + sum(rate(sysstat_value{metric_group="sysstat",stat_id="40008",@LABELS}[@INTERVAL]) by (@GBLABELS)) by (@GBLABELS) + sum(rate(sysstat_value{metric_group="sysstat",stat_id="40000",@LABELS}[@INTERVAL]) by (@GBLABELS)) by (@GBLABELS)) |
| Number of active sessions | active_session | The number of active sessions. | sum(active_sessions{metric_group="all_virtual_processlist",@LABELS}) by (@GBLABELS) |
| Waiting for events | wait_count | The average number of wait events per second. | sum(rate(total_waits{metric_group="waitevent",@LABELS}[@INTERVAL]) by (@GBLABELS)) by (@GBLABELS) |
| Waiting for event time | wait_time | The average wait time of a wait event. | sum(rate(time_wait{metric_group="waitevent",@LABELS}[@INTERVAL]) by (@GBLABELS)) by (@GBLABELS) / sum(rate(total_waits{metric_group="waitevent",@LABELS}[@INTERVAL]) by (@GBLABELS)) by (@GBLABELS) |
| Wait for lock time | wait_time | The average wait time of each write lock. | sum(rate(sysstat_value{metric_group="sysstat",stat_id="60023",@LABELS}[@INTERVAL]) by (@GBLABELS)) by (@GBLABELS) / (sum(rate(sysstat_value{metric_group="sysstat",stat_id="60021",@LABELS}[@INTERVAL]) by (@GBLABELS)) by (@GBLABELS) + sum(rate(sysstat_value{metric_group="sysstat",stat_id="60022",@LABELS}[@INTERVAL]) by (@GBLABELS)) by (@GBLABELS)) |
| Cache hit rate | block_cache | The hit rate of the block cache. | 100 * 1 / (1 + sum(rate(sysstat_value{metric_group="sysstat",stat_id="50009",@LABELS}[@INTERVAL]) by (@GBLABELS)) by (@GBLABELS) / sum(rate(sysstat_value{metric_group="sysstat",stat_id="50008",@LABELS}[@INTERVAL]) by (@GBLABELS)) by (@GBLABELS)) |
| Cache hit rate | plan_cache | The hit rate of the plan cache. | 100 * sum(rate(hit_count{metric_group="plan_cache_stat",@LABELS}[@INTERVAL]) by (@GBLABELS)) by (@GBLABELS) / sum(rate(access_count{metric_group="plan_cache_stat",@LABELS}[@INTERVAL]) by (@GBLABELS)) by (@GBLABELS) |
| Cache hit rate | row_cache | The hit rate of the row cache. | 100 * 1 / (1 + sum(rate(sysstat_value{metric_group="sysstat",stat_id="50001",@LABELS}[@INTERVAL]) by (@GBLABELS)) by (@GBLABELS) / sum(rate(sysstat_value{metric_group="sysstat",stat_id="50000",@LABELS}[@INTERVAL]) by (@GBLABELS)) by (@GBLABELS)) |
| Cache size | block_cache | The size of the block cache. | sum(cache_size{metric_group="all_virtual_kvcache_info",cache_name="user_block_cache",@LABELS}) by (@GBLABELS) / 1048576 |
| Cache size | plan_cache | The size of the plan cache. | sum(mem_used{metric_group="plan_cache_stat",@LABELS}) by (@GBLABELS) / 1048576 |
| Cache size | row_cache | The size of the row cache. | sum(cache_size{metric_group="all_virtual_kvcache_info",cache_name="user_row_cache",@LABELS}) by (@GBLABELS) / 1048576 |
| Network throughput | receive | The amount of data received per second. | avg(rate(node_network_receive_bytes_total{@LABELS}[@INTERVAL]) by (@GBLABELS)) by (@GBLABELS) / 1048576 |
| Network throughput | send | The amount of data sent per second. | avg(rate(node_network_transmit_bytes_total{@LABELS}[@INTERVAL]) by (@GBLABELS)) by (@GBLABELS) / 1048576 |
| Request waiting queue | queue_count | The average number of SQL statements queued up per second. | sum(rate(sysstat_value{metric_group="sysstat",stat_id="20001",@LABELS}[@INTERVAL]) by (@GBLABELS)) by (@GBLABELS) |
| Request waiting queue time-consuming | queue_time | The average waiting time of an SQL statement in a queue. | sum(rate(sysstat_value{metric_group="sysstat",stat_id="20002",@LABELS}[@INTERVAL]) by (@GBLABELS)) by (@GBLABELS) / sum(rate(sysstat_value{metric_group="sysstat",stat_id="20001",@LABELS}[@INTERVAL]) by (@GBLABELS)) by (@GBLABELS) |
| Lock wait | fail | The number of write lock wait failures. | sum(rate(sysstat_value{metric_group="sysstat",stat_id="60022",@LABELS}[@INTERVAL]) by (@GBLABELS)) by (@GBLABELS) |
| Lock wait | success | The number of write lock wait successes. | sum(rate(sysstat_value{metric_group="sysstat",stat_id="60021",@LABELS}[@INTERVAL]) by (@GBLABELS)) by (@GBLABELS) |