Metrics provide a general diagnostic for databases. They represent specific values collected during event tracking, such as the number of SQL statement executions, execution time, and I/O throughput. OceanBase Database supports more than 600 metrics, which are categorized into 14 classes to provide all the necessary information for key execution details such as network information, request queues, transactions, SQL statements, cache, storage, resource quotas, and logs. Some metrics, including QPS, TPS, QPS_RT, TPS_RT, and Active_sessions, provide real-time information and are collected every second in OceanBase Database. Metrics for other information are collected by minute.
The data source of the metrics is the GV$SYSSTAT view, which contains the following key fields:
CON_ID: the ID of the tenant.SVR_IP: the IP address of the node.CLASS: the class of the metric.STAT_ID: the ID of the metric.NAME: the name of the metric.VALUE: the value of the metric.VALUE_TYPE: the value type of the metric.
The CLASS field indicates the class of a metric. The following table describes the 14 classes supported by OceanBase Database V4.0:
| Category code | Category name |
|---|---|
| 1 | Network |
| 2 | Request queue |
| 4 | Transaction |
| 8 | SQL |
| 16 | Cache |
| 32 | Storage |
| 64 | Resource |
| 128 | Log |
| 256 | CLOG |
| 512 | Election |
| 1024 | OBServer |
| 2048 | Root Service |
| 3072 | Analyze Report |
| 4096 | TableAPI |
The STAT_ID and NAME fields describe the ID and name of a metric, respectively. For example, the following table displays the information about all metrics in the SQL class:
obclient> select distinct class,stat_id,name,VALUE_TYPE from gv$sysstat where class=8;
+-------+---------+------------------------------------+------------+
| class | stat_id | name | VALUE_TYPE |
+-------+---------+------------------------------------+------------+
| 8 | 40000 | sql select count | ADD_VALUE |
| 8 | 40001 | sql select time | ADD_VALUE |
| 8 | 40002 | sql insert count | ADD_VALUE |
| 8 | 40003 | sql insert time | ADD_VALUE |
| 8 | 40004 | sql replace count | ADD_VALUE |
| 8 | 40005 | sql replace time | ADD_VALUE |
| 8 | 40006 | sql update count | ADD_VALUE |
| 8 | 40007 | sql update time | ADD_VALUE |
| 8 | 40008 | sql delete count | ADD_VALUE |
| 8 | 40009 | sql delete time | ADD_VALUE |
| 8 | 40018 | sql other count | ADD_VALUE |
| 8 | 40019 | sql other time | ADD_VALUE |
| 8 | 40020 | ps prepare count | ADD_VALUE |
| 8 | 40021 | ps prepare time | ADD_VALUE |
| 8 | 40022 | ps execute count | ADD_VALUE |
| 8 | 40023 | ps close count | ADD_VALUE |
| 8 | 40024 | ps close time | ADD_VALUE |
| 8 | 40030 | opened cursors current | ADD_VALUE |
| 8 | 40031 | opened cursors cumulative | ADD_VALUE |
| 8 | 40010 | sql local count | ADD_VALUE |
| 8 | 40011 | sql remote count | ADD_VALUE |
| 8 | 40012 | sql distributed count | ADD_VALUE |
| 8 | 40013 | active sessions | ADD_VALUE |
| 8 | 40014 | single query count | ADD_VALUE |
| 8 | 40015 | multiple query count | ADD_VALUE |
| 8 | 40016 | multiple query with one stmt count | ADD_VALUE |
| 8 | 40100 | sql inner select count | ADD_VALUE |
| 8 | 40101 | sql inner select time | ADD_VALUE |
| 8 | 40102 | sql inner insert count | ADD_VALUE |
| 8 | 40103 | sql inner insert time | ADD_VALUE |
| 8 | 40104 | sql inner replace count | ADD_VALUE |
| 8 | 40105 | sql inner replace time | ADD_VALUE |
| 8 | 40106 | sql inner update count | ADD_VALUE |
| 8 | 40107 | sql inner update time | ADD_VALUE |
| 8 | 40108 | sql inner delete count | ADD_VALUE |
| 8 | 40109 | sql inner delete time | ADD_VALUE |
| 8 | 40110 | sql inner other count | ADD_VALUE |
| 8 | 40111 | sql inner other time | ADD_VALUE |
| 8 | 40112 | user logons cumulative | ADD_VALUE |
| 8 | 40113 | user logouts cumulative | ADD_VALUE |
| 8 | 40114 | user logons failed cumulative | ADD_VALUE |
| 8 | 40115 | user logons time cumulative | ADD_VALUE |
+-------+---------+------------------------------------+------------+
42 rows in set (0.07 sec)
Some metrics, such as QPS, TPS, QPS_RT, and TPS_RT, provide real-time information and are collected by second.
The CON_ID and SVR_IP fields describe the monitored objects of a metric, so that metrics can be flexibly aggregated in different scenarios. For example, you can aggregate the QPS and TPS metrics based on the CON_ID field to observe the changes in the number of requests of a tenant, or the SVR_IP field to observe the changes in the number of requests of a node. When you perform load balancing on a cluster, you can aggregate the QPS and TPS metrics for resource units based on both the CON_ID and SVR_IP fields.
The VALUE_TYPE field describes the value type of the metric. Valid values: ADD_VALUE, which indicates the cumulative type, and SET_VALUE, which indicates the status type. The values of cumulative metrics, such as SQL COUNT, SQL RT, and IO READ BYTES, keep increasing during event tracking in OceanBase Database. External monitoring systems regularly pull the cumulative metric values and calculate the differences to show the data change trend of these metrics. The latest values of status metrics, such as CACHE SIZE and CPU USAGE, are maintained during event tracking in OceanBase Database. External monitoring systems regularly pull and directly display the status metric values. In general, status metrics support aggregation only at the tenant level, while most cumulative metrics support aggregation both at the session and tenant levels.
Metrics of OceanBase Database are categorized into 14 classes based on the components that they describe. The following sections describe the most frequently used metrics.
Network metrics
The following sample code shows how to query the network metrics:
obclient> select distinct class,stat_id,name,VALUE_TYPE from gv$sysstat where class=1 and stat_id in (10000,10001,10002,10003,10004,10005,10006);
+-------+---------+----------------------+------------+
| class | stat_id | name | VALUE_TYPE |
+-------+---------+----------------------+------------+
| 1 | 10000 | rpc packet in | ADD_VALUE |
| 1 | 10001 | rpc packet in bytes | ADD_VALUE |
| 1 | 10002 | rpc packet out | ADD_VALUE |
| 1 | 10003 | rpc packet out bytes | ADD_VALUE |
| 1 | 10004 | rpc deliver fail | ADD_VALUE |
| 1 | 10005 | rpc net delay | ADD_VALUE |
| 1 | 10006 | rpc net frame delay | ADD_VALUE |
+-------+---------+----------------------+------------+
7 rows in set (0.06 sec)
Request queue metrics
The following sample code shows how to query the request queue metrics:
obclient> select distinct class,stat_id,name,VALUE_TYPE from gv$sysstat where class=2 and stat_id in (20000,20001,20002);
+-------+---------+-----------------------+------------+
| class | stat_id | name | VALUE_TYPE |
+-------+---------+-----------------------+------------+
| 2 | 20000 | request enqueue count | ADD_VALUE |
| 2 | 20001 | request dequeue count | ADD_VALUE |
| 2 | 20002 | request queue time | ADD_VALUE |
+-------+---------+-----------------------+------------+
3 rows in set (0.07 sec)
Transaction metrics
The following sample code shows how to query the transaction metrics:
obclient> select distinct class,stat_id,name,VALUE_TYPE from gv$sysstat where class=4 and stat_id in (30000,30001,30002,30005,30006,30007,30008,30009,30010,30011,30012,30013);
+-------+---------+-------------------------------+------------+
| class | stat_id | name | VALUE_TYPE |
+-------+---------+-------------------------------+------------+
| 4 | 30000 | trans commit log sync time | ADD_VALUE |
| 4 | 30001 | trans commit log sync count | ADD_VALUE |
| 4 | 30002 | trans commit log submit count | ADD_VALUE |
| 4 | 30005 | trans start count | ADD_VALUE |
| 4 | 30006 | trans total used time | ADD_VALUE |
| 4 | 30007 | trans commit count | ADD_VALUE |
| 4 | 30008 | trans commit time | ADD_VALUE |
| 4 | 30009 | trans rollback count | ADD_VALUE |
| 4 | 30010 | trans rollback time | ADD_VALUE |
| 4 | 30011 | trans timeout count | ADD_VALUE |
| 4 | 30012 | trans local trans count | ADD_VALUE |
| 4 | 30013 | trans distribute trans count | ADD_VALUE |
+-------+---------+-------------------------------+------------+
12 rows in set (0.07 sec)
SQL statement metrics
The following sample code shows how to query the SQL statement metrics:
obclient> select distinct class,stat_id,name,VALUE_TYPE from gv$sysstat where class=8 and stat_id in (40000, 40001, 40002, 40003, 40004, 40005, 40006, 40007, 40008, 40009, 40010, 40011, 40012, 40018, 40019);
+-------+---------+-----------------------+------------+
| class | stat_id | name | VALUE_TYPE |
+-------+---------+-----------------------+------------+
| 8 | 40000 | sql select count | ADD_VALUE |
| 8 | 40001 | sql select time | ADD_VALUE |
| 8 | 40002 | sql insert count | ADD_VALUE |
| 8 | 40003 | sql insert time | ADD_VALUE |
| 8 | 40004 | sql replace count | ADD_VALUE |
| 8 | 40005 | sql replace time | ADD_VALUE |
| 8 | 40006 | sql update count | ADD_VALUE |
| 8 | 40007 | sql update time | ADD_VALUE |
| 8 | 40008 | sql delete count | ADD_VALUE |
| 8 | 40009 | sql delete time | ADD_VALUE |
| 8 | 40018 | sql other count | ADD_VALUE |
| 8 | 40019 | sql other time | ADD_VALUE |
| 8 | 40010 | sql local count | ADD_VALUE |
| 8 | 40011 | sql remote count | ADD_VALUE |
| 8 | 40012 | sql distributed count | ADD_VALUE |
+-------+---------+-----------------------+------------+
15 rows in set (0.08 sec)
KVCache metrics
The following sample code shows how to query the KVCache metrics:
obclient> select distinct class,stat_id,name,VALUE_TYPE from gv$sysstat where class=16 and stat_id in (50000, 50001, 50004, 50005, 50008, 50009, 50010, 50011, 50037, 50038,120000,120001,120008,120009);
+-------+---------+-------------------------+------------+
| class | stat_id | name | VALUE_TYPE |
+-------+---------+-------------------------+------------+
| 16 | 50000 | row cache hit | ADD_VALUE |
| 16 | 50001 | row cache miss | ADD_VALUE |
| 16 | 50004 | bloom filter cache hit | ADD_VALUE |
| 16 | 50005 | bloom filter cache miss | ADD_VALUE |
| 16 | 50008 | block cache hit | ADD_VALUE |
| 16 | 50009 | block cache miss | ADD_VALUE |
| 16 | 50010 | location cache hit | ADD_VALUE |
| 16 | 50011 | location cache miss | ADD_VALUE |
| 16 | 50037 | tablet ls cache hit | ADD_VALUE |
| 16 | 50038 | tablet ls cache miss | ADD_VALUE |
| 16 | 120000 | location cache size | SET_VALUE |
| 16 | 120001 | tablet ls cache size | SET_VALUE |
| 16 | 120008 | user row cache size | SET_VALUE |
| 16 | 120009 | bloom filter cache size | SET_VALUE |
+-------+---------+-------------------------+------------+
14 rows in set (0.07 sec)
The GV$SYSSTAT view describes the hit rate of KVCache. To learn about the actual memory usage of each module of KVCache, query the GV$OB_KVCACHE view. For example:
obclient> select DBA_OB_TENANTS.tenant_id,tenant_name, svr_ip,svr_port, cache_name, cache_size
from GV$OB_KVCACHE, DBA_OB_TENANTS where DBA_OB_TENANTS.tenant_type!='META' and DBA_OB_TENANTS.tenant_id = GV$OB_KVCACHE.tenant_id limit 10;
+-----------+----------------+------------+----------+-----------------------+------------+
| tenant_id | tenant_name | svr_ip | svr_port | cache_name | cache_size |
+-----------+----------------+------------+----------+-----------------------+------------+
| 1 | sys |xx.xx.xx.xx | 2882 | schema_cache | 14564352 |
| 1 | sys |xx.xx.xx.xx | 2882 | tablet_table_cache | 6241280 |
| 1 | sys |xx.xx.xx.xx | 2882 | vtable_cache | 35372032 |
| 1 | sys |xx.xx.xx.xx | 2882 | opt_column_stat_cache | 6241280 |
| 1008 | mq_t1 |xx.xx.xx.xx | 2882 | index_block_cache | 6241280 |
| 1008 | mq_t1 |xx.xx.xx.xx | 2882 | user_block_cache | 10402816 |
| 1008 | mq_t1 |xx.xx.xx.xx | 2882 | user_row_cache | 6241280 |
| 1010 | oracle_tenant1 |xx.xx.xx.xx | 2882 | index_block_cache | 6241280 |
| 1008 | mq_t1 |xx.xx.xx.xx | 2882 | bf_cache | 4160512 |
| 1010 | oracle_tenant1 |xx.xx.xx.xx | 2882 | user_block_cache | 8322048 |
+-----------+----------------+------------+----------+-----------------------+------------+
10 rows in set (0.06 sec)
Storage metrics
The following sample code shows how to query the storage metrics:
obclient> select distinct class,stat_id,name,VALUE_TYPE from gv$sysstat where class=32 and stat_id in (60000, 60001, 60002, 60003, 60004, 60005, 60019, 60020, 60021, 60022, 60023, 60024,130000,130001,130002,130004);
+-------+---------+--------------------------------+------------+
| class | stat_id | name | VALUE_TYPE |
+-------+---------+--------------------------------+------------+
| 32 | 60000 | io read count | ADD_VALUE |
| 32 | 60001 | io read delay | ADD_VALUE |
| 32 | 60002 | io read bytes | ADD_VALUE |
| 32 | 60003 | io write count | ADD_VALUE |
| 32 | 60004 | io write delay | ADD_VALUE |
| 32 | 60005 | io write bytes | ADD_VALUE |
| 32 | 60019 | memstore read lock succ count | ADD_VALUE |
| 32 | 60020 | memstore read lock fail count | ADD_VALUE |
| 32 | 60021 | memstore write lock succ count | ADD_VALUE |
| 32 | 60022 | memstore write lock fail count | ADD_VALUE |
| 32 | 60023 | memstore wait write lock time | ADD_VALUE |
| 32 | 60024 | memstore wait read lock time | ADD_VALUE |
| 32 | 130000 | active memstore used | SET_VALUE |
| 32 | 130001 | total memstore used | SET_VALUE |
| 32 | 130002 | major freeze trigger | SET_VALUE |
| 32 | 130004 | memstore limit | SET_VALUE |
+-------+---------+--------------------------------+------------+
16 rows in set (0.06 sec)
Resource metrics
The following sample code shows how to query the resource metrics:
obclient> select distinct class,stat_id,name,VALUE_TYPE from gv$sysstat where class=64 and stat_id in (140002,140003,140005,140006);
+-------+---------+-----------------+------------+
| class | stat_id | name | VALUE_TYPE |
+-------+---------+-----------------+------------+
| 64 | 140002 | max memory size | SET_VALUE |
| 64 | 140003 | memory usage | SET_VALUE |
| 64 | 140005 | max cpus | SET_VALUE |
| 64 | 140006 | cpu usage | SET_VALUE |
+-------+---------+-----------------+------------+
4 rows in set (0.06 sec)
The metrics described in the preceding sample code can be used to calculate the CPU utilization and memory usage of the tenant based on the following formulas:
Tenant CPU utilization = cpu usage (stat_id = 140006) / max cpus (stat_id = 140005)
Tenant memory usage = memory usage (stat_id = 140003) / max memory size (stat_id = 140002)
Clog metrics
The following sample code shows how to query the clog metrics:
obclient> select distinct class,stat_id,name,VALUE_TYPE from gv$sysstat where class=256 and stat_id in (80040,80041,80057);
+-------+---------+---------------------------+------------+
| class | stat_id | name | VALUE_TYPE |
+-------+---------+---------------------------+------------+
| 256 | 80040 | clog write count | ADD_VALUE |
| 256 | 80041 | clog write time | ADD_VALUE |
| 256 | 80057 | clog trans log total size | ADD_VALUE |
+-------+---------+---------------------------+------------+
3 rows in set (0.07 sec)
Other metrics
In addition to the GV$SYSSTAT view, OceanBase Database provides many other views to describe the metrics from various aspects.
For example, the GV$OB_PROCESSLIST view describes the status of the current sessions of the system and the IDs of the executed SQL statements. This view also provides important information in fields such as SID, SQL_ID, TRANS_ID, and TRACE_ID. It is a useful tool for end-to-end tracing.
The following example shows how to query the
GV$OB_PROCESSLISTview for information about active sessions:obclient> select TENANT,SVR_IP,SVR_PORT,USER,HOST,COMMAND,STATE,INFO,SQL_ID,TRACE_ID from GV$OB_PROCESSLIST where STATE='ACTIVE' or null limit 10; +--------+------------+----------+-----------+-------------------+---------+--------+-----------------------------------------------------------------------------------------------------------------------------------------+----------------------------------+-----------------------------------+ | TENANT | SVR_IP | SVR_PORT | USER | HOST | COMMAND | STATE | INFO | SQL_ID | TRACE_ID | +--------+------------+----------+-----------+-------------------+---------+--------+-----------------------------------------------------------------------------------------------------------------------------------------+----------------------------------+-----------------------------------+ | sys |xx.xx.xx.xx | 2882 | DBA_query | xx.xx.xx.xx:51588 | Query | ACTIVE | select TENANT,SVR_IP,SVR_PORT,USER,HOST,COMMAND,STATE,INFO,SQL_ID,TRACE_ID from GV$OB_PROCESSLIST where STATE='ACTIVE' or null limit 10 | D18F6AE855AD0D7478162DD8EF48C781 | YB4206008451-0005F55893CEA363-0-0 | +--------+------------+----------+-----------+-------------------+---------+--------+-----------------------------------------------------------------------------------------------------------------------------------------+----------------------------------+-----------------------------------+ 1 row in set (0.04 sec)The following example shows how to query the
GV$OB_PROCESSLISTview for the number of active sessions on each OBServer node:obclient> select tenant_id, tenant_name, svr_ip, case when cnt is null then 0 else cnt end as cnt from ( select DBA_OB_TENANTS.tenant_name, DBA_OB_TENANTS.tenant_id, svr_ip, cnt from DBA_OB_TENANTS left join ( select count(`state`='ACTIVE' OR NULL) as cnt, tenant as tenant_name, svr_ip from GV$OB_PROCESSLIST group by tenant,svr_ip ) t1 on DBA_OB_TENANTS.tenant_name = t1.tenant_name where DBA_OB_TENANTS.tenant_type!='META' ) t2; +-----------+-----------------+-------------+------+ | tenant_id | tenant_name | svr_ip | cnt | +-----------+-----------------+-------------+------+ | 1 | sys | xx.xx.xx.xx | 0 | | 1 | sys | xx.xx.xx.xx | 0 | | 1 | sys | xx.xx.xx.xx | 0 | | 1002 | midas0_3790 | xx.xx.xx.xx | 0 | | 1002 | midas0_3790 | xx.xx.xx.xx | 0 | | 1002 | midas0_3790 | xx.xx.xx.xx | 0 | +-----------+-----------------+-------------+------+ 6 rows in set (0.05 sec)