SYSSTAT metrics are common diagnostic monitoring information that statistics key performance values during system operation, such as the number of SQL executions, execution time, and I/O count.
Metric source
The data of metrics is stored in the GV$SYSSTAT view. The main fields are described as follows:
| Field | Description |
|---|---|
| CON_ID | Tenant ID |
| SVR_IP | Node IP |
| STAT_ID | Metric ID |
| NAME | Metric name |
| CLASS | Metric class |
| VALUE | Result value of the metric |
| VALUE_TYPE | Value type of the metric:
|
Metric query
Using the [G]V$SYSSTAT view as an example, query all metrics in the SQL class.
obclient> SELECT CLASS,STAT_ID,NAME,VALUE_TYPE FROM GV$SYSSTAT WHERE CLASS=8;
The result is as follows:
+-------+---------+------------------------------------+------------+
| class | stat_id | name | VALUE_TYPE |
+-------+---------+------------------------------------+------------+
| 8 | 40000 | sql select count | ADD_VALUE |
| 8 | 40001 | sql select time | ADD_VALUE |
| 8 | 40002 | sql insert count | ADD_VALUE |
| 8 | 40003 | sql insert time | ADD_VALUE |
| 8 | 40004 | sql replace count | ADD_VALUE |
| 8 | 40005 | sql replace time | ADD_VALUE |
| 8 | 40006 | sql update count | ADD_VALUE |
| 8 | 40007 | sql update time | ADD_VALUE |
| 8 | 40008 | sql delete count | ADD_VALUE |
| 8 | 40009 | sql delete time | ADD_VALUE |
| 8 | 40018 | sql other count | ADD_VALUE |
| 8 | 40019 | sql other time | ADD_VALUE |
| 8 | 40020 | ps prepare count | ADD_VALUE |
| 8 | 40021 | ps prepare time | ADD_VALUE |
| 8 | 40022 | ps execute count | ADD_VALUE |
| 8 | 40023 | ps close count | ADD_VALUE |
| 8 | 40024 | ps close time | ADD_VALUE |
| 8 | 40025 | sql commit count | ADD_VALUE |
| 8 | 40026 | sql commit time | ADD_VALUE |
| 8 | 40027 | sql rollback count | ADD_VALUE |
| 8 | 40028 | sql rollback time | ADD_VALUE |
| 8 | 40030 | opened cursors current | ADD_VALUE |
| 8 | 40031 | opened cursors cumulative | ADD_VALUE |
| 8 | 40010 | sql local count | ADD_VALUE |
| 8 | 40011 | sql remote count | ADD_VALUE |
| 8 | 40012 | sql distributed count | ADD_VALUE |
| 8 | 40013 | active sessions | ADD_VALUE |
| 8 | 40014 | single query count | ADD_VALUE |
| 8 | 40015 | multiple query count | ADD_VALUE |
| 8 | 40016 | multiple query with one stmt count | ADD_VALUE |
| 8 | 40100 | sql inner select count | ADD_VALUE |
| 8 | 40101 | sql inner select time | ADD_VALUE |
| 8 | 40102 | sql inner insert count | ADD_VALUE |
| 8 | 40103 | sql inner insert time | ADD_VALUE |
| 8 | 40104 | sql inner replace count | ADD_VALUE |
| 8 | 40105 | sql inner replace time | ADD_VALUE |
| 8 | 40106 | sql inner update count | ADD_VALUE |
| 8 | 40107 | sql inner update time | ADD_VALUE |
| 8 | 40108 | sql inner delete count | ADD_VALUE |
| 8 | 40109 | sql inner delete time | ADD_VALUE |
| 8 | 40110 | sql inner other count | ADD_VALUE |
| 8 | 40111 | sql inner other time | ADD_VALUE |
| 8 | 40112 | user logons cumulative | ADD_VALUE |
| 8 | 40113 | user logouts cumulative | ADD_VALUE |
| 8 | 40114 | user logons failed cumulative | ADD_VALUE |
| 8 | 40115 | user logons time cumulative | ADD_VALUE |
| 8 | 40116 | sql local execute time | ADD_VALUE |
| 8 | 40117 | sql remote execute time | ADD_VALUE |
| 8 | 40118 | sql distributed execute time | ADD_VALUE |
| 8 | 40119 | sql fail count | ADD_VALUE |
| 8 | 40120 | inner sql local count | ADD_VALUE |
| 8 | 40121 | inner sql remote count | ADD_VALUE |
| 8 | 40122 | inner sql distributed count | ADD_VALUE |
+-------+---------+------------------------------------+------------+
42 rows in set (0.07 sec)
Metric categories
OceanBase Database V4.x supports 15 major categories of monitoring items, detailed as follows:
| Metric class ID | Metric class | Description |
|---|---|---|
| 1 | NETWORK | Network |
| 2 | QUEUE | Request queue |
| 4 | TRANS | Transaction |
| 8 | SQL | SQL |
| 16 | CACHE | Cache |
| 32 | STORAGE | Storage |
| 64 | RESOURCE | Resource |
| 128 | DEBUG | Log |
| 256 | CLOG | CLOG |
| 512 | ELECT | Election |
| 1024 | OBSERVER | System |
| 2048 | RS | Root Service |
| 3072 | Time Model | Time model |
| 4096 | TABLEAPI | TableAPI |
| 8192 | WR | WR |
The following sections describe common categories and their metrics. For the full list of metrics, see Monitoring overview.
Network
You can query network-related monitoring metrics using the following example:
obclient> select distinct class,stat_id,name,VALUE_TYPE from gv$sysstat where class=1 and stat_id in (10000,10001,10002,10003,10004,10005,10006);
The result is as follows:
+-------+---------+----------------------+------------+
| class | stat_id | name | VALUE_TYPE |
+-------+---------+----------------------+------------+
| 1 | 10000 | rpc packet in | ADD_VALUE |
| 1 | 10001 | rpc packet in bytes | ADD_VALUE |
| 1 | 10002 | rpc packet out | ADD_VALUE |
| 1 | 10003 | rpc packet out bytes | ADD_VALUE |
| 1 | 10004 | rpc deliver fail | ADD_VALUE |
| 1 | 10005 | rpc net delay | ADD_VALUE |
| 1 | 10006 | rpc net frame delay | ADD_VALUE |
+-------+---------+----------------------+------------+
7 rows in set (0.06 sec)
The following table describes the fields.
| Name | Description |
|---|---|
| rpc packet in | The cumulative number of RPC packets received by the tenant. The count increases by 1 when the tenant receives an RPC request or receives a response after sending an RPC request. |
| rpc packet in bytes | The cumulative bytes of RPC packets received by the tenant, in Bytes. When the tenant receives an RPC request or receives a response after sending an RPC request, the size of the RPC request or response is added. |
| rpc packet out | The cumulative number of RPC packets sent by the tenant. The count increases by 1 when the tenant sends an RPC request or sends a response after receiving an RPC request. |
| rpc packet out bytes | The cumulative bytes of RPC packets sent by the tenant, in Bytes. When the tenant sends an RPC request or sends a response after receiving an RPC request, the size of the RPC request or response is added. |
| rpc deliver fail | The total number of failed RPC requests forwarded to the tenant queue. When the RPC IO thread fails to forward an RPC request to the tenant queue, the count increases by 1. |
| rpc net delay | The cumulative network transmission delay of RPC requests received by the tenant, in microseconds. After the RPC IO thread parses an RPC request, it obtains the current timestamp and subtracts the send timestamp from the packet header to get the difference, which is added to RPC_NET_DELAY. |
| rpc net frame delay | The cumulative network framework processing time of RPC requests received by the tenant, in microseconds. When the RPC IO thread forwards an RPC request to the tenant queue, it obtains the current timestamp and subtracts the timestamp recorded when the RPC packet was parsed to get the difference, which is added to RPC_NET_FRAME_DELAY. |
Request queue
You can query request queue-related monitoring metrics using the following example:
obclient> select distinct class,stat_id,name,VALUE_TYPE from gv$sysstat where class=2 and stat_id in (20000,20001,20002);
The result is as follows:
+-------+---------+-----------------------+------------+
| class | stat_id | name | VALUE_TYPE |
+-------+---------+-----------------------+------------+
| 2 | 20000 | request enqueue count | ADD_VALUE |
| 2 | 20001 | request dequeue count | ADD_VALUE |
| 2 | 20002 | request queue time | ADD_VALUE |
+-------+---------+-----------------------+------------+
3 rows in set (0.07 sec)
The following table describes the fields.
| Name | Description |
|---|---|
| request enqueue count | The total number of requests that the tenant has put into the queue. The count increases by 1 each time a request is put into the worker thread queue. |
| request dequeue count | The total number of requests that the tenant has taken out of the queue. The count increases by 1 each time a request is taken out of the worker thread queue. |
| request queue time | The total time that all tenant requests waited in the queue, in microseconds. When the tenant worker thread takes a request, it subtracts the time when the request entered the queue from the current time to get the queue wait time, which is added to the REQUEST_QUEUE_TIME metric. |
Transaction
You can query transaction-related monitoring metrics using the following example:
obclient> select distinct class,stat_id,name,VALUE_TYPE from gv$sysstat where class=4 and stat_id in (30000,30001,30002,30005,30006,30007,30008,30009,30010,30011,30012,30013,30200,30201,30202,30203,30204,30205,30206,30207,30208,30209,30210,30211,30212,30213,30214,30215,30216,30217,30218,30219);
The result is as follows:
+-------+---------+----------------------------------+------------+
| class | stat_id | name | VALUE_TYPE |
+-------+---------+----------------------------------+------------+
| 4 | 30000 | trans commit log sync time | ADD_VALUE |
| 4 | 30001 | trans commit log sync count | ADD_VALUE |
| 4 | 30002 | trans commit log submit count | ADD_VALUE |
| 4 | 30005 | trans start count | ADD_VALUE |
| 4 | 30006 | trans total used time | ADD_VALUE |
| 4 | 30007 | trans commit count | ADD_VALUE |
| 4 | 30008 | trans commit time | ADD_VALUE |
| 4 | 30009 | trans rollback count | ADD_VALUE |
| 4 | 30010 | trans rollback time | ADD_VALUE |
| 4 | 30011 | trans timeout count | ADD_VALUE |
| 4 | 30012 | trans local trans count | ADD_VALUE |
| 4 | 30013 | trans distribute trans count | ADD_VALUE |
| 4 | 30200 | xa start total count | ADD_VALUE |
| 4 | 30201 | xa start total used time | ADD_VALUE |
| 4 | 30202 | xa start with rpc total count | ADD_VALUE |
| 4 | 30203 | failed xa start total count | ADD_VALUE |
| 4 | 30204 | xa end total count | ADD_VALUE |
| 4 | 30205 | xa end total used count | ADD_VALUE |
| 4 | 30206 | xa end with rpc total count | ADD_VALUE |
| 4 | 30207 | failed xa end total count | ADD_VALUE |
| 4 | 30208 | xa prepare total count | ADD_VALUE |
| 4 | 30209 | xa prepare total used time | ADD_VALUE |
| 4 | 30210 | xa prepare with rpc total count | ADD_VALUE |
| 4 | 30211 | failed xa prepare total count | ADD_VALUE |
| 4 | 30212 | xa commit total count | ADD_VALUE |
| 4 | 30213 | xa commit total used time | ADD_VALUE |
| 4 | 30214 | xa commit with rpc total count | ADD_VALUE |
| 4 | 30215 | failed xa commit total count | ADD_VALUE |
| 4 | 30216 | xa rollback total count | ADD_VALUE |
| 4 | 30217 | xa rollback total used time | ADD_VALUE |
| 4 | 30218 | xa rollback with rpc total count | ADD_VALUE |
| 4 | 30219 | failed xa rollback total count | ADD_VALUE |
+-------+---------+----------------------------------+------------+
32 rows in set (0.013 sec)
The following table describes the fields.
| Name | Description |
|---|---|
| trans commit log sync time | The cumulative time for the tenant on this machine from successful clog commit to majority success, in microseconds. When the clog majority succeeds and the callback is made to the transaction layer, the current time minus the time when the clog commit succeeded is used to calculate the majority time, which is added to this statistic. |
| trans commit log sync count | The number of clogs that the tenant on this machine has successfully committed and completed majority. Each time the clog majority succeeds and the callback is made to the transaction layer, the count increases when the transaction layer receives the callback. |
| trans commit log submit count | The number of clogs that the tenant on this machine has submitted. The count increases by 1 each time a clog is successfully submitted (majority success not guaranteed). |
| trans start count | The number of transactions started by the tenant on this machine. Each time a transaction is successfully started on the OBServer node where the session is created, the count increases. |
| trans total used time | The cumulative transaction time of the tenant on this machine, in microseconds. Each time a transaction ends on the OBServer node where the session is created, the transaction time is added to this statistic. |
| trans commit count | The number of successful commits by the tenant on this machine. Each time a transaction is successfully committed on the OBServer node where the session is connected, the count increases by 1. |
| trans commit time | The cumulative transaction commit time of the tenant on this machine, in microseconds. Each time a transaction is successfully committed on the OBServer node where the session is created, the commit time is added to this statistic. |
| trans rollback count | The number of successful rollbacks by the tenant on this machine. Each time a transaction is successfully rolled back on the OBServer node where the session is created, the count increases. |
| trans rollback time | The transaction rollback time of the tenant on this machine, in microseconds. Each time a rollback succeeds on the OBServer node where the session is created, the rollback time is added to this statistic. |
| trans timeout count | The number of timed-out transactions of the tenant on this machine. Each time a transaction times out on the OBServer node where the session is created, the count increases. |
| trans local trans count | The number of single-node transactions ended by the tenant on this machine (including both commit and rollback). Each time a transaction ends on the OBServer node where the session is created, the participant count is checked; when the participant count equals 1, this count increases. |
| trans distribute trans count | The number of distributed transactions ended by the tenant on this machine (including both commit and rollback). Each time a transaction ends on the OBServer node where the session is created, the participant count is checked; when the participant count is greater than 1, this count increases. |
| xa start total count | For XA transactions, the total number of XA_START statements, including successful and failed. The count increases by 1 after the transaction side completes XA_START logic. |
| xa start total used time | For XA transactions, the total execution time of XA_START statements, including successful and failed, in microseconds. After the transaction side completes XA_START logic, the local processing time is added. |
| xa start with rpc total count | For XA transactions, the total number of XA_START statements executed remotely, including only successful ones. After the transaction side completes XA_START logic, if the XA_START was remotely executed and succeeded, the count increases by 1. |
| failed xa start total count | For XA transactions, the total number of failed XA_START statements. After the transaction side completes XA_START logic, if execution failed, the count increases by 1. |
| xa end total count | For XA transactions, the total number of XA_END statements, including successful and failed. The count increases by 1 after the transaction side completes XA_END logic. |
| xa end total used count | For XA transactions, the total execution time of XA_END statements, in microseconds. After the transaction side completes XA_END logic, the local processing time is added. |
| xa end with rpc total count | For XA transactions, the total number of remotely executed XA_END statements, including only successful ones. After the transaction side completes XA_END logic, if it was remotely executed and succeeded, the count increases by 1. |
| failed xa end total count | For XA transactions, the total number of failed XA_END statements. When the transaction side processes XA_END, if execution reports an error, the count increases by 1. |
| xa prepare total count | For XA transactions, the total number of XA_PREPARE statements, including successful and failed. The count increases by 1 after the transaction side completes XA_PREPARE logic. |
| xa prepare total used time | For XA transactions, the total execution time of XA_PREPARE statements, including successful and failed, in microseconds. After the transaction side completes XA_PREPARE logic, the local processing time is added. |
| xa prepare with rpc total count | For XA transactions, the total number of remotely executed XA_PREPARE statements, including only successful ones. After the transaction side completes XA_PREPARE logic, if the XA_PREPARE was remotely executed and succeeded, the count increases by 1. |
| failed xa prepare total count | For XA transactions, the total number of failed XA_PREPARE statements. After the transaction side completes XA_PREPARE logic, if execution failed, the count increases by 1. |
| xa commit total count | For XA transactions, the total number of XA_COMMIT statements, including successful and failed. The count increases by 1 after the transaction side completes XA_COMMIT logic. |
| xa commit total used time | For XA transactions, the total execution time of XA_COMMIT statements, in microseconds. After the transaction side completes XA_COMMIT logic, the local processing time is added. |
| xa commit with rpc total count | For XA transactions, the total number of remotely executed XA_COMMIT statements, including only successful ones. After the transaction side completes XA_COMMIT logic, if it was remotely executed and succeeded, the count increases by 1. |
| failed xa commit total count | For XA transactions, the total number of failed XA_COMMIT statements. When the transaction side processes XA_COMMIT, if execution reports an error, the count increases by 1. |
| xa rollback total count | For XA transactions, the total number of XA_ROLLBACK statements, including successful and failed. The count increases by 1 after the transaction side completes XA_ROLLBACK logic. |
| xa rollback total used time | For XA transactions, the total execution time of XA_ROLLBACK statements, including successful and failed, in microseconds. After the transaction side completes XA_ROLLBACK logic, the local processing time is added. |
| xa rollback with rpc total count | For XA transactions, the total number of remotely executed XA_ROLLBACK statements, including only successful ones. After the transaction side completes XA_ROLLBACK logic, if the XA_ROLLBACK was remotely executed and succeeded, the count increases by 1. |
| failed xa rollback total count | For XA transactions, the total number of failed XA_ROLLBACK statements. After the transaction side completes XA_ROLLBACK logic, if execution failed, the count increases by 1. |
SQL
You can query SQL-related metrics using the following example:
obclient> select distinct class,stat_id,name,VALUE_TYPE from gv$sysstat where class=8 and stat_id in (40000, 40001, 40002, 40003, 40004, 40005, 40006, 40007, 40008, 40009, 40010, 40011, 40012, 40018, 40019);
The result is as follows:
+-------+---------+-----------------------+------------+
| class | stat_id | name | VALUE_TYPE |
+-------+---------+-----------------------+------------+
| 8 | 40000 | sql select count | ADD_VALUE |
| 8 | 40001 | sql select time | ADD_VALUE |
| 8 | 40002 | sql insert count | ADD_VALUE |
| 8 | 40003 | sql insert time | ADD_VALUE |
| 8 | 40004 | sql replace count | ADD_VALUE |
| 8 | 40005 | sql replace time | ADD_VALUE |
| 8 | 40006 | sql update count | ADD_VALUE |
| 8 | 40007 | sql update time | ADD_VALUE |
| 8 | 40008 | sql delete count | ADD_VALUE |
| 8 | 40009 | sql delete time | ADD_VALUE |
| 8 | 40018 | sql other count | ADD_VALUE |
| 8 | 40019 | sql other time | ADD_VALUE |
| 8 | 40010 | sql local count | ADD_VALUE |
| 8 | 40011 | sql remote count | ADD_VALUE |
| 8 | 40012 | sql distributed count | ADD_VALUE |
+-------+---------+-----------------------+------------+
15 rows in set (0.08 sec)
The following table describes the fields.
| Name | Description |
|---|---|
| sql select count | Number of SQL SELECT executions. |
| sql select time | Total SQL SELECT execution time, in microseconds. |
| sql insert count | Total number of INSERT statements issued by the user. |
| sql insert time | Total time spent on INSERT statements. |
| sql replace count | Total number of REPLACE statements issued by the user. |
| sql replace time | Total time spent on REPLACE statements. |
| sql update count | Total number of UPDATE statements issued by the user. |
| sql update time | Total time spent on UPDATE statements, in microseconds. |
| sql delete count | Total number of DELETE statements issued by the user. |
| sql delete time | Total time spent on DELETE statements, in microseconds. |
| sql other count | Number of SQL executions other than SELECT/INSERT/REPLACE/UPDATE/DELETE/COMMIT/ROLLBACK. |
| sql other time | Total time spent on SQL other than SELECT/INSERT/REPLACE/UPDATE/DELETE/COMMIT/ROLLBACK, in microseconds. |
| sql local count | Number of SQL LOCAL executions. |
| sql remote count | Number of SQL REMOTE executions. |
| sql distributed count | Number of distributed SQL executions. |
Cache
You can query KVCache-related metrics using the following example:
obclient> select distinct class,stat_id,name,VALUE_TYPE from gv$sysstat where class=16 and stat_id in (50000, 50001, 50004, 50005, 50008, 50009, 50010, 50011, 50037, 50038,120001,120008,120009);
The result is as follows:
+-------+---------+-------------------------+------------+
| class | stat_id | name | VALUE_TYPE |
+-------+---------+-------------------------+------------+
| 16 | 50000 | row cache hit | ADD_VALUE |
| 16 | 50001 | row cache miss | ADD_VALUE |
| 16 | 50004 | bloom filter cache hit | ADD_VALUE |
| 16 | 50005 | bloom filter cache miss | ADD_VALUE |
| 16 | 50008 | block cache hit | ADD_VALUE |
| 16 | 50009 | block cache miss | ADD_VALUE |
| 16 | 50010 | location cache hit | ADD_VALUE |
| 16 | 50011 | location cache miss | ADD_VALUE |
| 16 | 50037 | tablet ls cache hit | ADD_VALUE |
| 16 | 50038 | tablet ls cache miss | ADD_VALUE |
| 16 | 120001 | tablet ls cache size | SET_VALUE |
| 16 | 120008 | user row cache size | SET_VALUE |
| 16 | 120009 | bloom filter cache size | SET_VALUE |
+-------+---------+-------------------------+------------+
14 rows in set (0.07 sec)
The following table describes the fields.
| Name | Description |
|---|---|
| row cache hit | Number of cache hits when the table get operator prefetches data rows. In the lookup_in_cache function, if the data row is in the cache, this value increases by 1. |
| row cache miss | Number of cache misses when the table get operator prefetches data rows. In the lookup_in_cache function, if the data row is not in the cache, this value increases by 1. |
| bloom filter cache hit | Number of macro block bloom filter cache hits, indicating that the SQL statement used a pre-built macro block bloom filter. |
| bloom filter cache miss | Number of macro block bloom filter cache misses, indicating that the corresponding macro block bloom filter was not built. |
| block cache hit | Number of block cache hits when fetching micro blocks. When fetching a micro block, if it is in the cache, this value increases by 1. |
| block cache miss | Number of block cache misses when fetching micro blocks. When fetching a micro block, if it is not in the cache, this value increases by 1. |
| location cache hit | Number of location cache hits. Includes ls_location_cache and tablet_ls_cache hits. When querying the tablet-to-logstream mapping or logstream location, if valid data is obtained from the local cache, this value increases by 1. Triggered in the get methods of ObTabletLSService and ObLSLocationService. |
| location cache miss | Number of location cache misses. When querying the tablet-to-logstream mapping or logstream location, if valid data cannot be obtained from the local cache (cache does not exist, expired, or invalid), this value increases by 1. In the get methods of ObTabletLSService and ObLSLocationService, when is_cache_hit is false, this value increases by 1. |
| tablet ls cache hit | Number of cache hits when reading tablets. In the get_from_cache_ function, if the tablet is in the cache, this value increases by 1. |
| tablet ls cache miss | Number of cache misses when reading tablets. In the get_from_cache function, if the tablet is not in the cache, this value increases by 1. |
| tablet ls cache size | Tablet cache size, in Bytes. Used in the get_from_cache function. |
| user row cache size | Cache size used when the table get operator prefetches data rows, in Bytes. Used in the lookup_in_cache function. |
| bloom filter cache size | Total size of macro block bloom filter cache, in Bytes. User INSERT statements need to check for primary key conflicts. When the data to be checked is on disk, a common performance optimization is to build bloom filters for the involved data range. This series of metrics are related to bloom filter cache. |
The GV$SYSSTAT view describes the hit rate of KVCache. To learn about the actual memory usage of each module of KVCache, query the GV$OB_KVCACHE view. For example:
obclient> select DBA_OB_TENANTS.tenant_id,tenant_name, svr_ip,svr_port, cache_name, cache_size
from GV$OB_KVCACHE, DBA_OB_TENANTS where DBA_OB_TENANTS.tenant_type!='META' and DBA_OB_TENANTS.tenant_id = GV$OB_KVCACHE.tenant_id limit 10;
The result is as follows:
+-----------+----------------+------------+----------+-----------------------+------------+
| tenant_id | tenant_name | svr_ip | svr_port | cache_name | cache_size |
+-----------+----------------+------------+----------+-----------------------+------------+
| 1 | sys |xx.xx.xx.xx | 2882 | schema_cache | 14564352 |
| 1 | sys |xx.xx.xx.xx | 2882 | tablet_table_cache | 6241280 |
| 1 | sys |xx.xx.xx.xx | 2882 | vtable_cache | 35372032 |
| 1 | sys |xx.xx.xx.xx | 2882 | opt_column_stat_cache | 6241280 |
| 1008 | mq_t1 |xx.xx.xx.xx | 2882 | index_block_cache | 6241280 |
| 1008 | mq_t1 |xx.xx.xx.xx | 2882 | user_block_cache | 10402816 |
| 1008 | mq_t1 |xx.xx.xx.xx | 2882 | user_row_cache | 6241280 |
| 1010 | oracle_tenant1 |xx.xx.xx.xx | 2882 | index_block_cache | 6241280 |
| 1008 | mq_t1 |xx.xx.xx.xx | 2882 | bf_cache | 4160512 |
| 1010 | oracle_tenant1 |xx.xx.xx.xx | 2882 | user_block_cache | 8322048 |
+-----------+----------------+------------+----------+-----------------------+------------+
10 rows in set (0.06 sec)
Storage
You can query storage-related metrics using the following example:
obclient> select distinct class,stat_id,name,VALUE_TYPE from gv$sysstat where class=32 and stat_id in (60000, 60001, 60002, 60003, 60004, 60005, 60019, 60020, 60021, 60022, 60023, 60024,130000,130001,130002,130004);
The result is as follows:
+-------+---------+--------------------------------+------------+
| class | stat_id | name | VALUE_TYPE |
+-------+---------+--------------------------------+------------+
| 32 | 60000 | io read count | ADD_VALUE |
| 32 | 60001 | io read delay | ADD_VALUE |
| 32 | 60002 | io read bytes | ADD_VALUE |
| 32 | 60003 | io write count | ADD_VALUE |
| 32 | 60004 | io write delay | ADD_VALUE |
| 32 | 60005 | io write bytes | ADD_VALUE |
| 32 | 60019 | memstore read lock succ count | ADD_VALUE |
| 32 | 60020 | memstore read lock fail count | ADD_VALUE |
| 32 | 60021 | memstore write lock succ count | ADD_VALUE |
| 32 | 60022 | memstore write lock fail count | ADD_VALUE |
| 32 | 60023 | memstore wait write lock time | ADD_VALUE |
| 32 | 60024 | memstore wait read lock time | ADD_VALUE |
| 32 | 130000 | active memstore used | SET_VALUE |
| 32 | 130001 | total memstore used | SET_VALUE |
| 32 | 130002 | major freeze trigger | SET_VALUE |
| 32 | 130004 | memstore limit | SET_VALUE |
+-------+---------+--------------------------------+------------+
16 rows in set (0.06 sec)
The following table describes the fields.
| Name | Description |
|---|---|
| io read count | Total read I/O count (IOPS). Each completed read I/O increases IO_READ_COUNT by 1. |
| io read delay | Total read I/O latency, in microseconds. IO_READ_DELAY accumulates the hardware service latency of each read I/O. The latency is measured from when the request is dispatched to disk until the disk completes processing and returns. |
| io read bytes | Total read I/O bandwidth. This is the total read I/O volume in Bytes. The average read I/O per second can be calculated from the change over a period. Each completed read I/O adds its read bandwidth to IO_READ_BYTES. |
| io write count | Total write I/O count (IOPS). Each completed write I/O increases IO_WRITE_COUNT by 1. |
| io write delay | Total write I/O latency, in microseconds. IO_WRITE_DELAY accumulates the hardware service latency of each write I/O. The latency is measured from when the request is dispatched to disk until the disk completes processing and returns. |
| io write bytes | Total write I/O bandwidth. This is the total write I/O volume in Bytes. The average can be calculated from the change over a period. Each completed write I/O adds its write bandwidth to IO_WRITE_BYTES. |
| memstore read lock succ count | Number of rows successfully read from MemTable. This value increases by 1 after each successful MemTable read. |
| memstore read lock fail count | Number of rows that failed MemTable read due to read lock conflict timeout. During MemTable read, if the data is in commit state but the commit status and version are not yet determined (e.g., pending commit log or distributed transaction), the read waits on the read lock. If this conflict causes a timeout timeout, this value increases by 1. |
| memstore write lock succ count | Number of rows successfully written to MemTable. During MemTable write, lock conflict, lost update, and primary key conflict are checked. If none of these conflicts exist, this value increases by 1.
NoteValidation success on MemTable may still fail on SSTable. |
| memstore write lock fail count | Number of rows that failed MemTable write due to conflict. During MemTable write, lock conflict, lost update, and primary key conflict are checked. If any of these conflicts exist, this value increases by 1.
NoteDoes not include validation on SSTable. |
| memstore wait write lock time | Wait time for MemTable write lock conflict wake-up, in microseconds. During MemTable write, if lock conflict occurs, the request is put into the lock conflict queue. This value adds the time from being put into the queue until wake-up. |
| memstore wait read lock time | MemTable read lock conflict wait time, in microseconds. During MemTable read, if the data is in commit state but the commit status and version are not yet determined, the read waits on the read lock. This value adds the wait time. |
| active memstore used | Active memstore size on the node, in Bytes. The node uses memstore for replay or write. When memstore usage reaches a certain level, freeze is triggered. All unfrozen memstores are called active memstore. |
| total memstore used | Total memstore memory used by the tenant on the node, in Bytes. This includes both active and frozen memstore. |
| major freeze trigger | Memstore size that triggers freeze, in Bytes. Obtained in the get_tenant_memstore_cond function by adding memstore_freeze_trigger and max_cached_memstore_size. |
| memstore limit | Maximum memstore memory that the tenant can use on the node, in Bytes. The user specifies the tenant memory limit. memstore_limit_percentage and _memstore_limit_percentage configure the maximum memstore limit. memstore_limit_percentage is cluster-level; _memstore_limit_percentage is tenant-level. _memstore_limit_percentage takes precedence over memstore_limit_percentage. When neither is set, tenants smaller than 8G use 40% and tenants larger than 8G use 50%. |
Resource
You can query resource-related metrics using the following example:
obclient> select distinct class,stat_id,name,VALUE_TYPE from gv$sysstat where class=64 and stat_id in (140002,140003,140005,140006);
The result is as follows:
+-------+---------+-----------------+------------+
| class | stat_id | name | VALUE_TYPE |
+-------+---------+-----------------+------------+
| 64 | 140002 | max memory size | SET_VALUE |
| 64 | 140003 | memory usage | SET_VALUE |
| 64 | 140005 | max cpus | SET_VALUE |
| 64 | 140006 | cpu usage | SET_VALUE |
+-------+---------+-----------------+------------+
4 rows in set (0.06 sec)
The following table describes the fields.
| Name | Description |
|---|---|
| max memory size | Tenant memory specification, which determines the tenant memory limit, in Bytes. The OmtNodeBalancer background thread periodically obtains the latest min_memory from the DBA_OB_UNIT_CONFIGS table and updates it to the Limit in ObTenantMemoryMgr. |
| memory usage | Memory used by the tenant, in Bytes. ObTenantMemoryMgr maintains a real-time memory value that records the actual memory usage of the tenant. |
| min cpus | Tenant CPU specification. Determines the number of resident worker threads and the thread count of some background modules. The OmtNodeBalancer background thread periodically obtains the latest max_cpu from the DBA_OB_UNIT_CONFIGS table and syncs it to the multi-tenant module. The OmtNodeBalancer background thread periodically adjusts the worker thread count based on the synced max_cpu. |
| max cpus | Tenant CPU specification. Determines the number of resident worker threads and the thread count of some background modules. When cgroup is enabled, it determines the CPU usage limit of tenant threads. The OmtNodeBalancer background thread periodically obtains the latest max_cpu from the DBA_OB_UNIT_CONFIGS table and syncs it to the multi-tenant module. The OmtNodeBalancer background thread periodically adjusts the worker thread count based on the synced max_cpu. |
| cpu usage | Thread usage rate of tenant worker threads, reflecting how busy the tenant worker threads are, in percentage. The OmtNodeBalancer background thread collects the ratio of request processing time to total time for tenant worker threads every 10 seconds, normalizes it, and multiplies by the tenant min_cpu to get the cpu_usage value. |
Based on the above metrics, you can calculate the tenant CPU usage and memory usage:
Tenant CPU usage = cpu usage (stat_id = 140013) / max cpus (stat_id = 140005)
Tenant thread usage = cpu usage (stat_id = 140006) / max cpus (stat_id = 140005)
Tenant memory usage = memory usage (stat_id = 140003) / max memory size (stat_id = 140002)
Log
You can query system log-related metrics using the following example:
obclient> select distinct CLASS,STAT_ID,NAME,VALUE_TYPE from GV$SYSSTAT where class=128 and stat_id in (160001,160002,160004,160019,160020,160021,160022,160023,160024,160025,160026,160027,160028,160029,160030,160031,160032,160033,160034,160035,160036);
The result is as follows:
+-------+---------+---------------------------------------+------------+
| CLASS | STAT_ID | NAME | VALUE_TYPE |
+-------+---------+---------------------------------------+------------+
| 128 | 160001 | oblogger log bytes | SET_VALUE |
| 128 | 160002 | election log bytes | SET_VALUE |
| 128 | 160004 | oblogger total log count | SET_VALUE |
| 128 | 160019 | async error log dropped count | SET_VALUE |
| 128 | 160020 | async warn log dropped count | SET_VALUE |
| 128 | 160021 | async info log dropped count | SET_VALUE |
| 128 | 160022 | async trace log dropped count | SET_VALUE |
| 128 | 160023 | async debug log dropped count | SET_VALUE |
| 128 | 160024 | async log flush speed | SET_VALUE |
| 128 | 160025 | async generic log write count | SET_VALUE |
| 128 | 160026 | async user request log write count | SET_VALUE |
| 128 | 160027 | async data maintain log write count | SET_VALUE |
| 128 | 160028 | async root service log write count | SET_VALUE |
| 128 | 160029 | async schema log write count | SET_VALUE |
| 128 | 160030 | async force allow log write count | SET_VALUE |
| 128 | 160031 | async generic log dropped count | SET_VALUE |
| 128 | 160032 | async user request log dropped count | SET_VALUE |
| 128 | 160033 | async data maintain log dropped count | SET_VALUE |
| 128 | 160034 | async root service log dropped count | SET_VALUE |
| 128 | 160035 | async schema log dropped count | SET_VALUE |
| 128 | 160036 | async force allow log dropped count | SET_VALUE |
+-------+---------+---------------------------------------+------------+
24 rows in set (0.052 sec)
The following table describes the fields.
| Name | Description |
|---|---|
| oblogger log bytes | Cumulative size of printed logs, in Bytes. Refers to the output process of system logs such as observer.log. |
| election log bytes | Cumulative size of printed election logs, in Bytes. Refers to the cumulative print size of election.log. |
| oblogger total log count | Cumulative number of log prints. Refers to the output process of system logs such as observer.log. |
| async error log dropped count | Number of ERROR log print failures (e.g., due to rate limiting). Refers to the output process of system logs such as observer.log. |
| async warn log dropped count | Number of WARN log print failures. Refers to the output process of system logs such as observer.log. |
| async info log dropped count | Number of INFO log print failures (e.g., due to rate limiting). Refers to the output process of system logs such as observer.log. |
| async trace log dropped count | Number of TRACE log print failures. Refers to the output process of system logs such as observer.log. |
| async debug log dropped count | Number of DEBUG log print failures (e.g., due to rate limiting). Refers to the output process of system logs such as observer.log. |
| async log flush speed | Average number of log entries written per second. Refers to the output process of system logs such as observer.log. |
| async generic log write count | Number of successful default log prints. Refers to the output process of system logs such as observer.log. |
| async user request log write count | Deprecated, retained in code only. Refers to the output process of system logs such as observer.log. |
| async data maintain log write count | Number of successful dump and merge log prints. Refers to the output process of system logs such as observer.log. |
| async root service log write count | Number of successful RS log prints. Refers to the output process of system logs such as observer.log. |
| async schema log write count | Number of Schema log print failures (e.g., due to rate limiting). Refers to the output process of system logs such as observer.log. |
| async force allow log write count | Number of successful forced log prints. Refers to the output process of system logs such as observer.log. |
| async generic log dropped count | Number of default log print failures (e.g., due to rate limiting). Refers to the output process of system logs such as observer.log. |
| async user request log dropped count | Deprecated, retained in code only. Refers to the output process of system logs such as observer.log. |
| async data maintain log dropped count | Number of dump and merge log print failures (e.g., due to rate limiting). Refers to the output process of system logs such as observer.log. |
| async root service log dropped count | Number of RS log print failures (e.g., due to rate limiting). Refers to the output process of system logs such as observer.log. |
| async schema log dropped count | Number of Schema log print failures (e.g., due to rate limiting). Refers to the output process of system logs such as observer.log. |
| async force allow log dropped count | Number of forced log print failures (e.g., due to rate limiting). Refers to the output process of system logs such as observer.log. |
CLOG
You can query CLOG-related metrics using the following example:
obclient> select distinct class,stat_id,name,VALUE_TYPE from gv$sysstat where class=256 and stat_id in (80001,80002,80057);
The result is as follows:
+-------+---------+--------------------------------------+------------+
| class | stat_id | name | VALUE_TYPE |
+-------+---------+--------------------------------------+------------+
| 256 | 80001 | palf write io count to disk | ADD_VALUE |
| 256 | 80002 | palf write size to disk | ADD_VALUE |
| 256 | 80057 | clog trans log total size | ADD_VALUE |
+-------+---------+--------------------------------------+------------+
3 rows in set (0.07 sec)
The following table describes the fields.
| Name | Description |
|---|---|
| palf write io count to disk | Number of clog disk writes. When clog is successfully written to disk, this statistic increases by 1. |
| palf write size to disk | Clog write size, in Bytes. When clog is successfully written to disk, the size of the write is added. |
| clog trans log total size | Amount of data committed to PALF by transactions, in Bytes. When a transaction commits logs to PALF and the commit completes, this value increases by the log size. |
ASH
You can query ASH-related metrics using the following example:
obclient> select * from oceanbase.V$STATNAME where class = 3072;
The result is as follows:
+--------+---------+------------+----------------------------------------+----------------------------------------+-------+
| CON_ID | STAT_ID | STATISTIC# | NAME | DISPLAY_NAME | CLASS |
+--------+---------+------------+----------------------------------------+----------------------------------------+-------+
| 1002 | 200001 | 368 | DB time | DB time | 3072 |
| 1002 | 200002 | 369 | DB CPU | DB CPU | 3072 |
| 1002 | 200005 | 370 | background elapsed time | background elapsed time | 3072 |
| 1002 | 200006 | 371 | background cpu time | background cpu time | 3072 |
| 1002 | 200010 | 372 | non idle wait time | non idle wait time | 3072 |
| 1002 | 200011 | 373 | idle wait time | idle wait time | 3072 |
| 1002 | 200012 | 374 | background database time | background database time | 3072 |
| 1002 | 200013 | 375 | background database non-idle wait time | background database non-idle wait time | 3072 |
| 1002 | 200014 | 376 | background database idle wait time | background database idle wait time | 3072 |
| 1002 | 220001 | 388 | concurrency wait total time | concurrency wait total time | 3072 |
| 1002 | 220002 | 389 | user io wait total time | user io wait total time | 3072 |
| 1002 | 220003 | 390 | application wait total time | application wait total time | 3072 |
+--------+---------+------------+----------------------------------------+----------------------------------------+-------+
12 rows in set (0.184 sec)
The following table describes the fields.
Notice
Metrics that do not include background do not include the background components. For example, idle wait time only records the time related to foreground sessions and does not include the content related to background database idle wait time. The statistical values of these two metrics are mutually exclusive.
| Name | Description |
|---|---|
| DB time | Total time of foreground processes executing database operations, including the sum of CPU and non-idle wait time, in microseconds. OceanBase Database periodically updates Time Model statistics. When a thread is active, it is either in ON_CPU state or wait event state. The duration in each state is added to the corresponding Time Model statistics. |
| DB CPU | The CPU time consumed by the database in the active state. |
| background elapsed time | The total amount of time spent on background process execution. |
| background cpu time | The CPU time consumed by background processes. |
| non idle wait time | The amount of non-idle wait time, namely, the total amount of time elapsed when the session is in the wait but not idle state. |
| idle wait time | The amount of idle wait time, namely, the total amount of time elapsed when the session is in the idle wait state. |
| background database time | The total amount of time spent by background processes in executing database operations. |
| background database non-idle wait time | The amount of time spent by background processes in executing database operations in the non-idle wait state. |
| background database idle wait time | The amount of time spent by background processes in executing database operations in the idle wait state. |
| concurrency wait total time | The total amount of wait time caused by concurrency issues, such as the wait time caused by resource locking. |
| user io wait total time | The total amount of time spent by user processes in waiting for the I/O operations, such as data reads from or data writes to the disk, to complete. |
| application wait total time | The total amount of wait time caused by the application code, such as lock waits caused by row-level locking or explicit lock commands. |
Other metrics
In addition to the GV$SYSSTAT view, OceanBase Database provides many other views to describe the metrics from various aspects.
For example, the GV$OB_PROCESSLIST view describes the status of the current sessions of the system and the IDs of the executed SQL statements. This view also provides important information in fields such as SID, SQL_ID, TRANS_ID, and TRACE_ID. It is a useful tool for end-to-end tracing.
The following example shows how to query the
GV$OB_PROCESSLISTview for information about active sessions:obclient> select TENANT,SVR_IP,SVR_PORT,USER,HOST,COMMAND,STATE,INFO,SQL_ID,TRACE_ID from GV$OB_PROCESSLIST where STATE='ACTIVE' or null limit 10;The result is as follows:
+--------+------------+----------+-----------+-------------------+---------+--------+-----------------------------------------------------------------------------------------------------------------------------------------+----------------------------------+-----------------------------------+ | TENANT | SVR_IP | SVR_PORT | USER | HOST | COMMAND | STATE | INFO | SQL_ID | TRACE_ID | +--------+------------+----------+-----------+-------------------+---------+--------+-----------------------------------------------------------------------------------------------------------------------------------------+----------------------------------+-----------------------------------+ | sys |xx.xx.xx.xx | 2882 | DBA_query | xx.xx.xx.xx:51588 | Query | ACTIVE | select TENANT,SVR_IP,SVR_PORT,USER,HOST,COMMAND,STATE,INFO,SQL_ID,TRACE_ID from GV$OB_PROCESSLIST where STATE='ACTIVE' or null limit 10 | D18F6AE855AD0D7478162DD8EF48C781 | YB4206008451-0005F55893CEA363-0-0 | +--------+------------+----------+-----------+-------------------+---------+--------+-----------------------------------------------------------------------------------------------------------------------------------------+----------------------------------+-----------------------------------+ 1 row in set (0.04 sec)The following example shows how to query the
GV$OB_PROCESSLISTview for the number of active sessions on each OBServer node:obclient> Select tenant_id, tenant_name, svr_ip, case when cnt is null then 0 else cnt end as cnt from ( select DBA_OB_TENANTS.tenant_name, DBA_OB_TENANTS.tenant_id, svr_ip, cnt from DBA_OB_TENANTS left join ( select count(`state`='ACTIVE' OR NULL) as cnt, tenant as tenant_name, svr_ip from GV$OB_PROCESSLIST group by tenant,svr_ip ) t1 on DBA_OB_TENANTS.tenant_name = t1.tenant_name where DBA_OB_TENANTS.tenant_type!='META' ) t2;The result is as follows:
+-----------+-----------------+-------------+------+ | tenant_id | tenant_name | svr_ip | cnt | +-----------+-----------------+-------------+------+ | 1 | sys | xx.xx.xx.xx | 0 | | 1 | sys | xx.xx.xx.xx | 0 | | 1 | sys | xx.xx.xx.xx | 0 | | 1002 | midas0_3790 | xx.xx.xx.xx | 0 | | 1002 | midas0_3790 | xx.xx.xx.xx | 0 | | 1002 | midas0_3790 | xx.xx.xx.xx | 0 | +-----------+-----------------+-------------+------+ 6 rows in set (0.05 sec)