Real-time monitoring presents the distribution and relationships of cluster resources from a global perspective, helping you understand system conditions, monitor business operations, and quickly locate faults. This facilitates better management and optimization of system performance, enhancing system stability and reliability.
Background information
To address the complexity of end-to-end performance monitoring and fault location in distributed database environments, OCP introduces an end-to-end real-time monitoring solution, establishing a five-level monitoring system covering OBProxy → OceanBase cluster → tenant → host → Unit. It reshapes the O&M experience through three core capabilities:
- 3D topology visualization, revealing the full architecture
Using a dynamic 3D visualization engine, it three-dimensionally presents the logical relationships and physical deployments between OBProxy and clusters, clusters and hosts, and tenants and OBServer nodes. It supports topology hierarchy drilling down from the global architecture to individual Units, intuitively displaying resource distribution and dependencies, making the operational status of distributed architectures, which are often described as "black boxes," transparent.
- Multi-dimensional metric monitoring, second-level exception location
It aggregates real-time data on over 10 core performance metrics, including QPS, TPS, RT, connection count, CPU/memory usage, etc. Combined with trend analysis, it accurately identifies performance inflection points. You can directly view the number of associated alerts on the topology map and jump to the alert center, enabling one-click tracing from metric anomalies to the root cause of faults, improving fault location efficiency.
- Dual-mode data presentation, meeting multi-dimensional O&M scenarios
It provides a structured tabular data view. By complementing graphical with numerical information, it meets managers' needs for global situational awareness while providing technical personnel with data support for in-depth analysis, effectively lowering the cognitive and management barriers for distributed systems.
Prerequisites
- Only OceanBase Database V4.x and later support viewing real-time monitoring data.
- The 3D view supports displaying up to 30 hosts. When the number of hosts exceeds this limit, only the tabular view is supported.
Considerations
- If your cluster contains a large number of hosts, tenants, or Unit resources, real-time monitoring may encounter performance limitations. It is recommended to switch to the tabular view to view various monitoring metrics.
- Due to differences in the metric collection timing for some monitored objects, data may exhibit temporal inconsistencies.
Procedure
Log in to OCP.
In the left navigation bar, click Cluster. The system defaults to the Clusters tab.
On the Clusters page, select the target cluster and click its name to go to the cluster's Overview page.
Click Real-Time Monitoring in the left navigation bar.
View real-time monitoring data in 3D
The system displays real-time monitoring data in 3D by default. In the 3D canvas, real-time monitoring data is dynamically presented through two modules: the data panel on the left and the topology map on the right.
View overview data
The data panel on the left displays an overview of the current cluster's resource objects and an overview of alert statistics.
The topology diagram on the right displays the resource information of the services associated with the current cluster, allowing you to intuitively view the topology and overall overview among the OBProxy cluster, OceanBase cluster, and zones.
It supports displaying all resource objects of the associated services, including all OBProxy server resources, all tenant resources, and all host resources.
It supports displaying the health status of resource objects using different colors. Different colors represent different levels of alert information for the resource objects. Specifically: purple indicates a service-disruption alert, red indicates a critical alert, orange indicates a warning alert, and green indicates no alerts.
View detailed data
Click or hover the cursor over the resource object you want to view its detailed data:
The data panel on the left displays the basic information, alert overview, and key performance metrics of the resource. The metrics displayed for different resource objects vary. You can refer to the following table to view them.
Object TypeMetric InformationOBProxy Cluster Cluster name, status, QPS, TPS, RT, number of request failures, number of client connections, client connection usage, process memory usage. OBProxy Server IP, status, QPS, TPS, RT, number of request failures, number of client connections, client connection usage, process memory usage. OceanBase Cluster Cluster name, status, QPS, TPS, number of sessions, number of active sessions, number of leader replicas, RT. Tenant Tenant name, status, QPS, TPS, RT, number of sessions, number of active sessions, thread usage, CPU consumption, number of leader replicas, request queue time, request queue size, and all unit information of the tenant.
-
Where:
- Unit metric information includes the affiliated OBServer, specification, QPS, TPS, RT, number of sessions, number of active sessions, thread usage, CPU consumption, number of leader replicas, request queue size, and request queue time. By default, the unit displays all monitoring items. If you do not need to pay attention to a certain monitoring item, you can click the filter button and uncheck that monitoring item in the panel.
- Click or hover the cursor over the IP of the affiliated OBServer to view the metric information of that OBServer and host.
OBServer IP, status, QPS, TPS, number of sessions, number of active sessions, number of leader replicas, and corresponding unit information.
-
Where:
- Unit metric information includes the affiliated tenant, specification, QPS, TPS, RT, number of sessions, number of active sessions, thread usage, CPU consumption, number of leader replicas, request queue size, and request queue time. By default, the unit displays all monitoring items. If you do not need to pay attention to a certain monitoring item, you can click the filter button and uncheck that monitoring item in the panel.
- Click or hover the cursor over the affiliated tenant to view the metric information of that tenant.
Host IP, status, CPU usage, memory usage, disk I/O usage, TCP retransmission rate, disk usage, NTP clock offset, network throughput (send), network throughput (receive). The topology diagram on the right uses flowing lines to display the upstream and downstream dependencies of resource objects.
View real-time monitoring data in table format
Click the Tables button in the upper-right corner to switch to table view. The system will display the structured data of different resource objects in a table format.
The OBProxy Cluster list supports displaying the real-time monitoring data of clusters, OBProxy servers, and hosts, as well as the alert overview data of clusters.
Object TypeMetric InformationOBProxy Cluster - Cluster: cluster name, status, QPS, TPS, RT, number of request failures, number of client connections, client connection usage, process memory usage.
- OBProxy Server: IP, status, QPS, TPS, RT, number of request failures, number of client connections, client connection usage, process memory usage.
Host Alert overview, IP, status, OBProxy cluster, CPU usage, memory usage, disk I/O usage, TCP retransmission rate, disk usage, NTP clock offset, network throughput (send), network throughput (receive). The OceanBase Cluster list supports displaying the real-time monitoring data of clusters, OBServer nodes, hosts, and tenants, as well as the alert overview data of clusters.
Object TypeMetric InformationOceanBase Cluster - Cluster: cluster name, status, QPS, TPS, number of sessions, number of active sessions, number of leader replicas, RT.
- OBServer: IP, Zone, status, QPS, TPS, number of sessions, number of active sessions, number of leader replicas.
Host Alert overview, IP, Zone, status, CPU usage, memory usage, disk I/O usage, TCP retransmission rate, disk usage, NTP clock offset, network throughput (send), network throughput (receive). Tenant - Tenant: tenant name, status, QPS, TPS, RT, number of sessions, number of active sessions, thread usage, CPU consumption, number of leader replicas, request queue time, request queue size.
- Unit: specification, affiliated OBServer, QPS, TPS, RT, number of sessions, number of active sessions, thread usage, CPU consumption, number of leader replicas, request queue size, request queue time.
Related operations
- Modify the auto-refresh frequency: The system refreshes monitoring data every 30 seconds by default. You can modify it to 10 Seconds, 20 Seconds, 1 Minute, 2 Minutes, or 5 Minutes as needed. After modification, the system will automatically refresh monitoring data at the selected frequency.
- View alert details: If an alert exists for the monitored object, click the corresponding alert level to navigate to the alert page and view the alert details.
- Adjust the topology diagram:
- Drag the mouse in the blank area to freely move the topology diagram.
- Click
to adjust the topology diagram to full-screen mode. - Click the zoom-in/out icon or scroll the mouse wheel up and down to zoom in or out on the topology diagram.
- Click
to quickly restore the topology diagram to its original scale. - Click
to make the topology diagram dynamically adapt to the canvas size.
