OceanBase Cloud Platform

V4.3.3

Document Overview
Product Introduction
- What is OceanBase Cloud Platform?
- Differences between the features of OCP Enterprise Edition and OCP Community Edition
- System architecture
- Features
- Limits
  - System requirements
  - Dependencies
Deploy OceanBase Cloud Platform (OCP)
- Deploy OceanBase Cloud Platform (OCP) Enterprise Edition
  - Deployment overview
    - Deploy OCP cluster in the Single-node
    - Deploy OCP cluster in the High-availability
  - Prepare the deployment environment
    - Install OAT
    - Add a server
    - Check the environment
  - Deploy OCP
    - Plan resources
      - Planning resources of a single node OCP
        OCP-Server specifications
        MetaDB resources
        MonitorDB resources
        Host resources
      - Planning resources of a high availability OCP
        OCP-Server specifications
        MetaDB resources
        MonitorDB resources
        Host resources
    - Deploy OCP
      - Deploying a single point of OCP
        Prepare installation media
        Create MetaDB
        Install OCP
        Initialize system parameters
      - Deploying a high availability of OCP
        Prepare installation media
        Create MetaDB
        Install OCP
        Register an OCP cluster
        Initialize system parameters
    - Check after deployment
  - Deployment FAQ
    - Optimize MonitorDB parameters
    - F5 Big-IP LTM and alert configuration
      - Connect OBProxy to F5 Big-IP LTM
      - Connect OCP-Console to F5 Big-IP LTM
      - Monitor the OCP service health
    - Ubuntu deployment FAQ
    - EulerOS deployment FAQ
    - Client requirements
    - Start and stop OCP
  - Appendix
    - Uninstall OCP
    - Check the NTP offset
- Deploy OceanBase Cloud Platform (OCP) Community Edition
  - Overview
  - Installation process
  - Installation planning
    - User planning
    - Host planning
    - Disk and directory planning
    - Port planning
  - Prepare for the installation
  - Deploy OCP on the GUI
  - Use Docker to deploy OCP
  - Post-deployment check
  - Appendix
    - Install Docker
    - Uninstall OCP
    - Restart OCP
    - FAQ about deployment
Upgrade OceanBase Cloud Platform (OCP)
- Upgrade OceanBase Cloud Platform (OCP) Enterprise Edition
  - Overview
  - Considerations
  - Preparations
  - Upgrade OCP
    - Take over OCP
    - Upgrade OCP
  - Verify after upgrade
    - Verify the cluster feature
    - Verify the tenant feature
    - Verify the host feature
    - Verify the software package feature
    - Verify the OBProxy feature
    - Verify the backup and restore feature
    - Verify the alert feature
    - Verify the task feature
    - Verify the password box feature
  - Appendix
    - Change a user's password
    - Version mapping
- Upgrade OceanBase Cloud Platform (OCP) Community Edition
  - Upgrade OCP on the GUI
  - Upgrade OCP using Docker containers
  - Post-upgrade check
Quick Start
- OCP operations
- Log on to the OCP console
- Upload a software package
- Add a host
- Create a cluster
- Create a tenant
- Create a user
- Create an OBProxy cluster
Cluster Management
- Overview
- Create a cluster
  - Create a distributed cluster
  - Create a standalone centralized database
- Manage clusters
  - Create a standby cluster
  - View the details of a cluster
  - Upgrade a standalone centralized database to a distributed cluster
  - Take over a cluster
  - Restart a cluster
  - Stop a cluster
  - Move out a cluster
  - Delete a cluster
  - Change the password
  - Enable automatic detection of deadlocks
  - Disable automatic detection of deadlocks
  - Manage O&M configuration
    - Overview
    - Manage parameters
    - View the parameter modification history
  - Manage CPU binding configurations
- Manage Arbitration Service
  - Arbitration service Overview
  - Create an arbitration service
  - Take over an arbitration service
  - Stop an arbitration service
  - Start an arbitration service
  - Upgrade an arbitration service
  - Restart an arbitration service
  - Migrate an arbitration service
  - Delete an arbitration service
  - Add an arbitration service
  - Replace an arbitration service
  - Remove an arbitration service
- Manage zones of a cluster
  - Add a zone
  - Edit a zone
  - Restart a zone
  - Stop a zone
  - Delete a zone
- Manage OBServers of a cluster
  - Add an OBServer node
  - Restart a faulty OBServer node
  - Stop an OBServer service
  - Stop the observer process
  - Replace an OBServer node
  - Reinstall an OBServer
  - Delete an OBServer
- Upgrade an OceanBase cluster
- View the topology of a cluster
- Overview of cluster tenant management
- Manage cluster resource
  - View the unit distribution
  - View resource usage
- Manage major compaction of a cluster
  - Modify the major compaction settings of a cluster
  - Perform a major compaction
  - View details of a major compaction
  - View statistics of a major compaction
- Manage cluster parameter templates
- Manage cluster parameters
  - Parameter types
  - View parameters
  - Modify parameters
  - View the history of parameter changes
Tenant Management
- Tenant and resource management
- View tenant overview information
- Create a primary tenant
- Manage unit specifications
- View OCP resource unit specifications
- Manage tenants
  - View the details of a tenant
  - Create a standby tenant
  - Copy a tenant
  - Rename a tenant
  - Lock a tenant
  - Clone a tenant
  - Delete a tenant
  - Change the password of the sys tenant
  - Configure full link tracking for a tenant
  - Modify the allowlist of a tenant
  - Modify zone priorities
  - Manage the binlog service for a tenant
  - Manage service names
- Manage topulogy of a tenant
  - View the topology of a tenant
  - View the topology of primary/standby relationships of a tenant
- Manage replicas of a tenant
  - Add a replica
  - Edit a tenant replica in a zone
  - Delete a tenant replica from a zone
- Manage databases
- Manage users and permissions under a tenant
  - Manage users in a MySQL tenant
  - Manage user under an Oracle tenant
    - User overview under an Oracle tenant
    - Create a user under an Oracle tenant
    - Change the password of a user under an Oracle tenant
    - Delete a user under an Oracle tenant
    - Create a role under an Oracle tenant
    - Manage users
    - Manage roles
  - System privileges in a MySQL tenant
- Manage resource isolation
  - Overview
  - Create a resource group
  - Create a resource isolation plan
  - Enable or disable a resource isolation plan
  - Modify a resource isolation plan
  - Delete a resource isolation plan
  - Modify a resource group
  - Delete a resource group
- Manage the resources of a tenant
- Manage major compaction under a tenant
  - Modify major compaction settings of a tenant
  - View details of tenant major compaction
  - Perform a major compaction
- Manage statistical information
  - Overview
  - Collect statistics
  - Manage statistics collection tasks
- Manage tenant parameter templates
- Manage tenant parameters
  - View the parameters
  - Modify a parameter
  - View the parameter modification history
OBProxy Management
- OBProxy management overview
- View details on the OBProxy page
- Create an OBProxy cluster
- Manage OBProxy Clusters
  - View details on the Overview page of an OBProxy cluster
  - Add a connectable OceanBase cluster
  - Manage load balancing
  - Change the password of the proxysys user
  - Change the password of the proxyro user
  - Move out an OBProxy cluster
  - Delete an OBProxy cluster
  - Upgrade an OBProxy cluster
  - Manage CPU core binding configurations
  - Stop or start an ODP cluster
  - Restart an ODP cluster
  - Delete a connected OceanBase cluster
  - Manage the proxyro account
- Manage OBProxy Servers
  - Add an OBProxy
  - Take over an OBProxy
  - Delete an OBProxy
  - Restart an OBProxy
  - Stop or start an OBProxy
  - Remove an OBProxy
  - Launch an OBProxy
  - Upgrade an OBProxy
  - Refresh OBProxy configurations
- Request analysis
- Parameter management
- Client configuration
- OBProxy parameters descpription
Binlog Service Management
- Binlog service overview
- Create a binlog cluster
- Manage Binlog Clusters
  - View information about a binlog cluster
  - Start and stop a binlog cluster
  - Restart a binlog cluster
  - Delete a binlog cluster
  - Manage Binlog Instance
    - View a binlog instance
    - Configure throttling
    - Start and stop a binlog instance
    - Restart a binlog instance
    - Manage binlog instance parameters
  - Manage Binlog Server
    - Add a binlog node
    - Start and stop a binlog server
    - Delete a binlog server
    - Clear binlog instances on a host
- Manage binlog cluster parameters
Host Management
- Host management operations
- View details of a host
- Add a host
- View the details of tasks on a host
- Modify a host
- Restart OCP-Agent
- Reinstall OCP Agent
- Remove a host
- Perform host standardization checks
- Host standardization check items
- OCP-Agent processes
- OCP-Agent O&M Script Instructions
Alert Management
- Overview
- Alert-related concepts
- Manage alert rules
  - Create an alert rule
  - View an alert rule
  - Copy an alert rule
  - Edit an alert rule
  - Delete an alert rule
  - Alert rule groups
  - Export alert rule configurations
- Manage alert templates
  - Create an alert template
  - Set alert objects
  - Export alert templates
  - View an alert template
  - Copy an alert template
  - Edit an alert template
  - Delete an alert template
- Manage alert channels
  - Create an alert channel
  - View an alert channel
  - Edit an alert channel
  - Copy an alert channel
  - Delete an alert channel
  - Examples of alert channel configuration
  - Examples of alert channel configuration
- Manage alert push
  - Create an alert push
  - View an alert push
  - Edit an alert push
  - Delete an alert push
- View alert events
- Manage blocking conditions
  - Create a blocking condition
  - Edit an alert blocking condition
  - Delete an alert blocking condition
- OCP alert template tag
- Optional monitoring metrics for custom alerts
Dashboard Management
- Monitoring dashboard overview
- Manage dashboards
- Manage groups
- Manage charts
Performance Monitoring
- Manage monitoring
  - Overview
  - View database performance data
  - View host performance data
  - View OBServer node performance data
  - View the performance and SQL monitoring data of a tenant
  - View transaction performance data
  - View storage and cache monitoring data
  - View database object monitoring data
  - View OBKV-Table monitoring data
  - View OBKV-HBase monitoring data
  - View subscription connection monitoring charts
  - View performance monitoring data of a binlog instance
  - View resource monitoring data of a binlog instance
  - View host resource monitoring data
  - View host process monitoring data
  - View service monitoring data
  - View system monitoring data
  - View performance monitoring
  - Drill-down monitoring
  - Integrate monitoring data to an external time-series system
- Use the custom monitoring feature
  - Overview
  - View collection items
  - Manage indicator items
    - Create a metric
    - Manage metrics
  - Manage charts
    - Create a chart
    - Manage charts
  - OCP metrics
Diagnostics and Tuning
- SQL Diagnostics
  - SQL Diagnostics Introduction
    - SQL diagnostics overview
    - SQL statement execution statistics
    - SQL execution plans
  - View the OceanBase Autonomy Service page
  - Diagnose suspicious SQL statements
  - Diagnose top SQL statements
  - View Comparison of Top SQL Statements
  - Diagnose slow SQL statements
  - Diagnose parallel SQL statements
  - Diagnose high-risk SQL statements
  - Diagnose new SQL statements
  - View details of an SQL statement
  - View outlines
  - View the SQL collection enable/disable history
  - View SQL request analysis
  - Parameters related to SQL performance diagnostics
- Transaction Diagnostics
  - Overview of transaction
  - Diagnose transactions
  - Diagnose XA transactions and suspended transactions
- Manage sessions
  - Manage the sessions of a tenant
  - View session statistics
  - View row lock analytics
  - # View row lock analytics
- View the Optimization Center page
- View the capacity center
- View diagnostics reports
  - Manage ASH reports
  - Performance report
O&M Management
- Manage Information Collection
  - Overview
  - Configure information collection items
  - Create an information collection task
  - View information collection details
  - Download an information collection report
- Manage Plans
  - Overview
  - Manage plans
  - View plan execution details
- Manage Inspection
  - Inspection
  - Configure inspection items
  - Configure Scheduling Rules
    - Configure a scheduling rule for a single object
    - Configure a global scheduling rule
  - Initiate Inspection
    - Initiate an inspection for a single object
    - Initiate a global inspection
  - View an inspection task
  - Manage Inspection reports
    - View an inspection report
    - Download an inspection report
Backup and Recovery
- Overview
- Backup and restore guide
- Overview
- Backup now
  - Back up cluster now
  - Back up a tenant now
- Regular backup
  - Manage cluster-level backup strategy
    - Create a cluster-level backup strategy
    - Modify a cluster-level backup strategy
    - Delete a cluster-level backup strategy
    - Pause backup scheduling
    - View a cluster-level backup stategy
  - Manage tenant-level backup strategy
    - Create a tenant-level backup strategy
    - Modify a tenant-level backup strategy
    - Delete a tenant-level backup strategy
    - Pause backup scheduling
    - View a tenant-level backup stategy
- Manage backup tasks
  - View backup tasks
  - View second backup tasks
- Initiate a restore task
- Regular recovery
  - Create a sampling strategy for recovery
  - View sampling strategies
  - View sampling tasks
- View restore tasks
- Manage backup and recovery service
  - View services
  - Install a service
  - Add a node
  - Upgrade the version of a service
  - Update service configurations
  - Copy a service
  - Delete a service
  - Stop a service on a node
  - Restart a service on a node
  - View O\&M tasks
  - Uninstall a service on a node
Disaster Recovery
- Switching primary and standby tenants
  - Switchover
  - Failover
  - Decouple a standby tenant from its primary tenant
  - Create a standby tenant for a primary tenant managed by another OCP cluster
- Switching primary and standby databases
  - Routine primary/standby cluster switchover
  - Primary/Standby cluster switchover for disaster recovery
  - Decouple a standby cluster from the primary cluster
  - Start the original primary cluster in read-only mode after a failover
- OCP Manage OCP clusters in the multi-cluster mode
  - OCP multi-cluster mode overview
  - Register an OCP cluster
  - View Leader \& Follower Details
  - Manage OCP cluster parameters
  - Switch an OCP cluster in daily maintenance
  - Switch an OCP cluster in a failover
  - Delete a faulty follower OCP cluster
  - Unbind a leader and a follower OCP cluster
  - Enable the OCP multi-cluster mode
Log Management
- Query logs
- Download logs
- Configure trace query parameters
- Trace query
- About OpenSearch
Software Package Management
- Upload a software package
- Download a software package
- Delete a software package
System Management
- Manage password box
  - Create a credential
  - Export a credential
  - Import a credential
  - Verify a credential
  - Edit a credential
  - Delete a credential
  - Batch operations on credentials
- Manage users
  - Manage a user
    - Users overview
    - Create a user
    - View users
    - Edit a user
    - Copy a user
    - Change a user password
    - Delete a user
    - Logon history
  - Manage a role
    - Role overview
    - Create a role
    - Copy a role
    - View a role
    - Edit a role
    - Delete a role
    - Default OCP roles
- Manage tags
  - Overview
  - Create a tag
  - Modify objects bound to a tag
  - Edit a tag
  - Delete a tag
- View operation records
- Manage external integration
  - # Overview
  - # Create an SSO integration task
  - # Enable/Disable an SSO integration task
  - # Edit an SSO integration task
  - # Delete an SSO integration task
- Manage system parameters
  - View system parameters
  - Modify system parameters
  - OCP configuration parameters
- Manage tasks
- User center management
  - Configure personal information
  - Change your logon password
  - Log off the OCP console
SQL Tuning Practices
- Locate General Abnormal SQL Statements
  - How can I locate SQL statements with high CPU load in a tenant?
  - How can I locate slow SQL statements that take a long time in a tenant?
  - How can I locate SQL statements for full-table scans in a tenant?
  - How can I locate distributed SQL statements in a tenant?
  - How can I locate remote SQL statements in a tenant?
  - How can I locate slow hard-parsed SQL statements in a tenant?
  - How can I locate erroneous SQL statements in a tenant?
- Customize Abnormal SQL Statement Determining Rules
  - Conditions
  - Custom column rules
- General SQL Performance Tuning Scenarios
  - Execution Plan Optimization
    - SQL statement performance deterioration caused by plan changes
  - Index Optimization
    - No suitable indexes
    - Inappropriate index of an SQL statement
    - Invalid index of an SQL statement
O&M Best practices
- Upgrade an OceanBase cluster
- Migrate a resource unit from an OceanBase Database tenant
- Expand the high availability of OceanBase clusters and tenants
- Reduce the high availability of OceanBase clusters and tenants
- Scale out an OceanBase cluster and scale up an OceanBase Database tenant
- Scale in an OceanBase cluster and scale down an OceanBase Database tenant
- Troubleshoot host issues in an OceanBase cluster
- Migrate a cluster between IDCs in OCP
- Migrate an OceanBase cluster to another IDC
- Perform a failover between primary and standby OceanBase clusters for disaster recovery in a scenario with two IDCs
- Perform a switchover between primary and standby OceanBase Database tenants in a scenario with two IDCs
- Use the arbitration service of OceanBase Database to achieve HA in a scenario with two IDCs
- Scale out an OBProxy cluster
- Scale in an OBProxy cluster
- Replace an OBProxy
- Move out an OBProxy cluster
- Take over OBProxies
- Delete an OBProxy
- Perform a failover
- Decouple a standby tenant from the primary tenant
- Automatic routing to the new primary tenant after a switchover
- Back up an OceanBase cluster
- Use the backup and restore module to restore data
- Monitor the business load of OceanBase Database
- Integrate OCP monitoring into Prometheus
- Configure custom monitoring
- Perform an inspection in OCP to detect potential risks in an OceanBase cluster
- Check clog synchronization
- Check the NIC rate
- Check the availability of an auto-increment column
- Send alert messages to a DingTalk group
- Send alert messages to a Feishu group
- Send alert messages to a WeChat Enterprise group
- Send alert messages through email
- Push alert messages through HTTP
Reference Guide
- Alarm reference
  - Overview
  - OceanBase alerts
    - ob_cannot_connected
    - ob_cluster_rs_not_same
    - ob_cluster_status_check_failed
    - ob_cluster_exists_inactive_server
    - ob_cluster_exists_index_fail_table
    - ob_cluster_frozen_version_delta_over_threshold
    - ob_cluster_merge_error
    - ob_cluster_merge_timeout
    - ob_cluster_no_frozen
    - ob_cluster_no_merge
    - ob_cluster_operation_info
    - ob_cluster_sync_failed
    - ob_cpu_assigned_percent_over_threshold
    - ob_cpu_percent_over_threshold
    - ob_host_connection_percent_over_threshold
    - ob_host_disk_readonly
    - ob_tenant_expired_trans_exist
    - ob_host_load1_per_cpu_over_threshold
    - ob_host_partition_count_over_threshold
    - OceanBase log alerts
    - ob_mem_assigned_percent_over_threshold
    - ob_server_sstable_percent_over_threshold
    - ob_tenant_long_trans_exist
    - ob_tenant_operation_info
    - ob_tenant500_mem_hold_over_threshold
    - ob_tenant500_mem_hold_percent_over_threshold
    - tenant_active_memstore_percent_over_threshold
    - tenant_cpu_percent_over_threshold
    - tenant_memstore_percent_over_threshold
    - obproxy_process_dead
    - obproxyd_process_dead
    - obproxy_cannot_connected
    - ob_cluster_sync_delay_time_too_long
    - ob_host_data_disk_percent_over_threshold
    - ob_host_log_disk_percent_over_threshold
    - ob_host_install_disk_percent_over_threshold
    - ob_tenant_exists_expired_xa_trans
    - ob_cluster_active_session_count_over_threshold
    - ob_tenant_active_session_count_over_threshold
    - ob_host_active_session_count_over_threshold
    - ob_tenant_slow_sql_exists
    - ob_tenant_large_trans_exist
    - same_alarm_rule_detect_too_many_targets
    - ob_tenant_expired_trans_exist
    - ob_tenant_long_trans_exist
    - ob_tenant_task_timeout
    - ob_host_task_timeout
    - ob_tenant_log_disk_usage_high
    - ob_tenant_no_compaction
    - ob_tenant_no_frozen
    - ob_tenant_compaction_error
    - host_operation
    - obproxy_host_operation
    - obproxy_cluster_operation
    - ob_host_operation
    - refresh_location_cache_failed
    - ob_tenant_log_stream_degraded
    - arbitration_service_unavailable
    - observer_process_stop
    - obproxyd_process_stop
    - obproxy_process_stop
    - standby_tenant_sync_delay_too_long
    - standby_tenant_sync_status_error
    - obproxy_client_connections_usage_over_threshold
    - ob_tenant_request_queue_over_threshold
    - oas_anomaly_sql_from_anomaly_event_analysis_cpu_percent_high
    - oas_anomaly_sql_from_anomaly_event_analysis_perf_degradation
    - oas_anomaly_sql_from_anomaly_event_analysis_plan_changed
    - oas_anomaly_sql_from_sql_inspection_perf_degradation
    - oas_anomaly_sql_from_sql_inspection_plan_changed
    - os_observer_not_exist
    - refresh_location_cache_failed_by_metric
    - ob_tenant_cpu_usage_over_threshold
    - ob_tenant_partition_replica_absent
    - ob_tenant_partition_leader_absent
    - ob_tenant500_storage_short_meta_mem_hold_high
    - obproxy_core_dump
    - ob_server_cannot_connect_arbitration
    - agent_process_count_abnormal
  - Application alerts
    - ob_cluster_inspection_not_passed
    - ob_host_ssd_wear_indicator_over_threshold
    - ob_host_mem_percent_over_threshold
    - ob_host_net_recv_percent_over_threshold
    - ob_host_net_send_percent_over_threshold
    - ob_host_tcp_retrans_percent_over_threshold
    - ob_host_cpu_percent_over_threshold
    - inc_backup_delay
    - base_backup_fail
    - base_backup_too_long_time_no_one_success_task
    - backup_process_dead
    - backup_storage_capacity_over_threshold
    - backup_storage_capacity_retry_times_exceeded
    - backup_storage_capacity_timeout_or_interrupted
    - ocp_remote_server_time_diff_too_large
    - monitor_exporter_unavailable
    - system_obproxy_unavailable
    - host_unavailable
    - host_ntp_offset_too_large
    - host_ntp_service_not_exist
    - host_agent_res_memory_over_threshold
    - host_agent_open_fd_count_over_threshold
    - host_agent_goroutine_count_over_threshold
    - partition_create_failed
    - obagent_dead
    - host_disk_readonly
    - ic_server_connect_failed Inter-Connector
    - vpc_connect_failed
    - node_load1_peak
    - Host log alerts
    - base_backup_timeout
    - base_secondary_backup_fail
    - ocp_meta_db_disconnected
    - ocp_http_request_timeout
    - ocp_http_request_too_many_errors_occur
    - ocp_alarm_detect_timeout
    - odp_instance_compress_failed
    - odp_instance_expanse_failed
    - odp_sql_execute_failed
    - odp_sql_query_slow
    - agentd_process_stop
    - mgragent_process_stop
    - monagent_process_stop
    - ocp_contingency_failed
    - ob_host_invalid_disk_exists
    - ob_cluster_recyclebin_disk_used_over_threshold
    - ob_host_monitordb_disconnected
    - upgrade_ocp_agent_failed
    - host_agent_version_not_same
    - ocp_collect_metric_failure_rate_high
  - OAS alerts
    - os_cpu_irq_error
    - os_tsar_cpu_sys_abnormal
    - os_observer_fd_usage
    - os_tsar_cpu_util_full
    - os_tsar_cpu_util_hwm
    - os_kernel_io_hang
    - os_tsar_sda_ioawait
    - os_tsar_nvme_ioawait
    - os_tsar_traffic_drop
    - os_tsar_traffic_error
    - os_observer_core_dump
    - os_nic_1000m_full
    - os_nic_1000m_hwm
    - os_tsar_traffic_overload
    - node_file_root_usage
    - node_file_inode_usage
    - os_kernel_ntp_down
    - os_kernel_ntp_delay
    - node_memory_peak
    - os_home_file_usage
    - node_file_data1_usage
    - node_file_datalog1_usage
    - sql_audit_collect_lost_percent_over_threshold
  - Appendix
    - Exception handling for OceanBase cluster compaction
    - Apply throttling to an OceanBase cluster
    - Network troubleshooting
    - Execute the alert clearance plan
- API Reference
  - Overview
  - API call description
  - Rules for generating a signature by using AK/SK
  - Task return structure
  - Cluster management
    - Query OceanBase clusters
    - Create an OceanBase cluster
    - Delete an OceanBase cluster
    - Stop an OceanBase cluster
    - Start an OceanBase cluster
    - Restart an OceanBase cluster
    - Upgrade an OceanBase cluster
    - # Move out an OceanBase cluster
    - # Perform a takeover precheck on an OceanBase cluster
    - # Take over an OceanBase cluster
    - Query zones of an OceanBase cluster
    - Create a zone for an OceanBase cluster
    - Delete a zone from an OceanBase cluster
    - Stop a zone in an OceanBase cluster
    - Start a zone in an OceanBase cluster
    - Restart a zone in an OceanBase cluster
    - Add an OBServer
    - Delete multiple OBServers at a time
    - Stop an OBServer
    - Start an OBServer
    - Restart an OBServer
    - Replace an OBServer
    - Change the password of an OceanBase cluster
    - Query the description information of the OceanBase cluster parameters
    - Query parameters of an OceanBase cluster
    - Modify parameters of an OceanBase cluster
    - Query the list of OBServers in an OceanBase cluster
    - Query the list of OBServers in a zone
    - Perform a switchover pre-check
    - Perform a failover pre-check
    - Perform a switchover
    - Perform a failover
    - Obtain resource statistics of an OceanBase cluster
    - Obtain resource statistics of all OBServer nodes in a cluster
    - Query the list of resource units in an OceanBase cluster
  - Tenant management
    - Query tenants of a cluster
    - Query all tenants
    - Query details about a tenant
    - Create a tenant
    - Create a standby tenant
    - Delete a tenant
    - Lock a tenant
    - Unlock a tenant
    - Query units of a tenant
    - Delete a unit from a tenant
    - Add a replica for a tenant
    - Delete a replica of a tenant
    - Modify a replica of a tenant
    - Modify zone priorities of a tenant
    - Change the administrator password of a tenant
    - Modify the whitelist of a tenant
    - Query parameters of a tenant
    - Modify parameters of a tenant
    - Obtain a list of unit specifications
    - Create a resource unit specification
    - Delete a resource unit specification
    - # Perform a switchover precheck
    - # Perform a switchover
    - # Perform a batch switchover precheck
    - # Perform a batch switchover
    - # Perform a failover precheck
    - # Perform a failover
    - # Perform a batch failover precheck
    - # Perform a batch failover
  - OBProxy management
    - Create an OBProxy cluster
    - Delete an OBProxy cluster
    - Query OBProxy clusters
    - Query details about an OBProxy cluster
    - Update configurations of an OBProxy cluster
    - Add an OBProxy
    - Take over an OBProxy
    - Delete an OBProxy
    - Restart an OBProxy
    - Upgrade an OBProxy
    - Add a connectable OceanBase cluster for an OBProxy cluster
    - Remove a connectable OceanBase cluster from an OBProxy cluster
    - Query the description information of the OBProxy parameters
    - Query parameters of an OBProxy cluster
  - Database management
    - Query databases
    - Create a database
    - Modify a database
    - Delete a database
    - Query the list of database users
    - Query database user details
    - Create a database user
    - Delete a database user
    - Change the password of a database user
    - Lock a database user
    - Unlock a database user
    - Query the list of database roles
    - Query database role details
    - Create a database role
    - Delete a database role
    - Query database objects
    - Grant global privileges to a user
    - Revoke global privileges of a user
    - Change global privileges of a user
    - Grant global privileges to a role
    - Revoke global privileges of a role
    - Change global privileges of a role
    - Grant roles to a user
    - Revoke roles of a user
    - Change roles of a user
    - Grant roles to a role
    - Revoke roles of a role
    - Change roles of a role
    - Grant database privileges to a user
    - Revoke database privileges of a user
    - Change database privileges of a user
    - Grant object privileges to a user
    - Revoke object privileges of a user
    - Change object privileges of a user
    - Grant object privileges to a role
    - Revoke object privileges of a role
    - Change object privileges of a role
  - Monitoring
    - Query the description information of monitoring metrics
    - Query monitoring data
    - Query monitoring data by tag
  - Alerts
    - Call an alert API
    - Alert events
      - Query the alert event list
      - Query alert events
    - Alert notifications
      - Query alert notifications
  - Inspection
    - Query all inspection tasks
    - Query inspection objects
    - View the details of an inspection report
    - Initiate an inspection
    - # Query the last inspection result of an inspection item
    - # Query the last inspection result of an inspection object
  - SQL performance
    - Query performance indicators of an SQL statement
    - Query the performance indicator trend of an SQL statement
    - Query SQL text
    - Query performance indicators of an execution plan
    - Query the performance indicator trend of an execution plan
    - Query the operator structure of an execution plan
    - Query the list of slow SQL statements
    - Query snapshots
    - Generate a performance report
    - Query a performance report
  - Backup and recovery
    - Query backup capabilities of a cluster
    - Create a backup strategy for a cluster
    - Modify the backup strategy of a cluster
    - Delete the backup strategy of a cluster
    - Query the backup strategy of a cluster
    - Query the backup overview of a cluster
    - Query data backup tasks of a cluster
    - Query log backup tasks of a cluster
    - Query recovery tasks of a cluster
    - Immediately back up a cluster
    - Parse cluster backup data
    - # Start backup scheduling for a cluster
    - # Stop backup scheduling for a cluster
    - Initiate tenant recovery
    - Preview a restore task
    - Add restore resources
    - Clear added restore resources
    - Initiate a data backup for a tenant
    - Create a backup strategy for a tenant
    - Delete the backup strategy of a tenant
    - View the data backup tasks of a tenant
    - View the log backup tasks of a tenant
    - View restore tasks
    - Parse the backup information of a tenant
    - Query the backup strategy of a tenant
    - # Modify the backup strategy of a tenant
    - # Start backup scheduling for a tenant
    - # Stop backup scheduling for a tenant
  - Host management
    - Query regions
    - Query details about a region
    - Add region information
    - Delete a region
    - Query IDCs
    - Query details about an IDC
    - Add IDC information
    - Delete an IDC
    - Query host types
    - Query details about a host type
    - Add host type information
    - Delete a host type
    - Query hosts
    - Query details about a host
    - Add multiple hosts at a time
    - Delete a host
    - Delete multiple hosts at a time
  - Software packages
    - Query software packages
    - Upload a software package
    - Delete a software package
  - O&M tasks management
    - Query tasks
    - Query details about a task
    - Retry a task
    - Roll back a task
    - Query subtask logs
    - Retry a subtask
    - Skip a subtask
    - Cancel a subtask
  - System management
    - Health examination
      - Query the basic information of an OCP application
      - Query OCP server time
      - Query OCP application status
    - Operation Audit
      - Query event history
    - Agent management
      - Query OCP Agent details on a host
      - Stop the OCP Agent process on a host
      - Batch stop the OCP Agent processes on hosts
      - Restart the OCP Agent process on a host
      - Batch restart the OCP Agent processes on hosts
      - Query OCP Agent processes on a host
      - Restart the OCP Agent process on a host
      - Stop the OCP Agent process on a host
- Metric reference
  - Overview of metrics
  - Monitoring metrics list
  - OceanBase cluster
    - QPS
    - Query response time
    - TPS
    - Transaction response time
    - Request queue time
    - Number of sessions
    - TPS
    - Number of transaction logs
    - Transaction log volume
    - Time consumed by transaction logs
    - IOPS
    - IO time-consuming
    - I/O Throughput
    - Process memory
    - Process CPU utilization
    - Process file descriptors
    - Number of process threads
    - sys500 memory occupation
    - Error logs
  - OceanBase Database tenant
    - QPS
    - Response time (SQL mode)
    - Sessions
    - SQL execution plan category
    - SQL execution plan time
    - Wait events
    - Metrics related to time consumption of wait events
    - Request waiting queue
    - Request waiting queue time consuming
    - Request queue size
    - Tenant CPU cost
    - Tenant thread usage
    - memroy usage pecent
    - MemStore usage
    - Rpc package rt
    - RPC packet throughput
    - Cursors
    - Clog synchronization delay
    - DB time metrics
    - Metrics for the SQL execution phase
    - Frontend workload metrics
    - Background workload metrics
    - PX worker thread usage
    - TPS
    - Transaction response time
    - Transaction response time details
    - Number of transaction logs
    - Transaction log volume
    - Time consumed by transaction logs
    - Lock waits
    - Time consumed by lock waits
    - TPS
    - Partitions
    - Number of XA transactions
    - Number of XA statements
    - Transaction table read request hits
    - Average execution duration of XA statements
    - MemStore
    - IOPS
    - IO time-consuming
    - I/O Throughput
    - Cache size
    - Cache hit rates
    - Number of cache requests
    - Frozen MemStores
    - # Log disk
    - # Data disk
    - Objects
    - # QPS (OBKV-Table)
    - # Response time (OBKV-Table)
    - # Average number of statement lines processed (OBKV-Table)
    - # QPS (OBKV_Hbase)
    - # Response time (OBKV-HBase)
    - # Average number of statement lines processed (OBKV-HBase)
  - Binlog Service
    - Sending Delay
    - Associate RPS
    - Network Traffic
    - Binlog Delay
    - Binlog RPS
    - CPU
    - memory used
    - memory
    - disk used
    - Binlog Disk Write Rate
    - disk ratio
    - FD count
    - network bytes
  - OBProxy cluster
    - TPS
    - QPS
    - Client connections
    - Server connections
    - Average response time for each SQL statement
    - Memory for the obproxy process
    - OBProxy CPU usage
    - Number of file descriptors for the obproxy process
    - Number of threads in the obproxy process
    - Log error
    - Route table queries
    - Net bytes
  - Host
    - Linux system load
    - CPU Utilization
    - I/O usage
    - I/O queue length
    - IOPS
    - Time consumed by I/O operations
    - I/O throughput
    - Network throughput
    - TCP retransmission rate
    - Packet forwarding
    - NTP clock offset
    - Inode usage
    - Memory
    - Memory usage
    - Disk
    - Process resident memory
    - Process CPU utilization
    - Number of process file descriptors
    - Number of process threads
    - Number of OCP-Agent process goroutines
- Information Collection Reference
  - Major Compaction Exceptions
    - # Collect information about the major compaction status of zones of a cluster
    - # Collect information about the ongoing major compaction in a cluster
    - # Collect suggestions provided based on historical major compactions
    - # Collect diagnostic information about major compactions
    - # Collect information about DAG warnings in OceanBase Database
    - # Collect RootService task records
    - # Collect the last-one-day records of major compaction events scheduled by RootService
    - # Collect the checksum information about tablets
    - # Collect the checksum information about columns
    - # Collect major compaction parameter settings
    - # Collect the major compaction time
  - OCP Agent Exceptions
    - # Collect panic logs of the mgragent process
    - # Collect panic logs of the monagent process
    - # Collect information about goroutines of the agent processes
  - Cluster Exceptions
    - # Collect information about bad disks
    - # Collect information about OBServer nodes in a cluster
    - # Collect obstack information on a cluster node
    - Collect information about the core dump file of the observer process
    - Collect host logs containing the CRASH ERROR keyword
    - Collect connectivity information of the sys tenant
  - CPU Exceptions
    - # Collect information about processes with high CPU utilization on a cluster node
    - # Collect information about threads with high CPU utilization in an observer process
  - Data Disk Full
    - # Collect the data disk usage information of nodes in a cluster
    - # Collect incremental data generated during cluster compactions
    - # Collect information about the disk space occupied by temporary files of a tenant
    - # Collect information about sizes of data of different versions in major SSTables
    - # Collect information about the macroblock utilization of nodes
  - Memory Exceptions
    - Collect information about modules with highest memory usage
    - Collect memory allocation sampling data of OBServer nodes
    - Collect KV cache information of a tenant
    - Collect memory leakage detection information
    - Collect MemStore information of a tenant
  - Host Exceptions
    - Collect logs from the /var/log/messages directory of a host
    - Collect host information by running the df -h command
  - Other Tenants Exceptions
    - Collect host logs containing the dump tenant info keyword
    - Collect host logs containing the dump tenant disk usage keyword
    - Collect host logs containing the dump tenant info keyword
- Error codes reference
  - OCP error codes
  - OBServer error codes
    - MySQL mode
      - Overview
      - 0001 to 3999
      - 4000 to 4499
      - 4500 to 4999
      - 5000 to 5999
      - 6000 to 6999
      - 7000 to 7999
      - 8000 to 8999
      - 9000 to 9499
      - 9500 to 9999
      - 10000 to 12000
      - 22998, 30926, 38104, and 38105
    - Oracle mode
      - Overview
      - ORA-00000 to ORA-00999
      - ORA-01000 to ORA-01499
      - ORA-01500 to ORA-01999
      - ORA-02000 to ORA-04999
      - ORA-05000 to ORA-10000
      - ORA-10000 to ORA-19999
      - ORA-20000 to ORA-29999
      - ORA-30000 to ORA-49999
      - ORA-50000 to ORA-99999
      - PLS-00000 to PLS-00999
FAQ
- FAQ about deployment
- FAQ on upgrade
- FAQ on O&M
  - FAQ on accounts and passwords
  - FAQ on host management
  - FAQ on OceanBase cluster
  - FAQ on OceanBase tenants
  - FAQ on OBProxy
  - FAQ on the OCP software package
  - FAQ on backup and recovery
- FAQ on monitoring
  - FAQ on monitoring metrics
  - FAQ on SQL monitoring
  - FAQ about resource usage
  - Use OCP-Agent to pull time-series monitoring data
- FAQ on alerts
- FAQ on the OCP system
- OCP multi-cluster FAQ
Appendix
- OCP background tasks
- Tables managed by the daemon
- Component listening port list
- Processes
- Install and configure OCI
- AWS S3 protocol
Release Notes
- V4.3
  - OCP V4.3.3
  - OCP V4.3.2
  - OCP V4.3.1
  - OCP V4.3.0

Download PDF

OceanBase

OceanBase

OceanBase, A Highly Scalable Database for Transactional, Analytical, and AI Workloads.

Company

About OceanBase Contact Us Partner Trust Center

Product

OceanBase Cloud OceanBase Enterprise Quick Start Download Pricing

Resources

Docs Blog White Paper

Follow us

© OceanBase 2024. All rights reserved | Cloud Service Agreement | Privacy Policy | Legal | Security

Document Feedback

ob_host_mem_percent_over_threshold

Last Updated：2025-01-10 06:15:54 Updated

share

share

Description

This alert is triggered when the total physical memory usage of the OBServer exceeds the threshold.

Principle

The following table describes the key parameters that are involved in the monitoring and alerting logic.

Parameter	Value
Metric	ob_host_mem_percent
Source	The data is collected by using the node_exporter process.
Collected metrics	node_memory_MemFree_bytes, node_memory_Cached_bytes, node_memory_Buffers_bytes, and node_memory_MemTotal_bytes
Metric expression	(1 - (avg(node_memory_MemFree_bytes{@LABELS}) by (@GBLABELS) + avg(node_memory_Cached_bytes{@LABELS}) by (@GBLABELS) + avg(node_memory_Buffers_bytes{@LABELS}) by (@GBLABELS)) / avg(node_memory_MemTotal_bytes{@LABELS}) by (@GBLABELS)) * 100
Collection cycle	1 second

Note

The metric source of this alert is special. OCP-Agent collects monitoring data by using the node_exporter process.

The value of the metric ob_host_mem_percent indicates the memory usage of the OBServer. When this value exceeds the threshold, this alert is triggered. The default threshold is 90%.

Alert rule

Metric	Default threshold (unit: %)	Duration	Detection cycle	Time before clearance
ob_host_mem_percent	90	0 seconds	60 seconds	5 minutes

Alert information

Trigger method	Alert level	Scope
Metric expression	Critical	Server

Alert templates

Overview: ${alarm_target} ${alarm_name}
Details: Cluster: ${ob_cluster_name}, Host: ${host}, Alert: ${alarm_name}. The memory usage is ${value}%, exceeding the threshold of ${alarm_threshold}%.
Overview example: ob_cluster=C1-1000:svr_ip=xxx.xxx.xxx.xxx. The memory usage of the OBServer exceeds the threshold.
Details: Cluster: obcluster-1, Host: xxx.xxx.xxx.xxx, Alert: The memory usage of the OBServer exceeds the threshold. The memory usage is 91%, exceeding the threshold of 90%.

Impact on the system

The OBServer cannot properly work if the server memory is insufficient.

Possible causes

A non-observer process is executing a memory-intensive task.
An OBServer malfunctions. For example, the memory usage of the sys tenant (ID = 500) of the OceanBase cluster exceeds the threshold.

Suggested solutions

Confirm the OBServer status.

In the OBServers list on the Overview page of the cluster, view the status of the server. Alternatively, you can connect to the sys tenant of the OceanBase cluster and execute the following SQL statement:
```
select status from __all_sever where svr_ip='your server ip';
```
- If the status is Inactive, restart the observer process.
- If the status is Running, go to the next step.
Verify whether the excessive memory usage is caused by daily business operations.

On the Host Overview page of the OCP console, find the OBServer that triggered the alert and go to its details page. Then, choose Monitoring > Host Resources to view the memory curve chart.
- If the memory usage suddenly increases at the alert time, it is not normal. Go to the next step for troubleshooting.
- If the memory usage mildly increases at the alert time, it indicates the routine usage by the applications.
  
  You can replace the OBServer with one that has a larger memory size. For more information, see Replace an OBServer.
Run the following command to check for processes that use the most memory resources:
```
# Find out the top five processes that use the most memory resources and sort them by memory usage in descending order.
ps -o %mem,pid,cmd  -ax | sort -rn | head -5
```
- If a non-observer process uses too many memory resources, analyze the corresponding application to find the cause of the high memory usage.
  
  Otherwise, you can stop the processes that use the most memory resources.
- If an observer process uses too many memory resources, the OBServer may have encountered an error.
  
  In this case, the ob_tenant500_mem_hold_over_threshold alert is usually triggered at the same time. We recommend that you resolve that alert first. Then, check whether the ob_host_mem_percent_over_threshold alert recurs 5 minutes later.

Previous topic

ob_host_ssd_wear_indicator_over_threshold

Next topic

ob_host_net_recv_percent_over_threshold