FAQ about OceanBase clusters

2025-09-08 08:15:43  Updated

This topic describes common issues related to the management of OceanBase clusters.

Feature implementation

Q1: Can OCP in the x86 environment manage an OceanBase cluster in the ARM architecture?

A: OCP V2.5.2 and later support this.

Q2: Does OCP support deploying a primary x86 OceanBase cluster with a backup ARM OceanBase cluster?

A: OCP V3.1.0 and later support this.

OceanBase cluster management

Q1: Where can I view the latency between the primary and backup clusters?

A: Go to the cluster overview page, switch to the cluster topology view, and view the latency between the primary and backup clusters.

Q2: What do I do when an error machine resource is not enough to hold a new unit is reported?

A: The error indicates that the machine resources are insufficient, which prevents the execution of replica migration-related maintenance tasks. You can add hosts to resolve the issue.

Q3: What is the log cleanup strategy of OCP-Agent for observer.log?

A: When the space utilization reaches 80%, only the most recent 100 log records are retained.

Q4: How is the topology status of a cluster obtained?

A: The status is obtained based on the records in the ob_server table of MetaDB.

Q5: How is the total number of sessions in OCP calculated?

A: The total number of sessions is queried from the internal table __all_virtual_processlist of OceanBase Database.

Q6: What is an inactive session and how is it managed?

A: A session that is established but has not been operated on for a while is an inactive session. By default, a session that has not been operated on for 28800 seconds is marked as inactive. OceanBase Database does not actively release the connection. The connection is released based on the timeout of the client.

Q7: Do inactive sessions occupy the maximum number of connections of a unit?

A: The maximum number of connections of a unit does not affect the allocation of real connections. What really matters is the upper limit of connections of OBServer nodes. OCP uses the number of connections of a unit to trigger alerts.

Q8: Which ports need to be opened if OCP and OceanBase Database are not in the same subnet?

A: For more information, see Component listening port list.

Q9: How do I quickly delete a cluster that has failed tasks?

A: Change the cluster status from Operating to NORMAL in the ob_cluster table of MetaDB.

Q10: What does the inspection result description Number of zones: unknown > 100000, serious indicate when OceanBase cluster health inspection is performed?

A: The unknown indicates that the partition count is unavailable. To locate the issue, perform the following steps:

  1. Check whether the Tenant Management page of the Capacity module displays related data in the Partition Count chart. If not, proceed to the next step.

  2. Check whether the Processes list on the corresponding host displays normal process status. If obstat and logtailer processes are unexpectedly stopped, proceed to the next step.

  3. The issue may be due to domain name resolution failure, which prevents the host from accessing the database, thereby causing obstat and logtailer startup failure.

We recommend that you configure the IP address of the domain name server in the /etc/resolve.conf file on the host that reported the error and then check the issue again.

Q11: Why can't I delete a cluster in OCP?

A: The user deleted the hosts of the OBServer cluster in the GUI, which caused the host to be offline and the cluster information in MetaDB to be inconsistent with that in OceanBase Database. If some OBServer nodes are offline in the OceanBase cluster, we recommend that you do not forcibly delete the cluster. In this case, contact Technical Support to manually correct the related data in MetaDB.

OceanBase cluster deployment

Q1: What do I do if an error is returned when I create a cluster, indicating that the argument is illegal?

A: Check whether the RPM package name is duplicated.

Q2: What is the cause of the error returned when I install OceanBase Database and click OK on the confirmation dialog box?

A: Check whether the OceanBase Database package name has been modified.

Q3: The primary cluster is an x86 cluster. I want to add an AArch64 cluster as the standby cluster. The standby cluster uses the x86 package. Is this feasible?

A: It is not recommended to deploy heterogeneous primary and standby clusters.

Q4: What is the cause of the error returned when I create a cluster in OCP, indicating that the zone priority is invalid?

A: The region corresponding to the zone with the highest priority must contain at least two zones.

Q5: When I create a standby database in OCP, does the entire data primary-standby synchronization end when the task to create the standby database ends?

A: The time when the task to create a standby database ends is not directly related to the time when the standby database starts to synchronize data. However, a standby database can provide services only after the task to create the standby database ends.

Q6: When I create a primary and standby cluster in OCP V2.5.x, I set the tenant and table primary_zone to RANDOM. However, an error is returned during the task execution, indicating that Add cluster not allowed and CHECK PRIMARY ZONE OR LOCALITY CONFIG. What is the cause?

A: User tables are configured with primary_zone or locality. Perform the following operations to modify the configurations:

  1. View the tables configured with primary_zone or locality in the OceanBase database of the sys tenant.

    select table_id,tenant_id,primary_zone,locality,table_name from __all_virtual_table;
    

    Note that the table name is t and the tenant ID is 1001.

  2. In the database of the corresponding tenant, execute alter table t set primary_zone = default; to cancel the primary zone setting at the table level.

  3. In the database of the corresponding tenant, execute the following statements to cancel the locality setting at the table level.

    select locality from __all_tenant where tenant_id = 1001;     #Record the query result as F@zone1,F@zone2,F@zone3.
    alter table t set locality = 'F@zone1,F@zone2,F@zone3';
    

Q7: In the same private cloud network environment, there is already an OCP that has created OceanBase Database. I want to deploy another OCP and create OceanBase V2.2.7x on it. In addition to modifying the default value of cluster_id to 100000 in /root/t-oceanbase-antman/obcluster.conf during OCP deployment, do I need to make any other modifications when creating a business cluster?

A: When you use OCP to create an OceanBase cluster, make sure that the clusterid and clustername are unique.

OceanBase cluster takeover

Q1: When I import a cluster, the host preparation task reports python dependency does not exist. How do I solve this issue?

A: If the host was not installed using the installation template, manually install the Python script. We recommend that you use the installation template.

Q2: When I use OCP V2.5.x to take over a cluster, an error is reported: error code:11042 Pre-check for cluster takeover failed. The failure reason is that the OBServer node check failed. How do I solve this issue?

A: We recommend that you call the pre-check interface to view the specific reasons why the OBServer node check failed. The method for calling the interface is as follows:

Q2: When I use OCP V2.5.x to take over a cluster, an error is reported: error code:11042 Take over cluster precheck failed. The failure reason is that the OBServer node check failed. How do I solve this issue?

A: We recommend that you call the precheck interface to view the specific reasons for the OBServer node check failure. The method for calling the interface is as follows:

curl -X POST --user admin:xxx \
  -H "Content-Type:application/json" \
  -d '{"rootSysPassword":"xxxxx",                   # Required. The root@sys password.
         "address":"xxxxx",                                 # Required. The cluster connection address.
         "port":2888,                                           # Required. The cluster connection port.
         "connectionMode": "proxy"                  # Optional. The connection mode. Valid values: direct | proxy. Default value: direct.
         "clusterName": "xxx",                            # Optional. Required when the connection mode is proxy.
         "obClusterId": xxx                                 # Optional. Required when the connection mode is proxy and the cluster to be taken over is a standby cluster.
        }' \
  "http://example.com:8080/api/v2/ob/clusters/takeOverPreCheck"

Q4: What are the considerations when other OCPs take over a cluster deployed by using OCP?

A: When other OCPs take over a cluster deployed by using OCP, note the following points:

  • Confirm the access method to the OceanBase cluster to be taken over and the impact on the business.

  • Migrate the cluster to be taken over from other OCPs or stop other OCPs.

  • If you are using a single OCP, take over the OceanBase primary cluster first, and then take over the OceanBase standby cluster.

  • If you are using multiple OCPs, take over the OceanBase primary cluster in one OCP, and then take over the OceanBase standby cluster in any OCP.

OceanBase cluster upgrade

Q1: How do I upgrade the primary and standby clusters?

A: On the overview page of the primary and standby clusters, you can click Upgrade Version in the upper-right corner. You can initiate an upgrade for the primary and standby clusters from the upgrade entry of either cluster.

Q2: What do I do when an error timeout is returned during schema refresh when I upgrade an OceanBase cluster by using OCP?

A: Run the following command to view the schema and version:

select tenant_id, refreshed_schema_version from __all_virtual_server_schema_info " + " where svr_ip = '%s' and svr_port = %d and refreshed_schema_version >
select max(value) value from oceanbase.__all_virtual_sys_parameter_stat where name = 'min_observer_version'

Then manually retry the CheckRefreshSchemaTask task.

Q3: What does an error returned during the execute upgrade checker step indicate when I upgrade an OceanBase cluster?

A: It may be caused by a timeout when downloading the RPM package. We recommend that you retry.

Q4: What is the cause of an error no corresponding version node being returned when I upgrade OceanBase V2.2.71 to OceanBase V2.2.74?

A: The upgrade path defined in the /home/admin/ocp-server/etc/oceanbase_upgrade_dep.yml file on the OCP server is incomplete. We recommend that you add the version information to OceanBase V2.2.74. The sample code is as follows:

#The preceding lines are omitted.
- version: 2.2.70
  can_be_upgraded_to:
      - 2.2.71
- version: 2.2.71
  can_be_upgraded_to:
      - 2.2.72
- version: 2.2.72
  can_be_upgraded_to:
      - 2.2.73
- version: 2.2.73         # The last version needs to be defined, otherwise, no path will be found for upgrading to the same version. This definition is also needed for the last version in the future.   Xin Xu
  can_be_upgraded_to:
      - 2.2.74

Q5: What does the message A binary replacement is required indicate on the interface for confirming the upgrade path of an OceanBase cluster?

A: A binary replacement means that OBServer nodes need to be replaced. OBServer nodes in V2.2.72 and V2.2.73 only need to run the pre- and post-upgrade scripts, and do not need to replace the binary files. This message is only for your reference and does not require any manual operations. The upgrade task will automatically complete the replacement.

Q6: What do I do when an error [alter system stop zone ?]; SQL state [25000]; error code [4012]; Statement is timeout is returned during the upgrade of an OceanBase cluster?

A: We recommend that you retry.

Q7: What is the cause of the failure in upgrading an OceanBase cluster from V2.2.30 to V2.2.50?

A: The RPM package of OceanBase V2.2.40, which is required for the upgrade, has not been uploaded. We recommend that you upload the RPM package and then retry.

Q8: If a customer wants to upgrade the minor version and then the major version of an OceanBase cluster, does OCP support upgrading the cluster to the same version?

A: Yes, but the minor version numbers must be different.

Q9: What do I do when an error Cannot allocate memory is returned on one of the servers during the Install rpm step when I upgrade an OceanBase cluster by using OCP?

A: The memory is insufficient. We recommend that you either clear the memory or expand the memory based on the actual situation.

Q10: What is the cause of an error MyError: "(1, Decimal('1.0000')) check_logonly_replica_unit failed, not found log_type resource_pool for this tenant being returned during the upgrade of an OceanBase cluster when upgrade_checker.py is executed?

A: Check whether log-only replicas exist in the OceanBase cluster to be upgraded. If they do, we recommend that you replace the log-only replicas with F replicas.

Q11: What do I do if an OceanBase cluster has tenants other than sys, ocp-meta, and ocp-monitor, such as omsmeta and odpmeta, and needs to be upgraded?

A: Contact OceanBase Technical Support to obtain the upgrade manual and proceed with the upgrade. Upgrading one cluster will not affect the tenants of other applications.

OceanBase cluster monitoring

Q1: When a cluster is created in OCP, will the ocp_monitor user be automatically created, and what is the default password for this user?

A: The ocp_monitor user will be automatically created. However, the default password is only used by the system when this user is called upon during monitoring. You do not need to be aware of this password.

Q2: On the performance monitoring page of a cluster, does the transaction log time refer to the average time of SQL or the average time of a transaction?

A: The average time of a transaction.

Cluster inspection

Q1: By default, the /data/1 data disk of OceanBase can occupy up to 90% of the host's disk space, while the host disk usage threshold for cluster health inspection is set to 85%. Is this setting reasonable?

A: During cluster health inspection, the inspection of OceanBase data disks will be ignored.

Contact Us