FAQ on OceanBase clusters |V3.3.1|OceanBase Cloud Platform| docs|Distributed Database

FAQ on OceanBase clusters

Last Updated：2023-08-15 11:20:59 Updated

This topic describes frequently asked questions about OceanBase cluster management.

Feature implementation

Q1: Can I use an X86 OCP system to manage an OceanBase cluster in the ARM architecture ?

A: Yes, you can. We recommend that you use OCP V2.5.2 and later.

Q2: Which OCP version can I use to deploy a V1.4.79 OceanBase cluster ?

A: The deployment can be implemented by using all OCP versions. We recommend that you use the latest OCP version.

Q3: Does OCP support a deployment plan of using the X86 architecture for the primary OceanBase cluster and the ARM architecture for the standby OceanBase cluster ?

A: This deployment plan is supported in OCP V3.1.0.

Manage OceanBase clusters

Q1: Where can I query the delay between the primary cluster and the standby cluster?

A: On the Cluster Overview page, switch to the topology of the cluster, and then view the delay between the primary and standby clusters.

Q2: What can I do when I receive the following system error? Error message: machine resource is not enough to hold a new unit .

A: If the server resources are insufficient, replica migration tasks cannot proceed. You can add hosts to solve this issue.

Q3: What is the strategy used by OCP-Agent in automatically clearing the observer.log?

A: When the memory usage reaches 80%, the latest 100 log records are retained and the earlier ones are automatically cleared.

Q4: How can I obtain the topology status of a cluster?

A: You can obtain the topology status of a cluster based on the records in the ob_server table of MetaDB.

Q5: Where can I get the total number of sessions of OCP?

A: You can query the __all_virtual_processlist table.

Q6: What is an inactive session? How are inactive sessions managed?

A: If a session remains idle for a specified period after it is created, the session is changed to the inactive status. By default, a session is marked as inactive after it remains idle for 28,800s. Idle sessions are released by the client upon timeout. OceanBase Database does not automatically release these sessions.

Q7: Does an inactive session occupy the number of connections supported by a unit?

A: The actual distribution of connections is not affected by the maximum unit connections. Instead, it is affected by the maximum OBServer connections. Unit connections are used as an alert condition in OCP.

Q8: Which ports do I need to enable if the OCP is deployed in a CIDR block that is different from that of OceanBase Database ?

A: For more information, see Component listening port list .

Q9: How can I delete a cluster with failed tasks ?

A: Change the status of the cluster from Operating to NORMAL in the ob_cluster table of MetaDB. Then, you can delete the cluster.

Q10: In the health inspection result description of the OceanBase cluster, the following message appears: Replicas: unknown > 100000, Critical . What does it mean?

A: unknown indicates that the system failed to get the number of partitions. We recommend that you perform the following steps to identify the problem:

Check whether related data is displayed in the Capacity_Number of Partitions diagram on the Tenant Management page in the OceanBase cluster. No data is displayed.
Check whether the processes in the Processes list of the corresponding host are normal. The obstat and logtailer processes stop abnormally.
Check whether the DNS is incorrectly resolved, causing the host to fail to access the database and the obstat and logtailer processes to fail to start.

We recommend that you configure the DNS address in the /etc/resolve.conf directory of the host that returns the error, and check whether the problem persists.

Q11: Why am I unable to delete a cluster or quit the deletion task in OCP ?

A: If you delete a host in the OBServer cluster by using a CLI tool, the host is abnormally offline and the cluster information in MetaDB is inconsistent with that of the OceanBase cluster. We recommend that you do not delete an OceanBase cluster forcibly if some OBServer in the cluster is offline. We recommend that you contact a technical engineer to manually modify the related data in MetaDB.

Q12 : Can the node-exporter process on the OceanBase server automatically recover after it is stopped ? If not, how can I manually recover the process ?

A: The node-exporter process will automatically pull up if the ocp_agentd process does not stop. For more information, see OCP-Agent processes.

Deploy OceanBase clusters

Q1: An error occurred when I was creating a cluster. Error message: Illegal argument exception . What can I do?

A: Check whether the name of the RPM package is duplicated.

Q2: An error occurred at the Confirm Information step when I was installing OceanBase Database. Error message: An unknown error has occurred. Cause: {0}. Error message: {1}. Contact the administrator. What is the cause of this problem?

A: Check whether the name of the OceanBase software package is modified.

Q3: The architecture of my primary cluster is X86. I plan to add AArch64 hosts into the standby cluster. Can I use the X86 software package to deploy the standby cluster?

A: This deployment plan is risky. We recommend that you use servers of the same architectures for both the primary and standby clusters.

Q4: Why does the system report invalid zone priority when I use OCP to create a cluster ?

A: The region of the top priority zone must have at least two zones.

Q5: When I create a standby cluster in OCP, is it true that the primary/standby data synchronization task is implemented and ends together with the cluster creation task?

A: The standby cluster creation task has nothing to do with the primary/standby synchronization task. The standby database provides services to applications only after it is created.

Q6: I used OCP V2.5.x to create primary and standby clusters, and I set the primary_zone of both the tenant and table to RANDOM . Why does the system report Add cluster not allowed . CHECK PRIMARY ZONE OR LOCALITY CONFIG when I run a task? ?

A: The primary_zone or locality is configured in a table. Perform the following steps to solve the problem:

Check for the table in which primary_zone or locality is configured in the OceanBase database of the system tenant.
```
select table_id,tenant_id,primary_zone,locality,table_name from __all_virtual_table;
```
The table in which primary_zone or locality is configured is named t, and the tenant ID is 1001.
Execute the alter table t set primary_zone = default; statement in the database of the corresponding tenant to cancel the primary_zone setting of the table.

Execute the following statement in the database of the corresponding tenant to cancel the locality setting of the table.

select locality from __all_tenant where tenant_id = 1001;     # The query result is F@zone1,F@zone2,F@zone3. 
alter table t set locality = 'F@zone1,F@zone2,F@zone3';

Q7: What can I do if the system report the following error when I install OceanBase Database 3.1? Error message: arm Kylin OS v10. Default page_size: 65536: Unsupported system page size: Unsupported system page size .

A: For more information about how to solve this problem, see Modify the page size.

Q8: I have an OCP system that manages an OceanBase cluster in a private cloud. I want to deploy a new OCP system in the same environment and create an OceanBase Database V2.2.7x cluster in it. I have changed the default value of cluster_id to 100000 in /root/t-oceanbase-antman/obcluster.conf when I deploy OCP. What else do I need to modify when I create the OceanBase cluster?

A: When you use OCP to create an OceanBase cluster, you must ensure that the values of the clusterid and clustername parameters are unique.

Take over OceanBase clusters

Q1: An error occurred on the host when I was adding a server to the cluster. Error message: Python dependency does not exist. What can I do?

A: The host was not installed by using the installation template. Manually install the Python script. We recommend that you install the host by using the installation template.

Q2: An error occurred when I was using OCP V2.5.x to manage an OceanBase cluster. Error code: 11042. Precheck of the cluster to be taken over failed. Cause: OBServer check failed. What can I do?

A: We recommend that you call the precheck API to find the cause of the OBServer check failure. Execute the following statement to call the API:

curl -X POST --user admin:xxx \
  -H "Content-Type:application/json" \
  -d '{"rootSysPassword":"xxxxx",                   # Required. The password of the root@sys account. 
         "address":"xxxxx",                                 # Required. The address used to connect to the cluster. 
         "port":2888,                                           # Required. The port used to connect to the cluster. 
         "connectionMode": "proxy"                  # Optional. The connection mode. Valid values: direct and proxy. Default value: direct. 
         "clusterName": "xxx",                            # Optional in most cases. If the connection mode is set to proxy, it is required. 
         "obClusterId": xxx                                 # Optional in most cases. When the connection mode is set to proxy and the cluster to be managed is a standby cluster, it is required. 
        }' \
  "http://192.168.0.1:xxxx/api/v2/ob/clusters/takeOverPreCheck"

Upgrade OceanBase clusters

Q1: How can I upgrade the primary and standby clusters?

A: You can click Upgrade Version in the upper-right corner of the Overview page of the primary and standby clusters to upgrade them. The upgrade applies to both the primary and standby clusters, regardless of whether you click Upgrade Version on the Overview page of the primary or standby cluster.

Q2: The connection timed out when I was refreshing schema info to upgrade my OceanBase cluster in OCP. What can I do?

A: Run the following command to view the schema and version:

select tenant_id, refreshed_schema_version from __all_virtual_server_schema_info " + " where svr_ip = '%s' and svr_port = %d and refreshed_schema_version >
select max(value) value from oceanbase.__all_virtual_sys_parameter_stat where name = 'min_observer_version'

If the schema and version are correct, manually retry the CheckRefreshSchemaTask.

Q3: Why does the system report execute upgrade checker when I upgrade the OceanBase cluster?

A: The RPM package download may have timed out. We recommend that you try to upgrade the cluster again.

Q4: Why does the system report No node of the corresponding version when I try to upgrade OceanBase Database V2.2.71 to OceanBase Database V2.2.74?

A: The upgrade path defined in the /home/admin/ocp-server/etc/oceanbase_upgrade_dep.yml file on the OCP server is incomplete. We recommend that you add the following version information to OceanBase Database V2.2.74:

# Previous content is omitted.
- version: 2.2.70
  can_be_upgraded_to:
      - 2.2.71
- version: 2.2.71
  can_be_upgraded_to:
      - 2.2.72
- version: 2.2.72
  can_be_upgraded_to:
      - 2.2.73
- version: 2.2.73         # The last version must be defined. Otherwise, the upgrade path is unavailable for same version upgrade.
  can_be_upgraded_to:
      - 2.2.74

Q5: An error occurred when I was upgrading my OceanBase cluster. Error message: rpc call failed! What can I do?

A:Check whether the pos_proxy process on the corresponding host is normal.

Q6: What does it mean to say Replace Binary Files on the Confirm Upgrade Path page of the OceanBase cluster ?

A: It means to replace the binary file of the OBServer. In OceanBase Database V2.2.72 and V2.2.73, you only need to run the pre/post script without replacing the binary file of the OBServer. The message is informational and does not require manual operation. The binary file will be automatically replaced during an upgrade.

Q7 : What can I do if the system reports [alter system stop zone ?]; SQL state [25000]; error code [4012]; Statement is timeout when I upgrade the OceanBase cluster?

A: We recommend that you try to upgrade the cluster again.

Q8: Why do I fail to upgrade OceanBase cluster V2.2.30 to V2.2.5 0 ?

A: You have not uploaded the V2.2.40 RPM package that is required for OceanBase upgrades. We recommend that you upload the package and try to upgrade the cluster again.

Q9: I want to upgrade the minor version before I upgrade the major version. Does OCP support upgrading to the same version ?

A: Yes, OCP supports upgrading to the same major version. However, the minor versions must be different.

Q10: I use OCP to upgrade the OceanBase cluster. Why does a server fail and report Cannot allocate memory at the Install rpm step?

A: The memory is insufficient. We recommend that you clear the content in the memory or increase the memory as needed.

Q11: What can I do if the upgrade_checker.py reports MyError: "(1, Decimal('1.0000')) check_logonly_replica_unit failed, not found log_type resource_pool for this tenant when I upgrade the OceanBase cluster?

A: Check for log replicas of the OceanBase cluster to be upgraded. If any, we recommend that you replace the log replicas with full-featured replicas.

Q12: If an OceanBase cluster includes the omsmeta and odpmeta tenants in addition to the sys, ocp-meta, and ocp-monitor tenants, how can I upgrade the cluster?

A: You can contact OceanBase technical support to obtain the corresponding upgrade document to upgrade the cluster. The operation will not affect the tenants of other applications.

Monitor OceanBase clusters

Q1:What can I do if the system reports obcluster does not exist after obstat failed to run?

A. Check the MetaDB for the number of replicas on a single OBServer. If the number of replicas on an OBServer exceeds the threshold, we recommend that you delete useless tables or scale out the OceanBase cluster to reduce the number of replicas on a single server. Then, restart OCP and wait for OCP to re-create the monitoring partition.

Q2: When I use OCP to create a cluster, can the ocp_monitor user be automatically created? If yes, what is the default password of this user ?

A: Yes, the ocp_monitor user is automatically created. You do not need to know the password of this user. After the ocp_monitor user is created, the password of this user is used by the system when the system calls this user.

Q3: Does transaction log time on the Performance Monitoring page mean the average time consumed by SQL statements or the average time consumed by transactions ?

A: It means the average time consumed by transactions.

Cluster inspection

Q1: The maximum usage of the /data/1 data disk in OBServer is 90% of the disk storage of the host, and the disk usage threshold of the host in health inspection of the cluster is set to 85%. Is the setting proper?

A: During health inspection, the data disk of OceanBase Database will be ignored.

Community Edition

Enterprise Edition