FAQ on OceanBase cluster|V4.2.2| docs|Distributed Database

FAQ on OceanBase cluster

Last Updated：2024-03-22 09:34:34 Updated

This topic provides answers to the frequently asked questions about OceanBase cluster management.

Questions about feature implementation

Q1: Can I use an x86 OCP system to manage an OceanBase cluster in the Arm architecture?

A: You can do so in OCP V2.5.2 or later.

Q2: Which version of OCP can I use to deploy an OceanBase V1.4.79 cluster ?

A: You can deploy such a cluster by using OCP of any version. We recommend that you use the latest version of OCP.

Q3: Does OCP support a deployment plan of using the x86 architecture for the primary OceanBase cluster and the Arm architecture for standby OceanBase clusters?

A: Such a deployment plan is supported in OCP V3.1.0 and later.

Questions about OceanBase cluster management

Q1: Where can I view the delay between the primary and standby clusters?

A: Click Cluster Overview, switch to the topology of the cluster, and then view the delay between the primary and standby clusters.

Q2: What should I do when I receive the error message machine resource is not enough to hold a new unit?

A: If the server resources are insufficient, replica migration tasks cannot proceed. You can add hosts to solve this issue.

Q3: What is the strategy used by OCP-Agent in automatically clearing observer.log?

A: When the memory usage reaches 80%, the latest 100 log records are retained and the earlier ones are automatically cleared.

Q4: How can I obtain the topology status of a cluster?

A: You can obtain the topology status of a cluster based on the records in the ob_server table of MetaDB.

Q5: Where can I get the total number of sessions of OCP?

A: You can get the number from the internal table __all_virtual_processlist of OceanBase Database.

Q6: What is an inactive session? How are inactive sessions managed?

A: If a session remains idle for a specified period after it is created, the session is changed to the inactive state. By default, a session is marked as inactive after it remains idle for 28,800s. Inactive sessions are released by the client that has a timeout mechanism. OceanBase Database will not proactively release inactive sessions.

Q7: Does an inactive session occupy the maximum number of connections supported by a unit?

A: The actual distribution of connections is not affected by the maximum unit connections. Instead, it is affected by the maximum OBServer connections. Unit connections are used as an alert condition in OCP.

Q8: Which ports do I need to enable if OCP and OceanBase Database are deployed in different CIDR blocks ?

A: For more information, see Component listening port list.

Q9: How do I quickly delete a cluster with failed tasks?

A: Change the status of the cluster from Operating to NORMAL in the ob_cluster table of MetaDB. Then, you can delete the cluster.

Q10: In the health inspection result description of the OceanBase cluster, the following message appears: Replicas: unknown > 100000, Critical. What does it mean?

A: unknown indicates that the system failed to get the number of partitions. We recommend that you perform the following steps to locate the fault:

Check whether related data is displayed in the Capacity_Number of Partitions diagram on the Tenant Management page in the OceanBase cluster. No data is displayed.
Check whether the processes in the Processes list of the corresponding host are normal. The obstat and logtailer processes stop abnormally.
Check whether the DNS is incorrectly resolved, causing the host to fail to access the database and the obstat and logtailer processes to fail to start.

We recommend that you configure the DNS address in the /etc/resolve.conf file of the host that returns the error, and check whether the problem persists.

Q11: Why am I unable to delete a cluster or quit the deletion task in OCP?

A: If you delete a host in the OBServer cluster by using a CLI tool, the host is abnormally offline and the cluster information in MetaDB is inconsistent with that of the OceanBase cluster. We recommend that you do not delete an OceanBase cluster forcibly if some OBServers in the cluster are offline. We recommend that you contact a technical engineer to manually modify the related data in MetaDB.

Questions about OceanBase cluster deployment

Q1: An error occurred when I was creating a cluster. Error message: illegal argument exception. What should I do?

A: Check whether the name of the RPM package is duplicated.

Q2: An error occurred at the Confirm Information step when I was installing OceanBase Database. Error message: An unknown error has occurred. Cause: {0}. Error message: {1}. Contact the administrator. What is the cause of this problem?

A: Check whether the name of the OceanBase Database software package is modified.

A3: The architecture of my primary cluster is x86. I plan to add AArch64 hosts into the standby cluster. Can I use the x86 software package to deploy standby clusters?

A: This deployment plan is risky. We recommend that you use servers of the same architecture for both the primary and standby clusters.

Q4: Why does the system report invalid zone priority when I use OCP to create a cluster?

A: The region of the top priority zone must have at least two zones.

Q5: When I create a standby cluster in OCP, is it true that the primary/standby data synchronization task is implemented and ends together with the cluster creation task?

A: The standby cluster creation task has nothing to do with the primary/standby synchronization task. The standby database provides services to applications only after it is created.

Q6: I used OCP V2.5.x to create primary and standby clusters, and I set the primary_zone of both the tenant and table to RANDOM. Why does the system report Add cluster not allowed and display the prompt CHECK PRIMARY ZONE OR LOCALITY CONFIG when I run a task?

A: The primary_zone or locality is configured in a table. Perform the following steps to solve the problem:

Check for the table in which primary_zone or locality is configured in the OceanBase database of the SYS tenant.
```
select table_id,tenant_id,primary_zone,locality,table_name from __all_virtual_table;
```
The table in which primary_zone or locality is configured is named t, and the tenant ID is 1001.
Execute the alter table t set primary_zone = default; statement in the database of the corresponding tenant to cancel the primary_zone setting of the table.

Execute the following statements in the database of the corresponding tenant to cancel the locality setting of the table:

select locality from __all_tenant where tenant_id = 1001;     # The query result is F@zone1,F@zone2,F@zone3.
alter table t set locality = 'F@zone1,F@zone2,F@zone3';

Q7: I have an OCP system that manages an OceanBase cluster in a private cloud. I want to deploy a new OCP system in the same environment and create an OceanBase V2.2.7x cluster in it. I have changed the default value of cluster_id in the /root/t-oceanbase-antman/obcluster.conf file to 100000 when I deploy OCP. What else do I need to modify when I create the OceanBase cluster?

A: When you use OCP to create an OceanBase cluster, you must ensure that the cluster ID and cluster name are unique.

Questions about OceanBase cluster takeover

Q1: An error occurred on the host when I was adding a server to the cluster. Error message: Python dependency does not exist. What should I do?

A: The host was not installed using the installation template. Manually install the Python script. We recommend that you install the host using the installation template.

Q2: An error occurred when I was using OCP V2.5.x to take over an OceanBase cluster. Error code: 11042. Precheck of the cluster to be taken over failed. Cause: OBServer check failed. What should I do?

A: We recommend that you call the precheck API to find the cause of the OBServer check failure. Execute the following statement to call the API:

curl -X POST --user admin:xxx \
  -H "Content-Type:application/json" \
  -d '{"rootSysPassword":"xxxxx",                   # Required. The password of the root@sys account.
         "address":"xxxxx",                                 # Required. The address used to connect to the cluster.
         "port":2888,                                           # Required. The port used to connect to the cluster.
         "connectionMode": "proxy"                  # Optional. The connection mode. Valid values: direct and proxy. Default value: direct.
         "clusterName": "xxx",                            # Optional in most cases. If the connection mode is set to proxy, it is required.
         "obClusterId": xxx                                 # Optional in most cases. When the connection mode is set to proxy and the cluster to be taken over is a standby cluster, it is required.
        }' \
  "http://example.com:8080/api/v2/ob/clusters/takeOverPreCheck"

Q4: What should I pay attention to when I take over an OceanBase cluster from another OCP?

A: Pay attention to the following when you take over an OceanBase cluster from another OCP:

Confirm the way that businesses access the OceanBase cluster so as to confirm the impact of the takeover on businesses.
Migrate the cluster from its original OCP, or stop this OCP.
In a single-OCP scenario, the primary OceanBase cluster is taken over before the standby OceanBase cluster.
In a multi-OCP scenario, the primary OceanBase cluster is taken over in any OCP before the standby OceanBase cluster is taken over in any OCP.

Questions about OceanBase cluster upgrade

Q1: How do I upgrade the primary and standby clusters?

A: You can click the Upgrade Version button in the upper-right corner of the Overview page of the primary and standby clusters to upgrade them. The upgrade applies to both the primary and standby clusters, regardless of whether you click the Upgrade Version button on the Overview page of the primary or standby cluster.

Q2: The connection timed out when I was refreshing schema information to upgrade my OceanBase cluster in OCP. What should I do?

A: Run the following command to view the schema and version:

select tenant_id, refreshed_schema_version from __all_virtual_server_schema_info " + " where svr_ip = '%s' and svr_port = %d and refreshed_schema_version >
select max(value) value from oceanbase.__all_virtual_sys_parameter_stat where name = 'min_observer_version'

If the schema and version are correct, manually retry CheckRefreshSchemaTask.

Q3: Why does the system report "execute upgrade checker" when I upgrade the OceanBase cluster?

A: The RPM package download may have timed out. We recommend that you try to upgrade the cluster again.

Q4: Why does the system report No node of the corresponding version when I try to upgrade OceanBase Database from V2.2.71 to V2.2.74?

A: The upgrade path defined in the /home/admin/ocp-server/etc/oceanbase_upgrade_dep.yml file on the OCP server is incomplete. We recommend that you add the following version information to OceanBase Database V2.2.74:

# Previous content is omitted.
- version: 2.2.70
  can_be_upgraded_to:
      - 2.2.71
- version: 2.2.71
  can_be_upgraded_to:
      - 2.2.72
- version: 2.2.72
  can_be_upgraded_to:
      - 2.2.73
- version: 2.2.73         # The last version must be defined. Otherwise, the upgrade path is unavailable for same version upgrade.
  can_be_upgraded_to:
      - 2.2.74

Q5: What does it mean to say Replace Binary Files on the Confirm Upgrade Path page of the OceanBase cluster ?

A: It means to replace the binary file of the OBServer. In OceanBase Database V2.2.72 and V2.2.73, you only need to run the pre/post script without replacing the binary file of the OBServer. The message is informational and does not require manual operations. The binary file will be automatically replaced during an upgrade.

Q6: An error is reported when I upgrade an OceanBase cluster. Error message: [alter system stop zone ?]; SQL state [25000]; error code [4012]; Statement is timeout. What should I do?

A: We recommend that you try again.

Q7: Why do I fail to upgrade an OceanBase cluster from V2.2.30 to V2.2.50 ?

A: You have not uploaded the V2.2.40 RPM package that is required for OceanBase upgrades. We recommend that you upload the package and try to upgrade the cluster again.

Q8: I want to upgrade the minor version before I upgrade the major version. Does OCP support upgrading to the same version ?

A: Yes, OCP supports upgrading to the same major version. However, the minor versions must be different.

Q9: I use OCP to upgrade the OceanBase cluster. Why does a server fail and report Cannot allocate memory at the Install rpm step?

A: The memory is insufficient. We recommend that you clear the content in the memory or increase the memory as needed.

Q10: When I upgrade an OceanBase cluster, the upgrade_checker.py script reports an error. Error message: MyError: "(1, Decimal('1.0000')) check_logonly_replica_unit failed, not found log_type resource_pool for this tenant. Why?

A: Check for log replicas of the OceanBase cluster to be upgraded. If any, we recommend that you replace the log replicas with full-featured replicas.

Q11: If an OceanBase cluster includes the omsmeta and odpmeta tenants in addition to the SYS, ocp-meta, and ocp-monitor tenants, how can I upgrade the cluster?

A: You can contact OceanBase Technical Support to obtain the corresponding upgrade document to upgrade the cluster. The operation will not affect the tenants of other applications.

Questions about OceanBase cluster monitoring

Q1: When I use OCP to create a cluster, can the ocp_monitor user be automatically created? If yes, what is the default password of this user?

A: Yes, the ocp_monitor user is automatically created. You do not need to know the password of this user. After the ocp_monitor user is created, the password of this user is used by the system when the system calls this user.

Q2: Does transaction log time on the Performance Monitoring page refer to the average time consumed by SQL statements or the average time consumed by transactions?

A: It refers to the average time consumed by transactions.

Questions about cluster inspection

Q1: The maximum usage of the /data/1 data disk in OBServer is 90% of the disk storage of the host, and the disk usage threshold of the host in health inspection of the cluster is set to 85%. Is the setting appropriate?

A: During health inspection, the data disk of OceanBase Database will be ignored.