This topic describes the concepts, parameters, and scenarios of the high availability (HA) feature and the procedure for configuring HA parameters in the OceanBase Migration Service (OMS) console.
Overview
HA clusters are an effective solution to ensure business continuity. During 24/7 business operation of enterprise users, the interruption of data synchronization even within minutes causes impact on the business.
With years of application in enterprise scenarios, OMS provides a high-availability cluster architecture that allows recovery in seconds. It supports disaster recovery for hosts, IDCs, and even regions, to meet the requirements of latency-sensitive enterprise users.
The store, connector, and JDBCWriter components of OMS support HA.
Basic principles of store HA
OMS manages the HA feature at the subtopic level. HA is enabled for stores under a subtopic based on the exceptions of the subtopic. Therefore, when store HA is enabled, a store exception may not trigger the generation of a new store for disaster recovery. Store HA is enabled based on the following basic principles:
When the HA module detects a store exception, if the difference between the latest heartbeat time of the store and the current time does not exceed the value of the
checkModuleExceptionIntervalSecparameter, the HA module determines that the store is normal.HA is not enabled for a subtopic if at least one store under the subtopic is normal.
When none of the stores under a subtopic is normal, the following rules apply:
If a store is scheduled for a primary/standby switchover, the HA module does not trigger the generation of a new store, but waits for the store to restore.
If none of the stores under a subtopic is normal and the number of stores under the subtopic does not reach the value of the
subtopicStoreNumberThresholdparameter, the HA module triggers a ticket to create new stores. Otherwise, the HA module does not take any action.If some stores are faulty and some stores are stopped under a subtopic, the HA module cannot determine the cause.
If the number of stores under the subtopic does not reach the value of the
subtopicStoreNumberThresholdparameter, the HA module triggers a ticket to create new stores. Otherwise, the HA module does not take any action.If all stores under a subtopic are stopped, the HA module does not take any action.
If a store exits not because of an exception, the HA module takes it as a normally stopped store.
The HA module creates a store based on the value of the
refetchStoreIntervalMinparameter, which indicates the number of minutes before the current time.
HA parameters and definitions
The ha.config file is as follows:
{
"enable":false,
"enableHost":false,
"enableStore":true,"
enableConnector":true,
"enableJdbcWriter":true,
"subtopicStoreNumberThreshold":5,
"checkRequestIntervalSec":600,
"checkJdbcWriterIntervalSec":600,
"checkHostDownIntervalSec":540,
"checkModuleExceptionIntervalSec":240,
"clearAbnormalResourceHours":72,
"refetchStoreIntervalMin":30
}
| Parameter | Description | Default value |
|---|---|---|
| enable | Specifies whether to enable HA. Valid values: true and false. |
false |
| enableHost | Specifies whether to enable HA for host nodes in the OMS cluster. Valid values: true and false. When hosts are down, the HA module triggers tickets for the batch migration of the connector and JDBCWriter component instances. For more information about OMS downtime migration, see oms_host_down. |
false |
| enableStore | Specifies whether to enable HA for the store component instances. Valid values: true and false. The HA module supports primary/standby store switchover and can recycle stores with exceptions:
|
true |
| enableConnector | Specifies whether to enable the connector daemon. Valid values: true and false. enableConnector is applicable to scenarios with abnormal connectors. |
true |
| enableJdbcWriter | Specifies whether to enable the JDBCWriter daemon. Valid values: true and false. enableJDBCWriter is applicable to scenarios with abnormal JDBCWriters. |
true |
| subtopicStoreNumberThreshold | The maximum number of stores under a subtopic. The value of this parameter is verified before the HA module performs an operation. | 5 |
| checkRequestIntervalSec | The interval, in seconds, for triggering HA tickets for the same operation object, for example, the IP, store, connector, or JDBCWriter component. | 600 |
| checkJdbcWriterIntervalSec | The duration, in seconds, before HA is enabled for a faulty JDBCWriter component. HA is enabled for a faulty JDBCWriter component only when the JDBCWriter component has become inactive for the specified duration. | 600 |
| checkHostDownIntervalSec | The time elapsed during which the host does not report a heartbeat, when a host is considered down in downtime migration scenarios. | 540 |
| checkModuleExceptionIntervalSec | The time threshold, in seconds, for which a faulty store can stay active before HA is enabled. This parameter is available if store HA is enabled. | 240 |
| clearAbnormalResourceHours | The time interval, in hours, for clearing redundant faulty stores when store HA is enabled, based on the GmtModfied parameter of the store component. | 72 |
| refetchStoreIntervalMin | The time interval, in minutes, for pulling stores. | 30 |
Modify HA configurations
You can modify the parameters of the ha.config file in the OMS console.
Log on to the OMS console.
In the left-side navigation pane, choose System Management > System Parameters.
On the System Parameters page, find
ha.config.Click the edit icon next to Value of the parameter name.
In the Modify Value dialog box, change the value as needed.
Click OK.