You can create a data migration task to seamlessly migrate the existing business data and incremental data from the source database to the target database in the HBase API compatible mode of OceanBase Database through schema migration, full migration, and incremental synchronization.
Notice
If a data migration task remains inactive for an extended period (with a status of Failed, Paused, or Completed), it may become unrecoverable due to factors such as the retention period of incremental logs. The data migration services will proactively release tasks that have been inactive for more than 7 days to reclaim related resources. It is recommended that you configure alerts for your tasks and promptly handle any task-related exceptions.
Prerequisites
You have created a source database instance.
You have created an instance and a tenant in the target OceanBase Database. For more information, see Create an instance and Create a tenant.
You have created dedicated database users for data migration on both the source and target and granted them the required permissions. For more information, see User privileges.
Limitations
Only users with the project role of Project Owner, Project Admin, or Data Services Admin can create a data migration task.
Currently, data transmission supports Lindorm/HBase wide-table engine version V2.x and OceanBase Database HBase API compatibility mode V4.2.5 or later.
Data transmission only supports migrating data from Lindorm/HBase databases, where table and column names consist of digits 0-9, letters a-z, A-Z, and underscores (_), to OceanBase Database HBase API compatibility mode.
When you migrate data from a Lindorm/HBase database to OceanBase Database HBase API compatibility mode, only table objects in the HBase model are supported. Selecting other types of table objects may result in data quality issues.
It is recommended to set the Kafka partition count in the incremental synchronization configuration to 1.
Considerations
We recommend that you do not use data migration tasks from Alibaba Cloud's Lindorm/HBase to OceanBase Database HBase API compatibility mode as a long-term synchronization method; use it only for data migration. If either of the following two situations exists, data quality issues may occur during the incremental synchronization phase and be detected during the full verification phase:
If values are updated using a specified version (timestamp) method, data transmission will not recognize it.
If there is an operation to set the field data for a certain rowkey to an empty string (''), it may be judged as a deletion of that value during the incremental synchronization phase.
When migrating data from Lindorm/HBase to OceanBase Database HBase API compatibility mode, full verification only supports pulling data in IN mode. It cannot verify scenarios where data exists at the target side that does not exist at the source side, and verification performance will be degraded to some extent.
If table objects differing only in case exist at the source or target, the data migration result may not meet expectations due to case-insensitivity at the source or target.
Clock desynchronization between nodes, or between the client terminal and the server, may cause inaccurate latency measurements (for incremental synchronization/reverse incremental). For example, if a clock is ahead of standard time, the latency may appear negative. If a clock is behind standard time, it may cause latency.
If the TTL (Time to Live) configurations for Lindorm/HBase and OceanBase Database HBase API compatibility mode are inconsistent, data inconsistency between the source and target may occur.
If you configure only Incremental Synchronization when creating a data migration task, data transmission requires the Kafka at the source database to retain data for more than 48 hours.
If you configure Full Migration + Incremental Synchronization when creating a data migration task, data transmission requires the Kafka at the source database to retain data for at least 7 days. Otherwise, the data migration task may fail or cause data inconsistency between the source and target due to the inability to obtain retained Kafka data.
Supported source and target instance types
Cloud vendor |
Source |
Target |
|---|---|---|
| Alibaba Cloud | Alibaba Cloud Lindorm | OceanBase Database HBase API Compatible |
| Alibaba Cloud | Alibaba Cloud ApsaraDB for HBase | OceanBase Database HBase API Compatible |
Procedure
Create a data migration task.
Log in to the OceanBase Cloud console.
In the left-side navigation pane, select Data Services > Migrations.
On the Migrations page, click the Migrate Data tab.
In the upper-right corner of the Migrate Data tab, click Create Task.
In the Edit Task Name text box, enter a custom name for the migration task.
We recommend that you use a combination of Chinese characters, numbers, and letters. The name must not contain spaces and cannot exceed 64 characters in length.
On the Configure Source & Target page, configure the parameters.
In the Source section, configure the parameters.
If you need to reference an existing data source, you can click Quick Fill next to Source and select the target data source from the drop-down list. After selection, the configurations for the source section will be automatically populated. If you want to save the current configuration as a new data source, click the Save icon in the upper-right corner of the Source area.
You can also click Manage Data Source in the Quick Fill drop-down list to go to the Data Sources page to view and manage data sources. This page provides centralized management for different types of data sources. For more information, see Data sources.
ParameterDescriptionCloud Vendor Currently supports Alibaba Cloud. Database Type Select the source database type as Lindorm or HBase based on your actual situation. Region Select the region where the source database is located. Instance Type Includes Alibaba Cloud Lindorm or Alibaba Cloud HBase Enhanced Edition. Connection Type Select Public IP. You must first add the displayed data source IP address to the allowlist of the source database instance to ensure connectivity. For details, see the Select a public network connection module. Connection Information Enter the address or IP of the source database. Port Enter the port of the source database. Database Account The database account used for data migration. Password The password for the database account. To obtain incremental data from the source database via Kafka, please fill in the Kafka data source and topic. Missing configurations may result in unavailable incremental synchronization.
ParameterDescriptionKafka Data Source You can select a Kafka data source saved in data transmission. For details, see Create a data source. Note
Only Kafka data sources from the same cloud vendor can be selected.
Topic You can select or enter a Topic. Incremental Data Format Select the supported incremental data format. Currently supports DEBEZIUM_V2. In the Target section, configure the parameters.
If you need to reference an existing data source, you can click Target next to Quick Fill and select the target data source from the drop-down list. After selection, the configurations for the target section will be automatically populated. If you want to save the current configuration as a new data source, click the Save icon in the upper-right corner of the Target area.
You can also click Manage Data Sources in the Quick Fill drop-down list to go to the Data Source page to view and manage data sources. This page provides centralized management for different types of data sources. For more information, see Data sources.
ParameterDescriptionCloud Vendor Currently only supports Alibaba Cloud. Region Select the region where the target database is located. Database Type Select the target database type as OceanBase HBase API Compatible. Instance Type Select the target instance type as Dedicated (Key-Value) or Cluster Instance (Flagship). Instance The ID or name of the instance where the tenant in OceanBase Database's HBase API compatible mode resides. You can view the target instance's ID or name on the Instances page. Note
When the cloud vendor is Alibaba Cloud, you can also select an Alibaba Cloud root account instance for cross-account authorization. For details, see Authorize an Alibaba Cloud account.
Tenant The ID or name of the tenant in OceanBase Database's HBase API compatible mode. You can view the target tenant's ID or name by expanding the target instance on the Instances page. Database Account The username of the tenant in OceanBase Database's HBase API compatible mode used for data migration. Password The password of the database user.
Click Test and Continue. On the Select Type & Objects page, configure the parameters.
In the Migration Type section, select the migration type for the current data migration task.
Migration Type includes Schema Migration, Full Migration, and Incremental Synchronization.
ParameterDescriptionSchema Migration Schema migration requires you to define the character set mapping relationship. Data migration only replicates a copy of the source database's data (schema) to the target database without affecting the source data (schema). Full Migration After a full migration task starts, the data migration service migrates the existing data from the source database tables to the corresponding tables in the target database. Incremental Synchronization After an incremental synchronization task starts, data migration synchronizes the changed data (added, modified, or deleted) from the source database to the corresponding tables in the target database. Note
- If you did not bind a Kafka data source when creating the Lindorm/HBase data source, you cannot select incremental synchronization.
- When you select incremental synchronization > DML synchronization, contact Alibaba Cloud Technical Support to confirm that the data delivered to Kafka is ordered. Otherwise, data inconsistency may occur. For Lindorm configuration details, see Real-time Data Subscription in Overview of Real-time Data Subscription. For HBase configuration details, see Streams (Real-time Data Subscription) Feature Introduction.
In the Select Migration Scope section, configure the method for selecting migration objects.
Currently, you can select migration objects by using Specify Objects.
In the Select Migration Scope section, select the objects to migrate.
When you select Specify Objects, you can select one or more tables in one or more databases as migration objects. Select the objects you want to migrate on the left and click > to add them to the list on the right.
Data migration supports importing objects via text. You can also rename target objects, set partitions, and remove individual or all migration objects.
OperationDescriptionImport Objects In the list on the right side of the selection area, click Import Object in the upper-right corner. For more information, see Import migration objects. Rename Data migration supports renaming migration objects. For more information, see Rename a database or table. Partition Settings Partition settings are supported only when Schema Migration is selected. In the selected box, click Modify Filter Condition next to the target table to set partitioning for the target table. For more information, see Partition settings. Remove/Clear All Data migration allows you to remove one or more temporarily selected objects to the target during data mapping. - Remove a single migration object
In the list on the right side of the selection area, click the Remove icon next to the target object to remove it. - Remove all migration objects
In the list on the right side of the selection area, click Clear All in the upper-right corner. In the dialog box, click OK to remove all migration objects.
- Remove a single migration object
Click Next. On the Migration Options page, configure the parameters.
Full synchronization
ParameterDescriptionRead Concurrency This parameter specifies the number of concurrent reads from the source during the full migration phase. The maximum limit is 512. A high concurrency value may cause excessive load on the source and affect business operations. Write Concurrency This parameter specifies the number of concurrent writes to the target during the full migration phase. The maximum limit is 512. A high concurrency value may cause excessive load on the target and affect business operations. Data Version for Migration You can select All Versions (default) or Latest Version. Lindorm/HBase supports multi-version storage, with data row versions marked by timestamp. Selecting All Versions means the full migration will migrate all versions of the data rows; selecting Latest Version means the full migration will only migrate the data row version with the latest timestamp. Handle Non-empty Tables in Target Database The handling strategies include Stop Migration and Ignore: - If you select Stop Migration, when the target table has data, full migration will report an error indicating that migration is not allowed. Please handle the data in the target table before continuing the migration.
Notice
If you click Resume after an error occurs, data migration will ignore this configuration option and continue migrating table data. Proceed with caution.
- If you select Ignore, when the target table has data, if the original data conflicts with the data to be written, data migration will log the conflicting data and retain the original data unchanged during write operations.
Full Migration Rate Limit You can decide whether to enable the full migration rate limit based on your actual needs. If enabled, set the RPS (the maximum number of data rows that can be migrated to the target per second during the full migration phase) and BPS (the maximum amount of data that can be migrated to the target per second during the full migration phase). Note
The RPS and BPS set here are only for rate limiting. The actual achievable performance during full migration is subject to factors such as the source, target, and instance specifications.
- If you select Stop Migration, when the target table has data, full migration will report an error indicating that migration is not allowed. Please handle the data in the target table before continuing the migration.
Incremental synchronization
The following parameters are displayed only if Incremental Synchronization is selected on the Select Type & Objects page.
ParameterDescriptionWrite Concurrency This parameter specifies the number of concurrent writes to the target during the incremental synchronization phase. The maximum limit is 512. A high concurrency value may cause excessive load on the target and affect business operations. Incremental Synchronization Start Timestamp - If you selected Full Migration as the migration type, this parameter is not displayed.
- If you did not select Full Migration but chose Incremental Synchronization, specify a point in time after which data should be migrated. The default is the current system time. For details, see Set an incremental synchronization timestamp.
Incremental Migration Rate Limit You can decide whether to enable the incremental synchronization rate limit based on your actual needs. If enabled, set the RPS (the maximum number of data rows that can be synchronized to the target per second during incremental synchronization) and BPS (the maximum amount of data that can be synchronized to the target per second during incremental synchronization). Note
The RPS and BPS set here are only for throttling. The actual achievable performance during incremental synchronization is limited by factors such as the source, target, and instance specifications.
Click Pre-check to perform a pre-check on the data migration task.
In the Pre-check step, the system checks whether the read and write permissions of the database user and the network connection meet the requirements. You can only start the data migration task after all checks pass. If an error occurs during the pre-check:
You can troubleshoot and fix the issue, then rerun the pre-check until it succeeds.
Alternatively, you can click Skip in the Actions column of the failed pre-check item. A dialog box will appear, informing you of the specific impact of skipping this operation. After confirming that it is acceptable, click OK in the dialog box.
After the pre-check succeeds, click Purchase to go to the Purchase Data Migration Instance page.
After the purchase succeeds, you can start the data migration task. For more information about how to purchase a data migration instance, see Purchase a data migration instance. If you do not need to purchase a data migration instance at this time, click Save to go to the details page of the data migration task. You can manually purchase a data migration instance later as needed.
You can click Configure Validation Task in the upper-right corner of the details page to compare the data differences between the source database and the target database. For more information, see Create a data validation task.
The data migration service allows you to modify the migration objects when the task is running. For more information, see View and modify migration objects. After the data migration task is started, it is executed based on the selected migration types. For more information, see the "View migration details" section in View details of a data migration task.
