Migrate data from HBase to OBKV|V4.2.11|OceanBase Migration Service| docs|Distributed Database

This topic describes how to use OceanBase Migration Service (OMS) Community Edition to migrate data from an HBase database to OBKV.

Background information

You can create a data migration task in the console of OMS Community Edition to seamlessly migrate the existing business data and incremental data from an HBase database to OBKV through schema migration, full migration, and incremental synchronization.

Prerequisites

You have created a corresponding schema in OBKV. OMS Community Edition allows you to migrate tables and columns. You must create a corresponding schema in the target database before the migration.

To migrate data in Flink mode, you need to create a Flink cluster of version 1.14.0.

To use the incremental synchronization feature, you need to enable synchronous replication for HBase tables by setting the REPLICATION_SCOPE parameter to 1.

Limitations

Limitations on operations in the source database

Do not perform DDL operations that modify database or table schemas during schema migration or full migration. Otherwise, the data migration task may be interrupted.
Only HBase 1.2.0-cdh5.15.2 and 2.4.13 are supported.
Direct load does not support writing data in the Latin1 character set. Otherwise, an exception will be thrown.
The migration of tag data and full verification are not supported.
The incremental data of the source HBase database cannot be migrated in bulkload mode during incremental synchronization.
OMS Community Edition supports migrating databases, tables, and columns with ASCII-compliant names that do not contain special characters (spaces, line breaks, or |"'`()=;/&).

Considerations

To ensure the performance of a data migration task, we recommend that you migrate no more than 1,000 tables at a time.
OMS Community Edition cannot obtain the statistics of tables in an HBase database. Therefore, the progress and estimated time displayed on all pages are not actual values and are for reference only. You can calculate the amount of time required based on the displayed real-time requests per second (RPS) value and the actual number of table records.
The incremental data of an HBase database is replicated by using a peer, which is created in the source HBase database. If the process for incremental synchronization of OMS Community Edition is paused for a long time, the incremental data of the source HBase database may not be sent to the incremental synchronization task of OMS Community Edition for processing. This results in full usage of disks.
The start timestamp of incremental synchronization from the HBase database for a peer is related only to the time when the peer was created, and cannot be specified. If the peer is deleted, you need to create a new data migration task and initialize the full migration task.
During migration in Flink mode, if you want to stop the migration process, you must stop the Flink jobs as well as the full migration and incremental synchronization tasks in OMS Community Edition.

You can find the corresponding Flink jobs based on the ID of the data migration task in the console of OMS Community Edition.
- Full migration job: OMS_BATCH_{task ID in OMS Community Edition}
- Incremental synchronization job: OMS_STREAM_{task ID in OMS Community Edition}
  
  You can also obtain information about the corresponding Flink jobs by viewing the flink_jobid file in the /home/ds/run/{component ID}/conf directory in the container of OMS Community Edition.
No statistics are provided for the incremental synchronization task that migrates incremental data from the HBase database to OBKV.
If you select only Incremental Synchronization when you create the data migration task, OMS Community Edition requires that the local incremental logs in the source database be retained for more than 48 hours.

If you select Full Migration and Incremental Synchronization when you create the data migration task, OMS Community Edition requires that the local incremental logs of the source database be retained for at least 7 days. Otherwise, the data migration task may fail or the data in the source and target databases may be inconsistent because OMS Community Edition cannot obtain incremental logs.
When you synchronize incremental data from HBase to OBKV, make sure that the host name of the server where OMS Community Edition is deployed can be successfully pinged by using the HBase server. You can configure the host name in the hosts file in the /etc directory of the HBase server.
If the HBase table specifies the Snappy compression, you need to copy the files in lib/native of Hadoop to /home/ds/plugins/jdbc_connector of the container of OMS Community Edition.

Notice

You need to modify the file name in the command chown -R ds:ds in the copied file to the file name of the ds user.

Data type mappings

By default, a column family in the HBase database maps to a table schema in OBKV.

create table if not exists {TABLE_NAME} -- Maps HBase {namespace}.{table_name}${column_family}.
(
    `K` varbinary(1024) not null, -- Maps the HBase rowkey.
    `Q` varbinary(256) not null,  -- Maps the column in the HBase column family.
    `T` bigint not null,          -- Maps the HBase version/timestamp.
    `V` varbinary(1048576),       -- Maps the HBase value.
    primary key(`K`, `Q`, `T`)) 
partition by key(`K`) partitions 64

You can specify the default table creation statement in OBKV for schema migration by modifying the value of the struct.obkv.createtable parameter.

Parameter	Description	Table creation statement
struct.obkv.createtable	On the System Parameters page, you can modify the value of this parameter to specify the default table creation statement in OBKV for schema migration in all tasks.	`create table if not exists {TABLE_NAME} (`K`varbinary(1024) not null,`Q`varbinary(256) not null,`T`bigint not null,`V`varbinary(1048576),`TTL`bigint, primary key(`K`,`Q`,`T`)) partition by key(`K`) partitions 64;`
structObkvCreatetable	In the Incremental Synchronization section on the Migration Type page, click Configuration Details. Then, you can specify the default table creation statement in OBKV for schema migration in the current task by modifying the `sink.json` file.	`create table if not exists {TABLE_NAME} (`K`varbinary(1024) not null,`Q`varbinary(256) not null,`T`bigint not null,`V`varbinary(1048576),`TTL`bigint, primary key(`K`,`Q`,`T`)) partition by key(`K`) partitions 64;`

Procedure

Create a data migration task.
1. Log in to the console of OMS Community Edition.
2. In the left-side navigation pane, click Data Migration.
3. On the Data Migration page, click Create Task in the upper-right corner.

On the Select Source and Target page, configure the parameters.

Parameter	Description
Migration Task Name	We recommend that you set it to a combination of digits and letters. It must not contain any spaces and cannot exceed 64 characters in length.
Tag	Click the field and select a tag from the drop-down list. You can also click Manage Tags to create, modify, and delete tags. For more information, see Use tags to manage data migration tasks.
Source	If you have created an HBase data source, select it from the drop-down list. If not, click New Data Source in the drop-down list and create one in the dialog box that appears on the right. For more information about the parameters, see Create an HBase data source.
Target	If you have created an OceanBase-CE data source, select it from the drop-down list. If not, click New Data Source in the drop-down list and create one in the dialog box that appears on the right. For more information about the parameters, see Create an OceanBase-CE data source.

Click Next. On the Select Migration Type page, configure the parameters.

Options for Migration Type are Schema Migration, Full Migration, Incremental Synchronization, and Reverse Increment.

Migration type	Description
Schema migration	The definitions of data objects, such as tables, indexes, constraints, comments, and views, are migrated from the source database to the target database. Temporary tables are automatically filtered out.
Full migration	After a full migration task is started, OMS Community Edition migrates existing data of tables in the source database to corresponding tables in the target database.
Incremental synchronization	Changed data in the source database is synchronized to the corresponding tables in the target database after an incremental synchronization task starts. Data changes are data addition, modification, and deletion.
Reverse increment	When a reverse increment task starts, OMS migrates the data changed in the target database after the business switchover back to the source database in real time.

Click Next. On the Select Migration Objects page, select the migration objects and migration scope.

You can select Specify Objects or Match Rules to specify the migration objects. The following procedure describes how to specify migration objects by using the Specify Objects option. For information about the procedure for specifying migration objects by using the Match Rules option, see Configure matching rules for migration objects.

Notice

The names of tables to be migrated, as well as the names of columns in the tables, must not contain Chinese characters.
If a database or table name contains double dollar signs ("$$"), you cannot create the migration task.

In the Select Migration Objects section, select Specify Objects.
In the Specify Migration Scope section, select the objects to be migrated from the Source Object(s) list. OBKV supports only tables with a single column family. Therefore, a table with multiple column families in the HBase database corresponds to multiple tables in OBKV.
Click > to add the selected objects to the Target Object(s) list.

OMS Community Edition also allows you to import objects from text, rename objects, set row filters, view column information, and remove a single migration object or all migration objects.

Note

When you select Match Rules to specify migration objects, object renaming is implemented based on the syntax of the specified matching rules. In the operation area, you can only set filter conditions. For more information, see Configure matching rules for migration objects.

Operation	Description
Import objects	In the list on the right of the Specify Migration Scope section, click Import Objects in the upper-right corner. In the dialog box that appears, click OK. Notice This operation will overwrite previous selections. Proceed with caution. In the Import Objects dialog box, import the objects to be migrated. You can import CSV files to rename databases/tables and set row filtering conditions. For more information, see Download and import the settings of migration objects. Click Validate. After the validation succeeds, click OK.
Rename objects	OMS Community Edition allows you to rename migration objects. For more information, see Rename a database table.
Configure settings	OMS Community Edition allows you to filter rows by using `WHERE` conditions. For more information, see Use SQL conditions to filter data. You can also view column information of the migration objects in the View Column section.
Remove one or all objects	OMS Community Edition allows you to remove a single object or all objects to be migrated to the target database during data mapping. To remove a single migration object: In the list on the right of the Specify Migration Scope section, move the pointer over the target object and click Remove. To remove all migration objects: In the list on the right of the Specify Migration Scope section, click Remove All in the upper-right corner. In the dialog box that appears, click OK.

Click Next. On the Migration Options page, configure the parameters.

If you want to view or modify the parameters of the full migration component, click Configuration Details in the upper-right corner of the Full Migration section. If you want to view or modify the parameters of the incremental synchronization or reverse increment component, click Configuration Details of incr Increment in the upper-right corner of the Incremental Synchronization or Reverse Increment section. For more information about the parameters, see the documentation in the Component Parameters section.

Full migration

The following parameters are displayed only if you have selected Full Migration on the Select Migration Type page.

Parameter	Description
Concurrency Speed	Valid values: Stable, Normal, Fast, and Custom. The amount of resources to be consumed by a full migration task varies based on the migration performance. If you select Custom, you can set Read Concurrency, Write Concurrency, and JVM Memory as needed.
Processing Strategy When Records Exist in Target Object	Valid values: Ignore and Stop Migration. If you select Ignore, when the data to be inserted conflicts with the existing data of a target table, OMS Community Edition retains the existing data and records the conflict data. Notice If you select Ignore, data is pulled in IN mode for full verification. In this case, the scenario where the target contains more data than the source cannot be verified, and the verification efficiency will be decreased. If you select Stop Migration and a target table contains data, an error is returned during full migration, indicating that the migration is not allowed. In this case, you must clear the data in the target table before you can continue with the migration. Notice After an error is returned, if you click Resume in the dialog box, OMS Community Edition ignores this error and continues to migrate data. Proceed with caution.
Computing Platform	The default value is `local`, which indicates the local running mode. You can also choose to run the task on the Flink computing platform. To add a computing platform, click Manage Computing Platform in the drop-down list. For more information, see Manage computing platforms.
Writing Method	Valid values: SQL (specifies to write data to tables by using `INSERT` or `REPLACE`) and Direct Load (specifies to write data through direct load). For more information about the direct load method, see Direct load.

You can specify the query method for full migration by setting the queryType parameter in the source section. Valid values are hfile and scan. The default value is hfile, which specifies to obtain the full data is by reading HFiles. By default, a table flush operation is performed before the full migration starts. To disable the operation, set the flushTable parameter in the source section to false. To view or modify parameters related to full migration, click Configuration Details in the upper-right corner of the Full Migration section. For more information about the parameters, see Component parameters.

Incremental synchronization

The following parameters are displayed only if you have selected Full Migration on the Select Migration Type page.

Parameter	Description
Concurrency Speed	Valid values: Stable, Normal, Fast, and Custom. The amount of resources to be consumed by an incremental synchronization task varies based on the synchronization performance. If you select Custom, you can set Read Concurrency, Write Concurrency, and JVM Memory as needed.
Peer ID	Use the default value.
rootDir	Use the default value.
zkHost	Required. Specify the ZooKeeper configuration used by the Incr-Sync component to simulate the startup of HBase.
zkPath	Use the default value.
Computing Platform	The default value is `local`, which indicates the local running mode. You can also choose to run the task on the Flink computing platform. To add a computing platform, click Manage Computing Platform in the drop-down list. For more information, see Manage computing platforms.

By default, OMS Community Edition starts one simulated region for incremental synchronization. You can change the number by modifying the regions parameter in the source section. If the traffic of incremental data is heavy, you can specify multiple regions to accelerate incremental synchronization. To view or modify parameters related to incremental synchronization, click Configuration Details in the upper-right corner of the Incremental Synchronization section. For more information about the parameters, see Component parameters.

Reverse increment

On the Select Migration Type page, select Reverse Increment to display the following parameters.

Parameter	Description
Concurrency Speed	Valid values: Stable, Normal, Fast, and Custom. The performance of incremental synchronization varies, and the resources required for incremental synchronization tasks also vary. If you select Custom, you can set Read Concurrency, Write Concurrency, and JVM Memory as needed.

Click Precheck to start a precheck on the data migration task.

During the precheck, OMS Community Edition checks the read and write privileges of the database users and the network connectivity of the databases. A data migration task can be started only after it passes all check items. If an error is returned during the precheck, you can perform the following operations:
- Identify and troubleshoot the problem and then perform the precheck again.
- Click Skip in the Actions column of the failed precheck item. In the dialog box that prompts the consequences of the operation, click OK.
Click Start Task. If you do not need to start the task now, click Save to go to the details page of the data migration task. You can start the task later as needed.

OMS Community Edition allows you to modify the migration objects when the data migration task is running. For more information, see View and modify migration objects. After the data migration task is started, it is executed based on the selected migration types. For more information, see the Migration Details section in the View details of a data migration task topic.

Enterprise Edition

Community Edition

Migrate data from HBase to OBKV

Background information

Prerequisites

Limitations

Considerations

Notice

Data type mappings

Procedure

Notice

Note

Notice

Notice

References