This topic describes how to use OceanBase Migration Service (OMS) Community Edition to migrate data from an HBase database to OBKV.
Background information
You can create a data migration task in the console of OMS Community Edition to seamlessly migrate the existing business data and incremental data from an HBase database to OBKV through schema migration, full migration, and incremental synchronization.
Prerequisites
- You have created a corresponding schema in OBKV. OMS Community Edition allows you to migrate tables and columns. You must create a corresponding schema in the target database before the migration.
To migrate data in Flink mode, you need to create a Flink cluster of version 1.14.0.
- To use the incremental synchronization feature, you need to enable synchronous replication for HBase tables by setting the
REPLICATION_SCOPEparameter to1.
Limitations
Limitations on the source database
Do not perform DDL operations that modify database or table schemas during schema migration or full migration. Otherwise, the data migration task may be interrupted.
Only HBase 1.2.0-cdh5.15.2 and 2.4.13 are supported.
The migration of tag data and full verification are not supported.
The incremental data of the source HBase database cannot be migrated in bulkload mode during incremental synchronization.
Considerations
To ensure the performance of a data migration task, we recommend that you migrate no more than 1,000 tables at a time.
OMS Community Edition cannot obtain the statistics of tables in an HBase database. Therefore, the progress and estimated time displayed on all pages are not actual values and are for reference only. You can calculate the amount of time required based on the displayed real-time requests per second (RPS) value and the actual number of table records.
The incremental data of an HBase database is replicated by using a peer, which is created in the source HBase database. If the process for incremental synchronization of OMS Community Edition is paused for a long time, the incremental data of the source HBase database may not be sent to the incremental synchronization task of OMS Community Edition for processing. This results in full usage of disks.
The start timestamp of incremental synchronization from the HBase database for a peer is related only to the time when the peer was created, and cannot be specified. If the peer is deleted, you need to create a new data migration task and initialize the full migration task.
During migration in Flink mode, if you want to stop the migration process, you must stop the Flink jobs as well as the full migration and incremental synchronization tasks in OMS Community Edition.
You can find the corresponding Flink jobs based on the ID of the data migration task in the console of OMS Community Edition.
Full migration job: OMS_BATCH_{task ID in OMS Community Edition}
Incremental synchronization job: OMS_STREAM_{task ID in OMS Community Edition}
You can also obtain information about the corresponding Flink jobs by viewing the
flink_jobidfile in the/home/ds/run/{component ID}/confdirectory in the container of OMS Community Edition.
No statistics are provided for the incremental synchronization task that migrates incremental data from the HBase database to OBKV.
If you select only
Incremental Synchronization when you create the data migration task, OMS Community Edition requires that the local incremental logs in the source database be retained for more than 48 hours.If you select
Full Migration andIncremental Synchronization when you create the data migration task, OMS Community Edition requires that the local incremental logs of the source database be retained for at least 7 days. Otherwise, the data migration task may fail or the data in the source and target databases may be inconsistent because OMS Community Edition cannot obtain incremental logs.When you synchronize incremental data from HBase to OBKV, make sure that the host name of the server where OMS Community Edition is deployed can be successfully pinged by using the HBase server. You can configure the host name in the
hostsfile in the/etcdirectory of the HBase server.
Data type mappings
By default, a column family in the HBase database maps to a table schema in OBKV.
create table if not exists {TABLE_NAME} -- Maps HBase {namespace}.{table_name}${column_family}.
(
`K` varbinary(1024) not null, -- Maps the HBase rowkey.
`Q` varbinary(256) not null, -- Maps the column in the HBase column family.
`T` bigint not null, -- Maps the HBase version/timestamp.
`V` varbinary(1048576), -- Maps the HBase value.
primary key(`K`, `Q`, `T`))
partition by key(`K`) partitions 64
You can specify the default table creation statement in OBKV for schema migration by modifying the value of the struct.obkv.createtable parameter.
| Parameter | Description | Table creation statement |
|---|---|---|
| struct.obkv.createtable | On the
|
create table if not exists {TABLE_NAME} (Kvarbinary(1024) not null,Qvarbinary(256) not null,Tbigint not null,Vvarbinary(1048576),primary key(K,Q,T)) partition by key(K) partitions 64 |
| structObkvCreatetable | In the
sink.json file. |
create table if not exists {TABLE_NAME} (Kvarbinary(1024) not null,Qvarbinary(256) not null,Tbigint not null,Vvarbinary(1048576),primary key(K,Q,T)) partition by key(K) partitions 64 |
Procedure
Create a data migration task.
Log in to the console of OMS Community Edition.
In the left-side navigation pane, click
Data Migration .On the
Data Migration page, clickCreate Migration Task in the upper-right corner.
On the
Select Source and Target page, configure the parameters.Parameter Description Data Migration Task Name We recommend that you set it to a combination of digits and letters. It must not contain any spaces and cannot exceed 64 characters in length. Tag Click the field and select a tag from the drop-down list. You can also click Manage Tags to create, modify, and delete tags. For more information, see Use tags to manage data migration tasks.Source If you have created an HBase data source, select it from the drop-down list. If not, click New Data Source in the drop-down list and create one in the dialog box that appears on the right. For more information about the parameters, see Create an HBase data source.Target If you have created an OceanBase-CE data source, select it from the drop-down list. If not, click New Data Source in the drop-down list and create one in the dialog box that appears on the right. For more information about the parameters, see Create an OceanBase-CE data source.Click
Next . On theSelect Migration Type page, configure the parameters.Options for
Migration Type areSchema Migration ,Full Migration ,Incremental Synchronization , andReverse Incremental Migration .Migration type Description Schema migration The definitions of data objects, such as tables, indexes, constraints, comments, and views, are migrated from the source database to the target database. Temporary tables are automatically filtered out. Full migration After a full migration task is started, OMS Community Edition migrates existing data of tables in the source database to corresponding tables in the target database. Incremental synchronization Changed data in the source database is synchronized to the corresponding tables in the target database after an incremental synchronization task starts. Data changes are data addition, modification, and deletion. Reverse incremental migration When a reverse incremental migration task starts, OMS migrates the data changed in the target database after the business switchover back to the source database in real time. Click
Next . On theSelect Migration Objects page, select the migration objects and migration scope.You can select
Specify Objects orMatch Rules to specify the migration objects. The following procedure describes how to specify migration objects by using theSpecify Objects option. For information about the procedure for specifying migration objects by using theMatch Rules option, see Configure matching rules for migration objects.Notice
The names of tables to be migrated, as well as the names of columns in the tables, must not contain Chinese characters.
If a database or table name contains double dollar signs ("$$"), you cannot create the migration task.
In the
Select Migration Objects section, selectSpecify Objects .In the
Specify Migration Scope section, select the objects to be migrated from theSource Object(s) list. OBKV supports only tables with a single column family. Therefore, a table with multiple column families in the HBase database corresponds to multiple tables in OBKV.Click > to add the selected objects to the
Target Object(s) list.
OMS Community Edition also allows you to import objects from text, rename objects, set row filters, view column information, and remove a single migration object or all migration objects.
Note
When you select Match Rules to specify migration objects, object renaming is implemented based on the syntax of the specified matching rules. In the operation area, you can only set filter conditions. For more information, see Configure matching rules for migration objects.
Operation Description Import objects - In the list on the right of the
Specify Migration Scope section, clickImport Objects in the upper-right corner. - In the dialog box that appears, click
OK .
Notice
This operation will overwrite previous selections. Proceed with caution. - In the
Import Objects dialog box, import the objects to be migrated.
You can import CSV files to rename databases/tables and set row filtering conditions. For more information, see Download and import the settings of migration objects. - Click
Validate . - After the validation succeeds, click
OK .
Rename objects OMS Community Edition allows you to rename migration objects. For more information, see Rename a database table. Configure settings OMS Community Edition allows you to filter rows by using WHEREconditions. For more information, see Use SQL conditions to filter data.
You can also view column information of the migration objects in theView Column section.Remove one or all objects OMS Community Edition allows you to remove a single object or all objects to be migrated to the target database during data mapping. - To remove a single migration object:
In the list on the right of theSpecify Migration Scope section, move the pointer over the target object and clickRemove . - To remove all migration objects:
In the list on the right of theSpecify Migration Scope section, clickRemove All in the upper-right corner. In the dialog box that appears, clickOK .
Click
Next . On the Migration Options page, configure the parameters.Full migration
The following parameters are displayed only if you have selected
Full Migration on theSelect Migration Type page.Parameter Description Concurrency Speed Valid values: Stable ,Normal ,Fast , andCustom . The amount of resources to be consumed by a full migration task varies based on the migration performance. If you selectCustom , you can setRead Concurrency ,Write Concurrency , andJVM Memory as needed.Handle Non-empty Tables in Target Database Valid values: Ignore andStop Migration .- If you select
Ignore , when the data to be inserted conflicts with the existing data of a target table, OMS Community Edition retains the existing data and records the conflict data.Notice
If you select Ignore, data is pulled in IN mode for full verification. In this case, the scenario where the target contains more data than the source cannot be verified, and the verification efficiency will be decreased.
- If you select
Stop Migration and a target table contains data, an error is returned during full migration, indicating that the migration is not allowed. In this case, you must clear the data in the target table before you can continue with the migration.Notice
After an error is returned, if you click
Resume in the dialog box, OMS Community Edition ignores this error and continues to migrate data. Proceed with caution.
Processing Strategy When Target Table Has Records Valid values: Ignore andStop Migration .- If you select
Ignore , when the data to be inserted conflicts with the existing data of a target table, OMS Community Edition retains the existing data and records the conflict data.Notice
If you select Ignore, data is pulled in IN mode for full verification. In this case, the scenario where the target contains more data than the source cannot be verified, and the verification efficiency will be decreased.
- If you select
Stop Migration and a target table contains data, an error is returned during full migration, indicating that the migration is not allowed. In this case, you must clear the data in the target table before you can continue with the migration.Notice
After an error is returned, if you click
Resume in the dialog box, OMS Community Edition ignores this error and continues to migrate data. Proceed with caution.
Computing Platform The default value is local, which indicates the local running mode. You can also choose to run the task on the Flink computing platform. To add a computing platform, clickManage Computing Platform in the drop-down list. For more information, see Manage computing platforms.Writing Method Valid values: SQL (specifies to write data to tables by using INSERTorREPLACE) and Direct Load (specifies to write data through direct load). For more information about the direct load method, see Direct load.You can specify the query method for full migration by setting the
queryTypeparameter in thesourcesection. Valid values arehfileandscan. The default value ishfile, which specifies to obtain the full data is by reading HFiles. By default, a table flush operation is performed before the full migration starts. To disable the operation, set theflushTableparameter in thesourcesection tofalse. To view or modify parameters related to full migration, clickConfiguration Details in the upper-right corner of theFull Migration section. For more information about the parameters, see Component parameters.- If you select
Incremental synchronization
The following parameters are displayed only if you have selected
Full Migration on theSelect Migration Type page.Parameter Description Concurrency Speed Valid values: Stable ,Normal ,Fast , andCustom . The amount of resources to be consumed by an incremental synchronization task varies based on the synchronization performance. If you selectCustom , you can setRead Concurrency ,Write Concurrency , andJVM Memory as needed.Peer ID Use the default value. rootDir Use the default value. zkHost Required. Specify the ZooKeeper configuration used by the Incr-Sync component to simulate the startup of HBase. zkPath Use the default value. Computing Platform The default value is local, which indicates the local running mode. You can also choose to run the task on the Flink computing platform. To add a computing platform, clickManage Computing Platform in the drop-down list. For more information, see Manage computing platforms.By default, OMS Community Edition starts one simulated region for incremental synchronization. You can change the number by modifying the
regionsparameter in thesourcesection. If the traffic of incremental data is heavy, you can specify multiple regions to accelerate incremental synchronization. To view or modify parameters related to incremental synchronization, clickConfiguration Details in the upper-right corner of theIncremental Synchronization section. For more information about the parameters, see Component parameters.
Click
Precheck to start a precheck on the data migration task.During the precheck, OMS Community Edition checks the read and write privileges of the database users and the network connectivity of the databases. A data migration task can be started only after it passes all check items. If an error is returned during the precheck, you can perform the following operations:
Identify and troubleshoot the problem and then perform the precheck again.
Click
Skip in theActions column of the failed precheck item. In the dialog box that prompts the consequences of the operation, clickOK .
Click
Start Task . If you do not need to start the task now, clickSave to go to the details page of the data migration task. You can start the task later as needed.OMS Community Edition allows you to modify the migration objects when the data migration task is running. For more information, see View and modify migration objects. After the data migration task is started, it is executed based on the selected migration types. For more information, see the
Migration Details section in the View details of a data migration task topic.