Basic components|V3.4.0|OceanBase Migration Service| docs|Distributed Database

Basic components

Last Updated：2023-06-25 03:32:32 Updated

This topic describes the components of OceanBase Migration Service (OMS).

DBCat

DBCat is the core schema migration component of OMS. As the native schema conversion engine of OceanBase, the DBCat component can accurately map or convert data based on the source data type, destination data type, and character encoding type. The DBCat component can convert and migrate various database objects such as tables, constraints, indexes, and views.

In addition, the DBCat component can be maximally compatible with the tenant types of OceanBase Database. For example, if a version of OceanBase Database does not support some data types in the source database, the DBCat component selects the most similar and compatible data types for conversion and mapping.

The following problems may occur when you migrate data between heterogeneous databases. In this case, you can use the DBCat component to automatically convert the schema of the source database.

The database engines in the source and destination databases are different.
The definitions of data objects vary greatly between databases.
The destination database does not support some data objects in the source database.
The precision settings of the same data type do not match.

In a complex data migration between heterogeneous databases, you can use the DBCat component to generate basic scripts, which are then manually processed by the database administrator (DBA) to generate a schema in the destination database.

Checker-Full and Checker-Verify

The Checker component consists of the Checker-Full and Checker-Verify components. The Checker-Full component migrates inventory data from the source database, and the Checker-Verify component verifies all fields after the migration. For flexible expansion and reuse, the Checker component is divided into the Reader module, Writer module, Broker module, and unified data model layer from the bottom up.

Reader: reads data from the source database. Each type of database has a corresponding Reader plug-in. The Reader plug-in converts the obtained records based on the unified data model, and places the records into the Broker module for consumption by other modules.
Writer: subscribes to the records of a table from the Broker module, converts the records into INSERT statements suitable for the downstream module based on the type of each Writer plug-in and the unified data model, and writes the records to the downstream module.
Broker: decouples the Reader, Writer, and other modules to improve performance. After decoupling, upstream and downstream modules can be independent of each other to facilitate maintenance and expansion.
Unified data model layer: provides a unified data model. Data in the Reader module must be converted based on the unified data model before it is written to the Broker module. Data records obtained from the Broker module must also be converted into objects or statements suitable for the downstream module.

With the preceding underlying modules, OMS implements data migration, verification, and correction.

To migrate data, the Checker-Full component creates a Reader > Broker > Writer path for each table to be migrated. Then, the upper-layer migration program schedules a migration task for each table. You can concurrently migrate multiple tables. The Reader and Writer modules can simultaneously execute the migration of the tables.

To perform full verification and correction, the Checker-Verify component creates the SrcReader > Broker > DstReader and Broker > Verifier paths for each table to be verified.

Store

The Store component pulls the logs in different ways for different types of databases. For example, the Store component of an OceanBase database is implemented based on Liboblog.

Liboblog is an incremental data synchronization tool of OceanBase Database. It pulls the redo logs from each partition of an OceanBase database by using remote procedure calls (RPCs), converts the redo logs into an intermediate data format based on the schema information of each table and column, and then outputs the modified data as a transaction.

Real-time synchronization components

Two real-time synchronization components are provided: JDBCWriter and Connector.

The JDBCWriter component pulls incremental data from the Store component, translates it into INSERT, UPDATE, DELETE, or other SQL statements, and writes the data to the destination component.

The Store component records streaming incremental data, to ensure data ordering through pipelining. The Writer module sequentially executes transactions with a single thread, which can meet the basic requirements but does not support performance expansion. Therefore, OMS introduces a concurrent write mechanism.

To ensure data consistency while improving the synchronization performance, OMS introduces a conflict matrix mechanism to implement out-of-order concurrent writes, to ensure the eventual consistency of each transaction.

To prevent cyclical replication, the data written by the OMS JDBCWriter component is tagged in the Store component. The tagged data cannot be consumed by other components.
The Connector component provides source and sink plug-ins that integrate the features of the JDBCWriter component. For example, when data is synchronized from OceanBase Database to Kafka, OB-Store-Source is the source plug-in, and Kafka-Sink is the sink plug-in.

The Connector component provides the following benefits:
- Achieves high scalability and implements synchronization between the source and the sink.
- Provides unified resource management, monitoring, and O&M for synchronization tasks.
- Serves as the unified intermediate layer and structures the records from different sources, to filter and transform the records.

Enterprise Edition

Community Edition

Basic components

DBCat

Checker-Full and Checker-Verify

Store

Real-time synchronization components