DataX is an offline data synchronization tool/platform widely used within Alibaba Group. It enables efficient data synchronization between various heterogeneous data sources such as MySQL, Oracle, HDFS, Hive, OceanBase Database, HBase, OTS, and ODPS. This topic describes how to migrate data across Oceanbase databases and clusters and heterogeneous databases by using the reader and writer plug-ins of OceanBase Database based on the synchronization mechanism of DataX.
Framework design

DataX is an offline data synchronization framework that is designed based on the "framework + plug-in" architecture. Data source reads and writes are abstracted as the reader and writer plug-ins and are integrated into the entire framework.
The reader plug-in is a data collection module that collects data from a data source and sends the data to the framework.
The writer plug-in is a data write module that retrieves data from the framework and writes the data to the destination.
The framework builds a data transmission channel to connect the reader and the writer and processes core technical issues such as caching, throttling, concurrency, and data conversion.
Examples
Use DataX to migrate MySQL data to OceanBase Database
If the source and destination databases cannot connect to the DataX server at the same time, you can export the MySQL data as CSV files and then import the CSV files into the OceanBase database. If the source and destination databases can connect to the DataX server at the same time, you can use DataX to migrate the MySQL data to the OceanBase database. The following example shows the content of the configuration file:
{
"job": {
"setting": {
"speed": {
"channel": 4
},
"errorLimit": {
"record": 0,
"percentage": 0.1
}
},
"content": [
{
"reader": {
"name": "mysqlreader",
"parameter": {
"username": "tpcc",
"password": "********",
"column": [
"*"
],
"connection": [
{
"table": [
"bmsql_oorder"
],
"jdbcUrl": ["jdbc:mysql://127.0.0.1:3306/tpccdb?useUnicode=true&characterEncoding=utf8"]
}
]
}
},
"writer": {
"name": "oceanbasev10writer",
"parameter": {
"obWriteMode": "insert",
"column": [
"*"
],
"preSql": [
"truncate table bmsql_oorder"
],
"connection": [
{
"jdbcUrl": "||_dsc_ob10_dsc_||obdemo:oboracle||_dsc_ob10_dsc_||jdbc:oceanbase://127.0.0.1:2883/tpcc?useLocalSessionState=true&allowBatch=true&allowMultiQueries=true&rewriteBatchedStatements=true",
"table": [
"bmsql_oorder"
]
}
],
"username": "tpcc",
"password":"********",
"writerThreadCount":10,
"batchSize": 1000,
"memstoreThreshold": "0.9"
}
}
}
]
}
}
Use DataX to migrate data from an OceanBase database to a MySQL or Oracle database
Synchronize data from an OceanBase database to a MySQL database.
The following example shows the content of the configuration file:
{ "job": { "setting": { "speed": { "channel": 16 }, "errorLimit": { "record": 0, "percentage": 0.1 } }, "content": [ { "reader": { "name": "oceanbasev10reader", "parameter": { "where": "", "readBatchSize": 10000, "column": [ "*" ], "connection": [ { "jdbcUrl": ["||_dsc_ob10_dsc_||obdemo:oboracle||_dsc_ob10_dsc_||jdbc:oceanbase://127.0.0.1:2883/tpcc"], "table": [ "bmsql_oorder" ] } ], "username": "tpcc", "password":"********" } }, "writer": { "name": "mysqlwriter", "parameter": { "writeMode": "replace", "username": "tpcc", "password": "******", "column": [ "*" ], "session": [ "set session sql_mode='ANSI'" ], "preSql": [ "truncate table bmsql_oorder" ], "batchSize": 512, "connection": [ { "jdbcUrl": "jdbc:mysql://127.0.0.1:3306/tpccdb?useUnicode=true&characterEncoding=utf8", "table": [ "bmsql_oorder" ] } ] } } } ] } }Synchronize data of an OceanBase database to an Oracle database. The following example shows the content of the configuration file:
{ "job": { "setting": { "speed": { "channel": 16 }, "errorLimit": { "record": 0, "percentage": 0.1 } }, "content": [ { "reader": { "name": "oceanbasev10reader", "parameter": { "where": "", "readBatchSize": 10000, "column": [ "*" ], "connection": [ { "jdbcUrl": ["||_dsc_ob10_dsc_||obdemo:oboracle||_dsc_ob10_dsc_||jdbc:oceanbase://127.0.0.1:2883/tpcc"], "table": [ "bmsql_oorder" ] } ], "username": "tpcc", "password":"********" } }, "writer": { "name": "oraclewriter", "parameter": { "username": "tpcc", "password": "********", "column": [ "*" ], "preSql": [ "truncate table bmsql_oorder" ], "batchSize": 512, "connection": [ { "jdbcUrl": "jdbc:oracle:thin:@127.0.0.1:1521:helowin", "table": [ "bmsql_oorder" ] } ] } } } ] } }
Parameters
| Parameter | Description |
|---|---|
| jdbcUrl |
|
| username |
|
| password |
|
| table |
|
| column |
|
| where |
|
More information
For more information about DataX, see DataX.