#docslug#/oceanbase-database/oceanbase-database/V3.2.2/use-datax-to-migrate-data-1 DataX is an offline data synchronization tool/platform widely used within Alibaba Group. It enables efficient data synchronization between various heterogeneous data sources such as MySQL, Oracle, HDFS, Hive, OceanBase, HBase, OTS, and ODPS. This topic shows you how to migrate data across Oceanbase databases and clusters and heterogeneous databases by using the reader and writer plug-ins of OceanBase Database based on the synchronization mechanism of DataX.
Framework design

DataX is an offline data synchronization framework that is designed based on the "framework + plug-in" architecture. Data source reads and writes are abstracted as the reader and writer plug-ins and are integrated into the entire framework.
The reader plug-in is a data collection module that collects data from a data source and sends the data to the framework.
The writer plug-in is a data write module that retrieves data from the framework and writes the data to the destination.
The framework builds a data transmission channel to connect the reader and the writer and processes core technical issues such as caching, throttling, concurrency, and data conversion.
Examples
Use DataX to migrate MySQL data to OceanBase Database
If the source and destination databases cannot connect to the DataX server at the same time, you can export the MySQL data as CSV files and then import the CSV files into the OceanBase database. If the source and destination databases can connect to the DataX server at the same time, you can use DataX to migrate the MySQL data to the OceanBase database. The following example shows the content of the configuration file:
{
"job": {
"setting": {
"speed": {
"channel": 4
},
"errorLimit": {
"record": 0,
"percentage": 0.1
}
},
"content": [
{
"reader": {
"name": "mysqlreader",
"parameter": {
"username": "tpcc",
"password": "********",
"column": [
"*"
],
"connection": [
{
"table": [
"bmsql_oorder"
],
"jdbcUrl": ["jdbc:mysql://127.0.0.1:3306/tpccdb?useUnicode=true&characterEncoding=utf8"]
}
]
}
},
"writer": {
"name": "oceanbasev10writer",
"parameter": {
"obWriteMode": "insert",
"column": [
"*"
],
"preSql": [
"truncate table bmsql_oorder"
],
"connection": [
{
"jdbcUrl": "||_dsc_ob10_dsc_||obdemo:oboracle||_dsc_ob10_dsc_||jdbc:oceanbase://127.0.0.1:2883/tpcc?useLocalSessionState=true&allowBatch=true&allowMultiQueries=true&rewriteBatchedStatements=true",
"table": [
"bmsql_oorder"
]
}
],
"username": "tpcc",
"password":"********",
"writerThreadCount":10,
"batchSize": 1000,
"memstoreThreshold": "0.9"
}
}
}
]
}
}
Use DataX to migrate data of OceanBase Database to MySQL or Oracle Database
Synchronize data of an OceanBase database to a MySQL database
The following example shows the content of the configuration file:
{ "job": { "setting": { "speed": { "channel": 16 }, "errorLimit": { "record": 0, "percentage": 0.1 } }, "content": [ { "reader": { "name": "oceanbasev10reader", "parameter": { "where": "", "readBatchSize": 10000, "column": [ "*" ], "connection": [ { "jdbcUrl": ["||_dsc_ob10_dsc_||obdemo:oboracle||_dsc_ob10_dsc_||jdbc:oceanbase://127.0.0.1:2883/tpcc"], "table": [ "bmsql_oorder" ] } ], "username": "tpcc", "password":"********" } }, "writer": { "name": "mysqlwriter", "parameter": { "writeMode": "replace", "username": "tpcc", "password": "******", "column": [ "*" ], "session": [ "set session sql_mode='ANSI'" ], "preSql": [ "truncate table bmsql_oorder" ], "batchSize": 512, "connection": [ { "jdbcUrl": "jdbc:mysql://127.0.0.1:3306/tpccdb?useUnicode=true&characterEncoding=utf8", "table": [ "bmsql_oorder" ] } ] } } } ] } }Synchronize data of an OceanBase database to an Oracle database The following example shows the content of the configuration file:
{ "job": { "setting": { "speed": { "channel": 16 }, "errorLimit": { "record": 0, "percentage": 0.1 } }, "content": [ { "reader": { "name": "oceanbasev10reader", "parameter": { "where": "", "readBatchSize": 10000, "column": [ "*" ], "connection": [ { "jdbcUrl": ["||_dsc_ob10_dsc_||obdemo:oboracle||_dsc_ob10_dsc_||jdbc:oceanbase://127.0.0.1:2883/tpcc"], "table": [ "bmsql_oorder" ] } ], "username": "tpcc", "password":"********" } }, "writer": { "name": "oraclewriter", "parameter": { "username": "tpcc", "password": "********", "column": [ "*" ], "preSql": [ "truncate table bmsql_oorder" ], "batchSize": 512, "connection": [ { "jdbcUrl": "jdbc:oracle:thin:@127.0.0.1:1521:helowin", "table": [ "bmsql_oorder" ] } ] } } } ] } }
Parameters
| Parameter | Description |
|---|---|
| jdbcUrl | * The JDBC URL of the database to which you want to connect. The value is a JSON array and multiple URLs can be entered for a database. You need to enter at least one JDBC URL in the JSON array. The value must be entered in compliance with the MySQL official format. You can also specify a configuration property in the URL. For more information, see Configuration Properties in the MySQL documentation. Notice The JDBC URL must be included in the connection section of the code. * Required: Yes. * Default value: None. |
| username | * The username for logging on to the database. * Required: Yes. * Default value: None. |
| password | * The password of the specified username required to log on to the database. * Required: Yes. * Default value: None. |
| table | * The table to be synchronized. The value is a JSON array and multiple tables can be specified at the same time. When you specify multiple tables, make sure that they use the same schema structure. The MySQL Reader does not verify whether the specified tables belong to the same logic table. Notice The table string must be included in the connection section of the code. * Required: Yes. * Default value: None. |
| column | * The set of names of columns to be synchronized in the configured table. The values are specified in a JSON array. You can use an asterisk (*) to indicate all columns in the ['*'] format. Column pruning is supported. You can export only the specified columns. Column reordering is supported. You can export columns without following the column order in the table schema. You can specify constants in the MySQL SQL format: ["id", "`table`", "1", "'bazhen.csy'", "null", "to_char(a + 1)", "2.3" , "true"]. Note * id is a regular column name. * table is the name of the column that includes a reserved word. * 1 is an integral constant. * bazhen.csy is a string constant. * null is a null pointer. * to_char(a + 1) is an expression. * 2.3 is a floating point number. * true is a Boolean value. * Required: Yes. * Default value: None. |
| where | * The filter condition. The MySQL reader assembles the specified column, table, and WHERE clause into an SQL statement. Then, the MySQL reader extracts data based on this SQL statement. To synchronize data of the current day, you can specify the WHERE clause asgmt_create > $bizdate. Notice You cannot set the WHERE clause to limit 10, because limit is not a valid WHERE clause of an SQL statement. A WHERE clause allows you to orderly synchronize the incremental business data. If you do not specify the WHERE clause or do not specify the key or value of the WHERE clause, DataX performs full synchronization. * Required: No. * Default value: None. |
More information
For more information about DataX, see DataX.