This topic describes the data import and export paths and solutions for various scenarios in OceanBase Database.
OceanBase Database supports the following data transfer solutions:
- Data import into or export from from OceanBase Database
- Primary/Standby synchronization in OceanBase Database
- Data migration and synchronization to OceanBase Database
- Data transfer from OceanBase Database to an incremental parsing system
- Data transfer from OceanBase Database to a message queue
In these scenarios, an OceanBase database can either serve as the primary database for data reads and writes or as the standby database for data synchronization. It can also serve as the source or target for data migration.
Data import and export
You can import data into or export data from OceanBase Database in the following ways:
Using SQL statements
- To export data, use the
SELECT INTO OUTFILEstatement. - To import data, use the
LOAD DATA INFILEstatement.
- To export data, use the
Using tools
- To export data, you can use obdumper. It allows you to export object definitions and table data from OceanBase Database in a specific file format to storage media.
- To import data, you can use obloader.
You can use obloader or the LOAD DATA statement to import data files of different sizes in standard formats such as CSV, INSERT SQL, ORC, or Parquet into OceanBase Database. Choose the method that best suits your needs for data import/export.
Primary/Standby synchronization
OceanBase Database V4.x provides the Physical Standby Database solution at the tenant level. This solution consists of the following components:
- A primary tenant, which provides read and write services.
- One or more standby tenants, which synchronize data changes in the primary tenant in real time through redo logs.
Please note that the Physical Standby Database solution of OceanBase Database supports only the asynchronous synchronization mode.
In OceanBase Database V4.x, you can choose between the following two deployment methods for the Physical Standby Database solution:
- Archive-based deployment
- Network-based deployment
The two deployment modes differ in the data synchronization method. You can select a deployment mode as needed. The synchronization performance of the two deployment modes is described as follows:
- The performance of both deployment modes is subject to multiple factors, such as the database load, storage device performance, and network bandwidth.
- The network-based deployment mode allows synchronization within seconds.
- The archive-based deployment mode increases the synchronization latency to minutes.
The Physical Standby Database solution applies to the following scenarios:
- The primary and standby tenants are homogeneous and support physical replication.
- The primary and standby tenants support zone- and geo-disaster recovery, as well as read-write splitting.
Archive-based deployment
In the archive-based deployment mode, redo logs are sourced from the archive logs of the primary tenant or a standby tenant. A standby tenant only synchronizes logs from the source through archive logs, without communicating with the primary tenant or other standby tenants.
Network-based deployment
In the network-based deployment mode, a standby tenant directly connects to the primary tenant or another standby tenant over the network to read logs in real time. The network connection between the primary and standby tenants must be smooth to ensure that the standby tenant can constantly request logs from the log transfer service of the primary tenant. The logs can be sourced from the online logs or archive logs (if log archiving is enabled) of the primary tenant. The system can automatically switch between the two log sources, which is imperceptible to the standby tenant and users.
Data migration and synchronization
OMS
OceanBase Migration Service (OMS) is a tool provided by OceanBase Database for data migration and synchronization. You can use OMS for data interaction between homogeneous or heterogeneous data sources and OceanBase Database. OMS supports online migration of existing data and real-time synchronization of incremental data. The data migration feature of OMS applies to business scenarios such as database upgrade, cross-instance data migration, database splitting, and database scaling.
OMS can synchronize incremental data from OceanBase Database and other databases to the message queues of self-managed instances, such as Kafka and RocketMQ instances, in real time. The data synchronization feature of OMS applies to business scenarios such as cloud business intelligence analysis, real-time data warehouse construction, data query, report distribution, active geo-redundancy, remote disaster recovery, and data aggregation.
CDC
OceanBase Database provides the binlog service to implement Change Data Capture (CDC). With this service, you can directly use Canal or Debezium for CDC when you switch from MySQL Database to OceanBase Database's MySQL mode. Designed for scenarios such as real-time data subscription, the binlog service collects transaction logs of OceanBase Database and converts them into MySQL binlogs.
Comparison of migration solutions
| Comparison item | OceanBase Database | MySQL Database | Oracle Database |
|---|---|---|---|
| Data import and export |
|
LOAD DATA/OUTFILE |
|
| Primary/Standby synchronization |
|
Based on binlog master-slave |
|
| Data migration and synchronization | OMS/Binlog service | Replication to ecosystem tools such as Canal | Oracle GoldenGate |