Migrate data from a PostgreSQL database to the MySQL compatible mode of OceanBase Database|V4.3.1|OceanBase Migration Service| docs|Distributed Database

This topic describes how to use OceanBase Migration Service (OMS) to migrate data from a PostgreSQL database to the MySQL compatible mode of OceanBase Database, which can be a physical data source, a public cloud OceanBase data source, or a standalone data source.

Background information

You can create a data migration task in the OMS console to seamlessly migrate the existing business data and incremental data from a PostgreSQL database to the MySQL compatible mode of OceanBase Database through schema migration, full migration, and incremental synchronization.

PostgreSQL databases in the following modes are supported as the sources: primary database only, standby database only, and primary/standby databases. The following table describes the data migration operations supported by each mode.

Mode	Supported operation
Primary database only	Schema migration, full migration, incremental synchronization, and reverse increment.
Standby database only	Schema migration and full migration.
Primary/Standby databases	Primary database: incremental synchronization and reverse increment. Standby database: schema migration and full migration.

Prerequisites

You have created a corresponding schema in the MySQL compatible mode of OceanBase Database.
You have created database users for the source PostgreSQL database and the MySQL compatible mode of OceanBase Database, and granted the necessary permissions to the users. For more information, see Create a database user.
If you need to perform incremental synchronization, complete the following prerequisite steps:
- To ensure that incremental DML statements can be correctly parsed after DDL statements are executed, you must create the corresponding triggers and tables for recording DDL statements. For more information, see Create a trigger.
- After you enable incremental synchronization, the wal_level parameter must be set to logical. For more information, see Change the log level for a PostgreSQL instance.

Limitations

Limitations on the source database

Do not perform DDL operations that modify the database or table schema during schema migration or full migration. Otherwise, the data migration task may be interrupted.
At present, PostgreSQL 10.x, 11.x, 12.x, 13.x, 14.x, and 15.x are supported.
OMS does not support migrating partitioned tables, unlogged tables, or temporary tables from a PostgreSQL database.

OMS supports migrating tables with primary keys and tables with NOT NULL unique keys from a PostgreSQL database to the MySQL compatible mode of OceanBase Database.
OMS does not support triggers in the target database. If triggers exist, the data migration may fail.
Data source identifiers and user accounts must be globally unique in OMS.
OMS only supports migrating databases, tables, and column objects with ASCII-compliant names that do not contain special characters (spaces, line breaks, or .|"'`()=;/&\).
Incremental synchronization is supported only from the primary database.

Considerations

After incremental synchronization is enabled, the requirements for the table-level replication identifier REPLICA IDENTITY are as follows:
- If you select the migration objects by using the Specify Objects option, the specified tables must have a primary key or the table-level replication identifier REPLICA IDENTITY must be set to FULL. Otherwise, the operation to update or delete the business data will fail.
- If you select the migration objects by using the Match by Rule option, the PostgreSQL database must subscribe to all tables in the selected database (including the selected tables, unselected tables, and new tables). All tables must have a primary key or the table-level replication identifier REPLICA IDENTITY must be set to FULL. Otherwise, the operations to update or delete the business data will fail.
- If the primary key or unique key of the source database does not align with that of the destination database, the table-level replication identifier REPLICA IDENTITY of the corresponding table must be set to FULL.
- In the default mode of PostgreSQL, only the current value is returned, not the full image. To ensure data quality, data migration processes the corresponding tables sequentially, which may affect the efficiency of incremental synchronization. Therefore, it is recommended to set the table-level replication identifier REPLICA IDENTITY of all tables to FULL.
The command to modify the table-level replication identifier REPLICA IDENTITY to FULL is as follows.

Notice

If row filtering conditions are set for the migrated table objects, the FULL mode must be enabled for the tables.
```
ALTER TABLE table_name REPLICA IDENTITY FULL;
```
When incremental components are created in a PostgreSQL database, publications and slots are automatically created. However, you must monitor the disk usage of the log files in the PostgreSQL database. OMS notifies you of the confirmed_flush_lsn of the slot every 10 minutes, which is the lsn of 10 minutes earlier. Therefore, each incremental component stores log files of at least 10 minutes in the PostgreSQL database.

Note

If you want to modify the notification interval or the retention period of log files in the PostgreSQL database, contact technical support.

If the log files in the PostgreSQL database cannot be cleared because of the slots, you must delete the data migration task and then clear the log files in the PostgreSQL database. The log files in the PostgreSQL database can be cleared only if the earliest slot restart_lsn among all the slots is within the log file.
If a table does not have a primary key or all columns have a NOT NULL unique key, duplicate data may be migrated to the target.
In the reverse increment, if you migrate data by using full-column matching for the UPDATE and DELETE operations on tables without a primary key, the following issues may occur:
- Performance issues may occur.
  
  Since no primary key index exists, the UPDATE and DELETE operations are performed after a full-table scan.
- Data consistency issues may occur.
  
  The UPDATE and DELETE operations in the PostgreSQL database do not support the LIMIT syntax. If multiple data records are matched by using full-column matching, the data in the source may be more than that in the target. For example, a table without a primary key named t1 contains columns c1 and c2. Two data records exist in the source with c1 = 1 and c2 = 2. If you delete one data record from the source, the two data records with c1 = 1 and c2 = 2 in the target are deleted because the matching condition is where c1 = 1 and c2 = 2. This causes data inconsistency between the source and the target.
When you migrate the data of the tsvector type, if the data is migrated in reverse increment to the PostgreSQL database, the data written to the PostgreSQL database by OceanBase Database must conform to the tsvector format. For example:
- If OceanBase Database writes 'a b c' to the PostgreSQL database, it is converted to "'a' 'b' 'c'".
- If OceanBase Database writes 'a:1 b:2 c:3' to the PostgreSQL database, it is converted to "'a':1 'b':2 'c':3".
If OceanBase Database writes data in a non-tsvector format, such as "'a':cccc", to the PostgreSQL database, an error is returned. For more information about the tsvector format, see PostgreSQL official docs.
In the reverse increment from a PostgreSQL database to the MySQL compatible mode of OceanBase Database, if the version of the MySQL compatible mode of OceanBase Database is earlier than V3.2.x and a multi-partition table has a global unique index, data may be lost during the migration if you update the partition key values of the table.
If the character set of the source is UTF-8, we recommend that you use a character set compatible with the source at the target, such as UTF-8 or UTF-16, to avoid garbled characters at the target due to character set incompatibility.
Confirm whether the precision of the target columns such as DECIMAL, FLOAT, or DOUBLE conforms to your expectations. If the precision of the target column is smaller than that of the source column, data may be truncated, which causes data inconsistency between the source and the target.
If you change the unique index at the target, you must restart the incremental synchronization component. Otherwise, data may be inconsistent.
If the clocks between nodes or between the client and the server are out of synchronization, the latency (incremental synchronization or reverse increment) may be inaccurate.

For example, if the clock is earlier than the standard time, the latency may be negative. If the clock is later than the standard time, the latency may be positive.
In multi-table aggregation scenarios:
- We recommend that you configure the mappings between the source and target databases by specifying matching rules.
- We recommend that you manually create schemas in the target database. If you use OMS to create schemas, skip failed objects in the schema migration step.
A difference between the source and target table schemas may result in data consistency. Some known scenarios are described as follows:
- When you manually create a table schema in the target database, if the data types of any columns are not supported by OMS, implicit data type conversion may occur in the target database, which causes inconsistent column types between the source and target databases.
- If the length of a column in the target database is shorter than that in the source database, the data of this column may be automatically truncated, which causes data inconsistency between the source and target databases.
If you select only Incremental Synchronization when you create the data migration task, OMS requires that the archive logs in the source database be retained for more than 48 hours.

If you select Full Migration and Incremental Synchronization when you create the data migration task, OMS requires that the archive logs in the source database be retained for at least seven days. Otherwise, the data migration task may fail or the data in the source and target databases may be inconsistent because OMS cannot obtain incremental logs.
If the source and target table objects differ only in capitalization of their names, the data migration result may not be as expected because the object names in the source or target database are case-insensitive.
If the unique constraint column allow NULL values, data may be lost. When multiple NULL values are synchronized from the PostgreSQL database to the MySQL compatible mode of OceanBase Database, only the first NULL value is successfully inserted. Subsequent NULL values are discarded due to unique constraint column conflicts.
At present, the data migration task does not support tables without a non-null unique key. To avoid duplicate data in case of task restart and other exceptions, we recommend that you configure a non-null unique key for each table.

Data type mappings

PostgreSQL database	MySQL compatible mode of OceanBase Database
bigint	BIGINT
bigserial	BIGINT
bit [ (n) ]	BIT
boolean	TINYINT(1)
bytea	LONGBLOB
character [ (n) ]	CHAR LONGTEXT
character varying [ (n) ]	VARCHAR MEDIUMTEXT LONGTEXT
cidr	VARCHAR(43)
date	DATE
double precision	DOUBLE
inet	VARCHAR(43)
integer	INTEGER
interval [ fields ] [ (p) ]	TIME
json	LONGTEXT JSON (MySQL compatible mode of OceanBase Database V3.2.2 and later versions)
jsonb	LONGTEXT JSON (MySQL compatible mode of OceanBase Database V3.2.2 and later versions)
macaddr	VARCHAR(17)
money	DECIMAL(19,2)
numeric [ (p, s) ]	DECIMAL
path	LINESTRING
real	FLOAT
smallint	SMALLINT
smallserial	SMALLINT
serial	INT
text	LONGTEXT
time [ (p) ] [ without time zone ]	TIME
time [ (p) ] with time zone	TIME
timestamp [ (p) ] [ without time zone ]	DATETIME
timestamp [ (p) ] with time zone	DATETIME
tsquery	LONGTEXT
tsvector	LONGTEXT
txid_snapshot	VARCHAR
uuid	VARCHAR(36)
xml	LONGTEXT
point	POINT
polygon	POLYGON
Geometry	GEOMETRY
GEOMETRY(Point)	POINT
GEOMETRY(LineString)	LINESTRING
GEOMETRY(Polygon)	POLYGON
GEOMETRY(MultiPoint)	MULTIPOINT
GEOMETRY(MultiLineString)	MULTILINESTRING
GEOMETRY(MultiPolygon)	MULTIPOLYGON
GEOMETRY(GeometryCollection)	GEOMETRYCOLLECTION
array	array Supports the following data types: TINYINT, SMALLINT, INT, INTEGER, BIGINT, FLOAT, DOUBLE, and VARCHAR. Nested arrays are also supported, with a maximum nesting depth of 6.
citext	LONGTEXT
vector	VECTOR
halfvec	VECTOR
sparsevec	SPARSEVECTOR

Procedure

Create a data migration task.
1. Log in to the OMS console.
2. In the left-side navigation pane, click Data Migration.
3. On the Data Migration page, click Create Task in the upper-right corner.
On the Create Task page, specify the name of the migration task.

We recommend that you set it to a combination of digits and letters. It must not contain any spaces and cannot exceed 64 characters in length.

Notice

The task name must be a unique identifier in the OMS system.

In the Select Source and Target step, configure the parameters.

migration-40-en

Parameter	Description
Source	If you have created a PostgreSQL data source, select it from the drop-down list. If not, click New Data Source in the drop-down list and create one in the dialog box that appears on the right. For more information, see Create a PostgreSQL data source. You can select a PostgreSQL data source in primary database only mode or primary/standby databases mode. This topic describes how to create a data migration task with a PostgreSQL data source in primary/standby databases mode.
Target	If you have created a data source for the MySQL compatible mode of OceanBase Database, which can be a physical data source, a public cloud OceanBase data source, or a standalone data source, select it from the drop-down list. If not, click New Data Source in the drop-down list and create one in the dialog box that appears on the right. For more information about the parameters, see Create a physical OceanBase data source, Create a public cloud OceanBase data source, or Create a standalone OceanBase data source.
Tag (Optional)	Click the text box and select a tag from the drop-down list. You can also click Manage Tags to create, modify, and delete tags. For more information, see Use tags to manage data migration tasks.

Click Next. In the Select Migration Type step, specify the migration types for the migration task.

Options for Migration Type are Schema Migration, Full Migration, Incremental Synchronization, and Reverse Increment.

Migration type	Limitations
Schema migration	The definitions of data objects, such as tables, indexes, constraints, comments, and views, are migrated from the source database to the target database. Temporary tables are automatically filtered out.
Full migration	After a full migration task is started, OMS migrates existing data of tables in the source database to corresponding tables in the target database. If you select Full Migration, we recommend that you use the `ANALYZE` statement to collect the statistics of the PostgreSQL database before data migration.
Incremental synchronization	Changed data in the source database is synchronized to the corresponding tables in the target database after an incremental synchronization task starts. Supported data changes are data addition, modification, and deletion. Options for Incremental Synchronization are DML synchronization and DDL synchronization. Select the options as needed. For more information, see Configure DDL/DML synchronization.
Reverse increment	When a reverse increment task starts, OMS migrates the data changed in the target database after the business switchover back to the source database in real time. Generally, incremental synchronization configurations are reused for reverse increment. You can also customize the configurations for reverse increment as needed. You cannot select Reverse Increment in the following cases: Multi-table aggregation is involved. Multiple source schemas map to the same target schema.

(Optional) Click Next.

If you have selected Reverse Increment without configuring the related parameters for the target MySQL compatible mode of OceanBase Database, the Add Data Source Information dialog box appears, prompting you to configure related parameters. For more information about the parameters, see Create a physical OceanBase data source, Create a public cloud OceanBase data source, or Create a standalone OceanBase data source.

After you configure the parameters, click Test connectivity. After the test succeeds, click Save.

Click Next. In the Select Migration Objects step, specify the migration objects for the migration task.

You can select Specify Objects or Match by Rule to specify the migration objects. The following procedure describes how to specify migration objects by using the Specify Objects option. For information about the procedure for specifying migration objects by using the Match by Rule option, see Configure matching rules.

Notice

If a database or table name contains double dollar signs ("$$"), you cannot create the migration task.
OMS automatically filters out unsupported tables. For information about the SQL statements for querying table objects, see SQL statements for querying table objects.

migration-39-en

In the Select Migration Objects section, select Specify Objects.
In the Source Object(s) list, select the objects to be migrated. You can select tables and views of one or more databases as the migration objects.
Click > to add the selected objects to the Target Object(s) list.

OMS allows you to import objects by using a text file, rename objects, configure row filters, select columns, and remove one or all migration objects.

Note

When you select Match by Rule to specify migration objects, object renaming is implemented based on the syntax of the specified matching rules. In the operation area, you can only set filter conditions. For more information, see Configure matching rules.

Operation	Steps
Import objects	In the Target Object(s) list, click Import Objects in the upper-right corner. In the dialog box that appears, click Create. Notice This operation will overwrite previous selections. Proceed with caution. In the Import Objects dialog box, import the objects to be migrated. You can import CSV files to rename databases/tables and set row filtering conditions. For more information, see Download and import the settings of migration objects. Click Validate. After the validation succeeds, click OK.
Rename objects	OMS allows you to rename migration objects. For more information, see Rename a migration or synchronization object.
Configure settings	OMS allows you to configure row filters and specify columns to be migrated. Hover the pointer over the target object in the right-side list of the selection area. Click Settings that appears. In the Settings dialog box, you can perform the following operations: In the Row Filters section, configure row filters by entering WHERE clauses of standard SQL statements. For more information, see Filter data by using SQL conditions. In the Select Columns section, select the columns to be migrated. For more information, see Column filtering.
Remove one or all objects	OMS allows you to remove one or all objects to be migrated to the target database during data mapping. To remove one migration object: In the Target Object(s) list, move the pointer over the target object and click Remove. To remove all migration objects: In the Target Object(s) list, click Remove All in the upper-right corner. In the dialog box that appears, click OK.

Click Next. On the Migration Options page, configure the parameters.

Schema migration

The following parameters are displayed only if you have selected One-way Synchronization > Schema Migration in the Select Migration Type step.

oms116-en

Parameter	Description
Automatically Enter Next Stage upon Completion	If you select schema migration and any other migration type, you can specify whether to automatically proceed to the next stage after schema migration is completed. The default value is Yes. You can also view and modify this value on the Schema Migration tab of the data migration task details page.
Normal Index Migration Method	The migration method for non-unique key indexes associated with the migrated table objects, including Do Not Migrate, Migrate with Schema, and Post-Full-Migration (displayed only when Full Migration is selected).

Full migration

The following parameters are displayed only if you have selected Full Migration in the Select Migration Type step.

oms65-en

Parameter	Description
Full Migration Rate Limit	You can choose whether to limit the full migration rate as needed. If you choose to limit it, you must specify the RPS and BPS. The RPS specifies the maximum rows of data migrated to the target database per second during full migration, and the BPS specifies the maximum amount of data in bytes migrated to the target database per second during full migration. Note The RPS and BPS values specified here are only for throttling. The actual full migration performance is subject to factors such as the settings of the source and target databases and the instance specifications.
Full Migration Resource Configuration	You can select Small, Medium, or Large to use the corresponding default values of Read Concurrency, Write Concurrency, and Memory. You can also customize the resource configurations for full migration. By setting the resource configuration for the Full-Import component, you can limit the resource consumption of a task in the full migration phase. Notice In the case of custom configurations, the minimum value is `1`, and only integers are supported.
Handle Non-empty Tables in Target Database	Valid values: Ignore and Stop Migration. If you select Ignore, when the data to be inserted conflicts with the existing data of a target table, OMS retains the existing data and records the conflict data. If you select Stop Migration and a target table contains data, an error is returned during full migration, indicating that the migration is not allowed. In this case, you must clear the data in the target table before you can continue with the migration. Notice After an error is returned, if you click Resume in the dialog box, OMS ignores this error and continues to migrate data. Proceed with caution.

Incremental synchronization

The following parameters are displayed only if you have selected Incremental Synchronization in the Select Migration Type step.

oms18-en

Parameter	Description
Incremental Synchronization Rate Limit	You can choose whether to limit the incremental synchronization rate as needed. If you choose to limit it, you must specify the RPS and BPS. The RPS specifies the maximum rows of data synchronized to the target database per second during incremental synchronization, and the BPS specifies the maximum amount of data in bytes synchronized to the target database per second during incremental synchronization. Note The RPS and BPS values specified here are only for throttling. The actual incremental synchronization performance is subject to factors such as the settings of the source and target databases and the instance specifications.
Incremental Log Pull Resource Configuration	You can select Small, Medium, or Large to use the corresponding default value of Memory. You can also customize the resource configurations for incremental log pull. By setting the resource configuration for the Store component, you can limit the resource consumption of a task in log pull in the incremental synchronization stage. Notice In the case of custom configurations, the minimum value is `1`, and only integers are supported.
Incremental Data Write Resource Configuration	You can select Small, Medium, or Large to use the corresponding default values of Write Concurrency and Memory. You can also customize the resource configurations for incremental data writes. By setting the resource configuration for the Incr-Sync component, you can limit the resource consumption of a task in data writes in the incremental synchronization stage. Notice In the case of custom configurations, the minimum value is `1`, and only integers are supported.
Incremental Record Retention Duration	The duration that incremental parsed files are cached in OMS. A longer retention duration results in more disk space occupied by the Store component.
Incremental Synchronization Start Timestamp	If you have selected Full Migration as the migration type, this parameter is not displayed. If you have selected Incremental Synchronization but not Full Migration, specify a point in time after which the data is to be synchronized. The default value is the current system time. For more information, see Set an incremental synchronization timestamp.

Reverse increment

The following parameters are displayed only if you have selected Reverse Increment in the Select Migration Type step. The parameters for reverse increment are consistent with those for incremental synchronization. You can select Reuse Incremental Synchronization Configuration in the upper-right corner.
Advanced options

The following parameters are displayed only if the target is a MySQL-compatible tenant of OceanBase Database V4.3.0 or later and you have selected Schema Migration in the Select Migration Type step.

The table object storage types of the target database include Default, Row Storage, Column Storage, and Hybrid Row-Column Storage. This parameter specifies the storage type for target table objects during schema migration or incremental synchronization. For more information, see default_table_store_format.

Note

The value Default means that other parameters are automatically set based on the parameter configurations of the target database. Table objects in schema migration are written to corresponding schemas based on the specified storage type.

If the parameter settings on the page cannot meet your requirements, you can click Parameter Configuration in the lower part of the page to configure more specific settings. You can also reference an existing task or component template.

oms131-en

Click Precheck to start a precheck on the data migration task.

During the precheck, OMS checks the read and write privileges of the database users and the network connectivity of the databases. A data migration task can be started only after it passes all check items. If an error is returned during the precheck, you can perform the following operations:
- Identify and troubleshoot the issue and then perform the precheck again.
- Click Skip in the Actions column of a failed precheck item. In the dialog box that prompts the consequences of the operation, click OK.
Click Start Task. If you do not need to start the task now, click Save to go to the details page of the task. You can start the task later as needed.

You can click Configure Validation Task in the upper-right corner of the data migration details page to compare the data between the source and target databases. For more information, see Create a data validation task.

OMS allows you to modify the migration objects when the data migration task is running. For more information, see View and modify migration objects. After the data migration task is started, it is executed based on the selected migration types. For more information, see the View migration details section in View details of a data migration task.

Enterprise Edition

Community Edition

Migrate data from a PostgreSQL database to the MySQL compatible mode of OceanBase Database

Background information

Prerequisites

Limitations

Considerations

Notice

Note

Data type mappings

Procedure

Notice

Notice

Note

Note

Notice

Notice

Note

Notice

Notice

Note