OBLOADER
What is OBLOADER?
OBLOADER is a client tool that is developed in Java. At present, OBLOADER applies only to OceanBase Database. You can import the definition files and table data files of database objects in storage media to OceanBase Database. Generally, we recommend that you use OBLOADER in combination with OBDUMPER. You can also use OBLOADER for data migration. It is compatible with CSV files exported by client tools such as mysqldump and Mydumper.
OBLOADER provides various types of built-in data preprocessing functions to improve its data import performance. The automatic fault tolerance mechanism ensures data import stability. Rich monitoring information is provided for you to observe the import performance and progress in real time.
Features
OBLOADER mainly provides the following features:
Allows you to import database object definitions and table data from local disks, Apache Hadoop, Alibaba Cloud Object Storage Service (OSS), Amazon Simple Storage Service (S3), Tencent Cloud Object Storage (COS), and Huawei Cloud Object Storage Service (OBS).
Allows you to import SQL files exported by mysqldump.
Allows you to import data files in the standard CSV, INSERT SQL, ORC, Parquet, or Avro format.
Provides a wide range of data cleansing features.
Provides various error handling strategies.
Allows you to logically split a data file before import to leverage the performance of multiple CPU cores.
Supports encryption of sensitive parameters specified in commands, such as the database account and password and the cloud storage account and password.
Allows you to decompress and load compressed files in the CSV, CUT, POS, or SQL format that are exported by OBDUMPER.
Considerations
For more information about the standard CSV format, see the RFC 4180 specifications. We recommend that you import data in strict accordance with the RFC 4180 specifications.
To improve performance, you can modify the Java virtual machine (JVM) memory parameter in the script when you try to import a large amount of data.
The object names, data file names, and rule file names specified by command-line options must be capitalized in the same way. By default, uppercase letters are used in Oracle mode, and lowercase letters are used in MySQL mode. If table names are case-sensitive, enclose them in brackets ([ ]). For example,
--table '[test]'indicates the table namedtest, and the file name is in the format oftest.group.sequence.suffix.--table '[TEST]'indicates the table namedTEST, and the file name is in the format ofTEST.group.sequence.suffix. In the file name format,groupindicates the sub-task ID (determined based on the task splitting strategy),sequenceindicates the file sequence number (a new file is generated when the file size exceeds the value specified by--block-size), andsuffixindicates the file name extension.During import, the file name format that can be identified by OBLOADER is
table name.file name extension. You can customize a file search rule by specifying the--file-regular-expressionoption.If the database objects to be imported have dependencies such as foreign key dependencies among tables and dependencies of triggers on sequences, we recommend that you import the objects based on the dependency order. If you specify the
--allor--table '*'option for import, the import order cannot be strictly ensured.When you resolve the primary key conflict in OceanBase Database V1.4.79 in MySQL mode, the use of the
INSERT ... WHERE NOT EXISTSstatement may result in cross-partition insertion errors.In OceanBase Database V1.4.x in MySQL mode, the metadata of the RANGE COLUMNS-KEY composite partitioned table is defective in the virtual routing view.
At present, resumable transmission is not supported for tables without a primary key.
Before you use OBLOADER to import data to OceanBase Database V3.2.4 or later, set the system parameter
open_cursorsto a large value. Otherwise, an error may occur during the import. After the data is imported, reset the system parameter to the initial value, such asALTER SYSTEM SET open_cursors = 65535;.When you import DDL statements, use the
--mixoption instead of the--ddloption if-fis set to a non-standard directory structure (a directory structure not generated by OBDUMPER). This is because the--ddloption can be used to import files that contain only one DDL statement whereas the--mixoption does not have this limitation.When the
--sqloption is specified, for data in the file data format, ensure that one statement inserts only one record. Otherwise, specify--mixinstead of--sqlto import the data.OBLOADER supports the following file formats:
DDL: A file in the DDL format contains only DDL statements but no table data.
CSV: A file in the CSV format contains content that complies with the RFC 4180 specification.
SQL: A file in the SQL format contains only
INSERT SQLstatements without line breaks.ORC: A file in the ORC format contains content that complies with the Apache ORC specification and uses zstd compression by default.
Parquet: A file in the Parquet format contains content that complies with the Apache Parquet specification and uses zstd compression by default.
MIX: A file in the MIX format contains mixed types of standard SQL statements, such as DDL and DML statements.
POS: A file in the POS format contains data at a fixed byte length. Currently, the length cannot be specified.
CUT: A file in the CUT format uses a single character or multiple characters to separate data columns and does not contain delimiters. However, in a file in the CSV format, data is separated by a single character.
Avro: A file in the Avro format contains content that complies with the Apache Avro specification. The file name extension is
.avro.
OBDUMPER
What is OBDUMPER?
OBDUMPER is a client tool that is developed in Java. At present, OBDUMPER applies only to OceanBase Database. You can use OBDUMPER to export object definitions and table data from OceanBase Database in a specific file format to the storage media.
You can integrate OBDUMPER into a database O&M system for logical backup. In this case, incremental backup is not supported. Compared with other client export tools such as mysqldump, OBDUMPER has the following advantages:
Quick data export: Multiple data query strategies are designed to significantly improve the export performance.
Versatile data exchange capabilities: Data in tables can be exported to multiple types of storage media in different formats.
Powerful data processing capabilities: Data is compressed, encrypted, desensitized, and preprocessed before being exported.
Features
OBDUMPER provides the following features:
Allows you to export database object definitions and table data to local disks, Alibaba Cloud OSS, Amazon S3, Tencent Cloud COS, and Huawei Cloud OBS.
Allows you to export table data to a file in the CSV, INSERT SQL, ORC, Parquet, or Avro format.
Allows you to specify a partition name to export only the data in the specified table partition.
Allows you to specify global filters to export only the data that meets the specified conditions.
Allows you to configure data preprocessing rules to convert and desensitize data before export.
Allows you to specify system change numbers (SCNs) or timestamps to export only the historical snapshot data of desired transaction points or time points.
Allows you to export data from a follower replica, which is different from a standby cluster, of an OceanBase cluster.
Allows you to specify a custom query statement to export only the result set of the query statement.
Allows you to use the latest snapshot version to export globally consistent data without locking tables.
Supports encryption of sensitive parameters specified in commands, such as the database account and password and the cloud storage account and password.
Allows you to compress and export files in the CSV, CUT, POS, or SQL format.
Supports OceanBase Database V4.3.0 and allows you to export schemas of columnstore tables.
Supports generating an empty file when the partition or query result set is empty.
Considerations
For more information about the standard CSV format, see the RFC 4180 specifications. We recommend that you export data in strict accordance with the RFC 4180 specifications.
You need to modify the VM memory parameter in the script when you try to export a large amount of data.
The object names, data file names, and control file names specified by command-line options must be capitalized in the same way. By default, the names are in uppercase in Oracle mode and lowercase in MySQL mode. If table names are case-sensitive, enclose them in brackets ([ ]). For example,
--table '[test]'indicates that the table name istestand the file name format istest.group.sequence.suffix;.--table '[TEST]'indicates that the table name isTESTand the file name format isTEST.group.sequence.suffix. In the file name format,groupindicates the sub-task ID (determined based on the task splitting strategy),sequenceindicates the file sequence number (a new file is generated when the file size exceeds the value specified by--block-size), andsuffixindicates the file name extension.If a foreign key contains several columns, the order of the columns might change after the export. Example: FOREIGN KEY(c1,c2) REFERENCE (c1,c2).
OceanBase Database V1.4.72 in MySQL mode does not support exporting the definitions of unique indexes, such as UNIQUE(c1(10)).
OceanBase Database V1.4.x in MySQL mode does not support exporting definitions of generated columns. For example, columns generated by using the
GENERATED ALWAYS AS (expr)statement cannot be exported.OceanBase Database V2.2.7x in MySQL mode does not support exporting vertical partitions. For example, partitions created by using the
PARTITION BY COLUMNstatement cannot be exported.OceanBase Database V2.2.7x in MySQL mode does not support exporting definitions of synonym objects.
OceanBase Database V2.2.7x in MySQL mode does not support exporting stored columns of indexes. For example, columns in STORING(COLUMN_LIST) cannot be exported.
OceanBase Database in Oracle mode does not allow you to export data of the INTERVAL DAY(2) TO SECOND(0) type.
On the Windows OS, the exclamation point (!) cannot be used as a separator. An object name in the database cannot contain special characters. Special characters are \ / : * ? " < > | Otherwise, neither the database object definitions nor the table data can be exported.
By default, when the table data to be exported contains a generated column, the data of the generated column is exported. You can use the
--exclude-virtual-columnsoption to specify not to export the data of the generated column.When you use the
--date-value-format,--time-value-format,--datetime-value-format,--timestamp-value-format,--timestamp-tz-value-format, and--timestamp-ltz-value-formatoptions, use a format template that can be identified by Java to avoid errors.When you export a CUT file, if a field contains the specified field separator, the separator character in the field is escaped. For example, if the content of a field is
abc|defand | is the field separator (--column-splitter "|"), the exported field is converted toabc|def.The
--no-sysoption specifies that the password of the sys tenant cannot be provided in OceanBase Database. OBDUMPER V3.3.0 and later distinguish ApsaraDB for OceanBase and OceanBase Database. The--public-cloudoption applies only to ApsaraDB for OceanBase, and the--no-sysoption applies only to OceanBase Database.Before you use OBDUMPER to export data from OceanBase Database V3.2.4 or later, set the system parameter
open_cursorsto a large value. Otherwise, an error may occur during the export. After the data is exported, reset the system parameter to the initial value, such asALTER SYSTEM SET open_cursors = 65535;.In OceanBase Database V4.0.0 or later, if the schema of a table has been changed, you cannot use OBDUMPER to export the baseline data, namely the consistent snapshot data, obtained after the last major compaction. You can manually initiate a major compaction and then export the most recent baseline data.
When you use OBDUMPER of a version earlier than V4.2.0 to export data from a MySQL tenant of OceanBase Database, OceanBase Connector/J converts the zero values of date and time type fields in the database to
NULL. If a column has aNOT NULLconstraint, an error will be reported during the export. OBDUMPER V4.2.0 can export zero values of date and time type fields, but cannot identify whether the original data isNULLor zero values during export and forcibly converts the data to zero values by default. In addition, exported zero values of the DATETIME and TIMESTAMP data types may be distorted into non-zero approximate values. If theNO_ZERO_DATEorNO_ZERO_IN_DATEconstraint is specified for thesql_modevariable in the database, an error will be reported when zero-value data is exported. Take note of this consideration on zero values when you export data of the DATE, DATETIME, TIME, YEAR, or TIMESTAMP type from a MySQL tenant of OceanBase Database.Feature differences in schema export in Oracle and MySQL tenants
| Tenant type | Password of the sys tenant provided | Password of the sys tenant not provided |
| MySQL | Tables, views, table groups, stored procedures, and functions are supported. | The export behavior is basically the same as that when the password of the sys tenant is provided, except for the following known issues:
|
| Oracle | Tables, views, triggers, synonyms, sequences, stored procedures, functions, packages, table groups, and types are supported |
Flashback export
To use the flashback export feature, you need to set the system variable undo_retention to an appropriate value.
Assume that t1 is the point in time of the Undo operation, and t2 = t1 + 900s. You can find data within the [t1, t2] range when you execute a query at t2. After you set undo_retention, the current session takes effect only on the data that is generated after t1. The default value of this system variable is 0, in seconds. Sample statement for setting the undo_retention variable:
SET global undo_retention=900;
You can query the
v$ob_timestamp_serviceview in the sys tenant for the valid OceanBase system change number (SCN).You can query data after the latest major compaction at a specific point in time. For example, if you started a major compaction at t1, the earliest data that you can query is the data at t1.
If the table that you want to query has been deleted and is in the recycle bin, you must restore it from the recycle bin.
Flashback queries are affected by minor compactions. If a minor compaction has been performed and the
undo_retentionvariable is not specified, flashback queries cannot be performed.After the
undo_retentionvariable is set, you can query data within the time range of [t1,t1 +undo_retention], where t1 is the point in time when the minor compaction occurs.
Supported OceanBase Database versions
The following table lists the OceanBase Database versions and modes that are supported.
| OceanBase Database mode | Supported version |
|---|---|
| Oracle mode | V2.0.x, V2.1.x, V2.2.20, V2.2.30, V2.2.50, V2.2.70 V2.2.71, V2.2.72, V2.2.76, V2.2.77, V3.1.x, V3.2.x, V4.0.0, V4.1.0, V4.2.0, V4.2.1, V4.2.2, V4.2.3, and V4.3.0 |
| MySQL mode | V1.4.70, V1.4.72, V1.4.75, V1.4.78, V1.4.79, V2.2.30, V2.2.50, V2.2.70, V2.2.71, V2.2.72, V2.2.76, V2.2.77, V3.1.x, V3.2.x, V4.0.0, V4.1.0, V4.2.0, V4.2.1, V4.2.2, V4.2.3, and V4.3.0 |
Note
- OBLOADER & OBDUMPER V3.0.0 supports OceanBase Database Community Edition V3.1.2 and later.
- OBLOADER & OBDUMPER is available in only one unified edition since V4.2.1, without providing the community and enterprise editions.