Command-line options|V4.3.1|OceanBase Loader and Dumper| docs|Distributed Database

OBLOADER allows you to specify the information required for import in command-line options. For more information about the options and their scenarios and examples, see Options and Scenarios and examples.

Overview

Option styles

OBLOADER supports the Unix and GNU styles of command-line options.

Unix style: An option is prefixed with a hyphen and each option is a single character, such as ps -e. In this style, you can omit the space between an option and its value, such as -p******.
GNU style: An option is prefixed with double hyphens and each option is a single character or a string, such as ps --version. An option and its value must be separated with a space, such as --table 'test'.

Option categories

Command-line options in OBLOADER are classified into basic options and advanced options.

Basic options: general options of OBLOADER, including connection options (such as the database connection method), feature options (such as the file format, database object type, and storage path), and other options.
Advanced options: feature options (such as the timestamp format, allowlist- and blocklist-based table/column filtering settings, and error handling method), performance options, and other options.

Required options

When you use OBLOADER to import data, you must specify at least the connection options, format option, database object type option, and storage path option.

Here is a sample statement:

$./obloader -h xx.x.x.x -P 2883 -u test@mysql#cluster_a -p ****** -D USERA --csv --table '*' -f /output

In this example, -h, -P, -u, -p, and -D are connection options, the --csv option specifies the file format, the --table option specifies the database object type, and the -f option specifies the storage path.

Basic options

Connection options

OBLOADER can read data from and write data to an OceanBase database only after it is connected to the database. You can connect to an OceanBase database by specifying the following options:

Import type

Import type	Relevant command-line option
OceanBase Database	-h host_name, --host= host_name The host IP address for connecting to OceanBase Database Proxy (ODP) or a physical OceanBase Database node. -P port_num, --port= port_num The host port for connecting to ODP or a physical OceanBase Database node. -c cluster_name, --cluster= cluster_name The OceanBase cluster to connect to. If this option is not specified, a physical node of the database is connected. The `-h` and `-P` options specify the host IP address and port number of the physical node. If this option is specified, ODP is connected. The `-h` and `-P` options specify the host IP address and port number of ODP. -u,--user The username, tenant name, and cluster name for connecting to OceanBase Database. The value is in the format of `<user>@<tenant>#<cluster>`, for example, `-u user@tenant#cluster`. If you specify an incorrect username, OBLOADER cannot connect to the database. If no cluster name is specified, a physical node of the database is connected. The `-h` and `-P` options specify the host IP address and port number of the physical node. If a cluster name is specified, ODP is connected. The `-h` and `-P` options specify the host IP address and port number of ODP. -D database_name, --database= database_name The name of the database to which OBLOADER imports database object definitions and table data. -p ' password', --password='password' The account password for connecting to OceanBase Database. If no password is set for the current database account, you do not need to specify this option. To specify this option on the command line, you must enclose the value in single quotation marks (' '), for example, `-p'****'`. Note If you are using Windows OS, enclose the value in double quotation marks (" "). This rule also applies to string values of other options. --sys-user sys_username The username of a user with required privileges in the `sys` tenant, such as `root` or `proxyro`. OBLOADER requires a special user in the `sys` tenant to query the metadata in system tables. The default value is `root`. You do not need to specify this option for OceanBase Database V4.0.0 and later. --sys-password 'sys_password' The password of a user with required privileges in the `sys` tenant. This option is used in combination with the `--sys-user` option. By default, the password of the `root` user in the `sys` tenant is left empty. When you specify this option on the command line, enclose the value in single quotation marks (' '), for example, `--sys-password '****'`. You do not need to specify this option for OceanBase Database V4.0.0 and later. Note If this option is not specified, OBLOADER cannot query metadata in system tables, and the import features and performance may be greatly affected. --no-sys Imports database object definitions or table data from an OceanBase cluster when the password of the sys tenant cannot be provided. Unlike the `--public-cloud` option, the `--no-sys` option requires you to specify the tenant name option (`-u <user>@<tenant>`) on the command line, and also add the cluster option (`-u <user>@<tenant>#<cluster>`) to connect to ODP. In OceanBase Database V4.0.0 and earlier, if you do not specify the `--public-cloud` or `--no-sys` option, the `--sys-user` and `--sys-password` options must be specified for OBLOADER.
ApsaraDB for OceanBase	--public-cloud Imports database object definitions or table data from an ApsaraDB for OceanBase cluster. If you specify this option on the command line, you do not need to specify the tenant name and cluster name options (`-u <user>@<tenant>#<cluster>`). OBLOADER enables the `--no-sys` option by default. For more information about the `--no-sys` option, see the corresponding option description. The use of the `--public-cloud` or `--no-sys` option will affect the import features, performance, and stability. OceanBase Database V2.2.30 and later support throttling on the server. Therefore, to ensure the stability of data import, before you use the `--public-cloud` or `--no-sys` option, you can run the following commands to set throttling thresholds as required on the server: `alter system set freeze_trigger_percentage=50;` `alter system set minor_merge_concurrence=64;` `alter system set writing_throttling_trigger_percentage=80 tenant='xxx';`
ODP (Sharding)	-h host_name, --host= host_name The host IP address for connecting to ODP or a physical OceanBase Database node. -P port_num, --port= port_num The host port for connecting to ODP or a physical OceanBase Database node. -c cluster_name, --cluster= cluster_name The OceanBase cluster to connect to. If this option is not specified, a physical node of the database is connected. The `-h` and `-P` options specify the host IP address and port number of the physical node. If this option is specified, ODP is connected. The `-h` and `-P` options specify the host IP address and port number of ODP. --logical-database Specifies to import data by using ODP (Sharding). If you specify the `--logical-database` option on the command line, the definition of a random physical database shard is exported and the shard cannot be directly imported into the database. You need to manually convert the exported physical shard to a logical one before you import it to the database for business use.

Bypass import

OBLOADER V4.2.6 and later support bypass import.

--rpc-port= rpc_port_num

The RPC port for connecting to the OBServer node. This option is used in combination with the --direct and --parallel options to specify to connect to the RPC port of an OBServer node to import data in bypass import mode.
Note
- This option applies only to OceanBase Database V4.2.0 RC2 and later.
- You can query the RPC port of an OBServer node in the following ways:
  - In the case of connection to the ODP server:
    - In ApsaraDB for OceanBase, the RPC port of the ODP server is 3307 by default.
    - In OceanBase Database, the RPC port of the ODP server is 2885 by default. You can change the port by using the -s option when you start ODP.
  - In the case of direct connection to an OBServer node, you can query the DBA_OB_SERVER system view in the sys tenant for the RPC port of the OBServer node, which is 2882 by default.
--direct

Specifies to use the bypass import mode. This option is used in combination with the --rpc-port and --parallel options.
Note
- The bypass import mode of OBLOADER does not support binary data types.
- The bypass import mode of OBLOADER supports direct connections to an OBServer node and connections through OceanBase Database Proxy (ODP). The version requirements are as follows:
  - OBServer node: V4.2.0 or later.
  - ODP: V4.1.3, or V4.3.0 or later while the OBServer node version must be V4.2.1 or later.
--parallel= parallel_num

The degree of parallelism (DOP) for loading data in bypass import mode. This option is used in combination with the--rpc-port and --direct options.

Feature options

File formats

Format (command-line option)	Relevant command-line option	Scenario
--csv Imports data files in the CSV format.	--skip-header Skips headers of CSV files when the files are imported. Headers are data in the first line. --column-delimiter 'column_delimiter_char' The string delimiter. --line-separator 'line_separator_string' The line separator. --escape-character 'escape_char' The escape character. --column-separator 'column_separator_char' The column separator, which is a comma (,) by default. --with-trim Deletes the space characters on the left and right sides of data. --ignore-unhex Ignores hexadecimal strings in decoding. This option applies only to binary data types. --character-set 'character_set_string' The character set for a database connection. --null-string 'null_replacer_string' Replaces the specified character with `NULL`. --empty-string 'empty_replacer_string' Replaces the specified character with an empty string (' '). Default value: \E. --file-suffix 'suffix_name' The file name extension of data files to be imported. --file-encoding 'encode_name' The character set of the files to read, which is not the database character set.	This option applies to almost all business scenarios.
--cut Imports data files in the CUT format.	--skip-header Skips headers of CUT files when the files are imported. Headers are data in the first line. --skip-footer Skips footers of CUT files when the files are imported. Footers are data in the last line. --column-splitter 'split_string' The column separator. --trail-delimiter Truncates the last column separator in a line. --line-separator 'line_separator_string' The line separator. --with-trim Deletes the space characters on the left and right sides of data. --ignore-escape Ignores escape operations on characters when CUT files are imported. By default, escape operations are not ignored. --ignore-unhex Ignores hexadecimal strings in decoding. This option applies only to binary data types. --character-set 'character_set_string' The character set for a database connection. --null-string 'null_replacer_string' Replaces the specified character with `NULL`. --escape-character 'escape_char' The escape character. --empty-string 'empty_replacer_string' Replaces the specified character with an empty string (' '). Default value: \E. --file-suffix 'suffix_name' The file name extension of data files to be imported. --file-encoding 'encode_name' The character set of the files to read, which is not the database character set.	This option applies to almost all business scenarios. You can use this option to import data of flexible data formats that is exported from outdated platforms.
--pos Imports data files in the POS format.	This option must be used in combination with control files.	This option is used for interaction with data extraction, transformation, and loading (ETL) platforms that support only a fixed-length format.
--sql Imports data files in the SQL format.	--line-separator 'line_separator_string' The line separator. --ignore-unhex Ignores hexadecimal strings in decoding. This option applies only to binary data types. --character-set 'character_set_string' The character set for a database connection. --null-string 'null_replacer_string' Replaces the specified character with `NULL`. --file-suffix 'suffix_name' The file name extension of data files to be imported. --file-encoding 'encode_name' The character set of the files to read, which is not the database character set.	This option applies to scenarios where only data but no schema is imported. Only one SQL syntax with high performance is supported.
--mix Imports MIX files.	--character-set 'character_set_string' The character set for a database connection. --file-suffix 'suffix_name' The file name extension of data files to be imported. --file-encoding 'encode_name' The character set of the files to read, which is not the database character set. --compat-mode Imports MySQL table schema definitions in compatible mode.	This option can be used to import SQL files in any format. It has higher compatibility but poorer performance than the `--sql` option. At present, the whole text is read to the memory for parsing. Therefore, this option is inapplicable if you want to process files of a large data mount.
--par Imports data files in the Parquet format.	--character-set 'character_set_string' The character set for a database connection. --ignore-unhex Ignores hexadecimal strings in decoding. This option applies only to binary data types. --file-suffix 'suffix_name' The file name extension of data files to be imported. --file-encoding 'encode_name' The character set of the files to read, which is not the database character set.	This option can be used to migrate data from Hive, Spark, or another data source that supports exporting data in the standard Parquet format.
--orc Imports data files in the ORC format.	--character-set 'character_set_string' The character set for a database connection. --ignore-unhex Ignores hexadecimal strings in decoding. This option applies only to binary data types. --file-suffix 'suffix_name' The file name extension of data files to be imported. --file-encoding 'encode_name' The character set of the files to read, which is not the database character set.	This option can be used to migrate data from Hive, Spark, or another data source that supports exporting data of the standard ORC format.
--ddl Imports DDL files.	--character-set 'character_set_string' The character set for a database connection. --ignore-unhex Ignores hexadecimal strings in decoding. This option applies only to binary data types. --file-suffix 'suffix_name' The file name extension of data files to be imported. --file-encoding 'encode_name' The character set of the files to read, which is not the database character set.	This option applies to logical backup.
--avro Imports data files in the Avro format.	--file-suffix 'suffix_name' The file name extension of data files to be imported.	Generally, this option applies to the big data field for offline migration of analytical processing (AP) business data and other scenarios.

Note

For more information about command-line options, see the option list in this topic.

Decompress data for import

OBLOADER V4.3.0 and later allow you to decompress CSV, CUT, POS, and SQL files exported by OBDUMPER before import. You can complete related configurations by using the following command-line options:

--compress

Indicates whether the to-be-imported files have been compressed by OBDUMPER during export. It is a Boolean value.
--compression-algo

Indicates the compression algorithm by which the to-be-imported files are compressed by OBDUMPER during export. The value is a string. Valid values are zstd, zlib, gzip, and snappy. The default value is zstd.
--compression-level

The compression level for the corresponding compression algorithm. The value is an integer. Compression levels supported for different compression algorithms are as follows:
- zstd: 1 to 22
- zlib and gzip: -1 to 9.
- snappy: You cannot specify a compression level for this compression algorithm.

Database object types

--all

Imports all supported database object definitions and table data. When this option is used in combination with --ddl, all database object definitions are imported. When this option is used in combination with --csv, --sql, --cut, or --pos, all data files in the specified format are imported. To import all database object definitions and table data, you can specify the --all and --ddl options with a data format option.

Note

The --all option is mutually exclusive with any database object options. It cannot be specified together with other database object options. If both the --all option and a database object option are specified, the --all option will be executed first.
--table-group 'table_group_name [,table_group_name...]' | --table-group '*'

Imports table group definitions. This option is similar to the --table option, except that this option does not support data import.
--table 'table_name [,table_name...]' | --table '*'

Imports table definitions or table data. When this option is used in combination with the --ddl option, only table definitions are imported. When this option is used in combination with any data format option, only table data is imported. To specify multiple tables, separate the table names with commas (,). By default, for OceanBase Database in Oracle mode, the table names are in uppercase, and for OceanBase Database in MySQL mode, the table names are in lowercase. For example, for OceanBase Database in Oracle mode, both --table 'test' and --table 'TEST' indicate the table named TEST. For OceanBase Database in MySQL mode, both --table 'test' and --table 'TEST' indicate the table named test. If table names are case-sensitive, enclose them in brackets ([ ]). For example, --table '[test]' indicates the table named test, while --table '[TEST]' indicates the table named TEST. If the table name is specified as an asterisk (*), all table definitions or table data is imported.

Notice

When you use a control file to import data, the table name specified in the --table option must be in the same letter case as that in the database. Otherwise, the control file fails to take effect.
--view 'view_name [, view_name...]' | --view '*'

Imports view definitions. This option is similar to the --table option, except that this option does not support data import.
--trigger 'trigger_name [, trigger_name...]' | --trigger '*'

Imports trigger definitions. This option is similar to the --table option, except that this option does not support data import.
--sequence 'sequence_name [, sequence_name...]' | --sequence '*'

Imports sequence definitions. This option is similar to the --table option, except that this option does not support data import. This option is supported only for OceanBase Database in Oracle mode.
--synonym 'synonym_name [, synonym_name...]' | --synonym '*'

Imports synonym definitions. This option is similar to the --table option, except that this option does not support data import. This option is supported only for OceanBase Database in Oracle mode.
--type 'type_name [, type_name...]' | --type '*'

Imports type definitions. This option is similar to the --table option, except that this option does not support data import. This option is supported only for OceanBase Database in Oracle mode.
--type-body 'typebody_name [, typebody_name...]' | --type-body '*'

Imports type body definitions. This option is similar to the --table option, except that this option does not support data import. This option is supported only for OceanBase Database in Oracle mode.
--package 'package_name [, package_name...]' | --package '*'

Imports package definitions. This option is similar to the --table option, except that this option does not support data import. This option is supported only for OceanBase Database in Oracle mode.
--package-body 'packagebody_name [, packagebody_name...]' | --package-body '*'

Imports package body definitions. This option is similar to the --table option, except that this option does not support data import. This option is supported only for OceanBase Database in Oracle mode.
--function 'function_name [, function_name...]' | --function '*'

Imports function definitions. This option is similar to the --table option, except that this option does not support data import.
--procedure 'procedure_name [, procedure_name...]' | --procedure '*'

Imports stored procedure definitions. This option is similar to the --table option, except that this option does not support data import.

Storage path

-f 'file_path', --file-path= 'file_path'

The absolute path for storing data files. When data is stored in another file system instead of the local file system, you can use the --tmp-path option to specify a temporary file directory.

OBLOADER allows you to import database object definitions and table data from local disks, Alibaba Cloud Object Storage Service (OSS), Amazon Simple Storage Service (S3), Apache Hadoop, Tencent Cloud Object Storage (COS), and Huawei Cloud Object Storage Service (OBS). The syntax is as follows:

[scheme://host]path[?parameters]
parameters: key[=value],...

Component	Description
scheme	The storage scheme. Alibaba Cloud OSS, Amazon S3, Apache Hadoop, Tencent Cloud COS, and Huawei Cloud OBS are supported. If the scheme is not Alibaba OSS, Amazon S3, Hadoop, Tencent Cloud COS, or Huawei Cloud OBS, an error will be returned.
host	The name of the storage space. When you import data from Alibaba Cloud OSS, Amazon S3, Tencent Cloud COS, or Huawei Cloud OBS, the `host` parameter specifies a bucket. For more information, see OSS Bucket. When you import data from Apache Hadoop, the `host` parameter specifies a Hadoop node, which is in the `<ip>:<port>` or `<cluster_name>` format.
path	The data storage path in the storage space. The path must start with a slash (`/`).
parameters	The parameters required for the request. The value can be a single key or multiple key-value pairs.

The following table describes the supported parameters.

Parameter	Value required?	Description	Supported storage scheme	Supported version
endpoint	Yes	The endpoint of the region where the OSS host resides. The endpoint for accessing Amazon S3, Tencent Cloud COS, or Huawei Cloud OBS.	OSS/S3/COS/OBS	OSS: V4.2.0 S3: V4.2.5 COS/OBS: V4.3.0
region	Yes	The physical location of the bucket.	S3/COS/OBS	S3: V4.2.0 COS/OBS: V4.3.0
storage-class	Yes	The storage class of Amazon S3.	S3	V4.2.0
access-key	Yes	The AccessKey ID used to access the bucket.	OSS/S3/COS/OBS	OSS/S3: V4.2.0 COS/OBS: V4.3.0
secret-key	Yes	The AccessKey secret used to access the bucket.	OSS/S3/COS/OBS	OSS/S3: V4.2.0 COS/OBS: V4.3.0
hdfs-site-file	Yes	A hdfsSiteFile configuration file. The configuration file contains the configuration information of Apache Hadoop, such as the block size and number of replicas. Storage and access rules are set for Apache Hadoop based on the configuration information.	Apache Hadoop	V4.2.1
core-site-file	Yes	A hdfsSiteFile configuration file. The configuration file contains the core configuration information of the Hadoop cluster, such as the URI of the file system and the default file system of Apache Hadoop.	Apache Hadoop	V4.2.1
principal	Yes	The identifier for identity authentication in Kerberos.	Apache Hadoop	V4.2.1
keytab-file	Yes	The absolute path of the Keytab file, which authorizes users or services to access system resources.	Apache Hadoop	V4.2.1
krb5-conf-file	Yes	The path where the Kerberos configuration file resides.	Apache Hadoop	V4.2.1

Here are some examples:

Import data from the local disk
```
-f '/home/admin/foo/bar'
```

Import data from Amazon S3

-f 's3://bucket/path?region={region}&access-key={accessKey}&secret-key={secretKey}'

Import data from Alibaba Cloud OSS

-f 'oss://mybucket/foo/bar?endpoint=myendpoint&access-key=myak&secret-key=mysk'

Import data from Apache Hadoop

-f 'hdfs://***.*.*.*:9000/chang/parquet?hdfs-site-file=/data/0/zeyang/hdfs-site.xml&core-site-file=/data/0/zeyang/core-site.xml

Import data from Tencent Cloud COS

-f 'cos://mybucket/foo?region=ap-shanghai&access-key=myak&secret-key=mysk'
-f 'cos://mybucket/foo?endpoint=yourendpoint&access-key=myak&secret-key=mysk'

Import data from Huawei Cloud OBS

-f 'obs://mybucket/foo?region=cn-north-1&access-key=myak&secret-key=mysk'
-f 'obs://mybucket/foo?endpoint=yourendpoint&access-key=myak&secret-key=mysk'

--ctl-path 'control_path'

The absolute path on a local disk for storing control files. You can configure built-in preprocessing functions in a control file. Data will be preprocessed by these functions before being imported. For example, the functions can perform case conversion and check the data for empty values. For the use of control files, see Data processing. When you specify this option on the command line, enclose the value in single quotation marks (' '), Example: --ctl-path '/home/controls/'.
--log-path 'log_path'

The output directory for the operational logs of OBLOADER. If this option is not specified, OBLOADER operational logs are stored in the directory specified by the -f option. Redirection is not required for log output, unless otherwise specified.

Other options

-H, --help

Shows the help information about the tool.
-V, --version

Shows the version of the tool.

Advanced options

Feature options

Timestamp formats

--nls-date-format 'date-format-string'

The format of dates in database connections in Oracle mode of OceanBase Database. Default value: YYYY-MM-DD HH24:MI:SS.
--nls-timestamp-format 'timestamp-format-string'

The format of timestamps in database connections in Oracle mode of OceanBase Database. Default value: YYYY-MM-DD HH24:MI:SS:FF9.
--nls-timestamp-tz-format 'timestamp-tz-format-string'

The format of timestamps that contains a time zone in database connections in Oracle mode of OceanBase Database. Default value: YYYY-MM-DD HH24:MI:SS:FF9 TZR.

Allowlist- and blocklist-based filtering

Allowlist- and blocklist-based table filtering
- --exclude-table 'table_name [, table_name...]'
  
  Excludes the specified tables from the import of table definitions or table data. Fuzzy match on table names is supported,
  
  such as --exclude-table 'test1,test*,*test,te*st'
  
  The preceding example specifies to exclude the following tables from the import of table definitions or table data:
  - test1
  - All tables with a table name starting with test
  - All tables with a table name ending with test
  - All tables with the table name starting with te and ending with st
- --exclude-data-types 'datatype [, datatype...]'
  
  Excludes the specified data types from the import of table data.
- --file-regular-expression
  
  The regular expression that is used to match the file name during an import. This option applies only to single-table import. For example, --file-regular-expression ".*" specifies to use the ".*" regular expression to match all files in the directory. In the expression, * matches zero or more characters. --file-regular-expression ".*\\.csv" specifies to use the ".*\.csv" regular expression to match files whose file names end with ".csv". In the expression, "\." matches the period ".".
Allowlist- and blocklist-based column filtering
- --exclude-column-names 'column_name [, column_name...]'
  
  Excludes the specified columns during data import.
  Notice
  - The letter case of the specified column name must be the same as that of the column name in the table structure.
  - In the imported data file, the excluded columns must have no corresponding data, and the imported columns must be in the same order as the columns in the table.

Error handling

--max-discards int_num

The maximum amount of duplicate data allowed in a single table to import. If duplicate data in a table exceeds the specified maximum, data import for this table stops. The import failure on the table is logged. Data import for other tables is not affected. The default value is -1, indicating that data import does not stop for duplicate data.

Note

This option takes effect only when the table contains a primary key or a unique key and duplicate data exists in the table.
--retry

Resumes the import task from the breakpoint. We recommend that you specify this option to resume an import task that has more than 80% of data imported, so that you do not have to start a new import. Duplicate data errors may occur during the resumed import. If only a small part of the data has been imported, you can clear tables and start a new import, which is more efficient.
Notice
- The `load.ckpt` file is a savepoint file generated when the tool runs and is located in the directory specified by the -f option. You cannot use this option if the `load.ckpt` file does not exist.
- This option is inapplicable to bypass import.
--max-errors int_num

The maximum number of errors allowed for an import. If the number of import errors of a table exceeds the specified maximum, data import for this table stops. The import failure on the table is logged. This option can be set to 0, -1, or any positive integer N. If you set this option to -1, the errors are ignored and the import continues. The default value is 1000.
--strict= 'strict_string'

Specifies to control the impact of dirty data on the process end status during import. The default value is true, which indicates that the program is to end in the failed state (System exit 1) when the imported data contains a Bad Record or Discard Record error. The value false indicates that the final end state (System exit 0) of the program is not affected when the imported data contains a Bad Record or Discard Record error.

Note

This option can be used in combination with the --max-discards or --max-errors option, which specifies to skip duplicate data or errors and proceed with the process when the amount of duplicate data or the number of errors is within the specified range. For more information, see Error handling.
--replace-data

Replaces duplicate data in the table. This option is only applicable to tables that have primary keys or unique keys with the NOT NULL constraint. If a file has a large amount of duplicate data, such as more than 30% of the total data volume, we recommend that you clear the table and import it again. The performance of data replacement is lower than that of import after table clearing. This option can be used only in combination with the --cut, --csv, and --sql options, and does not take effect for the --ddl option.
Notice
- When the file and the table have duplicate data, the data in the table is replaced with the data in the file.
- For tables without primary keys or unique keys, this option appends data to the tables.
- If you do not need to replace duplicate data, do not specify this option. Otherwise, business may be affected.

Performance options

--rw float_num

The proportion of data file parsing threads. The default value is 1. You can use this option in combination with the --thread option to calculate the number of file parsing threads by using the following formula: Number of file parsing threads = Value of --thread × Value of --rw.
--slow float_num

The threshold for triggering a slow import. When the memory usage of OceanBase Database reaches 75%, OBLOADER slows down to prevent an excessively high memory usage on the database. Default value: 0.75.
--pause float_num

The threshold for triggering an import pause. When the memory usage of OceanBase Database reaches 85%, OBLOADER pauses data import to prevent issues caused by an excessively high memory usage on the database. Default value: 0.85.
--batch int_num

The number of records for a batch of transactions. We recommend that you set this option to a value inversely proportional to the table width. Do not set this option to an excessively high value, which may cause database memory overflow. Default value: 200.
Note
- In OBLOADER V4.2.0 and later, the default value of the --batch option can be adapted based on the Java virtual machine (JVM) memory.
- In OBLOADER V4.2.8.1 and V4.2.8.2, the number of parameters in an executed prepared statement may exceed the upper limit, causing a core dump in ODP. In this case, you can manually set this option to an appropriate value.
--thread int_num

The number of concurrent threads allowed. This option corresponds to the number of import threads. You can use this option with the --rw option to calculate the number of file parsing threads based on the following formula: Number of file parsing threads = Value of --thread × Value of --rw. The default value is the value of the number of CPU cores multiplied by 2. If the number of CPU cores is greater than 16, the maximum value is 32. OceanBase Database executes DDL statements in sequence. Therefore, you do not need to specify this option when importing database object definitions.
--block-size int_num

The size of a file to be imported. When specifying this option, you do not need to explicitly specify the unit. The default unit is MB. By default, OBLOADER automatically splits a large file into multiple logical subfiles (or blocks) sized 64 MB. The logical subfiles do not occupy additional storage space. Default value: 64.
--max-tps int_num

The maximum import speed. You can specify this option to ensure import stability.
--max-wait-timeout int_num

The timeout period of waiting for a database major compaction to complete. When specifying this option, you do not need to explicitly specify the unit. The default unit is hour. When the database is under a major compaction, OBLOADER stops data import and waits up to the specified period. The default value is 3.

Other options

--truncate-table

Truncates tables in the destination database. This option can be used only in combination with a data format option. When being used in combination with the --all or --table '*' option, this option specifies to truncate all tables in the destination database. If you want to truncate only some of the tables, you can explicitly specify them in the format of --table 'test1,test2,[....]'. When being used in combination with the --with-data-files option, this option specifies to truncate tables that have the corresponding data files.
Notice
- If the --all or --table '*' option is used in combination with the --truncate-table option, the tool truncates all tables in the destination database, even if no corresponding data files of the tables exist in the directory specified by the -f option.
- Do not use this option to truncate the destination database or destination tables. We recommend that you manually truncate tables to avoid impacts on business.
--with-data-files

When being used in combination with the --truncate-table or --delete-from-table option, this option specifies to truncate or clear tables that have corresponding data files before data import. If not used in combination with these options, this option does not take effect.
--delete-from-table

Clears tables in the destination database before the import. This option can be used only in combination with a data format option. When being used in combination with the --all or --table '*' option, this option specifies to clear all tables in the destination database. If you want to clear only some of the tables, you can explicitly specify them in the format of --table 'test1,test2,[....]'. When being used in combination with the --with-data-files option, this option specifies to clear tables that have corresponding data files only.
Notice
- If the --all or --table '*' option is used in combination with the --delete-from-table option, the tool clears all tables in the destination database, even if no corresponding data files of the tables exist in the directory specified by the -f option.
- Do not use this option to clear the destination database or destination tables, especially tables with large data volumes. We strongly recommend that you delete table data manually based on your business requirements to avoid impacts on your business.
--replace-object

Replaces existing database object definitions during the import. For tables and synonyms, existing definitions are deleted and then new ones are created. For functions and stored procedures, existing definitions are replaced by using the CREATE OR REPLACE statement. This option can be used only in combination with the --ddl or --mix option, and does not take effect for --csv, --sql, and other data format options.
Notice
- If an object already exists in the destination database, the object definition is forcibly replaced with the object definition stored in the import file.
- If you do not need to replace objects in the destination database, do not use this option. Otherwise, business may be affected.

Options

Option	Required?	Description	Introduced in
-h(--host)	Yes	The host IP address for connecting to OceanBase Database Proxy (ODP) or a physical OceanBase Database node.
-P(--port)	Yes	The host port for connecting to ODP or a physical OceanBase Database node.
--rpc-port	No	The RPC port for connecting to the OBServer node.	V4.2.5
--compat-mode	No	Imports MySQL table schema definitions in a compatible mode. If you have specified this option when you import MySQL table schema definitions, OBLOADER converts the original `CREATE TABLE` statement to a synonymous statement that is supported by OceanBase Database in MySQL mode. Notice You can import only MySQL table definitions. Only MySQL 5.6, 5.7, and 8.0 are supported.	V4.2.7
--direct	No	Specifies to use the bypass import mode.	V4.2.5
--parallel	No	The degree of parallelism (DOP) for loading data in bypass import mode.	V4.2.6
-u(--user)	Yes	The username, tenant name, and cluster name, in the format of `<user>@<tenant>#<cluster>`.
-p(--password)	No	The password that you use to log on to the database.
-D(--database)	No	The name of the database.
-f(--file-path)	Yes	The directory that stores the data file or the absolute path of the data file.
--sys-user	No	The name of the user in the `sys` tenant.
--sys-password	No	The password of the user in the `sys` tenant.
--public-cloud	No	Indicates that the database environment is ApsaraDB for OceanBase.
--file-suffix	No	The file name extension. Generally, the file name extension is correlated with the file format. For example, a CSV file is usually named as `xxx.csv`. If you do not strictly follow the naming conventions, you may name a CSV file with any extension, such as .txt. In this case, OBLOADER cannot identify the file as a CSV file. This option is optional. A default value is available for each data format. The default file name extension is `.csv` for a CSV file, `.sql` for an SQL file, and `.dat` for a CUT or POS file. When you specify this option on the command line, enclose the value in single quotation marks (''), for example: `--file-suffix '.txt'`.
--file-encoding	No	The file character set, which is different from the database character set. When you specify this option on the command line, enclose the value in single quotation marks (' '), for example, `--file-encoding 'GBK'`. Default value: UTF-8.
--ctl-path	No	The directory of the control files.
--log-path	No	The directory where log files are stored.
--ddl	No	Imports DDL files. A DDL file stores the database object definitions, and is named in the format of object name-schema.sql. When this option is specified, only database object definitions are imported, and table data is not imported. Notice Avoid comments or statements to enable/disable a feature in the file. If database objects depend on each other, the import may fail and manual intervention is required.
--csv	No	Imports data files in the CSV format. This option is recommended. A CSV file stores data in the standard CSV format, and is named in the format of table name.csv. For CSV format specifications, see the definitions in RFC 4180. Delimiter errors are the most common errors that occur in CSV files. Single or double quotation marks are usually used as the delimiter. If data in the file contains the delimiter, you must specify escape characters. Otherwise, OBLOADER fails to parse the data due to its incorrect format. We strongly recommend that you use the CSV format. We recommend that you use this option with the `--table` option. When this option is used in combination with the `--all` option, OBLOADER imports only table data but not database object definitions.
--sql	No	Imports data files in the SQL format, which is different from DDL files. An SQL file stores data in the format of INSERT statements, and is named in the format of table name.sql. Each line of table data corresponds to an executable INSERT statement in an SQL file. An SQL file is different from a DDL file in terms of content format. We recommend that you use this option with the `--table` option. When this option is used in combination with the `--all` option, OBLOADER imports only table data but not database object definitions. Notice The data cannot contain SQL functions, special characters, line breaks, and so on. Otherwise, the file may not be correctly parsed.
--orc	No	Imports data files in the ORC format. An ORC file stores data in the column-oriented format, and is named in the format of table name.orc. For more information about ORC format definitions, see ORC Specification.	V4.0.0
--par	No	Imports data files in the Parquet format. A Parquet file stores data in the column-oriented format, and is named in the format of table name.parquet. For more information about Parquet format definitions, see File Format on the official Apache Parquet website. Note When you use OBLOADER V4.2.5 or earlier to import Parquet files, the DECIMAL, DATE, TIME, and TIMESTAMP data types are not supported.	V4.0.0
--mix	No	Imports a mixed file that contains both definitions and data. A MIX file stores a mix of DDL and DML statements, and does not have strict naming conventions. Notice MIX files do not have a strict format and feature a complex processing process and poor performance. Therefore, we recommend that you do not use MIX files.
--pos	No	Imports data files in the POS format. A POS file stores data with a fixed length in bytes, and is named in the format of table name.dat. Data stored in each column of a POS file occupies a fixed number of bytes. A column value shorter than specified is padded with spaces. A column value longer than specified is truncated, in which case data may be garbled. We recommend that you use this option with the `--table` option. When this option is used in combination with the `--all` option, OBLOADER imports only table data but not database object definitions. This format is different from the format that stores data with a fixed length in characters.
--cut	No	Imports data files in the CUT format. A CUT file uses a character or a character string as the separator string, and is named in the format of table name.dat. How to distinguish the CUT format from the CSV format? A file in the CSV format uses a single character, which is usually a comma (,), to separate fields. A file in the CUT format usually uses a string, such as `\\|@\\|`, to separate fields. A file in the CSV format uses single quotation marks or double quotation marks as delimiters between fields, while a file in the CUT format does not have delimiters. We recommend that you use this option with the `--table` option. When this option is used in combination with the `--all` option, OBLOADER imports only table data but not database object definitions. Notice In a CUT file, each data record is stored in an entire line. When a single character is used as the field separator, avoid special characters in the data, such as delimiters, carriage returns, and line breaks. Otherwise, OBLOADER cannot correctly parse the data. When you specify the `--cut` option on the OBLOADER command line, do not use the `--trail-delimiter` option if no field separator or separator string exists at the end of each data line in the file. Otherwise, the data cannot be correctly imported to the database.
--avro	No	Imports data files in the Avro format. A file in the Avro format contains content that complies with the Apache Avro specification. The file name extension is `.avro`. Supported data types and limitations are as follows: All primary types are supported. All logical types are supported. Complex types are not supported.	V4.3.0
--all	No	Imports all supported database object definitions and table data.
--table-group	No	Imports table group definitions.	V3.1.0
--table	No	Imports table definitions or table data.
--view	No	Imports view definitions.
--trigger	No	Imports trigger definitions.
--sequence	No	Imports sequence definitions.
--synonym	No	Imports synonym definitions.
--type	No	Imports type definitions.	V4.0.0
--type-body	No	Imports type body definitions.
--package	No	Imports package definitions.
--package-body	No	Imports package body definitions.
--function	No	Imports function definitions.
--procedure	No	Imports stored procedure definitions.
--replace-object	No	Replaces existing object definitions. We recommend that you replace object definitions manually, rather than using this option.
--rw	No	The proportion of data file parsing threads.
--slow	No	The threshold for triggering a slow import.
--pause	No	The threshold for triggering an import pause.
--batch	No	The number of records for a batch of transactions.
--thread	No	The number of concurrent import threads allowed.
--block-size	No	The size of a file to be imported.
--retry	No	Reimports data from the last savepoint.
--max-tps	No	The maximum import speed. Default unit: lines/second.
--max-wait-timeout	No	The timeout period of waiting for a database major compaction to complete.
--nls-date-format	No	The session-level datetime format, which is supported only for OceanBase Database in Oracle mode.
--nls-timestamp-format	No	The session-level timestamp format, which is supported only for OceanBase Database in Oracle mode.
--nls-timestamp-tz-format	No	The session-level timestamp format with a time zone, which is supported only for OceanBase Database in Oracle mode.
--trail-delimiter	No	Truncates the last column separator in a line.
--with-trim	No	Deletes the space characters on the left and right sides of the data.
--skip-header	No	Skips the first line of data in CSV/CUT files when the files are imported. Only OBLOADER V3.3.0 and later support skipping the first line of data in a CUT file.
--skip-footer	No	Skips the last line of data in a CUT file during import.	V3.3.0
--null-string	No	Replaces the specified character with NULL. Default value: \N.
--empty-string	No	Replaces the specified character with an empty string (' '). Default value: \E.
--line-separator	No	The line separator. Custom line separators are supported for the import of CUT files. Default value: \n. Custom line separators are supported for the import of CUT files. The default value of this option depends on the system platform. Only the following values are supported: \r, \n, and \r\n.
--column-separator	No	The column separator, which is different from the column separator string in the CUT format.
--escape-character	No	The escape character. The value can only be a single character. Default value: \. Note You can use hexadecimal characters as separators, so that OBLOADER takes the separators as special (invisible) characters.
--column-delimiter	No	The string delimiter. This option can be used only in combination with the `--csv` option and supports a single character only. Default value: single quotation mark (').
--ignore-unhex	No	Ignores hexadecimal strings in decoding.
--exclude-table	No	Excludes the specified tables from the import of table definitions and table data.
--exclude-data-types	No	Excludes the specified data types from the import of data.
--column-splitter	No	The column separator string, which is different from the column separator in the CSV format.
--max-discards	No	The maximum amount of duplicate data allowed in a single table to import. Default value:`-1`.
--max-errors	No	The maximum number of errors allowed for importing a table. Default value: `1000`.
--exclude-column-names	No	Excludes the specified columns from the import of data.
--replace-data	No	Replaces duplicate data in the table. This option is only applicable to tables that have primary keys or unique keys with the `NOT NULL` constraint.
--truncate-table	No	Truncates tables in the destination database. We recommend that you truncate tables manually, rather than using this option.
--with-data-files	No	Truncates or clears tables with specified data files.	V3.1.0
--delete-from-table	No	Clears tables in the destination database before the import. We recommend that you clear tables manually, rather than using this option.
-V(--version)	No	Shows the OBLOADER version.
--no-sys	No	Specifies that the password of the sys tenant cannot be provided in OceanBase Database.	V3.3.0
--logical-database	No	Imports data by using ODP (Sharding).	V3.3.0
--file-regular-expression	No	The regular expression that is used to match the file name during a single-table import.	V3.3.0
--ignore-escape	No	Ignores escape operations on characters for the import of CUT format files.	V3.3.0
--character-set	No	The character set for creating a database connection. The default value is the value specified by the session variable `jdbc.url.character.encoding` in the `session.properties` file. The value of the `--character-set` option overrides that of the `jdbc.url.character.encoding` variable. Supported character sets include binary, gbk, gb18030, utf16, and utf8mb4.	V4.2.4
--strict	No	Specifies to control the impact of dirty data on the process end status during import.	V4.2.4
-H(--help)	No	Shows the help information of the OBLOADER command-line tool.

OceanBase

Customer Stories

Documentation

Command-line options

Overview

Option styles

Option categories

Required options

Basic options

Connection options

Import type

Note

Note

Bypass import

Note

Note

Feature options

File formats

Note

Decompress data for import

Database object types

Note

Notice

Storage path

Other options

Advanced options

Feature options

Timestamp formats

Allowlist- and blocklist-based filtering

Notice

Error handling

Note

Notice

Note

Notice

Performance options

Note

Other options

Notice

Notice

Notice

Options

Notice

Notice

Notice

Note

Notice

Notice

Note