OBLOADER is a client data import tool developed in Java. OBLOADER provides extensive command-line options that allow you to import schemas and data to OceanBase Database in many complex scenarios. OBLOADER is usually used in combination with OBDUMPER. You can also use OBLOADER to import files in the SQL or CSV format that are exported by using third-party tools, such as Navicat, Mydumper, and SQL Developer. OBLOADER fully exploits the features of the distributed OceanBase Database system and is particularly optimized in import performance.
Features
You can use OBLOADER to import table schemas and data to OceanBase Database. OBLOADER provides the following features:
Allows you to import the DDL statements of database objects.
Allows you to import data files in the CSV or SQL format.
Allows you to import files in complex data formats such as POS, CUT, and MIX formats.
Allows you to set data preprocessing rules.
Allows you to configure field mappings between files and tables.
Supports features such as throttling, memory explosion prevention, resumption after an interruption, and automatic retries.
Allows you to specify a log directory and store bad data and conflict data.
Supports data import in limited mode, in which you do not need to specify account information of users under the sys tenant.
Note
The schema definition exported by using this tool in limited mode may be incomplete. Also, the import and export performance is reduced.
Supported OceanBase Database versions
The following table describes OceanBase Database versions that are supported.
| Database | Supported version |
|---|---|
| OceanBase Database in Oracle mode | V2.0.x, V2.1.x, V2.2.20, V2.2.30, V2.2.50, V2.2.70, V2.2.71, V2.2.72, V2.2.76, and V3.1.x |
| OceanBase Database in MySQL mode | V1.4.70, V1.4.72, V1.4.75, V1.4.78, V1.4.79, V2.2.30, V2.2.50, V2.2.70, V2.2.71, V2.2.72, V2.2.76, and V3.1.x |
Considerations
For more information about the standard CSV format, see the RFC 4180 specifications. We recommend that you import data in strict accordance with the RFC 4180 specifications.
You need to modify the VM memory parameter in the script when you intend to import or export a large amount of data. Default value: -Xms4G -Xmx4G.
The object names, data file names, and rule file names specified by command-line options must be capitalized in the same way. By default, uppercase letters are used in Oracle mode, and lowercase letters are used in MySQL mode.
The imported data files must be named in the table name.<random character>.extension format.
If a database contains tables with foreign keys, schemas and data may not be imported in dependency order, which may cause the import failure.
Tables without primary keys do not support import resumption after an interruption.
When you import data, make sure that the specified object name and data file name are capitalized in the same way. If a database contains both object names in uppercase and object names in lowercase, use double quotation marks (" ") to enclose the object names and file names in lowercase.
OBLOADER supports the following file formats:
DDL: A file in the DDL format contains only DDL statements.
SQL: A file in the SQL format contains only
INSERTstatements. Each statement occupies one line without line breaks.MIX: A file in the MIX format can contain mixed types of statements, such as DDL and DML statements.
POS: In a file in the POS format, the data is generated based on the positions of byte offsets.
CUT: A file in the CUT format stores data separated by string. However, in a file in the CSV format, data is separated by single character.
Limited mode
OBLOADER provides a limited mode. You can import data in limited mode by specifying the --public-cloud option. OBLOADER in limited mode is supported in OceanBase clusters deployed in the public cloud. In OceanBase clusters deployed in a private cloud, you must specify --sys-password or configure a key file before you use this tool.
The import feature, performance, and stability of OBLOADER may be compromised in limited mode. In addition, only OceanBase V2.2.30 and later versions support throttling on the server. Therefore, when you use the limited mode, run the following command to modify the throttling settings on the server:
ALTER SYSTEM SET freeze_trigger_percentage=50;
ALTER SYSTEM SET minor_merge_concurrence=64;
ALTER SYSTEM SET writing_throttling_trigger_percentage=80 tenant='xxx';
Examples
Import table schemas and data into OceanBase Database, respectively. Command-line options of OBLOADER:
bin/obloader -h <host IP address> -P <port> -u <user> -p <password> --sys-user <root or proxyro user in the sys tenant> --sys-password <password in the sys tenant> -c <cluster> -t <tenant> -D <Schema database name> [--ddl] [--csv|--sql] [--all|--table 'table name'] -f<data file or directory>
OBLOADER must obtain the metadata of the table schemas during an import, which means that you need to log on to the sys tenant. We recommend that you use the proxyro user to perform the import.
Import table schemas
Import a data file into the test2 database of the obmysql tenant. Sample code:
[root@obce-0000 ob-loader-dumper-3.0.0-SNAPSHOT]# bin/obloader -h 10.0.0.0 -P 2883 -u u_loader -p ****** --sys-user=proxyro --sys-password=******** -c obce-3zones -t obmysql -D test2 --ddl --all -f /tmp/obdumper
2021-09-29 21:26:05 [INFO] Parsed args:
-h[--host] 10.0.0.0
-P[--port] 2883
-u[--user] u_loader
-p[--password] ******
[--sys-user] proxyro
[--sys-password] ******
-c[--cluster] obce-3zones
-t[--tenant] obmysql
-D[--database] test2
[--ddl]
[--all]
-f[--file-path] /tmp/obdumper
2021-12-29 14:02:25 [INFO] The security certificate file: "/ob-loader-dumper-3.0.0-SNAPSHOT/conf/secure.crt" is not exists
2021-12-29 14:02:25 [INFO] No control files were defined in the path: "/tmp/obdumper"
2021-12-29 14:02:25 [INFO] Load jdbc driver class: "com.oceanbase.jdbc.Driver" finished
2021-12-29 14:02:25 [INFO] The manifest file: "/tmp/obdumper/data/MANIFEST.bin" has been saved
2021-12-29 14:02:26 [INFO] Init writer thread pool finished
2021-12-29 14:02:26 [WARN] The object type : "SEQUENCE" doesn't exist in the -schema.sql files
2021-12-29 14:02:26 [INFO] No.1 sql of the file: "/tmp/obdumper/data/test/TABLE/test1-schema.sql" exec success . Elapsed: 141.6 ms
2021-12-29 14:02:26 [INFO] Load file: "test1-schema.sql" finished
2021-12-29 14:02:26 [INFO] No.1 sql of the file: "/tmp/obdumper/data/test/TABLE/test0-schema.sql" exec success . Elapsed: 227.2 ms
2021-12-29 14:02:26 [INFO] Load file: "test0-schema.sql" finished
2021-12-29 14:02:26 [WARN] The object type : "VIEW" doesn't exist in the -schema.sql files
2021-12-29 14:02:26 [WARN] The object type : "FUNCTION" doesn't exist in the -schema.sql files
2021-12-29 14:02:26 [WARN] The object type : "PROCEDURE" doesn't exist in the -schema.sql files
2021-12-29 14:02:26 [WARN] The object type : "TRIGGER" doesn't exist in the -schema.sql files
2021-12-29 14:02:26 [WARN] The object type : "PACKAGE" doesn't exist in the -schema.sql files
2021-12-29 14:02:26 [WARN] The object type : "TYPE" doesn't exist in the -schema.sql files
2021-12-29 14:02:26 [WARN] The object type : "PACKAGE_BODY" doesn't exist in the -schema.sql files
2021-12-29 14:02:26 [WARN] The object type : "TYPE_BODY" doesn't exist in the -schema.sql files
2021-12-29 14:02:26 [WARN] The object type : "SYNONYM" doesn't exist in the -schema.sql files
2021-12-29 14:02:26 [WARN] The object type : "PUBLIC_SYNONYM" doesn't exist in the -schema.sql files
2021-12-29 14:02:27 [INFO] Close connection count: 1 of the DataSource. Key: 11_124_5_29_44564_857230679_test
2021-12-29 14:02:27 [INFO] Close connection count: 0 of the DataSource. Key: 11_124_5_29_44564_167438737_oceanbase
2021-12-29 14:02:27 [INFO] Shutdown task context finished
2021-12-29 14:02:27 [INFO]
Finished Tasks: 2 Running Tasks: 0 Progress: 100.00%
2021-12-29 14:02:27 [INFO]
All Load Tasks Finished:
----------------------------------------------------------------------------------------------------------------------------
No.# | Type | Name | Count | Status
----------------------------------------------------------------------------------------------------------------------------
1 | TABLE | test0 | 1 | SUCCESS
2 | TABLE | test1 | 1 | SUCCESS
----------------------------------------------------------------------------------------------------------------------------
Total Count: 2 End Time: 2021-12-29 14:02:27
2021-12-29 14:02:27 [INFO] Load schema finished. Total Elapsed: 1.057 s
2021-12-29 14:02:27 [INFO] System exit 0
[root@obce-0000 ob-loader-dumper-3.0.0-SNAPSHOT]#
Import data
[root@obce-0000 ob-loader-dumper-3.0.0-SNAPSHOT]# bin/obloader -h 10.0.0.0 -P 2883 -u u_loader -p ****** --sys-user=proxyro --sys-password=******** -c obce-3zones -t obmysql -D test2 --csv --all -f /tmp/obdumper
2021-09-29 21:27:53 [INFO] Parsed args:
-h[--host] 10.0.0.0
-P[--port] 2883
-u[--user] u_loader
-p[--password] ******
[--sys-user] proxyro
[--sys-password] ******
-c[--cluster] obce-3zones
-t[--tenant] obmysql
-D[--database] test2
[--csv]
[--all]
-f[--file-path] /tmp/obdumper
2021-12-29 14:07:21 [INFO] The security certificate file: "/ob-loader-dumper-3.0.0-SNAPSHOT/conf/secure.crt" is not exists
2021-12-29 14:07:21 [INFO] No control files were defined in the path: "/tmp/obdumper"
2021-12-29 14:07:21 [INFO] Load jdbc driver class: "com.oceanbase.jdbc.Driver" finished
2021-12-29 14:07:21 [INFO] The manifest file: "/tmp/obdumper/data/MANIFEST.bin" has been saved
2021-12-29 14:07:22 [INFO] Query the column metadata for the table: "test0" finished
2021-12-29 14:07:22 [INFO] Query the column metadata for the table: "test1" finished
2021-12-29 14:07:22 [INFO] Binding table: "test0" to the file: "/tmp/obdumper/data/test/TABLE/test0.0.0.csv" finished
2021-12-29 14:07:22 [WARN] File: "/tmp/obdumper/data/test/TABLE/test0-schema.sql" is unmatched on the suffix[.csv], ignore it
2021-12-29 14:07:22 [WARN] File: "/tmp/obdumper/data/test/TABLE/test1-schema.sql" is unmatched on the suffix[.csv], ignore it
2021-12-29 14:07:22 [WARN] File: "/tmp/obdumper/data/MANIFEST.bin" is unmatched on the suffix[.csv], ignore it
2021-12-29 14:07:22 [WARN] File: "/tmp/obdumper/data/CHECKPOINT.bin" is unmatched on the suffix[.csv], ignore it
2021-12-29 14:07:22 [INFO] Splitted 1 csv subfiles by 64.0 MB. Elapsed: 15.57 ms
2021-12-29 14:07:22 [INFO] Generate 1 subfiles finished
2021-12-29 14:07:22 [INFO] Ignore to clean any tables as --truncate-table or --delete-from-table is not specified
2021-12-29 14:07:22 [INFO] Ignore to query table entry for table: "test1" without datafiles. Remain: 1
2021-12-29 14:07:22 [INFO] Query table entry and primary key for table: "test0" finished. Remain: 0
2021-12-29 14:07:22 [INFO] Ignore to shuffle data files for table: "test1" without datafiles. Remain: 1
2021-12-29 14:07:50 [INFO] Query the leader location "test0" in multi unit finished. Elapsed: 73.12 ms
2021-12-29 14:07:50 [INFO] Query the leader location of "test0" finished. Remain: 0
2021-12-29 14:07:50 [INFO] Calculate leader: xx.xx.xx.xx:11141 of table: "test0", part: 0. Remain: 0
2021-12-29 14:07:50 [INFO] Waiting to refresh observer load status ......
2021-12-29 14:07:50 [INFO] Refresh the observer load status success. Table: "test0". Remain: 0
2021-12-29 14:07:50 [INFO] Refresh observer load status finished. Elapsed: 116.4 ms
2021-12-29 14:07:50 [INFO] Create 16384 slots for ring buffer finished. [xx.xx.xx.xx:11141]
2021-12-29 14:07:51 [INFO] Start 192 database writer threads finished. [xx.xx.xx.xx:11141]
2021-12-29 14:07:51 [INFO] Start 38 csv file reader threads successed
2021-12-29 14:07:51 [INFO] File: "/tmp/obdumper/data/test/TABLE/test0.0.0.csv" has been parsed finished
2021-12-29 14:07:55 [INFO]
1. Enqueue Performance Monitor:
-------------------------------------------------------------------------------------------------------
Dimension \ Metric | Tps | Throughput | Buffer
-------------------------------------------------------------------------------------------------------
1.sec.avg | 2.03 Records/sec | 0.0 KB/sec | 1 Slots
1.min.avg | 0.0 Records/min | 0.0 KB/min | 1 Slots
Total | 10 Records | 0.0 KB | 1 Slots
-------------------------------------------------------------------------------------------------------
2. Dequeue Performance Monitor:
-------------------------------------------------------------------------------------------------------
Dimension \ Metric | Tps | Throughput | Buffer
-------------------------------------------------------------------------------------------------------
1.sec.avg | 2.0 Records/sec | 0.0 KB/sec | 1 Slots
1.min.avg | 2.0 Records/min | 2.0 KB/min | 1 Slots
Total | 10 Records | 0.0 KB | 1 Slots
-------------------------------------------------------------------------------------------------------
2021-12-29 14:08:13 [INFO] Wait for the all the workers to drain of published events then halt the workers
2021-12-29 14:08:13 [INFO] Close connection count: 4 of the DataSource. Key: 11_124_5_29_44564_857230679_test
2021-12-29 14:08:14 [INFO] Close connection count: 17 of the DataSource. Key: 11_124_5_29_44564_167438737_oceanbase
2021-12-29 14:08:14 [INFO] Shutdown task context finished
2021-12-29 14:08:14 [INFO]
Finished Tasks: 1 Running Tasks: 0 Progress: 100.00%
2021-12-29 14:08:14 [INFO]
All Load Tasks Finished:
----------------------------------------------------------------------------------------------------------------------------
No.# | Type | Name | Count | Status
----------------------------------------------------------------------------------------------------------------------------
1 | TABLE | test0 | 10 -> 10 | SUCCESS
----------------------------------------------------------------------------------------------------------------------------
Total Count: 10 End Time: 2021-12-29 14:08:14
2021-12-29 14:08:14 [INFO] Load record finished. Total Elapsed: 2.05 s
2021-12-29 14:08:14 [INFO] System exit 0
Check the imported table schemas and data
MySQL [test]> SHOW create table test2.t1\G
*************************** 1. row ***************************
Table: t1
Create Table: CREATE TABLE `t1` (
`id` bigint(20) NOT NULL AUTO_INCREMENT,
`c1` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
PRIMARY KEY (`id`)
) AUTO_INCREMENT = 1 DEFAULT CHARSET = utf8mb4 ROW_FORMAT = COMPACT COMPRESSION = 'zstd_1.3.8' REPLICA_NUM = 3 BLOCK_SIZE = 16384 USE_BLOOM_FILTER = FALSE TABLET_SIZE = 134217728 PCTFREE = 0
1 row in set
MySQL [test]> SELECT * FROM test2.t1;
+----+---------------------+
| id | c1 |
+----+---------------------+
| 1 | 2021-09-29 21:16:03 |
| 2 | 2021-09-29 21:16:05 |
| 3 | 2021-09-29 21:16:05 |
| 4 | 2021-09-29 21:16:06 |
| 5 | 2021-09-29 21:16:06 |
| 6 | 2021-09-29 21:16:18 |
| 7 | 2021-09-29 21:16:18 |
| 8 | 2021-09-29 21:16:19 |
+----+---------------------+
8 rows in set
More information
For more information about how to use OBLOADER, see Data processing.