OBLOADER is a client data import tool developed in Java. OBLOADER provides extensive command-line options that allow you to import schemas and data to OceanBase Database in many complex scenarios.
Background
OBLOADER is usually used in combination with OBDUMPER. You can also use OBLOADER to import files in the SQL or CSV format that are exported by using third-party tools, such as Navicat, Mydumper, and SQL Developer. OBLOADER fully exploits the features of the distributed OceanBase Database system. The tool is particularly optimized for import performance and stability, and is enhanced to provide more operation monitoring information to improve user experience.
Features
You can use OBLOADER to import table structures and data to OceanBase Database. OBLOADER provides the following features:
Allows you to import the DDL statements of database objects.
Allows you to import data files in the standard CSV or SQL format.
Allows you to import files where data is stored with a fixed length in bytes, files where data is split by a string, and files that store a mix of DDL and DML statements.
Allows you to configure rules that control data preprocessing and field mappings between files and tables for data import.
Supports features such as import speed limiting, memory explosion prevention, resumption after an interruption, and automatic retries.
Allows you to specify a log directory and store bad data and conflict data.
Allows you to import data from Object Storage Service (OSS) to OceanBase Database.
Supported OceanBase Database versions
The following table describes OceanBase Database modes and versions that are supported.
| OceanBase Database mode | Supported version |
|---|---|
| Oracle mode | V2.0.x, V2.1.x, V2.2.20, V2.2.30, V2.2.50, V2.2.70, V2.2.71, V2.2.72, V2.2.76, V3.1.x, and V3.2.x |
| MySQL mode | V1.4.70, V1.4.72, V1.4.75, V1.4.78, V1.4.79, V2.2.30, V2.2.50, V2.2.70, V2.2.71, V2.2.72, V2.2.76, V3.1.x, and V3.2.x |
Considerations
For more information about the standard CSV format, see the RFC 4180 specifications. We recommend that you import data in strict accordance with the RFC 4180 specifications.
You need to modify the virtual machine (VM) memory parameter in the script when you try to import a large amount of data.
The object names, data file names, and rule file names specified by command-line options must be capitalized in the same way. By default, uppercase letters are used in Oracle mode, and lowercase letters are used in MySQL mode. If a database contains both object names in uppercase and object names in lowercase, use brackets ([ ]) to enclose the object names and file names in lowercase.
All imported data files are named in the
table.group.sequence.suffixformat.If a database object have dependencies on others, object definitions and data may not be imported in dependency order.
Tables without primary keys do not support import resumption after an interruption or data substitution.
When you resolve the primary key conflict in OceanBase Database V1.4.79 in MySQL mode, the use of the
insert ... where not existsstatement may result in cross-partition insertion errors.In OceanBase Database V1.4.x in MySQL mode, the metadata of the RANGE COLUMNS-KEY composite partitioned table is defective in the virtual routing view.
OBLOADER supports the following file formats:
Limited mode
Note
The limited mode is applicable to scenarios where you cannot directly specify the
--sys-useror--sys-passwordoption.
If you enable the limited mode (by specifying
--public-cloud) when you import data to an OceanBase cluster deployed in a public cloud, you do not need to specify the-tor-coption. When you import data in limited mode to an OceanBase cluster deployed in a private cloud, you must specify the-toption, and additionally the-coption for using OceanBase Database Proxy (ODP). If you do not enable the limited mode, the--sys-userand--sys-passwordoptions must be specified for OBLOADER.The limited mode does not support all features of OBLOADER. The performance and stability are therefore affected in limited mode. OceanBase Database V2.2.30 and later versions support throttling on the server. Therefore, to ensure the stability of data import in limited mode, you can run the following command to modify throttling thresholds on the server:
alter system set freeze_trigger_percentage=50; alter system set minor_merge_concurrence=64; alter system set writing_throttling_trigger_percentage=80 tenant='xxx';
Syntax
Import table structure and data into OceanBase Database, respectively. The syntax of the OBLOADER command is as follows:
./obloader -h <host IP address> -P <port> -u <user> -p <password> --sys-user <root or proxyro user in the sys tenant> --sys-password <password of the user in the sys tenant> -c <cluster> -t <tenant> -D <Schema database name> [--ddl] [--csv|--sql] [--all|--table 'table name'] -f<data file or directory>
| Parameter | Required | Description | Example |
|---|---|---|---|
| -h(--host) | Yes | The IP address of the OBProxy or OBServer to connect to. | -h 127.0.0.1 |
| -P(--port) | Yes | The port number of the OBProxy or OBServer to connect to. Note
|
-P 2881 |
| -c(--cluster) | No | The cluster name of the database. You do not need to specify this option if OBDUMPER is directly connected to an OBServer. | -c 'cluste1' |
| -t(--tenant) | No | The tenant name of the user. | -t sys |
| -u(--user) | Yes | The username that you use to log on to the database. | -u user_name |
| -p(--password) | No | The password that you use to log on to the database. | -p '**1***' |
| -D(--database) | No | The name of the database. | -D 'database1' |
| -f(--file-path) | Yes | The directory that stores the data file or the absolute path of the data file. | -f '/path/file' |
| --sys-user | No | The username of the user under the sys tenant. Default value: root@sys. Notice OBLOADER must obtain the metadata information of the table structure during an import, which means that you need to log on to the sys tenant. We recommend that you use the proxyro user to perform the import. |
--sys-user 'root@sys' |
| --sys-password | No | The password of the user specified by the --sys-user option. By default, this option is left empty. | --sys-password '**1***' |
| --ddl | No | Imports DDL files. Default file extension: -schema.sql. | --ddl |
| --sql | No | Imports files in the SQL format. Default file extension: .sql. | --sql |
| --mix | No | Imports files that contain both schemas and data. | --mix |
| --csv | No | Imports files in the CSV format. We recommend that you use this option. Default file extension: .csv. | --csv |
| --all | No | Imports the schemas of all database objects. | --all |
| --table | No | The table to be imported. Separate multiple tables with commas (,). If you set this option to '*', all tables in the database specified by the -D option are imported. | --table 'table1' |
For more information about OBLOADER parameters, see Command-line options of OBLOADER.
Import table schemas
Import a data file into the test2 database of the obmysql tenant. Sample code:
# bin/obloader -h 10.0.0.0 -P 2883 -u u_loader -p ****** --sys-user=proxyro --sys-password=****** -c obce-3zones -t obmysql -D test2 --ddl --all -f /tmp/obdumper
2021-09-29 21:26:05 [INFO] Parsed args:
-h[--host] 10.0.0.0
-P[--port] 2883
-u[--user] u_loader
-p[--password] ******
[--sys-user] proxyro
[--sys-password] ******
-c[--cluster] obce-3zones
-t[--tenant] obmysql
-D[--database] test2
[--ddl]
[--all]
-f[--file-path] /tmp/obdumper
2021-12-29 14:02:25 [INFO] The security certificate file: "/ob-loader-dumper-3.0.0-SNAPSHOT/conf/secure.crt" is not exists
2021-12-29 14:02:25 [INFO] No control files were defined in the path: "/tmp/obdumper"
2021-12-29 14:02:25 [INFO] Load jdbc driver class: "com.oceanbase.jdbc.Driver" finished
2021-12-29 14:02:25 [INFO] The manifest file: "/tmp/obdumper/data/MANIFEST.bin" has been saved
2021-12-29 14:02:26 [INFO] Init writer thread pool finished
2021-12-29 14:02:26 [WARN] The object type : "SEQUENCE" doesn't exist in the -schema.sql files
2021-12-29 14:02:26 [INFO] No.1 sql of the file: "/tmp/obdumper/data/test/TABLE/test1-schema.sql" exec success . Elapsed: 141.6 ms
2021-12-29 14:02:26 [INFO] Load file: "test1-schema.sql" finished
2021-12-29 14:02:26 [INFO] No.1 sql of the file: "/tmp/obdumper/data/test/TABLE/test0-schema.sql" exec success . Elapsed: 227.2 ms
2021-12-29 14:02:26 [INFO] Load file: "test0-schema.sql" finished
2021-12-29 14:02:26 [WARN] The object type : "VIEW" doesn't exist in the -schema.sql files
2021-12-29 14:02:26 [WARN] The object type : "FUNCTION" doesn't exist in the -schema.sql files
2021-12-29 14:02:26 [WARN] The object type : "PROCEDURE" doesn't exist in the -schema.sql files
2021-12-29 14:02:26 [WARN] The object type : "TRIGGER" doesn't exist in the -schema.sql files
2021-12-29 14:02:26 [WARN] The object type : "PACKAGE" doesn't exist in the -schema.sql files
2021-12-29 14:02:26 [WARN] The object type : "TYPE" doesn't exist in the -schema.sql files
2021-12-29 14:02:26 [WARN] The object type : "PACKAGE_BODY" doesn't exist in the -schema.sql files
2021-12-29 14:02:26 [WARN] The object type : "TYPE_BODY" doesn't exist in the -schema.sql files
2021-12-29 14:02:26 [WARN] The object type : "SYNONYM" doesn't exist in the -schema.sql files
2021-12-29 14:02:26 [WARN] The object type : "PUBLIC_SYNONYM" doesn't exist in the -schema.sql files
2021-12-29 14:02:27 [INFO] Close connection count: 1 of the DataSource. Key: 11_124_5_29_44564_857230679_test
2021-12-29 14:02:27 [INFO] Close connection count: 0 of the DataSource. Key: 11_124_5_29_44564_167438737_oceanbase
2021-12-29 14:02:27 [INFO] Shutdown task context finished
2021-12-29 14:02:27 [INFO]
Finished Tasks: 2 Running Tasks: 0 Progress: 100.00%
2021-12-29 14:02:27 [INFO]
All Load Tasks Finished:
----------------------------------------------------------------------------------------------------------------------------
No.# | Type | Name | Count | Status
----------------------------------------------------------------------------------------------------------------------------
1 | TABLE | test0 | 1 | SUCCESS
2 | TABLE | test1 | 1 | SUCCESS
----------------------------------------------------------------------------------------------------------------------------
Total Count: 2 End Time: 2021-12-29 14:02:27
2021-12-29 14:02:27 [INFO] Load schema finished. Total Elapsed: 1.057 s
2021-12-29 14:02:27 [INFO] System exit 0
[root@obce-0000 ob-loader-dumper-3.0.0-SNAPSHOT]#
Import data
# bin/obloader -h 10.0.0.0 -P 2883 -u u_loader -p ****** --sys-user=proxyro --sys-password=****** -c obce-3zones -t obmysql -D test2 --csv --all -f /tmp/obdumper
2021-09-29 21:27:53 [INFO] Parsed args:
-h[--host] 10.0.0.0
-P[--port] 2883
-u[--user] u_loader
-p[--password] ******
[--sys-user] proxyro
[--sys-password] ******
-c[--cluster] obce-3zones
-t[--tenant] obmysql
-D[--database] test2
[--csv]
[--all]
-f[--file-path] /tmp/obdumper
2021-12-29 14:07:21 [INFO] The security certificate file: "/ob-loader-dumper-3.0.0-SNAPSHOT/conf/secure.crt" is not exists
2021-12-29 14:07:21 [INFO] No control files were defined in the path: "/tmp/obdumper"
2021-12-29 14:07:21 [INFO] Load jdbc driver class: "com.oceanbase.jdbc.Driver" finished
2021-12-29 14:07:21 [INFO] The manifest file: "/tmp/obdumper/data/MANIFEST.bin" has been saved
2021-12-29 14:07:22 [INFO] Query the column metadata for the table: "test0" finished
2021-12-29 14:07:22 [INFO] Query the column metadata for the table: "test1" finished
2021-12-29 14:07:22 [INFO] Binding table: "test0" to the file: "/tmp/obdumper/data/test/TABLE/test0.0.0.csv" finished
2021-12-29 14:07:22 [WARN] File: "/tmp/obdumper/data/test/TABLE/test0-schema.sql" is unmatched on the suffix[.csv], ignore it
2021-12-29 14:07:22 [WARN] File: "/tmp/obdumper/data/test/TABLE/test1-schema.sql" is unmatched on the suffix[.csv], ignore it
2021-12-29 14:07:22 [WARN] File: "/tmp/obdumper/data/MANIFEST.bin" is unmatched on the suffix[.csv], ignore it
2021-12-29 14:07:22 [WARN] File: "/tmp/obdumper/data/CHECKPOINT.bin" is unmatched on the suffix[.csv], ignore it
2021-12-29 14:07:22 [INFO] Splitted 1 csv subfiles by 64.0 MB. Elapsed: 15.57 ms
2021-12-29 14:07:22 [INFO] Generate 1 subfiles finished
2021-12-29 14:07:22 [INFO] Ignore to clean any tables as --truncate-table or --delete-from-table is not specified
2021-12-29 14:07:22 [INFO] Ignore to query table entry for table: "test1" without datafiles. Remain: 1
2021-12-29 14:07:22 [INFO] Query table entry and primary key for table: "test0" finished. Remain: 0
2021-12-29 14:07:22 [INFO] Ignore to shuffle data files for table: "test1" without datafiles. Remain: 1
2021-12-29 14:07:50 [INFO] Query the leader location "test0" in multi unit finished. Elapsed: 73.12 ms
2021-12-29 14:07:50 [INFO] Query the leader location of "test0" finished. Remain: 0
2021-12-29 14:07:50 [INFO] Calculate leader: 10.10.10.1:11141 of table: "test0", part: 0. Remain: 0
2021-12-29 14:07:50 [INFO] Waiting to refresh observer load status ......
2021-12-29 14:07:50 [INFO] Refresh the observer load status success. Table: "test0". Remain: 0
2021-12-29 14:07:50 [INFO] Refresh observer load status finished. Elapsed: 116.4 ms
2021-12-29 14:07:50 [INFO] Create 16384 slots for ring buffer finished. [10.10.10.1:11141]
2021-12-29 14:07:51 [INFO] Start 192 database writer threads finished. [10.10.10.1:11141]
2021-12-29 14:07:51 [INFO] Start 38 csv file reader threads successed
2021-12-29 14:07:51 [INFO] File: "/tmp/obdumper/data/test/TABLE/test0.0.0.csv" has been parsed finished
2021-12-29 14:07:55 [INFO]
1. Enqueue Performance Monitor:
-------------------------------------------------------------------------------------------------------
Dimension \ Metric | Tps | Throughput | Buffer
-------------------------------------------------------------------------------------------------------
1.sec.avg | 2.03 Records/sec | 0.0 KB/sec | 1 Slots
1.min.avg | 0.0 Records/min | 0.0 KB/min | 1 Slots
Total | 10 Records | 0.0 KB | 1 Slots
-------------------------------------------------------------------------------------------------------
2. Dequeue Performance Monitor:
-------------------------------------------------------------------------------------------------------
Dimension \ Metric | Tps | Throughput | Buffer
-------------------------------------------------------------------------------------------------------
1.sec.avg | 2.0 Records/sec | 0.0 KB/sec | 1 Slots
1.min.avg | 2.0 Records/min | 2.0 KB/min | 1 Slots
Total | 10 Records | 0.0 KB | 1 Slots
-------------------------------------------------------------------------------------------------------
2021-12-29 14:08:13 [INFO] Wait for the all the workers to drain of published events then halt the workers
2021-12-29 14:08:13 [INFO] Close connection count: 4 of the DataSource. Key: 11_124_5_29_44564_857230679_test
2021-12-29 14:08:14 [INFO] Close connection count: 17 of the DataSource. Key: 11_124_5_29_44564_167438737_oceanbase
2021-12-29 14:08:14 [INFO] Shutdown task context finished
2021-12-29 14:08:14 [INFO]
Finished Tasks: 1 Running Tasks: 0 Progress: 100.00%
2021-12-29 14:08:14 [INFO]
All Load Tasks Finished:
----------------------------------------------------------------------------------------------------------------------------
No.# | Type | Name | Count | Status
----------------------------------------------------------------------------------------------------------------------------
1 | TABLE | test0 | 10 -> 10 | SUCCESS
----------------------------------------------------------------------------------------------------------------------------
Total Count: 10 End Time: 2021-12-29 14:08:14
2021-12-29 14:08:14 [INFO] Load record finished. Total Elapsed: 2.05 s
2021-12-29 14:08:14 [INFO] System exit 0
Check the imported table structure and data
MySQL [test]> SHOW create table test2.t1\G
*************************** 1. row ***************************
Table: t1
Create Table: CREATE TABLE `t1` (
`id` bigint(20) NOT NULL AUTO_INCREMENT,
`c1` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
PRIMARY KEY (`id`)
) AUTO_INCREMENT = 1 DEFAULT CHARSET = utf8mb4 ROW_FORMAT = COMPACT COMPRESSION = 'zstd_1.3.8' REPLICA_NUM = 3 BLOCK_SIZE = 16384 USE_BLOOM_FILTER = FALSE TABLET_SIZE = 134217728 PCTFREE = 0
1 row in set (0.012 sec)
MySQL [test]> SELECT * FROM test2.t1;
+----+---------------------+
| id | c1 |
+----+---------------------+
| 1 | 2021-09-29 21:16:03 |
| 2 | 2021-09-29 21:16:05 |
| 3 | 2021-09-29 21:16:05 |
| 4 | 2021-09-29 21:16:06 |
| 5 | 2021-09-29 21:16:06 |
| 6 | 2021-09-29 21:16:18 |
| 7 | 2021-09-29 21:16:18 |
| 8 | 2021-09-29 21:16:19 |
+----+---------------------+
8 rows in set (0.015 sec)