Quick start

2024-03-12 10:15:22  Updated

This topic describes how to have a quick start with OBLOADER & OBDUMPER.

Step 1: Download the software

Note

OBLOADER & OBDUMPER are no longer distinguished by the community edition and enterprise edition since V4.2.1. You can download the software package from OceanBase Download Center.

Click here to download the package of the latest version, and decompress the package:

$ unzip ob-loader-dumper-4.2.1-RELEASE.zip
$ cd ob-loader-dumper-4.2.1-RELEASE

Step 2: Configure the running environment

Note

You must install Java 8 and configure the JAVA_HOME environment variable in the local environment. We recommend that you install JDK 1.8.0_3xx or later. For more information about environment configuration, see Prepare the environment.

This step aims to modify JVM parameters.

A small JVM memory size may affect the performance and stability of the import and export features. For example, a full garbage collection (GC) or GC crash may occur. We recommend that you modify the JVM memory size, which is -Xms4G -Xmx4G by default, to 60% of the available memory of the server. If you are good at Java performance tuning, you can modify the JVM parameters in the JAVA_OPTS option as needed.

  1. Edit the configuration file that contains the JAVA_OPTS option.

    • In Linux, edit the obloader and obdumper scripts in the {ob-loader-dumper}/bin/ directory.

    • In Windows, edit the obloader.bat and obdumper.bat scripts in the {ob-loader-dumper}/bin/windows/ directory.

  2. Modify JVM parameters.

    JAVA_OPTS="$JAVA_OPTS -server -Xms4G -Xmx4G -XX:MetaspaceSize=128M -XX:MaxMetaspaceSize=128M -Xss352K"
    JAVA_OPTS="$JAVA_OPTS -XX:+UnlockExperimentalVMOptions -XX:+UseG1GC -Xnoclassgc -XX:+DisableExplicitGC
    

Step 3: Prepare data

Notice

When you try the export feature of OBDUMPER, you do not need to prepare any data. You can directly go to Step 4.

When you try the import feature of OBLOADER, you can use existing data files or use the TPC-H tool to generate temporary data files. The format of content in the imported file must meet related specifications. Identify the data format in the file based on Is your data ready.

Step 4: Create a database

Notice

When you try the export feature of OBDUMPER, you also need to create tables and insert data into the tables after you create a database.

  1. Deploy an OceanBase cluster by using OceanBase Cloud Platform (OCP) or OceanBase Deployer (OBD).

  2. Create a test database.

  3. Create a test table and insert data into the table. This operation is optional when you import data.

Step 5: View configuration files

The configuration files of OBLOADER & OBDUMPER refer to the connection configuration file session.config.json and log configuration file log4j2.xml.

Connection configuration file

The connection configuration file {ob-loader-dumper}/conf/session.config.json configures database connection parameters. OBLOADER & OBDUMPER builds a JDBC URL to connect to the target database based on the JDBC parameters in the connection configuration file and then sequentially executes SQL statements for initialization in the established connection. You can modify JDBC parameters and SQL statements for initialization in the connection configuration file. Default connection configurations apply to most scenarios. However, in special cases, you need to manually modify parameters to adapt to different OBServer versions and extract, transform, and load (ETL) scenarios. For more information about the connection configuration file, see Connection settings.

Log configuration file

In the log configuration file {ob-loader-dumper}/conf/log4j2.xml, you can view the log output path and log format, and adjust the log level during self-service troubleshooting. For more information, see How do I customize log file names for an import job?

Step 6: Import data

./obloader -h 'IP address' -P'port' -u'user' -t'tenant' -c'cluster' -p'password' -D'database name' --table 'table name' --csv -f 'file path' --sys-password 'password of the sys tenant' --external

Note

This example only imports data. For more information, see Command-line options of OBLOADER.

The following table describes the database information that is used in the examples.

Database information item Example
Cluster name cluster_a
IP address of the OceanBase Database Proxy (ODP) host xx.x.x.x
ODP port number 2883
Tenant name of the cluster mysql
Name of the root/proxyro user under the sys tenant **u***
Password of the root/proxyro user under the sys tenant ******
User account (with read/write privileges) under the business tenant test
Password of the user under the business tenant ******
Schema name USERA

Import DDL definition files

Scenario: Import all supported database object definitions in the /output directory to the USERA schema. For OceanBase Database of a version earlier than V4.0.0, the password of the sys tenant must be provided.

Sample statement:

$./obloader -h xx.x.x.x -P 2883 -u test -p ****** --sys-user **u*** --sys-password ****** -c cluster_a -t mysql -D USERA --ddl --all -f /output

Sample task result:

...
All Load Tasks Finished:

---------------------------------------------------------
|  No.#  | Type   | Name       | Count       | Status   |
---------------------------------------------------------     
|  1     | TABLE  | table      | 1 -> 1      | SUCCESS  |                
---------------------------------------------------------

Total Count: 1          End Time: 2023-05-11 16:02:08
...

Note

The --sys-user option is used to connect to a user with required privileges in the sys tenant. If the --sys-user option is not specified during import, the default value --sys-user root takes effect.

Import CSV data files

Scenario: Import all supported CSV data files in the /output directory to the USERA schema. For OceanBase Database of a version earlier than V4.0.0, the password of the sys tenant must be provided. For more information about CSV data file (.csv file) specifications, see RFC 4180. CSV data files store data in the form of plain text. You can open CSV data files by using a text editor or Excel.

Sample statement:

$./obloader -h xx.x.x.x -P 2883 -u test -p ****** --sys-user **u*** --sys-password ****** -c cluster_a -t mysql -D USERA --csv --table '*' -f /output

Sample task result:

...
All Load Tasks Finished:

---------------------------------------------------------
|  No.#  | Type   | Name       | Count       | Status   |
---------------------------------------------------------     
|  1     | TABLE  | table      | 1 -> 1      | SUCCESS  |                
---------------------------------------------------------

Total Count: 1          End Time: 2023-05-11 16:02:08
...

Import SQL data files

Scenario: Import all supported SQL data files in the /output directory to the USERA schema. For OceanBase Database of a version earlier than V4.0.0, the password of the sys tenant must be provided. SQL data files (.sql files) store INSERT SQL statements. You can open SQL data files by using a text editor or SQL editor.

Sample statement:

$./obloader -h xx.x.x.x -P 2883 -u test -p ****** --sys-user **u*** --sys-password ****** -c cluster_a -t mysql -D USERA --sql --table '*' -f /output

Sample task result:

...
All Load Tasks Finished:

---------------------------------------------------------
|  No.#  | Type   | Name       | Count       | Status   |
---------------------------------------------------------     
|  1     | TABLE  | table      | 1 -> 1      | SUCCESS  |                
---------------------------------------------------------

Total Count: 1          End Time: 2023-05-11 16:02:08
...

Import POS data files

Scenario: Import all supported POS data files in the /output directory to the USERA schema. For OceanBase Database of a version earlier than V4.0.0, the password of the sys tenant must be provided. You must specify the path where the control files reside during the import. POS data files (.dat files by default) organize data based on a byte offset position with a fixed length. To import a POS data file, you must define the fixed length of each field by using a control file. You can open POS data files by using a text editor.

Sample statement:

$./obloader -h xx.x.x.x -P 2883 -u test -p ****** --sys-user **u*** --sys-password ****** -c cluster_a -t mysql
-D USERA --table '*' -f /output --pos --ctl-path /output

Note

The table name defined in the database must be in the same letter case as its corresponding control file name. Otherwise, OBLOADER fails to recognize the control file. For more information about the rules for defining control files, see Preprocessing functions.

Import CUT data files

Scenario: Import all supported CUT data files in the /output directory to the USERA schema. For OceanBase Database of a version earlier than V4.0.0, the password of the sys tenant must be provided. To import a CUT data file, you must specify the path of the control file and use |@| as the column separator string. CUT data files (.dat files) use a character or character string to separate values. You can open CUT data files by using a text editor.

Sample statement:

$./obloader -h127.1 -P2881 -u test -p ****** --sys-user **u*** --sys-password ****** -c cluster_a -t mysql -D USERA --table '*' -f /output --cut --column-splitter '|@|' --ctl-path /output

Sample task result:

...
All Load Tasks Finished:

---------------------------------------------------------
|  No.#  | Type   | Name       | Count       | Status   |
---------------------------------------------------------     
|  1     | TABLE  | table      | 1 -> 1      | SUCCESS  |                
---------------------------------------------------------

Total Count: 1          End Time: 2023-05-11 16:02:08
...

Note

The table name defined in the database must be in the same letter case as its corresponding control file name. Otherwise, OBLOADER fails to recognize the control file. For more information about the rules for defining control files, see Preprocessing functions.

Import data files from Amazon S3 to OceanBase Database

Scenario: Import all supported CSV data files in an Amazon S3 bucket to the USERA schema. For OceanBase Database of a version earlier than V4.0.0, the password of the sys tenant must be provided.

Sample statement:

$./obloader -h xx.x.x.x -P 2883 -u test -p ****** --sys-user **u*** --sys-password ****** -c cluster_a -t mysql -D USERA --csv --table '*' -f /output --storage-uri 's3://obloaderdumper/obdumper?region=cn-north-1&access-key=******&secret-key=******'

The --storage-uri 's3://obloaderdumper/obdumper?region=cn-north-1&access-key=******&secret-key=******' option specifies the storage URI, which comprises the following components.

Component Description
s3 The S3 storage scheme.
obloaderdumper The name of the S3 bucket.
/obdumper The data storage path in the S3 bucket.
region=cn-north-1&access-key=******&secret-key=****** The parameters required for the request.
  • region=cn-north-1: the physical location of the S3 bucket.
  • access-key=******: the AccessKey ID for accessing the S3 bucket.
  • secret-key=******: the AccessKey secret for accessing the S3 bucket.

Sample task result:

...
All Load Tasks Finished:

---------------------------------------------------------
|  No.#  | Type   | Name       | Count       | Status   |
---------------------------------------------------------     
|  1     | TABLE  | table      | 3 -> 3      | SUCCESS  |                
---------------------------------------------------------

Total Count: 3          End Time: 2023-05-11 16:02:08
...

Note

The --storage-uri option must be used in combination with the --f option.
OBLOADER imports database object definitions and table data from the S3 bucket specified by --storage-uri to OceanBase Database. The operational logs of OBLOADER are stored in the path specified by -f.

Import data files from Alibaba Cloud OSS to OceanBase Database

Scenario: Import all supported CSV data files in an Alibaba Cloud OSS bucket to the USERA schema. For OceanBase Database of a version earlier than V4.0.0, the password of the sys tenant must be provided.

Sample statement:

$./obloader -h xx.x.x.x -P 2883 -u test -p ****** --sys-user **u*** --sys-password ****** -c cluster_a -t mysql -D USERA --csv --table '*' -f /output --storage-uri 'oss://antsys-oceanbasebackup/backup_obloader_obdumper/obdumper?endpoint=https://cn-hangzhou-alipay-b.oss-cdn.aliyun-inc.com&access-key=******&secret-key=******'

The --storage-uri 'oss://antsys-oceanbasebackup/backup_obloader_obdumper/obdumper?endpoint=https://cn-hangzhou-alipay-b.oss-cdn.aliyun-inc.com&access-key=******&secret-key=******' option specifies the storage URI, which comprises the following components.

Component Description
oss The OSS storage scheme.
antsys-oceanbasebackup The name of the OSS bucket.
/backup_obloader_obdumper/obdumper The data storage path in the OSS bucket.
endpoint=https://cn-hangzhou-alipay-b.oss-cdn.aliyun-inc.com&access-key=******&secret-key=****** The parameters required for the request.
  • endpoint=https://cn-hangzhou-alipay-b.oss-cdn.aliyun-inc.com: the endpoint of the region where the OSS host resides.
  • access-key=******: the AccessKey ID for accessing the OSS bucket.
  • secret-key=******: the AccessKey secret for accessing the OSS bucket.

Sample task result:

...
All Load Tasks Finished:

---------------------------------------------------------
|  No.#  | Type   | Name       | Count       | Status   |
---------------------------------------------------------     
|  1     | TABLE  | table      | 3 -> 3      | SUCCESS  |                
---------------------------------------------------------

Total Count: 3          End Time: 2023-05-11 16:02:08
...

Note

The --storage-uri option must be used in combination with the -f option.
OBLOADER imports database object definitions and table data from the Alibaba Cloud OSS bucket specified by --storage-uri to OceanBase Database. The operational logs of OBLOADER are stored in the path specified by -f.

Import data files from Apache Hadoop to OceanBase Database

Scenario: Import all supported CSV data files in an Apache Hadoop node to the USERA schema. For OceanBase Database of a version earlier than V4.0.0, the password of the sys tenant must be provided.

Sample statement:

$./obloader -h xx.x.x.x -P 2883 -u test -p ****** --sys-user **u*** --sys-password ****** -c cluster_a -t mysql -D USERA --csv --table '*' -f /output --storage-uri 'hdfs://***.*.*.*:9000/chang/parquet?hdfs-site-file=/data/0/zeyang/hdfs-site.xml&core-site-file=/data/0/zeyang/core-site.xml'

The --storage-uri 'hdfs://***.*.*.*:9000/chang/parquet?hdfs-site-file=/data/0/zeyang/hdfs-site.xml&core-site-file=/data/0/zeyang/core-site.xml' option specifies the storage URI, which comprises the following components.

Component Description
hdfs The Apache Hadoop storage scheme.
***.*.*.*:9000 The name of the Apache Hadoop node.
/chang/parquet The data storage path in the Apache Hadoop node.
hdfs-site-file=/data/0/zeyang/hdfs-site.xml&core-site-file=/data/0/zeyang/core-site.xml The parameters required for the request.
  • hdfs-site-file=/data/0/zeyang/hdfs-site.xml: the Apache Hadoop hdfsSiteFile configuration file that contains the configurations of Apache Hadoop.
  • core-site-file=/data/0/zeyang/core-site.xml: the Apache Hadoop hdfsSiteFile configuration file that contains the core configurations of the Apache Hadoop cluster.

Sample task result:

...
All Load Tasks Finished:

---------------------------------------------------------
|  No.#  | Type   | Name       | Count       | Status   |
---------------------------------------------------------     
|  1     | TABLE  | table      | 3 -> 3      | SUCCESS  |                
---------------------------------------------------------

Total Count: 3          End Time: 2023-05-11 16:02:08
...

Note

The --storage-uri option must be used in combination with the -f option.
OBLOADER imports database object definitions and table data from the Apache Hadoop node specified by --storage-uri to OceanBase Database. The operational logs of OBLOADER are stored in the path specified by -f.

Import database object definitions and table data to ApsaraDB for OceanBase

Scenario: When you cannot provide the password of the sys tenant, import all database object definitions and table data in the /output directory to the USERA schema in ApsaraDB for OceanBase.

Sample statement:

$./obloader -h xx.x.x.x -P 2883 -u test -p ****** -D USERA --ddl --csv --public-cloud --all -f /output

Import database object definitions and table data to OceanBase Database

Scenario: When you cannot provide the password of the sys tenant, import all database object definitions and table data in the /output directory to the USERA schema in OceanBase Database.

Sample statement:

$./obloader -h xx.x.x.x -P 2883 -u test -p ****** -c cluster_a -t mysql -D USERA --ddl --csv --no-sys --all -f /output

Step 7: Congratulations

You have had a quick start with OBLOADER & OBDUMPER.

For more information, refer to the following steps:

  • Learn about the operating principles, major features, and differences from other tools of OBLOADER & OBDUMPER from the product introduction. For more information about the tools, see OBLOADER & OBDUMPER.

  • Welcome to join OceanBase community to discuss issues and requirements on import and export and future plans with OceanBase R&D engineers.

Contact Us