This topic describes how to quickly get started with OBLOADER.
Step 1: Download the software
Note
OBLOADER & OBDUMPER is available in only one unified edition since V4.2.1, without providing the community and enterprise editions. You can download the software package from OceanBase Download Center.
Click here to download the package of the latest version and decompress the package:
unzip ob-loader-dumper-4.2.x-RELEASE.zip
cd ob-loader-dumper-4.2.x-RELEASE
Step 2: Configure the runtime environment
Notice
You must install Java 8 and configure the JAVA_HOME environment variable in the local environment. We recommend that you install JDK 1.8.0_3xx or later. For more information about environment configuration, see Prepare the environment.
This step aims to modify JVM parameters.
A small JVM memory size may affect the performance and stability of the import and export features. For example, a full garbage collection (GC) or GC crash may occur.
Edit the configuration file that contains the
JAVA_OPTSoption.In Linux, edit the
obloaderandobdumperscripts in the{ob-loader-dumper}/bin/directory.In Windows, edit the
obloader.batandobdumper.batscripts in the{ob-loader-dumper}/bin/windows/directory.
Modify JVM parameters.
Note
We recommend that you set the JVM memory size to an appropriate value based on the available memory of the server. If you are good at Java performance tuning, you can modify the JVM parameters in the
JAVA_OPTSoption as needed.Example: Assume that the available memory of the server is 16 GB. Set the JVM heap memory size to 8 GB.
JAVA_OPTS="$JAVA_OPTS -server -Xms8G -Xmx8G -XX:MetaspaceSize=128M -XX:MaxMetaspaceSize=128M -Xss352K" JAVA_OPTS="$JAVA_OPTS -XX:+UnlockExperimentalVMOptions -XX:+UseG1GC -Xnoclassgc -XX:+DisableExplicitGC-Xms8G -Xmx8Gis a JVM parameter whose default value is-Xms4G -Xmx4G.
Step 3: Prepare data
You can use OBLOADER to import existing data files or temporary data files generated by using the TPC-H tool. The format of content in the imported file must meet the related specifications. Identify the data format in the file based on Is your data ready?.
Step 4: Create a database
Deploy an OceanBase cluster by using OceanBase Cloud Platform (OCP) or OceanBase Deployer (OBD).
Create a test database.
Create a test table and insert data into the table. This operation is optional when you import data.
Step 5: View configuration files
The configuration files of OBLOADER & OBDUMPER are the connection configuration file session.config.json and log configuration file log4j2.xml.
Connection configuration file
The connection configuration file {ob-loader-dumper}/conf/session.config.json configures database connection parameters. OBLOADER & OBDUMPER builds a JDBC URL to connect to the target database based on the JDBC parameters in the connection configuration file. Then, it sequentially executes SQL statements for initialization in the established connection. You can modify JDBC parameters and SQL statements for initialization in the connection configuration file. Default connection configurations apply to most scenarios. However, in special cases, you need to manually modify parameters to adapt to different OceanBase Database versions and extract, transform, and load (ETL) scenarios. For more information about the connection configuration file, see Connection settings.
Log configuration file
In the log configuration file {ob-loader-dumper}/conf/log4j2.xml, you can view the log output path and log format, and adjust the log level during self-service troubleshooting. For more information, see How do I customize log file names for an export job?
Step 6: Import data
./obloader -h 'IP address' -P'Port number' -u'User'@'Tenant'#'Cluster' -p'Password' -D'Database name' --table 'Table name' --csv -f 'File path' --sys-password 'Password of the sys tenant' --external
Note
This example imports only data. For more information, see Command-line options of OBLOADER.
The following table describes the database information that is used in the examples.
| Database information item | Example value |
|---|---|
| Cluster name | cluster_a |
| Host IP address of OceanBase Database Proxy (ODP) | xx.x.x.x |
| Port number of ODP | 2883 |
| Tenant name of the cluster | mysql |
| Username of the root or proxyro user in the sys tenant | **u*** |
| Password of the root or proxyro user in the sys tenant | ****** |
| User account (with read/write privileges) in the business tenant | test |
| Password of the user in the business tenant | ****** |
| Schema name | USERA |
Import DDL definition files
Scenario: Import all supported database object definitions in the /output directory to the USERA schema. For OceanBase Database of a version earlier than V4.0.0, the password of the sys tenant must be provided.
Sample statement:
$./obloader -h xx.x.x.x -P 2883 -u test@mysql#cluster_a -p ****** --sys-user **u*** --sys-password ****** -D USERA --ddl --all -f /output
Sample task result:
...
All Load Tasks Finished:
---------------------------------------------------------
| No.# | Type | Name | Count | Status |
---------------------------------------------------------
| 1 | TABLE | table | 1 -> 1 | SUCCESS |
---------------------------------------------------------
Total Count: 1 End Time: 2023-05-11 16:02:08
...
Note
The --sys-user option is used to connect to a user with required privileges in the sys tenant. If the --sys-user option is not specified during import, the default value --sys-user root takes effect.
Import CSV data files
Scenario: Import all supported CSV data files in the /output directory to the USERA schema. For OceanBase Database of a version earlier than V4.0.0, the password of the sys tenant must be provided. For more information about CSV data file (.csv file) specifications, see RFC 4180. CSV data files store data in the form of plain text. You can open CSV data files by using a text editor or Excel.
Sample statement:
$./obloader -h xx.x.x.x -P 2883 -u test@mysql#cluster_a -p ****** --sys-user **u*** --sys-password ****** -D USERA --csv --table '*' -f /output
Sample task result:
...
All Load Tasks Finished:
---------------------------------------------------------
| No.# | Type | Name | Count | Status |
---------------------------------------------------------
| 1 | TABLE | table | 1 -> 1 | SUCCESS |
---------------------------------------------------------
Total Count: 1 End Time: 2023-05-11 16:02:08
...
Import SQL data files
Scenario: Import all supported SQL data files in the /output directory to the USERA schema. For OceanBase Database of a version earlier than V4.0.0, the password of the sys tenant must be provided. SQL data files (.sql files) store INSERT SQL statements. You can open SQL data files by using a text editor or SQL editor.
Sample statement:
$./obloader -h xx.x.x.x -P 2883 -u test@mysql#cluster_a -p ****** --sys-user **u*** --sys-password ****** -D USERA --sql --table '*' -f /output
Sample task result:
...
All Load Tasks Finished:
---------------------------------------------------------
| No.# | Type | Name | Count | Status |
---------------------------------------------------------
| 1 | TABLE | table | 1 -> 1 | SUCCESS |
---------------------------------------------------------
Total Count: 1 End Time: 2023-05-11 16:02:08
...
Import POS data files
Scenario: Import all supported POS data files in the /output directory to the USERA schema. For OceanBase Database of a version earlier than V4.0.0, the password of the sys tenant must be provided. You must specify the path where the control files reside during the import. POS data files (.dat files by default) organize data based on a byte offset position with a fixed length. To import a POS data file, you must define the fixed length of each field by using a control file. You can open POS data files by using a text editor.
Sample statement:
$./obloader -h xx.x.x.x -P 2883 -u test@mysql#cluster_a -p ****** --sys-user **u*** --sys-password ******
-D USERA --table '*' -f /output --pos --ctl-path /output
Note
The table name defined in the database must be in the same letter case as its corresponding control file name. Otherwise, OBLOADER fails to recognize the control file. For more information about the rules for defining control files, see Preprocessing functions.
Import CUT data files
Scenario: Import all supported CUT data files in the /output directory to the USERA schema. For OceanBase Database of a version earlier than V4.0.0, the password of the sys tenant must be provided. To import a CUT data file, you must specify the path of the control file and use |@| as the column separator string. CUT data files (.dat files) use a character or character string to separate values. You can open CUT data files by using a text editor.
Sample statement:
$./obloader -h127.1 -P2881 -u test@mysql#cluster_a -p ****** --sys-user **u*** --sys-password ****** -D USERA --table '*' -f /output --cut --column-splitter '|@|' --ctl-path /output
Sample task result:
...
All Load Tasks Finished:
---------------------------------------------------------
| No.# | Type | Name | Count | Status |
---------------------------------------------------------
| 1 | TABLE | table | 1 -> 1 | SUCCESS |
---------------------------------------------------------
Total Count: 1 End Time: 2023-05-11 16:02:08
...
Note
The table name defined in the database must be in the same letter case as its corresponding control file name. Otherwise, OBLOADER fails to recognize the control file. For more information about the rules for defining control files, see Preprocessing functions.
Import data files from Amazon S3 to OceanBase Database
Scenario: Import all supported CSV data files in an Amazon S3 bucket to the USERA schema. For OceanBase Database of a version earlier than V4.0.0, the password of the sys tenant must be provided.
Sample statement:
$./obloader -h xx.x.x.x -P 2883 -u test@mysql#cluster_a -p ****** --sys-user **u*** --sys-password ****** -D USERA --csv --table '*' -f 's3://obloaderdumper/obdumper?region=cn-north-1&access-key=******&secret-key=******'
-f 's3://obloaderdumper/obdumper?region=cn-north-1&access-key=******&secret-key=******' is the storage path. The following table describes the components of a storage path.
| Component | Description |
|---|---|
s3 |
The S3 storage scheme. |
obloaderdumper |
The name of the S3 bucket. |
/obdumper |
The data storage path in the S3 bucket. |
region=cn-north-1&access-key=******&secret-key=****** |
The parameters required for the request.
|
Sample task result:
...
All Load Tasks Finished:
---------------------------------------------------------
| No.# | Type | Name | Count | Status |
---------------------------------------------------------
| 1 | TABLE | table | 3 -> 3 | SUCCESS |
---------------------------------------------------------
Total Count: 3 End Time: 2023-05-11 16:02:08
...
Import data files from Alibaba Cloud OSS to OceanBase Database
Scenario: Import all supported CSV data files in an Alibaba Cloud OSS bucket to the USERA schema. For OceanBase Database of a version earlier than V4.0.0, the password of the sys tenant must be provided.
Sample statement:
$./obloader -h xx.x.x.x -P 2883 -u test@mysql#cluster_a -p ****** --sys-user **u*** --sys-password ****** -D USERA --csv --table '*' -f 'oss://antsys-oceanbasebackup/backup_obloader_obdumper/obdumper?endpoint=https://cn-hangzhou-alipay-b.oss-cdn.aliyun-inc.com&access-key=******&secret-key=******'
-f 'oss://antsys-oceanbasebackup/backup_obloader_obdumper/obdumper?endpoint=https://cn-hangzhou-alipay-b.oss-cdn.aliyun-inc.com&access-key=******&secret-key=******' is the storage URL. The following table describes the components of a storage URL.
| Component | Description |
|---|---|
oss |
The OSS storage scheme. |
antsys-oceanbasebackup |
The name of the OSS bucket. |
/backup_obloader_obdumper/obdumper |
The data storage path in the OSS bucket. |
endpoint=https://cn-hangzhou-alipay-b.oss-cdn.aliyun-inc.com&access-key=******&secret-key=****** |
The parameters required for the request.
|
Sample task result:
...
All Load Tasks Finished:
---------------------------------------------------------
| No.# | Type | Name | Count | Status |
---------------------------------------------------------
| 1 | TABLE | table | 3 -> 3 | SUCCESS |
---------------------------------------------------------
Total Count: 3 End Time: 2023-05-11 16:02:08
...
Import data files from Apache Hadoop to OceanBase Database
Scenario: Import all supported CSV data files in an Apache Hadoop node to the USERA schema. For OceanBase Database of a version earlier than V4.0.0, the password of the sys tenant must be provided.
Sample statement:
$./obloader -h xx.x.x.x -P 2883 -u test@mysql#cluster_a -p ****** --sys-user **u*** --sys-password ****** -D USERA --csv --table '*' -f 'hdfs://***.*.*.*:9000/chang/parquet?hdfs-site-file=/data/0/zeyang/hdfs-site.xml&core-site-file=/data/0/zeyang/core-site.xml'
-f 'hdfs://***.*.*.*:9000/chang/parquet?hdfs-site-file=/data/0/zeyang/hdfs-site.xml&core-site-file=/data/0/zeyang/core-site.xml' is the storage URL. The following table describes the components of a storage URL.
| Component | Description |
|---|---|
hdfs |
The Apache Hadoop storage scheme. |
***.*.*.*:9000 |
The name of the Apache Hadoop node. |
/chang/parquet |
The data storage path in the Apache Hadoop node. |
hdfs-site-file=/data/0/zeyang/hdfs-site.xml&core-site-file=/data/0/zeyang/core-site.xml |
The parameters required for the request.
|
Sample task result:
...
All Load Tasks Finished:
---------------------------------------------------------
| No.# | Type | Name | Count | Status |
---------------------------------------------------------
| 1 | TABLE | table | 3 -> 3 | SUCCESS |
---------------------------------------------------------
Total Count: 3 End Time: 2023-05-11 16:02:08
...
Import database object definitions and table data to ApsaraDB for OceanBase
Scenario: When you cannot provide the password of the sys tenant, import all database object definitions and table data in the /output directory to the USERA schema in ApsaraDB for OceanBase.
Sample statement:
$./obloader -h xx.x.x.x -P 2883 -u test -p ****** -D USERA --ddl --csv --public-cloud --all -f /output
Import database object definitions and table data to OceanBase Database
Scenario: When you cannot provide the password of the sys tenant, import all database object definitions and table data in the /output directory to the USERA schema in OceanBase Database.
Sample statement:
$./obloader -h xx.x.x.x -P 2883 -u test@mysql#cluster_a -p ****** -D USERA --ddl --csv --no-sys --all -f /output
Step 7: Congratulations
You have got started with OBLOADER.
For more information, refer to the following steps:
Learn about the operating principles, major features, and differences from other tools of OBLOADER & OBDUMPER from the product introduction. For more information about the tools, see OBLOADER & OBDUMPER.
Welcome to join OceanBase community to discuss issues and requirements on import and export and plans with OceanBase R&D engineers.