Use cases of command-line options|V4.3.4.1| docs|Distributed Database

Use cases of command-line options

Last Updated：2025-09-26 03:14:35 Updated

This topic describes use cases of command-line options in a complex scenario.

--auto-column-mapping

By default, OBLOADER requires that the columns in the source file exactly match those of the target table. OBLOADER imports data to the target columns in order without identifying column names.

If the number of columns in the source file is different from that of the target table, you can specify the --auto-column-mapping option to define column name mappings, so that OBLOADER imports data based on the column names, and ignores columns that have no matching columns in the target table.

Scenarios

Assume that the CSV file containing the col1, col2, and col3 columns of the my_table table in the /home/admin/obloader directory needs to be imported into the target table containing the col2, col3, and col4 columns in the obloader database.

Data of the source table

$cat my_table.csv
'col1','col2','col3'
'data1',1,'desc1'
'data2',2,'desc2'
'data3',3,'desc3'

Schema of the target table

obclient [obloader]> DESC my_table;
+-------+--------------+------+------+---------+-------+
| Field | Type         | Null | Key  | Default | Extra |
+-------+--------------+------+------+---------+-------+
| col2  | int(11) | YES  |      | NULL    |       |
| col3  | varchar(255)     | YES  |      | NULL    |       |
| col4  | varchar(255) | YES  |      | NULL    |       |
+-------+--------------+------+------+---------+-------+
3 rows in set (0.035 sec)

Import data without specifying --auto-column-mapping

Run the following command to import data:

./bin/obloader -h xxx.xxx.xxx.xxx -P xxxx -u test@mysql -D obloader --csv --table 'my_table' -f /home/admin/obloader

Wait for the execution result. An error is expected.

Notice

If the column names do not match but the column types do, OBLOADER may ignore the column names and import the data, leading to data inconsistency.

Query the data in the target table.

SELECT * FROM my_table;

The query result is as follows:

obclient [obloader]> SELECT * FROM my_table;
Empty set (0.035 sec)

Import data with --auto-column-mapping specified

Run the following command to import data:

./bin/obloader -h xxx.xxx.xxx.xxx -P xxxx -u test@mysql -D obloader --csv --table 'my_table' --auto-column-mapping -f /home/admin/obloader

Wait for the execution result. The expected result is that data of the col2 and col3 columns of the source table is imported, and the col4 column is filled with default values.

Query the data in the target table.

SELECT * FROM my_table;

The query result is as follows:

obclient [obloader]> SELECT * FROM my_table;
+------+---------+------+
| col2 | col3    | col4 |
+------+---------+------+
|    1 | desc1   | NULL |
|    2 | desc2   | NULL |
|    3 | desc3   | NULL |
+------+---------+------+
3 rows in set (0.033 sec)

--source-type hive

By default, OBLOADER maps the source file to the target table and imports data to the target table based on the source file structure and column positions. If the source file stored in Hive has a different number of columns from the target table, you can specify the --source-type hive parameter to parse the names of directories in the source path as column values, and specify the --auto-column-mapping parameter to align the columns with the target table.

Scenarios

Import the Parquet file part-00439-07417647-caab-4f87-823d-7c6dfac92211.c000 that contains the id, name, age, and city columns and that is stored in Hive to the tbl1 table that contains the id, name, age, city, and dt columns in the obloader database.

Data structure of the source file in Hive

hive/
└── tbl1
    └── dt=2025
        └── part-00439-07417647-caab-4f87-823d-7c6dfac92211.c000

Schema of the target table

CREATE TABLE tbl1 (
    id INT NOT NULL,          -- The ID of the user.
    name VARCHAR(50) NOT NULL, -- The name of the user.
    age INT NOT NULL,         -- The age of the user.
    city VARCHAR(50) NOT NULL, -- The city of the user.
    dt YEAR NOT NULL          -- The year.
) PARTITION BY RANGE (dt) (  
PARTITION P0 VALUES LESS THAN (2024),
PARTITION P1 VALUES LESS THAN (MAXVALUE)
);

Procedure

In the import command, specify the --source-type hive parameter to identify the source directory dt=2025 as a column, and specify the --auto-column-mapping parameter to map the source file to the target table based on column names.

Run the following command to import data:

./bin/obloader --user root@sun2 -h xxx.xxx.xxx.xxx -P xxxx -D obloader --table 'tbl1'  --truncate-table --par -f ${hdfs_path} -pxxxx --log-path /tmp/ --source-type hive --auto-column-mapping --file-regular-expression '.*'

Wait for the execution result. It is expected that the directory name dt=2025 in the source file will be imported to the dt column of the target table.

Query the data in the target table.

obclient [obloader]> SELECT * FROM tbl1;
+------+---------+------+-------------+------+
| id   | name    | age  | city        | dt   |
+------+---------+------+-------------+------+
|    1 | Alice   |   25 | New York    | 2025 |
|    2 | Bob     |   30 | Los Angeles | 2025 |
|    3 | Charlie |   35 | Chicago     | 2025 |
|    4 | David   |   40 | Houston     | 2025 |
|    5 | Eve     |   45 | Phoenix     | 2025 |
+------+---------+------+-------------+------+

Use cases of command-line options

--auto-column-mapping

Scenarios

Import data without specifying --auto-column-mapping

Notice

Import data with --auto-column-mapping specified

--source-type hive

Scenarios

Procedure