LOAD DATA|V4.3.1|OceanBase Database| docs|Distributed Database

LOAD DATA

Last Updated：2026-04-15 08:25:15 Updated

Description

The statement is used to import data from an external source.

Notice

Tables with triggers are not supported.
To import data from an external file, you need to have the FILE privilege and meet the following requirements:
- When you import a server file, you must set the system variable secure_file_priv in advance to specify the path that is accessible for importing or exporting files.
- When you import a local file, you must add the --local-infile[=1] option when you start the MySQL/[OBClient](https://www.oceanbase.com/softwarecenter-cloud) client to enable the data loading feature from the local file system.

The LOAD DATA statement in OceanBase Database allows you to load data from the following types of input files:

Server (OBServer) node files: files that are stored on the OBServer nodes of OceanBase Database. You can use the LOAD DATA INFILE statement to load data from server node files to database tables.
Client (local) files: files that are stored in the local file system of the client. You can use the LOAD DATA LOCAL INFILE statement to load data from client local files to database tables.

Note

When OceanBase Database executes the LOAD DATA LOCAL INFILE statement, the system automatically adds the IGNORE option.
OSS files: files that are stored in the OSS file system. You can use the LOAD DATA REMOTE_OSS INFILE statement to load data from OSS files to database tables.

The LOAD DATA statement currently supports importing data from CSV files. The entire import process consists of the following steps:

Parse the file: OceanBase Database reads data from the file based on the specified file name, and then parses the data in the file based on the specified degree of parallelism.
Distribute data: As OceanBase Database is a distributed database, the data in different partitions may be distributed on different OBServer nodes. The LOAD DATA statement calculates where the parsed data should be sent to based on the OBServer nodes.
Insert data: After the data is received by the target OBServer node, an INSERT statement is executed locally to insert the data into the corresponding partition.

Syntax

LOAD DATA
    [/*+ PARALLEL(N) [load_batch_size(M)] [APPEND | direct(bool, int, [load_mode])] */]
    [REMOTE_OSS | LOCAL] INFILE 'file_name'
    [REPLACE | IGNORE]
    INTO TABLE table_name
    [{FIELDS | COLUMNS}
        [TERMINATED BY 'string']
        [[OPTIONALLY] ENCLOSED BY 'char']
        [ESCAPED BY 'char']
    ]
    [LINES
        [STARTING BY 'string']
        [TERMINATED BY 'string']
    ]
    [IGNORE number {LINES | ROWS}]
    [(column_name_var
        [, column_name_var] ...)]

load_mode:
    'full' 
    | 'inc_replace'

Parameter description

Parameter	Description
parallel(N)	The degree of parallelism for loading data. The default value of `N` is `4`.
load_batch_size(M)	The batch size for inserting data. The default value of `M` is `100`. We recommend that you set this parameter to a value within the range of [100,1000].
APPEND \| direct()	The hint to enable direct load. By default, the `APPEND` hint is equivalent to `direct(true, 0)`, and online statistics collection can also be enabled with the `GATHER_OPTIMIZER_STATISTICS` hint. The parameters of the `direct()` hint are described as follows: `bool`: specifies whether to sort the data to be written. The value `true` specifies to sort the data, and the value `false` specifies not to sort the data. `int`: specifies the maximum number of tolerated error rows. `load_mode`: an optional parameter that specifies the direct load mode. The value must be enclosed in single quotation marks (''). The following modes are supported: `full`: the default mode, which specifies to perform full direct load. `inc_replace`: the incremental direct load mode with the semantics of the `replace` clause. Notice If you set `load_mode` to `inc_replace`, you cannot specify the `REPLACE` or `IGNORE` keyword in the `LOAD DATA` statement. The incremental direct load feature is currently in the experimental stage. To avoid impacting the stability of the system, we recommend that you do not use it in a production environment. For more information about how to use the `LOAD DATA` statement to perform direct load, see Import data or files by using the LOAD DATA statement.
REMOTE_OSS \| LOCAL	Optional. `REMOTE_OSS` specifies whether to read data from the OSS file system. Notice If you use this parameter, you must specify `file_name` as the URL of an OSS bucket. `LOCAL` specifies whether to read data from the local file system of the client. If you do not specify this parameter, data is read from the file system of the server, that is, from the OBServer node.
file_name	The path and name of the input file. The format of `file_name` is described as follows: If the file is stored on an OBServer node or a client, `/$PATH/$FILENAME` is used. If the file is stored on OSS, `oss://$PATH/$FILENAME/?host=$HOST&access_id=$ACCESS_ID&access_key=$ACCESSKEY` is used. The parameters are described as follows: `$PATH`: the path of the file in the bucket, which indicates the directory where the file is stored. `$FILENAME`: the name of the file, which indicates the specific file to be accessed. `$HOST`: the host name or the CDN accelerated domain name of the OSS service, which is the address for accessing the OSS service. `$ACCESS_ID`: the Access Key ID required for accessing the OSS service. It is used for authentication. `$ACCESSKEY`: the Access Key Secret required for accessing the OSS service. It is used for authentication. Note To import a file from OSS, make sure of the following: You have the permissions to access the OSS bucket and file. You must have sufficient permissions to read the specified bucket and file. Generally, you need to set the access permissions in the OSS console or by using OSS APIs, and configure the access keys (Access Key ID and Access Key Secret) as credentials with appropriate permissions. The database server can connect to the specified `$HOST` over the network to access the OSS service. If a CDN accelerated domain name is used, make sure that the CDN is correctly configured and the network connection is normal.
REPLACE \| IGNORE	`REPLACE` specifies to overwrite the conflicting row in case of a unique key conflict, and `IGNORE` specifies to ignore the conflicting row. When you use the `LOAD DATA` statement, it checks for duplicate data based on the primary key of the table. If the table does not have a primary key, the `REPLACE` and `IGNORE` options are equivalent. By default, if duplicate data is found, the `LOAD DATA` statement writes the data to an error file. Notice When you execute the `LOAD DATA LOCAL INFILE` statement in MySQL mode, the system automatically adds the `IGNORE` option. This behavior improves compatibility with MySQL databases. If you specify the `REPLACE` or `IGNORE` option and the degree of parallelism is greater than `1`, the records of the conflicting rows may not be inserted in the same order as that in serial execution. If you want to strictly ensure the insertion order of conflicting records, do not set the parallelism of this statement or set it to `1`.
table_name	The name of the table to which the imported data is to be imported. The table can be a partitioned or non-partitioned table.
FIELDS \| COLUMNS	The format of fields. `ENCLOSED BY`: the delimiter for exported values. `TERMINATED BY`: the delimiter for exported columns. `ESCAPED BY`: the character to be ignored in exported values.
LINES STARTING BY	The starting delimiter of a line.
LINES TERMINATED BY	The ending delimiter of a line.
IGNORE number { LINES \| ROWS }	The first several lines or rows of a file to be ignored. By default, the system matches the fields in the input file with the columns in the table one by one. If the input file does not contain all the columns in the table, the missing columns are filled with default values. The default values are described in the following table. Character type: an empty string. Numeric type: 0. Date type: `0000-00-00`.
column_name_var	The name of the column to be imported.

Considerations

Wildcard rules for direct load

The Hint direct keyword is used in the LOAD DATA statement to specify whether to enable direct load. When direct load is enabled, only one file can be specified that is stored in Object Storage Service (OSS). To import multiple files, you must execute multiple load data statements. To facilitate importing multiple files in the cluster file system (server_disk), the file wildcard feature is introduced. This feature applies to file sources of the server_disk and OSS types, but not to client disks (client_disk).

Enable direct load for server_disk

Examples:
- Matching files: load data /*+ parallel(20) direct(true, 0) */ infile '/xxx/test.*.csv' replace into table t1 fields terminated by '|';
- Matching directories: load data /*+ parallel(20) direct(true, 0) */ infile '/aaa*bb/test.1.csv' replace into table t1 fields terminated by '|';
- Matching both directories and filenames: load data /*+ parallel(20) direct(true, 0) */ infile '/aaa*bb/test.*.csv' replace into table t1 fields terminated by '|';
Considerations:
- At least one matching file must exist. Otherwise, the error OB_FILE_NOT_EXIST is returned.
- For the input load data /*+ parallel(20) direct(true, 0) */ infile '/xxx/test.1*.csv,/xxx/test.6*.csv' replace into table t1 fields terminated by '|';, /xxx/test.1*.csv,/xxx/test.6*.csv is considered as a whole for matching. If no file matches the pattern, the error OB_FILE_NOT_EXIST is returned.
- Only wildcard characters supported by the POSIX GLOB function can be used. For example, test.6*(6|0).csv and test.6*({0.csv,6.csv}|.csv) can be used although they can be listed by using the ls command. However, the GLOB function cannot match them, and the error OB_FILE_NOT_EXIST is returned.

Enable direct load for oss

Examples:
- Matching files: load data /*+ parallel(20) direct(true, 0) */ remote_oss infile 'oss://xxx/test.*.csv?host=xxx&access_id=xxx&access_key=xxx' replace into table t1 fields terminated by '|';
Considerations:
- Directory matching is not supported. For example, the statement load data /*+ parallel(20) direct(true, 0) */ remote_oss infile 'oss://aa*bb/test.*.csv?host=xxx&access_id=xxx&access_key=xxx' replace into table t1 fields terminated by '|'; returns the error OB_NOT_SUPPORTED.
- Only the wildcard characters * and ? are supported for matching filenames. Other wildcard characters, although allowed, cannot match any files.

For more information, see Import data or files by using the LOAD DATA statement

Examples

Import data from a server file

Example 1: Import data from a server file.

Set the global secure path.
```
obclient> SET GLOBAL secure_file_priv = "/"
Query OK, 0 rows affected
obclinet> \q
Bye
```
Note

Since secure_file_priv is a GLOBAL variable, you need to execute \q to make the setting take effect.

Reconnect to the database and import data from an external file.

obclient> LOAD DATA INFILE 'test.sql' INTO TABLE t1;
Query OK, 0 rows affected

Example 2: Use the APPEND hint to enable direct load.

LOAD DATA /*+ PARALLEL(4) APPEND */
   INFILE '/home/admin/a.csv'
   INTO TABLE t;

Example 3: Use the direct(bool, int) hint to enable direct load. The direct load file can be stored in Alibaba Cloud OSS.

load data /*+ parallel(1) direct(false,0)*/ remote_oss infile 'oss://antsys-oceanbasebackup/backup_rd/xiaotao.ht/lineitem2.tbl?host=***.oss-cdn.***&access_id=***&access_key=***' into table lineitem fields terminated by '|' enclosed by '' lines starting by '' terminated by '\n';

Import data from a local client file

Example 4: Use the following statement to import data from a local file to a table in OceanBase Database.

Open the terminal or command prompt window and enter the following command to start the client.

obclient --local-infile -hxxx.xxx.xxx.xxx -P2881 -uroot@mysql001 -p****** -A -Dtest

The return result is as follows:

Welcome to the OceanBase.  Commands end with ; or \g.
Your OceanBase connection id is 3221719526
Server version: OceanBase 4.2.2.0 (r100000022023121309-f536833402c6efe9364d5a4b61830a858ef24d82) (Built Dec 13 2023 09:58:18)

Copyright (c) 2000, 2018, OceanBase and/or its affiliates. All rights reserved.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

obclient [test]>

Notice

To use the LOAD DATA LOCAL INFILE feature, you must use OBClient V2.2.4 or later. If you do not have OBClient of the required version, you can also use a MySQL client to connect to the database.

In the client, execute the LOAD DATA LOCAL INFILE statement to load the local data file.

obclient [test]> LOAD DATA LOCAL INFILE '/home/admin/test_data/tbl1.csv' INTO TABLE tbl1 FIELDS TERMINATED BY ',';

The return result is as follows:

Query OK, 3 rows affected
Records: 3  Deleted: 0  Skipped: 0  Warnings: 0

Import data from an OSS file

Notice

When you use object storage path parameters, separate the parameters with & characters. Make sure that the values you enter for the parameters contain only uppercase and lowercase letters, numbers, and the following characters: \/-_$+=. If you enter characters other than the preceding ones, the setup may fail.

Use the direct(bool, int) hint to enable direct load, and the direct load file can be stored in Alibaba Cloud OSS.

LOAD DATA /*+ direct(true,1024) parallel(16) */ REMOTE_OSS INFILE 'oss://antsys-oceanbasebackup/backup_rd/xiaotao.ht/lineitem2.tbl?host=***.oss-cdn.***&access_id=***&access_key=***' INTO TABLE tbl1 FIELDS TERMINATED BY ',';

References

For more information about how to connect to OceanBase Database, see Overview of connection methods.
For more information about examples that use the LOAD DATA statement, see Import data by using the LOAD DATA statement.
For more information about examples of direct load using the LOAD DATA statement, see Import data by using the LOAD DATA statement.

OceanBase

Customer Stories

Documentation

LOAD DATA

Description

Notice

Note

Syntax

Parameter description

Notice

Notice

Note

Notice

Considerations

Wildcard rules for direct load

Examples

Import data from a server file

Note

Import data from a local client file

Notice

Import data from an OSS file

Notice

References