Description
The statement is used to import data from an external source.
Notice
- Tables with triggers are not supported.
- To import data from an external file, you must have the
FILEprivilege and meet the following requirements:- When loading files on the server, you must set the system variable secure_file_priv in advance to specify the path that is accessible for importing or exporting files.
- When loading local files on the client, you must add the
--local-infile[=1]option when you start the MySQL client or OBClient to enable the data loading feature from the local file system.
The LOAD DATA statement in OceanBase Database allows you to load data from the following sources:
Server files: files stored on OceanBase Server. You can use the
LOAD DATA INFILEstatement to load data from server files into database tables.Client files: files stored in the local file system of the client. You can use the
LOAD DATA LOCAL INFILEstatement to load data from client files into database tables.Note
OceanBase Database in Oracle mode has supported the use of the
LOAD DATA LOCAL INFILEsyntax to load local data files since V4.2.2.OSS files: files stored in the OSS file system. You can use the
LOAD DATA REMOTE_OSS INFILEstatement to load data from OSS files into database tables.
The LOAD DATA statement supports importing data from CSV files. The procedure is as follows:
Parse the file: OceanBase Database reads data from the file based on the file name that you entered and reads data from the file in parallel or in series based on the specified degree of parallelism.
Distribute data: As OceanBase Database is a distributed database, data in partitions can be distributed across multiple OBServer nodes. The
LOAD DATAstatement calculates the destination OBServer node to which the parsed data should be sent.Insert data: After the data is received by the target OBServer node, the node executes an
INSERTstatement to insert the data into the corresponding partition.
To import data from an external file, you must have the FILE privilege. You can execute the GRANT FILE ON *.* TO $user_name; statement to grant the privilege to the user who needs to execute the LOAD DATA statement, where $user_name is the name of the user who needs to execute the LOAD DATA statement.
Syntax
LOAD DATA
[ [/*+ PARALLEL(N) load_batch_size(M) APPEND */]
| [/*+ PARALLEL(N) direct(bool, int) */] ]
[REMOTE_OSS | LOCAL] INFILE 'file_name'
INTO TABLE table_name
[{FIELDS | COLUMNS}
[TERMINATED BY 'string']
[[OPTIONALLY] ENCLOSED BY 'char']
[ESCAPED BY 'char']
]
[LINES
[STARTING BY 'string']
[TERMINATED BY 'string']
]
[IGNORE number {LINES | ROWS}]
[(column_name_var
[, column_name_var] ...)]
Parameter description
| Parameter | Description |
|---|---|
| parallel(N) | The degree of parallelism for data loading. If N is specified, the default value is 4. |
| load_batch_size(M) | The number of rows to be inserted in one batch. The default value is 100. We recommend that you set this parameter to a value within the range of [100, 1000]. |
| APPEND | Use this hint to enable direct load, which allows you to allocate space and write data directly in the data file. By default, the APPEND hint is equivalent to direct(true, 0). It also allows you to collect statistics online (using the GATHER_OPTIMIZER_STATISTICS hint). |
| direct | Use this hint to enable direct load. The bool parameter indicates whether to sort the given CSV file, where true indicates that sorting is required. The int parameter indicates the maximum number of error rows that can be tolerated. |
| REMOTE_OSS | LOCAL | Optional.
|
| file_name | The path and name of the input file. The format of the file_name parameter is described as follows:
NoteMake sure that the following information is correct when you import a file from OSS.
|
| table_name | The name of the table to which the imported data is loaded. The table can be a partitioned or non-partitioned table. |
| FIELDS | COLUMNS | Specify the format of fields.
|
| LINES STARTING BY | Specify the delimiter for the start of lines. |
| LINES TERMINATED BY | Specify the delimiter for the end of lines. |
| IGNORE number { LINES | ROWS } | Ignore the first several lines. LINES indicates several lines in the file, and ROWS indicates several rows separated by field separators. By default, each field in the input file is corresponded to each column in the table. If the input file does not contain all the columns, the missing columns are filled with default values, which are described in the following example:
|
| column_name_var | The name of the column to be imported. |
Examples
Example 1: Import data from a file on the server (OBServer node)
Set the global secure path.
Notice
For security reasons, when you set the system variable
secure_file_priv, you can connect to the database only locally to execute the SQL statement that modifies the global variable. For more information, see secure_file_priv.obclient> SET GLOBAL secure_file_priv = "/";Log out.
Note
Since
secure_file_privis aGLOBALvariable, you must execute\qto make the modification take effect.obclinet> \qThe return result is as follows:
ByeReconnect to the database and use the
LOAD DATAstatement to import data.Use the following statement to import data in the normal way.
obclient> LOAD DATA INFILE '/home/admin/test.csv' INTO TABLE t1;Use the
APPENDhint to enable direct load.LOAD DATA /*+ PARALLEL(4) APPEND */ INFILE '/home/admin/test.csv' INTO TABLE t1;
Example 2: Import data from an OSS file
Notice
When you use an object storage path, make sure that the values of all parameters in the object storage path contain only uppercase and lowercase letters, digits, and the following special characters: \/-_$+=, and wildcard characters. If you enter characters other than the aforesaid ones, the setting may fail.
Use the direct(bool, int) hint to enable direct load, where the imported file is stored in Alibaba Cloud OSS.
LOAD DATA /*+ direct(true,1024) parallel(16) */ REMOTE_OSS INFILE 'oss://antsys-oceanbasebackup/backup_rd/xiaotao.ht/lineitem2.tbl?host=***.oss-cdn.***&access_id=***&access_key=***' INTO TABLE tbl1 FIELDS TERMINATED BY ',';
Example 3: Import data from a local file on the client
Run the following statements to import data from a local file to a table in OceanBase Database.
Open the terminal or command prompt window, and enter the following command to start the client.
obclient --local-infile -hxxx.xxx.xxx.xxx -P2881 -usys@oracle001 -p******The return result is as follows:
Welcome to the OceanBase. Commands end with ; or \g. Your OceanBase connection id is 3221548006 Server version: OceanBase 4.2.2.0 (r100000032024010510-75c47d4be18a399e13c5309de1a81da5caf4e7c0) (Built Jan 5 2024 10:17:55) Copyright (c) 2000, 2018, OceanBase and/or its affiliates. All rights reserved. Type 'help;' or '\h' for help. Type '\c' to clear the current input statement. obclient [SYS]>Notice
To use the
LOAD DATA LOCAL INFILEfeature, you must use OBClient V2.2.4 or later.In the client, execute the
LOAD DATA LOCAL INFILEstatement to load the local data file.obclient [SYS]> LOAD DATA LOCAL INFILE '/home/admin/test_data/tbl1.csv' INTO TABLE tbl1 FIELDS TERMINATED BY ',';The return result is as follows:
Query OK, 3 rows affected Records: 3 Deleted: 0 Skipped: 0 Warnings: 0
References
- For more information about how to use the
LOAD DATAstatement, see Import data by using the LOAD DATA statement. - For more information about direct load by using the
LOAD DATAstatement, see Import data by using the LOAD DATA statement.