Purpose
This statement is used to import data from an external file.
Notice
- Tables with triggers cannot use the
LOAD DATAstatement. - To import data from an external file, you must have the
FILEprivilege and the following settings are required:- When loading a server-side file, you must set the system variable secure_file_priv in advance. This variable specifies the path for accessing files during import or export.
- When loading a client-side file, you must add the
--local-infile[=1]option when starting the MySQL/OBClient client to enable loading data from the local file system.
OceanBase Database supports the following input file types for the LOAD DATA statement:
Server-side (OBServer node) files: These files are stored on the OBServer node of OceanBase Database. You can use the
LOAD DATA INFILEstatement to load data from these files into a database table.Client-side (local) files: These files are stored on the local file system of the client. You can use the
LOAD DATA LOCAL INFILEstatement to load data from these files into a database table.Note
Starting from V4.2.2 of OceanBase Database's Oracle mode, you can use the
LOAD DATA LOCAL INFILEstatement to load local data files.OSS files: These files are stored in the OSS file system. You can use the
LOAD DATA REMOTE_OSS INFILEstatement to load data from these files into a database table.
The LOAD DATA statement can currently import data from CSV text files. The entire import process can be divided into the following steps:
Parse the file: OceanBase Database reads the data from the file based on the file name provided by the user. It then determines whether to parse the data in parallel or sequentially based on the specified parallelism.
Distribute the data: Since OceanBase Database is a distributed database, data in different partitions may be stored on different OBServer nodes. The
LOAD DATAstatement calculates the parsed data and determines which OBServer node the data should be sent to.Insert the data: Once the target OBServer node receives the data, it executes the
INSERTstatement to insert the data into the corresponding partition.
To import data from an external file, you must have the FILE privilege. You can use the GRANT FILE ON *.* TO $user_name; statement to grant this privilege, where $user_name is the user who needs to execute the LOAD DATA statement.
Syntax
LOAD DATA
[ [/*+ PARALLEL(N) load_batch_size(M) APPEND */]
| [/*+ PARALLEL(N) direct(bool, int) */] ]
[REMOTE_OSS | LOCAL] INFILE 'file_name'
INTO TABLE table_name
[{FIELDS | COLUMNS}
[TERMINATED BY 'string']
[[OPTIONALLY] ENCLOSED BY 'char']
[ESCAPED BY 'char']
]
[LINES
[STARTING BY 'string']
[TERMINATED BY 'string']
]
[IGNORE number {LINES | ROWS}]
[(column_name_var
[, column_name_var] ...)]
Parameters
| Parameter | Description |
|---|---|
| parallel(N) | The degree of parallelism for data loading. The default value is 4. |
| load_batch_size(M) | The size of the batch to be inserted each time. The default value is 100. We recommend that you set the value to a number in the range of [100,1000]. |
| APPEND | A hint for enabling direct load, which allows you to directly allocate space in the data file and write data to the file. By default, the APPEND hint is equivalent to the direct(true, 0) hint. In addition, the APPEND hint enables online statistics collection (the GATHER_OPTIMIZER_STATISTICS hint). |
| direct | A hint for enabling direct load. In the direct(bool, int) hint, the bool parameter indicates whether the specified CSV file needs to be sorted. If the value is true, the file needs to be sorted. The int parameter indicates the maximum number of error rows that can be tolerated. |
| REMOTE_OSS | LOCAL | Optional.
|
| file_name | The path and name of the input file. file_name can be in the following formats:
NoteWhen you import a file from OSS, make sure that the following conditions are met:
|
| table_name | The name of the table to which the data is imported. You can import data to a partitioned table or a non-partitioned table. |
| FIELDS | COLUMNS | The format of the fields.
|
| LINES STARTING BY | The starting delimiter of a line. |
| LINES TERMINATED BY | The ending delimiter of a line. |
| IGNORE number { LINES | ROWS } | The number of lines to be ignored. LINES specifies the number of lines at the beginning of the file, and ROWS specifies the number of rows at the beginning of the file. By default, the input file is matched with the columns of the table. If the input file does not contain all the columns, the missing columns are filled with default values as follows:
|
| column_name_var | The name of the column to be imported. |
Examples
Example 1: Import data from a file on the server (OBServer node)
Set the global secure file path.
Notice
Due to security reasons, when you set the
secure_file_privsystem variable, you can only modify the global variable by using a local socket connection. For more information, see secure_file_priv.obclient> SET GLOBAL secure_file_priv = "/";Log out.
Note
Since
secure_file_privis aGLOBALvariable, you need to execute the\qcommand to make the change take effect.obclinet> \qThe return result is as follows:
ByeAfter you reconnect to the database, use the
LOAD DATAstatement to import data.Normal import.
obclient> LOAD DATA INFILE '/home/admin/test.csv' INTO TABLE t1;Enable direct load by using the
APPENDhint.LOAD DATA /*+ PARALLEL(4) APPEND */ INFILE '/home/admin/test.csv' INTO TABLE t1;
Example 2: Import data from an OSS file
Notice
When you use an OSS path, the parameters in the path are separated by the & symbol. Make sure that the parameter values you enter contain only uppercase and lowercase letters, digits, \/-_$+=, and wildcards. If you enter other characters, the settings may fail.
Enable direct load by using the direct(bool, int) hint. The imported file is stored on OSS.
LOAD DATA /*+ direct(true,1024) parallel(16) */ REMOTE_OSS INFILE 'oss://antsys-oceanbasebackup/backup_rd/xiaotao.ht/lineitem2.tbl?host=***.oss-cdn.***&access_id=***&access_key=***' INTO TABLE tbl1 FIELDS TERMINATED BY ',';
Example 3: Import data from a local file (client)
Run the following statement to import data from a local file to a table in OceanBase Database.
Open the terminal or command prompt window and enter the following command to start the client.
obclient --local-infile -hxxx.xxx.xxx.xxx -P2881 -usys@oracle001 -p******The return result is as follows:
Welcome to the OceanBase. Commands end with ; or \g. Your OceanBase connection id is 3221548006 Server version: OceanBase 4.2.2.0 (r100000032024010510-75c47d4be18a399e13c5309de1a81da5caf4e7c0) (Built Jan 5 2024 10:17:55) Copyright (c) 2000, 2018, OceanBase and/or its affiliates. All rights reserved. Type 'help;' or '\h' for help. Type '\c' to clear the current input statement. obclient [SYS]>Notice
To use the
LOAD DATA LOCAL INFILEfeature, use OBClient V2.2.4 or later.In the client, execute the
LOAD DATA LOCAL INFILEstatement to load the local data file.obclient [SYS]> LOAD DATA LOCAL INFILE '/home/admin/test_data/tbl1.csv' INTO TABLE tbl1 FIELDS TERMINATED BY ',';The return result is as follows:
Query OK, 3 rows affected Records: 3 Deleted: 0 Skipped: 0 Warnings: 0
References
- For more information about how to use the
LOAD DATAstatement, see Import data by using the LOAD DATA statement. - For more information about how to use the
LOAD DATAstatement for direct load, see Import data by using the LOAD DATA statement for direct load.