LOAD DATA

2024-10-23 05:57:51  Updated

Purpose

You can use this statement to import data from an external file.

In OceanBase Database, the LOAD DATA statement can load only local input files on OBServer nodes. Therefore, you must copy a file to an OBServer node before you import it.

Note

  • The current version of OceanBase Database does not support the LOCAL INFILE statement.
  • Do not use the LOAD DATA statement on tables with triggers.

You can use the LOAD DATA statement to import a CSV text file in the following process:

  1. Parse the file: OceanBase Database reads data from a file based on the file name that you enter and determines whether to perform parallel or serial parsing of data from the input file based on the specified degree of parallelism (DOP).

  2. Distribute the data: OceanBase Database is a distributed database. Data of each partition may be distributed across different OBServer nodes. The LOAD DATA statement is used to process the parsed data and determine to which OBServer node the data is sent.

  3. Insert the data: After the destination OBServer node receives the data, it executes the INSERT statement to insert the data into the corresponding partition.

To import data from an external file, you must have the FILE privilege.

Syntax

LOAD DATA
    [/*+ PARALLEL(N) load_batch_size(M) APPEND */]
   |[/*+ PARALLEL(N) direct(bool, int) */]
    INFILE 'file_name'
    INTO TABLE table_name
    [{FIELDS | COLUMNS}
        [TERMINATED BY 'string']
        [[OPTIONALLY] ENCLOSED BY 'char']
        [ESCAPED BY 'char']
    ]
    [LINES
        [STARTING BY 'string']
        [TERMINATED BY 'string']
    ]
    [IGNORE number {LINES | ROWS}]
    [(column_name_var
        [, column_name_var] ...)]

Parameters

Parameter Description
parallel(N) The DOP for loading data. The default value of N is 4.
load_batch_size(M) The batch size of each insertion. The default value of M is 100. Recommended value range: [100, 1000].
APPEND A hint for enabling the bypass import feature. This feature allows you to allocate space and write data to data files. By default, this hint is equivalent to direct(false, 0) and can collect statistics online like the GATHER_OPTIMIZER_STATISTICS hint does.
direct A hint for enabling the bypass import feature. The bool parameter in direct(bool, int) specifies whether the given CSV file needs to be ordered. The value true indicates that the file must be ordered. The int parameter specifies the maximum number of error rows allowed.
file_name The path and name of the input file. Currently, you can load only local files from OBServer nodes.
table_name The name of the table from which data is imported. Partitioned and non-partitioned tables are supported.
FIELDS | COLUMNS The format of the field.
  • ENCLOSED BY: specifies the modifier of the exported value.
  • TERMINATED BY: specifies the end character of the exported column.
  • ESCAPED BY specifies the characters ignored for the exported value.
LINES STARTING BY The start character of the line.
LINES TERMINATED BY The end character of the line.
IGNORE number { LINES | ROWS } Specifies to ignore the first few lines. LINES indicates the first few lines of the file. ROWS indicates the first few rows of data specified by the field delimiter. By default, fields in the input file are mapped to columns in the destination table one by one. If the input file does not contain all the columns, the missing columns are filled based on the following mappings:
  • Character data type: null string
  • Numeric data type: 0
  • Date data type: 0000-00-00
column_name_var The name of the imported column.

Examples

Example 1: Import data from an external file.

  1. Set a global security path.

    obclient> SET GLOBAL secure_file_priv = ""
    Query OK, 0 rows affected
    obclient> \q
    Bye
    
  2. Import data from an external file.

    obclient> LOAD DATA INFILE 'test.sql' INTO TABLE t1;
    Query OK, 0 rows affected
    

Example 2: Use the APPEND hint to enable the bypass import feature.

LOAD DATA /*+ PARALLEL(4) APPEND */
   INFILE '/home/admin/a.csv'
   INTO TABLE t;

Example 3: Use the direct(bool, int) hint to enable the bypass import feature. The file to be imported in bypass mode can be stored in OSS.

load data /*+ parallel(1) direct(false,0)*/ remote_oss infile 'oss://antsys-oceanbasebackup/backup_rd/xiaotao.ht/lineitem2.tbl?host=***.oss-cdn.***&access_id=***&access_key=***' into table lineitem fields terminated by '|' enclosed by '' lines starting by '' terminated by '\n';

Contact Us