Import data by using the LOAD DATA statement

2024-12-02 03:48:29  Updated

OceanBase Database allows you to use the LOAD DATA statement to import data from external files to database tables.

Limitations

Do not use the LOAD DATA statement on tables with triggers.

Considerations

OceanBase Database uses parallel processing to optimize the data import rate of the LOAD DATA statement. In parallel processing, data is split into multiple subtasks for parallel execution. Each subtask is considered an independent transaction. The execution sequence of the subtasks is not fixed. Therefore:

  • Global atomicity cannot be ensured during the data import.
  • For a table without a primary key, data may be written in a sequence different from that in the original file.

Scenarios

You can use the LOAD DATA statement to import a CSV file as follows:

Note

OceanBase Database supports loading data files from Alibaba Cloud Object Storage Service (OSS), the server (an OBServer node), or the client (local).

  1. Parse the file.

    OceanBase Database reads data from a file based on the file name that you entered and determines whether to perform parallel or serial parsing of data from the input file based on the degree of parallelism (DOP) that you specified.

  2. Distribute data.

    OceanBase Database is a distributed database system. Data of each partition may be distributed across different OBServer nodes. The LOAD DATA statement is used to process the parsed data and determine the OBServer node to which the data is to be sent.

  3. Insert data.

    After the destination OBServer node receives the data, it executes the INSERT statement to insert the data into the corresponding partition.

Syntax

For more information about the LOAD DATA statement, see LOAD DATA (MySQL mode) or LOAD DATA (Oracle mode).

Obtain the privileges to execute the LOAD DATA statement

Before you execute the LOAD DATA statement, you must obtain the required privileges. The procedure for granting execution privileges is as follows:

  1. Grant the FILE privilege.

    Here is an example:

    Use the following syntax to grant the FILE privilege:

    GRANT FILE ON *.* TO user_name;
    

    Here, user_name is the user who needs to execute the LOAD DATA statement.

  2. Grant other necessary privileges.

    • The INSERT privilege on the corresponding table is required in MySQL mode.

      Here is an example:

      Use the following syntax to grant the INSERT privilege:

      GRANT INSERT ON database_name.tbl_name TO user_name;
      

      Here, database_name specifies the database name, tbl_name specifies the table name, and user_name specifies the user who needs to execute the LOAD DATA statement.

    • The CREATE SESSION privilege is required in Oracle mode.

      Here is an example:

      Use the following syntax to grant the CREATE SESSION privilege:

      GRANT CREATE SESSION TO user_name;
      

      Here, user_name is the username of the user to which the privilege is to be granted.

Examples

Note

OceanBase Database supports two modes: MySQL and Oracle. The following example shows how to use the LOAD DATA statement in MySQL mode.

Import data from a file on the server

  1. Log in to the server where the OBServer node to connect to resides.

    Here is an example:

    ssh admin@10.10.10.1
    
  2. Create test data in the /home/admin/test_data directory.

    Here is an example:

    Run the following command to write a script named student.sql:

    vi student.sql
    
  3. Enter the editing mode and add test data.

    Here is an example:

    Press the i or Insert key to enter the insert mode of the vi editor and add the following content:

    1,"lin",98
    2,"hei",90
    3,"ali",95
    
  4. Set the path where the file to be imported is located.

    Notice

    For security reasons, when you set the system variable secure_file_priv, you can connect to the database only through a local socket to execute the SQL statement that modifies the global variable. For more information, see secure_file_priv.

    Here is an example:

    1. Log in to the server where the OBServer node to connect to resides.

      ssh admin@10.10.10.1
      
    2. Execute the following statement to connect to the mysql001 tenant through a local Unix Socket:

      Here is an example:

      obclient -S /home/admin/oceanbase/run/sql.sock -uroot@mysql001 -p******
      
    3. Set the file directory to /, which indicates that any path can be accessed.

      SET GLOBAL SECURE_FILE_PRIV = "/";
      
  5. Reconnect to the database.

    Here is an example:

    obclient -h127.0.0.1 -P2881 -utest_user001@mysql001 -p****** -A
    
  6. Create a test table.

    Here is an example:

    Execute the following statement to create a test table named student:

    obclient [test]> CREATE TABLE student (id INT, name VARCHAR(50), score INT);
    
  7. Use the LOAD DATA statement to import data.

    Here is an example:

    Execute the following LOAD DATA statement to load data from the specified file to a data table. In this statement:

    • Set the path and name of the file to be loaded to /home/admin/test_data/student.sql.
    • Set the name of the destination table to which the data is to be loaded to student.
    • Specify comma (,) as the delimiter for fields in the data file.
    • Specify to enclose fields (of the character type) in the data file by using double quotation marks (").
    • Specify to use line breaks as the end characters of lines in the data file.
    • Specify the mappings between columns in the source data file and those in the destination table. The first column in the data file will be mapped to the id column in the destination table, the second column to the name column, and the third column to the score column.
    obclient [test]> LOAD DATA INFILE '/home/admin/test_data/student.sql'
    INTO TABLE student
    FIELDS TERMINATED BY ','
    ENCLOSED BY '"'  
    LINES TERMINATED BY '\n'
    (id,name,score);
    

    The result is as follows:

    Query OK, 3 rows affected
    Records: 3  Deleted: 0  Skipped: 0  Warnings: 0
    
  8. View data in the destination table.

    Here is an example:

    obclient [test]> SELECT * FROM student;
    

    The result is as follows:

    +------+------+-------+
    | id   | name | score |
    +------+------+-------+
    |    1 | lin  |    98 |
    |    2 | hei  |    90 |
    |    3 | ali  |    95 |
    +------+------+-------+
    3 rows in set
    

Import data from a local file on the client

Perform the following steps to import data from a local file to a table in OceanBase Database:

  1. Create test data in the local directory /home/admin/test_data.

    Here is an example:

    Run the following command to write a script named test_tbl1.csv:

    vi test_tbl1.csv
    
  2. Enter the editing mode and add test data.

    Here is an example:

    Press the i or Insert key to enter the insert mode of the vi editor and add the following content:

    1,11
    2,22
    3,33
    
  3. Start the client.

    Here is an example:

    Run the following command to use OceanBase Client (OBClient) to connect to OceanBase Database. Add the --local-infile parameter in the command to enable the feature for loading data from local files.

    obclient --local-infile -hxxx.xxx.xxx.xxx -P2881 -uroot@mysql001 -p****** -Dtest
    

    Notice

    To use the LOAD DATA LOCAL INFILE feature, use OBClient of V2.2.4 or later. If you do not have OBClient of the required version, you can also use a MySQL client to connect to OceanBase Database.

  4. Create a test table.

    Here is an example:

    CREATE TABLE test_tbl1(col1 INT,col2 INT);
    
  5. Execute the LOAD DATA LOCAL INFILE statement on the client to load data from a local file.

    Here is an example:

    obclient [test]> LOAD DATA LOCAL INFILE '/home/admin/test_data/test_tbl1.csv' INTO TABLE test_tbl1 FIELDS TERMINATED BY ',';
    

    The result is as follows:

    Query OK, 3 rows affected
    Records: 3  Deleted: 0  Skipped: 0  Warnings: 0
    
  6. View data in the destination table.

    Here is an example:

    obclient [test]> SELECT * FROM test_tbl1;
    

    The result is as follows:

    +------+------+
    | col1 | col2 |
    +------+------+
    |    1 |   11 |
    |    2 |   22 |
    |    3 |   33 |
    +------+------+
    3 rows in set
    

Exception handling

Log files

If an error occurs during the import, the INSERT statement is rolled back, and the LOAD DATA statement generates a log file named obloaddata.log <XXXXXX> in the log subdirectory of the installation directory of the observer process. The following example shows the content of a log file. The log contains the basic information about the task generated by the LOAD DATA statement, including the tenant name, source file name, destination table name, DOP, LOAD DATA statement used, and error details in lines.

Tenant name:    mysql
File name:  /home/admin/a.csv
Into table: `test`.`t`
Parallel:   1
Batch size: 1000
SQL trace:  YD7A20BA65670-0005AADAAA3C****
Start time: 2020-07-29 21:08:13.073741
Load query:
load data infile '/home/admin/test.csv' into table t fields terminated by ',' lines terminated by '\n'
Row ErrCode ErrMsg
1   1062    Duplicated primary key
2   1062    Duplicated primary key

References

  • For more information about how to use the LOAD DATA statement to import data through direct load, see Direct load.
  • For more information about how to connect to a database, see Overview.
  • For more information about how to drop a table, see Drop a table.

Contact Us