SELECT INTO|V4.4.2|OceanBase Database| docs|Distributed Database

SELECT INTO

Last Updated：2026-07-17 03:39:41 Updated

Purpose

This statement is used to store query results in a variable or file. Specifically:

SELECT ... INTO OUTFILE stores the result set in an external file and allows you to specify the output format.

Note

When using SELECT ... INTO OUTFILE, you can use \N to represent NULL.
SELECT ... INTO DUMPFILE writes an unformatted single line to an external file.
SELECT ... INTO var_list stores the result set in a variable.

Privilege requirements

To execute the SELECT INTO statement, you must have the FILE privilege and the SELECT privilege on the corresponding table. For more information about the privileges of OceanBase Database, see Privilege classification in MySQL-compatible mode.

Here is an example:

To grant the FILE privilege to a user, you can use the following command:

GRANT FILE ON *.* TO user_name;

where user_name is the user to execute the SELECT INTO statement.

Syntax

select_stmt
INTO {OUTFILE 'file_name' [PARTITION BY part_expr] [{CHARSET | CHARACTER SET} charset_name] [field_opt] [line_opt] [file_opt]
     | OUTFILE 'file_name' [PARTITION BY part_expr] [FORMAT = (external_file_format_list)] [file_opt]
     | DUMPFILE 'file_name'
     | into_var_list};

field_opt:
    {COLUMNS | FIELDS} field_term_list

field_term_list:
    field_term [, field_term ...]

field_term:
    {[OPTIONALLY] ENCLOSED | TERMINATED | ESCAPED} BY string

line_opt:
    LINES line_term_list

line_term_list:
    line_term [line_term]

line_term:
    {STARTING | TERMINATED} BY string

file_opt:
    file_option [, file_option ...]

file_option:
    SINGLE [=] {TRUE | FALSE}
    | MAX_FILE_SIZE [=] {int | string}
    | BUFFER_SIZE [=] {int | string}

external_file_format_list:
    type_csv_option
    | type_parquet_option
    | type_orc_option

type_csv_option:
    TYPE = 'CSV'
    LINE_DELIMITER = '<string>' | <expr>
    FIELD_DELIMITER = '<string>' | <expr>
    ESCAPE = '<character>' | <expr>
    FIELD_OPTIONALLY_ENCLOSED_BY = '<character>' | <expr>
    FIELD_ENCLOSED_BY = '<character>' | <expr>
    ENCODING = 'charset'
    COMPRESSION = [NONE, GZIP, DEFLATE, ZSTD] | '<string>'
    FILE_EXTENSION = ['<string>']

type_parquet_option:
    TYPE = 'PARQUET',
    COMPRESSION = '<string>'
    ROW_GROUP_SIZE = '<string>' | <int>

type_orc_option:
    TYPE = 'ORC'
    COMPRESSION = '<string>'
    COMPRESSION_BLOCK_SIZE = '<string>' | <int>
    STRIPE_SIZE = '<string>' | <int>
    ROW_INDEX_STRIDE = <int>

Parameters

Parameter	Description
select_stmt	Specifies the query statement to be executed. `select_stmt` must return a result set. For more information about the structure and options of the query statement, see SELECT. Note The location of the `INTO` clause can also be before the `FROM` clause in the query statement. For example, `SELECT * INTO OUTFILE '/home/admin/test_data/test_tbl1.csv' FROM test_tbl1 WHERE id > 5;`.
file_name	Specifies the path and name of the exported file. For more information, see file_name.
PARTITION BY part_expr	Optional. Specifies the partitioning method for the exported data. The value of `part_expr` is used as part of the export path. For each row of data, the value of `part_expr` is calculated. Rows with the same value of `part_expr` belong to the same partition and are exported to the same directory. Notice When data is exported by partition, `SINGLE = FALSE` must be specified, which allows exporting to multiple files. Currently, data exported by partition can only be imported to OSS.
FORMAT = (external_file_format_list)	Optional. The FORMAT clause specifies the file format attributes for the exported file. Use TYPE to specify the CSV, PARQUET, or ORC file format. TYPE cannot be empty. For more information, see external_file_format_list.
CHARSET \| CHARACTER SET charset_name	Optional. Specifies the character set for the exported file. `charset_name` specifies the name of the character set.
field_opt	Optional. Specifies the format options for the exported fields. Use the `FIELDS` or `COLUMNS` clause to specify the format of each field in the output file. For more information, see field_term.
line_opt	Optional. Specifies the start and end delimiters for the exported data rows. Use the `LINES` clause to set the start and end characters for each row in the output file. For more information, see line_term.
file_opt	Optional. Specifies whether to export to multiple files and the size of each file when exporting to multiple files. For more information, see file_option.

file_name

file_name has the following format:

When you save the exported file to an OBServer node, the format is: /\$PATH/\$FILENAME, where the parameters are described as follows:
- \$PATH: specifies the path to save the exported file, that is, the path of the exported file on the OBServer node.
- \$FILENAME: specifies the name of the exported file. When SINGLE = FALSE, it specifies the prefix of the exported file. If you do not specify the prefix, the system generates a default prefix data. The system automatically generates the suffix.
When you save the exported file to an Alibaba Cloud OSS bucket, the format is: oss://\$PATH/\$FILENAME/?host=\$HOST&access_id=\$ACCESS_ID&access_key=\$ACCESSKEY, where the parameters are described as follows:
- \$PATH: specifies the path to save the exported file, that is, the path of the file in the bucket.
- \$FILENAME: specifies the name of the exported file. When SINGLE = FALSE, it specifies the prefix of the exported file. If you do not specify the prefix, the system generates a default prefix data. The system automatically generates the suffix.
- \$HOST: specifies the host name or the domain name of the CDN-accelerated service for accessing the OSS service.
- \$ACCESS_ID: specifies the AccessKey ID required to access the OSS service for authentication.
- \$ACCESSKEY: specifies the AccessKey secret required to access the OSS service for authentication.
Note
- OceanBase Database supports exporting data to S3 and object storage services that are compatible with the S3 protocol.
- Alibaba Cloud OSS has a file size limit. Therefore, when you export a file larger than 5 GB to OSS, the system splits the file into multiple files, each of which is smaller than 5 GB.
When you save the exported file to an HDFS cluster, the format is:

Notice

When you use the write capability of an HDFS external table, OceanBase Database must be deployed in an environment that provides the Java SDK. For more information about how to deploy an environment that provides the Java SDK, see Deploy an environment that provides the Java SDK.
- The export format for SIMPLE authentication:
  - Basic authentication (without Kerberos):
```
hdfs://\$NAMENODE:\$PORT/\$PATH/\$FILENAME[?username=\$USERNAME]
```
  - Description:
    - When you use basic authentication (SIMPLE), you can specify the HDFS username by using the username parameter.
    - If you do not specify the username parameter, the system uses the HDFS username defined in the configuration of OceanBase Database.
- The export format for Kerberos authentication:
```
hdfs://\$NAMENODE:\$PORT/\$PATH/\$FILENAME?principal=\$PRINCIPAL&keytab=\$KEYTAB&krb5conf=\$KRB5CONF&configs=\$CONFIGS
```
  Description:
  - For Kerberos authentication, you must provide the Kerberos principal (principal), keytab (keytab), and Kerberos configuration file (krb5conf).
  - In this scenario, the username parameter is not applicable because Kerberos authentication identifies the user by using the principal.
The parameters are described as follows:
- Common parameters:
  - \$NAMENODE: specifies the IP address of the HDFS NameNode.
  - \$PORT: specifies the port of the HDFS NameNode.
  - \$PATH: specifies the storage path of the file in HDFS.
  - \$FILENAME: specifies the name of the exported file.
- Parameters for SIMPLE authentication:
  - ?username=\$USERNAME: optional. Specifies the HDFS username. If you do not specify the username parameter, the system uses the HDFS username defined in the configuration of OceanBase Database.
- Parameters for Kerberos authentication:
  - \$PRINCIPAL: specifies the name of the Kerberos principal. Example: principal=hdfs/hadoop@EXAMPLE.COM.
  - \$KEYTAB: specifies the path of the Kerberos keytab file. Example: keytab=/path/to/hdfs.keytab.
  - \$KRB5CONF: specifies the path of the Kerberos configuration file. Example: krb5conf=/path/to/krb5.conf.
  - \$CONFIGS: specifies the additional HDFS configuration parameters. The parameters are in the key-value pair format. Example: configs=dfs.data.transfer.protection=authentication,privacy.
The following examples show the values of file_name when you save the exported file to an HDFS cluster:
- SIMPLE authentication (basic authentication):
  - Applies to an HDFS cluster that does not support Kerberos authentication.
  - Example 1 (default username):
```
hdfs://10.10.10.1:9000/data/export/sales_data.csv
```
  - Example 2 (specified username):
```
hdfs://10.10.10.1:9000/data/export/sales_data.csv?username=hiveuser
```
- Kerberos authentication:
  - Applies to an HDFS cluster that supports Kerberos authentication.
  - You must specify the query parameters as needed. Multiple configuration items (such as configs) are supported.
  - Example:
```
hdfs://10.10.10.1:9000/data/export/sales_data.csv?principal=hdfs/hadoop@EXAMPLE.COM&keytab=/etc/security/keytabs/hdfs.keytab&krb5conf=/etc/krb5.conf&configs=dfs.data.transfer.protection=authentication,privacy
```

field_term

[OPTIONALLY] ENCLOSED BY string: specifies the symbol that wraps field values. By default, no quotation marks are used. For example, ENCLOSED BY '"' indicates that character values are enclosed in double quotation marks. If you use the OPTIONALLY keyword, the specified character is used only to wrap string values.
TERMINATED BY string: specifies the symbol that separates field values. For example, TERMINATED BY ',' indicates that a comma is used to separate field values.
ESCAPED BY string: specifies the escape character for handling special characters or parsing data in special formats. The default escape character is the backslash (\).

line_term

STARTING BY string: specifies the starting character of each line.
TERMINATED BY string: specifies the ending character of each line. By default, a newline character is used. For example, ... LINES TERMINATED BY '\n' ... indicates that a newline character is used to mark the end of a line.

file_option

SINGLE [=] {TRUE | FALSE}: specifies whether to export data to a single file or multiple files.
- SINGLE [=] TRUE: specifies to export data to a single file. This is the default value.
- SINGLE [=] FALSE: specifies to export data to multiple files.
  
  Notice
  
  When the parallelism is greater than 1 and SINGLE = FALSE, data can be exported to multiple files, achieving parallel read and write and improving the export speed.
MAX_FILE_SIZE [=] {int | string}: specifies the size of a single file during export. This option is effective only when SINGLE = FALSE.
BUFFER_SIZE [=] {int | string}: specifies the size of memory allocated to each thread for each partition during export (treated as a single partition if no partitioning is specified). The default value is 1 MB.

Note

BUFFER_SIZE is used for export performance tuning. When the machine has sufficient memory and you want to improve the export efficiency, you can set a larger value (such as 4 MB). When the machine has insufficient memory, you can set a smaller value (such as 4 KB). If you set it to 0, all partitions in a single thread use a shared memory block.

external_file_format_list

TYPE = 'CSV': contains the following fields:
- LINE_DELIMITER: specifies the line delimiter for CSV files. The default value is LINE_DELIMITER='\n'.
- FIELD_DELIMITER: optional. Specifies the column delimiter for CSV files. The default value is FIELD_DELIMITER='\t'.
- ESCAPE: specifies the escape character for CSV files. It must be a single byte. The default value is ESCAPE ='\'.
- FIELD_OPTIONALLY_ENCLOSED_BY: optional. Specifies the character used to enclose field values in CSV files. The default value is empty. This option applies only to certain field types (such as CHAR, VARCHAR, TEXT, and JSON).
- FIELD_ENCLOSED_BY: optional. Specifies the character used to enclose field values in CSV files. The default value is empty, indicating that no enclosure is used. This option applies to all field types.
- ENCODING: specifies the character set encoding format of the file. If not specified, the default value is UTF8MB4.
- COMPRESSION: optional. Specifies the compression format of the file. Supported values are NONE, GZIP, DEFLATE, and ZSTD.
  - GZIP/DEFLATE: specifies GZIP compression.
  - ZSTD: specifies ZSTD compression.
  - NONE: specifies no compression (default value).
- FILE_EXTENSION: optional. Specifies a user-defined file extension. This option is used only for multi-file exports and is supported only for CSV format. If not specified, the file extension is determined by the format type. For CSV format, the default file extension is .csv.
TYPE = 'PARQUET': contains the following fields:
- COMPRESSION: specifies the compression format of the PARQUET file. The default compressed file extension is .parquet. Supported values are UNCOMPRESSED (no compression), SNAPPY, GZIP, BROTLI, ZSTD, LZ4, and LZ4_HADOOP. If not specified, the default value is UNCOMPRESSED.
- ROW_GROUP_SIZE: specifies the size of a ROW GROUP in the PARQUET file, in bytes. You can specify a number or a string such as '64MB'. If not specified, the default value is 256MB. We recommend that you use the default value.
TYPE = 'ORC': contains the following fields:
- COMPRESSION: specifies the compression format of the ORC file. The default compressed file extension is .orc. Supported values are UNCOMPRESSED (no compression), SNAPPY, ZLIB, LZ4, and ZSTD. If not specified, the default value is UNCOMPRESSED.
- COMPRESSION_BLOCK_SIZE: specifies the block size for data compression, in bytes. You can specify a number or a string such as '64KB'. If not specified, the default value is 256KB. We recommend that you use the default value.
- STRIPE_SIZE: specifies the stripe size of the ORC file, in bytes. You can specify a number or a string such as '64MB'. If not specified, the default value is 64MB. We recommend that you use the default value.
- ROW_INDEX_STRIDE: specifies the frequency of index records. It defines how often an index is recorded. If not specified, the default value is 10000. We recommend that you use the default value.

Examples

Export data files to the local server

Set the file path for export.

To export files, you must set the system variable secure_file_priv to specify the path where exported files can be stored.

Notice

For security reasons, you can only modify the global variable secure_file_priv by using a local socket connection. For more information, see secure_file_priv.
1. Log in to the OBServer node that is to be connected to OceanBase Database.
```
ssh admin@xxx.xxx.xxx.xxx
```
2. Execute the following command to connect to the mysql001 tenant by using a local Unix socket.
```
obclient -S /home/admin/oceanbase/run/sql.sock -uroot@mysql001 -p******
```
3. Set the export path to /home/admin/test_data.
```
SET GLOBAL secure_file_priv = "/home/admin/test_data";
```
4. Log out.
After you reconnect to the database, use the SELECT INTO OUTFILE statement to export data. Specify a comma as the delimiter between two field values, enclose string values with the " character, and use a newline character as the end marker.
- Export data to a single file named test_tbl1.csv.
```
SELECT /*+parallel(2)*/ * FROM test_tbl1
INTO OUTFILE '/home/admin/test_data/test_tbl1.csv'
  FIELDS TERMINATED BY ',' OPTIONALLY ENCLOSED BY '"'
  LINES TERMINATED BY '\n';
```
  The return result is as follows:
```
Query OK, 9 rows affected
```
- Export data to multiple files in parallel. Do not specify the file name (the default prefix is data), and ensure that the size of each file does not exceed 4 MB.
```
SELECT /*+parallel(2)*/ * FROM test_tbl1
  INTO OUTFILE '/home/admin/test_data/'
  FIELDS TERMINATED BY ',' OPTIONALLY ENCLOSED BY '"'
  LINES TERMINATED BY '\n'
  SINGLE = FALSE MAX_FILE_SIZE = '4MB';
```
  The return result is as follows:
```
Query OK, 9 rows affected
```
- Export data to multiple files in parallel. Specify the prefix of the file name as dd2024, and ensure that the size of each file does not exceed 4 MB.
```
SELECT /*+parallel(2)*/ * FROM test_tbl1
  INTO OUTFILE '/home/admin/test_data/dd2024'
  FIELDS TERMINATED BY ',' OPTIONALLY ENCLOSED BY '"'
  LINES TERMINATED BY '\n'
  SINGLE = FALSE MAX_FILE_SIZE = '4MB';
```
  The return result is as follows:
```
Query OK, 9 rows affected
```
Note
- When multiple export tasks are performed to the same path, errors or incomplete data exports may occur. You can avoid these issues by setting the export path properly.
  For example, when the following two statements are executed at the same time: SELECT /*+parallel(2)*/ * FROM t1 INTO OUTFILE 'test/data' SINGLE = FALSE; and SELECT /*+parallel(2)*/ * FROM t2 INTO OUTFILE 'test/data' SINGLE = FALSE;, an error may occur because the exported files have the same name. We recommend that you set the export paths to test/data1 and test/data2, respectively.
- If SINGLE = FALSE and the export fails due to reasons such as the file already existing, you can delete all files with the same prefix as the export target in the export directory or delete the export directory and recreate it, and then retry the export operation.
  For example, if the following statement fails: SELECT /*+parallel(2)*/ * FROM t1 INTO OUTFILE 'test/data' SINGLE = FALSE;, you can delete all files with the data prefix in the test directory or directly delete the test directory and recreate it, and then retry the export operation.
Log in to the server and view the exported files in the /home/admin/test_data directory on the OBServer node.

[xxx@xxx /home/admin/test_data]# ls

The return result is as follows:

data_0_0_0  data_0_1_0  dd2024_0_0_0  dd2024_0_1_0  test_tbl1.csv

In the result, test_tbl1.csv is the file name of the data exported in the single-file export example. data_0_0_0 and data_0_1_0 are the file names of the data exported in the multiple-file export example without file name specification. dd2024_0_0_0 and dd2024_0_1_0 are the file names of the data exported in the multiple-file export example with the file name prefix specified as dd2024.

Export data files to OSS

Use the SELECT INTO OUTFILE statement to export data from the test_tbl2 table to a specified OSS storage location by partition. The partitioning is based on the combination of the col1 and col2 columns. Rows with the same values in these columns belong to the same partition and are exported to the same directory.

SELECT /*+parallel(3)*/ * FROM test_tbl2
  INTO OUTFILE 'oss://$DATA_FOLDER_NAME/?host=$OSS_HOST&access_id=$OSS_ACCESS_ID&access_key=$OSS_ACCESS_KEY'
    PARTITION BY CONCAT(col1,'/',col2)
    SINGLE = FALSE BUFFER_SIZE = '2MB';

The storage location is specified by the $DATA_FOLDER_NAME variable. You also need to provide the host address, Access ID, and Access Key of the OSS service.

Export data files to HDFS

Notice

To use the write capability of HDFS external tables, OceanBase Database must be deployed in an environment that supports the Java SDK. For more information about how to deploy an environment that supports the Java SDK, see Deploy an environment that supports the Java SDK.

Export data by using the SIMPLE authentication mode

Export data from the lineitem_external table in OceanBase Database to HDFS and save the data as CSV files.

SELECT * FROM lineitem_external
INTO OUTFILE 'hdfs://${namenode}:${port}/path/to/lineitem.csv'
FORMAT = (
    TYPE = 'CSV'
    LINE_DELIMITER = '\n'
    FIELD_DELIMITER = ','
    FIELD_OPTIONALLY_ENCLOSED_BY ='"'
    )
SINGLE = TRUE;

Export data by using the KERBEROS authentication mode

Unlike the SIMPLE authentication mode, the KERBEROS authentication mode requires additional configuration files and key files. Export data from the lineitem_external table in OceanBase Database to HDFS and save the data as CSV files.

SELECT * FROM lineitem_external
INTO OUTFILE 'hdfs://${namenode}:${port}/path/to/lineitem.csv?principal=hdfs/hadoop@EXAMPLE.COM&keytab=/path/to/hdfs.keytab&krb5conf=/path/to/krb5.conf&configs=dfs.data.transfer.protection=authentication,privacy'
FORMAT = (
    TYPE = 'CSV'
    LINE_DELIMITER = '\n'
    FIELD_DELIMITER = ','
    FIELD_OPTIONALLY_ENCLOSED_BY ='"'
    )
SINGLE = TRUE;

References

Use the OUTFILE statement to export data

OceanBase

Customer Stories

Documentation

SELECT INTO

Purpose

Note

Privilege requirements

Syntax

Parameters

Note

Notice

file_name

Note

Notice

field_term

line_term

file_option

Notice

Note

external_file_format_list

Examples

Export data files to the local server

Notice

Note

Export data files to OSS

Export data files to HDFS

Notice

Export data by using the SIMPLE authentication mode

Export data by using the KERBEROS authentication mode

References