Create an external table

2025-07-11 09:43:08  Updated

You can create an external table by using the CREATE EXTERNAL TABLE statement. When you create an external table, you must specify the path of data files and the format of data files to read data from external files.

Privilege requirements

To create an external table, the current user must have the CREATE TABLE privilege. For more information about how to view user privileges, see View user privileges. If you do not have the CREATE TABLE privilege, contact the administrator to grant it to you. For more information about how to grant privileges to users, see Directly grant privileges.

Create an external table

The following example shows the syntax for creating an external table:

CREATE EXTERNAL TABLE table_name
( col_name col_type [AS (metadata$filecol{N})] 
  [ , col_name col_type [AS (metadata$filecol{N})] ]
  [ , ... ] )
LOCATION = '<string>'
FORMAT = (
  TYPE = 'CSV'
  LINE_DELIMITER = '<string>' | <expr>
  FIELD_DELIMITER = '<string>' | <expr>
  ESCAPE = '<character>' | <expr>
  FIELD_OPTIONALLY_ENCLOSED_BY = '<character>' | <expr>
  ENCODING = 'charset'
  NULL_IF = ('<string>' | <expr>, '<string>' | <expr> ...)
  SKIP_HEADER = <int>
  SKIP_BLANK_LINES = { TRUE | FALSE }
  TRIM_SPACE = { TRUE | FALSE }
  EMPTY_FIELD_AS_NULL = { TRUE | FALSE }
)
[ PATTERN = '<regex_pattern>' ]

The parameters in the syntax are described as follows:

  • col_name col_type [AS (metadata$filecol{N})]: specifies the columns of the external table. You can use AS (metadata$filecol{N}) to manually define column mapping.

    The column types supported by an external table are the same as those supported by a regular table. For more information about the data types supported in Oracle mode of OceanBase Database and descriptions of the data types, see Overview.

    By default, columns in external files are automatically mapped to columns in an external table in sequence. That is, the first column in an external table is automatically mapped to the first column in the external file, and so on.

    For example, in the following example, the C1 column in the ext_t1 external table is automatically mapped to the first column in the external file, and the C2 column is automatically mapped to the second column in the external file.

    CREATE EXTERNAL TABLE ext_t1 (
      C1 int,
      C2 int
      )
      LOCATION = 'oss://$ACCESS_ID:$ACCESS_KEY@$HOST/tpch_1g_data/lineitem/'
      FORMAT = (
      TYPE = 'CSV'
      FIELD_DELIMITER = '|'
      );
    

    If the columns in the external file are in a different order from those in the external table, you can use a pseudo-column in the format of metadata$filecol{N} to specify that a column in the external table be mapped to the Nth column in the external file. Note that the columns in the file are numbered starting from 1.

    For example, in the following example, C1 int AS (metadata$filecol2) specifies that the C1 column in the ext_t2 external table be mapped to the second column in the file, and C2 int AS (metadata$filecol4) specifies that the C2 column in the ext_t2 external table be mapped to the fourth column in the external file.

    CREATE EXTERNAL TABLE ext_t2 (
                C1 int AS (metadata$filecol2),
                C2 int AS (metadata$filecol4)
      )
      LOCATION = 'oss://$ACCESS_ID:$ACCESS_KEY@$HOST/tpch_1g_data/lineitem/'
      FORMAT = (
      TYPE = 'CSV'
      FIELD_DELIMITER = '|'
      );
    

    Notice

    If you want to manually define column mapping, automatic column mapping will fail and all columns must be manually mapped.

  • LOCATION = '<string>': specifies the path where the external file is stored. Generally, the data files of an external table are stored in a dedicated directory, which can contain subdirectories. When you create an external table, the system automatically collects all files in the directory.

    Two formats are supported:

    • Local location format: LOCATION = '[file://] local_file_path'

      • local_file_path: can be a relative path or an absolute path. If you enter a relative path, the current directory must be the installation directory of OceanBase Database.

        Notice

        The local_file_path parameter must be a directory, not a file. If you want to specify a single file, you can specify the upper directory of the file in the LOCATION parameter and specify the file itself by setting the PATTERN parameter.

      • For scenarios that use the local location format, if you use the system variable secure_file_priv to specify a path that OceanBase Database can access, secure_file_priv must be the upper directory of local_file_path, namely, local_file_path must be a subdirectory of secure_file_priv.

        The tenant-level system variable secure_file_priv specifies the path that OceanBase Database can access when you import data to a file or export data from a file. For more information about secure_file_priv, see secure_file_priv.

    • Remote location format: LOCATION = '{oss|S3}://$ACCESS_ID:$ACCESS_KEY@$HOST/remote_file_path'

      $ACCESS_ID, $ACCESS_KEY, and $HOST are required access information for accessing Alibaba Cloud OSS, AWS S3, or object storage services that support the S3 protocol. These sensitive access information is stored in the system tables of the database in encrypted form.

      Notice

      When you use object storage service, separate the parameters of the object storage service path with the & character. Make sure that the values of the parameters contain only uppercase and lowercase English letters, numbers, and the characters /-_$+= and wildcard. If the values contain other characters, the setting may fail.

  • FORMAT = ( TYPE = 'CSV'...): specifies the external file format as CSV.

    • TYPE: specifies the type of the external file.
    • LINE_DELIMITER: specifies the line delimiter of the file. If you do not specify this parameter, the default value LINE_DELIMITER='\n' takes effect.
    • FIELD_DELIMITER: specifies the field delimiter of the file. If you do not specify this parameter, the default value FIELD_DELIMITER='\t' takes effect.
    • ESCAPE: specifies the escape character of the file. For example, ESCAPE ='*' specifies to use an asterisk (*) as the escape character instead of the default escape character (). If you do not specify this parameter, the default value ESCAPE ='\' takes effect.
    • FIELD_OPTIONALLY_ENCLOSED_BY: specifies the characters that enclose the field values in the file. For example, ESCAPE = '"' specifies to enclose the field values in double quotation marks. If you do not specify this parameter, the default value takes effect.
    • ENCODING: specifies the character set encoding used by the file. For more information about the character sets supported in Oracle mode of OceanBase Database, see Character sets. If you do not specify this parameter, the default value UTF8MB4 takes effect.
    • NULL_IF: specifies the strings that are to be treated as NULL values. If you do not specify this parameter, the default value takes effect.
    • SKIP_HEADER: specifies the number of lines to skip in the file header. If you do not specify this parameter, the default value specifies not to skip the file header.
    • SKIP_BLANK_LINES: specifies whether to skip blank lines. If you do not specify this parameter, the default value FALSE takes effect.
    • TRIM_SPACE: specifies whether to remove leading and trailing spaces from fields in the file. If you do not specify this parameter, the default value FALSE takes effect.
    • EMPTY_FIELD_AS_NULL: specifies whether to treat empty strings as NULL values. If you do not specify this parameter, the default value FALSE takes effect.
  • FORMAT = ( TYPE = 'PARQUET'...): specifies the external file format as PARQUET.

  • PATTERN: specifies the regular pattern string to filter files in the directory specified by LOCATION. For each file in the directory specified by LOCATION, if the file matches the pattern string, the external table can access the file. Otherwise, the external table skips the file. If you do not specify this parameter, the external table can access all files in the directory specified by LOCATION by default.

Assume that a data.csv file exists in the /home/admin/oceanbase/ directory on the local server, and the file contains the following data:

1,"lin",98
2,"hei",90
3,"ali",95
  1. On the OBServer node, the tenant administrator connects to the Oracle tenant of the database cluster by using a local Unix socket.

    Here is an example of the connection:

    obclient -S /home/admin/oceanbase/run/sql.sock -uroot@sys -p********
    

    To connect to OceanBase Database by using a local Unix socket, perform the specific operations and follow the instructions in secure_file_priv.

  2. Configure the path /home/admin/oceanbase/ that OceanBase Database can access.

    SET GLOBAL secure_file_priv = "/home/admin/oceanbase/";
    

    After the command is executed, you need to restart the session for the modification to take effect.

  3. Reconnect to the database and create an external table named ext_t3.

    CREATE EXTERNAL TABLE ext_t3(ID NUMBER(32), NAME VARCHAR2(30),SCORE NUMBER(32))
    LOCATION = '/home/admin/oceanbase/'
    FORMAT = (
    TYPE = 'CSV'
    FIELD_DELIMITER = ','
    FIELD_OPTIONALLY_ENCLOSED_BY ='"'
    )
    PATTERN = 'data.csv';
    

After the external table is created, you can use the SHOW CREATE TABLE statement to view the table definition, just like you do with a regular table.

SHOW CREATE TABLE ext_t3;

The query result is as follows:

+--------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| TABLE  | CREATE TABLE                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                |
+--------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| EXT_T3 | CREATE EXTERNAL TABLE "EXT_T3" (
  "ID" NUMBER(32) GENERATED ALWAYS AS (METADATA$FILECOL1),
  "NAME" VARCHAR2(30) GENERATED ALWAYS AS (METADATA$FILECOL2),
  "SCORE" NUMBER(32) GENERATED ALWAYS AS (METADATA$FILECOL3)
)
LOCATION='file:///home/admin/oceanbase/'
PATTERN='data.csv'
FORMAT (
  TYPE = 'CSV',
  FIELD_DELIMITER = ',',
  FIELD_OPTIONALLY_ENCLOSED_BY = '"',
  ENCODING = 'utf8mb4'
)COMPRESS FOR ARCHIVE REPLICA_NUM = 1 BLOCK_SIZE = 16384 USE_BLOOM_FILTER = FALSE TABLET_SIZE = 134217728 PCTFREE = 0 |
+--------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set

You can also access the external table just like a regular table. When you query an external table, the system reads the external file by using the driver of the external table, parses the file according to the file format, converts the data into internal data types of OceanBase Database, and returns the data rows. Here is an example of querying the external table lineitem that is created just now.

SELECT * FROM ext_t3;

The query result is as follows:

+----+------+-------+
| ID | NAME | SCORE |
+----+------+-------+
|  1 | lin  |    98 |
|  2 | hei  |    90 |
|  3 | ali  |    95 |
+----+------+-------+
3 rows in set

In addition, you can combine an external table with a regular table for query operations. Assume that a regular table named info exists in the current database, and the table contains the following data:

+------+--------+------+
| NAME | SEX    | AGE  |
+------+--------+------+
| lin  | male   |    8 |
| hei  | male   |    9 |
| li   | female |    8 |
+------+--------+------+
3 rows in set

Here is an example of combining the external table ext_t3 and the regular table info for query operations.

SELECT info.* FROM info, ext_t3 WHERE info.name = ext_t3.name AND ext_t3.score > 90;

The query result is as follows:

+------+--------+------+
| NAME | SEX    | AGE  |
+------+--------+------+
| lin  | male   |    8 |
| li   | female |    8 |
+------+--------+------+
2 rows in set

For more information about queries, see Read data.

Considerations

  • An external table can only be queried, and you cannot perform DML operations on it.

  • When you query an external table, if the external file accessed by the table is deleted, the system does not return an error, but instead returns an empty result set.

  • The external file system manages the files accessed by an external table. If the external storage system is unavailable, an error is returned when you query the external table.

What to do next

When you create an external table, the system saves the list of files that match the PATTERN in the specified LOCATION in the system table of OceanBase Database. During a scan, the system accesses the external files based on this list. If you add new files to the external directory, you must perform an operation to add the new files to the file list of the external table. For more information, see Manage external files.

After you create an external table, you can drop it. The statement for dropping an external table is the same as that for a normal table. You can use the DROP TABLE statement to drop an external table. For more information, see Drop a table.

References

Contact Us