Description
This statement creates an external table in a database.
External tables are a key feature in a database management system. Generally, the data of a table in a database is stored in the storage space of the database, while an external table has data stored in an external storage service.
When you create an external table, you need to specify the file path and file format of the data. Then, you can read the data of the external table from the external storage service. An external table is read-only. You can use it in a query statement, but you cannot perform Data Manipulation Language (DML) operations on it. You cannot define constraints or create indexes on external tables.
Syntax
CREATE EXTERNAL TABLE <table_name>
( [ <column_name> <column_type> [AS <expr>] ]
[ , <column_name> <column_type> [AS <expr>] ]
[ , ... ] )
LOCATION = '<string>'
formatTypeOptions
[ PARTITION BY ( <column_name> [, <column_name> ... ] ) ]
[ PARTITION_TYPE = USER_SPECIFIED ]
[ PATTERN = '<regex_pattern>' ]
[AUTO_REFRESH = 'xx']
formatTypeOptions:
FORMAT = (
TYPE = 'CSV'
LINE_DELIMITER = '<string>' | <expr>
FIELD_DELIMITER = '<string>' | <expr>
ESCAPE = '<character>' | <expr>
FIELD_OPTIONALLY_ENCLOSED_BY = '<character>' | <expr>
ENCODING = 'charset'
NULL_IF = ('<string>' | <expr>, '<string>' | <expr> ...)
SKIP_HEADER = <int>
SKIP_BLANK_LINES = { TRUE | FALSE }
TRIM_SPACE = { TRUE | FALSE }
EMPTY_FIELD_AS_NULL = { TRUE | FALSE }
)
| FORMAT = (TYPE = 'PARQUET')
Parameters
| Parameter | Description |
|---|---|
| table_name | The name of the external table to be created. |
| column_name | The name of a column in the external table. By default, the data columns in the file are automatically mapped to the columns defined in the external table. |
| column_type | The type of a column in the external table. You cannot define constraints, such as DEFAULT, NOT NULL, UNIQUE, CHECK, PRIMARY KEY, or FOREIGN KEY, for the column. |
| AS |
Manually specifies the mappings between columns. When the order of columns in the file does not match that in the external table, you can use the pseudo-column metadata$filecol{N} to specify the mapping between the Nth column in the file and a column in the external table. For example, c2 INT AS (metadata$filecol4) indicates that the c2 column in the external table corresponds to the fourth column in the file. Note that if you manually specify column mappings, automatic mappings will become invalid, and you will need to manually define mappings for all columns. |
| LOCATION | The path where the external table is stored. Generally, the data file of an external table is stored in a separate directory, which can contain subdirectories. When you create an external table, the system automatically collects all files in the specified directory.
NoticeWhen you use object storage parameters in the path, make sure that the values of these parameters only contain uppercase and lowercase English letters, digits, and the following special characters: |
| FORMAT = ( TYPE = 'CSV'... ) | Specifies the CSV format for external files. LINE_DELIMITER: specifies the line delimiter for a CSV file. The default value is LINE_DELIMITER='\n'.FIELD_DELIMITER: specifies the field delimiter for a CSV file. The default value is FIELD_DELIMITER='\t'.ESCAPE: specifies the escape character for a CSV file. It can only be 1 byte in length. The default value is ESCAPE ='\'.FIELD_OPTIONALLY_ENCLOSED_BY: specifies the characters that enclose field values in a CSV file. The default value is an empty string.ENCODING: specifies the character set encoding format of the file. For more information about character sets supported in MySQL mode, see Character set. If no character set is specified, the default value UTF8MB4 takes effect.NULL_IF: specifies the strings to be treated as NULL values. The default value is an empty string.SKIP_HEADER: specifies to skip the file header, and specifies the number of lines to skip.SKIP_BLANK_LINES: specifies whether to skip blank lines. The default value is FALSE, which specifies not to skip blank lines.TRIM_SPACE: specifies whether to remove leading and trailing spaces from fields in the file. The default value is FALSE, which specifies not to remove leading and trailing spaces from fields in the file.EMPTY_FIELD_AS_NULL: specifies whether to treat empty strings as NULL values. The default value is FALSE, which specifies not to treat empty strings as NULL values. |
| FORMAT = ( TYPE = 'PARQUET'... ) | Specifies the PARQUET format for external files. |
| PATTERN | Specifies a regular pattern string to filter files in the directory specified by LOCATION. For each file in the directory specified by LOCATION, if the file path matches the pattern string, the external table accesses the file. Otherwise, the external table skips the file. If this parameter is not specified, the external table, by default, accesses all files in the directory specified by LOCATION. The external table stores the list of files that match the LOCATION parameter-specified path and are specified by PATTERN in the system tables of the database. During a scan, the external table accesses external files based on this list. |
| PARTITION_TYPE = USER_SPECIFIED | You must specify the PARTITION_TYPE = USER_SPECIFIED parameter if you want to manually add and drop partitions instead of letting the external table automatically manage partitions. |
| AUTO_REFRESH = 'xx' | Specifies whether to automatically refresh the external table. Valid values: OFF, INTERVAL, and IMMEDIATE. OFF: the default value, which specifies not to automatically refresh the external table.INTERVAL: allows you to manage the automatic refresh rules for the external table by using dbms_external_table.IMMEDIATE: specifies to refresh the related external table META each time the external table is accessed by an SQL statement. |
Usage notes
- If an external file is deleted, the file no longer exists in the file list when you access the file list. In this case, the external table ignores the missing file.
- If an external file is modified, the external table accesses the latest data of the external file. If the modification of the external file is concurrent with the query of the external table, the query result may be unexpected. Therefore, avoid modifying the external file while querying the external table.
- If a new file is added to an external directory, the external table does not access the new file. If you want to add the new file to the file list of the external table, you need to update the file list of the external table.
Examples
Set the path of
secure_file_privto/home/admin/, and place the CSV fileextdata.csvto be imported in the/home/admin/testdirectory on the local OBServer node that is connected.Here is an example of setting the global secure path.
obclient> SET GLOBAL secure_file_priv = "" Query OK, 0 rows affected obclinet> \q ByeNote
After you set the
secure_file_privvariable, you must execute\qto quit the current mode for the setting to take effect.The CSV file contains the following data:
1,'Dave','Smith','dsmith@outlook.com','friend',32 2,'Xena','Johnson','xjonson@outlook.com','contact',45 3,'Fred','Jackon','fjackson@outlook.com','co-worker',19 4,'Alma','Tyler','atyler@outlook.com','friend',53Log in to the database as a user of the tenant and create an external table named
contacts.obclient> CREATE EXTERNAL TABLE contacts ( id INT, firstname VARCHAR(100), lastname VARCHAR(100), email VARCHAR(255), category CHAR(30), age NUMBER ) LOCATION = '/home/admin/test' FORMAT = ( TYPE = 'CSV' FIELD_DELIMITER = ',' FIELD_OPTIONALLY_ENCLOSED_BY ='\'' ) PATTERN = 'extdata.csv';Query data from the
contactsexternal table.obclient> SELECT * FROM contacts; +----+-----------+----------+----------------------+-----------+------+ | id | firstname | lastname | email | category | age | +----+-----------+----------+----------------------+-----------+------+ | 1 | Dave | Smith | dsmith@outlook.com | friend | 32 | | 2 | Xena | Johnson | xjonson@outlook.com | contact | 45 | | 3 | Fred | Jackon | fjackson@outlook.com | co-worker | 19 | | 4 | Alma | Tyler | atyler@outlook.com | friend | 53 | +----+-----------+----------+----------------------+-----------+------+ 4 rows in set