Purpose
You can use this statement to create an external table in the database.
External tables are a key feature in a database management system. Generally, the data of a table in a database is stored in the storage space of the database, while an external table has data stored in an external storage service.
When you create an external table, you must define the file path and file format of its data, so that users can read the data from the external storage service. External tables are read-only. You can query an external table, but cannot perform DML operations on it. You cannot define constraints or create indexes on external tables.
Syntax
CREATE EXTERNAL TABLE <table_name>
( [ <column_name> <column_type> [AS <expr>] ]
[ , <column_name> <column_type> [AS <expr>] ]
[ , ... ] )
LOCATION = '<string>'
FORMAT = (
TYPE = 'CSV'
LINE_DELIMITER = '<string>' | <expr>
FIELD_DELIMITER = '<string>' | <expr>
ESCAPE = '<character>' | <expr>
FIELD_OPTIONALLY_ENCLOSED_BY = '<character>' | <expr>
ENCODING = 'charset'
NULL_IF = ('<string>' | <expr>, '<string>' | <expr> ...)
SKIP_HEADER = <int>
SKIP_BLANK_LINES = { TRUE | FALSE }
TRIM_SPACE = { TRUE | FALSE }
EMPTY_FIELD_AS_NULL = { TRUE | FALSE }
)
[ PATTERN = '<regex_pattern>' ]
Parameters
| Parameter | Description |
|---|---|
| table_name | The name of the external table to be created. |
| column_name | The name of a column in the external table. By default, the data columns in the files accessed by an external table automatically map the columns in the external table in sequence. |
| column_type | The type of a column in the external table. You cannot define constraints, such as DEFAULT, NOT NULL, UNIQUE, CHECK, PRIMARY KEY, and FOREIGN KEY, for an external table. |
| AS |
Used to manually specify column mappings between files and the external table. When the column order in files is different from that in the external table, you can use the pseudo column metadata$filecol{N} to map a column in the external table to the Nth column in files. For example, c2 INT AS (metadata$filecol4) maps the c2 column in the external table to the fourth column in files. If you manually map one column, all automatic mappings become invalid, which means that you must then manually map all other columns. |
| LOCATION | The path where the files of the external table are stored. Generally, the data files of an external table are stored in a separate directory. The folder can contain subdirectories. When you create an external table, the table automatically collects all files in the specified directory.
|
| FORMAT | The format of external files.
|
| PATTERN | The regular pattern string for filtering files in the LOCATION directory. For each file in the LOCATION directory, if the file path matches the pattern string, the external table accesses the file. Otherwise, the external table skips the file. If this parameter is not specified, all files in the LOCATION directory are accessible by default. The external table stores the list of the files that match PATTERN in the system table of the database. During the scan, the external table accesses external files based on this list. |
Considerations
- When an external file is dropped, the file no longer exists in the file list of the external table. The external table then ignores the file that does not exist.
- When an external file is modified, the external table accesses the latest data of the file. When an external file is modified during a query to the external table, unexpected results may be returned. Therefore, do not modify an external file while querying the external table.
- When a new file is added to the file directory of the external table, you need to update the file list so that the external table can access the new file.
Examples
Prepare data. Set
secure_file_privto/home/admin/, and place the CSV fileextdata.csvin the/home/admin/testdirectory on the currently connected OBServer node.The following example shows how to set a global security path.
obclient> SET GLOBAL secure_file_priv = "" Query OK, 0 rows affected obclient> \q ByeNote
Because
secure_file_privis aGLOBALvariable, you need to run\qto exit for the settings to take effect.The content of the CSV file is as follows:
1,'Dave','Smith','dsmith@outlook.com','friend',32 2,'Xena','Johnson','xjonson@outlook.com','contact',45 3,'Fred','Jackon','fjackson@outlook.com','co-worker',19 4,'Alma','Tyler','atyler@outlook.com','friend',53Log on to a user tenant of the database and create an external table named
contacts.obclient> CREATE EXTERNAL TABLE contacts ( id INT, firstname VARCHAR(100), lastname VARCHAR(100), email VARCHAR(255), category CHAR(30), age NUMBER ) LOCATION = '/home/admin/test' FORMAT = ( TYPE = 'CSV' FIELD_DELIMITER = ',' FIELD_OPTIONALLY_ENCLOSED_BY ='\'' ) PATTERN = 'extdata.csv';Query data in the external table
contacts.obclient> SELECT * FROM contacts; +----+-----------+----------+----------------------+-----------+------+ | id | firstname | lastname | email | category | age | +----+-----------+----------+----------------------+-----------+------+ | 1 | Dave | Smith | dsmith@outlook.com | friend | 32 | | 2 | Xena | Johnson | xjonson@outlook.com | contact | 45 | | 3 | Fred | Jackon | fjackson@outlook.com | co-worker | 19 | | 4 | Alma | Tyler | atyler@outlook.com | friend | 53 | +----+-----------+----------+----------------------+-----------+------+ 4 rows in set