Description
This statement is used to create a new external table in the database.
External tables are a key feature in database management systems. Typically, tables in a database store data in the database's storage space, while external tables store data in external storage services.
When you create an external table, you need to specify the file path and file format of the data. After that, you can use the external table to read data from the external storage service. External tables are read-only. You can use them in query statements, but you cannot perform DML operations on them. You cannot define constraints or create indexes on external tables.
Syntax
CREATE EXTERNAL TABLE <table_name>
( [ <column_name> <column_type> [AS <expr>] ]
[ , <column_name> <column_type> [AS <expr>] ]
[ , ... ] )
LOCATION = '<string>'
FORMAT = (
TYPE = 'CSV'
LINE_DELIMITER = '<string>' | <expr>
FIELD_DELIMITER = '<string>' | <expr>
ESCAPE = '<character>' | <expr>
FIELD_OPTIONALLY_ENCLOSED_BY = '<character>' | <expr>
ENCODING = 'charset'
NULL_IF = ('<string>' | <expr>, '<string>' | <expr> ...)
SKIP_HEADER = <int>
SKIP_BLANK_LINES = { TRUE | FALSE }
TRIM_SPACE = { TRUE | FALSE }
EMPTY_FIELD_AS_NULL = { TRUE | FALSE }
)
[ PATTERN = '<regex_pattern>' ]
Parameter description
| Parameter | Description |
|---|---|
| table_name | The name of the external table to be created. |
| column_name | The name of a column in the external table. By default, the data columns in the file are automatically matched with the definition columns of the external table in sequence. |
| column_type | The type of a column in the external table. You cannot define constraints such as DEFAULT, NOT NULL, UNIQUE, CHECK, PRIMARY KEY, or FOREIGN KEY in the column type. |
| AS |
Manually specifies the mappings between columns. When the order of columns in a file is different from that in the external table, you can specify the correspondence between a column in the external table and the Nth column in the file by using the pseudo-column metadata$filecol{N}. For example, c2 INT AS (metadata$filecol4) indicates that the c2 column in the external table corresponds to the fourth column in the file. Note that if you manually specify the column mappings, the automatic mappings become invalid, and you must manually specify mappings for all columns. |
| LOCATION | The path where the external table is stored. Generally, the data file of an external table is stored in a separate directory, which can contain subdirectories. When you create an external table, the system automatically collects all files in the specified directory.
NoticeWhen you use an object storage path, separate the parameters in the object storage path with |
| FORMAT | The format of the external file.
|
| PATTERN | A regular pattern string that is used to filter files in the LOCATION directory. For each file in the directory specified by LOCATION, if the file path matches the pattern string, the external table accesses the file. Otherwise, the external table skips the file. If you do not specify this parameter, the external table, by default, accesses all files in the directory specified by LOCATION. The external table stores the list of files that match the LOCATION-specified path and the PATTERN-specified pattern in the system tables of the database. When the external table scans files, it accesses external files based on this list. The file list can be automatically updated or manually updated. |
Considerations
- If an external file is deleted, the file no longer exists in the file list when you access the file list. In this case, the external table ignores the non-existent file.
- If an external file is modified, the external table accesses the latest data of the external file. If the modification of the external file is concurrent with a query on the external table, the result of the query may be unexpected. Therefore, avoid modifying external files while they are being queried.
- If a new file is added to an external directory, the external table accesses only the files in the file list. To add the new file to the file list of the external table, you must update the file list of the external table.
Examples
Set the
secure_file_privpath to/home/admin/and place the CSV fileextdata.csv, which contains the data to be imported, in the/home/admin/testdirectory on the local OBServer node that is connected.Here is an example of setting the global secure path.
obclient> SET GLOBAL secure_file_priv = "" Query OK, 0 rows affected obclinet> \q ByeNote
After you set the
secure_file_privvariable to a global variable, you must execute the\qstatement to make the setting take effect.The CSV file contains the following data:
1,'Dave','Smith','d******h@outlook.com','friend',32 2,'Xena','Johnson','x******n@outlook.com','contact',45 3,'Fred','Jackon','f******n@outlook.com','co-worker',19 4,'Alma','Tyler','a******r@outlook.com','friend',53Log in to the database as a user in the tenant and create an external table named
contacts.obclient> CREATE EXTERNAL TABLE contacts ( id INT, firstname VARCHAR(100), lastname VARCHAR(100), email VARCHAR(255), category CHAR(30), age NUMBER ) LOCATION = '/home/admin/test/' FORMAT = ( TYPE = 'CSV' FIELD_DELIMITER = ',' FIELD_OPTIONALLY_ENCLOSED_BY ='''' )PATTERN ='extdata.csv';Query data from the
contactstable.obclient> SELECT * FROM contacts; +------+-----------+----------+----------------------+--------------------------------+------+ | ID | FIRSTNAME | LASTNAME | EMAIL | CATEGORY | AGE | +------+-----------+----------+----------------------+--------------------------------+------+ | 1 | Dave | Smith | d******h@outlook.com | friend | 32 | | 2 | Xena | Johnson | x******n@outlook.com | contact | 45 | | 3 | Fred | Jackon | f******n@outlook.com | co-worker | 19 | | 4 | Alma | Tyler | a******r@outlook.com | friend | 53 | +------+-----------+----------+----------------------+--------------------------------+------+ 4 rows in set