You can execute the CREATE EXTERNAL TABLE statement to create an external table. When you create an external table, you must specify the path and format of the data file to read data from an external file.
Privilege requirements
To create an external table, the current user must have the CREATE TABLE privilege. To view the privileges of the current user, see View user privileges. If you do not have the CREATE TABLE privilege, contact the administrator to add it for you. For more information about how to add user privileges, see Modify user privileges.
Create an external table
The SQL statement for creating an external table is as follows:
CREATE EXTERNAL TABLE table_name
( col_name col_type [AS (metadata$filecol{N})]
[ , col_name col_type [AS (metadata$filecol{N})] ]
[ , ... ] )
LOCATION = '<string>'
FORMAT = (
TYPE = 'CSV'
LINE_DELIMITER = '<string>' | <expr>
FIELD_DELIMITER = '<string>' | <expr>
ESCAPE = '<character>' | <expr>
FIELD_OPTIONALLY_ENCLOSED_BY = '<character>' | <expr>
ENCODING = 'charset'
NULL_IF = ('<string>' | <expr>, '<string>' | <expr> ...)
SKIP_HEADER = <int>
SKIP_BLANK_LINES = { TRUE | FALSE }
TRIM_SPACE = { TRUE | FALSE }
EMPTY_FIELD_AS_NULL = { TRUE | FALSE }
)
[ PATTERN = '<regex_pattern>' ]
The parameters are described as follows:
col_name col_type [AS (metadata$filecol{N})]: specifies the columns. You can useAS (metadata$filecol{N})to manually specify column mapping.The column types supported by external tables are the same as those supported by regular tables. For more information about the data types supported in Oracle mode of OceanBase Database, see Overview.
By default, data columns in an external file are automatically mapped to columns of an external table in order. That is, the first column of an external file is mapped to the first column of an external table.
For example, in the following example, column
C1of the external tableext_t1is automatically mapped to the first column of the external file; and columnC2is automatically mapped to the second column of the external file.CREATE EXTERNAL TABLE ext_t1 ( C1 int, C2 int ) LOCATION = 'oss://$ACCESS_ID:$ACCESS_KEY@$HOST/tpch_1g_data/lineitem/' FORMAT = ( TYPE = 'CSV' FIELD_DELIMITER = '|' );If the columns in an external file are in a different order from those in an external table, you can use a pseudocolumn in the format of
metadata$filecol{N}to specify that the Nth column of an external table corresponds to the Nth column of an external file. Note that the columns in the file are numbered starting from 1.For example, in the following example,
C1 int AS (metadata$filecol2)indicates that columnC1of the external tableext_t2corresponds to the second column of the file; andC2 int AS (metadata$filecol4)indicates that columnC2of the external tableext_t2corresponds to the fourth column of the external file.CREATE EXTERNAL TABLE ext_t2 ( C1 int AS (metadata$filecol2), C2 int AS (metadata$filecol4) ) LOCATION = 'oss://$ACCESS_ID:$ACCESS_KEY@$HOST/tpch_1g_data/lineitem/' FORMAT = ( TYPE = 'CSV' FIELD_DELIMITER = '|' );Notice
If you need to manually specify column mapping, the automatic column mapping function will fail and you must manually specify the mapping for all columns.
LOCATION = '<string>': specifies the path where the external file is stored. Typically, the data files of an external table are stored in a dedicated directory, which can contain subdirectories. When you create an external table, the system automatically collects all files in the directory.Two formats are supported:
Local location format:
LOCATION = '[file://] local_file_path'local_file_path: can be a relative or absolute path. If a relative path is used, the current directory must be the installation directory of OceanBase Database. In specifying a relative path, note that the current directory must be the installation directory of OceanBase Database.Notice
In the local location format, the
secure_file_privsystem variable must be set to the directory that containslocal_file_path. In other words,local_file_pathmust be a subdirectory ofsecure_file_priv.For the local location format, if you specify the path to a file that OceanBase Database does not have permission to access by using the
secure_file_privsystem variable, the specified file will be encrypted and stored in a system table of OceanBase Database. The sensitive access information required for accessing the file, such as$ACCESS_ID,$ACCESS_KEY, and$HOST, is also stored in the system table.Tenant-level system variables, such as
secure_file_priv, control the paths that OceanBase Database can access when importing or exporting files. For more information aboutsecure_file_priv, see secure_file_priv.
Remote location format:
LOCATION = '{oss|cos}://$ACCESS_ID:$ACCESS_KEY@$HOST/remote_file_path'$ACCESS_ID,$ACCESS_KEY, and$HOSTare the access information required for accessing Alibaba Cloud OSS or Tencent Cloud COS. These sensitive access information is stored in the system table of OceanBase Database in an encrypted form.Notice
When you use an object storage path, separate the parameters in the object storage path with the
&sign. Make sure that the values of the parameters you entered contain only uppercase and lowercase letters, digits,/-_$+=, and wildcard characters. If you enter characters other than the preceding ones, the setting may fail.
FORMAT: specifies the format of external files.TYPE: specifies the type of external files. Only CSV files are supported.LINE_DELIMITER: specifies the line delimiter of a file. If this parameter is not specified, the default valueLINE_DELIMITER='\n'is used.FIELD_DELIMITER: specifies the column delimiter of a file. If this parameter is not specified, the default valueFIELD_DELIMITER='\t'is used.ESCAPE: specifies the escape character of a file. For example,ESCAPE ='*'indicates that the asterisk (*) is the escape character, which replaces the default escape character (). If this parameter is not specified, the default valueESCAPE ='\'is used.FIELD_OPTIONALLY_ENCLOSED_BY: specifies the characters that enclose field values in a file. For example,ESCAPE = '"'indicates that values are enclosed in double quotation marks. If this parameter is not specified, the default value is an empty string.ENCODING: specifies the character set encoding format of a file. For more information about the character sets supported in Oracle mode of OceanBase Database, see Character sets. If this parameter is not specified, the default value UTF8MB4 is used.NULL_IF: specifies the strings that are treated asNULL. If this parameter is not specified, the default value is an empty string.SKIP_HEADER: specifies the number of lines to skip in the header. If this parameter is not specified, the default is to skip no lines.SKIP_BLANK_LINES: specifies whether to skip blank lines. If this parameter is not specified, the default isFALSE.TRIM_SPACE: specifies whether to remove leading and trailing spaces from fields. If this parameter is not specified, the default isFALSE.EMPTY_FIELD_AS_NULL: specifies whether an empty string is treated asNULL. If this parameter is not specified, the default isFALSE.
PATTERN: specifies a regular pattern string to filter files in the directory specified inLOCATION. For each file in the directory specified inLOCATION, if the file matches the pattern string, the external table can access the file. Otherwise, the external table will skip the file. If this parameter is not specified, the external table can access all files in the directory specified inLOCATIONby default.
Assume that a data.csv file exists in the /home/admin/oceanbase/ directory on the local server, and the file contains the following data.
1,"lin",98
2,"hei",90
3,"ali",95
On the OBServer node, the tenant administrator connects to the Oracle tenant of the cluster through the local Unix socket.
Here is an example of how to connect:
obclient -S /home/admin/oceanbase/run/sql.sock -uroot@sys -p********For more information about how to connect to OceanBase Database through a local Unix socket, see secure_file_priv.
Configure the path
/home/admin/oceanbase/that OceanBase Database can access.SET GLOBAL secure_file_priv = "/home/admin/oceanbase/";After the command is executed, you need to restart the session for the modification to take effect.
Reconnect to the database and create an external table named
ext_t3.CREATE EXTERNAL TABLE ext_t3(ID NUMBER(32), NAME VARCHAR2(30),SCORE NUMBER(32)) LOCATION = '/home/admin/oceanbase/' FORMAT = ( TYPE = 'CSV' FIELD_DELIMITER = ',' FIELD_OPTIONALLY_ENCLOSED_BY ='"' ) PATTERN = 'data.csv';
After the external table is created, you can use the SHOW CREATE TABLE statement to view the table definition, just like a regular table.
SHOW CREATE TABLE ext_t3;
The query result is as follows:
+--------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| TABLE | CREATE TABLE |
+--------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| EXT_T3 | CREATE EXTERNAL TABLE "EXT_T3" (
"ID" NUMBER(32) GENERATED ALWAYS AS (METADATA$FILECOL1),
"NAME" VARCHAR2(30) GENERATED ALWAYS AS (METADATA$FILECOL2),
"SCORE" NUMBER(32) GENERATED ALWAYS AS (METADATA$FILECOL3)
)
LOCATION='file:///home/admin/oceanbase/'
PATTERN='data.csv'
FORMAT (
TYPE = 'CSV',
FIELD_DELIMITER = ',',
FIELD_OPTIONALLY_ENCLOSED_BY = '"',
ENCODING = 'utf8mb4'
)COMPRESS FOR ARCHIVE REPLICA_NUM = 1 BLOCK_SIZE = 16384 USE_BLOOM_FILTER = FALSE TABLET_SIZE = 134217728 PCTFREE = 0 |
+--------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set
You can also access the external table, just like a regular table. When you query an external table, the system reads the external file through the driver of the external table and parses the file according to its format. Then, the system converts the data into internal data types of OceanBase Database and returns the data rows. Here is an example of how to query an external table named lineitem that has been created.
SELECT * FROM ext_t3;
The query result is as follows:
+----+------+-------+
| ID | NAME | SCORE |
+----+------+-------+
| 1 | lin | 98 |
| 2 | hei | 90 |
| 3 | ali | 95 |
+----+------+-------+
3 rows in set
In addition, you can also combine an external table and a regular table for query. Assume that the current database contains a regular table named info, and the data in the table is as follows:
+------+--------+------+
| NAME | SEX | AGE |
+------+--------+------+
| lin | male | 8 |
| hei | male | 9 |
| li | female | 8 |
+------+--------+------+
3 rows in set
Here is an example of how to combine the external table ext_t3 and the regular table info for query.
SELECT info.* FROM info, ext_t3 WHERE info.name = ext_t3.name AND ext_t3.score > 90;
The query result is as follows:
+------+--------+------+
| NAME | SEX | AGE |
+------+--------+------+
| lin | male | 8 |
| li | female | 8 |
+------+--------+------+
2 rows in set
For more information about queries, see Read data.
Considerations
An external table can only be used for query operations and cannot be used for DML operations.
If the external file accessed by the external table is deleted, the system does not return an error but returns an empty result set.
If the external storage system fails to manage the file accessed by the external table, an error is returned when you query the external table.
Next steps
When you create an external table, the system saves the file list that matches the specified PATTERN and is located at the path specified in LOCATION to a system table of OceanBase Database. The system uses the file list to locate external files when the system scans the external table. If you add new files to the external directory, you must update the external table files and add the new files to the file list of the external table. For more information, see Manage external files.
You can use the DROP TABLE statement to drop an external table, just like you drop a regular table. For more information, see Drop a table.