You can create an external table by using the CREATE EXTERNAL TABLE statement. When you create an external table, you must specify the path of the data file and the format of the data file to read data from the external file.
Privilege requirements
To create an external table, the current user must have the CREATE TABLE privilege. For more information about how to view user privileges, see View user privileges. If you do not have the CREATE TABLE privilege, contact the administrator to grant it to you. For more information about how to grant privileges to users, see Modify user privileges.
Create an external table
The following example shows the statement for creating an external table:
CREATE EXTERNAL TABLE table_name
( col_name col_type [AS (metadata$filecol{N})]
[ , col_name col_type [AS (metadata$filecol{N})] ]
[ , ... ] )
LOCATION = '<string>'
FORMAT = (
TYPE = 'CSV'
LINE_DELIMITER = '<string>' | <expr>
FIELD_DELIMITER = '<string>' | <expr>
ESCAPE = '<character>' | <expr>
FIELD_OPTIONALLY_ENCLOSED_BY = '<character>' | <expr>
ENCODING = 'charset'
NULL_IF = ('<string>' | <expr>, '<string>' | <expr> ...)
SKIP_HEADER = <int>
SKIP_BLANK_LINES = { TRUE | FALSE }
TRIM_SPACE = { TRUE | FALSE }
EMPTY_FIELD_AS_NULL = { TRUE | FALSE }
)
[ PATTERN = '<regex_pattern>' ]
The following table describes the parameters in the statement.
col_name col_type [AS (metadata$filecol{N})]: specifies the columns. You can useAS (metadata$filecol{N})to manually define column mapping.The columns types supported by external tables are the same as those supported by regular tables. For more information about the data types supported in Oracle mode of OceanBase Database and descriptions of the data types, see Overview.
By default, the columns in the external file are automatically mapped to the columns of the external table in order. That is, the first column of the external table is automatically mapped to the first column of the external file, the second column of the external table is automatically mapped to the second column of the external file, and so on.
For example, in the following example, column
C1of the external tableext_t1is automatically mapped to the first column of the external file; columnC2is automatically mapped to the second column of the external file.CREATE EXTERNAL TABLE ext_t1 ( C1 int, C2 int ) LOCATION = 'oss://$ACCESS_ID:$ACCESS_KEY@$HOST/tpch_1g_data/lineitem/' FORMAT = ( TYPE = 'CSV' FIELD_DELIMITER = '|' );If the order of columns in the external file is different from that in the external table, you can use a pseudo-column in the format of
metadata$filecol{N}to specify that a column of the external table be mapped to the Nth column of the external file. Note that the columns in the file are numbered starting from 1.For example, in the following example,
C1 int AS (metadata$filecol2)specifies that columnC1of the external tableext_t2be mapped to the second column of the file;C2 int AS (metadata$filecol4)specifies that columnC2of the external tableext_t2be mapped to the fourth column of the external file.CREATE EXTERNAL TABLE ext_t2 ( C1 int AS (metadata$filecol2), C2 int AS (metadata$filecol4) ) LOCATION = 'oss://$ACCESS_ID:$ACCESS_KEY@$HOST/tpch_1g_data/lineitem/' FORMAT = ( TYPE = 'CSV' FIELD_DELIMITER = '|' );Notice
If you want to manually define column mapping, automatic column mapping will fail and all columns must be manually mapped.
LOCATION = '<string>': specifies the path where the external file is stored. Typically, the data files of an external table are stored in a dedicated directory, which can contain subdirectories. When you create an external table, the database automatically collects all files in the directory.The following two formats are supported:
Local location format:
LOCATION = '[file://] local_file_path'local_file_path: can be a relative path or an absolute path. If you enter a relative path, the current directory must be the installation directory of OceanBase Database.Notice
In the
LOCATIONparameter, the value oflocal_file_pathmust be a directory rather than a file. If you want to specify a single file, you can specify the upper directory of the file in theLOCATIONparameter and specify the file itself by setting thePATTERNparameter.For scenarios that use the local location format, if you configure the value of the
secure_file_privvariable to a path, make sure that the path is the upper directory oflocal_file_path, namely,local_file_pathis a subdirectory of the path specified bysecure_file_priv.The tenant-level system variable
secure_file_privspecifies the path that OceanBase Database can access when you import data to or export data from a file. For more information aboutsecure_file_priv, see secure_file_priv.
Remote location format:
LOCATION = '{oss|cos}://$ACCESS_ID:$ACCESS_KEY@$HOST/remote_file_path'$ACCESS_ID,$ACCESS_KEY, and$HOSTare the access information required to access Alibaba Cloud OSS or Tencent Cloud COS. These sensitive access information will be stored in the system tables of the database in an encrypted form.Notice
When you use an object storage path, separate the parameters in the path with the
&character. Make sure that the value of each parameter contains only uppercase and lowercase English letters, digits, and the following characters:/-_$+=. If the value contains other characters, the setting may fail.
FORMAT: specifies the format of the external file.TYPE: specifies the type of the external file. Only CSV files are supported.LINE_DELIMITER: specifies the line delimiter of the file. If you do not specify a line delimiter, the default value'\n'is used.FIELD_DELIMITER: specifies the field delimiter of the file. If you do not specify a field delimiter, the default value'\t'is used.ESCAPE: specifies the escape character of the file. For example,ESCAPE ='*'specifies to use an asterisk (*) as the escape character, which replaces the default escape character (). If you do not specify an escape character, the default value'\'is used.FIELD_OPTIONALLY_ENCLOSED_BY: specifies the characters that enclose the field values in the file. For example,ESCAPE = '"'specifies to enclose the field values in double quotation marks. If you do not specify characters to enclose the field values, no characters are used by default.ENCODING: specifies the character set encoding used by the file. For more information about the character sets supported in Oracle mode of OceanBase Database, see Character sets. If you do not specify a character set, the default value UTF8MB4 is used.NULL_IF: specifies the strings that are treated asNULLvalues. If you do not specify any strings, no string is treated asNULLby default.SKIP_HEADER: specifies the number of lines to skip in the file header. If you do not specify this parameter, the file header is not skipped by default.SKIP_BLANK_LINES: specifies whether to skip blank lines. If you do not specify this parameter, the default valueFALSEis used.TRIM_SPACE: specifies whether to remove leading and trailing spaces in fields. If you do not specify this parameter, the default valueFALSEis used.EMPTY_FIELD_AS_NULL: specifies whether to treat empty strings asNULLvalues. If you do not specify this parameter, the default valueFALSEis used.
PATTERN: specifies the regular pattern string to filter files in the directory specified byLOCATION. For each file in the directory specified byLOCATION, if the file name matches the pattern string, the external table can access the file. Otherwise, the external table skips the file. If you do not specify this parameter, the external table can access all files in the directory specified byLOCATIONby default.
Assume that a data.csv file exists in the /home/admin/oceanbase/ directory on your local machine. The file contains the following data:
1,"lin",98
2,"hei",90
3,"ali",95
On the OBServer node, connect to the Oracle tenant of the database as the tenant administrator by using a local Unix socket.
Here is an example:
obclient -S /home/admin/oceanbase/run/sql.sock -uroot@sys -p********For more information about how to connect to OceanBase Database using a local Unix socket, see secure_file_priv.
Configure the path
/home/admin/oceanbase/as the directory that the database can access.SET GLOBAL secure_file_priv = "/home/admin/oceanbase/";After the command is executed, you must restart the session for the modification to take effect.
Reconnect to the database and create an external table named
ext_t3.CREATE EXTERNAL TABLE ext_t3(ID NUMBER(32), NAME VARCHAR2(30),SCORE NUMBER(32)) LOCATION = '/home/admin/oceanbase/' FORMAT = ( TYPE = 'CSV' FIELD_DELIMITER = ',' FIELD_OPTIONALLY_ENCLOSED_BY ='"' ) PATTERN = 'data.csv';
After the external table is created, you can use the SHOW CREATE TABLE statement to view the table definition, just like that for a regular table.
SHOW CREATE TABLE ext_t3;
The query result is as follows:
+--------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| TABLE | CREATE TABLE |
+--------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| EXT_T3 | CREATE EXTERNAL TABLE "EXT_T3" (
"ID" NUMBER(32) GENERATED ALWAYS AS (METADATA$FILECOL1),
"NAME" VARCHAR2(30) GENERATED ALWAYS AS (METADATA$FILECOL2),
"SCORE" NUMBER(32) GENERATED ALWAYS AS (METADATA$FILECOL3)
)
LOCATION='file:///home/admin/oceanbase/'
PATTERN='data.csv'
FORMAT (
TYPE = 'CSV',
FIELD_DELIMITER = ',',
FIELD_OPTIONALLY_ENCLOSED_BY = '"',
ENCODING = 'utf8mb4'
)COMPRESS FOR ARCHIVE REPLICA_NUM = 1 BLOCK_SIZE = 16384 USE_BLOOM_FILTER = FALSE TABLET_SIZE = 134217728 PCTFREE = 0 |
+--------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set
You can access the external table just like a regular table. When you query an external table, the system reads the external file by using the driver of the external table, parses the file according to the file format, converts the parsed data into the data types of OceanBase Database, and then returns the data rows. Here is an example of querying the external table lineitem that is created above:
SELECT * FROM ext_t3;
The query result is as follows:
+----+------+-------+
| ID | NAME | SCORE |
+----+------+-------+
| 1 | lin | 98 |
| 2 | hei | 90 |
| 3 | ali | 95 |
+----+------+-------+
3 rows in set
You can also combine the external table with a regular table for queries. Assume that the current database contains a regular table named info, whose data is as follows:
+------+--------+------+
| NAME | SEX | AGE |
+------+--------+------+
| lin | male | 8 |
| hei | male | 9 |
| li | female | 8 |
+------+--------+------+
3 rows in set
Here is an example of combining the external table ext_t3 and the regular table info for queries.
SELECT info.* FROM info, ext_t3 WHERE info.name = ext_t3.name AND ext_t3.score > 90;
The query result is as follows:
+------+--------+------+
| NAME | SEX | AGE |
+------+--------+------+
| lin | male | 8 |
| li | female | 8 |
+------+--------+------+
2 rows in set
For more information about queries, see Read data.
Considerations
An external table can only be queried, and you cannot perform DML operations on it.
When you query an external table, if the external file accessed by the table has been deleted, the system does not return an error, but instead returns an empty result set.
The external file system manages the files accessed by an external table. If the external storage system is unavailable, an error is returned when you query the external table.
Next steps
When you create an external table, the system saves the file names that match the PATTERN in the specified LOCATION in the system table of OceanBase Database. During a scan, the system accesses the external files based on the saved file names. If you add new files to the external directory, you must perform an operation to add the new files to the file list of the external table. For more information, see Manage external files.
You can also drop an external table. The statement for dropping an external table is the same as that for dropping a regular table. To drop an external table, you need to specify the DROP TABLE statement. For more information, see Drop a table.