You can execute the CREATE EXTERNAL TABLE statement to create an external table. When you create an external table, you must specify the path to and format of the data files whose data you want to read.
Required privileges
You have the CREATE privilege. For more information about how to view your privileges, see View user privileges.
Create an external table
The SQL syntax for creating an external table is as follows:
CREATE EXTERNAL TABLE table_name
( col_name col_type [AS (metadata$filecol{N})]
[ , col_name col_type [AS (metadata$filecol{N})] ]
[ , ... ] )
LOCATION = '<string>'
FORMAT = (
TYPE = 'CSV'
LINE_DELIMITER = '<string>' | <expr>
FIELD_DELIMITER = '<string>' | <expr>
ESCAPE = '<character>' | <expr>
FIELD_OPTIONALLY_ENCLOSED_BY = '<character>' | <expr>
ENCODING = 'charset'
NULL_IF = ('<string>' | <expr>, '<string>' | <expr> ...)
SKIP_HEADER = <int>
SKIP_BLANK_LINES = { TRUE | FALSE }
TRIM_SPACE = { TRUE | FALSE }
EMPTY_FIELD_AS_NULL = { TRUE | FALSE }
)
[ PATTERN = '<regex_pattern>' ]
The options in the command are described as follows:
col_name col_type [AS (metadata$filecol{N})]specifies a column, whereAS (metadata$filecol{N})specifies the column mapping.The column types supported by an external table are the same as those supported by a normal table. For more information about the data types supported by the MySQL mode of OceanBase Database, see Data types.
By default, the columns in an external table are sequentially mapped to the columns in the corresponding external file. Specifically, the first column in the external table is mapped to the first column in the external file.
In the following example, the
C1column in the external tableext_t1is automatically mapped to the first column in the external file, and theC2column is automatically mapped to the second column in the external file.CREATE EXTERNAL TABLE ext_t1 ( C1 int, C2 int ) LOCATION = 'oss://$ACCESS_ID:$ACCESS_KEY@$HOST/tpch_1g_data/lineitem/' FORMAT = ( TYPE = 'CSV' FIELD_DELIMITER = '|' );If the sequence of columns in the external table is inconsistent with that in the external file, you can specify a mapping between a column in the external table and the corresponding column N in the external file by using a pseudocolumn such as
metadata$filecol{N}. Columns in the external file are numbered from 1.In the following example,
C1 int AS (metadata$filecol2)indicates that theC1column in the external tableext_t2is mapped to the second column in the external file.C2 int AS (metadata$filecol4)indicates that theC2column in the external tableext_t2is mapped to the fourth column in the external file.CREATE EXTERNAL TABLE ext_t2 ( C1 int AS (metadata$filecol2), C2 int AS (metadata$filecol4) ) LOCATION = 'oss://$ACCESS_ID:$ACCESS_KEY@$HOST/tpch_1g_data/lineitem/' FORMAT = ( TYPE = 'CSV' FIELD_DELIMITER = '|' );Notice
If you manually define column mappings, the automatic column mapping feature does not take effect. In this case, you must define mappings for all columns.
LOCATION = '<string>'specifies the directory where external files are stored. Generally, the data files of an external table are stored in an independent directory, which can contain subdirectories. When you create an external table, the external table automatically collects all files in this directory.Two file location formats are supported.
Local location format:
LOCATION = '[file://] local_file_path'The value of
local_file_pathcan be a relative path or an absolute path. If you want to specify a relative path, the current directory must be the installation directory of OceanBase Database.Notice
local_file_pathmust specify a directory rather than a file. To specify a single file, specify the upper-level directory of this file inLOCATIONand then set thePATTERNattribute.If the local location format is used, when you configure the file path accessible to OceanBase Database by using the
secure_file_privsystem variable, make sure that the directory specified bysecure_file_privmust be the upper-level directory of the directory specified bylocal_file_path. That is, the value oflocal_file_pathmust be a subdirectory of the value ofsecure_file_priv.The tenant-level system variable
secure_file_privspecifies the path accessible to OceanBase Database when data is imported to or exported from a file. For more information about thesecure_file_privsystem variable, see secure_file_priv.
Remote location format:
LOCATION = '{oss|cos}://$ACCESS_ID:$ACCESS_KEY@$HOST/remote_file_path'In the format above,
$ACCESS_ID,$ACCESS_KEY, and$HOSTare required for accessing Alibaba Cloud Object Storage Service (OSS) or Tencent Cloud Object Storage (COS). Such sensitive access information is encrypted and stored in the system table of the database.
FORMATspecifies the format of external files.TYPE: the type of external files. At present, only CSV files are supported.LINE_DELIMITER: the line delimiter for the files. The default setting isLINE_DELIMITER='\n'.FIELD_DELIMITER: the column delimiter for the files. The default setting isFIELD_DELIMITER='\t'.ESCAPE: the escape character for the files. For example,ESCAPE ='*'indicates that the asterisk (*) is used as the escape character. The default setting isESCAPE ='\'.FIELD_OPTIONALLY_ENCLOSED_BY: the characters for enclosing column values. For example,ESCAPE = '"'indicates that a column value is enclosed by double quotation marks ("). By default, this parameter is left empty.ENCODING: the character set encoding used by the file. For more information about all character sets supported by the MySQL mode of OceanBase Database, see Character sets. If this parameter is not specified, the default valueUTF8MB4takes effect.NULL_IF: the strings that are considered asNULLvalues. By default, this parameter is left empty.SKIP_HEADER: specifies whether to skip file headers and the number of lines to skip. By default, file headers are not skipped.SKIP_BLANK_LINES: specifies whether to skip blank lines. The default value isFALSE.TRIM_SPACE: specifies whether to delete leading and trailing spaces from columns in the files. The default value isFALSE.EMPTY_FIELD_AS_NULL: specifies whether to consider empty strings asNULLvalues. The default value isFALSE.
PATTERNspecifies a regular string for filtering files in the directory specified byLOCATION. The external table can access a file in the specified directory that matches the regular string, and will skip files that do not match the regular string. By default, if this parameter is not specified, all files in theLOCATIONdirectory are accessible.
Assume that a file named data.csv is stored in the /home/admin/oceanbase/ directory on the local server. The following example shows the content of the file.
1,"lin",98
2,"hei",90
3,"ali",95
Log on as the administrator to a MySQL tenant in the cluster by using a local Unix socket on the OBServer node.
Here is an example:
obclient -S /home/admin/oceanbase/run/sql.sock -uroot@sys -p********For more information about how to connect to OceanBase Database by using a local Unix socket, see secure_file_priv.
Set the accessible path to
/home/admin/.SET GLOBAL secure_file_priv = "/home/admin/";After the statement is executed, you must restart the session for the modification to take effect.
After you reconnect to OceanBase Database, create an external table named
ext_t3.CREATE EXTERNAL TABLE ext_t3(id int, name char(10),score int) LOCATION = '/home/admin/oceanbase/' FORMAT = ( TYPE = 'CSV' FIELD_DELIMITER = ',' FIELD_OPTIONALLY_ENCLOSED_BY ='"' ) PATTERN = 'data.csv';
After you create an external table, you can use the SHOW CREATE TABLE statement to query its table definition as a normal table.
SHOW CREATE TABLE ext_t3;
The query result is as follows:
+--------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Table | Create Table |
+--------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| ext_t3 | CREATE EXTERNAL TABLE `ext_t3` (
`id` int(11) GENERATED ALWAYS AS (metadata$filecol1),
`name` char(10) GENERATED ALWAYS AS (metadata$filecol2),
`score` int(11) GENERATED ALWAYS AS (metadata$filecol3)
)
LOCATION='file:///home/admin/oceanbase/'
PATTERN='data.csv'
FORMAT (
TYPE = 'CSV',
FIELD_DELIMITER = ',',
FIELD_OPTIONALLY_ENCLOSED_BY = '"',
ENCODING = 'utf8mb4'
)DEFAULT CHARSET = utf8mb4 ROW_FORMAT = DYNAMIC COMPRESSION = 'zstd_1.3.8' REPLICA_NUM = 1 BLOCK_SIZE = 16384 USE_BLOOM_FILTER = FALSE TABLET_SIZE = 134217728 PCTFREE = 0 |
+--------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set
You can also access the external table as a normal table. When you query an external table, the system directly reads the external files through the driver layer of the external table, parses the external files based on the specific file format, converts the data into internal data types of OceanBase Database, and returns the data lines. The following example shows a query on the external table lineitem:
SELECT * FROM ext_t3;
The query result is as follows:
+----+------+-------+
| ID | name | score |
+----+------+-------+
| 1 | lin | 98 |
| 2 | hei | 90 |
| 3 | ali | 95 |
+----+------+-------+
3 rows in set
You can also use an external table and a normal table for compound queries. Assume that the current database has a normal table named info. The following code shows the data in this table:
+------+--------+------+
| name | sex | age |
+------+--------+------+
| lin | male | 8 |
| hei | male | 9 |
| li | female | 8 |
+------+--------+------+
3 rows in set
Example: Initiate a compound query on the external table ext_t3 and the normal table info.
SELECT info.* FROM info, ext_t3 WHERE info.name = ext_t3.name AND ext_t3.score > 90;
The query result is as follows:
+------+--------+------+
| name | sex | age |
+------+--------+------+
| lin | male | 8 |
| li | female | 8 |
+------+--------+------+
2 rows in set
For more information about queries, see Read data.
Considerations
External tables support only query operations, but not DML operations.
When you query an external table, if the external file accessed by the table is deleted, the system does not generate an error but returns empty rows.
Files accessed by an external table are managed by an external storage system. When the external storage system is unavailable, an error occurs when you query the external table.
What to do next
When you create an external table, the system saves the list of files in the directory specified by LOCATION matching PATTERN in a system table of OceanBase Database. The system will access external files based on this list during a scan on the external table. If a file is added to the specified directory, you must perform an update operation to add this file to the file list of the external table. For more information, see Manage external files.
After an external table is created, you can use the DROP TABLE statement to drop it as a normal table. For more information, see Drop a table.