You can create an external table by using the CREATE EXTERNAL TABLE statement. When you create an external table, you must specify the path of data files and the format of data files to read data from external files.
Privilege requirements
To create an external table, the current user must have the CREATE privilege. For more information about how to view the privileges of the current user, see View user privileges.
Create an external table
The following example shows the syntax for creating an external table:
CREATE EXTERNAL TABLE table_name
( col_name col_type [AS (metadata$filecol{N})]
[ , col_name col_type [AS (metadata$filecol{N})] ]
[ , ... ] )
LOCATION = '<string>'
FORMAT = (
TYPE = 'CSV'
LINE_DELIMITER = '<string>' | <expr>
FIELD_DELIMITER = '<string>' | <expr>
ESCAPE = '<character>' | <expr>
FIELD_OPTIONALLY_ENCLOSED_BY = '<character>' | <expr>
ENCODING = 'charset'
NULL_IF = ('<string>' | <expr>, '<string>' | <expr> ...)
SKIP_HEADER = <int>
SKIP_BLANK_LINES = { TRUE | FALSE }
TRIM_SPACE = { TRUE | FALSE }
EMPTY_FIELD_AS_NULL = { TRUE | FALSE }
)
[ PATTERN = '<regex_pattern>' ]
The following table describes the parameters in the statement.
col_name col_type [AS (metadata$filecol{N})]: specifies the columns of the external table. You can useAS (metadata$filecol{N})to manually specify column mapping.The column types of an external table are the same as those of a regular table. For more information about the data types supported in MySQL mode of OceanBase Database, see Data types.
By default, the columns in an external file are automatically mapped to the columns of the external table in sequence. That is, the first column of the external file is mapped to the first column of the external table.
For example, the
C1column of the external tableext_t1is automatically mapped to the first column of the external file; theC2column is automatically mapped to the second column of the external file.CREATE EXTERNAL TABLE ext_t1 ( C1 int, C2 int ) LOCATION = 'oss://$ACCESS_ID:$ACCESS_KEY@$HOST/tpch_1g_data/lineitem/' FORMAT = ( TYPE = 'CSV' FIELD_DELIMITER = '|' );If the columns in the external file are in a different order from those in the external table, you can use a pseudo-column in the format of
metadata$filecol{N}to specify that the Nth column of the external table is mapped to the Nth column of the external file. Note that the columns in the file are numbered starting from 1.For example, the
C1 int AS (metadata$filecol2)statement specifies that theC1column of the external tableext_t2is mapped to the second column of the external file; theC2 int AS (metadata$filecol4)statement specifies that theC2column of the external tableext_t2is mapped to the fourth column of the external file.CREATE EXTERNAL TABLE ext_t2 ( C1 int AS (metadata$filecol2), C2 int AS (metadata$filecol4) ) LOCATION = 'oss://$ACCESS_ID:$ACCESS_KEY@$HOST/tpch_1g_data/lineitem/' FORMAT = ( TYPE = 'CSV' FIELD_DELIMITER = '|' );Notice
If you want to manually specify column mapping, automatic column mapping will fail. You need to manually specify column mapping for all columns.
LOCATION = '<string>': specifies the path where the external file is stored. Typically, the data files of an external table are stored in a dedicated directory, which can contain subdirectories. When you create an external table, the database automatically collects all files in the directory.The following two formats are supported:
Local location format:
LOCATION = '[file://] local_file_path'local_file_path: can be a relative path or an absolute path. If you enter a relative path, the current directory must be the installation directory of OceanBase Database.Notice
The
local_file_pathparameter must be a directory rather than a file. If you want to specify a single file, you need to set theLOCATIONparameter to the directory that contains the file and set thePATTERNparameter to specify the file.For scenarios that use the local location format, if you configure the file path that OceanBase Database can access by using the
secure_file_privsystem variable, thesecure_file_privvariable must be the parent directory oflocal_file_path, namely,local_file_pathmust be a subdirectory ofsecure_file_priv.The
secure_file_privvariable controls the path that OceanBase Database can access when it imports data to or exports data from a file. For more information about thesecure_file_privvariable, see secure_file_priv.
Remote location format:
LOCATION = '{oss|cos}://$ACCESS_ID:$ACCESS_KEY@$HOST/remote_file_path'$ACCESS_ID,$ACCESS_KEY, and$HOSTare the access information required to access Alibaba Cloud OSS or Tencent Cloud COS. These sensitive access information is stored in the system tables of the database in an encrypted form.Notice
When you use an object storage path, separate the parameters of the path with
&. Make sure that the values of the parameters you entered contain only uppercase and lowercase English letters, digits, and the characters/-_$+=. If you entered characters other than the aforesaid ones, the setting may fail.
FORMAT: specifies the format of the external file.TYPE: specifies the type of the external file. Only CSV files are supported.LINE_DELIMITER: specifies the line delimiter of the file. If this parameter is not specified, the default valueLINE_DELIMITER='\n'takes effect.FIELD_DELIMITER: specifies the field delimiter of the file. If this parameter is not specified, the default valueFIELD_DELIMITER='\t'takes effect.ESCAPE: specifies the escape character of the file. For example,ESCAPE ='*'indicates that the asterisk (*) is the escape character, which replaces the default escape character (). If this parameter is not specified, the default valueESCAPE ='\'takes effect.FIELD_OPTIONALLY_ENCLOSED_BY: specifies the characters that enclose the field values in the file. For example,ESCAPE = '"'indicates that the values are enclosed in double quotation marks. If this parameter is not specified, the default value takes effect.ENCODING: specifies the character set encoding format of the file. For more information about the character sets supported in MySQL mode of OceanBase Database, see Character sets. If this parameter is not specified, the default value UTF8MB4 takes effect.NULL_IF: specifies the strings that are treated asNULLvalues. If this parameter is not specified, the default value takes effect.SKIP_HEADER: specifies the number of lines to skip in the file header. If this parameter is not specified, the file header is not skipped by default.SKIP_BLANK_LINES: specifies whether to skip blank lines. If this parameter is not specified, the default valueFALSEtakes effect.TRIM_SPACE: specifies whether to remove leading and trailing spaces in the fields of the file. If this parameter is not specified, the default valueFALSEtakes effect.EMPTY_FIELD_AS_NULL: specifies whether to treat empty strings asNULLvalues. If this parameter is not specified, the default valueFALSEtakes effect.
PATTERN: specifies the regular pattern string to filter files in the directory specified byLOCATION. For each file in the directory specified byLOCATION, if the file name matches the pattern string, the external table can access the file. Otherwise, the external table skips the file. If this parameter is not specified, the external table can access all files in the directory specified byLOCATIONby default.
Assume that a file named data.csv is stored in the /home/admin/oceanbase/ directory on the local server. The file contains the following data.
1,"lin",98
2,"hei",90
3,"ali",95
On the OBServer node, the tenant administrator connects to the MySQL tenant of the cluster by using a local Unix socket.
Here is an example of the connection:
obclient -S /home/admin/oceanbase/run/sql.sock -uroot@sys -p********To connect to OceanBase Database by using a local Unix socket, follow the specific procedures and guidelines described in secure_file_priv.
Configure the path
/home/admin/oceanbase/that the database can access.SET GLOBAL secure_file_priv = "/home/admin/oceanbase/";After the command is executed, you need to restart the session for the modification to take effect.
Reconnect to the database and create an external table named
ext_t3.CREATE EXTERNAL TABLE ext_t3(id int, name char(10),score int) LOCATION = '/home/admin/oceanbase/' FORMAT = ( TYPE = 'CSV' FIELD_DELIMITER = ',' FIELD_OPTIONALLY_ENCLOSED_BY ='"' ) PATTERN = 'data.csv';
After the external table is created, you can use the SHOW CREATE TABLE statement to view the table definition, just like that for a regular table.
SHOW CREATE TABLE ext_t3;
The query result is as follows:
+--------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Table | Create Table |
+--------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| ext_t3 | CREATE EXTERNAL TABLE `ext_t3` (
`id` int(11) GENERATED ALWAYS AS (metadata$filecol1),
`name` char(10) GENERATED ALWAYS AS (metadata$filecol2),
`score` int(11) GENERATED ALWAYS AS (metadata$filecol3)
)
LOCATION='file:///home/admin/oceanbase/'
PATTERN='data.csv'
FORMAT (
TYPE = 'CSV',
FIELD_DELIMITER = ',',
FIELD_OPTIONALLY_ENCLOSED_BY = '"',
ENCODING = 'utf8mb4'
)DEFAULT CHARSET = utf8mb4 ROW_FORMAT = DYNAMIC COMPRESSION = 'zstd_1.3.8' REPLICA_NUM = 1 BLOCK_SIZE = 16384 USE_BLOOM_FILTER = FALSE TABLET_SIZE = 134217728 PCTFREE = 0 |
+--------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set
You can also access the external table like a regular table. When you query an external table, the system reads the external file by using the driver of the external table and parses the file data according to the file format. Then the parsed data is converted into internal data types of OceanBase Database, and data rows are returned. Here is an example of querying the external table lineitem that is created just now.
SELECT * FROM ext_t3;
The query result is as follows:
+----+------+-------+
| id | name | score |
+----+------+-------+
| 1 | lin | 98 |
| 2 | hei | 90 |
| 3 | ali | 95 |
+----+------+-------+
3 rows in set
You can also combine an external table with a regular table for queries. Assume that a regular table named info exists in the current database, and the table contains the following data:
+------+--------+------+
| name | sex | age |
+------+--------+------+
| lin | male | 8 |
| hei | male | 9 |
| li | female | 8 |
+------+--------+------+
3 rows in set
Here is an example of combining the external table ext_t3 and the regular table info for queries.
SELECT info.* FROM info, ext_t3 WHERE info.name = ext_t3.name AND ext_t3.score > 90;
The query result is as follows:
+------+--------+------+
| name | sex | age |
+------+--------+------+
| lin | male | 8 |
| li | female | 8 |
+------+--------+------+
2 rows in set
For more information about queries, see Read data.
Considerations
An external table can only be queried, and you cannot perform DML operations on it.
When you query an external table, if the external file accessed by the table has been deleted, the system does not return an error, but instead returns an empty result set.
The external file system manages the files accessed by an external table. If the external storage system is unavailable, an error is returned when you query the external table.
Next steps
When you create an external table, the system saves the list of files that match the PATTERN in the specified LOCATION in the system table of OceanBase Database. When the external table is scanned, the system accesses the external files based on the list. If you add new files to the external directory, you must perform an operation to add the new files to the file list of the external table. For more information, see Manage external files.
After you create an external table, you can drop it. The statement for dropping an external table is the same as that for dropping a normal table. You can use the DROP TABLE statement to drop an external table. For more information, see Drop a table.