This topic describes how to create an external table by using SQL statements.
Overview
An external table is a logical table object. Its actual data is stored in an external storage system, rather than in the database.
For more information about external tables, see Overview.
Prerequisites
Before you create an external table, make sure that:
You have connected to OceanBase Database. For more information about how to connect to the database, see Overview of connection methods.
Note
The tenant mode to which the logged-in tenant belongs can be queried through the
oceanbase.DBA_OB_TENANTSview in thesystenant.The current user has the
CREATE TABLEprivilege. To view the privileges of the current user, see View user privileges.
Considerations
An external table can only be used for query operations, and does not support DML operations.
When you query an external table, if the external file accessed by the external table has been deleted, the system does not return an error, but returns an empty result set.
If the external storage system that manages the file accessed by the external table becomes unavailable, an error is returned when you query the external table.
External tables store data in external data sources. Therefore, query operations involve network factors and file systems, which may affect query performance. Make sure to select an appropriate data source and optimize strategies when you create external tables to improve query efficiency.
Syntax for creating an external table
The CREATE EXTERNAL TABLE statement typically takes the following form:
CREATE EXTERNAL TABLE table_name (column_options [AS (metadata$filecol{N})])
LOCATION = '<string>'
FORMAT = (format_options)
[PATTERN = '<regex_pattern>'];
;
For more information about the CREATE EXTERNAL TABLE statement, see CREATE EXTERNAL TABLE.
Parameter description:
table_name: The external table name.
column_options: Column definitions of the external table, including column names and data types.
[AS (metadata$filecol{N})]: This is an optional parameter that allows you to manually define column mappings.Note
By default, data columns in an external file are automatically mapped to the columns defined in the external table in the order in which they appear. For example, the first column in the external table corresponds to the first column of data in the external file. If the order of columns in the external file differs from that in the external table, you can use the `metadata$filecol{N}` pseudo-column to specify that the Nth column in the external file corresponds to the column in the external table. Note that the columns in the file are numbered starting from 1.
LOCATION: specifies the path where the external file is stored.FORMAT: specifies the format of the external file.PATTERN: specifies a regular pattern string to filter files in theLOCATIONdirectory.
Define an external table name
When you create an external table, you must name it. To avoid confusion and ambiguity, we recommend that you use specific naming rules or prefixes to distinguish external tables from ordinary tables when naming external tables. For example, you can add a suffix such as _csv to the name of an external table.
Here is an example:
When you create an external table that contains student information, you can name it students_csv.
CREATE EXTERNAL TABLE students_csv external_options
Notice
The preceding SQL statement cannot be executed because no other attributes of the external table are specified.
Define columns
You cannot define constraints, such as DEFAULT, NOT NULL, UNIQUE, CHECK, PRIMARY KEY, and FOREIGN KEY, for columns of an external table.
The column types supported by external tables are the same as those supported by ordinary tables. For more information about the data types supported in Oracle mode of OceanBase Database, see Overview.
Define LOCATION
The LOCATION option specifies the path where the files of the external table are stored. Generally, the data files of an external table are stored in a dedicated directory. The directory can contain subdirectories. When you create an external table, the system automatically collects all files in the directory that you specified.
OceanBase Database supports the following two path formats:
Local path:
LOCATION = '[file://] local_file_path'Notice
In a scenario that uses the local path format, you must set the system variable
secure_file_privto specify an accessible path. For more information, see secure_file_priv.Remote path:
LOCATION = '{oss|cos}://$ACCESS_ID:$ACCESS_KEY@$HOST/remote_file_path'$ACCESS_ID,$ACCESS_KEY, and$HOSTare the access information required to access Alibaba Cloud OSS or Tencent Cloud COS. These sensitive access information is stored in encrypted form in system tables of the database.
Notice
When you use an object storage path, separate the parameters of the path with the & character. Make sure that the values of the parameters you entered contain only uppercase and lowercase letters, digits, and the characters /-_$+=. If you enter characters other than the aforesaid ones, the setting may fail.
Define FORMAT
The FORMAT option specifies the format of external files. The following parameters are supported:
TYPE: specifies the type of external files. Only the CSV file type is supported.LINE_DELIMITER: specifies the line delimiter for CSV files. The default value isLINE_DELIMITER='\n'.FIELD_DELIMITER: specifies the field delimiter for CSV files. The default value isFIELD_DELIMITER='\t'.ESCAPE: specifies the escape character for CSV files, which can be only 1 byte in length. The default value isESCAPE ='\'.FIELD_OPTIONALLY_ENCLOSED_BY: specifies the characters that enclose field values in CSV files. The default value is an empty string.ENCODING: specifies the character set encoding format of files. For more information about the character sets supported in the Oracle mode, see Character sets. If this parameter is not specified, the default valueUTF8MB4takes effect.NULL_IF: specifies the strings to be treated asNULL. The default value is an empty string.SKIP_HEADER: specifies the number of lines to be skipped.SKIP_BLANK_LINES: specifies whether to skip blank lines. The default value isFALSE, which means that blank lines are not skipped.TRIM_SPACE: specifies whether to remove leading and trailing spaces from fields. The default value isFALSE, which means that leading and trailing spaces are not removed.EMPTY_FIELD_AS_NULL: specifies whether to treat empty strings asNULL. The default value isFALSE, which means that empty strings are not treated asNULL.
(Optional) Define PATTERN
The PATTERN option specifies a regular pattern string for filtering files in the LOCATION directory. For each file in the LOCATION directory, the file is accessed if its path matches the pattern string, and is ignored if it does not. If this parameter is not specified, the external table accesses all files in the LOCATION directory by default. The external table stores the list of files that match the PATTERN in the system table of the database. During a scan, the external table accesses external files based on this list. The file list can be updated automatically or manually.
Example
Notice
The commands in the following example mask the involved IP addresses. You must replace the IP addresses with your real ones when you verify the example.
The following example describes how to create an external table when the external file is located locally and in the Oracle mode of OceanBase Database. The procedure is as follows:
Create an external file.
Execute the following command to create a file named
test_tbl1.csvin the/home/admin/external_csvdirectory of the server where the OBServer node to log in to is located.[admin@xxx /home/admin/external_csv]# vi test_tbl1.csvThe content of the file is as follows:
1,'Emma' 2,'William' 3,'Olivia'Set the path of the imported file.
Notice
For security reasons, when you set the system variable
secure_file_priv, you can connect to the database only by using a local socket to execute the SQL statement that modifies the global variable. For more information, see secure_file_priv.Execute the following command to log in to the server where the OBServer node to log in to is located.
ssh admin@10.10.10.1Execute the following command to connect to the
oracle001tenant by using a local Unix socket.obclient -S /home/admin/oceanbase/run/sql.sock -usys@oracle001 -p******Execute the following SQL command to set the import path to
/home/admin/external_csv.SET GLOBAL secure_file_priv = "/home/admin/external_csv";
Reconnect to the
oracle001tenant.Here is the sample code:
obclient -h10.10.10.1 -P2881 -usys@oracle001 -p****** -AExecute the following SQL command to create an external table named
test_tbl1_csv.CREATE EXTERNAL TABLE test_tbl1_csv ( id INT, name VARCHAR(50) ) LOCATION = '/home/admin/external_csv' FORMAT = ( TYPE = 'CSV' FIELD_DELIMITER = ',' FIELD_OPTIONALLY_ENCLOSED_BY ='''' ) PATTERN = 'test_tbl1.csv';Execute the following SQL command to view the data in the external table named
test_tbl1_csv.SELECT * FROM test_tbl1_csv;The return result is as follows:
+------+---------+ | ID | NAME | +------+---------+ | 1 | Emma | | 2 | William | | 3 | Olivia | +------+---------+ 3 rows in set