This topic describes how to create an external table with some examples.
Overview
An external table is a logical table object. Its data is stored in an external storage system instead of the database.
For more information about external tables, see Overview.
Prerequisites
Before you create an external table, make sure that:
You have deployed an OceanBase cluster and created an Oracle tenant. For more information about how to deploy an OceanBase cluster, see Deployment overview.
You have connected to the Oracle tenant of OceanBase Database. For more information about how to connect to OceanBase Database, see Overview of connection methods.
You have the
CREATE TABLEprivilege. For more information about how to view your privileges, see View user privileges. If you do not have the required privileges, contact the administrator to obtain the privileges. For more information, see Grant direct privileges.
Considerations
External tables support only query operations, but not DML operations.
When you query an external table, if the external file accessed by the table is deleted, the system does not generate an error but returns empty rows.
Files accessed by an external table are managed by an external storage system. When the external storage system is unavailable, an error occurs when you query the external table.
Because data of an external table is stored in an external data source, queries may need to be performed across networks or file systems, which may affect query performance. Therefore, you must select an appropriate data source and optimization strategy when you create an external table to improve query efficiency.
Create an external table by using a statement
You can use the CREATE EXTERNAL TABLE statement to create an external table.
Name an external table
When you create an external table, you must name it. To distinguish between external tables and regular tables, we recommend that you follow specific naming rules or use a prefix or suffix when you name an external table. For example, you can use the suffix _csv when you name an external table.
Here is an example:
Create an external table of student information and name it as students_csv.
CREATE EXTERNAL TABLE students_csv external_options
Notice
The preceding SQL statement cannot be executed because other attributes are not added for the external table.
Define columns
Constraints such as DEFAULT, NOT NULL, UNIQUE, CHECK, PRIMARY KEY, and FOREIGN KEY cannot be defined for columns of external tables.
The column types supported by external tables are the same as those supported by regular tables. For more information about the data types that are supported in Oracle mode, see Overview.
Define LOCATION
The LOCATION option specifies the path for storing the files of an external table. Generally, the data files of an external table are stored in a separate directory. The folder can contain subdirectories. When you create an external table, the table automatically collects all files in the specified directory.
OceanBase Database supports the following two path formats:
Local location format:
LOCATION = '[file://] local_file_path'Notice
When you use the local location format, you must set the system variable
secure_file_privto specify an accessible path. For more information, see secure_file_priv.Remote location format:
LOCATION = '{oss|cos}://$ACCESS_ID:$ACCESS_KEY@$HOST/remote_file_path'In this format,
$ACCESS_ID,$ACCESS_KEY, and$HOSTare required for accessing Alibaba Cloud Object Storage Service (OSS) or Tencent Cloud Object Storage (COS). These sensitive access information is encrypted and stored in the system table of the database.
Define FORMAT
The FORMAT option specifies the format of an external file. The option contains the following parameters:
TYPE: the type of the external file. Only the CSV type is supported.LINE_DELIMITER: the line delimiter for the CSV file. The default value is'\n'.FIELD_DELIMITER: the column delimiter for the CSV file. The default value is'\t'.ESCAPE: the escape character for the CSV file, which can be only 1 byte in length. The default value is'\'.FIELD_OPTIONALLY_ENCLOSED_BY: the characters that enclose the field values in the CSV file. The default value is none.ENCODING: the character set encoding used by the files. For more information about all character sets supported in Oracle mode, see Character sets. If this parameter is not specified, the default value isUTF8MB4.NULL_IF: the strings to be treated asNULLvalues. The default value is none.SKIP_HEADER: specifies to skip the file header, and specifies the number of lines to skip.SKIP_BLANK_LINES: specifies whether to skip blank lines. The default value isFALSE, which indicates not to skip blank lines.TRIM_SPACE: specifies whether to remove leading and trailing spaces from fields in the file. The default value isFALSE, which indicates not to remove leading and trailing spaces from fields in the file.EMPTY_FIELD_AS_NULL: specifies whether to treat empty strings asNULLvalues. The default value isFALSE, which indicates not to treat empty strings asNULLvalues.
(Optional) Define PATTERN
The PATTERN option specifies a regular pattern string for filtering files in the LOCATION directory. For each file in the LOCATION directory, if the file path matches the pattern string, the external table accesses the file. Otherwise, the external table skips the file. By default, if this parameter is not specified, all files in the LOCATION directory are accessible. The external table stores the list of the files that match PATTERN in the system table of the database. During the scan, the external table accesses external files based on this list.
(Optional) Define partitions of the external table
Automatically define partitions
The external table calculates and adds partitions based on the expression defined by the partitioning key. You can specify the value or range of the partitioning key in the query statement. In this case, partitions are pruned, and the external table reads only the files in the specified partition.
Manually define partitions
If you want to manually add and delete partitions, you can set PARTITION_TYPE to USER_SPECIFIED so that the external table does not automatically manage partitions.
Examples
Notice
IP addresses used in these examples are desensitized. Enter your actual server IP addresses during verification.
The following example shows how to create an external table whose data files are stored in a local directory in the Oracle mode of OceanBase Database.
Create an external file.
Create a file named
test_tbl1.csvin the/home/admin/external_csvdirectory of the server where the OBServer node you want to log on to resides.[admin@xxx /home/admin/external_csv]# vi test_tbl1.csvThe file content is as follows:
1,'Emma' 2,'William' 3,'Olivia'Set the path where the file to be imported is located.
Notice
For security reasons, when you set the system variable
secure_file_priv, you can connect to the database only through a local socket to execute the SQL statement that modifies the global variable. For more information, see secure_file_priv.Log on to the server where the OBServer node resides.
ssh admin@10.10.10.1Connect to the tenant
oracle001through a local Unix socket.obclient -S /home/admin/oceanbase/run/sql.sock -usys@oracle001 -p******Set the import path to
/home/admin/external_csv.SET GLOBAL secure_file_priv = "/home/admin/external_csv";
Reconnect to the tenant
oracle001.obclient -h10.10.10.1 -P2881 -usys@oracle001 -p****** -ACreate an external table named
test_tbl1_csv.CREATE EXTERNAL TABLE test_tbl1_csv ( id INT, name VARCHAR(50) ) LOCATION = '/home/admin/external_csv' FORMAT = ( TYPE = 'CSV' FIELD_DELIMITER = ',' FIELD_OPTIONALLY_ENCLOSED_BY ='''' ) PATTERN = 'test_tbl1.csv';View data of the external table
test_tbl1_csv.SELECT * FROM test_tbl1_csv;The result is as follows:
+------+---------+ | ID | NAME | +------+---------+ | 1 | Emma | | 2 | William | | 3 | Olivia | +------+---------+ 3 rows in set
References
For more information about how to view and update access to external files for external tables, see Manage external files.