This topic describes how to create an external table by using SQL statements. It also covers the prerequisites, overview, and considerations for creating an external table, and provides examples.
Overview
An external table is a logical table object. Its data is stored in an external storage system rather than in the database.
For more information about external tables, see Overview.
Prerequisites
Before you create an external table, make sure that you have completed the following steps:
You have deployed an OceanBase cluster and created a MySQL tenant. For more information about how to deploy an OceanBase cluster, see Deployment overview.
You have connected to the MySQL tenant of OceanBase Database. For more information about how to connect to the database, see Overview of connection methods.
You have created a database. For more information about how to create a database, see Create a database.
You have the
CREATEprivilege. To view the privileges of the current user, see View user privileges. If you do not have this privilege, contact the administrator to request it. For more information about how to grant privileges to a user, see Modify user privileges.
Considerations
An external table can only be used for query operations, and does not support DML operations.
When you query an external table, if the external file accessed by the external table has been deleted, the system does not return an error, but returns an empty result set.
If the external storage system that manages the file accessed by the external table becomes unavailable, an error is returned when you query the external table.
The data of an external table is stored in an external data source. Therefore, factors such as network and file system affect the query performance. When you create an external table, you must specify an appropriate data source and optimization strategy to improve query efficiency.
Create an external table
Use the CREATE EXTERNAL TABLE statement to create an external table.
Define an external table name
When you create an external table, you must name it. To avoid confusion and ambiguity, we recommend that you use specific naming rules or prefixes to distinguish external tables from regular tables when naming external tables. For example, you can add a suffix such as _csv to the name of an external table.
Here is an example:
When you create an external table that contains student information, you can name it students_csv.
CREATE EXTERNAL TABLE students_csv external_options
Notice
The preceding SQL statement cannot be executed because no other attributes of the external table are specified.
Define columns
You cannot define constraints, such as DEFAULT, NOT NULL, UNIQUE, CHECK, PRIMARY KEY, and FOREIGN KEY, for columns of an external table.
The column types supported by external tables are the same as those supported by regular tables. For more information about the data types supported in MySQL mode of OceanBase Database, see Data types.
Define LOCATION
The LOCATION option specifies the path for storing the files of the external table. Generally, the data files of an external table are stored in a dedicated directory, which can contain subdirectories. When you create an external table, the system automatically collects all files in the specified directory.
OceanBase Database supports the following two path formats:
Local path:
LOCATION = '[file://] local_file_path'Notice
In scenarios that use the local path format, you must set the system variable
secure_file_privto specify the path that can be accessed. For more information, see secure_file_priv.Remote path:
LOCATION = '{oss|cos}://$ACCESS_ID:$ACCESS_KEY@$HOST/remote_file_path'$ACCESS_ID,$ACCESS_KEY, and$HOSTare the access information required to access Alibaba Cloud OSS or Tencent Cloud COS. These sensitive access information is stored in encrypted form in the system tables of the database.
Notice
When you use an object storage path, separate the parameters of the path with the & character. Make sure that the values of the parameters you entered contain only uppercase and lowercase letters, digits, and the characters /-_$+=. If you enter characters other than the preceding ones, the setting may fail.
Define FORMAT
The FORMAT option specifies the format of external files. The following parameters are supported:
TYPE: the type of external files. Currently, only the CSV file type is supported.LINE_DELIMITER: the line delimiter for CSV files. The default value is'\n'.FIELD_DELIMITER: the field delimiter for CSV files. The default value is'\t'.ESCAPE: the escape character for CSV files, which can be only 1 byte in length. The default value is'\'.FIELD_OPTIONALLY_ENCLOSED_BY: the characters that enclose field values in CSV files. The default value is an empty string.ENCODING: the character set encoding format of files. For more information about supported character sets in MySQL mode, see Character sets. If this parameter is not specified, the default value UTF8MB4 is used.NULL_IF: the strings that are to be treated asNULL. The default value is an empty string.SKIP_HEADER: specifies to skip the file header and indicates the number of lines to skip.SKIP_BLANK_LINES: specifies whether to skip blank lines. The default value isFALSE, which specifies not to skip blank lines.TRIM_SPACE: specifies whether to remove leading and trailing spaces from fields in the file. The default value isFALSE, which specifies not to remove leading and trailing spaces from fields in the file.EMPTY_FIELD_AS_NULL: specifies whether to treat empty strings asNULL. The default value isFALSE, which specifies not to treat empty strings asNULL.
(Optional) Define PATTERN
The PATTERN option specifies a regular pattern string for filtering files in the LOCATION directory. Each file path in the LOCATION directory is checked against the pattern string. If the path matches the pattern string, the external table accesses the file. Otherwise, the external table skips the file. If this parameter is not specified, the external table accesses all files in the LOCATION directory. The external table stores the list of files that match the PATTERN in a system table in the database. During a scan, the external table accesses external files based on this list. The file list can be automatically updated or manually updated.
Examples
Notice
The commands in the following examples are desensitized. You must replace the IP address with your real IP address during verification.
The following examples show how to create an external table when the external file is located locally and in MySQL mode of OceanBase Database. The steps are as follows:
Create an external file.
Execute the following command to create a file named
test_tbl1.csvin the/home/admindirectory of the server where the OBServer node resides.[admin@xxx /home/admin]# vi test_tbl1.csvThe content of the file is as follows:
1,'Emma','2021-09-01' 2,'William','2021-09-02' 3,'Olivia','2021-09-03'Set the path of the imported file.
Notice
For security reasons, when you set the system variable
secure_file_priv, you can connect to the database only by using a local socket to execute the SQL statement that modifies the global variable. For more information, see secure_file_priv.Execute the following command to log in to the server where the OBServer node resides.
ssh admin@10.10.10.1Execute the following command to connect to the
mysql001tenant by using a local Unix socket.obclient -S /home/admin/oceanbase/run/sql.sock -uroot@mysql001 -p******Execute the following SQL statement to set the import path to
/home/admin.SET GLOBAL secure_file_priv = "/home/admin";
Reconnect to the
mysql001tenant.Here is an example:
obclient -h10.10.10.1 -P2881 -uroot@mysql001 -p****** -A -DtestExecute the following SQL statement to create an external table named
test_tbl1_csv.CREATE EXTERNAL TABLE test_tbl1_csv ( id INT, name VARCHAR(50), c_date DATE ) LOCATION = '/home/admin' FORMAT = ( TYPE = 'CSV' FIELD_DELIMITER = ',' FIELD_OPTIONALLY_ENCLOSED_BY ='\'' ) PATTERN = 'test_tbl1.csv';Execute the following SQL statement to view the data in the external table named
test_tbl1_csv.SELECT * FROM test_tbl1_csv;The return result is as follows:
+------+---------+------------+ | id | name | c_date | +------+---------+------------+ | 1 | Emma | 2021-09-01 | | 2 | William | 2021-09-02 | | 3 | Olivia | 2021-09-03 | +------+---------+------------+ 3 rows in set
References
- You can delete an external table in the same way as a regular table. For more information about how to delete a table, see Delete a table.
- For more information about how to view and update the access path of an external table, see Manage external files.