This topic describes how to create an external table by using SQL statements. It also describes the prerequisites, overview, and considerations for creating an external table, and provides examples.
Overview
An external table is a logical table object. Its data is stored in an external storage system instead of the database.
For more information about external tables, see Overview.
Prerequisites
Before you create an external table, make sure that the following conditions are met:
You have deployed an OceanBase cluster and created a MySQL tenant. For more information about how to deploy an OceanBase cluster, see Deployment overview.
You have connected to a MySQL tenant of OceanBase Database. For more information about how to connect to the database, see Overview of connection methods.
You have created a database. For more information, see Create a database.
You have the
CREATEprivilege. To view the privileges of the current user, perform the relevant operation described in View user privileges. If you do not have the required privilege, contact the administrator to grant you the necessary privileges. For more information, see Modify user privileges.
Considerations
An external table can only be queried, and DML operations are not supported.
When you query an external table, if the external file accessed by the table is deleted, the system does not return an error, but instead returns empty rows.
When the external storage system fails, an error is returned when you query the external table, because the external storage system manages the file accessed by the external table.
The data of an external table is stored in an external data source. Therefore, factors such as cross-network and file system factors affect the query performance. When you create an external table, select an appropriate data source and optimization strategy to improve the query efficiency.
Create an external table by using the command line
You can execute the CREATE EXTERNAL TABLE statement to create an external table.
Define External Name
When creating an external table, you need to name the external table first. To avoid confusion and ambiguity, it is recommended to use a specific naming convention or prefix when naming external tables to distinguish them from regular tables. For example, the suffix of the external table name can be _csv.
Here is an example:
Create an external table named students_csv to store student information.
CREATE EXTERNAL TABLE students_csv external_options
Notice
Since no other attributes are specified for the external table in the following statement, it cannot be executed.
Define columns
You cannot define constraints such as DEFAULT, NOT NULL, UNIQUE, CHECK, PRIMARY KEY, and FOREIGN KEY for columns of an external table.
The column types supported for an external table are the same as those supported for a regular table. For more information about the data types supported in the MySQL mode of OceanBase Database, see Overview of data types.
Define LOCATION
The LOCATION option specifies the path where the files of the external table are stored. Generally, the data files of an external table are stored in a dedicated directory, which can contain subdirectories. When you create an external table, the system automatically collects all files in the directory you specified.
OceanBase Database supports the following two path formats:
Local path:
LOCATION = '[file://] local_file_path'Notice
In scenarios that use the local path format, you must set the system variable
secure_file_privto specify an accessible path. For more information, see secure_file_priv.Remote path:
LOCATION = '{oss|cos}://$ACCESS_ID:$ACCESS_KEY@$HOST/remote_file_path'$ACCESS_ID,$ACCESS_KEY, and$HOSTare required for accessing Alibaba Cloud OSS or Tencent Cloud COS. These sensitive access information is encrypted and stored in system tables of the database.
Notice
When you use object storage service paths, separate the parameters with &. Make sure that the values of the parameters you entered contain only uppercase and lowercase letters, numbers, and the following special characters: /-_$+= and wildcard characters. If you enter characters other than the preceding ones, the setting may fail.
Define FORMAT
The FORMAT option specifies the format of external files. The following parameters are supported:
TYPE: the type of external files. Set the value to CSV.LINE_DELIMITER: the line delimiter for CSV files. The default value isLINE_DELIMITER='\n'.FIELD_DELIMITER: the field delimiter for CSV files. The default value isFIELD_DELIMITER='\t'.ESCAPE: the escape character for CSV files, which can be only 1 byte in length. The default value isESCAPE ='\'.FIELD_OPTIONALLY_ENCLOSED_BY: the characters that enclose the field values in CSV files. The default value is an empty string.ENCODING: the character set encoding format of files. For more information about the character sets supported in MySQL mode, see Character set and collation. If this parameter is not specified, the default value UTF8MB4 is used.NULL_IF: the strings to be treated asNULLvalues. The default value is an empty string.SKIP_HEADER: specifies to skip the file header, and specifies the number of lines to skip.SKIP_BLANK_LINES: specifies whether to skip blank lines. The default value isFALSE, which specifies not to skip blank lines.TRIM_SPACE: specifies whether to remove leading and trailing spaces from fields in the file. The default value isFALSE, which specifies not to remove leading and trailing spaces from fields in the file.EMPTY_FIELD_AS_NULL: specifies whether to treat empty strings asNULLvalues. The default value isFALSE, which specifies not to treat empty strings asNULLvalues.
(Optional) Define PATTERN
The PATTERN option specifies a regular pattern string to filter files in the LOCATION directory. For each file under the LOCATION directory, if the file path matches the pattern string, the external table accesses the file. Otherwise, the external table skips the file. If this parameter is not specified, the external table accesses all files under the LOCATION directory by default. The external table stores the list of files that match the LOCATION specified path and are filtered by the PATTERN string in the database system table. During a scan, the external table accesses external files based on this list. The file list can be automatically updated or manually updated.
Examples
Notice
IP addresses in commands for examples are desensitized. Replace them with real IP addresses during verification.
The following example describes how to create an external table in MySQL mode of OceanBase Database when the external file is located locally and remotely. The steps are as follows:
Create an external file.
Execute the following command to create a file named
test_tbl1.csvin the/home/admindirectory on the server where you want to log in to.[admin@xxx /home/admin]# vi test_tbl1.csvThe content of the file is as follows:
1,'Emma','2021-09-01' 2,'William','2021-09-02' 3,'Olivia','2021-09-03'Set the path of the imported file.
Notice
For security reasons, when you set the system variable
secure_file_priv, you can connect to the database only by using a local socket to execute the SQL statement that changes the global variable. For more information, see secure_file_priv.Execute the following command to log in to the server where the OBServer node resides.
ssh admin@10.10.10.1Execute the following command to connect to the
mysql001tenant by using a local Unix socket.obclient -S /home/admin/oceanbase/run/sql.sock -uroot@mysql001 -p******Execute the following SQL command to set the import path to
/home/admin.SET GLOBAL secure_file_priv = "/home/admin";
Reconnect to the
mysql001tenant.Here is an example:
obclient -h10.10.10.1 -P2881 -uroot@mysql001 -p****** -A -DtestExecute the following SQL command to create an external table named
test_tbl1_csv.CREATE EXTERNAL TABLE test_tbl1_csv ( id INT, name VARCHAR(50), c_date DATE ) LOCATION = '/home/admin' FORMAT = ( TYPE = 'CSV' FIELD_DELIMITER = ',' FIELD_OPTIONALLY_ENCLOSED_BY ='\'' ) PATTERN = 'test_tbl1.csv';Execute the following SQL command to view the data in the
test_tbl1_csvtable.SELECT * FROM test_tbl1_csv;The return result is as follows:
+------+---------+------------+ | id | name | c_date | +------+---------+------------+ | 1 | Emma | 2021-09-01 | | 2 | William | 2021-09-02 | | 3 | Olivia | 2021-09-03 | +------+---------+------------+ 3 rows in set
References
- You can use the same method to drop an external table as you would for a regular table. For more information, see Drop a table.
- For more information about how to view and update the information about an external file, see Manage external files.