OceanBase logo

OceanBase

A unified distributed database ready for your transactional, analytical, and AI workloads.

Product Overview
DEPLOY YOUR WAY

OceanBase Cloud

The best way to deploy and scale OceanBase

OceanBase Enterprise

Run and manage OceanBase on your infra

TRY OPEN SOURCE

OceanBase Community Edition

The free, open-source distributed database

OceanBase seekdb

Open source AI native search database

Customer Stories

Real-world success stories from enterprises across diverse industries.

View All
BY USE CASES

Mission-Critical Transactions

Global & Multicloud Application

Elastic Scaling for Peak Traffic

Real-time Analytics

Active Geo-redundancy

Database Consolidation

Resources

Comprehensive knowledge hub for OceanBase.

Blog

Live Demos

Training & Certification

Documentation

Official technical guides, tutorials, API references, and manuals for all OceanBase products.

View All
PRODUCTS

OceanBase Cloud

OceanBase Database

Tools

Connectors and Middleware

QUICK START

OceanBase Cloud

OceanBase Database

BEST PRACTICES

Practical guides for utilizing OceanBase more effectively and conveniently

Company

Learn more about OceanBase – our company, partnerships, and trust and security initiatives.

About OceanBase

Partner

Trust Center

Contact Us

International - English
中国站 - 简体中文
日本 - 日本語
Sign In
Start on Cloud

OceanBase

A unified distributed database ready for your transactional, analytical, and AI workloads.

Product Overview
DEPLOY YOUR WAY

OceanBase Cloud

The best way to deploy and scale OceanBase

OceanBase Enterprise

Run and manage OceanBase on your infra

TRY OPEN SOURCE

OceanBase Community Edition

The free, open-source distributed database

OceanBase seekdb

Open source AI native search database

Customer Stories

Real-world success stories from enterprises across diverse industries.

View All
BY USE CASES

Mission-Critical Transactions

Global & Multicloud Application

Elastic Scaling for Peak Traffic

Real-time Analytics

Active Geo-redundancy

Database Consolidation

Comprehensive knowledge hub for OceanBase.

Blog

Live Demos

Training & Certification

Documentation

Official technical guides, tutorials, API references, and manuals for all OceanBase products.

View All
PRODUCTS
OceanBase CloudOceanBase Database
ToolsConnectors and Middleware
QUICK START
OceanBase CloudOceanBase Database
BEST PRACTICES

Practical guides for utilizing OceanBase more effectively and conveniently

Learn more about OceanBase – our company, partnerships, and trust and security initiatives.

About OceanBase

Partner

Trust Center

Contact Us

Start on Cloud
编组
All Products
    • Databases
    • iconOceanBase Database
    • iconOceanBase Cloud
    • iconOceanBase Tugraph
    • iconInteractive Tutorials
    • iconOceanBase Best Practices
    • Tools
    • iconOceanBase Cloud Platform
    • iconOceanBase Migration Service
    • iconOceanBase Developer Center
    • iconOceanBase Migration Assessment
    • iconOceanBase Admin Tool
    • iconOceanBase Loader and Dumper
    • iconOceanBase Deployer
    • iconKubernetes operator for OceanBase
    • iconOceanBase Diagnostic Tool
    • iconOceanBase Binlog Service
    • Connectors and Middleware
    • iconOceanBase Database Proxy
    • iconEmbedded SQL in C for OceanBase
    • iconOceanBase Call Interface
    • iconOceanBase Connector/C
    • iconOceanBase Connector/J
    • iconOceanBase Connector/ODBC
    • iconOceanBase Connector/NET
icon

OceanBase Database

SQL - V4.6.0

    Download PDF

    OceanBase logo

    The Unified Distributed Database for the AI Era.

    Follow Us
    Products
    OceanBase CloudOceanBase EnterpriseOceanBase Community EditionOceanBase seekdb
    Resources
    DocsBlogLive DemosTraining & CertificationTicket
    Company
    About OceanBaseTrust CenterLegalPartnerContact Us
    Follow Us

    © OceanBase 2026. All rights reserved

    Cloud Service AgreementPrivacy PolicySecurity
    Contact Us
    Document Feedback
    1. Documentation Center
    2. OceanBase Database
    3. SQL
    4. V4.6.0
    iconOceanBase Database
    SQL - V 4.6.0
    Databases
    • OceanBase Database
    • OceanBase Cloud
    • OceanBase Tugraph
    • Interactive Tutorials
    • OceanBase Best Practices
    Tools
    • OceanBase Cloud Platform
    • OceanBase Migration Service
    • OceanBase Developer Center
    • OceanBase Migration Assessment
    • OceanBase Admin Tool
    • OceanBase Loader and Dumper
    • OceanBase Deployer
    • Kubernetes operator for OceanBase
    • OceanBase Diagnostic Tool
    • OceanBase Binlog Service
    Connectors and Middleware
    • OceanBase Database Proxy
    • Embedded SQL in C for OceanBase
    • OceanBase Call Interface
    • OceanBase Connector/C
    • OceanBase Connector/J
    • OceanBase Connector/ODBC
    • OceanBase Connector/NET
    SQL
    KV
    • V 4.6.0
    • V 4.4.2
    • V 4.3.5
    • V 4.3.3
    • V 4.3.1
    • V 4.3.0
    • V 4.2.5
    • V 4.2.2
    • V 4.2.1
    • V 4.2.0
    • V 4.1.0
    • V 4.0.0
    • V 3.1.4 and earlier

    Create an external table

    Last Updated:2026-05-07 11:26:24  Updated
    Share
    What is on this page
    Overview
    Prerequisites
    Considerations
    Use the CLI to create an external table
    Define the name of the external table
    Define columns
    Define LOCATION
    Define the FORMAT clause
    (Optional) Define the PATTERN clause
    (Optional) Define partitions for the external table
    Example
    Example 2
    References

    folded

    Share

    This topic describes how to create an external table by using SQL statements. It also provides the prerequisites, overview, and considerations for creating an external table, and gives some examples.

    Overview

    An external table is a logical table object that stores data in an external storage service instead of in the database.

    For more information about external tables, see Overview.

    Prerequisites

    Before you create an external table, make sure that the following conditions are met:

    • You have deployed an OceanBase cluster and created a MySQL tenant. For more information, see Overview.

    • You have connected to a MySQL tenant of OceanBase Database. For more information, see Overview.

    • You have created a database. For more information, see Create a database.

    • You have the CREATE privilege. For information about how to view the privileges of the current user, see View user privileges. If you do not have the privilege, contact the administrator to grant the privilege to you. For information about how to directly grant privileges, see Directly grant privileges.

    Considerations

    • External tables support only query operations and do not support DML operations.

    • If the external file accessed by an external table is deleted, the system does not return an error but instead returns an empty row.

    • If the external storage system is unavailable, queries on external tables will return an error.

    • Since external table data is stored in an external data source, queries may involve network and file system factors, which can affect query performance. Therefore, when creating an external table, you need to select an appropriate data source and optimization strategy to improve query efficiency.

    Use the CLI to create an external table

    Use the CREATE EXTERNAL TABLE statement to create an external table.

    Define the name of the external table

    When you create an external table, you must specify its name. To avoid confusion and ambiguity, we recommend that you use specific naming rules or prefixes to distinguish external tables from regular tables. For example, you can add _csv as a suffix to the name of an external table.

    Here is an example:

    When you create an external table that stores information about students, you can name it students_csv.

    CREATE EXTERNAL TABLE students_csv external_options
    

    Notice

    Because the other attributes of the external table are not specified, the preceding SQL statement cannot be executed.

    Define columns

    You cannot define constraints for columns of an external table. Constraints such as DEFAULT, NOT NULL, UNIQUE, CHECK, PRIMARY KEY, and FOREIGN KEY cannot be defined for columns of an external table.

    The column types supported by external tables are the same as those supported by regular tables. For more information about the data types supported by OceanBase Database in MySQL mode, see Data types.

    Define LOCATION

    The LOCATION option specifies the path where the external table files are stored. Usually, the data files of an external table are stored in a separate directory, which can contain subdirectories. When you create an external table, it automatically collects all files in the directory.

    OceanBase Database supports the following two path formats:

    • Local Location format: LOCATION = '[file://] local_file_path'

      Notice

      For scenarios using the local Location format, you must set the system variable secure_file_priv to configure the accessible path. For more information, see secure_file_priv.

    • Remote Location format:

      Notice

      When you use an object storage path, the parameters of the object storage path are separated by the & symbol. Make sure that the parameter values you enter contain only uppercase and lowercase letters, digits, /-_$+=, and wildcards. If you enter other characters, the configuration may fail.

      • LOCATION = '{oss|S3}://$ACCESS_ID:$ACCESS_KEY@$HOST:s3_region/remote_file_path' where $ACCESS_ID, $ACCESS_KEY, and $HOST are the access information required for accessing Alibaba Cloud OSS, AWS S3, or object storage compatible with the S3 protocol, and s3_region is the region information selected when using S3. These sensitive access information are stored in the system tables of the database in an encrypted manner.

      • When the files are stored on HDFS, the format is as follows:

        • Format for accessing the cluster based on the address of a single-node NameNode (NN): LOCATION = hdfs://localhost:port/PATH where localhost is the address of HDFS, port is the port number of HDFS, and PATH is the file path in HDFS.

          • Format with Kerberos authentication: LOCATION = 'hdfs://localhost:port/user?principal=xxx&keytab=xxx&krb5conf=xxx&configs=xxx' where:

            • principal: the user for login authentication.
            • keytab: the path of the key file for user authentication.
            • krb5conf: the path of the description file for the Kerberos environment.
            • configs: extra HDFS configurations. By default, this parameter is empty. However, in a Kerberos environment, this parameter usually has a value and needs to be configured. For example: dfs.data.transfer.protection=authentication,privacy specifies the data transmission protection level as authentication and privacy.
        • Format for accessing the cluster based on Hadoop High Availability (HA): LOCATION = hdfs://nameserviceID/PATH where nameserviceID is the logical nameservice ID of Hadoop HA, and PATH is the file path.

          Note

          Make sure that the client OBServer has the nameservice definition and failover strategy for the HA cluster.

          • Format with Kerberos authentication: LOCATION = 'hdfs://nameserviceID/PATH?principal=xxx&keytab=xxx&krb5conf=xxx&configs=dfs.data.transfer.protection=${string}#dfs.nameservices=${nameservice id}#dfs.ha.namenodes.${nameservice id}=${namenode1}, ${namenode2}#dfs.namenode.rpc-address.${nameservice id}.${namenode1}=${namenode 1 address}#dfs.namenode.rpc-address.${nameservice id}.${namenode2}=${namenode 2 address}#dfs.ha.automatic-failover.enabled.${nameservice id}=true#dfs.client.failover.proxy.provider.${nameservice id}=org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider' where:

            • principal: the user for login authentication. Set this parameter to the pricipal of a non-primary NN.

            • keytab and krb5conf: same as in the single-node NN setting.

            • configs: extra HDFS configurations. If you need to set multiple configurations, only HA configurations and security configurations are related:

              • dfs.data.transfer.protection=${string}: aligns with the dfs.data.transfer.protection configuration of the cluster.
              • dfs.nameservices=${nameservice id}: specifies the namesevice (alias) of the current HA cluster.
              • dfs.ha.namenodes.${nameservice id}=${namenode1}, ${namenode2}: specifies the list of namenode backup IDs of the HA cluster.
              • dfs.namenode.rpc-address.${nameservice id}.${namenode1}=${namenode 1 address}: specifies the specific namenode of namenode1 for client routing.
              • dfs.namenode.rpc-address.${nameservice id}.${namenode2}=${namenode 2 address}: specifies the specific namenode of namenode2 for client routing.
              • dfs.ha.automatic-failover.enabled.${nameservice id}=true: enables the HA cluster to automatically obtain an available namenode to respond to requests.
              • dfs.client.failover.proxy.provider.${nameservice id}=org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider: specifies the logical tool class for switching between primary and standby nodes in the HA cluster. You can also customize and upload the required logic for the HA cluster.

          Notice

          HA-related parameters are bound to namespace. For example, in the following sample three, mycluster is used. Make sure to set the relevant parameters accordingly.

    Define the FORMAT clause

    • FORMAT = ( TYPE = 'CSV'... ) specifies that the external file is of the CSV type. The following parameters are available:

      • TYPE: specifies the type of the external file.

      • LINE_DELIMITER: specifies the line delimiter of the CSV file. The default value is LINE_DELIMITER='\n'.

      • FIELD_DELIMITER: specifies the field delimiter of the CSV file. The default value is FIELD_DELIMITER='\t'.

      • ESCAPE: specifies the escape character of the CSV file. It must be one byte. The default value is ESCAPE ='\'.

      • FIELD_OPTIONALLY_ENCLOSED_BY: specifies the character used to enclose field values in the CSV file. The default value is empty.

        Notice

        If the external file contains NULL values (non-string NULL, not "NULL"), you must explicitly configure the FIELD_OPTIONALLY_ENCLOSED_BY parameter, and the value of this parameter cannot be empty.

      • ENCODING: specifies the character set encoding format of the file. For information about the character sets supported in MySQL mode, see Character sets. If this parameter is not specified, the default value is UTF8MB4.

      • NULL_IF: specifies the string to be treated as NULL. The default value is empty.

      • SKIP_HEADER: specifies the number of lines to skip from the beginning of the file.

      • SKIP_BLANK_LINES: specifies whether to skip blank lines. The default value is FALSE, indicating that blank lines are not skipped.

      • TRIM_SPACE: specifies whether to remove leading and trailing spaces from fields in the file. The default value is FALSE, indicating that leading and trailing spaces are not removed.

      • EMPTY_FIELD_AS_NULL: specifies whether to treat empty strings as NULL. The default value is FALSE, indicating that empty strings are not treated as NULL.

      • COMPRESSION: optional. Specifies the compression format for the file. Valid values are:

        • GZIP/DEFLATE: GZIP-compressed file.
        • ZSTD: ZSTD-compressed file.
        • NONE: indicates that the file is not compressed (default value).
        • SNAPPY_BLOCK: HADOOP-SNAPPY-compressed file.
      • PARALLEL_PARSE_ON_SINGLE_FILE: specifies whether to enable parallel parsing of a single CSV file. If not specified, the default value is TRUE.

      • PARALLEL_PARSE_FILE_SIZE_THRESHOLD: specifies the file size threshold for parallel parsing, in bytes. The default value is 256 MB.

      • MAX_ROW_LENGTH: specifies the maximum length of a single row of data, in bytes. The default value is 2 MB. This parameter is used during the boundary determination phase of parallel parsing of a single CSV file and does not affect the accuracy of data reading.

    • FORMAT = ( TYPE = 'PARQUET'... ) specifies that the external file is of the PARQUET type.

    • FORMAT = ( TYPE = 'ORC'... ) specifies that the external file is of the ORC type.

    Note

    Starting from OceanBase Database V4.6.0, the external table of the Parquet/ORC file format supports reading complex data types such as Array, Map, and JSON. The data types supported by the external table must be consistent with those supported by OceanBase Database. For more information, see Overview of data types.

    • OceanBase Database supports reading Bloom filters from external tables of the Parquet/ORC file format. Bloom filters filter data based on query predicates to reduce unnecessary I/O operations and accelerate queries.

    (Optional) Define the PATTERN clause

    The PATTERN option specifies a regular expression pattern to filter files in the LOCATION directory. For each file path in the LOCATION directory, if the file path matches the pattern, the external table accesses the file; otherwise, it skips the file. If this parameter is not specified, all files in the LOCATION directory are accessible by default. The external table saves the list of files in the LOCATION directory that match the PATTERN in the database system table. When scanning the external table, it accesses the external files based on this list.

    (Optional) Define partitions for the external table

    Automatically define partitions for the external table

    The external table calculates and adds partitions based on the expressions defined for the partitioning key. When querying, you can specify the value or range of the partitioning key. In this case, partition pruning is performed, and the external table only reads files in the specified partition.

    Manually define partitions for the external table

    If you need to manually add and delete partitions instead of letting the external table manage them automatically, you must specify the PARTITION_TYPE = USER_SPECIFIED field.

    Example

    Notice

    The IP address in the example is desensitized. You need to replace it with the actual IP address of your server.

    This example shows how to create an external table in the MySQL mode of OceanBase Database based on an external file stored in the local server or in the MySQL mode of OceanBase Database. The steps are as follows:

    1. Prepare an external file.

      Run the following command to create a file named test_tbl1.csv in the /home/admin directory on the server where the OBServer node is located.

      [admin@xxx /home/admin]# vi test_tbl1.csv
      

      The content of the file is as follows:

      1,'Emma','2021-09-01'
      2,'William','2021-09-02'
      3,'Olivia','2021-09-03'
      
    2. Set the import file path.

      Notice

      For security reasons, you can modify only the secure_file_priv system variable by using a local socket connection. For more information, see secure_file_priv.

      1. Run the following command to log in to the server where the OBServer node is located.

        ssh admin@10.10.10.1
        
      2. Run the following command to connect to the mysql001 tenant by using a local Unix socket connection.

        obclient -S /home/admin/oceanbase/run/sql.sock -uroot@mysql001 -p******
        
      3. Run the following SQL statement to set the import path to /home/admin.

        SET GLOBAL secure_file_priv = "/home/admin";
        
    3. Reconnect to the mysql001 tenant.

      Here is an example:

      obclient -h10.10.10.1 -P2881 -uroot@mysql001 -p****** -A -Dtest
      
    4. Run the following SQL statement to create an external table named test_tbl1_csv.

      CREATE EXTERNAL TABLE test_tbl1_csv ( 
          id INT, 
          name VARCHAR(50), 
          c_date    DATE
          )
          LOCATION = '/home/admin'
          FORMAT = (
            TYPE = 'CSV'
            FIELD_DELIMITER = ','
            FIELD_OPTIONALLY_ENCLOSED_BY ='\''
            )
          PATTERN = 'test_tbl1.csv';
      
    5. Run the following SQL statement to query the data in the test_tbl1_csv external table.

      SELECT * FROM test_tbl1_csv;
      

      The return result is as follows:

      +------+---------+------------+
      | id   | name    | c_date     |
      +------+---------+------------+
      |    1 | Emma    | 2021-09-01 |
      |    2 | William | 2021-09-02 |
      |    3 | Olivia  | 2021-09-03 |
      +------+---------+------------+
      3 rows in set
      

    Example 2

    Example of creating an HDFS external table

    With Kerberos:

    CREATE EXTERNAL TABLE ext_data ( 
      id INT, 
      name VARCHAR(50), 
      c_date    DATE
    )
    LOCATION = 'hdfs://localhost:8020/user?principal=principal_str&keytab=/path/to/keytab&krb5conf=/path/to/krb5conf_file&configs=xxx=xxx#xxx=xxx'
    FORMAT = (
    TYPE = 'CSV'
    FIELD_DELIMITER = ','
    FIELD_OPTIONALLY_ENCLOSED_BY ='"'
    )
    PATTERN = 'data.csv';
    

    Where:

    • principal: specifies the user for login authentication.
    • keytab: specifies the path of the key file for user authentication.
    • krb5conf: specifies the path of the description file for the Kerberos environment.
    • configs: specifies additional HDFS configuration items. By default, it is empty. However, in a Kerberos environment, this configuration item usually has a value and needs to be configured. For example, dfs.data.transfer.protection=authentication,privacy specifies the data transmission protection level as authentication and privacy.

    Without Kerberos:

    CREATE EXTERNAL TABLE ext_data ( 
      id INT, 
      name VARCHAR(50), 
      c_date    DATE
    )
    LOCATION = 'hdfs://localhost:8020/user'
    FORMAT = (
      TYPE = 'CSV'
      FIELD_DELIMITER = ','
      FIELD_OPTIONALLY_ENCLOSED_BY ='\''
      )
    PATTERN = 'test_tbl1.csv';
    

    References

    • You can use the same method as deleting a regular table to delete an external table. For more information about how to delete a table, see Delete a table.
    • For more information about how to view and update external files, see Manage external files.

    Previous topic

    Create an index
    Last

    Next topic

    Insert data
    Next
    What is on this page
    Overview
    Prerequisites
    Considerations
    Use the CLI to create an external table
    Define the name of the external table
    Define columns
    Define LOCATION
    Define the FORMAT clause
    (Optional) Define the PATTERN clause
    (Optional) Define partitions for the external table
    Example
    Example 2
    References