OceanBase logo

OceanBase

A unified distributed database ready for your transactional, analytical, and AI workloads.

Product Overview
DEPLOY YOUR WAY

OceanBase Cloud

The best way to deploy and scale OceanBase

OceanBase Enterprise

Run and manage OceanBase on your infra

TRY OPEN SOURCE

OceanBase Community Edition

The free, open-source distributed database

OceanBase seekdb

Open source AI native search database

Customer Stories

Real-world success stories from enterprises across diverse industries.

View All
BY USE CASES

Mission-Critical Transactions

Global & Multicloud Application

Elastic Scaling for Peak Traffic

Real-time Analytics

Active Geo-redundancy

Database Consolidation

Resources

Comprehensive knowledge hub for OceanBase.

Blog

Live Demos

Training & Certification

Documentation

Official technical guides, tutorials, API references, and manuals for all OceanBase products.

View All
PRODUCTS

OceanBase Cloud

OceanBase Database

Tools

Connectors and Middleware

QUICK START

OceanBase Cloud

OceanBase Database

BEST PRACTICES

Practical guides for utilizing OceanBase more effectively and conveniently

Company

Learn more about OceanBase – our company, partnerships, and trust and security initiatives.

About OceanBase

Partner

Trust Center

Contact Us

International - English
中国站 - 简体中文
日本 - 日本語
Sign In
Start on Cloud

OceanBase

A unified distributed database ready for your transactional, analytical, and AI workloads.

Product Overview
DEPLOY YOUR WAY

OceanBase Cloud

The best way to deploy and scale OceanBase

OceanBase Enterprise

Run and manage OceanBase on your infra

TRY OPEN SOURCE

OceanBase Community Edition

The free, open-source distributed database

OceanBase seekdb

Open source AI native search database

Customer Stories

Real-world success stories from enterprises across diverse industries.

View All
BY USE CASES

Mission-Critical Transactions

Global & Multicloud Application

Elastic Scaling for Peak Traffic

Real-time Analytics

Active Geo-redundancy

Database Consolidation

Comprehensive knowledge hub for OceanBase.

Blog

Live Demos

Training & Certification

Documentation

Official technical guides, tutorials, API references, and manuals for all OceanBase products.

View All
PRODUCTS
OceanBase CloudOceanBase Database
ToolsConnectors and Middleware
QUICK START
OceanBase CloudOceanBase Database
BEST PRACTICES

Practical guides for utilizing OceanBase more effectively and conveniently

Learn more about OceanBase – our company, partnerships, and trust and security initiatives.

About OceanBase

Partner

Trust Center

Contact Us

Start on Cloud
编组
All Products
    • Databases
    • iconOceanBase Database
    • iconOceanBase Cloud
    • iconOceanBase Tugraph
    • iconInteractive Tutorials
    • iconOceanBase Best Practices
    • Tools
    • iconOceanBase Cloud Platform
    • iconOceanBase Migration Service
    • iconOceanBase Developer Center
    • iconOceanBase Migration Assessment
    • iconOceanBase Admin Tool
    • iconOceanBase Loader and Dumper
    • iconOceanBase Deployer
    • iconKubernetes operator for OceanBase
    • iconOceanBase Diagnostic Tool
    • iconOceanBase Binlog Service
    • Connectors and Middleware
    • iconOceanBase Database Proxy
    • iconEmbedded SQL in C for OceanBase
    • iconOceanBase Call Interface
    • iconOceanBase Connector/C
    • iconOceanBase Connector/J
    • iconOceanBase Connector/ODBC
    • iconOceanBase Connector/NET
icon

OceanBase Database

SQL - V4.4.2

    Download PDF

    OceanBase logo

    The Unified Distributed Database for the AI Era.

    Follow Us
    Products
    OceanBase CloudOceanBase EnterpriseOceanBase Community EditionOceanBase seekdb
    Resources
    DocsBlogWhite PaperLive DemosTraining & CertificationTicket
    Company
    About OceanBaseTrust CenterLegalPartnerContact Us
    Follow Us

    © OceanBase 2026. All rights reserved

    Cloud Service AgreementPrivacy PolicySecurity
    Contact Us
    Document Feedback
    1. Documentation Center
    2. OceanBase Database
    3. SQL
    4. V4.4.2
    iconOceanBase Database
    SQL - V 4.4.2
    Databases
    • OceanBase Database
    • OceanBase Cloud
    • OceanBase Tugraph
    • Interactive Tutorials
    • OceanBase Best Practices
    Tools
    • OceanBase Cloud Platform
    • OceanBase Migration Service
    • OceanBase Developer Center
    • OceanBase Migration Assessment
    • OceanBase Admin Tool
    • OceanBase Loader and Dumper
    • OceanBase Deployer
    • Kubernetes operator for OceanBase
    • OceanBase Diagnostic Tool
    • OceanBase Binlog Service
    Connectors and Middleware
    • OceanBase Database Proxy
    • Embedded SQL in C for OceanBase
    • OceanBase Call Interface
    • OceanBase Connector/C
    • OceanBase Connector/J
    • OceanBase Connector/ODBC
    • OceanBase Connector/NET
    SQL
    KV
    • V 4.6.0
    • V 4.4.2
    • V 4.3.5
    • V 4.3.3
    • V 4.3.1
    • V 4.3.0
    • V 4.2.5
    • V 4.2.2
    • V 4.2.1
    • V 4.2.0
    • V 4.1.0
    • V 4.0.0
    • V 3.1.4 and earlier

    About external tables

    Last Updated:2026-04-02 06:23:57  Updated
    Share
    What is on this page
    HDFS external tables
    Read data from HDFS external tables
    Write data to HDFS external tables
    ODPS external tables
    Catalog external tables
    References

    folded

    Share

    Usually, table data in a database is stored within the database's storage space, whereas the data of an external table is stored in an external storage service. When creating an external table, you need to specify the path and format of the data files. After the external table is created, you can use it to read data from the external storage service.

    External tables can be used just like regular tables—they can be joined, aggregated, sorted, and so on. The differences between external tables and regular tables are as follows:

    • The data of an external table is stored in external files, while the data of a regular table is stored within the database.

    • External tables are read-only. You can use them in query statements, but you cannot perform DML operations on them.

    • External tables do not support adding constraints or creating indexes.

    In general, accessing external tables is slower than accessing regular tables.

    HDFS external tables

    Read data from HDFS external tables

    The Hadoop Distributed File System (HDFS) is a core component of the Hadoop ecosystem, designed to store and process large-scale datasets. To allow direct access to data in HDFS, OceanBase Database now supports reading external tables from HDFS.

    For more information about creating an HDFS external table (where files are stored in HDFS), see CREATE EXTERNAL TABLE.

    Since the HDFS SDK is developed in Java while OceanBase Database is built using C++, a bridge between the two is required, which is achieved through the Java Native Interface (JNI) framework. Similarly, the Java SDK for ODPS also requires a Java environment to run. To use the HDFS external table feature, you need to configure a Java environment and control it using specific parameters to create tables that can access HDFS files. The relevant parameters are as follows:

    • ob_enable_java_env
    • ob_java_home
    • ob_java_connector_path
    • ob_java_opts

    For more information about configuring the Java environment, see Deploy the Java SDK environment for OceanBase Database.

    Write data to HDFS external tables

    OceanBase Database supports the feature of writing data to HDFS external tables in V4.4.0. For more information about this feature, see SELECT INTO.

    ODPS external tables

    MaxCompute (ODPS) provides two open APIs: Storage API and Tunnel API.

    • Storage API: A data service interface that offers efficient, low-latency, and secure data reading.
    • Tunnel API: A data upload and download interface, mainly used for batch operations on table data (such as full table import and export).

    By adapting the ODPS APIs, OceanBase Database can access tables in ODPS through external tables. When you create an external table for ODPS, OceanBase Database provides parameter configuration options for both the Storage API and Tunnel API. For more information, see CREATE EXTERNAL TABLE. The table below outlines the differences between the Storage API and Tunnel API.

    Dimension
    Storage API
    Tunnel API
    Applicable scenarios
    Features Supports fine-grained data access (such as partition filtering and predicate pushdown). Focuses on efficient full-table data import and export, without support for conditional filtering. Suitable for HTAP mixed workloads, conditional queries on partitioned tables, and deep integration with computing engines (such as Apache Spark and ODPS).
    Sharding strategy Automatic sharding: dynamically splits tasks by bytes or rows to improve parallel efficiency. Manual sharding: you need to calculate the partition size or row count yourself, and configuration is relatively complex. Performance is lower than that of the Storage API, with no special requirements for ODPS resource configuration. Compatible with all ODPS configurations.
    • Storage API: Suitable for multi-partition tables and supports two dynamic sharding strategies:
      • By bytes: When data record sizes vary significantly, byte-based sharding ensures balanced data volume per shard, preventing computation skew.
      • By rows: When record sizes are relatively consistent, fixed-row sharding improves parallel processing efficiency.
    • Tunnel API: Suitable for scenarios where the Storage API service is not enabled in ODPS or compatibility is required.
    Performance optimization Low resource consumption: Predicate pushdown reduces the amount of data to be transmitted, and computation is pushed to the database side, resulting in faster queries. High resource consumption: Full data transmission may occupy a large amount of bandwidth and storage space. Choose the Storage API when reducing data transfer and improving HTAP efficiency is needed. Choose the Tunnel API for simple ETL tasks or full backups.
    Environment requirements OceanBase Database V4.4.0 No special requirements. Use the Storage API if your environment supports VPS. Otherwise, use the Tunnel API for earlier versions or simpler scenarios.
    Data filtering capability Supported: Data can be filtered using SQL conditions (such as WHERE), and only the required subset is transmitted. Not supported: Full data transmission is required, and local filtering is performed. Choose the Storage API when conditional data filtering is required (such as analyzing specific user behavior).

    Catalog external tables

    OceanBase Database uses the catalog (data directory) feature to enable unified management and efficient querying of external data sources. This feature adds a Catalog-Database-Table three-layer data hierarchy, allowing direct access to table data in external data sources (such as ODPS and HMS) without manually creating mapping tables. For more information, see Catalog overview.

    The following table describes the catalog data sources supported by OceanBase Database:

    Type
    Supported version
    Data source type
    Table format support
    Description
    ODPS catalog OceanBase Database V4.3.5 BP2 and later ODPS MaxCompute table Suitable for querying data on Alibaba Cloud MaxCompute.
    HMS catalog OceanBase Database V4.4.1 and later HMS Hive, Iceberg table Manages metadata through Hive Metastore and supports the open-source data lake ecosystem.

    References

    • Create an external table
    • Create a partition of an external table
    • Manage external files
    • Data type mapping

    Previous topic

    Lock a table
    Last

    Next topic

    Create an external table
    Next
    What is on this page
    HDFS external tables
    Read data from HDFS external tables
    Write data to HDFS external tables
    ODPS external tables
    Catalog external tables
    References