Typically, table data is stored in the database's own storage space. However, an external table is a special type of table where the data is actually stored in an external storage system (such as object storage, HDFS, or ODPS). The database only stores the metadata and access path. Users can use standard SQL queries to access external tables, enabling unified access to heterogeneous data sources.
When creating an external table, you need to specify:
- Data location (such as an OSS path, HDFS, or ODPS)
- Data format (such as CSV, Parquet, or ORC)
Once created, an external table can participate in operations like JOIN, aggregation, and sorting, just like a regular table. However, there are some key differences:
Feature |
External Table |
Regular Table |
|---|---|---|
| Data storage location | External storage system (OSS/HDFS/ODPS, etc.) | OceanBase distributed storage |
| Read/write capability | Read-only | Full read/write support |
| Constraints and indexes | Does not support adding constraints or creating indexes | Fully supported |
| Query performance | Typically slower due to network and external system factors | High-performance local access |
Overview
OceanBase Database supports multiple types of external tables.
File external tables
File external tables are suitable for structured or semi-structured files stored in object storage services (such as OSS or S3-compatible object storage services), distributed file systems (such as HDFS), or local file systems.
You can directly read external files by using external tables. You can also use the select into outfile statement to export data to external files. You can also use the insert into ... from files() statement to import data from external files to OceanBase Database.
Supported formats:
- Columnar storage: Parquet (recommended), ORC (high compression and high performance) (supported in V4.3.5 and later)
- Row-based storage: CSV
ODPS external tables
OceanBase Database V4.3.5 supports reading data from and importing data to MaxCompute through external tables. By adapting the ODPS API, you can access MaxCompute tables through external tables. By creating ODPS external tables, you can transparently read and write data in Alibaba Cloud MaxCompute (formerly known as ODPS) tables without data migration. This enables cross-system joint analysis, real-time reporting, and data reflow.
For more information, see ODPS external tables.
JDBC external tables
The OceanBase Database JDBC external table plugin is implemented in Java. It transparently maps remote MySQL tables to local external tables, supporting efficient queries and cross-source federated analysis.
- Supported version: OceanBase Database V4.4.1 and later (only in MySQL mode).
- Core value: You can directly query remote MySQL databases using standard SQL without data migration.
To use the MySQL external table plugin, you must complete the following configurations:
- OceanBase Database cluster is deployed, and a MySQL mode user tenant is created.
- All OBServer nodes have installed JDK 11 or later and the
JAVA_HOMEenvironment variable is correctly configured. - The MySQL external table plugin JAR package is downloaded and deployed, and its path is configured.
For detailed installation and configuration steps, see MySQL external table plugin.
Relationship between catalog external tables and external tables
OceanBase Database provides two external data access modes:
Mode |
Scenario |
Metadata management |
Typical usage |
|---|---|---|---|
| Catalog mode (ODPS Catalog, HMS Catalog) |
Predefined data sources (ODPS/HMS) | Automatic synchronization | SET CATALOG my_hms; and SELECT * FROM db1.sales; |
| File/ODPS/JDBC external tables | Temporary files or custom data sources | Manual table creation | CREATE EXTERNAL TABLE ... LOCATION ...; |
