Usually, table data is stored in the database's own storage space. An external table is a special type of table where the data is stored in an external storage system (such as object storage, HDFS, or ODPS), and the database only stores metadata and access paths. Users can use standard SQL queries to access external tables, enabling unified access to heterogeneous data sources.
When creating an external table, you need to specify:
- Data location (such as an OSS path, HDFS, or ODPS)
- Data format (such as CSV, Parquet, or ORC)
Once created, an external table can participate in operations like JOIN, aggregation, and sorting, just like a regular table. However, there are some key differences:
| Feature | External table | Regular table |
|---|---|---|
| Data storage location | External storage system (OSS/HDFS/ODPS, etc.) | OceanBase distributed storage |
| Read/write capability | Read-only | Full read/write support |
| Constraints and indexes | Constraints and indexes are not supported | Full support |
| Query performance | Slower due to network and external system factors | High-performance local access |
Overview of external table types
OceanBase Database supports various external table types.
File external tables
File external tables are used to store structured and semi-structured files in object storage (such as OSS or S3-compatible object storage), distributed file systems (such as HDFS), or local file systems.
You can directly read external files by using external tables. You can also use the select into outfile statement to export data to external files. In addition, you can use the insert into ... from files() statement to import data from external files to OceanBase Database.
Supported formats:
- Columnar storage: Parquet (recommended), ORC (high compression and high performance) (supported in V4.3.5 and later)
- Row-based storage: CSV
ODPS external tables
Starting from OceanBase Database V4.3.5, you can use external tables to access and import data from MaxCompute. By adapting the ODPS API, you can access MaxCompute tables through external tables. Creating an ODPS external table allows transparent read and write operations on Alibaba Cloud MaxCompute (formerly known as ODPS) tables without the need for data migration. This enables cross-system joint analysis, real-time reporting, and data reflow without data migration.
For more information, see ODPS external tables.
JDBC external tables
The OceanBase Database JDBC external table plugin, implemented in Java, transparently maps remote MySQL tables to local external tables, supporting efficient queries and cross-source federated analysis.
- Supported versions: OceanBase Database V4.4.1 and later (only in MySQL-compatible mode).
- Core value: Directly query remote MySQL databases using standard SQL without data migration.
To use the MySQL external table plugin, you must complete the following configurations:
- OceanBase Database cluster is deployed, and a user tenant in MySQL-compatible mode is created.
- JDK 11 or later is installed on all OBServer nodes, and
JAVA_HOMEis correctly configured. - The MySQL external table plugin JAR package is downloaded and deployed, and its path is configured.
Relationship between catalog external tables and external tables
OceanBase Database provides two modes for accessing external data:
| Mode | Use case | Metadata management | Typical usage |
|---|---|---|---|
| Catalog mode (ODPS Catalog, HMS Catalog) |
Predefined data sources (ODPS/HMS) | Automatic synchronization | SET CATALOG my_hms; and SELECT * FROM db1.sales; |
| File/ODPS/JDBC external tables | Temporary files or custom data sources | Manual table creation | CREATE EXTERNAL TABLE ... LOCATION ...; |