Generally, the data of a normal table in a database is stored in the storage space of the database, whereas the data of an external table is stored in an external storage service. When you create an external table, you must specify the path to and format of the relevant data files. After you create an external table, you can read data from files in the external storage service by using the external table.
An external table can be linked, aggregated, and sorted with another table like a normal table. External tables differ from normal tables in the following aspects:
The data of external tables is stored in external files, whereas that of normal tables is stored in databases.
External tables are read-only and can be used in query statements. However, you cannot perform Data Manipulation Language (DML) operations on external tables.
You cannot add constraints or create indexes on external tables.
Generally, the access speed of an external table is lower than that of a normal table.
HDFS external table
Read data from an HDFS external table
Hadoop Distributed File System (HDFS) is a core component of the Hadoop ecosystem, used for storing and processing large-scale datasets. Therefore, starting from V4.3.5 BP1, OceanBase Database has added support for reading data from HDFS external tables, allowing direct access to data stored in HDFS.
For detailed instructions on creating an HDFS external table, see CREATE EXTERNAL TABLE.
Since the HDFS SDK is developed in Java and OceanBase is developed in C++, it is necessary to use the Java Native Interface (JNI) framework as a bridge between the two. Similarly, the Java SDK for ODPS also requires a Java environment to run. In order to use the HDFS external table feature, you need to configure the Java environment and control it through specific parameters, enabling the creation of tables that can access files on HDFS. The relevant parameters are as follows:
- ob_enable_java_env
- ob_java_home
- ob_java_connector_path
- ob_java_opts
For more information on configuring the Java environment, see Deploy the Java SDK environment for OceanBase Database.