OceanBase Database now supports using the JAVA SDK to connect to file systems, allowing you to use external data sources for external tables. This feature is based on the Java Native Interface (JNI) framework, so you need to set up a JAVA SDK environment on the server where OceanBase Database is running.
Environment preparation
Notice
If you want to use HDFS or ODPS external tables and your OceanBase Database is deployed across multiple servers (distributed deployment), you must set up the JAVA environment and install the required dependencies on every server node—not just one.
Set up the Java environment
Download the OpenJDK installation package from the download page.
Notice
Use the latest version of OpenJDK 8.
For example, to download OpenJDK 8u422-b05, run:
[root@xxx /home/admin/rpm]# wget https://builds.openlogic.com/downloadJDK/openlogic-openjdk/8u422-b05/openlogic-openjdk-8u422-b05-linux-x64.tar.gzDecompress the installation package.
Here is an example:
[root@xxx /home/admin/rpm]# tar -zxvf openlogic-openjdk-8u422-b05-linux-x64.tar.gz[root@xxx /home/admin/rpm]# ls openlogic-openjdk-8u422-b05-linux-x64 ASSEMBLY_EXCEPTION LICENSE THIRD_PARTY_README bin demo include jre lib man release sample src.zipThe installation path is as follows:
/home/admin/rpm/openlogic-openjdk-8u422-b05-linux-x64Notice
This path is used to configure the java home directory of the OBServer running node, which corresponds to the value of the ob_java_home parameter.
Install dependencies
To use HDFS or ODPS external tables with OceanBase Database, you also need to install the following packages:
devdeps-hdfs-sdk RPM package:
This package contains the dynamic libraries needed to run HDFS/ODPS external tables. It provides the core interface for communication between the Java Virtual Machine (JVM) and JNI. It acts as a bridge between the JVM and the local external table JAVA SDK, ensuring stable operation.
devdeps-java-extensions RPM package:
This package integrates core dependency libraries (JAR files) required for HDFS external tables and other external data sources like ODPS. The extension package includes a complete Java runtime dependency chain, ensuring compatibility and performance optimization of the external table feature in distributed scenarios.
Deploy the HDFS.so dynamic library
Download the devdeps-hdfs-sdk RPM package.
- For Enterprise Edition users, please contact technical support to obtain the devdeps-hdfs-sdk RPM package.
- For Community Edition users, visit the OceanBase Image page, click the
development-kit/directory to go to the development tool resource directory, and download the devdeps-hdfs-sdk RPM package.
Run the following command to install the package.
sudo rpm -Uvh devdeps-hdfs-sdk-3.3.6-xxxxx.xxx.xxx.rpmHere is an example:
sudo rpm -Uvh devdeps-hdfs-sdk-3.3.6-112024123116.el7.x86_64.rpmCheck whether the installation is successful.
The
libhdfs.soandlibhdfs.so.0.0.0files must exist, and the soft link must be set up correctly.Here is an example:
$ll /usr/local/oceanbase/deps/devel/lib total 376 lrwxrwxrwx 1 root root 16 Dec 24 19:49 libhdfs.so -> libhdfs.so.0.0.0 -rwxr-xr-x 1 root root 384632 Dec 24 19:09 libhdfs.so.0.0.0
Configure the JAR package path
Obtain the devdeps-java-extensions RPM installation package.
For Enterprise Edition users, please contact technical support to obtain the devdeps-java-extensions RPM package.
For Community Edition users, visit the OceanBase Image page, click the
development-kit/directory to go to the development tool resource directory, and download the devdeps-java-extensions RPM package.Notice
- For OceanBase Database V4.3.5 BP1 and earlier: Download the devdeps-java-extensions RPM installation package of V1.0.0.
- For OceanBase Database V4.3.5 BP2 and later: Download the devdeps-java-extensions RPM installation package of V1.0.1.
- For OceanBase Database V4.4.0: Download the devdeps-java-extensions RPM installation package of V1.0.1.
Install the package by using the following command.
sudo rpm -Uvh devdeps-java-extensions-x.x.x-xxxxxxxxxxxx.xxx.xxxxxx.rpmHere is an example:
sudo rpm -Uvh devdeps-java-extensions-1.0.0-122025032514.el7.x86_64.rpmCheck whether the installation is successful.
Check whether the
oceanbase-odps-connector-jar-with-dependencies.jarfile exists in the directory/home/admin/oceanbase/jni_packages/v1.0.0.Here is an example:
$ll /home/admin/oceanbase/jni_packages/v1.0.0 total 52756 drwxr-sr-x 4 root root 4096 Dec 24 20:25 hadoop drwxr-xr-x 3 root root 4096 Dec 24 20:25 lib -rw-r--r-- 1 root root 54008720 Dec 24 19:52 oceanbase-odps-connector-jar-with-dependencies.jarNotice
The path is used to configure the path of the executable dependency JAR package that can be loaded by the JVM started in OBServer. This path corresponds to the value of the ob_java_connector_path parameter.
(Optional) Restart the observer process
Note
- If you are using the JAVA SDK for the first time, you do not need to restart the observer process.
- Due to the limitations in flexible and timely setting and activation of JNI-related configurations supported by the current OBServer, changing the configurations of the Java environment variables requires a restart of the observer process for the changes to take effect.
When using the HDFS/ODPS external table feature, you need to configure the OBServer server accordingly. The configuration steps are as follows.
Notice
All the following configurations are cluster-level settings. You need to configure them only once, without the need to set them on each node individually.
Step 1: Set the relevant parameters for starting the Java environment
Notice
All the following settings are performed in the sys tenant.
Enable the Java environment for the Java SDK to access the external table.
Here is an example:
ALTER SYSTEM SET ob_enable_java_env = true;Set the Java home directory on the OBServer node where the OBServer runs.
Note
This path is from the OpenJDK installation path.
Here is an example:
ALTER SYSTEM SET ob_java_home = "/home/admin/rpm/openlogic-openjdk-8u422-b05-linux-x64";Set the path of the executable dependency JAR file that the JVM of the OBServer can load.
Note
This path is from the RPM installation path of the JAR file.
Here is an example:
ALTER SYSTEM SET ob_java_connector_path = "/home/admin/oceanbase/jni_packages/v1.0.0";Set the relevant parameters for starting the Java environment.
Create the corresponding log folder.
mkdir -p /home/user/jvmlogs mkdir -p /home/user/jvmlogs/heapdumpsSet the JVM startup parameters for running Java.
Here is an example:
ALTER SYSTEM SET ob_java_opts = "-Djdk.lang.processReaperUseDefaultStackSize=true -XX:+HeapDumpOnOutOfMemoryError -Xmx2048m -Xms2048m -Xloggc:/home/user/jvmlogs/gc.log -XX:+PrintGCDetails -XX:+PrintGCDateStamps -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/home/user/jvmlogs/heapdumps/ -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=10 -XX:GCLogFileSize=100M -XX:+UseG1GC -XX:-CriticalJNINatives";Notice
- After this parameter is modified, you must restart the OBServer process. In the current HDFS integration, the memory copy is performed by directly copying the HDFS data stream to the C++ memory heap. You can appropriately reduce the values of
-Xmx2048m -Xms2048m. - GC logs are generated only when the configuration folder path exists. If the path does not exist, the GC logs will not be generated.
- After this parameter is modified, you must restart the OBServer process. In the current HDFS integration, the memory copy is performed by directly copying the HDFS data stream to the C++ memory heap. You can appropriately reduce the values of
(Optional) Set the ODPS parameters for using the Java SDK.
Notice
The HDFS Java SDK does not require this parameter to be set. The other parameters are the same.
To use the Java SDK, you must set
_use_odps_jni_connectortotrue.ALTER SYSTEM SET _use_odps_jni_connector = true;
Step 2: Set the available LD_LIBRARY_PATH
The JNI SDK for OceanBase Database is implemented as an extension feature using a dynamic library loading mechanism. To ensure this functionality is available, you need to configure the dynamic library search path (that is, set the LD_LIBRARY_PATH environment variable) on all OBServer nodes. This configuration must be applied to every relevant OBServer node for the feature to work properly.
Here is an example:
$ vim ~/.bashrc
export LD_LIBRARY_PATH=/home/admin/rpm/openlogic-openjdk-8u422-b05-linux-x64/jre/lib/amd64/server:/usr/local/oceanbase/deps/devel/lib
$ source ~/.bashrc
Step 3: Restart the observer process
Stop the observer process on all servers and then restart it.
Switch to the
adminuser.[root@xxx /home/admin/oceanbase/etc]# su - adminStop the observer process.
-bash-4.2$ kill -9 `pidof observer`Restart the observer process.
-bash-4.2$ cd /home/admin/oceanbase && /home/admin/oceanbase/bin/observerNote
When restarting the observer process, you do not need to specify the startup parameters again, as the parameters from the previous startup have been written to the parameter file.
References
For more information about how to create an ODPS external table, see Create an external table (MySQL-compatible mode) or Create an external table (Oracle-compatible mode).