OceanBase Database now supports connecting to a file system as an external data source by using the Java SDK. This feature is implemented using the JNI framework, so you must deploy the Java SDK in the OceanBase Database environment.
Configure environment
Notice
If you want to use the HDFS/ODPS external table feature and OceanBase Database is deployed in a distributed manner, you must configure the Java environment and install dependencies on all corresponding nodes. You cannot configure only one node.
All environment configuration and dependency installation operations must be performed as the admin user. To switch to the admin user, run the following command:
su - admin
Install and configure Java environment
Download the OpenJDK package from Download page.
Notice
Use the latest version of OpenJDK 11.
This example uses the OpenJDK 11.0.29+7 version. Perform the following steps in sequence for OpenJDK 11.0.29+7:
[admin@xxx /home/admin/rpm]# wget https://builds.openlogic.com/downloadJDK/openlogic-openjdk/11.0.29+7/openlogic-openjdk-11.0.29+7-linux-x64.tar.gzDecompress the installation package.
Here is an example:
[admin@xxx /home/admin/rpm]# sudo tar -zxvf openlogic-openjdk-11.0.29+7-linux-x64.tar.gzThe installation path is as follows:
/home/admin/rpm/openlogic-openjdk-11.0.29+7-linux-x64Notice
This path is used for setting the Java home directory of the current OBServer node, which corresponds to the value of the configuration item ob_java_home.
Install dependencies
To use the HDFS/ODPS external tables feature of OceanBase Database, the following components are required:
devdeps-hdfs-sdk RPM package:
This package contains dynamic link libraries required for the runtime of HDFS/ODPS external tables, and provides the core interfaces for interaction between the JVM environment and JNI (Java Native Interface). This component serves as the communication bridge between the Java Virtual Machine and the native external table Java SDK, ensuring stable operation of the external tables feature.
devdeps-java-extensions RPM package:
This package integrates the core dependency libraries (JAR files) required for HDFS external tables and other external data sources (such as ODPS). The extension package includes a complete Java runtime dependency chain, ensuring the compatibility and performance optimization of external table features in distributed scenarios.
Deploy and configure HDFS.so dynamic library
Obtain the devdeps-hdfs-sdk RPM installation package.
- Enterprise Edition users: Contact Technical Support to obtain the devdeps-hdfs-sdk RPM installation package.
- Community Edition users: Go to the OceanBase Database mirror page, click the
development-kit/directory to access the development tools resource directory, and download the devdeps-hdfs-sdk RPM installation package.
Install the package by using the following command:
sudo rpm -Uvh devdeps-hdfs-sdk-3.3.6-xxxxx.xxx.xxx.rpmHere is an example:
sudo rpm -Uvh devdeps-hdfs-sdk-3.3.6-112024123116.el7.x86_64.rpmCheck whether the package is installed correctly.
The
libhdfs.soandlibhdfs.so.0.0.0files must be present, and the corresponding symbolic links must be valid.Here is an example:
$ll /usr/local/oceanbase/deps/devel/lib total 376 lrwxrwxrwx 1 root root 16 Dec 24 19:49 libhdfs.so -> libhdfs.so.0.0.0 -rwxr-xr-x 1 root root 384632 Dec 24 19:09 libhdfs.so.0.0.0
Deploy and configure jar package path
Obtain the devdeps-java-extensions RPM installation package.
Enterprise Edition users: Contact Technical Support to obtain the devdeps-java-extensions RPM installation package.
Community Edition users: Go to the OceanBase Database mirror page, click the
development-kit/directory to access the development tools resource directory, and download the devdeps-java-extensions RPM installation package.Notice
- For OceanBase Database V4.3.5 BP1 and earlier: Download the devdeps-java-extensions RPM installation package of version 1.0.0.
- For OceanBase Database V4.3.5 BP2 and later: Download the devdeps-java-extensions RPM installation package of version 1.0.1.
- For OceanBase Database V4.4.0: Download the devdeps-java-extensions RPM installation package of version 1.0.1.
- For OceanBase Database V4.4.1: Download the devdeps-java-extensions RPM installation package of version 1.0.2.
- For OceanBase Database V4.4.2 and later: Download the devdeps-java-extensions RPM installation package of version 1.0.4.
Install the package by using the following command:
sudo rpm -Uvh devdeps-java-extensions-x.x.x-xxxxxxxxxxxx.xxx.xxxxxx.rpm --prefix=/user_install_directoryHere is an example:
sudo rpm -Uvh devdeps-java-extensions-1.0.0-122025032514.el7.x86_64.rpm --prefix=/home/admin/oceanbaseCheck whether the package is installed correctly.
Check whether the
oceanbase-odps-connector-jar-with-dependencies.jarfile exists in the/home/admin/oceanbase/jni_packages/v1.0.0directory.Here is an example:
$ll /home/admin/oceanbase/jni_packages/v1.0.0 total 52756 drwxr-sr-x 4 root root 4096 Dec 24 20:25 hadoop drwxr-xr-x 3 root root 4096 Dec 24 20:25 lib -rw-r--r-- 1 root root 54008720 Dec 24 19:52 oceanbase-odps-connector-jar-with-dependencies.jarNotice
This path is used for setting the path of the jar packages that are to be loaded by the JVM upon OBServer startup, which corresponds to the value of the configuration item ob_java_connector_path.
(Optional) Restart observer process
Note
- If you are using the JAVA SDK for the first time, you do not need to restart the observer process.
- Since the current OBServer does not support flexible and immediate configuration changes for JNI-related settings, if you need to change related Java environment variables, you must restart the observer process for the changes to take effect.
When using the HDFS/ODPS external table feature, you need to configure the corresponding OBServer server. The configuration steps are as follows:
Notice
All the following configuration items are cluster-level settings. You need to configure them only once, without the need to configure them on each node individually.
Step 1: Set related configurations for Java environment startup
Notice
The following settings are performed in the sys tenant.
Enable the Java environment to allow access to the SDK (Java SDK) for external tables.
Here is an example:
ALTER SYSTEM SET ob_enable_java_env = true;For more information about this setting, see ob_enable_java_env.
Set the Java home directory on the current OBServer node.
Note
This path is the installation path of OpenJDK Java.
Here is an example:
ALTER SYSTEM SET ob_java_home = "/home/admin/rpm/openlogic-openjdk-11.0.29+7-linux-x64";For more information about this setting, see ob_java_home.
Set the path to the executable dependency JAR packages that can be loaded by the JVM when OBServer starts.
Note
This path is the installation path of the JAR package RPM.
Here is an example:
ALTER SYSTEM SET ob_java_connector_path = "/home/admin/oceanbase/jni_packages/v1.0.0";For more information about this setting, see ob_java_connector_path.
Set the related configurations for Java environment startup
Create the corresponding log folder path.
mkdir -p /home/user/jvmlogs mkdir -p /home/user/jvmlogs/heapdumpsSet the JVM startup configuration for Java runtime.
Here is an example:
ALTER SYSTEM SET ob_java_opts="-Xmx2048m -Xms2048m -XX:-CriticalJNINatives -Djdk.lang.processReaperUseDefaultStackSize=true -Xrs -XX:+HeapDumpOnOutOfMemoryError -Xloggc:/home/admin/gc.log -XX:+PrintGCDetails -XX:+PrintGCDateStamps -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/home/admin/heapdumps/ -XX:+UseConcMarkSweepGC -XX:+UseParNewGC -XX:CMSInitiatingOccupancyFraction=75 -XX:+UseCMSInitiatingOccupancyOnly -XX:+CMSClassUnloadingEnabled -XX:+TieredCompilation -XX:+ExplicitGCInvokesConcurrentAndUnloadsClasses ";For more information about this setting, see ob_java_opts.
Notice
- Changing this configuration requires a restart of the observer process. Since the current HDFS integration uses memory copy, which directly copies HDFS data streams to the C++ memory heap, you can appropriately reduce the
-Xmx2048m -Xms2048msetting. - GC log files are generated only when the configuration folder path exists; if the path does not exist, no GC log files will be generated.
- Changing this configuration requires a restart of the observer process. Since the current HDFS integration uses memory copy, which directly copies HDFS data streams to the C++ memory heap, you can appropriately reduce the
(Optional) Set ODPS configuration item for using the Java SDK.
Notice
The HDFS Java SDK does not require this configuration item, while other configuration items remain the same.
If you want to use the Java SDK, set
_use_odps_jni_connectortotrue.ALTER SYSTEM SET _use_odps_jni_connector = true;
Step 2 (Optional): Decompress hdfs-sdk package without sudo privileges
If you cannot obtain
sudoprivileges to execute the installation command, you can manually decompress the devdeps-hdfs-sdk RPM installation package using therpm2cpiocommand. You can place the required packages in the desired path as needed.Here is an example:
rpm2cpio devdeps-hdfs-sdk-3.3.6-xxxxx.xxx.xxx.rpm | cpio -idmvSpecify the target path.
After decompressing the installation package, you can use the
mvcommand to move the decompressed files to a custom path accessible by the user (e.g.,~/hdfs_lib).Here is an example:
mv ./usr/local/oceanbase/deps/devel/lib/* ~/hdfs_lib/Obtain and confirm the absolute path.
You can use the
realpathcommand to view the absolute path of the custom path.Here is an example:
realpath ~/hdfs_libThe returned result is as follows:
/home/${user_name}/hdfs_libConfigure the OceanBase Database path variable.
Here is an example:
Log in to the sys tenant and execute the following command to register the custom path to the system configuration.
Notice
- When executing the following statement, please replace the example
${user_name}with the actual path. - The path must be an absolute path, and multiple paths should be separated by colons
:, without spaces.
ALTER SYSTEM SET _ob_additional_lib_path = '/home/${user_name}/hdfs_lib';- When executing the following statement, please replace the example
Step 3: Restart observer process
Stop the observer process on all servers and then restart the observer process.
Switch to the
adminuser.[admin@xxx /home/admin/oceanbase/etc]# su - adminStop the observer process.
-bash-4.2$ kill -9 `pidof observer`Restart the observer process.
-bash-4.2$ cd /home/admin/oceanbase && /home/admin/oceanbase/bin/observerNote
You do not need to specify the startup parameters again because the previous startup parameters have been written to the parameter file.
References
For more information about how to create an ODPS external table, see Create an external table (MySQL mode) or Create an external table (Oracle mode).
