OceanBase Database now supports connecting to a file system as an external data source by using the Java SDK. This feature uses the Java Native Interface (JNI) framework, so you need to deploy the Java SDK environment for OceanBase Database.
Environment configuration
Notice
If you want to use the HDFS/ODPS external table feature and OceanBase Database is deployed in a distributed manner, you need to configure the Java environment and install the dependencies on all corresponding nodes. You cannot configure only one node.
All environment configuration and dependency installation operations must be performed as the admin user. Please switch to the admin user by using the following command:
su - admin
Configure the Java environment
Download the OpenJDK installation package from the download page.
Notice
Use the latest version of OpenJDK 11.
This example uses OpenJDK 11.0.29+7. The specific operation steps are as follows:
[admin@xxx /home/admin/rpm]# wget https://builds.openlogic.com/downloadJDK/openlogic-openjdk/11.0.29+7/openlogic-openjdk-11.0.29+7-linux-x64.tar.gzDecompress the installation package.
Example:
[admin@xxx /home/admin/rpm]# sudo tar -zxvf openlogic-openjdk-11.0.29+7-linux-x64.tar.gzThe installation path is as follows:
/home/admin/rpm/openlogic-openjdk-11.0.29+7-linux-x64Notice
This path is used to configure the Java home directory on the node where the OBServer runs, that is, the value of the ob_java_home parameter.
Install dependencies
To use the HDFS/ODPS external table feature of OceanBase Database, you also need to install the following components:
devdeps-hdfs-sdk RPM package:
This package contains dynamic link library files required for the runtime of HDFS/ODPS external tables. It provides core interfaces for interaction between the Java Virtual Machine (JVM) and the Java Native Interface (JNI). This component acts as a communication bridge between the JVM and the local external table Java SDK, ensuring the stable operation of the external table feature.
devdeps-java-extensions RPM package:
This package integrates core dependency libraries (JAR files) required for HDFS external tables and other external data sources such as ODPS. The extension package contains a complete Java runtime dependency chain, ensuring compatibility and performance optimization of the external table feature in distributed scenarios.
Deploy and configure the required HDFS.so dynamic library
Obtain the devdeps-hdfs-sdk RPM installation package.
- For Enterprise Edition users, contact the technical support to obtain the devdeps-hdfs-sdk RPM installation package.
After obtaining the installation package, install it by using the following command:
sudo rpm -Uvh devdeps-hdfs-sdk-3.3.6-xxxxx.xxx.xxx.rpmExample:
sudo rpm -Uvh devdeps-hdfs-sdk-3.3.6-112024123116.el7.x86_64.rpmCheck whether the installation meets the expected results.
The
libhdfs.soandlibhdfs.so.0.0.0files must exist, and the corresponding symbolic links must be normal.Example:
$ll /usr/local/oceanbase/deps/devel/lib total 376 lrwxrwxrwx 1 root root 16 Dec 24 19:49 libhdfs.so -> libhdfs.so.0.0.0 -rwxr-xr-x 1 root root 384632 Dec 24 19:09 libhdfs.so.0.0.0
Deploy and configure the dependency jar package path
Obtain the devdeps-java-extensions RPM installation package.
- For Enterprise Edition users, contact Technical Support to obtain the devdeps-java-extensions RPM installation package.
Notice
- For OceanBase Database V4.3.5 BP1 and earlier: Download the devdeps-java-extensions RPM installation package of V1.0.0.
- For OceanBase Database V4.3.5 BP2 and later: Download the devdeps-java-extensions RPM installation package of V1.0.1.
- For OceanBase Database V4.4.0: Download the devdeps-java-extensions RPM installation package of V1.0.1.
- For OceanBase Database V4.4.1: Download the devdeps-java-extensions RPM installation package of V1.0.2.
After obtaining the installation package, install it by using the following command:
sudo rpm -Uvh devdeps-java-extensions-x.x.x-xxxxxxxxxxxx.xxx.xxxxxx.rpm --prefix=/user_install_directoryExample:
sudo rpm -Uvh devdeps-java-extensions-1.0.0-122025032514.el7.x86_64.rpm --prefix=/home/admin/oceanbaseCheck whether the installation meets the expected results.
Check whether the
oceanbase-odps-connector-jar-with-dependencies.jarfile exists in the/home/admin/oceanbase/jni_packages/v1.0.0directory.Example:
$ll /home/admin/oceanbase/jni_packages/v1.0.0 total 52756 drwxr-sr-x 4 root root 4096 Dec 24 20:25 hadoop drwxr-xr-x 3 root root 4096 Dec 24 20:25 lib -rw-r--r-- 1 root root 54008720 Dec 24 19:52 oceanbase-odps-connector-jar-with-dependencies.jarNotice
This path is used to configure the path of the executable dependency JAR package that can be loaded by the JVM when the OBServer starts, that is, the value of the ob_java_connector_path parameter.
(Optional) Restart the observer process
Note
- If you are using the Java SDK for the first time, you do not need to restart the observer process.
- Since the JNI-related parameters supported by the current OBServer cannot be flexibly set and take effect immediately, you need to restart the observer process to make the changes to the Java environment variables take effect.
When using the HDFS/ODPS external table feature, you need to configure the corresponding OBServer server. The configuration steps are as follows:
Notice
All the following parameters are cluster-level configurations. You need to set them only once, without the need to configure them on each node individually.
Step 1: Set the parameters related to the Java environment
Notice
The following settings are performed in the sys tenant.
Enable the Java environment for accessing the SDK (Java SDK) of external tables.
Here is an example:
ALTER SYSTEM SET ob_enable_java_env = true;For more information, see ob_enable_java_env.
Set the Java home directory on the OBServer node where the current OBServer is running.
Note
The path is obtained from the OpenJDK Java installation path.
Here is an example:
ALTER SYSTEM SET ob_java_home = "/home/admin/rpm/openlogic-openjdk-11.0.29+7-linux-x64";For more information, see ob_java_home.
Set the path to the executable dependency JAR packages that can be loaded by the JVM when the OBServer is started.
Note
The path is obtained from the JAR package RPM installation path.
Here is an example:
ALTER SYSTEM SET ob_java_connector_path = "/home/admin/oceanbase/jni_packages/v1.0.0";For more information, see ob_java_connector_path.
Set the parameters related to the Java environment startup.
Create the corresponding log folder path.
mkdir -p /home/user/jvmlogs mkdir -p /home/user/jvmlogs/heapdumpsSet the JVM startup configuration for Java runtime.
Here is an example:
ALTER SYSTEM SET ob_java_opts="-Xmx2048m -Xmn2048m -XX:-CriticalJNINatives -Djdk.lang.processReaperUseDefaultStackSize=true -Xrs -XX:+HeapDumpOnOutOfMemoryError -Xloggc:/home/admin/gc.log -XX:+PrintGCDetails -XX:+PrintGCDateStamps -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/home/admin/heapdumps/ -XX:+UseConcMarkSweepGC -XX:+UseParNewGC -XX:CMSInitiatingOccupancyFraction=75 -XX:+UseCMSInitiatingOccupancyOnly -XX:+CMSClassUnloadingEnabled -XX:+TieredCompilation -XX:+ExplicitGCInvokesConcurrentAndUnloadsClasses ";For more information, see ob_java_opts.
Notice
- After you change this parameter, you must restart the observer process. The current HDFS integration uses memory copy, which directly copies HDFS data streams to the C++ memory heap. You can reduce the
-Xmx2048m -Xms2048msetting. - GC log files are generated only when the configuration folder path exists. If the path does not exist, no GC log files are generated.
- After you change this parameter, you must restart the observer process. The current HDFS integration uses memory copy, which directly copies HDFS data streams to the C++ memory heap. You can reduce the
(Optional) Set the parameter for using the Java SDK of ODPS.
Notice
The HDFS Java SDK does not require this parameter. Other parameters are the same.
If you want to use the Java SDK, set
_use_odps_jni_connectortotrue.ALTER SYSTEM SET _use_odps_jni_connector = true;
Step 2 (Optional): Decompress the hdfs-sdk package without sudo privileges
If you cannot obtain
sudoprivileges to execute the installation command, you can manually decompress the devdeps-hdfs-sdk RPM installation package by using therpm2cpiocommand. You can place the package in the required path as needed.Here is an example:
rpm2cpio devdeps-hdfs-sdk-3.3.6-xxxxx.xxx.xxx.rpm | cpio -idmvSpecify the target path.
After decompressing the installation package, you can use the
mvcommand to move the decompressed files to a custom path accessible by users (for example,~/hdfs_lib).Here is an example:
mv ./usr/local/oceanbase/deps/devel/lib/* ~/hdfs_lib/Obtain and verify the absolute path.
You can use the
realpathcommand to view the absolute path of the custom path.Here is an example:
realpath ~/hdfs_libThe returned result is as follows:
/home/${user_name}/hdfs_libConfigure the OceanBase Database path variable.
Here is an example:
Log in to the sys tenant and execute the following command to register the custom path to the system configuration.
Notice
- When you execute the following statement, replace
${user_name}in the example with the actual path. - The path must be an absolute path. Multiple paths are separated by colons
:and cannot contain spaces.
ALTER SYSTEM SET _ob_additional_lib_path = '/home/${user_name}/hdfs_lib';- When you execute the following statement, replace
Step 3: Restart the observer process
Stop the observer process on all servers and restart it.
Switch to the
adminuser.[admin@xxx /home/admin/oceanbase/etc]# su - adminStop the observer process.
-bash-4.2$ kill -9 `pidof observer`Restart the observer process.
-bash-4.2$ cd /home/admin/oceanbase && /home/admin/oceanbase/bin/observerNote
No startup parameters are required when you restart the observer process because the previous startup parameters have been written to the parameter file.
References
For detailed information about how to create an ODPS external table, see Create an external table (MySQL-compatible mode) or Create an external table (Oracle-compatible mode).