Background information
OceanBase Database allows you to insert data into a database by using bypass import. In other words, OceanBase Database can directly write data into data files. With the bypass import feature, you can bypass interfaces in the SQL layer and directly allocate space and insert data into data files, thereby improving the data import efficiency.
Scenarios
The bypass import feature applies to the following scenarios:
Data migration and synchronization: In data migration or synchronization, a large amount of data of different types must be migrated from different data sources to OceanBase Database. Conventional SQL interfaces cannot meet the requirement on timeliness.
Conventional extraction, transformation, and loading (ETL): After data is extracted and transformed in the source, a large amount of data must be loaded to the destination within a short time. The bypass import technology can improve the import performance.
Data loading from text files or other data sources to OceanBase Database: Bypass import can accelerate the data loading process.
Considerations
In bypass import, the RPC port instead of the SQL port is required to transmit data.
Data to be imported is submitted at the table level instead of the session or transaction level.
Retry or resumable transmission is not supported.
Bit data types are not supported.
Virtual generated columns are not supported.
We recommend that you do not import a small amount of data by using bypass import.
The
--replace-datacommand-line option cannot help address unique index conflicts.The differences between the
--threadand--parallelcommand-line options are as follows:--threadspecifies the thread pool for connection from the client to the server, and is maintained on the client.--parallelspecifies the number of worker threads that can be called by the OBServer node for data writing and sorting.We recommend that you specify consistent values for
--threadand--parallel.
Command-line options for bypass import
Notice
The bypass import mode of OBLOADER supports direct connections to an OBServer node and connections through OceanBase Database Proxy (ODP). The version requirements are as follows:
- For direct connections to an OBServer node: OceanBase Database V4.2.0 or later.
- For connections through ODP: ODP V4.1.3 and OceanBase Database V4.2.1 or later.
| Command-line option | Description | ApsaraDB for OceanBase & ODP | OceanBase Database & ODP | OceanBase Database & OBServer node |
|---|---|---|---|---|
| --direct | Specifies to use bypass import. | Required | Required | Required |
| --parallel | The degree of parallelism (DOP) on the server. The default value is 1. We recommend that the value be consistent with the number of CPU cores of the tenant. We recommend that you specify this option to ensure performance stability. |
Optional | Optional | Optional |
| --rpc-port | The inner RPC port of the server. You can obtain the RPC port as follows:
|
Required | Required | Required |
| -u(--user) | The username that you use to log on to the database. | Required | Required | Required |
| -P(--port) | The SQL port number. | Required | Required | Required |
| -t(--tenant) | The tenant name of the cluster. | Optional If this option is not specified, partition calculation may be skipped. |
Required | Required |
| -c(--cluster) | The cluster name of the database. | Optional | Required | - |
| --public-cloud | Specifies that the database environment is ApsaraDB for OceanBase. | Required | - | - |
| --no-sys | Specifies that the import does not rely on the sys tenant. This option applies only to OceanBase Database of a version earlier than V4.0.0. | Optional | Optional | Optional |
| --sys-user | The user on which the import relies in the sys tenant. If this option is not specified, the default value root takes effect. This option applies only to OceanBase Database of a version earlier than V4.0.0. |
Optional This option is mutually exclusive with the --no-sys option. |
Optional This option is mutually exclusive with the --no-sys option. |
Optional This option is mutually exclusive with the --no-sys option. |
| --sys-password | The password of the user on which the import relies in the sys tenant. This option applies only to OceanBase Database of a version earlier than V4.0.0. | Optional This option is mutually exclusive with the --no-sys option. |
Optional This option is mutually exclusive with the --no-sys option. |
Optional This option is mutually exclusive with the --no-sys option. |
Bypass import parameters
You can configure bypass import parameters in the session.config.json file in the {ob-loader-dumper}/conf directory.
Here is an example:
"direct_path_load": {
"rpc_connect_timeout": "15000",
"rpc_execute_timeout": "20000",
"runtime_retry_times": "5",
"runtime_retry_intervals": "50",
"task_timeout": "2592000000000",
"heartbeat_timeout": "60000000" }
rpc_connect_timeout: the timeout period for creating an RPC connection, in ms.rpc_execute_timeout: the timeout period for executing an RPC request, in ms.runtime_retry_times: the maximum number of retries allowed. If an import operation fails, it will be retried based on this parameter.runtime_retry_intervals: the retry interval, namely, the amount of time to wait before a retry is triggered, in ms.task_timeout: the timeout period of an import operation, in μs. If an import operation is not completed within the configured period, it is considered timed out. The default value is0, which indicates that an import operation will not time out.heartbeat_timeout: the heartbeat timeout period, which is used to detect the active status of an import operation, in μs. The default value is0, which specifies not to enable heartbeat detection.