Performance troubleshooting and tuning for data synchronization from OceanBase Community Edition to Kafka|V4.2.11|OceanBase Migration Service| docs|Distributed Database

Performance troubleshooting and tuning for data synchronization from OceanBase Community Edition to Kafka

Last Updated：2025-10-30 08:37:50 Updated

Use the following methods to determine whether the Store component has performance bottlenecks.
1. Query the number of records processed per second from the CDC log.
```
grep NEXT_RECORD_RPS libobcdc.log
```
  If the CDC processing speed is slower than the business data speed of the source, run the following command to check whether the issue is caused by the OMS Community Edition server.
2. Check whether the CDC process has triggered the traffic control.
```
grep "NEED_SLOW_DOWN=1 PAUSED=1" libobcdc.log
```
  NEED_SLOW_DOWN=1 indicates that the traffic control is triggered because the memory usage is high, which limits the log pulling efficiency. CDC is paused to avoid further increasing the system pressure when the traffic control is triggered due to issues such as I/O or server load.
  
  You can modify the memory_limit parameter to adjust the throttling threshold. View the current value in the /home/ds/store/store{port}/etc/libobcdc.conf file and increase the parameter value if necessary. Here is an example:
```
liboblog.memory_limit=20G
liboblog.part_trans_task_active_count_upper_bound=500000
```
3. If the traffic control is not triggered, query the logs for CLOG pulling to check the RPC latency.
```
grep do_stat libobcdc.log
[2025-04-21 16:05:13.905681] INFO  [TLOG.FETCHER] do_stat (ob_log_ls_fetch_stream.cpp:309) [20155][][T0][Y0-0000000000000000-0-0] [lt=9] [STAT] [FETCH_STREAM] stream="xxx.xxx.xxx.1:2882"(0x7fa62d4131f0:HOT)({tenant_id:1028, ls_id:{id:1002}})(FETCHED_LOG:153.11GB) traffic=41.85MB/sec log_size=438879806 size/rpc=13.50MB log_cnt/rpc=946 rpc_cnt=31(3/sec) single_rp
c=0(0/sec)(upper_limit=0(0/sec),max_log=0(0/sec),no_log=0(0/sec),max_result=0(0/sec)) rpc_time=312357 svr_time=(queue=41,process=224677) net_time=(l2s=1146,s2l=83859) cb_time=2632 h andle_rpc_time=13739 flush_time=860 read_log_time=12870(log_entry=2600,trans=0) trans_count=0 trans_size=0.00B
```
  The rpc_time=312357 svr_time=(queue=41,process=224677) in the log indicates that the RPC latency is 312 ms, and the server spent 224 ms processing the RPC. Generally, the RPC latency is only several tens of milliseconds. This indicates that the RPC latency is excessively high. In this case, query the OBServer logs and adjust relevant parameters.
  
  Keywords in the OBServer log: fetch_log done. This line of log is expected to print the statistics of log pulling. If the value of fetch_archive_time is not 0 in this line of log, increase the value of log_disk_size to increase the storage space for CLOG.
After the Store component is ruled out as the cause, check the performance-related parameters of the Full-Import/Incr-Sync component.

Usually, setting useSchemaCache to true in the Source is sufficient for most scenarios. If the required records per second (RPS) is still not met, you can set buildRecordConcurrent to true.

Source

Parameter	Description
useSchemaCache	Specifies whether to cache the schema. Valid values: `true` and `false`. Default value: false. If you set this parameter to `true`, the Store component caches the schema when reading data, which accelerates the message conversion of the Store.
buildRecordConcurrent	Specifies whether to asynchronously convert Store messages. Valid values: `true` and `false`. If you set this parameter to `true`, data is pulled from the Store and message conversion is performed in parallel. The number of parallel threads is the same as workerNum.

Sink

The following two parameters configure the producer client properties of Kafka.

OMS Community Edition parameter	Corresponding Kafka client parameter	Description
lingerMs	ProducerConfig.LINGER_MS_CONFIG	The waiting time of Kafka for sending batches of data. If you want to increase the throughput, you can increase the amount of data sent in each batch. The default value is 10, in milliseconds.
batchSize	ProducerConfig.BATCH_SIZE_CONFIG	The maximum number of messages sent in each batch by the Kafka client. Default value: 1048576, in bytes (1 MB).
workerNum		The number of concurrent worker threads of the Sink. Default value: 16.

If enablePreprocessConfig is set to true in the coordinator, lingerMs and batchSize will be automatically configured based on the JVM memory. If you manually configure these two parameters, your configurations take precedence.

# View the automatically configured parameters
grep "auto set " connector.log
# View the configurations finally used by the system
cat conf/runningConf.json

Coordinator

Shuffle-related configurations of OMS Community Edition

Parameter	Description
shuffleBucketSize	The number of buckets. OMS Community Edition usually reads and sends a batch of data in a bucket and then reads and sends the next batch of data in the bucket. The number of buckets determines the number of records that can be sent at the same time. Default value: 128.
shuffleFlushIntervalMs	The time interval for reading bucketed data periodically. The smaller the interval, the more real-time it is. The unit is milliseconds, and the default value is 100.
shuffleMinBatchSize	The number of records in a bucket must be greater than or equal to the value of this parameter before the bucket is read and sent. If the number of records in a bucket is less than the value of this parameter, the system waits for shuffleFlushIntervalMs and then reads and sends the records in the bucket. Default value: 20.
shuffleMaxBatchSize	The maximum number of records to be read and sent in one time. Default value: 64.

Use Arthas for performance analysis

# Log in to the OMS Community Edition container
cp /root/arthas-bin.zip /home/ds
su - ds
unzip arthas-bin.zip

/opt/alibaba/java/bin/java -jar arthas-boot.jar pid(Incremental component process number)
profiler start
profiler getSamples
profiler status
# Enter the stop command after waiting for 1 minute. This will generate an HTML file containing flame graphs.
profiler stop --format html
# Exit Arthas.
exit

Enterprise Edition

Community Edition

Performance troubleshooting and tuning for data synchronization from OceanBase Community Edition to Kafka

Performance troubleshooting steps

Source

Sink

Coordinator

Use Arthas for performance analysis

Enterprise Edition

Community Edition

Performance troubleshooting and tuning for data synchronization from OceanBase Community Edition to Kafka

Performance troubleshooting steps

Performance-related configurations for Full-Import/Incr-Sync components

Source

Sink

Coordinator

Use Arthas for performance analysis