OceanBase logo

OceanBase

A unified distributed database ready for your transactional, analytical, and AI workloads.

DEPLOY YOUR WAY

OceanBase Cloud

The best way to deploy and scale OceanBase

OceanBase Enterprise

Run and manage OceanBase on your infra

TRY OPEN SOURCE

OceanBase Community Edition

The free, open-source distributed database

OceanBase seekdb

Open source AI native search database

Customer Stories

Real-world success stories from enterprises across diverse industries.

View All
BY USE CASES

Mission-Critical Transactions

Global & Multicloud Application

Elastic Scaling for Peak Traffic

Real-time Analytics

Active Geo-redundancy

Database Consolidation

Resources

Comprehensive knowledge hub for OceanBase.

Blog

Live Demos

Training & Certification

Documentation

Official technical guides, tutorials, API references, and manuals for all OceanBase products.

View All
PRODUCTS

OceanBase Cloud

OceanBase Database

Tools

Connectors and Middleware

QUICK START

OceanBase Cloud

OceanBase Database

BEST PRACTICES

Practical guides for utilizing OceanBase more effectively and conveniently

Company

Learn more about OceanBase – our company, partnerships, and trust and security initiatives.

About OceanBase

Partner

Trust Center

Contact Us

International - English
中国站 - 简体中文
日本 - 日本語
Sign In
Start on Cloud

A unified distributed database ready for your transactional, analytical, and AI workloads.

DEPLOY YOUR WAY

OceanBase Cloud

The best way to deploy and scale OceanBase

OceanBase Enterprise

Run and manage OceanBase on your infra

TRY OPEN SOURCE

OceanBase Community Edition

The free, open-source distributed database

OceanBase seekdb

Open source AI native search database

Customer Stories

Real-world success stories from enterprises across diverse industries.

View All
BY USE CASES

Mission-Critical Transactions

Global & Multicloud Application

Elastic Scaling for Peak Traffic

Real-time Analytics

Active Geo-redundancy

Database Consolidation

Comprehensive knowledge hub for OceanBase.

Blog

Live Demos

Training & Certification

Documentation

Official technical guides, tutorials, API references, and manuals for all OceanBase products.

View All
PRODUCTS
OceanBase CloudOceanBase Database
ToolsConnectors and Middleware
QUICK START
OceanBase CloudOceanBase Database
BEST PRACTICES

Practical guides for utilizing OceanBase more effectively and conveniently

Learn more about OceanBase – our company, partnerships, and trust and security initiatives.

About OceanBase

Partner

Trust Center

Contact Us

Start on Cloud
编组
All Products
    • Databases
    • iconOceanBase Database
    • iconOceanBase Cloud
    • iconOceanBase Tugraph
    • iconInteractive Tutorials
    • iconOceanBase Best Practices
    • Tools
    • iconOceanBase Cloud Platform
    • iconOceanBase Migration Service
    • iconOceanBase Developer Center
    • iconOceanBase Migration Assessment
    • iconOceanBase Admin Tool
    • iconOceanBase Loader and Dumper
    • iconOceanBase Deployer
    • iconKubernetes operator for OceanBase
    • iconOceanBase Diagnostic Tool
    • iconOceanBase Binlog Service
    • Connectors and Middleware
    • iconOceanBase Database Proxy
    • iconEmbedded SQL in C for OceanBase
    • iconOceanBase Call Interface
    • iconOceanBase Connector/C
    • iconOceanBase Connector/J
    • iconOceanBase Connector/ODBC
    • iconOceanBase Connector/NET
icon

OceanBase Migration Service

V4.2.13Community Edition

  • OMS Documentation
  • What's New
  • OMS Community Edition Introduction
    • What is OMS Community Edition?
    • Terms
    • OMS Community Edition HA
    • Architecture
      • Overview
      • Hierarchical functional system
      • Basic components
    • Limitations
  • Quick Start
    • Data migration process
    • Data synchronization process
  • Deploy OMS Community Edition
    • Deployment modes
    • System and network requirements
    • Memory and disk requirements
    • Prepare the environment
    • Deploy by using Docker
      • Single-node deployment
      • Single-region multi-node deployment
      • Multi-region multi-node deployment
      • Scale out
      • Check the deployment
      • Deploy a time-series database (Optional)
    • Deploy by using k8s
      • Single-node deployment
      • Single-region multi-node deployment
      • Multi-region multi-node deployment
      • Scale out
    • Integrate the OIDC protocol into OMS Community Edition to implement SSO
  • OMS Community Edition console
    • Log in to the console of OMS Community Edition
    • Overview
    • User center
      • Configure user information
      • Change your logon password
      • Log off
  • Data migration
    • Overview
    • Migrate data from a MySQL database to OceanBase Database Community Edition
    • Migrate data from HBase to OBKV
    • Migrate incremental data from OBKV to HBase
    • Migrate data from a Milvus database to OceanBase Database Community Edition
    • Migrate data from a Redis database to an OBKV-Redis database
    • Migrate data between tenants of OceanBase Database Community Edition
    • Migrate data in active-active disaster recovery scenarios
    • Migrate data from a TiDB database to OceanBase Database Community Edition
    • Migrate data from a PostgreSQL database to OceanBase Database Community Edition
    • Migrate data from Hive to OceanBase Database Community Edition
    • Migrate data from an ElasticSearch database to OceanBase Database Community Edition
    • Migrate data from OceanBase Database Community Edition to an ElasticSearch database
    • Migrate data from a MongoDB database to OceanBase Database Community Edition
    • Migrate data from a StarRocks database to OceanBase Database Community Edition
    • Migrate data from a Qdrant database to OceanBase Database Community Edition
    • Migrate data from a Doris database to OceanBase Database Community Edition
    • Migrate data from a ClickHouse database to OceanBase Database Community Edition
    • Manage data migration tasks
      • View details of a data migration task
      • Change the name of a data migration task
      • View and modify migration objects
      • Manage computing platforms
      • Use tags to manage data migration tasks
      • Perform batch operations on data migration tasks
      • Download and import settings of migration objects
      • Start and pause a data migration task
      • Release and delete a data migration task
    • Features
      • DML filtering
      • DDL synchronization
      • Configure matching rules for migration objects
      • Wildcard rules
      • Rename a database table
      • Use SQL conditions to filter data
      • Create and update a heartbeat table
      • Schema migration mechanisms
      • Schema migration operations
      • Set an incremental synchronization timestamp
      • Direct load
    • Supported DDL operations in incremental migration and limits
      • DDL synchronization from MySQL database to OceanBase Community Edition
        • Overview of DDL synchronization from a MySQL database to a MySQL-compatible tenant of OceanBase Database
        • CREATE TABLE
          • Create a table
          • Create a column
          • Create an index or a constraint
          • Create partitions
        • Data type conversion
        • ALTER TABLE
          • Modify a table
          • Operations on columns
          • Operations on constraints and indexes
          • Operations on partitions
        • TRUNCATE TABLE
        • RENAME TABLE
        • DROP TABLE
        • CREATE INDEX
        • DROP INDEX
        • DDL incompatibilities between MySQL database and OceanBase Community Edition
          • Overview
          • Incompatibilities of the CREATE TABLE statement
            • Incompatibilities of CREATE TABLE
            • Column types that are supported to create indexes or constraints
          • Incompatibilities of the ALTER TABLE statement
            • Incompatibilities of ALTER TABLE
            • Change the type of a constrained column
            • Change the type of an unconstrained column
            • Change the length of a constrained column
            • Change the length of an unconstrained column
            • Delete a constrained column
          • Incompatibilities of DROP INDEX operations
      • Supported DDL operations in incremental migration between MySQL-compatible tenants of OceanBase Database
      • Synchronize DDL operations from a PostgreSQL database to OceanBase Database
  • Data synchronization
    • Data synchronization overview
    • Create a task to synchronize data from OceanBase Database Community Edition to a Kafka instance
    • Create a task to synchronize data from OceanBase Database Community Edition to a RocketMQ instance
    • Manage data synchronization tasks
      • View details of a data synchronization task
      • Change the name of a data synchronization task
      • View and modify synchronization objects
      • Use tags to manage data synchronization tasks
      • Perform batch operations on data synchronization tasks
      • Download and import the settings of synchronization objects
      • Start and pause a data synchronization task
      • Release and delete a data synchronization task
    • Features
      • DML filtering
      • DDL synchronization
      • Configure matching rules for synchronization objects
      • Rename a topic
      • Use SQL conditions to filter data
      • Column filtering
      • Data format description
  • Create and manage data sources
    • Create data sources
      • Create an OceanBase-CE data source
      • Create a MySQL data source
      • Create a TiDB data source
      • Create a Kafka data source
      • Create a RocketMQ data source
      • Create a PostgreSQL data source
      • Create an HBase data source
      • Create a Qdrant data source
      • Create a Milvus data source
      • Create a Redis data source
      • Create a Hive data source
      • Create an ElasticSearch data source
      • Create a MongoDB data source
      • Create a StarRocks data source
      • Create a Doris data source
      • Create a ClickHouse data source
    • Manage data sources
      • View data source information
      • Copy a data source
      • Edit a data source
      • Delete a data source
      • Cache system views
    • Create a database user
    • User privileges
    • Enable binlogs for the MySQL database
  • OPS & Monitoring
    • O&M overview
    • Go to the overview page
    • Server
      • View server information
      • Update the quota
      • View server logs
      • Delete a server
    • Components
      • Store
        • Create a Store component
        • View details of a Store component
        • Update the configurations of a Store component
        • Start and pause a Store component
        • Delete a Store component
      • Incr-Sync
        • View details of an Incr-Sync component
        • Start and pause an Incr-Sync component
        • Migrate an Incr-Sync component
        • Update the configurations of an Incr-Sync component
        • Batch O&M
        • Delete an Incr-Sync component
      • Full-Import
        • View details of a Full-Import component
        • Pause a Full-Import component
        • Rerun and resume a Full-Import component
        • Update the configurations of a Full-Import component
        • Delete a Full-Import component
      • Full-Verification
        • View details of a Full-Verification component
        • Pause a Full-Verification component
        • Rerun and resume a Full-Verification component
        • Update the configurations of a Full-Verification component
        • Delete a Full-Verification component
      • Component parameters
        • Coordinator
        • Condition
        • Source Plugin
          • Overview
          • StoreSource
          • DataFlowSource
          • LogProxySource
          • KafkaSource (TiDB)
          • HBaseSource
          • HiveSource
          • ElasticSearchSource
          • MongoDBSource
        • Sink Plugin
          • Overview
          • JDBC-Sink
          • KafkaSink
          • DatahubSink
          • RocketMQSink
          • HBaseSink
          • HiveSink
        • Store parameters
          • Parameters of a MySQL store
          • Parameters of an OceanBase store
          • ElasticSearch Store
          • MongoDB Store
        • Parameters of the CM component
        • Parameters of the Supervisor component
        • Parameters of the Full-Verification component
    • O&M Task
      • View O&M tasks
      • Skip a task or subtask
      • Retry a task or subtask
  • System management
    • Permission Management
      • Overview
      • Manage users
      • Manage departments
    • Alert center
      • View task alerts
      • View system alerts
      • Manage alert settings
    • Associate with OCP clusters
    • System parameters
      • Modify system parameters
      • Modify HA configurations
    • SSO management
      • Overview
      • Create an SSO integration
      • Enable or disable an SSO integration
      • Edit an SSO integration
      • Delete an SSO integration
  • OMS Community Edition O&M
    • Manage OMS services
    • OMS logs
    • Component O&M
      • O&M operations for the Supervisor component
      • CLI-based O&M for the Connector component
      • O&M operations for the Store component
    • Component tuning
      • Full/Incremental data migration performance optimization
    • Set throttling
  • Reference Guide
    • API Reference
      • Overview
      • CreateProject
      • StartProject
      • StopProject
      • ResumeProject
      • ReleaseProject
      • DeleteProject
      • ListProjects
      • DescribeProject
      • DescribeProjectSteps
      • DescribeProjectStepMetric
      • DescribeProjectProgress
      • DescribeProjectComponents
      • ListProjectFullVerifyResult
      • StartProjectsByLabel
      • StopProjectsByLabel
      • CreateMysqlDataSource
      • CreateMySQLMasterSlaveDataSource
      • CreateOceanBaseDataSource
      • CreateKafkaDataSource
      • ListDataSource
      • DeleteDataSource
      • CreateLabel
      • ListAllLabels
      • ListFullVerifyInconsistenciesResult
      • ListFullVerifyCorrectionsResult
      • UpdateStore
      • UpdateFullImport
      • UpdateIncrSync
      • UpdateFullVerification
      • UploadFile
      • SubmitPreCheck
      • GetPreCheckResult
      • RetryPreCheck
    • OMS error codes
    • Alert Reference
      • oms_host_down
      • oms_host_down_migrate_resource
      • oms_host_threshold
      • oms_migration_failed
      • oms_migration_delay
      • oms_sync_failed
      • oms_sync_status_inconsistent
      • oms_sync_delay
    • Telemetry parameters
    • Create a trigger
  • Upgrade Guide
    • Overview
    • Single-node upgrade
    • Multi-node upgrade
    • Hot upgrade of OMS Community Edition
    • Upgrade the CDC library
    • FAQ
  • FAQ
    • General O&M
      • How do I modify the resource quotas of an OMS container?
      • Clear files in the Store component
      • How do I troubleshoot the OMS server down issue?
      • Deploy InfluxDB for OMS
      • Increase the disk space of the OMS host
    • Task diagnostics
      • What do I do when a store does not have data of the timestamp requested by the downstream?
      • What do I do when OceanBase Store failed to access an OceanBase cluster through RPC?
    • OPS & monitoring
      • What are the alert rules?
    • Data synchronization
      • FAQ about synchronization to a message queue
        • What are the strategies for ensuring the message order in incremental data synchronization to Kafka
      • Performance troubleshooting and tuning for data synchronization from OceanBase Community Edition to Kafka
    • Data migration
      • Full migration
        • How do I query the ID of a checker?
        • How do I query log files of the Checker component of OMS?
        • How do I query the verification result files of the Checker component of OMS?
        • Garbled characters in the Latin1 character set
        • What do I do if the target table does not exist?
        • What can I do when the full migration failed due to LOB fields?
        • What do I do if garbled characters cannot be written into OceanBase Database V3.1.2?
      • Incremental synchronization
        • How do I skip DDL statements?
        • How do I update allowlists and blocklists?
        • What are the application scope and limits of ETL?
    • Logon and password
      • What do I do if my logon password is locked?
    • Installation and deployment
      • How do I upgrade Store?
      • What do I do when the "Failed to fetch" error is reported?
      • Change port numbers for components
      • Switching between the primary and standby OMS MetaDBs

Download PDF

OMS Documentation What's New What is OMS Community Edition? Terms OMS Community Edition HA Overview Hierarchical functional system Basic components Limitations Data migration process Data synchronization process Deployment modes System and network requirements Memory and disk requirements Prepare the environment Single-node deployment Single-region multi-node deployment Multi-region multi-node deployment Scale out Check the deployment Deploy a time-series database (Optional) Single-node deployment Single-region multi-node deployment Multi-region multi-node deployment Scale out Integrate the OIDC protocol into OMS Community Edition to implement SSO Log in to the console of OMS Community Edition Overview Configure user information Change your logon password Log off Overview Migrate data from a MySQL database to OceanBase Database Community Edition Migrate data from HBase to OBKV Migrate incremental data from OBKV to HBase Migrate data from a Milvus database to OceanBase Database Community Edition Migrate data from a Redis database to an OBKV-Redis database Migrate data between tenants of OceanBase Database Community Edition Migrate data in active-active disaster recovery scenarios Migrate data from a TiDB database to OceanBase Database Community Edition Migrate data from a PostgreSQL database to OceanBase Database Community Edition Migrate data from Hive to OceanBase Database Community Edition Migrate data from an ElasticSearch database to OceanBase Database Community Edition Migrate data from OceanBase Database Community Edition to an ElasticSearch database Migrate data from a MongoDB database to OceanBase Database Community Edition Migrate data from a StarRocks database to OceanBase Database Community Edition Migrate data from a Qdrant database to OceanBase Database Community Edition Migrate data from a Doris database to OceanBase Database Community Edition Migrate data from a ClickHouse database to OceanBase Database Community Edition View details of a data migration task Change the name of a data migration task View and modify migration objects Manage computing platforms Use tags to manage data migration tasks Perform batch operations on data migration tasks Download and import settings of migration objects Start and pause a data migration task Release and delete a data migration task DML filtering DDL synchronization Configure matching rules for migration objects Wildcard rules Rename a database table Use SQL conditions to filter data Create and update a heartbeat table Schema migration mechanisms Schema migration operations Set an incremental synchronization timestamp Direct load Supported DDL operations in incremental migration between MySQL-compatible tenants of OceanBase Database Synchronize DDL operations from a PostgreSQL database to OceanBase Database Data synchronization overview Create a task to synchronize data from OceanBase Database Community Edition to a Kafka instance Create a task to synchronize data from OceanBase Database Community Edition to a RocketMQ instance View details of a data synchronization task Change the name of a data synchronization task View and modify synchronization objects Use tags to manage data synchronization tasks Perform batch operations on data synchronization tasks Download and import the settings of synchronization objects Start and pause a data synchronization task Release and delete a data synchronization task DML filtering DDL synchronization Configure matching rules for synchronization objects Rename a topic Use SQL conditions to filter data Column filtering Data format description Create an OceanBase-CE data source Create a MySQL data source Create a TiDB data source Create a Kafka data source Create a RocketMQ data source Create a PostgreSQL data source Create an HBase data source Create a Qdrant data source Create a Milvus data source Create a Redis data source Create a Hive data source
OceanBase logo

The Unified Distributed Database for the AI Era.

Follow Us
Products
OceanBase CloudOceanBase EnterpriseOceanBase Community EditionOceanBase seekdb
Resources
DocsBlogLive DemosTraining & Certification
Company
About OceanBaseTrust CenterLegalPartnerContact Us
Follow Us

© OceanBase 2026. All rights reserved

Cloud Service AgreementPrivacy PolicySecurity
Contact Us
Document Feedback
  1. Documentation Center
  2. OceanBase Migration Service
  3. V4.2.13
iconOceanBase Migration Service
V 4.2.13Community Edition
Enterprise Edition
  • V 4.3.2
  • V 4.3.1
  • V 4.3.0
  • V 4.2.5
  • V 4.2.4
  • V 4.2.3
  • V 4.0.2
  • V 3.4.0
Community Edition
  • V 4.2.13
  • V 4.2.12
  • V 4.2.11
  • V 4.2.10
  • V 4.2.9
  • V 4.2.8
  • V 4.2.7
  • V 4.2.6
  • V 4.2.5
  • V 4.2.4
  • V 4.2.3
  • V 4.2.1
  • V 4.2.0
  • V 4.0.0
  • V 3.3.1

Full/Incremental data migration performance optimization

Last Updated:2026-04-16 07:09:24  Updated
share
What is on this page
Terms
Query metrics information
Diagnose Incr-Sync or Full-Import
workerNum
GC time is too long
View GC
Parameters related to batch accumulation
Information required for latency

folded

share

Terms

  • TPS

    The number of messages retrieved per second at the source.

  • Latency

    The latency of the current task. The unit is second. The latency calculation does not include the safe point.

  • ReadQ

    When the data is asynchronously transmitted, the intermediate framework obtains data from ReadQ and writes the data to the destination. The data not obtained by the intermediate framework in the current process is cached in ReadQ. The default maximum value of ReadQ is 4096. If the value is small, it indicates that there may be no data at the source or that the source data retrieval is limited.

  • WriteConsume

    The time of writing {batch.size} data, in milliseconds. The lower the value of WriteConsume, the better the performance at the destination.

Query metrics information

Run the ./connector_utils.sh metrics command to query metrics information.

./connector_utils.sh metrics

2022-09-22 12:49:48.876
SOURCE: [RPS:0.0, IOPS:0.0M, delay:1489ms]
SINK: [RPS:0.0, TPS:0.0, IOPS:0.0M, delay:2986440ms]
SINK_TIME: [execute_time:0.0ms/record, commit_time:0.0ms/batch]
SINK_SLOW_ROUTES:
SINK_THREAD: 4/4
DISPATCHER: wait record:0, ready batch:0, shardTime:nullms/record
forward_slot0 batchAccumulate: 0, recordAccumulate: 0
queue_slot1 batchAccumulate: 0, recordAccumulate: 0
heap:620M/1945M, noHeap:52M/488M, threadCount:18, cpu:0.054, sysCpu:51.555
ParNew(count:0, cost:0) ConcurrentMarkSweep(count:0, cost:0)

The information is described as follows:

  1. Source RPS, IOPS, and DELAY.

  2. Sink RPS, TPS (RecordBatch/s), IOPS, and DELAY.

  3. Sink_TIME: execute_time indicates the execution time of a record, and commit_time indicates the execution time of a recordBatch.

  4. SINK_SLOW_ROUTES: the information about the slowly executed SINK_ROUTES in the internal statistics. A SINK_ROUTE is a parallel writing unit. For example, partitions in Kafka, shards in DataHub, and queues in RocketMQ.

  5. Sink_THREAD: the number of active sink threads/maximum sink threads. A small number of sink threads indicates that the sink end is idle and has not reached a bottleneck.

  6. DISPATCHER indicates the conditions of the intermediate queue, wait record indicates the number of messages waiting to be allocated, and ready batch indicates the number of records to be executed by sinkThreads.

    If the number of wait records is large, it indicates that the number of messages is large and that garbage collection may be involved.

    If the number of ready batches is large, it indicates that the sink end has a bottleneck. You can try to improve the sink write speed (for example, by increasing the number of threads).

  7. {Framework-Queue-Name} batchAccumulate: {number of accumulated recordBatches}, recordAccumulate: {number of accumulated records}.

    1. If batchAccumulate is empty in the first queue, no data has entered at the source end.

    2. If batchAccumulate is full in the last queue, it indicates that a bottleneck exists in RecordDispatcher (conflict matrix/hashing).

  8. Heap memory usage, heap memory maximum, nonheap memory usage, nonheap memory maximum, the number of threads, process CPU and system CPU usage.

  9. {Time} {youngGcName}(count:{Cumulative times}, cost:{Cumulative duration}) {fullGcName}(count:{Cumulative times}, cost:{Cumulative duration}).

Diagnose Incr-Sync or Full-Import

  1. Obtain the component ID of Incr-Sync or Full-Import.

    1. Log in to the OMS Community Edition console.

    2. In the left-side navigation pane, click Data Migration.

    3. On the Migration Tasks page, click the name of the target data migration task to go to its details page.

    4. In the upper-right corner of the page, click View Component Monitoring.

    5. In the View Component Monitoring dialog box, view the Component ID of Incr-Sync Component or Full-Import Component.

  2. Go to the directory of the Incr-Sync or Full-Import component.

    1. Log in to the OMS Community Edition deployment server.

    2. Go to the Docker container.

      docker exec -it ${CONTAINER_NAME} bash
      
    3. Run the following command to enter the directory of the component:

      cd /home/ds/run/${Component ID}
      
  3. Run the ./connector_utils.sh diagnose command in the Incr-Sync or Full-Import directory.

    ./connector_utils.sh diagnose -s 'YYYY-MM-DDTHH:mm:ss' -e 'YYYY-MM-DDTHH:mm:ss'
    

    In this command, -s and -e are optional parameters. -s indicates the start time of log analysis and -e indicates the end time of log analysis. The time format is 'YYYY-MM-DDTHH:mm:ss' (for example, '2023-06-01T12:00:00').

    By default, ./connector_utils.sh diagnose is used for analysis for 10 minutes (the default value of -e is the current time).

    The return result is as follows:

    [Metrics]
    TPS: [last:345,avg:277.28,p99:911.00]
    RPS: [last:106,avg:257.08,p99:968.00]
    IOPS: [last:2KB,avg:21.33KB]
    EXECUTE_TIME: [last:34ms,avg:220.44ms,p99:783.00ms]
    SINK_DELAY: [last:19ms,avg:260.31ms,p99:819.00ms]
    SOURCE_DELAY: [
    source_subtopic2_source_delay: [last:702ms,avg:525.00ms,p99:986.00ms]
    source_subtopic1_source_delay: [last:14ms,avg:490.69ms,p99:973.00ms]
    ]
    QUEUE_BATCH_ACCUMULATE: [
    frame_queue_slot_1.batchAccumulate: [last:420,avg:496.00,p99:975.00]
    frame_queue_slot_2.batchAccumulate: [last:310,avg:470.05,p99:975.00]
    ]
    JVM-MEM: heap:34.28M/3641M, noHeap:19.38M/0M]
    THREAD: [count:4, sink:14/16]
    CPU: [last:17,avg:27.95,p99:62.00]
    [Pref]
    sink block: true
    youngGc: true
    [Suggest]
    config[coordinator.shuffleMinBatchSize]:20 to 40
    config[coordinator.shuffleMaxBatchSize]:40 to 80
    jvm to: -Xmx4096m 
    

    The information is described as follows:

    • The metrics information is the judgment basis.

    • The pref information indicates the bottle-neck points analyzed based on the metrics information.

    • The suggest information indicates the optimization points. For example, you can update the shuffleMinBatchSize, shuffleMaxBatchSize, and connectorJvmParam parameters in the coordinator component of Incr-Sync or Full-Import.

workerNum

  • The value of workerNum has reached the upper limit, and the executeTime (execution time) and commitTime (commit time) in the sink logs in the metrics logs are within the normal range.

    1. Go to the View Component Monitoring dialog box.

    2. Click Update next to the target component.

    3. In the Update Configuration dialog box, hover the pointer over the sink > workerNum parameter and click the edit icon.

      If the parameter does not exist, hover the pointer over the blank space next to the sink parameter and click the add icon.

      Note

      If you write data to the database by using the direct load mode, you can modify the serverParallel parameter to adjust the concurrency of the direct load server. The default value is 8.

    4. Increase the value of workerNum based on the machine resources.

    5. Enter the modified parameter in the text box and click the confirmation icon.

    6. In the Update Configuration dialog box, click OK.

  • The value of workerNum has not reached the upper limit, and the garbage collection (GC) time between two consecutive metrics logs is very long.

    1. Go to the View Component Monitoring dialog box.

    2. Click Update next to the target component.

    3. In the Update Configuration dialog box, hover the pointer over the source > splitThreshold parameter and click the edit icon.

      If the parameter does not exist, hover the pointer over the blank space next to the source parameter and click the add icon.

    4. The default value of the splitThreshold parameter is 128. Decrease the value.

    5. Enter the modified parameter in the text box and click the confirmation icon.

    6. In the Update Configuration dialog box, click OK.

  • The value of workerNum is only 1 or 2, and the conflictKey or deepSize keyword is printed in the connector.log file.

    1. Go to the View Component Monitoring dialog box.

    2. Click Update next to the target component.

    3. In the Update Configuration dialog box, hover the pointer over the blank space next to the coordinator parameter and click the add icon.

    4. Enter hotKeyMerge as the key name and click the checkmark icon.

    5. In the Update Configuration dialog box, find the new key name. The default value is NULL.

    6. Hover the pointer over the new parameter, click the edit icon that appears, and change the parameter value to true. Click the confirmation icon.

    7. In the Update Configuration dialog box, click OK.

GC time is too long

Note

GC time is too long means that the Young GC time exceeds 300ms per second and Full GC occurs every second.

View GC

Run the following command in the task directory to view the details of GC per second:

/opt/alibaba/java/bin/jstat -gcutil `cat task.pid` 1s 
  • Increase the JVM memory by setting the coordinator > connectorJvmParam parameter to -Xms12g -Xmx16g.

    This is only an example. You need to adjust the memory based on the current machine. If the -Xmn parameter already exists, you can remove it.

  • Reduce the coordinator > bridgeQueueSize parameter. The default value is 256, and it can be reduced to 32.

  • Synchronize and synchronize all data: Set the sink > lingerMs parameter to 1.

  • Limit the memory by setting the coordinator > throttleMemoryBound parameter to the specified number of bytes. We recommend that you set this parameter to 1/4 of the maximum memory.

    For example, if the maximum heap memory is 16G, the value is 16 * 1024 * 1024 * 1/4 = 4294967296.

  • If the dispatcherClassName parameter is set to ShuffleRecordDispatcher in the conf/coordinator.json or conf_new/coordinator.json file, you can modify the following parameters of coordinator:

    • maxRecordCapacity = 1000 specifies the total number of records in the dispatcher queue. By default, it is calculated as shuffleMinBatchSize * (shuffleBucketSize * 1.5) = 3840.

    • Set the shuffleBucketSize parameter to 32. This reduces the number of batches that can be accumulated.

    • Set the shuffleFlushIntervalMs parameter to 10 to accelerate data pushing to the Sink.

  • Increase the sink > workerNum parameter. The default value is 16, and the maximum value can be adjusted to 64.

Parameters related to batch accumulation

Notice

Parameters related to batch accumulation apply only to incremental synchronization links with non-database targets.

Based on the GC situation:

  1. When GC is not severe, you can increase the batch accumulation capability within the acceptable range at the destination.

  2. When GC is severe, reduce batch accumulation.

    • maxRecordCapacity: the maximum number of records in the batch accumulation queue. Default value: 16000.

    • shuffleMinBatchSize: the minimum number of records in a batch. Default value: 20.

    • shuffleMaxBatchSize: the maximum number of records in a batch. Default value: 64.

    • shuffleFlushIntervalMs: the flush interval. Default value: 100ms.

    • shuffleBucketMaxInFlightBatchSize: the maximum number of batches that can be in flight for each concurrency. For incremental synchronization, the default value is 1. For full synchronization, there is no limit.

    • shuffleBucketSize: the maximum number of concurrent batch accumulations.

Data is pushed to the Sink only when the conditions of shuffleMaxBatchSize or shuffleFlushIntervalMs are met, provided that the write thread has write capability.

Information required for latency

  • Screenshots of the task latency, Store latency, and Incr-Sync latency.

    As shown in the preceding figure, a data migration or data synchronization task has multiple latency concepts.

    No. Latency type Description
    1 Task latency
    • The Incr-Sync latency is used as the task latency for a data migration or data synchronization task. If multiple Incr-Sync latencies exist, the maximum Incr-Sync latency is used.
    • The task latency and component latency are calculated based on different scheduling storage, so they may differ. For example, a large number of tasks or a long scheduling time may cause the task latency to be greater than the component latency. This only indicates the difference in latency time.
    2 Store latency The time difference between the current time and the time when the Store component fetches change records is the Store latency. This time is calculated based on polling, which is typically 10~30 seconds.
    3 Incr-Sync latency The time difference between the current time and the minimum change time of the record written to the destination is the Incr-Sync latency. This time is calculated based on polling. The polling time is specified in the time in the parentheses.
  • The metrics logs of the Incr-Sync component. For more information, see the section about how to query the metrics information.

  • We recommend that you package the logs and conf directories of the Incr-Sync component and provide them.

Previous topic

O&M operations for the Store component
Last

Next topic

Set throttling
Next
What is on this page
Terms
Query metrics information
Diagnose Incr-Sync or Full-Import
workerNum
GC time is too long
View GC
Parameters related to batch accumulation
Information required for latency