Use Canal to synchronize data from a MySQL database to OceanBase Database

2023-12-25 08:49:40  Updated

Canal is an open-source product of Alibaba. It provides subscription and consumption of incremental data based on the parsing of incremental logs in MySQL databases.

This topic describes how to use the canal.deployer and canal.adapter components of Canal to synchronize data from a MySQL database to OceanBase Database.

Architecture

  • canal.deployer: the Canal component that serves as a server and converts the binlogs to CanalEntry data.

  • canal.adapter: the client adapter of Canal. It parses the CanalEntry data and synchronizes the incremental changes to the destination database.

Procedure

Step 1: Complete MySQL database settings

  1. Modify the my.cnf configuration file of the MySQL database.

    The my.cnf file is stored in the /etc/ directory. Enable binlog write and set binlog-format to ROW. Then, restart the MySQL database for the configuration to take effect.

    For example:

    log-bin=mysql-bin # Enable the binlog.
    binlog-format=ROW # Set the binglog format to ROW.
    server_id=1 # The server ID must be defined for the configuration of MySQL replication. It must be different from the slaveId of Canal.
    
  2. Create a MySQL database user.

    Create a user for connecting to the MySQL database. Set the username to canal and password to ******. Grant the read and write privileges on all databases to the canal user.

    For example:

    MySQL [(none)]> CREATE USER 'canal'@'%' IDENTIFIED BY '******';
    Query OK, 0 rows affected
    
    MySQL [(none)]> GRANT SELECT,REPLICATION SLAVE,REPLICATION CLIENT ON *.* TO 'canal'@'%';
    Query OK, 0 rows affected
    
    MySQL [(none)]> FLUSH PRIVILEGES;
    Query OK, 0 rows affected
    
  3. Create a test database.

    Create a database named test_mysql_to_ob. Then, create the tbl1 and tbl2 tables and insert data in the tables.

    MySQL [(none)]> CREATE DATABASE test_mysql_to_ob;
    Query OK, 1 row affected
    
    MySQL [(none)]> USE test_mysql_to_ob;
    Database changed
    MySQL [test_mysql_to_ob]> CREATE TABLE tbl1(col1 INT PRIMARY KEY, col2 VARCHAR(20),col3 INT);
    Query OK, 0 rows affected
    
    MySQL [test_mysql_to_ob]> INSERT INTO tbl1 VALUES(1,'China',86),(2,'Taiwan',886),(3,'Hong Kong',852),(4,'Macao',853),(5,'North Korea',850);
    Query OK, 5 rows affected
    Records: 5  Duplicates: 0  Warnings: 0
    
    MySQL [test_mysql_to_ob]> CREATE TABLE tbl2(col1 INT PRIMARY KEY,col2 VARCHAR(20));
    Query OK, 0 rows affected
    
    MySQL [test_mysql_to_ob]> INSERT INTO tbl2 VALUES(86,'+86'),(886,'+886'),(852,'+852'),(853,'+853'),(850,'+850');
    Query OK, 5 rows affected
    Records: 5  Duplicates: 0  Warnings: 0
    

Step 2: Download and install Canal

  1. Download the software package from canal.deployer-1.1.5.tar.gz.

    wget https://github.com/alibaba/canal/releases/download/canal-1.1.5/canal.deployer-1.1.5.tar.gz
    
  2. Decompress the package to the /Canal_Home/canal directory.

    mkdir /Canal_Home/canal && tar zxvf canal.deployer-1.1.5.tar.gz  -C /Canal_Home/canal
    
  3. Modify the configuration file.

    The default configuration files of canal.deployer are conf/canal.properties and conf/example/instance.properties. By default, an instance named example is created. You must modify the database connection address, username, and password of the example instance. The following sample code shows you how to modify the configuration file: The canal.instance.connectionCharset parameter specifies the Java encoding type corresponding to the database encoding method. Valid values include UTF-8, GBK, and ISO-8859-1.

    For example:

    vi conf/example/instance.properties
    
    # mysql serverId
    canal.instance.mysql.slaveId = 1234
    # Specify the position information of your database.
    canal.instance.master.address = xxx.xxx.xxx.xxx:3306
    canal.instance.master.journal.name =
    canal.instance.master.position =
    canal.instance.master.timestamp =
    #canal.instance.standby.address =
    #canal.instance.standby.journal.name =
    #canal.instance.standby.position =
    #canal.instance.standby.timestamp =
    # Specify the username and password of your database.
    canal.instance.dbUsername = canal  
    canal.instance.dbPassword = ******
    canal.instance.defaultDatabaseName =
    canal.instance.connectionCharset = UTF-8
    #table regex
    canal.instance.filter.regex = .*\\..*
    
  4. Start the Canal server.

    cd /Canal_Home/canal && sh bin/startup.sh
    
  5. View server logs.

    cat logs/canal/canal.log
    
  6. View instance logs.

    tail -f logs/canal/canal.log
    tail -f logs/example/example.log
    
  7. Run the following command to stop the service.

    cd /Canal_Home/canal && sh bin/stop.sh
    

Step 3: Deploy the RDB adapter

canal.adapter supports multiple containers. For OceanBase Database, the canal.adapter uses the RDB module. The destination container is an OceanBase database.

  1. Download the software package from canal.adapter-1.1.5.tar.gz.

    wget https://github.com/alibaba/canal/releases/download/canal-1.1.5/canal.adapter-1.1.5.tar.gz
    
  2. Decompress the package to the /Canal_Home/adapter directory.

    mkdir /Canal_Home/adapter && tar zxvf canal.adapter-1.1.5.tar.gz  -C /Canal_Home/adapter
    
  3. Modify the launcher configurations.

    Modify the application.yml file of the launcher in the conf/ directory. Set mode of the adapter source to tcp. Specify the canal.tcp properties, including the IP address and port of the Canal server and the username and password for database connection. Then, specify the connection information of the adapter destination. instance indicates the name of the source instance, which is specified during the deployment of Canal. key is a custom parameter and will be used later. jdbc indicates how the destination OceanBase database is connected to the source MySQL database. You can use the MySQL driver.

    For example:

    mode: tcp #tcp kafka rocketMQ rabbitMQ
    flatMessage: true
    zookeeperHosts:
    syncBatchSize: 1000
    retries: 0
    timeout:
    accessKey:
    secretKey:
    consumerProperties:
    # canal tcp consumer
        canal.tcp.server.host: 127.0.0.1:11111
        canal.tcp.zookeeper.hosts:
        canal.tcp.batch.size: 500
        canal.tcp.username: 
        canal.tcp.password:
    canalAdapters:
    - instance: example # canal instance Name or mq topic name
      groups:
      - groupId: g1
        outerAdapters:
        - name: logger
        - name: rdb
          key: test_mysql_to_ob
          properties:
            jdbc.driverClassName: com.mysql.jdbc.Driver
            jdbc.url: jdbc:mysql://xxx.xxx.xxx.xxx:2883/test_data?useUnicode=true
            jdbc.username: root@mysql001#test4000
            jdbc.password: ******  
    
  4. Modify the RDB mapping file.

    You need to modify the mytest_user.yml file in the conf/rdb/ directory. The destination parameter indicates the name of the Canal instance. The outerAdapterKey parameter is the user-defined key. The mirrorDb parameter indicates that the database level DDL and DML operations of the source database are mirrored to the destination database.

    Two mapping modes are supported: table mapping and database mapping. The following example shows how to configure database mapping. The configurations for table mapping are provided in the comments.

    [root@obce00 adapter]# cat conf/rdb/mytest_user.yml
    #dataSourceKey: defaultDS
    #destination: example
    #groupId: g1
    #outerAdapterKey: mysql1
    #concurrent: true
    #dbMapping:
    #  database: mytest
    #  table: user
    #  targetTable: mytest
    #  targetPk:
    #    id: id
    #  mapAll: true
    #  targetColumns:
    #    id:
    #    name:
    #    role_id:
    #    c_time:
    #    test1:
    #  etlCondition: "where c_time>={}"
    #  commitBatch: 3000 # The size of batch commit.
    # Mirror schema synchronize config
    dataSourceKey: defaultDS
    destination: example
    groupId: g1
    outerAdapterKey: test_mysql_to_ob
    concurrent: true
    dbMapping:
    mirrorDb: true
    database: test_data
    commitBatch: 1000
    
  5. Start RDB.

    Note

    If you use the driver of OceanBase Database for the destination database, place the driver package in the lib folder.

    Run the following command to start the canal-adapter launcher.

    cd /Canal_Home/adapter && sh bin/startup.sh
    
  6. Run the following command to view RDB logs.

    tail -f logs/adapter/adapter.log
    
  7. Run the following command to stop the service.

    cd /Canal_Home/adapter && bin/stop.sh
    
  8. Check data synchronization.

    Write data in the source MySQL database and check data synchronization in the destination OceanBase database.

Limitations

  • The source table must have a primary key. Otherwise, if a record is deleted from the source table, the synchronized table in the destination is fully deleted.

  • The CREATE TABLE and CREATE COLUMN DDL operations can be synchronized.

Contact Us