• canal监听mysql增量数据发布到rabbitmq


    canal工作原理

    • canal 依靠mysql主从备份的原理,模拟 MySQL slave 的交互协议,伪装自己为 MySQL slave ,向 MySQL master 发送dump 协议
    • MySQL master 收到 dump 请求,开始推送 binary log 给 slave (即 canal )
    • canal 解析 binary log 对象(原始为 byte 流)

    核心逻辑可见源码:

    com.alibaba.otter.canal.parse.inbound.AbstractEventParser.java中start方法

    配置mysql开启binlog日志

    修改my.ini文件(如果是安装在windows上,文件目录默认是C:\ProgramData\MySQL\MySQL Server 5.7)

    1. log-bin=D:\mysql-binlog\mysql-bin
    2. binlog-format=ROW
    查看binlog日志开启状态
    SHOW VARIABLES LIKE '%log_bin%';

     查看binlog日志最新的日志文件及位置
    SHOW MASTER STATUS

     查询默认保存时间
    SHOW VARIABLES LIKE 'expire_logs_days';

     如果为0的话,则默认不清理binlog日志

    安装部署canal-admin

    canal-admin版本1.1.8,下载地址是https://github.com/alibaba/canal/releases/download/canal-1.1.8-alpha-2/canal.admin-1.1.8-SNAPSHOT.tar.gz

    jdk版本是11.0.24  ( 如果是jdk8的话,会出现一些问题,以前走过的坑),可以去oracle官网下载

    选择一个mysql数据库做为canal-admin的内置数据库

    在数据库中执行conf目录下的canal_manager.sql 脚本文件

    修改conf 目录下的application.yml配置文件,填写数据库信息

    注意:adminPasswd是端口11110服务的密码,后续安装canal-deploye的时候会说到

    在bin目录执行 

    sh startup.sh

    web服务默认的端口是8089,访问ip:8089

    默认的用户名密码是:admin/123456,后续也可以修改密码

    至此canal-admin已经搭建完毕

    安装部署 canal.deployer

    canal-deployer版本1.1.8,下载地址是

    https://github.com/alibaba/canal/releases/download/canal-1.1.8-alpha-2/canal.deployer-1.1.8-SNAPSHOT.tar.gz

    修改conf下的canal.properties 

    #################################################
    #########               common argument         #############
    #################################################
    # tcp bind ip
    canal.ip = 172.18.10.27 // 本机ip即可
    # register ip to zookeeper
    canal.register.ip = 172.18.10.27 // 本机ip即可
    canal.port = 11111
    canal.metrics.pull.port = 11112
    # canal instance user/passwd
    # canal.user = canal
    # canal.passwd =

    # canal admin config
    canal.admin.manager = 172.18.10.27:8089 //这里是连接canal-admin服务的地址,这里由于canal-admin,canal-deployer安装在同一台服务器上
    canal.admin.port = 11110
    canal.admin.user = admin
    canal.admin.passwd = 4ACFE3202A5FF5CF467898FC58AAB1D615029441 //这里就是在安装canal配置的adminPasswd密码,这里是admin,由于要配置加密的密码,可以去mysql中执行select password("admin"),把前面那个*去掉即可
    # admin auto register
    #canal.admin.register.auto = true
    #canal.admin.register.cluster =
    #canal.admin.register.name =

    canal.zkServers = dubbo1.mysteel.local:2181,dubbo2.mysteel.local:2182,dubbo3.mysteel.local:2183//填写zk服务
    # flush data to zk
    canal.zookeeper.flush.period = 1000
    canal.withoutNetty = false
    # tcp, kafka, rocketMQ, rabbitMQ, pulsarMQ
    canal.serverMode = rabbitMQ  //这里选择需要发布的服务类型,这里选择是rabbitMQ 
    # flush meta cursor/parse position to file
    canal.file.data.dir = ${canal.conf.dir}
    canal.file.flush.period = 1000
    ## memory store RingBuffer size, should be Math.pow(2,n)
    canal.instance.memory.buffer.size = 16384
    ## memory store RingBuffer used memory unit size , default 1kb
    canal.instance.memory.buffer.memunit = 1024 
    ## meory store gets mode used MEMSIZE or ITEMSIZE
    canal.instance.memory.batch.mode = MEMSIZE
    canal.instance.memory.rawEntry = true

    ## detecing config
    canal.instance.detecting.enable = false
    #canal.instance.detecting.sql = insert into retl.xdual values(1,now()) on duplicate key update x=now()
    canal.instance.detecting.sql = select 1
    canal.instance.detecting.interval.time = 3
    canal.instance.detecting.retry.threshold = 3
    canal.instance.detecting.heartbeatHaEnable = false

    # support maximum transaction size, more than the size of the transaction will be cut into multiple transactions delivery
    canal.instance.transaction.size =  1024
    # mysql fallback connected to new master should fallback times
    canal.instance.fallbackIntervalInSeconds = 60

    # network config
    canal.instance.network.receiveBufferSize = 16384
    canal.instance.network.sendBufferSize = 16384
    canal.instance.network.soTimeout = 30

    # binlog filter config
    canal.instance.filter.druid.ddl = true
    canal.instance.filter.query.dcl = false
    canal.instance.filter.query.dml = false
    canal.instance.filter.query.ddl = false
    canal.instance.filter.table.error = false
    canal.instance.filter.rows = false
    canal.instance.filter.transaction.entry = false
    canal.instance.filter.dml.insert = false
    canal.instance.filter.dml.update = false
    canal.instance.filter.dml.delete = false

    # binlog format/image check
    canal.instance.binlog.format = ROW,STATEMENT,MIXED 
    canal.instance.binlog.image = FULL,MINIMAL,NOBLOB

    # binlog ddl isolation
    canal.instance.get.ddl.isolation = false

    # parallel parser config
    canal.instance.parser.parallel = true
    ## concurrent thread number, default 60% available processors, suggest not to exceed Runtime.getRuntime().availableProcessors()
    #canal.instance.parser.parallelThreadSize = 16
    ## disruptor ringbuffer size, must be power of 2
    canal.instance.parser.parallelBufferSize = 256

    # table meta tsdb info
    canal.instance.tsdb.enable = false
    canal.instance.tsdb.dir = ${canal.file.data.dir:../conf}/${canal.instance.destination:}
    canal.instance.tsdb.url = jdbc:h2:${canal.instance.tsdb.dir}/h2;CACHE_SIZE=1000;MODE=MYSQL;
    canal.instance.tsdb.dbUsername = canal
    canal.instance.tsdb.dbPassword = canal
    # dump snapshot interval, default 24 hour
    canal.instance.tsdb.snapshot.interval = 24
    # purge snapshot expire , default 360 hour(15 days)
    canal.instance.tsdb.snapshot.expire = 360

    #################################################
    #########               destinations            #############
    #################################################
    canal.destinations = example
    # conf root dir
    canal.conf.dir = ../conf
    # auto scan instance dir add/remove and start/stop instance
    canal.auto.scan = true
    canal.auto.scan.interval = 5
    # set this value to 'true' means that when binlog pos not found, skip to latest.
    # WARN: pls keep 'false' in production env, or if you know what you want.
    canal.auto.reset.latest.pos.mode = false

    canal.instance.tsdb.spring.xml = classpath:spring/tsdb/h2-tsdb.xml
    #canal.instance.tsdb.spring.xml = classpath:spring/tsdb/mysql-tsdb.xml

    canal.instance.global.mode = spring
    canal.instance.global.lazy = false
    canal.instance.global.manager.address = ${canal.admin.manager}
    #canal.instance.global.spring.xml = classpath:spring/memory-instance.xml
    canal.instance.global.spring.xml = classpath:spring/file-instance.xml
    #canal.instance.global.spring.xml = classpath:spring/default-instance.xml

    ##################################################
    #########             MQ Properties      #############
    ##################################################
    # aliyun ak/sk , support rds/mq
    canal.aliyun.accessKey =
    canal.aliyun.secretKey =
    canal.aliyun.uid=

    canal.mq.flatMessage = true
    canal.mq.canalBatchSize = 50
    canal.mq.canalGetTimeout = 100
    # Set this value to "cloud", if you want open message trace feature in aliyun.
    canal.mq.accessChannel = local

    canal.mq.database.hash = true
    canal.mq.send.thread.size = 30
    canal.mq.build.thread.size = 8

    ##################################################
    #########                    Kafka                   #############
    ##################################################
    kafka.bootstrap.servers = 127.0.0.1:9092
    kafka.acks = all
    kafka.compression.type = none
    kafka.batch.size = 16384
    kafka.linger.ms = 1
    kafka.max.request.size = 1048576
    kafka.buffer.memory = 33554432
    kafka.max.in.flight.requests.per.connection = 1
    kafka.retries = 0

    kafka.kerberos.enable = false
    kafka.kerberos.krb5.file = ../conf/kerberos/krb5.conf
    kafka.kerberos.jaas.file = ../conf/kerberos/jaas.conf

    # sasl demo
    # kafka.sasl.jaas.config = org.apache.kafka.common.security.scram.ScramLoginModule required \\n username=\"alice\" \\npassword="alice-secret\";
    # kafka.sasl.mechanism = SCRAM-SHA-512
    # kafka.security.protocol = SASL_PLAINTEXT

    ##################################################
    #########                   RocketMQ         #############
    ##################################################
    rocketmq.producer.group = test
    rocketmq.enable.message.trace = false
    rocketmq.customized.trace.topic =
    rocketmq.namespace =
    rocketmq.namesrv.addr = 127.0.0.1:9876
    rocketmq.retry.times.when.send.failed = 0
    rocketmq.vip.channel.enabled = false
    rocketmq.tag = 

    ##################################################
    #########                   RabbitMQ         #############
    ##################################################
    rabbitmq.host = cyhlw01.mq.mysteel.local:5672
    rabbitmq.virtual.host = /mysteel
    rabbitmq.exchange = exchange_ebc_canal
    rabbitmq.username = mysteeladmin
    rabbitmq.password = mysteeladmin0411
    rabbitmq.queue = queue_ebc_canal
    rabbitmq.routingKey = queue_ebc_canal
    rabbitmq.deliveryMode =

    //配置rabbitMQ  连接信息
    ##################################################
    #########                     Pulsar         #############
    ##################################################
    pulsarmq.serverUrl =
    pulsarmq.roleToken =
    pulsarmq.topicTenantPrefix =

    注意:canal.serverMode = rabbitMQ 由于选择的是rabbitMQ ,则需要把plugin/connector.rabbitmq-1.1.8-SNAPSHOT-jar-with-dependencies.jar的复制到lib目录,否则会出现问题

    在bin目录执行 

    sh startup.sh

    有时候出现问题会出现闪退,可以在脚本文件最后加pause,便于排查问题

     也可以在线修改配置信息,保存即可生效

     canal.deployer中手动添加instance

    这里需要在网页端配置instance,新建instance,选择对应的canal-server,在配置文件中添加监听的数据及监听规则

    #################################################
    ## mysql serverId , v1.0.26+ will autoGen
    # canal.instance.mysql.slaveId=0

    # enable gtid use true/false
    canal.instance.gtidon=false

    # position info
    canal.instance.master.address=app03.db.mysteel.local:3306  //监听的数据源
    canal.instance.master.journal.name=mysql-bin.000025 //监听的数据源binlog文件
    canal.instance.master.position=987721127//监听的数据源binlog文件起始位置
    canal.instance.master.timestamp=
    canal.instance.master.gtid=

    # rds oss binlog
    canal.instance.rds.accesskey=
    canal.instance.rds.secretkey=
    canal.instance.rds.instanceId=

    # table meta tsdb info
    canal.instance.tsdb.enable=false //这里可以改为false,否则会出现一些错误
    #canal.instance.tsdb.url=jdbc:mysql://127.0.0.1:3306/canal_manager
    #canal.instance.tsdb.dbUsername=canal
    #canal.instance.tsdb.dbPassword=canal

    #canal.instance.standby.address =
    #canal.instance.standby.journal.name =
    #canal.instance.standby.position =
    #canal.instance.standby.timestamp =
    #canal.instance.standby.gtid=

    # username/password
    canal.instance.dbUsername=crm_canal  //监听的数据源用户名
    canal.instance.dbPassword=4jRwhaDR5Q9h  //监听的数据源密码
    canal.instance.connectionCharset = UTF-8
    # enable druid Decrypt database password
    canal.instance.enableDruid=false
    #canal.instance.pwdPublicKey=MFwwDQYJKoZIhvcNAQEBBQADSwAwSAJBALK4BUxdDltRRE5/zXpVEVPUgunvscYFtEip3pmLlhrWpacX7y7GCMo2/JM6LeHmiiNdH1FWgGCpUfircSwlWKUCAwEAAQ==

    # table regex
    canal.instance.filter.regex=dc_crm.72crm_crm_account,dc_crm.72crm_crm_contacts,dc_crm.72crm_crm_customer,dc_crm.crm_account_target,dc_crm.crm_product_package_ebc_frame,dc_crm.crm_dictionary,dc_crm.crm_product_module,dc_crm.crm_product_package_module,dc_crm.crm_product_package_indexcategory,dc_crm.crm_product_package,dc_crm.crm_account_product

      //监听的数据源匹配规则,如果有多个规则,则用“,”分隔,如上述要监听一个数据中多张表,则dc_crm.72crm_crm_account,dc_crm.72crm_crm_contacts

    # table black regex
    canal.instance.filter.black.regex=
    # table field filter(format: schema1.tableName1:field1/field2,schema2.tableName2:field1/field2)
    #canal.instance.filter.field=test1.t_product:id/subject/keywords,test2.t_company:id/name/contact/ch
    # table field black filter(format: schema1.tableName1:field1/field2,schema2.tableName2:field1/field2)
    #canal.instance.filter.black.field=test1.t_product:subject/product_image,test2.t_company:id/name/contact/ch

    # mq config
    canal.mq.topic=queue_ebc_canal //rabbitmq的routingKey
    # dynamic topic route by schema or table regex
    #canal.mq.dynamicTopic=mytest1.user,topic2:mytest2\\..*,.*\\..*
    canal.mq.partition=0
    # hash partition config
    #canal.mq.enableDynamicQueuePartition=false
    #canal.mq.partitionsNum=3
    #canal.mq.dynamicTopicPartitionNum=test.*:4,mycanal:6
    #canal.mq.partitionHash=test.table:id^name,.*\\..*
    #
    # multi stream for polardbx
    canal.instance.multi.stream.on=false
    #################################################

     配置修改保存之后,会实时生效,不需要启停生效。

    遇到的实际问题

    1、被监听的mysql重装,或者binlog文件删除了,怎么重置canal中的binlog的文件及位置

    在conf/instance/meta.dat保存着最新的binlog位置信息

    {
        "clientDatas": [
            {
                "clientIdentity": {
                    "clientId": 1001,
                    "destination": "instance",
                    "filter": ""
                },
                "cursor": {
                    "identity": {
                        "slaveId": -1,
                        "sourceAddress": {
                            "address": "app03.db.mysteel.local",
                            "port": 3306
                        }
                    },
                    "postion": {
                        "gtid": "",
                        "included": false,
                        "journalName": "mysql-bin.000049",
                        "position": 973390906,
                        "serverId": 1810141,
                        "timestamp": 1722923225000
                    }
                }
            }
        ],
        "destination": "instance"
    }

    只需要修改journalName,position,timestamp即可

  • 相关阅读:
    Redis-事物
    ES6中const注意点
    Spring Framework 6.1 正式 GA
    数据结构与算法课程设计:基于交通路线的规划系统
    【Java】helloworld
    haskell 的where 或者 let ..in 表达式
    【算法题】从数量最多的堆取走礼物
    java计算机毕业设计高校勤工助学管理系统源码+mysql数据库+系统+lw文档+部署
    专业四第二周自测
    品RocketMQ 源码,学习并发编程三大神器
  • 原文地址:https://blog.csdn.net/beiduofen2011/article/details/140950079