• 通过通过Monstache实时同步MongoDB数据至Elasticsearch


    我安装的版本如下:
    es 7.1.1
    go 1.16.6
    monstache 6.7.10
    mongo 3.6.18

    1.gcc环境

    安装gcc

    yum -y install gcc gcc-c++
    yum install glibc-static
    yum install binutils
    
    • 1
    • 2
    • 3

    2.安装GO

    由于Monstache数据同步依赖于Go语言,因此需要先在ECS中准备Go环境

    (1)下载go并且解压

    wget https://dl.google.com/go/go1.16.6.linux-amd64.tar.gz
    tar -C /usr/local -xzf go1.16.6.linux-amd64.tar.gz
    
    • 1
    • 2

    (2)配置环境变量
    vi /etc/profile,将以下内容配置到该文件中

    export GOROOT=/usr/local/go
    export GOPATH=/home/go/
    export PATH=$PATH:$GOROOT/bin:$GOPATH/bin
    export GOPROXY=https://mirrors.aliyun.com/goproxy/
    
    • 1
    • 2
    • 3
    • 4

    (3)应用环境变量配置

    source /etc/profile
    
    • 1

    3.安装Monstache

    # 进入安装路径
    cd /usr/local/
    # 下载安装包
    git clone https://github.com/rwynn/monstache.git
    # 进入monstache目录
    cd monstache
    # 切换版本
    git checkout rel6
    # 安装Monstache
    go install
    # 查看Monstache版本,预期版本6.7.10
    monstache -v
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    • 11
    • 12

    4.配置实时同步任务

    cd /usr/local/monstache/
    
    • 1

    创建config.toml文件
    编辑config.toml
    共有5处一定需要修改,其它的视自身情况而定

    # connection settings
    
    # connect to MongoDB using the following URL,1这里需要根据自己情况改
    mongo-url = "mongodb://192.168.xxx.xxx:27017/"
    # connect to the Elasticsearch REST API at the following node URLs,2这里需要根据自己情况改
    elasticsearch-urls = ["http://192.168.xxx.xxx:9200"]
    
    # frequently required settings
    
    # if you need to seed an index from a collection and not just listen and sync changes events
    # you can copy entire collections or views from MongoDB to Elasticsearch,3这里需要根据自己情况改
    direct-read-namespaces = ["db.col"]
    
    # if you want to use MongoDB change streams instead of legacy oplog tailing use change-stream-namespaces
    # change streams require at least MongoDB API 3.6+
    # if you have MongoDB 4+ you can listen for changes to an entire database or entire deployment
    # in this case you usually don't need regexes in your config to filter collections unless you target the deployment.
    # to listen to an entire db use only the database name.  For a deployment use an empty string.,4这里需要根据自己情况改
    change-stream-namespaces = ["db.col"]
    
    # additional settings
    
    # if you don't want to listen for changes to all collections in MongoDB but only a few
    # e.g. only listen for inserts, updates, deletes, and drops from mydb.mycollection
    # this setting does not initiate a copy, it is only a filter on the change event listener,5这里需要根据自己情况改
    namespace-regex = '^db\.col$'
    # compress requests to Elasticsearch
    gzip = true
    # generate indexing statistics
    stats = true
    # index statistics into Elasticsearch
    index-stats = true
    # use the following user name for Elasticsearch basic auth
    elasticsearch-user = "elastic"
    # use the following password for Elasticsearch basic auth
    #elasticsearch-password = "somepassword"
    # use 4 go routines concurrently pushing documents to Elasticsearch
    elasticsearch-max-conns = 4 
    # use the following PEM file to connections to Elasticsearch
    #elasticsearch-pem-file = "/path/to/elasticCert.pem"
    # validate connections to Elasticsearch
    #elastic-validate-pem-file =false
    # propogate dropped collections in MongoDB as index deletes in Elasticsearch
    dropped-collections = true
    # propogate dropped databases in MongoDB as index deletes in Elasticsearch
    dropped-databases = true
    # do not start processing at the beginning of the MongoDB oplog
    # if you set the replay to true you may see version conflict messages
    # in the log if you had synced previously. This just means that you are replaying old docs which are already
    # in Elasticsearch with a newer version. Elasticsearch is preventing the old docs from overwriting new ones.
    replay = false
    # resume processing from a timestamp saved in a previous run
    resume = true
    # do not validate that progress timestamps have been saved
    resume-write-unsafe = false
    # override the name under which resume state is saved
    resume-name = "default"
    # use a custom resume strategy (tokens) instead of the default strategy (timestamps)
    # tokens work with MongoDB API 3.6+ while timestamps work only with MongoDB API 4.0+
    resume-strategy = 1
    # exclude documents whose namespace matches the following pattern
    #namespace-exclude-regex = '^mydb\.ignorecollection$'
    # turn on indexing of GridFS file content
    index-files = true
    # turn on search result highlighting of GridFS content
    file-highlighting = true
    # index GridFS files inserted into the following collections
    file-namespaces = ["users.fs.files"]
    # print detailed information including request traces
    verbose = true
    # enable clustering mode
    #cluster-name = 'apollo'
    # do not exit after full-sync, rather continue tailing the oplog
    exit-after-direct-reads = false
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    • 11
    • 12
    • 13
    • 14
    • 15
    • 16
    • 17
    • 18
    • 19
    • 20
    • 21
    • 22
    • 23
    • 24
    • 25
    • 26
    • 27
    • 28
    • 29
    • 30
    • 31
    • 32
    • 33
    • 34
    • 35
    • 36
    • 37
    • 38
    • 39
    • 40
    • 41
    • 42
    • 43
    • 44
    • 45
    • 46
    • 47
    • 48
    • 49
    • 50
    • 51
    • 52
    • 53
    • 54
    • 55
    • 56
    • 57
    • 58
    • 59
    • 60
    • 61
    • 62
    • 63
    • 64
    • 65
    • 66
    • 67
    • 68
    • 69
    • 70
    • 71
    • 72
    • 73
    • 74

    运行Monstache

    # -f参数,显式运行Monstache,系统会打印所有调试日志(包括对Elasticsearch的请求追踪)。
    monstache -f config.toml
    
    • 1
    • 2

    在这里插入图片描述

  • 相关阅读:
    人机融合需要在事实与价值之间构建新型的拓扑关系
    ELK企业级日志分析系统
    MYSQL SEQUENCE方案
    在Python中使用LSTM和PyTorch进行时间序列预测
    拼多多快捷回复语
    攻防世界 level3
    OSPF实验:配置与检测全网互通
    SpringBoot 自动装配原理
    【八股】synchronized
    简单的小复习(一)
  • 原文地址:https://blog.csdn.net/weixin_45753881/article/details/126017194