• 尚硅谷大数据项目《在线教育之实时数仓》笔记002


    视频地址:尚硅谷大数据项目《在线教育之实时数仓》_哔哩哔哩_bilibili

    目录

    第06章 数据仓库环境准备

    P006

    P007

    P008

    P009

    P010

    P011

    P012

    P013

    P014

    1、用户行为日志(topic_log)

    2、业务数据(topic_db)


    第06章 数据仓库环境准备

    P006

    P007

    P008

    http://node001:16010/master-status

    [atguigu@node001 ~]$ start-hbase.sh

    1. [atguigu@node001 ~]$ start-hbase.sh
    2. SLF4J: Class path contains multiple SLF4J bindings.
    3. SLF4J: Found binding in [jar:file:/opt/module/hbase/hbase-2.0.5/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
    4. SLF4J: Found binding in [jar:file:/opt/module/hadoop/hadoop-3.1.3/share/hadoop/common/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
    5. SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
    6. SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
    7. running master, logging to /opt/module/hbase/hbase-2.0.5/logs/hbase-atguigu-master-node001.out
    8. node002: running regionserver, logging to /opt/module/hbase/hbase-2.0.5/bin/../logs/hbase-atguigu-regionserver-node002.out
    9. node003: running regionserver, logging to /opt/module/hbase/hbase-2.0.5/bin/../logs/hbase-atguigu-regionserver-node003.out
    10. node001: running regionserver, logging to /opt/module/hbase/hbase-2.0.5/logs/hbase-atguigu-regionserver-node001.out
    11. [atguigu@node001 ~]$ jpsall
    12. ================ node001 ================
    13. 3041 DataNode
    14. 6579 HMaster
    15. 2869 NameNode
    16. 3447 NodeManager
    17. 6778 HRegionServer
    18. 6940 Jps
    19. 3646 JobHistoryServer
    20. 3806 QuorumPeerMain
    21. ================ node002 ================
    22. 1746 DataNode
    23. 3289 HRegionServer
    24. 2074 NodeManager
    25. 2444 QuorumPeerMain
    26. 1949 ResourceManager
    27. 3471 Jps
    28. ================ node003 ================
    29. 2240 QuorumPeerMain
    30. 1938 SecondaryNameNode
    31. 3138 HRegionServer
    32. 1842 DataNode
    33. 2070 NodeManager
    34. 3338 Jps
    35. [atguigu@node001 ~]$

    P009

    [atguigu@node001 conf]$ cd /opt/module/hbase/apache-phoenix-5.0.0-HBase-2.0-bin
    [atguigu@node001 apache-phoenix-5.0.0-HBase-2.0-bin]$ bin/sqlline.py node001,node002,node003:2181

    [atguigu@node001 ~]$ start-hbase.sh

    [atguigu@node001 ~]$ /opt/module/hbase/apache-phoenix-5.0.0-HBase-2.0-bin/bin/sqlline.py node001,node002,node003:2181

    1. [atguigu@node001 apache-phoenix-5.0.0-HBase-2.0-bin]$ start-hbase.sh
    2. SLF4J: Class path contains multiple SLF4J bindings.
    3. SLF4J: Found binding in [jar:file:/opt/module/hbase/hbase-2.0.5/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
    4. SLF4J: Found binding in [jar:file:/opt/module/hadoop/hadoop-3.1.3/share/hadoop/common/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
    5. SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
    6. SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
    7. running master, logging to /opt/module/hbase/hbase-2.0.5/logs/hbase-atguigu-master-node001.out
    8. node002: running regionserver, logging to /opt/module/hbase/hbase-2.0.5/logs/hbase-atguigu-regionserver-node002.out
    9. node003: running regionserver, logging to /opt/module/hbase/hbase-2.0.5/logs/hbase-atguigu-regionserver-node003.out
    10. node001: running regionserver, logging to /opt/module/hbase/hbase-2.0.5/logs/hbase-atguigu-regionserver-node001.out
    11. node002: running master, logging to /opt/module/hbase/hbase-2.0.5/logs/hbase-atguigu-master-node002.out
    12. [atguigu@node001 apache-phoenix-5.0.0-HBase-2.0-bin]$ bin/sqlline.py node001,node002,node003:2181
    13. Setting property: [incremental, false]
    14. Setting property: [isolation, TRANSACTION_READ_COMMITTED]
    15. issuing: !connect jdbc:phoenix:node001,node002,node003:2181 none none org.apache.phoenix.jdbc.PhoenixDriver
    16. Connecting to jdbc:phoenix:node001,node002,node003:2181
    17. SLF4J: Class path contains multiple SLF4J bindings.
    18. SLF4J: Found binding in [jar:file:/opt/module/hbase/apache-phoenix-5.0.0-HBase-2.0-bin/phoenix-5.0.0-HBase-2.0-client.jar!/org/slf4j/impl/StaticLoggerBinder.class]
    19. SLF4J: Found binding in [jar:file:/opt/module/hadoop/hadoop-3.1.3/share/hadoop/common/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
    20. SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
    21. 23/09/12 10:52:47 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
    22. Connected to: Phoenix (version 5.0)
    23. Driver: PhoenixEmbeddedDriver (version 5.0)
    24. Autocommit status: true
    25. Transaction isolation: TRANSACTION_READ_COMMITTED
    26. Building list of tables and columns for tab-completion (set fastconnect to true to skip)...
    27. 133/133 (100%) Done
    28. Done
    29. sqlline version 1.2.0
    30. 0: jdbc:phoenix:node001,node002,node003:2181> !table
    31. +------------+--------------+-------------+---------------+----------+------------+----------------------------+-----------------+--------------+--------------+
    32. | TABLE_CAT | TABLE_SCHEM | TABLE_NAME | TABLE_TYPE | REMARKS | TYPE_NAME | SELF_REFERENCING_COL_NAME | REF_GENERATION | INDEX_STATE | IMMUTABLE_RO |
    33. +------------+--------------+-------------+---------------+----------+------------+----------------------------+-----------------+--------------+--------------+
    34. | | SYSTEM | CATALOG | SYSTEM TABLE | | | | | | false |
    35. | | SYSTEM | FUNCTION | SYSTEM TABLE | | | | | | false |
    36. | | SYSTEM | LOG | SYSTEM TABLE | | | | | | true |
    37. | | SYSTEM | SEQUENCE | SYSTEM TABLE | | | | | | false |
    38. | | SYSTEM | STATS | SYSTEM TABLE | | | | | | false |
    39. +------------+--------------+-------------+---------------+----------+------------+----------------------------+-----------------+--------------+--------------+
    40. 0: jdbc:phoenix:node001,node002,node003:2181>

    P010

    6.2.2 Hbase 环境搭建

    P011

    6.2.3 Redis 环境搭建

    安装 redis执行make命令报错struct redisServer’没有名为‘sentinel_mode’的成员_server.c:2860:11: 错误:‘struct redisserver’没有名为‘stat_橙-极纪元的博客-CSDN博客

    1. [atguigu@node001 redis-6.0.8]$ /usr/local/bin/redis-server
    2. 3651:C 14 Sep 2023 10:23:44.668 # oO0OoO0OoO0Oo Redis is starting oO0OoO0OoO0Oo
    3. 3651:C 14 Sep 2023 10:23:44.668 # Redis version=6.0.8, bits=64, commit=00000000, modified=0, pid=3651, just started
    4. 3651:C 14 Sep 2023 10:23:44.668 # Warning: no config file specified, using the default config. In order to specify a config file use /usr/local/bin/redis-server /path/to/redis.conf
    5. _._
    6. _.-``__ ''-._
    7. _.-`` `. `_. ''-._ Redis 6.0.8 (00000000/0) 64 bit
    8. .-`` .-```. ```\/ _.,_ ''-._
    9. ( ' , .-` | `, ) Running in standalone mode
    10. |`-._`-...-` __...-.``-._|'` _.-'| Port: 6379
    11. | `-._ `._ / _.-' | PID: 3651
    12. `-._ `-._ `-./ _.-' _.-'
    13. |`-._`-._ `-.__.-' _.-'_.-'|
    14. | `-._`-._ _.-'_.-' | http://redis.io
    15. `-._ `-._`-.__.-'_.-' _.-'
    16. |`-._`-._ `-.__.-' _.-'_.-'|
    17. | `-._`-._ _.-'_.-' |
    18. `-._ `-._`-.__.-'_.-' _.-'
    19. `-._ `-.__.-' _.-'
    20. `-._ _.-'
    21. `-.__.-'
    22. 3651:M 14 Sep 2023 10:23:44.671 # WARNING: The TCP backlog setting of 511 cannot be enforced because /proc/sys/net/core/somaxconn is set to the lower value of 128.
    23. 3651:M 14 Sep 2023 10:23:44.671 # Server initialized
    24. 3651:M 14 Sep 2023 10:23:44.671 # WARNING overcommit_memory is set to 0! Background save may fail under low memory condition. To fix this issue add 'vm.overcommit_memory = 1' to /etc/sysctl.conf and then reboot or run the command 'sysctl vm.overcommit_memory=1' for this to take effect.
    25. 3651:M 14 Sep 2023 10:23:44.671 # WARNING you have Transparent Huge Pages (THP) support enabled in your kernel. This will create latency and memory usage issues with Redis. To fix this issue run the command 'echo madvise > /sys/kernel/mm/transparent_hugepage/enabled' as root, and add it to your /etc/rc.local in order to retain the setting after a reboot. Redis must be restarted after THP is disabled (set to 'madvise' or 'never').
    26. 3651:M 14 Sep 2023 10:23:44.671 * Ready to accept connections

    [atguigu@node001 redis-6.0.8]$ /usr/local/bin/redis-server # 前台启动

    [atguigu@node001 ~]$ redis-server ./my_redis.conf # 后台启动
    [atguigu@node001 ~]$
    jps
    4563 Jps
    [atguigu@node001 ~]$
    ps -ef | grep redis
    atguigu    4558      1  0 10:29 ?        00:00:00 redis-server 127.0.0.1:6379
    atguigu    4579   3578  0 10:29 pts/0    00:00:00 grep --color=auto redis
    [atguigu@node001 ~]$
    redis-cli
    127.0.0.1:6379> quit
    [atguigu@node001 ~]$ 

    P012

    修改redis配置,允许外部访问。

    P013

    6.2.4 ClickHouse环境搭建

    ‘abrt-cli status‘ timed out 解决办法(2021综合整理)_abrt-cli status' timed out_爱喝咖啡的程序猿的博客-CSDN博客

    cd /-bash: 无法为立即文档创建临时文件: 设备上没有空间。

    [root@node001 mapper]# docker stop $(docker ps -aq) # 停止所有的docker容器

    1. [atguigu@node001 clickhouse]$ sudo systemctl start clickhouse-server
    2. [atguigu@node001 clickhouse]$ clickhouse-client -m
    3. ClickHouse client version 20.4.5.36 (official build).
    4. Connecting to localhost:9000 as user default.
    5. Code: 209. DB::NetException: Timeout exceeded while reading from socket ([::1]:9000)
    6. [atguigu@node001 clickhouse]$
    1. [atguigu@node001 ~]$ sudo systemctl status clickhouse-server
    2. ● clickhouse-server.service - ClickHouse Server (analytic DBMS for big data)
    3. Loaded: loaded (/etc/systemd/system/clickhouse-server.service; enabled; vendor preset: disabled)
    4. Active: active (running) since 二 2023-09-19 10:18:17 CST; 1min 42s ago
    5. Main PID: 1033 (clickhouse-serv)
    6. Tasks: 56
    7. Memory: 230.4M
    8. CGroup: /system.slice/clickhouse-server.service
    9. └─1033 /usr/bin/clickhouse-server --config=/etc/clickhouse-server/config.xml --pid-file=/run/clickhouse-server/clickhouse-server.pid
    10. 9月 19 10:18:19 node001 clickhouse-server[1033]: Include not found: clickhouse_compression
    11. 9月 19 10:18:19 node001 clickhouse-server[1033]: Logging trace to /var/log/clickhouse-server/clickhouse-server.log
    12. 9月 19 10:18:19 node001 clickhouse-server[1033]: Logging errors to /var/log/clickhouse-server/clickhouse-server.err.log
    13. 9月 19 10:18:20 node001 clickhouse-server[1033]: Processing configuration file '/etc/clickhouse-server/users.xml'.
    14. 9月 19 10:18:20 node001 clickhouse-server[1033]: Include not found: networks
    15. 9月 19 10:18:20 node001 clickhouse-server[1033]: Saved preprocessed configuration to '/var/lib/clickhouse//preprocessed_configs/users.xml'.
    16. 9月 19 10:18:24 node001 clickhouse-server[1033]: Processing configuration file '/etc/clickhouse-server/config.xml'.
    17. 9月 19 10:18:24 node001 clickhouse-server[1033]: Include not found: clickhouse_remote_servers
    18. 9月 19 10:18:24 node001 clickhouse-server[1033]: Include not found: clickhouse_compression
    19. 9月 19 10:18:24 node001 clickhouse-server[1033]: Saved preprocessed configuration to '/var/lib/clickhouse//preprocessed_configs/config.xml'.
    20. [atguigu@node001 ~]$ clickhouse-client -m
    21. ClickHouse client version 20.4.5.36 (official build).
    22. Connecting to localhost:9000 as user default.
    23. Connected to ClickHouse server version 20.4.5 revision 54434.
    24. node001 :)
    25. node001 :) Bye.
    26. [atguigu@node001 ~]$

    [atguigu@node001 ~]$ clickhouse-client -m
    ClickHouse client version 20.4.5.36 (official build).
    Connecting to localhost:9000 as user default.
    Connected to ClickHouse server version 20.4.5 revision 54434.

    node001 :) 
    node001 :) Bye.
    [atguigu@node001 ~]$ 

    P014

    6.3 模拟数据准备

    1. zookeeper、
    2. kafka、
    3. f1.sh(f1采集上传到kafka,不需要上传到hdfs)、
    4. data_mocker(造日志数据)。/opt/module/data_mocker/01-onlineEducation/java -jar edu2021-mock-2022-06-18.jar
    1、用户行为日志(topic_log)

    com.github.shyiko.mysql.binlog.network.ServerException: Could not find first log file name in binary log index file
        at com.github.shyiko.mysql.binlog.BinaryLogClient.listenForEventPackets(BinaryLogClient.java:926) ~[mysql-binlog-connector-java-0.23.3.jar:0.23.3]
        at com.github.shyiko.mysql.binlog.BinaryLogClient.connect(BinaryLogClient.java:595) ~[mysql-binlog-connector-java-0.23.3.jar:0.23.3]
        at com.github.shyiko.mysql.binlog.BinaryLogClient$7.run(BinaryLogClient.java:839) ~[mysql-binlog-connector-java-0.23.3.jar:0.23.3]
        at java.lang.Thread.run(Thread.java:748) [?:1.8.0_212]

    maxwell :Could not find first log file name in binary log index file_新一aaaaa的博客-CSDN博客

    删除mysql中的maxwell数据库。

    [atguigu@node001 ~]$ kafka-console-consumer.sh --bootstrap-server node001:9092 --topic topic_log # 启动一个kafka消费者挂起在node001
    1. 连接成功
    2. Last login: Thu Oct 26 14:12:50 2023 from 192.168.10.1
    3. [atguigu@node001 ~]$ cd /opt/module/data_mocker/
    4. [atguigu@node001 data_mocker]$ cd 01-onlineEducation/
    5. [atguigu@node001 01-onlineEducation]$ java -jar edu2021-mock-2022-06-18.jar
    6. SLF4J: Class path contains multiple SLF4J bindings.
    7. SLF4J: Found binding in [jar:file:/opt/module/data_mocker/01-onlineEducation/edu2021-mock-2022-06-18.jar!/BOOT-INF/lib/logback-classic-1.2.3.jar!/org/slf4j/impl/StaticLoggerBinder.class]
    8. SLF4J: Found binding in [jar:file:/opt/module/data_mocker/01-onlineEducation/edu2021-mock-2022-06-18.jar!/BOOT-INF/lib/slf4j-log4j12-1.7.7.jar!/org/slf4j/impl/StaticLoggerBinder.class]
    9. SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
    10. SLF4J: Actual binding is of type [ch.qos.logback.classic.util.ContextSelectorStaticBinder]
    11. . ____ _ __ _ _
    12. /\\ / ___'_ __ _ _(_)_ __ __ _ \ \ \ \
    13. ( ( )\___ | '_ | '_| | '_ \/ _` | \ \ \ \
    14. \\/ ___)| |_)| | | | | || (_| | ) ) ) )
    15. ' |____| .__|_| |_|_| |_\__, | / / / /
    16. =========|_|==============|___/=/_/_/_/
    17. :: Spring Boot :: (v2.0.7.RELEASE)

    2、业务数据(topic_db)
    [atguigu@node001 ~]$ kafka-console-consumer.sh --bootstrap-server node001:9092 --topic topic_db
    1. [atguigu@node001 ~]$ maxwell.sh start # 启动maxwell
    2. [atguigu@node001 ~]$ cd ~/bin
    3. [atguigu@node001 bin]$ mysql_to_kafka_inc_init.sh all
  • 相关阅读:
    【PyTorch】PyTorch基础知识——张量
    Spring Security(十九)--OAuth2:实现授权服务器(下)--环境准备以及骨架代码搭建
    《第三期(先导课)》之《Python 开发环境搭建》
    将输入对象转换为数组数组的维度大于等于1numpy.atleast_1d()
    一个详细且完整的公司局域网搭建案例,跟着操作!
    python基于django学生成绩管理系统o8mkp
    java-net-php-python-ssm房车买卖租赁专用网站计算机毕业设计程序
    React中的useCallback和useMemo
    Java调用tess4j完成 OCR 文字识别
    海量Redis数据库优化,vivo如何实现成本与性能的平衡
  • 原文地址:https://blog.csdn.net/weixin_44949135/article/details/132811616