日常工作中,我们有时候需要对zookeeper集群的状态进行检查,下面分享一些常用的方法。
zookeeper获取监控指标已知的有两种方式:
上述两种方式获取的指标大体上一样的。
下面罗列出来了能获取到监控指标的常用命令:
zookeeper的四字命令zookeeper的四字命令是指它们的命令长度都是4个英文字母。运维人员可以在不使用zookeeper客户端命令的前提下,简单而便捷地使用四字命令来查看zookeeper当前的状态等信息。
要使用以上的四字命令,需要在zookeeper的配置里面添加以下参数4lw.commands.whitelist=*
,添加后重启所有命令就可以使用了,不然会报如下的错误。
# 首先要安装nc服务
[root@k8s-m1 bin]# yum install nc -y
[root@k8s-m1 bin]# echo ruok |nc localhost 2181
ruok is not executed because it is not in the whitelist.
可以查看一些状态信息和连接信息,包括本节点角色,客户端连接情况等:
[root@k8s-m1 bin]# echo stat |nc localhost 2181
Zookeeper version: 3.7.1-a2fb57c55f8e59cdd76c34b357ad5181df1258d5, built on 2022-05-07 06:45 UTC
Clients:
/127.0.0.1:33254[0](queued=0,recved=1,sent=0)
Latency min/avg/max: 0/0.0/0
Received: 2
Sent: 1
Connections: 1
Outstanding: 0
Zxid: 0x300000008
Mode: follower
Node count: 10
ruok 查看zookeeper是否启动
[root@k8s-m1 bin]# echo ruok |nc localhost 2181
imok[root@k8s-m1 bin]#
列出所有等待队列中的会话和临时节点的信息。
[root@k8s-m1 bin]# echo dump |nc localhost 2181
SessionTracker dump:
Global Sessions(2):
0x200267383e50001 30000ms
0x300267355120000 30000ms
ephemeral nodes dump:
Sessions with Ephemerals (0):
Connections dump:
Connections Sets (1)/(1):
1 expire at Fri Sep 01 10:35:47 CST 2023:
ip: /127.0.0.1:34502 sessionId: 0x0
能够获取到zookeeper的配置信息,包括客户端端口clientPort;数据以及日志路径;间隔单位时间;单台server与单个client端的连接数限制;超时时间minSessionTimeout、maxSessionTimeout等。
Follower在启动过程中,会从Leader同步所有最新数据,然后确定自己能够对外服务的起始状态。Leader允许follower在initLimit时间内完成这个工作。
在运行过程中,Leader负责与ZK集群中所有机器进行通信,例如通过一些心跳检测机制,来检测机器的存活状态。如果L发出心跳包在syncLimit之后,还没有从F那里收到响应,那么就认为这个F已经不在线了。
[root@k8s-m1 bin]# echo conf |nc localhost 2181
clientPort=2181
secureClientPort=-1
dataDir=/zookeeperData/version-2
dataDirSize=402653232
dataLogDir=/zookeeperDataLog/version-2
dataLogSize=899
tickTime=2000
maxClientCnxns=1000
minSessionTimeout=30000
maxSessionTimeout=60000
clientPortListenBacklog=-1
serverId=1
initLimit=10
syncLimit=5
electionAlg=3
electionPort=3888
quorumPort=2888
peerType=0
membership:
server.1=192.168.2.140:2888:3888:participant
server.2=192.168.2.141:2888:3888:participant
server.3=192.168.2.142:2888:3888:participant
显示连接到服务端的信息(列出所有连接到服务器的客户端的完全的连接 / 会话的详细信息)
[root@k8s-m1 bin]# echo cons |nc localhost 2181
/127.0.0.1:511040
我这测试环境没有其他程序连接使用。
显示环境变量信息(区别于 conf 命令)
[root@k8s-m1 bin]# echo envi |nc localhost 2181
Environment:
zookeeper.version=3.7.1-a2fb57c55f8e59cdd76c34b357ad5181df1258d5, built on 2022-05-07 06:45 UTC
host.name=k8s-m1
java.version=1.8.0_65
java.vendor=Oracle Corporation
java.home=/opt/jdk1.8.0_65/jre
java.class.path=/opt/apache-zookeeper-3.7.1-bin/bin/../zookeeper-server/target/classes:/opt/apache-zookeeper-3.7.1-bin/bin/../build/classes:/opt/apache-zookeeper-3.7.1-bin/bin/../zookeeper-server/target/lib/*.jar:/opt/apache-zookeeper-3.7.1-bin/bin/../build/lib/*.jar:/opt/apache-zookeeper-3.7.1-bin/bin/../lib/zookeeper-prometheus-metrics-3.7.1.jar:/opt/apache-zookeeper-3.7.1-bin/bin/../lib/zookeeper-jute-3.7.1.jar:/opt/apache-zookeeper-3.7.1-bin/bin/../lib/zookeeper-3.7.1.jar:/opt/apache-zookeeper-3.7.1-bin/bin/../lib/snappy-java-1.1.7.7.jar:/opt/apache-zookeeper-3.7.1-bin/bin/../lib/slf4j-reload4j-1.7.35.jar:/opt/apache-zookeeper-3.7.1-bin/bin/../lib/slf4j-api-1.7.35.jar:/opt/apache-zookeeper-3.7.1-bin/bin/../lib/simpleclient_servlet-0.9.0.jar:/opt/apache-zookeeper-3.7.1-bin/bin/../lib/simpleclient_hotspot-0.9.0.jar:/opt/apache-zookeeper-3.7.1-bin/bin/../lib/simpleclient_common-0.9.0.jar:/opt/apache-zookeeper-3.7.1-bin/bin/../lib/simpleclient-0.9.0.jar:/opt/apache-zookeeper-3.7.1-bin/bin/../lib/reload4j-1.2.19.jar:/opt/apache-zookeeper-3.7.1-bin/bin/../lib/netty-transport-native-unix-common-4.1.76.Final.jar:/opt/apache-zookeeper-3.7.1-bin/bin/../lib/netty-transport-native-epoll-4.1.76.Final.jar:/opt/apache-zookeeper-3.7.1-bin/bin/../lib/netty-transport-classes-epoll-4.1.76.Final.jar:/opt/apache-zookeeper-3.7.1-bin/bin/../lib/netty-transport-4.1.76.Final.jar:/opt/apache-zookeeper-3.7.1-bin/bin/../lib/netty-resolver-4.1.76.Final.jar:/opt/apache-zookeeper-3.7.1-bin/bin/../lib/netty-handler-4.1.76.Final.jar:/opt/apache-zookeeper-3.7.1-bin/bin/../lib/netty-common-4.1.76.Final.jar:/opt/apache-zookeeper-3.7.1-bin/bin/../lib/netty-codec-4.1.76.Final.jar:/opt/apache-zookeeper-3.7.1-bin/bin/../lib/netty-buffer-4.1.76.Final.jar:/opt/apache-zookeeper-3.7.1-bin/bin/../lib/metrics-core-4.1.12.1.jar:/opt/apache-zookeeper-3.7.1-bin/bin/../lib/jline-2.14.6.jar:/opt/apache-zookeeper-3.7.1-bin/bin/../lib/jetty-util-ajax-9.4.43.v20210629.jar:/opt/apache-zookeeper-3.7.1-bin/bin/../lib/jetty-util-9.4.43.v20210629.jar:/opt/apache-zookeeper-3.7.1-bin/bin/../lib/jetty-servlet-9.4.43.v20210629.jar:/opt/apache-zookeeper-3.7.1-bin/bin/../lib/jetty-server-9.4.43.v20210629.jar:/opt/apache-zookeeper-3.7.1-bin/bin/../lib/jetty-security-9.4.43.v20210629.jar:/opt/apache-zookeeper-3.7.1-bin/bin/../lib/jetty-io-9.4.43.v20210629.jar:/opt/apache-zookeeper-3.7.1-bin/bin/../lib/jetty-http-9.4.43.v20210629.jar:/opt/apache-zookeeper-3.7.1-bin/bin/../lib/javax.servlet-api-3.1.0.jar:/opt/apache-zookeeper-3.7.1-bin/bin/../lib/jackson-databind-2.13.2.1.jar:/opt/apache-zookeeper-3.7.1-bin/bin/../lib/jackson-core-2.13.2.jar:/opt/apache-zookeeper-3.7.1-bin/bin/../lib/jackson-annotations-2.13.2.jar:/opt/apache-zookeeper-3.7.1-bin/bin/../lib/commons-cli-1.4.jar:/opt/apache-zookeeper-3.7.1-bin/bin/../lib/audience-annotations-0.12.0.jar:/opt/apache-zookeeper-3.7.1-bin/bin/../zookeeper-*.jar:/opt/apache-zookeeper-3.7.1-bin/bin/../zookeeper-server/src/main/resources/lib/*.jar:/opt/apache-zookeeper-3.7.1-bin/bin/../conf:
java.library.path=/usr/java/packages/lib/amd64:/usr/lib64:/lib64:/lib:/usr/lib
java.io.tmpdir=/tmp
java.compiler=<NA>
os.name=Linux
os.arch=amd64
os.version=3.10.0-957.el7.x86_64
user.name=root
user.home=/root
user.dir=/
os.memory.free=96MB
os.memory.max=889MB
os.memory.total=119MB
查看zk的健康信息
[root@k8s-m1 bin]# echo mntr |nc localhost 2181|more
zk_version 3.7.1-a2fb57c55f8e59cdd76c34b357ad5181df1258d5, built on 2022-05-07 06:45 UTC
zk_server_state follower
zk_peer_state following - broadcast
zk_ephemerals_count 0
zk_num_alive_connections 1
zk_avg_latency 0.0
zk_outstanding_requests 0
zk_znode_count 10
zk_global_sessions 1
zk_non_mtls_remote_conn_count 0
zk_last_client_response_size -1
zk_packets_sent 14
zk_packets_received 13
zk_max_client_response_size -1
zk_connection_drop_probability 0.0
zk_watch_count 0
zk_auth_failed_count
......
列出服务器 watch 的详细信息,包括有watch path的连接数 以及watch的path数 和 watcher数
[root@k8s-m1 bin]# echo wchs |nc localhost 2181
0 connections watching 0 paths
Total watches:0
通过 session 列出服务器 watch 的详细信息,它的输出是一个与 watch 相关的会话的列表。 ·
[root@k8s-m1 bin]# echo wchc |nc localhost 2181
通过path路径列出服务器 watch 的详细信息。它输出一个与 session 相关的路径。
[root@k8s-m1 bin]# echo wchp |nc localhost 2181
重置连接状态,重置关于链接/session的统计信息,是一个execute操作 不是一个select操作,执行后返回一个状态信息:
[root@k8s-m1 bin]# echo crst |nc localhost 2181
Connection stats reset.
[root@k8s-m1 bin]#
同样是一个execute操作而不是select,重置server状态:
[root@k8s-m1 bin]# echo srst |nc localhost 2181
Server stats reset.
[root@k8s-m1 bin]#
服务的一些状态信息,和stat有一些信息重合。
[root@k8s-m1 bin]# echo srvr |nc localhost 2181
Zookeeper version: 3.7.1-a2fb57c55f8e59cdd76c34b357ad5181df1258d5, built on 2022-05-07 06:45 UTC
Latency min/avg/max: 0/0.0/0
Received: 1
Sent: 1
Connections: 1
Outstanding: 0
Zxid: 0x300000009
Mode: follower
Node count: 10
[root@k8s-m1 bin]#
以上是基于3.7.1版本的zookeeper 的四字命令的一些监控指标,我们可以通过这些命令查看当前集群的一些状态。
更多关于zookeeper的知识分享,请前往博客主页。编写过程中,能力有限难免出现差错,敬请指正