准备看下etcd如何命令行操作,才发现,主机上,只用kubeadm拉起了etcd,但没有etcdctl命令。
- # sudo docker ps -a | awk '/etcd-master/{print $1}'
- c4e3a57f05d7
- 26a11608b270
- 836dabc8e254
找到正在运行的etcd,将pod中的etcdctl命令复制到主机上,使得在主机上就能直接使用etcdctl命令。
# sudo docker cp c4e3a57f05d7:/usr/local/bin/etcdctl /usr/local/bin/etcdctl
在此执行etcdctl 命令,已成功执行
- # etcdctl
- NAME:
- etcdctl - A simple command line client for etcd3.
-
- USAGE:
- etcdctl [flags]
-
- VERSION:
- 3.5.1
-
- API VERSION:
- 3.5
- # etcdctl --endpoints=https://[127.0.0.1]:2379 --cacert=/etc/kubernetes/pki/etcd/ca.crt --cert=/etc/kubernetes/pki/etcd/healthcheck-client.crt --key=/etc/kubernetes/pki/etcd/healthcheck-client.key member list -w table
- +-----------------+---------+-------------------+---------------------------+---------------------------+------------+
- | ID | STATUS | NAME | PEER ADDRS | CLIENT ADDRS | IS LEARNER |
- +-----------------+---------+-------------------+---------------------------+---------------------------+------------+
- | dd7a929be676b37 | started | | https://192.168.1.120:2380 | https://192.168.1.120:2379 | false |
- +-----------------+---------+-------------------+---------------------------+---------------------------+------------+
etcd的证书列表
- # ll /etc/kubernetes/pki/etcd/
- total 32
- -rw-r----- 1 root root 1086 Mar 26 16:52 ca.crt
- -rw------- 1 root root 1675 Mar 26 16:52 ca.key
- -rw-r----- 1 root root 1159 Mar 26 16:52 healthcheck-client.crt
- -rw------- 1 root root 1675 Mar 26 16:52 healthcheck-client.key
- -rw-r----- 1 root root 1220 Mar 26 16:52 peer.crt
- -rw------- 1 root root 1675 Mar 26 16:52 peer.key
- -rw-r----- 1 root root 1220 Mar 26 16:52 server.crt
- -rw------- 1 root root 1675 Mar 26 16:52 server.key
etcdctl设置别名
- # alias etcdctl='etcdctl --endpoints=https://[127.0.0.1]:2379 --cacert=/etc/kubernetes/pki/etcd/ca.crt --cert=/etc/kubernetes/pki/etcd/healthcheck-client.crt --key=/etc/kubernetes/pki/etcd/healthcheck-client.key'
- [root@192.168.1.120 ~]#
- [root@192.168.1.120 ~]# etcdctl member list -w table
- +-----------------+---------+-------------------+---------------------------+---------------------------+------------+
- | ID | STATUS | NAME | PEER ADDRS | CLIENT ADDRS | IS LEARNER |
- +-----------------+---------+-------------------+---------------------------+---------------------------+------------+
- | dd7a929be676b37 | started | 192.168.1.120 | https://192.168.1.120:2380 | https://192.168.1.120:2379 | false |
- +-----------------+---------+-------------------+---------------------------+---------------------------+------------+
查看etcd的详情
IS LEADER:当前是主节点
RAFT TERM: 做了多少轮的选举
- # etcdctl endpoint status -w table
- +--------------------------+-----------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+
- | ENDPOINT | ID | VERSION | DB SIZE | IS LEADER | IS LEARNER | RAFT TERM | RAFT INDEX | RAFT APPLIED INDEX | ERRORS |
- +--------------------------+-----------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+
- | https://[127.0.0.1]:2379 | dd7a929be676b37 | 3.5.1 | 18 MB | true | false | 22 | 7579742 | 7579742 | |
- +--------------------------+-----------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+
查看etcd是否健康
- # etcdctl endpoint health
- https://[127.0.0.1]:2379 is healthy: successfully committed proposal: took = 27.76824ms
etcd中的数据,存放的目录从 / 开始
- # etcdctl put /skywell/byd bus
- OK
- # etcdctl get /skywell/ --prefix=true
- /skywell/byd
- bus
查看表中数据
--keys-only: 只查看key,不看value
--limit: 表中有很多,限制只查询几条
- # etcdctl get / --prefix=true --keys-only --limit 10
- /registry/apiregistration.k8s.io/apiservices/v1.
-
- /registry/apiregistration.k8s.io/apiservices/v1.admissionregistration.k8s.io
-
- /registry/apiregistration.k8s.io/apiservices/v1.apiextensions.k8s.io
-
- /registry/apiregistration.k8s.io/apiservices/v1.apps
-
- /registry/apiregistration.k8s.io/apiservices/v1.authentication.k8s.io
-
- /registry/apiregistration.k8s.io/apiservices/v1.authorization.k8s.io
-
- /registry/apiregistration.k8s.io/apiservices/v1.autoscaling
-
- /registry/apiregistration.k8s.io/apiservices/v1.batch
-
- /registry/apiregistration.k8s.io/apiservices/v1.certificates.k8s.io
-
- /registry/apiregistration.k8s.io/apiservices/v1.coordination.k8s.io
-
- # etcdctl get /skywell --prefix=true --keys-only --limit 10
- /skywell/byd
- # etcdctl snapshot save etcdbackup.db
- {"level":"info","ts":1716971751.0047052,"caller":"snapshot/v3_snapshot.go:68","msg":"created temporary db file","path":"etcdbackup.db.part"}
- {"level":"info","ts":1716971751.028518,"logger":"client","caller":"v3/maintenance.go:211","msg":"opened snapshot stream; downloading"}
- {"level":"info","ts":1716971751.0286477,"caller":"snapshot/v3_snapshot.go:76","msg":"fetching snapshot","endpoint":"https://[127.0.0.1]:2379"}
- {"level":"info","ts":1716971751.3699682,"logger":"client","caller":"v3/maintenance.go:219","msg":"completed snapshot read; closing"}
- {"level":"info","ts":1716971751.8124714,"caller":"snapshot/v3_snapshot.go:91","msg":"fetched snapshot","endpoint":"https://[127.0.0.1]:2379","size":"18 MB","took":"now"}
- {"level":"info","ts":1716971751.8127532,"caller":"snapshot/v3_snapshot.go:100","msg":"saved","path":"etcdbackup.db"}
- Snapshot saved at etcdbackup.db
- # etcdctl --write-out=table snapshot status etcdbackup.db
- Deprecated: Use `etcdutl snapshot status` instead.
-
- +----------+----------+------------+------------+
- | HASH | REVISION | TOTAL KEYS | TOTAL SIZE |
- +----------+----------+------------+------------+
- | b91b2b0e | 6454813 | 947 | 18 MB |
- +----------+----------+------------+------------+
此时,我们删除测试的nginx的deployment
- # kubectl get deployment -A
- NAMESPACE NAME READY UP-TO-DATE AVAILABLE AGE
- default nginx 3/3 3 3 151m
- ingress-nginx nginx-deployment 1/1 1 1 15d
- ingress-nginx nginx-ingress-controller 1/1 1 1 15d
- kube-system coredns 2/2 2 2 63d
- kube-system metrics-server 1/1 1 1 56d
- kubernetes-dashboard dashboard-metrics-scraper 1/1 1 1 56d
- kubernetes-dashboard kubernetes-dashboard 1/1 1 1 56d
-
- # kubectl delete deployment -n default nginx
- deployment.apps "nginx" deleted
-
- # kubectl get deployment -A
- NAMESPACE NAME READY UP-TO-DATE AVAILABLE AGE
- ingress-nginx nginx-deployment 1/1 1 1 15d
- ingress-nginx nginx-ingress-controller 1/1 1 1 15d
- kube-system coredns 2/2 2 2 63d
- kube-system metrics-server 1/1 1 1 56d
- kubernetes-dashboard dashboard-metrics-scraper 1/1 1 1 56d
- kubernetes-dashboard kubernetes-dashboard 1/1 1 1 56d
将备份的数据还原到 --data-dir 指定的目录
- # etcdctl snapshot restore etcdbackup.db --data-dir=/data/foot/etcdtest/restore
- Deprecated: Use `etcdutl snapshot restore` instead.
-
- 2024-05-29T16:44:33+08:00 info snapshot/v3_snapshot.go:251 restoring snapshot {"path": "etcdbackup.db", "wal-dir": "/data/foot/etcdtest/restore/member/wal", "data-dir": "/data/foot/etcdtest/restore", "snap-dir": "/data/foot/etcdtest/restore/member/snap", "stack": "go.etcd.io/etcd/etcdutl/v3/snapshot.(*v3Manager).Restore\n\t/tmp/etcd-release-3.5.1/etcd/release/etcd/etcdutl/snapshot/v3_snapshot.go:257\ngo.etcd.io/etcd/etcdutl/v3/etcdutl.SnapshotRestoreCommandFunc\n\t/tmp/etcd-release-3.5.1/etcd/release/etcd/etcdutl/etcdutl/snapshot_command.go:147\ngo.etcd.io/etcd/etcdctl/v3/ctlv3/command.snapshotRestoreCommandFunc\n\t/tmp/etcd-release-3.5.1/etcd/release/etcd/etcdctl/ctlv3/command/snapshot_command.go:128\ngithub.com/spf13/cobra.(*Command).execute\n\t/home/remote/sbatsche/.gvm/pkgsets/go1.16.3/global/pkg/mod/github.com/spf13/cobra@v1.1.3/command.go:856\ngithub.com/spf13/cobra.(*Command).ExecuteC\n\t/home/remote/sbatsche/.gvm/pkgsets/go1.16.3/global/pkg/mod/github.com/spf13/cobra@v1.1.3/command.go:960\ngithub.com/spf13/cobra.(*Command).Execute\n\t/home/remote/sbatsche/.gvm/pkgsets/go1.16.3/global/pkg/mod/github.com/spf13/cobra@v1.1.3/command.go:897\ngo.etcd.io/etcd/etcdctl/v3/ctlv3.Start\n\t/tmp/etcd-release-3.5.1/etcd/release/etcd/etcdctl/ctlv3/ctl.go:107\ngo.etcd.io/etcd/etcdctl/v3/ctlv3.MustStart\n\t/tmp/etcd-release-3.5.1/etcd/release/etcd/etcdctl/ctlv3/ctl.go:111\nmain.main\n\t/tmp/etcd-release-3.5.1/etcd/release/etcd/etcdctl/main.go:59\nruntime.main\n\t/home/remote/sbatsche/.gvm/gos/go1.16.3/src/runtime/proc.go:225"}
- 2024-05-29T16:44:33+08:00 info membership/store.go:141 Trimming membership information from the backend...
- 2024-05-29T16:44:34+08:00 info membership/cluster.go:421 added member {"cluster-id": "cdf818194e3a8c32", "local-member-id": "0", "added-peer-id": "8e9e05c52164694d", "added-peer-peer-urls": ["http://localhost:2380"]}
- 2024-05-29T16:44:34+08:00 info snapshot/v3_snapshot.go:272 restored snapshot {"path": "etcdbackup.db", "wal-dir": "/data/foot/etcdtest/restore/member/wal", "data-dir": "/data/foot/etcdtest/restore", "snap-dir": "/data/foot/etcdtest/restore/member/snap"}
在指定的位置,重新生成了新的etcd数据
- # ll restore/member/
- total 8
- drwx------ 2 root root 4096 May 29 16:44 snap
- drwx------ 2 root root 4096 May 29 16:44 wal
-
- # ll /var/lib/etcd/member/
- total 0
- drwx------ 2 root root 246 May 29 15:08 snap
- drwx------ 2 root root 244 May 29 09:14 wal
现在,需要停止所有的 kubernetes 组件以更新 etcd 数据。
将/etc/kubernetes/manifests/kubernetes中的组件清单文件, 将此文件移除。
- # ll /etc/kubernetes/manifests/
- total 16
- -rw------- 1 root root 2260 Mar 26 16:52 etcd.yaml
- -rw------- 1 root root 3367 Mar 26 16:52 kube-apiserver.yaml
- -rw------- 1 root root 2878 Mar 26 16:52 kube-controller-manager.yaml
- -rw------- 1 root root 1464 Mar 26 16:52 kube-scheduler.yaml
-
- # mv /etc/kubernetes/manifetes/* /tmp
kubelet会自动删除pod.【说会自动删除,我试了下,修改了etcd-data的目录,node节点没显示,kubectl get po -A 也没有显示,最后将移除去的yaml再放回来,k8s环境正常】
- # kubectl get po -A
- NAMESPACE NAME READY STATUS RESTARTS AGE
- ingress-nginx nginx-deployment-64d5f7665c-56cpz 1/1 Running 0 15d
- ingress-nginx nginx-ingress-controller-7cfc988f46-cszsd 1/1 Running 0 15d
- kube-flannel kube-flannel-ds-lpm9c 1/1 Running 0 64d
- kube-system coredns-6d8c4cb4d-sml87 1/1 Running 0 64d
- kube-system coredns-6d8c4cb4d-w4hgz 1/1 Running 0 64d
- kube-system etcd-master 1/1 Running 181 (18d ago) 64d
- kube-system kube-apiserver-master 1/1 Running 159 64d
- kube-system kube-controller-manager-master 1/1 Running 241 (3d7h ago) 64d
- kube-system kube-proxy-6ct9f 1/1 Running 0 64d
- kube-system kube-scheduler-master 1/1 Running 3256 (3d7h ago) 64d
- kube-system metrics-server-5d6946c85b-5585p 1/1 Running 0 56d
- kubernetes-dashboard dashboard-metrics-scraper-6f669b9c9b-hmw4b 1/1 Running 0 19d
- kubernetes-dashboard kubernetes-dashboard-57dd8bd998-ghrhd 1/1 Running 26 (18d ago) 19d
当组件都已删除后,修改/etc/kubernetes/manifests/etcd.yaml中的etcd-data中hostPath路径参数
- volumeMounts:
- - mountPath: /var/lib/etcd #将这里的路径换成新的etcd的路径,刚才restore所在的目录
- name: etcd-data
- - mountPath: /etc/kubernetes/pki/etcd
- name: etcd-certs
- hostNetwork: true
- priorityClassName: system-node-critical
- securityContext:
- seccompProfile:
- type: RuntimeDefault
- volumes:
- - hostPath:
- path: /etc/kubernetes/pki/etcd
- type: DirectoryOrCreate
- name: etcd-certs
- - hostPath:
- path: /var/lib/etcd
- type: DirectoryOrCreate
- name: etcd-data
- status: {}
- ~
查看k8s集群状态
- # kubectl get cs
- Warning: v1 ComponentStatus is deprecated in v1.19+
- NAME STATUS MESSAGE ERROR
- controller-manager Healthy ok
- scheduler Healthy ok
- etcd-0 Healthy {"health":"true","reason":""}