作者:张华 发表于:2023-10-15
版权声明:可以任意转载,转载时请务必以超链接形式标明文章原始出处和作者信息及本版权声明(http://blog.csdn.net/quqi99)
sunbeam是一个部署openstack的工具,它会用juju定义两个clouds(microk8s and sunbeam), microk8s用于部署openstack控制服务(位于openstack model), sunbeam用于部署sunbeam-controller(位于admin/conroller model):

思路:
# Backup images, remember to use 'mcirok8s.ctr' instead of 'ctr', and remove 'ctr' by 'sudo apt purge containerd docker.io -y'
microk8s.ctr --namespace k8s.io images ls -q |grep -v sha256 |xargs -I {} sh -c 'fname=$(echo "{}" | tr -s "/" "_"); microk8s.ctr --namespace k8s.io image export ${fname}.tar "{}"'
ls *.tar | xargs -I {} sh -c 'fname=$(echo "{}" | sed "s/_/\//g" | sed "s/.tar$//g");echo $fname; microk8s.ctr --namespace k8s.io image import {} ${fname}'
# or use micro.registry
#sudo microk8s.enable registry
#ls *.tar | xargs -I {} sh -c 'fname=$(echo "{}" | sed "s/_/\//g" | sed "s/.tar$//g"); sudo docker load -i {}; sudo docker tag ${fname} localhost:32000/${fname}; sudo docker push localhost:32000/${fname}'
microk8s.ctr --namespace k8s.io image ls
sed -i "s#registry.k8s.io#registry.cn-hangzhou.aliyuncs.com/google_containers#g" /var/snap/microk8s/current/args/containerd-template.toml
sudo systemctl restart snap.microk8s.daemon-containerd.service
# sudo microk8s.stop && sudo microk8s.start #并不需要重启microk8s, 重启上面的snap.microk8s.daemon-containerd即可
cat << EOF |tee -a /var/snap/microk8s/current/args/containerd-env
# should use http type
HTTP_PROXY=http://192.168.99.186:3129
HTTPS_PROXY=http://192.168.99.186:3129
NO_PROXY=10.0.0.0/8,127.0.0.0/16,192.168.0.0/16
EOF
juju add sunbean && juju add-machine --series jammy --constraints "root-disk=100G mem=32G cores=8"
juju ssh 0
# 遇到各种奇怪的问题时,要先reset环境
sudo snap remove --purge openstack
sudo snap remove --purge juju
sudo snap remove --purge juju-db
sudo snap remove --purge kubectl
sudo /usr/sbin/remove-juju-services
sudo rm -rf /var/lib/juju
rm -rf ~/.local/share/juju
rm -rf ~/snap/juju/
rm -rf ~/snap/openstack
rm -rf ~/snap/openstack-hypervisor
rm -rf ~/snap/microstack/
rm -rf ~/snap/microk8s/
sudo snap remove --purge vault
sudo snap remove --purge microk8s
sudo snap remove --purge openstack-hypervisor
sudo init 6 #最好重启,否则会有一些calico的网卡和namespace去不了
# 用2023.2/candidate将会遇到:Please run `sunbeam configure` first,所以肜2023.1/stable
sudo snap install openstack --channel 2023.1/stable
python3 -c "import socket; print(socket.getfqdn())"
sunbeam prepare-node-script | bash -x
sudo usermod -a -G snap_daemon $USER && newgrp snap_daemon
#ERROR failed to bootstrap model: machine is already provisioned
sudo remove-juju-services
# 如果hangout在'Bootstrapping juju into machine'无反应,用'journalctl -f'看到是ssh的问题
# 是因为ssh密钥设密码了,同时配置无密码登录,且配置‘NOPASSWD:ALL’
sunbeam cluster bootstrap --accept-defaults
mkdir -p ~/.kube && sudo chown -R $USER ~/.kube
sudo usermod -a -G snap_microk8s $USER && newgrp snap_microk8s
microk8s.kubectl get pods --all-namespaces
microk8s.ctr --namespace k8s.io image ls
#registry.k8s.io, docker.io, registry.jujucharms.com, quay.io
##echo 'HTTPS_PROXY=http://192.168.99.186:9311' |sudo tee -a /var/snap/microk8s/current/args/containerd-env
microk8s.ctr --namespace k8s.io containers ls
alias kubectl='sudo /snap/bin/microk8s.kubectl'
source <(kubectl completion bash) && kubectl completion bash |sudo tee /etc/bash_completion.d/kubectl
sunbeam cluster list
#Unable to complete operation for new subnet. The number of DNS nameservers exceeds the limit 5.
#The workaround is to modify /run/systemd/resolve/resolv.conf, but don't restart systemd-resolv according to
#https://github.com/openstack-snaps/snap-openstack/commit/7b7ca702efb490f13624002093e1b0b4cefe3aab
sunbeam configure --accept-defaults --openrc demo-openrc
sunbeam openrc > admin-openrc
source admin-openrc
sunbeam launch ubuntu --name test
sudo journalctl -u snap.openstack.clusterd.service -f
source <(openstack complete)
openstack complete |sudo tee /etc/bash_completion.d/openstack
openstack hypervisor list
sudo snap get openstack-hypervisor node
sudo snap logs openstack-hypervisor.hypervisor-config-service
sudo snap logs openstack-hypervisor.ovn-controller
#juju switch opensetack && juju ssh ovn-central/0
sudo microk8s.kubectl -n openstack exec -it ovn-central-0 bash
sudo microk8s.kubectl -n openstack exec -it ovn-central-0 -c ovn-northd -- ovn-sbctl --db=ssl:ovn-central-0.ovn-central-endpoints.openstack.svc.cluster.local:16642 -c /etc/ovn/cert_host -C /etc/ovn/ovn-central.crt -p /etc/ovn/key_host list
cat /var/snap/openstack-hypervisor/common/etc/nova/nova.conf
上面的部署方法是在正常网络条件下部署的,但是如果存在特色网络,可能需要做更多的hack.
1, 首先如果OS不是fresh的,需要reset env
sudo snap remove --purge microk8s
sudo snap remove --purge juju
sudo snap remove --purge openstack
sudo snap remove --purge openstack-hypervisor
sudo /usr/sbin/remove-juju-services
sudo rm -rf /var/lib/juju
rm -rf ~/.local/share/juju
rm -rf ~/snap/openstack
rm -rf ~/snap/openstack-hypervisor
rm -rf ~/snap/microstack/
rm -rf ~/snap/juju/
rm -rf ~/snap/microk8s/
sudo init 6 #最好重启,否则会有一些calico的网卡和namespace去不了
2, 由于我的OS有ssh key,并且ssh key是设置有密码的,所以得额外加一步使用无密码的.local/share/juju/ssh/juju_id_rsa
#because my default ssh key has password, so need one extra step to avoid: Timeout before authentication for 192.168.99.179 port 56142
cat .local/share/juju/ssh/juju_id_rsa.pub |sudo tee -a ~/.ssh/authorized_keys
ssh hua@minipc.lan -i .local/share/juju/ssh/juju_id_rsa
3, dns要用长格式:
echo '192.168.99.179 minipc.lan minipc' |sudo tee -a /etc/hosts
python3 -c "import socket; print(socket.getfqdn())"
4, 可查看日志: sudo journalctl -f
5, 特色环境需要能正确下载镜像
echo 'HTTP_PROXY=http://192.168.99.186:9311' |sudo tee -a /var/snap/microk8s/current/args/containerd-env
echo 'HTTPS_PROXY=http://192.168.99.186:9311' |sudo tee -a /var/snap/microk8s/current/args/containerd-env
echo 'NO_PROXY=10.0.0.0/8,192.168.0.0/16,127.0.0.1,172.16.0.0/12' |sudo tee -a /var/snap/microk8s/current/args/containerd-env
sudo snap restart microk8s
microk8s.kubectl get pods --all-namespaces
microk8s.ctr --namespace k8s.io image ls
为lp bug (https://bugs.launchpad.net/snap-openstack/+bug/2039403)产生了一个小patch
diff --git a/sunbeam-python/sunbeam/utils.py b/sunbeam-python/sunbeam/utils.py
index 542c1c1..cd02ee2 100644
--- a/sunbeam-python/sunbeam/utils.py
+++ b/sunbeam-python/sunbeam/utils.py
@@ -242,7 +242,7 @@ def get_free_nic() -> str:
return nic
-def get_nameservers(ipv4_only=True) -> List[str]:
+def get_nameservers(ipv4_only=True, max_count=5) -> List[str]:
"""Return a list of nameservers used by the host."""
resolve_config = Path("/run/systemd/resolve/resolv.conf")
nameservers = []
@@ -258,7 +258,7 @@ def get_nameservers(ipv4_only=True) -> List[str]:
nameservers = list(set(nameservers))
except FileNotFoundError:
nameservers = []
- return nameservers
+ return nameservers[:max_count]
它位于/snap/openstack/274/lib/python3.10/site-packages/sunbeam/utils.py, 但如何在生产环境中能过打diff来快速调试它呢?因为它是read-only的,感觉好麻烦,下面的步骤也不好使,因为在’sudo snap remove openstack’之后sunbeam命令也找不着了也就无法运行‘sunbeam configure --accept-defaults --openrc demo-openrc’了
cd snap-openstack/sunbeam-python
tox -epy3
#In order to debug read-only file /snap/openstack/274/lib/python3.10/site-packages/sunbeam/utils.py
sudo unsquashfs -d squashfs-root /var/lib/snapd/snaps/openstack_*.snap
sudo snap try ./squashfs-root/ --devmode
cd ./squashfs-root/lib/python3.10/site-packages
sudo patch -p1 < ./diff
cd ~ && sudo vim ./squashfs-root/lib/python3.10/site-packages/sunbeam/utils.py
import rpdb;rpdb.set_trace()
sudo ./squashfs-root/bin/python3 -m pip install rpdb
sudo systemctl restart snap.openstack.clusterd.service
nc 127.0.0.1 4444
sunbeam configure --accept-defaults --openrc demo-openrc #trigger it, but there is no sunbeam now
重新build snap也需要先删除snap也不能debug
git clone https://github.com/openstack-snaps/snap-openstack.git
cd snap-openstack/
patch -p1 < diff
sudo apt install build-essential -y
sudo snap install --classic snapcraft
#snapcraft clean
sudo snapcraft
用下列’mount --bind’的方法可以,但因为只是mount一个文件所以只能用日志:
cp /snap/openstack/274/lib/python3.10/site-packages/sunbeam/utils.py .
sudo mount --bind utils.py /snap/openstack/274/lib/python3.10/site-packages/sunbeam/utils.py
mount | grep "utils.py"
#now we can modify /snap/openstack/274/lib/python3.10/site-packages/sunbeam/utils.py (NOTE: it's not ./utils.py)
LOG.warn("quqi {}".format(nameservers[:max_count]))
#python needs to be restarted if they are running in the daemon
sudo systemctl restart snap.openstack.clusterd.service
sunbeam configure --accept-defaults --openrc demo-openrc #trigger it
想要’mount --bind’一个目录然后安装rpdb模块,但没有成功(以后再机会再确认一下)
mkdir -p ~/snap_write/ && cp -r /snap/openstack/274 ~/snap_write/
sudo mount --bind ~/snap_write/274 /snap/openstack/274
mount |grep /snap/openstack/274
vim /snap/openstack/274/lib/python3.10/site-packages/sunbeam/utils.py
LOG.warn("quqi {}".format(nameservers[:max_count]))
import rpdb;rpdb.set_trace()
/snap/openstack/274/bin/pip install rpdb
#python needs to be restarted if they are running in the daemon
sudo systemctl restart snap.openstack.clusterd.service
sunbeam configure --accept-defaults --openrc demo-openrc #trigger it
nc 127.0.0.1 4444
juju ssh -m admin/controller 0
ubuntu@juju-5d90c3-sunbeam-0:~$ juju clouds |tail -n2
Only clouds with registered credentials are shown.
There are more clouds, use --all to see them.
microk8s 1 localhost k8s 0 built-in A Kubernetes Cluster
sunbeam 1 default manual 0 local
ubuntu@juju-5d90c3-sunbeam-0:~$ juju controllers |tail -n1
sunbeam-controller* admin/controller juju-5d90c3-sunbeam-0.cloud.sts superuser 2 - - 3.2.0
ubuntu@juju-5d90c3-sunbeam-0:~$ juju models |tail -n3
Model Cloud/Region Type Status Machines Cores Units Access Last connection
admin/controller* sunbeam/default manual available 1 8 4 admin just now
openstack sunbeam-microk8s/localhost kubernetes available 0 - 24 admin 1 minute ago
ubuntu@juju-5d90c3-sunbeam-0:~$ kubectl get pods --all-namespaces
NAMESPACE NAME READY STATUS RESTARTS AGE
metallb-system speaker-2rspk 1/1 Running 0 108m
kube-system coredns-6f5f9b5d74-ctc9d 1/1 Running 0 109m
kube-system calico-node-m74rh 1/1 Running 0 107m
metallb-system controller-9556c586f-kqslx 1/1 Running 0 108m
kube-system calico-kube-controllers-7457875fc6-xdst9 1/1 Running 0 106m
openstack modeloperator-7f5fcd7474-w2f5p 1/1 Running 0 105m
openstack cinder-ceph-mysql-router-0 2/2 Running 0 105m
openstack ovn-relay-0 2/2 Running 0 105m
openstack certificate-authority-0 1/1 Running 0 104m
openstack horizon-mysql-router-0 2/2 Running 1 (101m ago) 105m
openstack horizon-0 2/2 Running 0 105m
openstack keystone-mysql-router-0 2/2 Running 0 104m
openstack cinder-ceph-0 2/2 Running 0 105m
openstack rabbitmq-0 2/2 Running 0 105m
openstack placement-0 2/2 Running 0 104m
openstack neutron-0 2/2 Running 0 104m
openstack keystone-0 2/2 Running 0 105m
openstack glance-0 2/2 Running 1 (91m ago) 104m
openstack traefik-0 2/2 Running 0 105m
openstack cinder-mysql-router-0 2/2 Running 2 (41m ago) 105m
openstack neutron-mysql-router-0 2/2 Running 2 (35m ago) 104m
openstack nova-api-mysql-router-0 2/2 Running 2 (10m ago) 104m
openstack cinder-0 3/3 Running 1 (8m43s ago) 104m
kube-system hostpath-provisioner-69cd9ff5b8-kdjpp 1/1 Running 5 (7m22s ago) 108m
openstack nova-mysql-router-0 2/2 Running 3 (7m19s ago) 105m
openstack nova-0 4/4 Running 2 (7m19s ago) 103m
openstack glance-mysql-router-0 2/2 Running 1 (7m19s ago) 104m
openstack ovn-central-0 4/4 Running 2 (5m51s ago) 103m
openstack nova-cell-mysql-router-0 2/2 Running 1 (4m38s ago) 105m
openstack mysql-0 2/2 Running 1 (3m21s ago) 104m
openstack placement-mysql-router-0 2/2 Running 3 (7m19s ago) 104m
ubuntu@juju-5d90c3-sunbeam-0:~$ juju switch admin/controller
sunbeam-controller:juju-5d90c3-sunbeam-0.cloud.sts/openstack -> sunbeam-controller:admin/controller
ubuntu@juju-5d90c3-sunbeam-0:~$ juju status
Model Controller Cloud/Region Version SLA Timestamp Notes
controller sunbeam-controller sunbeam/default 3.2.0 unsupported 03:50:49Z upgrade available: 3.2.3
SAAS Status Store URL
certificate-authority active local juju-5d90c3-sunbeam-0.cloud.sts/openstack.certificate-authority
keystone waiting local juju-5d90c3-sunbeam-0.cloud.sts/openstack.keystone
ovn-relay active local juju-5d90c3-sunbeam-0.cloud.sts/openstack.ovn-relay
rabbitmq active local juju-5d90c3-sunbeam-0.cloud.sts/openstack.rabbitmq
App Version Status Scale Charm Channel Rev Exposed Message
controller active 1 juju-controller 3.2/stable 14 no
microceph unknown 0 microceph edge 9 no
microk8s active 1 microk8s legacy/stable 121 no
openstack-hypervisor active 1 openstack-hypervisor 2023.1/stable 105 no
sunbeam-machine active 1 sunbeam-machine latest/edge 1 no
Unit Workload Agent Machine Public address Ports Message
controller/0* active idle 0 10.5.1.11
microk8s/0* active idle 0 10.5.1.11 16443/tcp
openstack-hypervisor/0* active idle 0 10.5.1.11
sunbeam-machine/0* active idle 0 10.5.1.11
Machine State Address Inst id Base AZ Message
0 started 10.5.1.11 manual: ubuntu@22.04 Manually provisioned machine
Offer Application Charm Rev Connected Endpoint Interface Role
microceph microceph microceph 9 0/0 ceph ceph-client provider
ubuntu@juju-5d90c3-sunbeam-0:~$ juju switch openstack
sunbeam-controller:admin/controller -> sunbeam-controller:juju-5d90c3-sunbeam-0.cloud.sts/openstack
ubuntu@juju-5d90c3-sunbeam-0:~$ juju status
Model Controller Cloud/Region Version SLA Timestamp
openstack sunbeam-controller sunbeam-microk8s/localhost 3.2.0 unsupported 03:56:29Z
App Version Status Scale Charm Channel Rev Address Exposed Message
certificate-authority active 1 tls-certificates-operator latest/stable 22 10.152.183.253 no
cinder waiting 1 cinder-k8s 2023.1/stable 47 10.152.183.47 no installing agent
cinder-ceph waiting 1 cinder-ceph-k8s 2023.1/stable 38 10.152.183.65 no installing agent
cinder-ceph-mysql-router 8.0.34-0ubuntu0.22.04.1 active 1 mysql-router-k8s 8.0/candidate 64 10.152.183.165 no
cinder-mysql-router 8.0.34-0ubuntu0.22.04.1 active 1 mysql-router-k8s 8.0/candidate 64 10.152.183.124 no
glance active 1 glance-k8s 2023.1/stable 59 10.152.183.202 no
glance-mysql-router 8.0.34-0ubuntu0.22.04.1 active 1 mysql-router-k8s 8.0/candidate 64 10.152.183.77 no
horizon active 1 horizon-k8s 2023.1/stable 56 10.152.183.234 no http://10.20.21.10/openstack-horizon
horizon-mysql-router 8.0.34-0ubuntu0.22.04.1 active 1 mysql-router-k8s 8.0/candidate 64 10.152.183.218 no
keystone waiting 1 keystone-k8s 2023.1/stable 125 10.152.183.123 no installing agent
keystone-mysql-router 8.0.34-0ubuntu0.22.04.1 active 1 mysql-router-k8s 8.0/candidate 64 10.152.183.78 no
mysql 8.0.34-0ubuntu0.22.04.1 active 1 mysql-k8s 8.0/candidate 99 10.152.183.183 no
neutron waiting 1 neutron-k8s 2023.1/stable 53 10.152.183.187 no installing agent
neutron-mysql-router 8.0.34-0ubuntu0.22.04.1 active 1 mysql-router-k8s 8.0/candidate 64 10.152.183.45 no
nova waiting 1 nova-k8s 2023.1/stable 48 10.152.183.59 no installing agent
nova-api-mysql-router 8.0.34-0ubuntu0.22.04.1 active 1 mysql-router-k8s 8.0/candidate 64 10.152.183.46 no
nova-cell-mysql-router 8.0.34-0ubuntu0.22.04.1 active 1 mysql-router-k8s 8.0/candidate 64 10.152.183.194 no
nova-mysql-router 8.0.34-0ubuntu0.22.04.1 active 1 mysql-router-k8s 8.0/candidate 64 10.152.183.110 no
ovn-central active 1 ovn-central-k8s 23.03/stable 61 10.152.183.195 no
ovn-relay active 1 ovn-relay-k8s 23.03/stable 49 10.20.21.11 no
placement active 1 placement-k8s 2023.1/stable 43 10.152.183.90 no
placement-mysql-router 8.0.34-0ubuntu0.22.04.1 active 1 mysql-router-k8s 8.0/candidate 64 10.152.183.210 no
rabbitmq 3.9.13 active 1 rabbitmq-k8s 3.9/stable 30 10.20.21.12 no
traefik 2.10.4 maintenance 1 traefik-k8s 1.0/candidate 148 10.20.21.10 no updating ingress configuration for 'ingress:48'
Unit Workload Agent Address Ports Message
certificate-authority/0* active idle 10.1.105.20
cinder-ceph-mysql-router/0* active idle 10.1.105.9
cinder-ceph/0* blocked idle 10.1.105.12 (ceph) integration missing
cinder-mysql-router/0* active idle 10.1.105.7
cinder/0* waiting idle 10.1.105.30 (workload) Not all relations are ready
glance-mysql-router/0* active idle 10.1.105.19
glance/0* active idle 10.1.105.35
horizon-mysql-router/0* active idle 10.1.105.11
horizon/0* active idle 10.1.105.13
keystone-mysql-router/0* active idle 10.1.105.25
keystone/0* waiting idle 10.1.105.22 (workload) Not all relations are ready
mysql/0* active idle 10.1.105.36 Primary
neutron-mysql-router/0* active idle 10.1.105.26
neutron/0* waiting idle 10.1.105.29 (workload) Not all relations are ready
nova-api-mysql-router/0* active idle 10.1.105.21
nova-cell-mysql-router/0* active idle 10.1.105.18
nova-mysql-router/0* active idle 10.1.105.8
nova/0* waiting idle 10.1.105.31 (workload) Not all relations are ready
ovn-central/0* active idle 10.1.105.37
ovn-relay/0* active idle 10.1.105.10
placement-mysql-router/0* active idle 10.1.105.28
placement/0* active idle 10.1.105.27
rabbitmq/0* active idle 10.1.105.23
traefik/0* maintenance idle 10.1.105.24 updating ingress configuration for 'ingress:48'
Offer Application Charm Rev Connected Endpoint Interface Role
certificate-authority certificate-authority tls-certificates-operator 22 1/1 certificates tls-certificates provider
keystone keystone keystone-k8s 125 1/1 identity-credentials keystone-credentials provider
ovn-relay ovn-relay ovn-relay-k8s 49 1/1 ovsdb-cms-relay ovsdb-cms provider
rabbitmq rabbitmq rabbitmq-k8s 30 1/1 amqp rabbitmq provider
正常地,这么装:
sudo update-alternatives --install "$(which editor)" editor "$(which vim)" 15
sudo update-alternatives --config editor
# only for snap download, it's useless
#sudo systemctl edit --full snapd.service
#[Service]
#Environment="HTTP_PROXY=http://192.168.99.186:9311"
#Environment="HTTPS_PROXY=http://192.168.99.186:9311"
#Environment="NO_PROXY=localhost,127.0.0.1,192.168.0.0/24,10.0.0.0/8,172.16.0.0/16,*.lan"
#sudo systemctl restart snapd.service
#sudo snap set system proxy.http=”http://localhost:8081"
#sudo snap set system proxy.https=”http://localhost:8081"
snap info microk8s
sudo snap install microk8s --classic
mkdir -p ~/.kube && sudo chown -R $USER ~/.kube
sudo usermod -a -G microk8s $USER && newgrp microk8s #switch to the group microk8s
sudo journalctl -f -u snap.microk8s.daemon-kubelite
microk8s.kubectl get pods --all-namespaces
microk8s.ctr --namespace k8s.io image ls
microk8s.ctr --namespace k8s.io containers ls
alias kubectl='sudo /snap/bin/microk8s.kubectl'
source <(kubectl completion bash) && kubectl completion bash |sudo tee /etc/bash_completion.d/kubectl
kubectl describe pod -n kube-system coredns-864597b5fd-pj27h
但特色网络肯定会失败,下面的无论是修改contained-env还是containerd-template.toml都失败了
curl -x http://192.168.99.186:3129 https://registry.k8s.io
openssl s_client -connect registry.k8s.io:443
sudo HTTP_PROXY=http://192.168.99.186:3129 HTTPS_PROXY=http://192.168.99.186:3129 ctr --namespace=k8s.io images pull registry.k8s.io/pause:3.7
sudo ctr --namespace=k8s.io images pull registry.cn-hangzhou.aliyuncs.com/google_containers/pause:3.7
#echo -e 'HTTP_PROXY=http://192.168.99.186:3129\nHTTPS_PROXY=http://192.168.99.186:3129' |tee -a /var/snap/microk8s/current/args/containerd-env
sed -i "s#registry.k8s.io#registry.cn-hangzhou.aliyuncs.com/google_containers#g" /var/snap/microk8s/current/args/containerd-template.toml
sudo systemctl restart snap.microk8s.daemon-containerd.service
# sudo microk8s.stop && sudo microk8s.start #并不需要重启microk8s, 重启上面的snap.microk8s.daemon-containerd即可
试图将所有image全下载,也失败了:
kubectl get pods --all-namespaces -o=jsonpath='{range .items[*]}{.metadata.namespace}:{.metadata.name}{"\t"}{range .spec.containers[*]}{.image}{"\n"}{end}{end}'
sudo HTTP_PROXY=http://192.168.99.186:3129 HTTPS_PROXY=http://192.168.99.186:3129 ctr --namespace=k8s.io images pull registry.k8s.io/pause:3.7
sudo HTTP_PROXY=http://192.168.99.186:3129 HTTPS_PROXY=http://192.168.99.186:3129 ctr --namespace=k8s.io images pull docker.io/calico/kube-controllers:v3.25.1
sudo HTTP_PROXY=http://192.168.99.186:3129 HTTPS_PROXY=http://192.168.99.186:3129 ctr --namespace=k8s.io images pull docker.io/calico/node:v3.25.1
sudo microk8s.ctr --namespace=k8s.io images pull registry.cn-hangzhou.aliyuncs.com/google_containers/coredns:1.10.1
用L3层的工具还是失败了。
再换下面的方法来下载所有镜像:
git clone https://github.com/ubuntu/microk8s.git
cd microk8s
# know it's 1.28 according to 'snap info microk8s'
git checkout -b 1.28 v1.28
# grep -ir 'image:' * | awk '{print $3 $4}'
# reference - https://soulteary.com/2019/09/08/build-your-k8s-environment-with-microk8s.html
images=(
nginx:latest
rocks.canonical.com/cdk/diverdane/nginxdualstack:1.0.0
nginx:1.14.2
image:cdkbot/microbot-amd64
docker.io/calico/cni:v3.23.4
docker.io/calico/cni:v3.23.4
docker.io/calico/pod2daemon-flexvol:v3.23.4
docker.io/calico/node:v3.23.4
docker.io/calico/kube-controllers:v3.23.4
docker.io/calico/cni:v3.21.1
docker.io/calico/cni:v3.21.1
docker.io/calico/pod2daemon-flexvol:v3.21.1
docker.io/calico/node:v3.21.1
docker.io/calico/kube-controllers:v3.17.3
docker.io/calico/cni:v3.25.1
docker.io/calico/cni:v3.25.1
docker.io/calico/node:v3.25.1
docker.io/calico/kube-controllers:v3.25.1
)
for image in ${images[@]};do
sudo HTTP_PROXY=http://192.168.99.186:3129 HTTPS_PROXY=http://192.168.99.186:3129 ctr --namespace k8s.io images pull $image
done
# 记得处理pause
#sudo HTTP_PROXY=http://192.168.99.186:3129 HTTPS_PROXY=http://192.168.99.186:3129 ctr --namespace k8s.io images pull registry.k8s.io/pause:3.7
sed -i "s#registry.k8s.io#registry.cn-hangzhou.aliyuncs.com/google_containers#g" /var/snap/microk8s/current/args/containerd-template.toml
sudo systemctl restart snap.microk8s.daemon-containerd.service
microk8s.kubectl get pods --all-namespaces -A
sudo usermod -a -G microk8s hua && sudo chown -r hua ~/.kube
newgrp microk8s
microk8s.inspect && microk8s status
microk8s.kubectl get pods --all-namespaces -A
# 查看什么错误的利器,
microk8s.kubectl describe pod --all-namespaces > tmp/tmp #error reason
grep -r failed tmp/tmp |tail -n1
# 然后重启microk8s (microk8s stop && microk8s start)
# 接着用‘kubectl get pods --all-namespaces -A’看到有一个calico-node启动失败的原因是有一个landscape的服务占用了9099端口
kubectl get pods --all-namespaces -A
可通过下面命令备份导入这些镜像:
# 注意:在import image时记得用microk8s.ctr代替ctr, 否则用“microk8s.ctr --namespace k8s.io image ls”看不到
sudo ctr --namespace k8s.io images ls -q |grep -v sha256 |xargs -I {} sh -c 'fname=$(echo "{}" | tr -s "/" "_"); sudo ctr --namespace k8s.io image export ${fname}.tar "{}"'
ls *.tar | xargs -I {} sh -c 'fname=$(echo "{}" | sed "s/_/\//g" | sed "s/.tar$//g");echo $fname; microk8s.ctr --namespace k8s.io image import {} ${fname}'
microk8s.ctr --namespace k8s.io image ls
sudo microk8s.stop && sudo microk8s.start
sed -i "s#registry.k8s.io#registry.cn-hangzhou.aliyuncs.com/google_containers#g" /var/snap/microk8s/current/args/containerd-template.toml
sudo systemctl restart snap.microk8s.daemon-containerd.service
# sunbeam特有的image
sudo HTTP_PROXY=http://192.168.99.186:3129 HTTPS_PROXY=http://192.168.99.186:3129 ctr --namespace k8s.io images pull quay.io/metallb/speaker:v0.13.3
kubectl delete -n metallb-system pod speaker-52bqz --grace-period=0 --force
继续部署cos:
# https://charmhub.io/topics/canonical-observability-stack/tutorials/install-microk8s
juju add-model cos sunbeam
juju switch cos
juju deploy cos-lite --trust
watch --color juju status --color --relations
# Backup images, remember to use 'mcirok8s.ctr' instead of 'ctr', and remove 'ctr' by 'sudo apt purge containerd docker.io -y'
microk8s.ctr --namespace k8s.io images ls -q |grep -v sha256 |xargs -I {} sh -c 'fname=$(echo "{}" | tr -s "/" "_"); microk8s.ctr --namespace k8s.io image export ${fname}.tar "{}"'
# ls *.tar | xargs -I {} sh -c 'fname=$(echo "{}" | sed "s/_/\//g" | sed "s/.tar$//g");echo $fname; microk8s.ctr --namespace k8s.io image import {} ${fname}'
# microk8s.ctr --namespace k8s.io image ls
# Reset the env
sudo snap remove --purge openstack
sudo snap remove --purge juju
sudo snap remove --purge juju-db
sudo snap remove --purge kubectl
sudo /usr/sbin/remove-juju-services
sudo rm -rf /var/lib/juju
rm -rf ~/.local/share/juju
rm -rf ~/snap/juju/
rm -rf ~/snap/openstack
rm -rf ~/snap/openstack-hypervisor
rm -rf ~/snap/microstack/
rm -rf ~/snap/microk8s/
sudo snap remove --purge vault
sudo snap remove --purge microk8s
sudo snap remove --purge openstack-hypervisor
sudo init 6 #最好重启,否则会有一些calico的网卡和namespace去不了
# Create a ssh key without password, and use 'NOPASSWD:ALL'
ssh-keygen
echo 'hua ALL=(ALL) NOPASSWD:ALL' |sudo tee -a /etc/sudoers
# Install sunbeam
sudo snap install openstack --channel 2023.1/stable
sunbeam prepare-node-script | bash -x
sudo usermod -a -G snap_daemon $USER && newgrp snap_daemon
sunbeam cluster bootstrap --accept-defaults
journalctl -f
# 下面这些东西现在全不用,改成用mirror的方式就行
# Monitor the status, when seeing the microk8s snap by using 'snap list |grep k8s', or the log 'Adding MicroK8S unit to machine ...', then load the image
ls *.tar | xargs -I {} sh -c 'fname=$(echo "{}" | sed "s/_/\//g" | sed "s/.tar$//g");echo $fname; microk8s.ctr --namespace k8s.io image import {} ${fname}'
# or use micro.registry
#sudo microk8s.enable registry
#ls *.tar | xargs -I {} sh -c 'fname=$(echo "{}" | sed "s/_/\//g" | sed "s/.tar$//g"); sudo docker load -i {}; sudo docker tag ${fname} localhost:32000/${fname}; sudo docker push localhost:32000/${fname}'
microk8s.ctr --namespace k8s.io image ls
sed -i "s#registry.k8s.io#registry.cn-hangzhou.aliyuncs.com/google_containers#g" /var/snap/microk8s/current/args/containerd-template.toml
sudo systemctl restart snap.microk8s.daemon-containerd.service
# sudo microk8s.stop && sudo microk8s.start #并不需要重启microk8s, 重启上面的snap.microk8s.daemon-containerd即可
microk8s.kubectl get pods --all-namespaces
cat << EOF |tee -a /var/snap/microk8s/current/args/containerd-env
# should use http type
HTTP_PROXY=http://192.168.99.186:3129
HTTPS_PROXY=http://192.168.99.186:3129
NO_PROXY=10.0.0.0/8,127.0.0.0/16,192.168.0.0/16
EOF
# If there is a pause during the installation of pod in microk8s, we can restart microk8s by: sudo microk8s.stop && sudo microk8s.start
# Use sunbeam
alias kubectl='sudo /snap/bin/microk8s.kubectl'
source <(kubectl completion bash) && kubectl completion bash |sudo tee /etc/bash_completion.d/kubectl
source <(openstack complete) && openstack complete |sudo tee /etc/bash_completion.d/openstack
sunbeam configure --accept-defaults --openrc demo-openrc
sunbeam openrc > admin-openrc
source admin-openrc
sunbeam launch ubuntu --name test
sudo journalctl -u snap.openstack.clusterd.service -f
sudo snap logs openstack-hypervisor.ovn-controller
openstack hypervisor list
sunbeam cluster list
# Debug hacks
microk8s.kubectl describe pod --all-namespaces > tmp && grep -r failed tmp |tail -n1
microk8s.kubectl logs -n kube-system pod xxx
$ microk8s.ctr --namespace k8s.io image ls -q |grep -v sha256
docker.io/calico/cni:v3.21.1
docker.io/calico/cni:v3.23.4
docker.io/calico/cni:v3.23.5
docker.io/calico/cni:v3.25.1
docker.io/calico/kube-controllers:v3.17.3
docker.io/calico/kube-controllers:v3.23.4
docker.io/calico/kube-controllers:v3.23.5
docker.io/calico/kube-controllers:v3.25.1
docker.io/calico/node:v3.21.1
docker.io/calico/node:v3.23.4
docker.io/calico/node:v3.23.5
docker.io/calico/node:v3.25.1
docker.io/calico/pod2daemon-flexvol:v3.21.1
docker.io/calico/pod2daemon-flexvol:v3.23.4
docker.io/cdkbot/hostpath-provisioner:1.4.2
docker.io/coredns/coredns:1.9.3
docker.io/jujusolutions/charm-base:ubuntu-20.04
docker.io/jujusolutions/charm-base:ubuntu-22.04
docker.io/jujusolutions/jujud-operator:3.2.0
quay.io/metallb/controller:v0.13.3
quay.io/metallb/speaker:v0.13.3
registry.cn-hangzhou.aliyuncs.com/google_containers/pause:3.7
registry.k8s.io/pause:3.7
rocks.canonical.com/cdk/diverdane/nginxdualstack:1.0.0
在这个过程中,如果运行juju clouds看到了下列问题, 那是因为在newgrp snap_daemon里运行了它 , juju 3.x开始用了strict (snap debug sandbox-features)这样用snap_daemon这个组肯定少权限。
update.go:85: cannot change mount namespace according to change mount (/run/user/1000/doc/by-app/snap.juju /run/user/1000/doc none bind,rw,x-snapd.ignore-missing 0 0): cannot inspect "/run/user/1000/doc": lstat /run/user/1000/doc: permission denied
遇到下面问题,将/etc/ntp.conf里搜索restrict禁用ipv6即可:
11月 29 13:43:28 minipc ntpd[1910]: bind(30) AF_INET6 fe80::ecee:eeff:feee:eeee%14#123 flags 0x11 failed: Cannot assign requested address
11月 29 13:43:28 minipc ntpd[1910]: unable to create socket on cali93e42ce2874 (14) for fe80::ecee:eeff:feee:eeee%14#123
另外,如果snap无法下载,是因为gw上的网络有问题。在microk8s里的pod始终部署不结束可能和这个有关,因为它要去registry.jujucharms.com里拉openstack的charm (这个目前网络可以正常访问)
grep -r 'PullImage from image service failed' /var/log/syslog | awk -F'image="' '{split($2, a, "@sha256:"); print a[1]}'
ov 29 13:50:15 minipc microk8s.daemon-kubelite[38665]: E1129 13:50:15.323748 38665 remote_image.go:171] "PullImage from image service failed" err="rpc error: code = Unknown desc = failed to pull and unpack image \"registry.jujucharms.com/charm/6a0rnzywlucfo4rvn7y2aylcc19uaarnwsrge/ovn-sb-db-server-image@sha256:d132bf917fde0e48743ace9f0bceb0ae3ba17a7cc41c0a76c4160a1fb606940a\": failed to resolve reference \"registry.jujucharms.com/charm/6a0rnzywlucfo4rvn7y2aylcc19uaarnwsrge/ovn-sb-db-server-image@sha256:d132bf917fde0e48743ace9f0bceb0ae3ba17a7cc41c0a76c4160a1fb606940a\": pull access denied, repository does not exist or may require authorization: server message: insufficient_scope: authorization failed" image="registry.jujucharms.com/charm/6a0rnzywlucfo4rvn7y2aylcc19uaarnwsrge/ovn-sb-db-server-image@sha256:d132bf917fde0e48743ace9f0bceb0ae3ba17a7cc41c0a76c4160a1fb606940a"
sudo HTTP_PROXY=http://192.168.99.186:3129 HTTPS_PROXY=http://192.168.99.186:3129 microk8s.ctr --namespace=k8s.io images pull registry.jujucharms.com/charm/6a0rnzywlucfo4rvn7y2aylcc19uaarnwsrge/ovn-sb-db-server-image
安装方法:
sudo microk8s.enable registry
刚开始不知道端口,用了一些方法找到端口是5000, 并且向外导出的端口是32000
journalctl -u snap.microk8s.daemon-containerd -u snap.microk8s.daemon-registry
cat /var/snap/microk8s/current/args/containerd.toml |grep '\.registry' -A1
cat /var/snap/microk8s/6100/args/certs.d/localhost\:32000/hosts.toml
hua@minipc:~$ sudo microk8s.kubectl logs -n container-registry registry-77c7575667-q9qr2 |tail -n1
time="2023-11-29T06:08:29.221059247Z" level=info msg="listening on [::]:5000" go.version=go1.16.15 instance.id=5e67e401-e456-49bc-acaf-d6406f012e7f service=registry version="v2.8.1+unknown"
$ microk8s.kubectl get svc -A |grep reg
container-registry registry NodePort 10.152.183.20 5000:32000/TCP 33m
curl http://192.168.99.179:32000/v2/_catalog
sudo microk8s.kubectl proxy
http://localhost:8001/api/v1/namespaces/kube-system/services/https:kubernetes-dashboard:/proxy/
kubectl run -it --rm --image=alpine:latest test-container-registry -- sh
# apk add curl
导入镜像:
ls *.tar | xargs -I {} sh -c 'fname=$(echo "{}" | sed "s/_/\//g" | sed "s/.tar$//g"); microk8s.ctr image import {} localhost:32000/${fname}'
但不清楚怎么列出localhost:32000中的镜像,用"microk8s.ctr images ls localhost:32000"肯定不行因为ctr侧重与containerd打交道而不是镜像库,这里可能要用到docker的(sudo docker image ls localhost:32000/k8s.io)但也是空的,难道是需要将镜像还从ctr里导到docker吗或用docker重新标识吗?
ls *.tar | xargs -I {} sh -c 'fname=$(echo "{}" | sed "s/_/\//g" | sed "s/.tar$//g"); sudo docker load -i {}; sudo docker tag ${fname} localhost:32000/${fname}; sudo docker push localhost:32000/${fname}'
#sudo docker image ls localhost:32000
# 用上面的docker image ls localhost:32000还是看不到东西的,得按下列的命令看. docker并不关心存在哪,只看tag (tag里有localhost哦)
sudo docker image ls
sudo docker image ls localhost:32000/docker.io/calico/node
sudo docker pull registry.k8s.io/pause:3.7
sudo docker pull k8s.m.daocloud.io/pause:3.7
# https://github.com/DaoCloud/public-image-mirror
# https://microk8s.io/docs/registry-private
sudo mkdir -p /var/snap/microk8s/current/args/certs.d/registry.k8s.io
echo '
server = "registry.k8s.io"
[host."https://registry.aliyuncs.com/v2/google_containers"]
capabilities = ["pull", "resolve"]
override_path = true
' | sudo tee /var/snap/microk8s/current/args/certs.d/registry.k8s.io/hosts.toml
sudo mkdir -p /var/snap/microk8s/current/args/certs.d/rocks.canonical.com
echo '
server = "rocks.canonical.com"
[host."https://rocks-canonical.m.daocloud.io"]
capabilities = ["pull", "resolve"]
override_path = true
' | sudo tee /var/snap/microk8s/current/args/certs.d/rocks.canonical.com/hosts.toml
sudo mkdir -p /var/snap/microk8s/current/args/certs.d/registry.jujucharms.com
echo '
server = "registry.jujucharms.com"
[host."https://jujucharms.m.daocloud.io"]
capabilities = ["pull", "resolve"]
override_path = true
' | sudo tee /var/snap/microk8s/current/args/certs.d/registry.jujucharms.com/hosts.toml
sudo mkdir -p /var/snap/microk8s/current/args/certs.d/quay.io
echo '
server = "quay.io"
[host."https://quay.m.daocloud.io"]
capabilities = ["pull", "resolve"]
override_path = true
' | sudo tee /var/snap/microk8s/current/args/certs.d/quay.io/hosts.toml
sudo mkdir -p /var/snap/microk8s/current/args/certs.d/gcr.io
echo '
server = "gcr.io"
[host."https://gcr.m.daocloud.io"]
capabilities = ["pull", "resolve"]
override_path = true
' | sudo tee /var/snap/microk8s/current/args/certs.d/gcr.io/hosts.toml
sudo mkdir -p /var/snap/microk8s/current/args/certs.d/docker.io
echo '
server = "docker.io"
[host."https://m.daocloud.io/docker.io"]
capabilities = ["pull", "resolve"]
override_path = true
' | sudo tee /var/snap/microk8s/current/args/certs.d/docker.io/hosts.toml
# 如果不生效,应是下面的权限问题
sud chown -R hua:hua /var/snap/microk8s/current/args/certs.d
sudo snap restart microk8s
上面用mirror方式可以顺利安装完sunbeam,但启动一个测试虚机时报了下列错,
$ sudo dmesg | grep 'apparmor="DENIED"' |tail -n1
[ 8189.359936] audit: type=1400 audit(1701263375.834:2157): apparmor="DENIED" operation="open" profile="snap.openstack-hypervisor.nova-api-metadata" name="/etc/nova/api-paste.ini" pid=1440903 comm="python3" requested_mask="r" denied_mask="r" fsuid=0 ouid=1000
可这么fix它:
sudo vim /var/lib/snapd/apparmor/profiles/snap.openstack-hypervisor.nova-api-metadata
/etc/nova/api-paste.ini r,
/etc/nova/** r,
sudo apparmor_parser -r /var/lib/snapd/apparmor/profiles/snap.openstack-hypervisor.nova-api-metadata
但之后‘openstack hypervisor list’还是空,看到了下列错. 现在bug实在太多先不玩了:
hua@minipc:~$ journalctl -f -u snap.openstack-hypervisor.nova-compute.service
11月 29 21:31:06 minipc nova-compute[1655283]: 2023-11-29 21:31:06.227 1655283 INFO nova.virt.libvirt.driver [None req-646001d6-e3ff-414d-8be4-6bf0c1882b2b - - - - - -] Connection event '0' reason 'Failed to connect to libvirt: Failed to connect socket to '/var/snap/openstack-hypervisor/common/run/libvirt/virtqemud-sock': No such file or directory'
sudo mkdir -p /etc/docker
sudo tee /etc/docker/daemon.json <<-EOF
{
"registry-mirrors": [
"https://dockerproxy.com",
"https://mirror.baidubce.com",
"https://docker.m.daocloud.io",
"https://docker.nju.edu.cn",
"https://docker.mirrors.sjtug.sjtu.edu.cn"
]
}
EOF
sudo systemctl daemon-reload
sudo systemctl restart docker
[1] Sunbeam underlying projects - https://discourse.ubuntu.com/t/sunbeam-underlying-projects/37526
[2] https://github.com/canonical/snap-openstack.git