目录
节点网络 nodeIP 物理网卡的IP实现节点间的通信
Pod网络 podIP Pod与Pod之间可通过Pod的IP相互通信
Service网络 clusterIP 在K8S集群内可通过service资源的clusterIP实现对Pod集群的网络代理转发
VLAN 和 VXLAN 的区别:
1)作用不同:VLAN主要用作于在交换机上逻辑划分广播域,还可以配合STP生成树协议阻塞路径接口,避免产生环路和广播风暴
VXLAN可以将数据帧封装成UDP报文,再通过网络层传输给其它网络,从而实现虚拟大二层网络的通信
2)VXLAN支持更多的二层网络:VXLAN最多可支持 2^24 个;VLAN最多支持 2^12 个(4096-2)
3)VXLAN可以防止物理交换机MAC表耗尽:VLAN需要在交换机的MAC表中记录MAC物理地址;VXLAN采用隧道机制,MAC物理地址不需记录在交换机
UDP 出现最早的模式,但是性能最差,基于flanneld应用程序实现数据包的封装/解封装
VXLAN flannel的默认模式,也是推荐使用的模式,性能比UDP模式更好,基于内核实现数据帧的封装/解封装,而且配置简单使用方便
HOST-GW 性能最好的模式,但是配置负载,且不能跨网段
flannel的UDP模式工作原理:
1)原始数据包从源主机的Pod容器发出到cni0网桥接口,再由cni0转发到flannel0虚拟接口
2)flanneld服务进程会监听flannel0接口接收到的数据,flanneld进程会将原始数据包封装到UDP报文里
3)flanneld进程会根据在etcd中维护的路由表查到目标Pod所在的nodeIP,并在UDP报文外封装nodeIP头部、MAC头部,再通过物理网卡发送到目标node节点
4)UDP报文通过8285端口送达到目标node节点的flanneld进程进行解封装,再根据本地路由规则通过flannel0接口发送到cni0网桥,再由cni0发送到目标Pod容器
上传 cni-plugins-linux-amd64-v0.8.6.tgz 和 flannel.tar 到 /opt 目录中:
docker load -i flannel-cni-plugin.tar
docker load -i flannel.tar
tar xf cni-plugins-linux-amd64-v1.3.0.tgz -C /opt/cni/bin
复制kube-flannel.yml 文件到 /opt/k8s 目录中,部署 CNI 网络:
scp kube-flannel.yml 192.168.233.10:/opt/k8s/
查看master:
kubectl apply -f kube-flannel.yml
kubectl get pods -n kube-flannel
kubectl get pods -n kube-flannel -owide
看下node节点ip:
查看master文件使用的网络模式:
- apiVersion: v1
- kind: Namespace
- metadata:
- labels:
- k8s-app: flannel
- pod-security.kubernetes.io/enforce: privileged
- name: kube-flannel
- ---
- apiVersion: v1
- kind: ServiceAccount
- metadata:
- labels:
- k8s-app: flannel
- name: flannel
- namespace: kube-flannel
- ---
- apiVersion: rbac.authorization.k8s.io/v1
- kind: ClusterRole
- metadata:
- labels:
- k8s-app: flannel
- name: flannel
- rules:
- - apiGroups:
- - ""
- resources:
- - pods
- verbs:
- - get
- - apiGroups:
- - ""
- resources:
- - nodes
- verbs:
- - get
- - list
- - watch
- - apiGroups:
- - ""
- resources:
- - nodes/status
- verbs:
- - patch
- - apiGroups:
- - networking.k8s.io
- resources:
- - clustercidrs
- verbs:
- - list
- - watch
- ---
- apiVersion: rbac.authorization.k8s.io/v1
- kind: ClusterRoleBinding
- metadata:
- labels:
- k8s-app: flannel
- name: flannel
- roleRef:
- apiGroup: rbac.authorization.k8s.io
- kind: ClusterRole
- name: flannel
- subjects:
- - kind: ServiceAccount
- name: flannel
- namespace: kube-flannel
- ---
- apiVersion: v1
- data:
- cni-conf.json: |
- {
- "name": "cbr0",
- "cniVersion": "0.3.1",
- "plugins": [
- {
- "type": "flannel",
- "delegate": {
- "hairpinMode": true,
- "isDefaultGateway": true
- }
- },
- {
- "type": "portmap",
- "capabilities": {
- "portMappings": true
- }
- }
- ]
- }
- net-conf.json: |
- {
- "Network": "10.244.0.0/16",
- "Backend": {
- "Type": "vxlan"
- }
- }
- kind: ConfigMap
- metadata:
- labels:
- app: flannel
- k8s-app: flannel
- tier: node
- name: kube-flannel-cfg
- namespace: kube-flannel
- ---
- apiVersion: apps/v1
- kind: DaemonSet
- metadata:
- labels:
- app: flannel
- k8s-app: flannel
- tier: node
- name: kube-flannel-ds
- namespace: kube-flannel
- spec:
- selector:
- matchLabels:
- app: flannel
- k8s-app: flannel
- template:
- metadata:
- labels:
- app: flannel
- k8s-app: flannel
- tier: node
- spec:
- affinity:
- nodeAffinity:
- requiredDuringSchedulingIgnoredDuringExecution:
- nodeSelectorTerms:
- - matchExpressions:
- - key: kubernetes.io/os
- operator: In
- values:
- - linux
- containers:
- - args:
- - --ip-masq
- - --kube-subnet-mgr
- command:
- - /opt/bin/flanneld
- env:
- - name: POD_NAME
- valueFrom:
- fieldRef:
- fieldPath: metadata.name
- - name: POD_NAMESPACE
- valueFrom:
- fieldRef:
- fieldPath: metadata.namespace
- - name: EVENT_QUEUE_DEPTH
- value: "5000"
- image: docker.io/flannel/flannel:v0.21.5
- name: kube-flannel
- resources:
- requests:
- cpu: 100m
- memory: 50Mi
- securityContext:
- capabilities:
- add:
- - NET_ADMIN
- - NET_RAW
- privileged: false
- volumeMounts:
- - mountPath: /run/flannel
- name: run
- - mountPath: /etc/kube-flannel/
- name: flannel-cfg
- - mountPath: /run/xtables.lock
- name: xtables-lock
- hostNetwork: true
- initContainers:
- - args:
- - -f
- - /flannel
- - /opt/cni/bin/flannel
- command:
- - cp
- image: docker.io/flannel/flannel-cni-plugin:v1.1.2
- name: install-cni-plugin
- volumeMounts:
- - mountPath: /opt/cni/bin
- name: cni-plugin
- - args:
- - -f
- - /etc/kube-flannel/cni-conf.json
- - /etc/cni/net.d/10-flannel.conflist
- command:
- - cp
- image: docker.io/flannel/flannel:v0.21.5
- name: install-cni
- volumeMounts:
- - mountPath: /etc/cni/net.d
- name: cni
- - mountPath: /etc/kube-flannel/
- name: flannel-cfg
- priorityClassName: system-node-critical
- serviceAccountName: flannel
- tolerations:
- - effect: NoSchedule
- operator: Exists
- volumes:
- - hostPath:
- path: /run/flannel
- name: run
- - hostPath:
- path: /opt/cni/bin
- name: cni-plugin
- - hostPath:
- path: /etc/cni/net.d
- name: cni
- - configMap:
- name: kube-flannel-cfg
- name: flannel-cfg
- - hostPath:
- path: /run/xtables.lock
- type: FileOrCreate
- name: xtables-lock
flannel的VXLAN模式工作原理:
1)原始数据帧从源主机的Pod容器发出到cni0网桥接口,再由cni0转发到flannel.1虚拟接口
2)flannel.1接口接收到数据帧后添加VXLAN头部,并在内核将原始数据帧封装到UDP报文里
3)根据在etcd中维护的路由表查到目标Pod所在的nodeIP,并在UDP报文外封装nodeIP头部、MAC头部,再通过物理网卡发送到目标node节点
4)UDP报文通过8472端口送达到目标node节点的flannel.1接口并在内核进行解封装,再根据本地路由规则发送到cni0网桥,再由cni0发送到目标Pod容器
calico的IPIP模式工作原理:
1)原始数据包从源主机的Pod容器发出,通过 veth pair 设备送达到tunl0接口,再被内核的IPIP驱动封装到node节点网络的IP报文里
2)根据Felix维护的路由规则通过物理网卡发送到目标node节点
3)IP数据包到达目标node节点的tunl0接口后再通过内核的IPIP驱动解封装得到原始数据包,再根据本地路由规则通过 veth pair 设备送达到目标Pod容器
calico的BGP模式工作原理(本质就是通过路由规则来实现Pod之间的通信)
每个Pod容器都有一个 veth pair 设备,一端接入容器,另一个接入宿主机网络空间,并设置一条路由规则。
这些路由规则都是 Felix 维护配置的,由 BIRD 组件基于 BGP 动态路由协议分发路由信息给其它节点。
1)原始数据包从源主机的Pod容器发出,通过 veth pair 设备送达到宿主机网络空间
2)根据Felix维护的路由规则通过物理网卡发送到目标node节点
3)目标node节点接收到数据包后,会根据本地路由规则通过 veth pair 设备送达到目标Pod容器
flannel: UDP VXLAN HOST-GW
默认网段:10.244.0.0/16
通常会采用VXLAN模式,用的是叠加网络、IP隧道方式传输数据,对性能有一定的影响。
Flannel产品成熟,依赖性较少,易于安装,功能简单,配置方便,利于管理。但是不具备复杂的网络策略配置能力。
calico: IPIP BGP 混合模式(CrossSubnet)
默认网段:192.168.0.0/16
使用IPIP模式可以实现跨子网传输,但是传输过程中需要额外的封包和解包过程,对性能有一定的影响。
使用BGP模式会把每个node节点看作成路由器,通过Felix、BIRD组件来维护和分发路由规则,可实现直接通过BGP路由协议实现路由转发,传输过程中不需要额外封包和解包过程,因此性能较好,但是只能在同一个网段里使用,无法跨子网传输。
calico不使用cni0网桥,而使通过路由规则把数据包直接发送到目标主机,所以性能较高;而且还具有更丰富的网络策略配置管理能力,功能更全面,但是维护起来较为复杂。
所以对于较小规模且网络要求简单的K8S集群,可以采用flannel作为cni网络插件。对于K8S集群规模较大且要求更多的网络策略配置时,可以考虑采用性能更好功能更全面的calico或cilium。
CoreDNS 是 K8S 默认的集群内部 DNS 功能实现,为 K8S 集群内的 Pod 提供 DNS 解析服务
根据 service 的资源名称 解析出对应的 clusterIP
根据 statefulset 控制器创建的Pod资源名称 解析出对应的 podIP
上传 coredns.tar 到 /opt 目录中:
docker load -i coredns.tar
上传 coredns.yaml 文件到 /opt/k8s 目录中,部署 CoreDNS :
- # __MACHINE_GENERATED_WARNING__
-
- apiVersion: v1
- kind: ServiceAccount
- metadata:
- name: coredns
- namespace: kube-system
- labels:
- kubernetes.io/cluster-service: "true"
- addonmanager.kubernetes.io/mode: Reconcile
- ---
- apiVersion: rbac.authorization.k8s.io/v1
- kind: ClusterRole
- metadata:
- labels:
- kubernetes.io/bootstrapping: rbac-defaults
- addonmanager.kubernetes.io/mode: Reconcile
- name: system:coredns
- rules:
- - apiGroups:
- - ""
- resources:
- - endpoints
- - services
- - pods
- - namespaces
- verbs:
- - list
- - watch
- - apiGroups:
- - ""
- resources:
- - nodes
- verbs:
- - get
- ---
- apiVersion: rbac.authorization.k8s.io/v1
- kind: ClusterRoleBinding
- metadata:
- annotations:
- rbac.authorization.kubernetes.io/autoupdate: "true"
- labels:
- kubernetes.io/bootstrapping: rbac-defaults
- addonmanager.kubernetes.io/mode: EnsureExists
- name: system:coredns
- roleRef:
- apiGroup: rbac.authorization.k8s.io
- kind: ClusterRole
- name: system:coredns
- subjects:
- - kind: ServiceAccount
- name: coredns
- namespace: kube-system
- ---
- apiVersion: v1
- kind: ConfigMap
- metadata:
- name: coredns
- namespace: kube-system
- labels:
- addonmanager.kubernetes.io/mode: EnsureExists
- data:
- Corefile: |
- .:53 {
- errors
- health {
- lameduck 5s
- }
- ready
- kubernetes cluster.local in-addr.arpa ip6.arpa {
- pods insecure
- fallthrough in-addr.arpa ip6.arpa
- ttl 30
- }
- prometheus :9153
- forward . /etc/resolv.conf {
- max_concurrent 1000
- }
- cache 30
- loop
- reload
- loadbalance
- }
- ---
- apiVersion: apps/v1
- kind: Deployment
- metadata:
- name: coredns
- namespace: kube-system
- labels:
- k8s-app: kube-dns
- kubernetes.io/cluster-service: "true"
- addonmanager.kubernetes.io/mode: Reconcile
- kubernetes.io/name: "CoreDNS"
- spec:
- # replicas: not specified here:
- # 1. In order to make Addon Manager do not reconcile this replicas parameter.
- # 2. Default is 1.
- # 3. Will be tuned in real time if DNS horizontal auto-scaling is turned on.
- strategy:
- type: RollingUpdate
- rollingUpdate:
- maxUnavailable: 1
- selector:
- matchLabels:
- k8s-app: kube-dns
- template:
- metadata:
- labels:
- k8s-app: kube-dns
- spec:
- securityContext:
- seccompProfile:
- type: RuntimeDefault
- priorityClassName: system-cluster-critical
- serviceAccountName: coredns
- affinity:
- podAntiAffinity:
- preferredDuringSchedulingIgnoredDuringExecution:
- - weight: 100
- podAffinityTerm:
- labelSelector:
- matchExpressions:
- - key: k8s-app
- operator: In
- values: ["kube-dns"]
- topologyKey: kubernetes.io/hostname
- tolerations:
- - key: "CriticalAddonsOnly"
- operator: "Exists"
- nodeSelector:
- kubernetes.io/os: linux
- containers:
- - name: coredns
- image: registry.cn-hangzhou.aliyuncs.com/google_containers/coredns:1.7.0
- imagePullPolicy: IfNotPresent
- resources:
- limits:
- memory: 170Mi
- requests:
- cpu: 100m
- memory: 70Mi
- args: [ "-conf", "/etc/coredns/Corefile" ]
- volumeMounts:
- - name: config-volume
- mountPath: /etc/coredns
- readOnly: true
- ports:
- - containerPort: 53
- name: dns
- protocol: UDP
- - containerPort: 53
- name: dns-tcp
- protocol: TCP
- - containerPort: 9153
- name: metrics
- protocol: TCP
- livenessProbe:
- httpGet:
- path: /health
- port: 8080
- scheme: HTTP
- initialDelaySeconds: 60
- timeoutSeconds: 5
- successThreshold: 1
- failureThreshold: 5
- readinessProbe:
- httpGet:
- path: /ready
- port: 8181
- scheme: HTTP
- securityContext:
- allowPrivilegeEscalation: false
- capabilities:
- add:
- - NET_BIND_SERVICE
- drop:
- - all
- readOnlyRootFilesystem: true
- dnsPolicy: Default
- volumes:
- - name: config-volume
- configMap:
- name: coredns
- items:
- - key: Corefile
- path: Corefile
- ---
- apiVersion: v1
- kind: Service
- metadata:
- name: kube-dns
- namespace: kube-system
- annotations:
- prometheus.io/port: "9153"
- prometheus.io/scrape: "true"
- labels:
- k8s-app: kube-dns
- kubernetes.io/cluster-service: "true"
- addonmanager.kubernetes.io/mode: Reconcile
- kubernetes.io/name: "CoreDNS"
- spec:
- selector:
- k8s-app: kube-dns
- clusterIP: 10.0.0.2
- ports:
- - name: dns
- port: 53
- protocol: UDP
- - name: dns-tcp
- port: 53
- protocol: TCP
- - name: metrics
- port: 9153
- protocol: TCP
kubectl apply -f coredns.yaml
kubectl get pods -n kube-system
kubectl run -it --rm dns-test --image=busybox:1.28.4 sh
kubelet get svc