cordon节点drain驱逐节点delete节点详解

本站所有内容来自互联网收集,仅供学习和交流,请勿用于商业用途。如有侵权、不妥之处,请第一时间联系我们删除!Q群:迪思分享

免费资源网 – https://freexyz.cn/
目录一.系统环境二.前言三.cordon节点3.1 cordon节点概览3.2 cordon节点3.3 uncordon节点四.drain节点4.1 drain节点概览4.2 drain 节点4.3 uncordon节点五.delete 节点5.1 delete节点概览5.2 delete节点

一.系统环境

服务器版本docker软件版本Kubernetes(k8s)集群版本CPU架构CentOS Linux release 7.4.1708 (Core)Docker version 20.10.12v1.21.9x86_64

Kubernetes集群架构:k8scloude1作为master节点,k8scloude2,k8scloude3作为worker节点

服务器操作系统版本CPU架构进程功能描述k8scloude1/192.168.110.130CentOS Linux release 7.4.1708 (Core)x86_64docker,kube-apiserver,etcd,kube-scheduler,kube-controller-manager,kubelet,kube-proxy,coredns,calicok8s master节点k8scloude2/192.168.110.129CentOS Linux release 7.4.1708 (Core)x86_64docker,kubelet,kube-proxy,calicok8s worker节点k8scloude3/192.168.110.128CentOS Linux release 7.4.1708 (Core)x86_64docker,kubelet,kube-proxy,calicok8s worker节点

二.前言

本文介绍cordon节点,drain驱逐节点,delete 节点,在对k8s集群节点执行维护(例如内核升级、硬件维护等)时候会用到。

cordon节点,drain驱逐节点,delete 节点的前提是已经有一套可以正常运行的Kubernetes集群,关于Kubernetes(k8s)集群的安装部署,可以查看博客《Centos7 安装部署Kubernetes(k8s)集群》

三.cordon节点

3.1 cordon节点概览

cordon 节点会使其停止调度,会将node状态调为SchedulingDisabled,之后再创建新pod,新pod不会被调度到该节点,原有的pod不会受到影响,仍正常对外提供服务。

3.2 cordon节点

创建目录存放yaml文件

[root@k8scloude1 ~]# mkdir deploy [root@k8scloude1 ~]# cd deploy/

使用–dry-run生成deploy配置文件

[root@k8scloude1 deploy]# kubectl create deploy nginx –image=nginx –dry-run=client -o yaml >nginx.yaml [root@k8scloude1 deploy]# cat nginx.yaml apiVersion: apps/v1 kind: Deployment metadata: creationTimestamp: null labels: app: nginx name: nginx spec: replicas: 1 selector: matchLabels: app: nginx strategy: {} template: metadata: creationTimestamp: null labels: app: nginx spec: containers: – image: nginx name: nginx resources: {} status: {}

修改deploy配置文件,replicas: 5表示副本数为 5,deploy将创建5个pod

[root@k8scloude1 deploy]# vim nginx.yaml #修改配置文件: # replicas: 5 副本数修改为5 #terminationGracePeriodSeconds: 0 宽限期修改为0 # imagePullPolicy: IfNotPresent 镜像下载策略为存在镜像就不下载 [root@k8scloude1 deploy]# cat nginx.yaml apiVersion: apps/v1 kind: Deployment metadata: creationTimestamp: null labels: app: nginx name: nginx spec: replicas: 5 selector: matchLabels: app: nginx strategy: {} template: metadata: creationTimestamp: null labels: app: nginx spec: terminationGracePeriodSeconds: 0 containers: – image: nginx name: nginx imagePullPolicy: IfNotPresent resources: {} status: {}

创建deploy和使用pod yaml文件创建pod

[root@k8scloude1 deploy]# cat pod.yaml apiVersion: v1 kind: Pod metadata: creationTimestamp: null labels: run: pod1 name: pod1 spec: terminationGracePeriodSeconds: 0 containers: – image: nginx imagePullPolicy: IfNotPresent name: n1 resources: {} dnsPolicy: ClusterFirst restartPolicy: Always status: {} [root@k8scloude1 deploy]# kubectl apply -f pod.yaml pod/pod1 created [root@k8scloude1 deploy]# kubectl apply -f nginx.yaml deployment.apps/nginx created

查看pod,可以看到deploy生成5个pod(nginx-6cf858f6cf-XXXXXXX),还有一个pod1。

[root@k8scloude1 deploy]# kubectl get pods -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES nginx-6cf858f6cf-fwhmh 1/1 Running 0 52s 10.244.251.217 k8scloude3 <none> <none> nginx-6cf858f6cf-hr6bn 1/1 Running 0 52s 10.244.251.218 k8scloude3 <none> <none> nginx-6cf858f6cf-j2ccs 1/1 Running 0 52s 10.244.112.161 k8scloude2 <none> <none> nginx-6cf858f6cf-l7n4w 1/1 Running 0 52s 10.244.112.162 k8scloude2 <none> <none> nginx-6cf858f6cf-t6qxq 1/1 Running 0 52s 10.244.112.163 k8scloude2 <none> <none> pod1 1/1 Running 0 60s 10.244.251.216 k8scloude3 <none> <none>

假设某天要对k8scloude2进行维护测试,不希望k8scloude2节点上被分配新的pod,可以对某个节点执行cordon之后,此节点就不会再调度新的pod了

cordon k8scloude2节点,k8scloude2节点变为SchedulingDisabled状态

[root@k8scloude1 deploy]# kubectl cordon k8scloude2 node/k8scloude2 cordoned [root@k8scloude1 deploy]# kubectl get nodes NAME STATUS ROLES AGE VERSION k8scloude1 Ready control-plane,master 8d v1.21.0 k8scloude2 Ready,SchedulingDisabled <none> 7d23h v1.21.0 k8scloude3 Ready <none> 7d23h v1.21.0

kubectl scale deploy命令使nginx deploy的副本数扩展为10个

[root@k8scloude1 deploy]# kubectl scale deploy nginx –replicas=10 deployment.apps/nginx scaled

查看pod,可以发现新生成的pod都被调度到到k8scloude3上,某个节点被cordon之后,新的pod将不被调度到该节点,原先的pod不变

[root@k8scloude1 deploy]# kubectl get pod -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES nginx-6cf858f6cf-7fdnr 1/1 Running 0 4s 10.244.251.221 k8scloude3 <none> <none> nginx-6cf858f6cf-fwhmh 1/1 Running 0 9m9s 10.244.251.217 k8scloude3 <none> <none> nginx-6cf858f6cf-g92ls 1/1 Running 0 4s 10.244.251.219 k8scloude3 <none> <none> nginx-6cf858f6cf-hr6bn 1/1 Running 0 9m9s 10.244.251.218 k8scloude3 <none> <none> nginx-6cf858f6cf-j2ccs 1/1 Running 0 9m9s 10.244.112.161 k8scloude2 <none> <none> nginx-6cf858f6cf-l7n4w 1/1 Running 0 9m9s 10.244.112.162 k8scloude2 <none> <none> nginx-6cf858f6cf-lsvsg 1/1 Running 0 4s 10.244.251.223 k8scloude3 <none> <none> nginx-6cf858f6cf-mpwjl 1/1 Running 0 4s 10.244.251.222 k8scloude3 <none> <none> nginx-6cf858f6cf-s8x6b 1/1 Running 0 4s 10.244.251.220 k8scloude3 <none> <none> nginx-6cf858f6cf-t6qxq 1/1 Running 0 9m9s 10.244.112.163 k8scloude2 <none> <none> pod1 1/1 Running 0 9m17s 10.244.251.216 k8scloude3 <none> <none>

来个极端的例子,先把deploy的副本数变为0,再变为10,此时所有的pod都运行在k8scloude3节点了。

[root@k8scloude1 deploy]# kubectl scale deploy nginx –replicas=0 deployment.apps/nginx scaled [root@k8scloude1 deploy]# kubectl get pod -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES pod1 1/1 Running 0 10m 10.244.251.216 k8scloude3 <none> <none> [root@k8scloude1 deploy]# kubectl scale deploy nginx –replicas=10 deployment.apps/nginx scaled [root@k8scloude1 deploy]# kubectl get pod -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES nginx-6cf858f6cf-5cx9s 1/1 Running 0 8s 10.244.251.231 k8scloude3 <none> <none> nginx-6cf858f6cf-6cblj 1/1 Running 0 8s 10.244.251.228 k8scloude3 <none> <none> nginx-6cf858f6cf-827cz 1/1 Running 0 8s 10.244.251.233 k8scloude3 <none> <none> nginx-6cf858f6cf-b989n 1/1 Running 0 8s 10.244.251.229 k8scloude3 <none> <none> nginx-6cf858f6cf-kwxhn 1/1 Running 0 8s 10.244.251.224 k8scloude3 <none> <none> nginx-6cf858f6cf-ljjxz 1/1 Running 0 8s 10.244.251.225 k8scloude3 <none> <none> nginx-6cf858f6cf-ltrpr 1/1 Running 0 8s 10.244.251.227 k8scloude3 <none> <none> nginx-6cf858f6cf-lwf7g 1/1 Running 0 8s 10.244.251.230 k8scloude3 <none> <none> nginx-6cf858f6cf-xw84l 1/1 Running 0 8s 10.244.251.226 k8scloude3 <none> <none> nginx-6cf858f6cf-zpwhq 1/1 Running 0 8s 10.244.251.232 k8scloude3 <none> <none> pod1 1/1 Running 0 11m 10.244.251.216 k8scloude3 <none> <none>

3.3 uncordon节点

要让节点恢复调度pod,uncordon即可。

uncordon k8scloude2节点,k8scloude2节点状态变为Ready,恢复调度。

#需要uncordon [root@k8scloude1 deploy]# kubectl uncordon k8scloude2 node/k8scloude2 uncordoned [root@k8scloude1 deploy]# kubectl get nodes NAME STATUS ROLES AGE VERSION k8scloude1 Ready control-plane,master 8d v1.21.0 k8scloude2 Ready <none> 8d v1.21.0 k8scloude3 Ready <none> 8d v1.21.0

四.drain节点

4.1 drain节点概览

在对节点执行维护(例如内核升级、硬件维护等)之前, 可以使用 kubectl drain 从节点安全地逐出所有 Pods。 安全的驱逐过程允许 Pod 的容器 体面地终止, 并确保满足指定的 PodDisruptionBudgets,PodDisruptionBudget 是一个对象,用于定义可能对一组 Pod 造成的最大干扰。。

说明: 默认情况下, kubectl drain 将忽略节点上不能杀死的特定系统 Pod;

drain 驱逐或删除除镜像 pod 之外的所有 pod(不能通过 API 服务器删除)。如果有 daemon set-managed pods,drain 不会在没有 –ignore-daemonsets 的情况下继续进行,并且无论如何它都不会删除任何 daemon set-managed pods,因为这些 pods 将立即被 daemon set 控制器替换,它会忽略不可调度的标记。如果有任何 pod 既不是镜像 pod,也不是由复制控制器、副本集、守护程序集、有状态集或作业管理的,那么除非您使用 –force,否则 drain 不会删除任何 pod。如果一个或多个 pod 的管理资源丢失, –force 也将允许继续删除。

kubectl drain 的成功返回,表明所有的 Pods(除了上一段中描述的被排除的那些), 已经被安全地逐出(考虑到期望的终止宽限期和你定义的 PodDisruptionBudget)。 然后就可以安全地关闭节点, 比如关闭物理机器的电源,如果它运行在云平台上,则删除它的虚拟机。

4.2 drain 节点

查看node状态和pod

[root@k8scloude1 deploy]# kubectl get nodes NAME STATUS ROLES AGE VERSION k8scloude1 Ready control-plane,master 8d v1.21.0 k8scloude2 Ready <none> 8d v1.21.0 k8scloude3 Ready <none> 8d v1.21.0 [root@k8scloude1 deploy]# kubectl get pod -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES nginx-6cf858f6cf-58wnd 1/1 Running 0 65s 10.244.112.167 k8scloude2 <none> <none> nginx-6cf858f6cf-5rrk4 1/1 Running 0 65s 10.244.112.164 k8scloude2 <none> <none> nginx-6cf858f6cf-86wxr 1/1 Running 0 65s 10.244.251.237 k8scloude3 <none> <none> nginx-6cf858f6cf-89wj9 1/1 Running 0 65s 10.244.112.168 k8scloude2 <none> <none> nginx-6cf858f6cf-9njrj 1/1 Running 0 65s 10.244.251.236 k8scloude3 <none> <none> nginx-6cf858f6cf-hchtb 1/1 Running 0 65s 10.244.251.234 k8scloude3 <none> <none> nginx-6cf858f6cf-mb2ft 1/1 Running 0 65s 10.244.112.166 k8scloude2 <none> <none> nginx-6cf858f6cf-nq6zv 1/1 Running 0 65s 10.244.112.169 k8scloude2 <none> <none> nginx-6cf858f6cf-pl7ww 1/1 Running 0 65s 10.244.251.235 k8scloude3 <none> <none> nginx-6cf858f6cf-sf2w6 1/1 Running 0 65s 10.244.112.165 k8scloude2 <none> <none> pod1 1/1 Running 0 36m 10.244.251.216 k8scloude3 <none> <none>

drain驱逐节点:drain=cordon+evicted

drain k8scloude2节点,–delete-emptydir-data删除数据,–ignore-daemonsets忽略daemonsets

[root@k8scloude1 deploy]# kubectl drain k8scloude2 node/k8scloude2 cordoned error: unable to drain node “k8scloude2”, aborting command… There are pending nodes to be drained: k8scloude2 cannot delete Pods with local storage (use –delete-emptydir-data to override): kube-system/metrics-server-bcfb98c76-k5dmj cannot delete DaemonSet-managed Pods (use –ignore-daemonsets to ignore): kube-system/calico-node-nsbfs, kube-system/kube-proxy-lpj8z [root@k8scloude1 deploy]# kubectl get node NAME STATUS ROLES AGE VERSION k8scloude1 Ready control-plane,master 8d v1.21.0 k8scloude2 Ready,SchedulingDisabled <none> 8d v1.21.0 k8scloude3 Ready <none> 8d v1.21.0 [root@k8scloude1 deploy]# kubectl drain k8scloude2 –ignore-daemonsets node/k8scloude2 already cordoned error: unable to drain node “k8scloude2”, aborting command… There are pending nodes to be drained: k8scloude2 error: cannot delete Pods with local storage (use –delete-emptydir-data to override): kube-system/metrics-server-bcfb98c76-k5dmj [root@k8scloude1 deploy]# kubectl drain k8scloude2 –ignore-daemonsets –force –delete-emptydir-data node/k8scloude2 already cordoned WARNING: ignoring DaemonSet-managed Pods: kube-system/calico-node-nsbfs, kube-system/kube-proxy-lpj8z evicting pod pod/nginx-6cf858f6cf-sf2w6 evicting pod pod/nginx-6cf858f6cf-5rrk4 evicting pod kube-system/metrics-server-bcfb98c76-k5dmj evicting pod pod/nginx-6cf858f6cf-58wnd evicting pod pod/nginx-6cf858f6cf-mb2ft evicting pod pod/nginx-6cf858f6cf-89wj9 evicting pod pod/nginx-6cf858f6cf-nq6zv pod/nginx-6cf858f6cf-5rrk4 evicted pod/nginx-6cf858f6cf-mb2ft evicted pod/nginx-6cf858f6cf-sf2w6 evicted pod/nginx-6cf858f6cf-58wnd evicted pod/nginx-6cf858f6cf-nq6zv evicted pod/nginx-6cf858f6cf-89wj9 evicted pod/metrics-server-bcfb98c76-k5dmj evicted node/k8scloude2 evicted

查看pod,k8scloude2节点被drain之后,pod都调度到了k8scloude3节点。

节点被drain驱逐的本质就是删除节点上的pod,k8scloude2节点被drain驱逐之后,k8scloude2上运行的pod会被删除。

deploy是一个控制器,会监控pod的副本数,当k8scloude2上的pod被驱逐之后,副本数少于10,于是在可调度的节点创建pod,补足副本数。

单独的pod不具备再生性,删除之后就真删除了,如果k8scloude3被驱逐,则pod pod1会被删除,其他可调度节点也不会再生一个pod1。

[root@k8scloude1 deploy]# kubectl get pod -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES nginx-6cf858f6cf-7gh4z 1/1 Running 0 84s 10.244.251.240 k8scloude3 <none> <none> nginx-6cf858f6cf-7lmfd 1/1 Running 0 85s 10.244.251.238 k8scloude3 <none> <none> nginx-6cf858f6cf-86wxr 1/1 Running 0 6m14s 10.244.251.237 k8scloude3 <none> <none> nginx-6cf858f6cf-9bn2b 1/1 Running 0 85s 10.244.251.243 k8scloude3 <none> <none> nginx-6cf858f6cf-9njrj 1/1 Running 0 6m14s 10.244.251.236 k8scloude3 <none> <none> nginx-6cf858f6cf-bqk2w 1/1 Running 0 84s 10.244.251.241 k8scloude3 <none> <none> nginx-6cf858f6cf-hchtb 1/1 Running 0 6m14s 10.244.251.234 k8scloude3 <none> <none> nginx-6cf858f6cf-hjddp 1/1 Running 0 84s 10.244.251.244 k8scloude3 <none> <none> nginx-6cf858f6cf-pl7ww 1/1 Running 0 6m14s 10.244.251.235 k8scloude3 <none> <none> nginx-6cf858f6cf-sgxfg 1/1 Running 0 84s 10.244.251.242 k8scloude3 <none> <none> pod1 1/1 Running 0 41m 10.244.251.216 k8scloude3 <none> <none>

查看node节点状态

[root@k8scloude1 deploy]# kubectl get nodes NAME STATUS ROLES AGE VERSION k8scloude1 Ready control-plane,master 8d v1.21.0 k8scloude2 Ready,SchedulingDisabled <none> 8d v1.21.0 k8scloude3 Ready <none> 8d v1.21.0

4.3 uncordon节点

要取消drain某个节点,直接uncordon即可,没有undrain操作。

[root@k8scloude1 deploy]# kubectl undrain k8scloude2 Error: unknown command “undrain” for “kubectl” Did you mean this? drain Run kubectl –help for usage.

uncordon k8scloude2节点,节点恢复调度

[root@k8scloude1 deploy]# kubectl uncordon k8scloude2 node/k8scloude2 uncordoned [root@k8scloude1 deploy]# kubectl get nodes NAME STATUS ROLES AGE VERSION k8scloude1 Ready control-plane,master 8d v1.21.0 k8scloude2 Ready <none> 8d v1.21.0 k8scloude3 Ready <none> 8d v1.21.0

把deploy副本数变为0,再变为10,再观察pod分布

[root@k8scloude1 deploy]# kubectl scale deploy nginx –replicas=0 deployment.apps/nginx scaled [root@k8scloude1 deploy]# kubectl get pods -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES pod1 1/1 Running 0 52m 10.244.251.216 k8scloude3 <none> <none> [root@k8scloude1 deploy]# kubectl scale deploy nginx –replicas=10 deployment.apps/nginx scaled

k8scloude2节点恢复可调度pod状态

[root@k8scloude1 deploy]# kubectl get pods -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES nginx-6cf858f6cf-4sqj8 1/1 Running 0 6s 10.244.112.172 k8scloude2 <none> <none> nginx-6cf858f6cf-cjqxv 1/1 Running 0 6s 10.244.112.176 k8scloude2 <none> <none> nginx-6cf858f6cf-fk69r 1/1 Running 0 6s 10.244.112.175 k8scloude2 <none> <none> nginx-6cf858f6cf-ghznd 1/1 Running 0 6s 10.244.112.173 k8scloude2 <none> <none> nginx-6cf858f6cf-hnxzs 1/1 Running 0 6s 10.244.251.246 k8scloude3 <none> <none> nginx-6cf858f6cf-hshnm 1/1 Running 0 6s 10.244.112.171 k8scloude2 <none> <none> nginx-6cf858f6cf-jb5sh 1/1 Running 0 6s 10.244.112.170 k8scloude2 <none> <none> nginx-6cf858f6cf-l9xlm 1/1 Running 0 6s 10.244.112.174 k8scloude2 <none> <none> nginx-6cf858f6cf-pgjlb 1/1 Running 0 6s 10.244.251.247 k8scloude3 <none> <none> nginx-6cf858f6cf-rlnh6 1/1 Running 0 6s 10.244.251.245 k8scloude3 <none> <none> pod1 1/1 Running 0 52m 10.244.251.216 k8scloude3 <none> <none>

删除deploy,删除pod。

[root@k8scloude1 deploy]# kubectl delete -f nginx.yaml deployment.apps “nginx” deleted [root@k8scloude1 deploy]# kubectl delete pod pod1 –force warning: Immediate deletion does not wait for confirmation that the running resource has been terminated. The resource may continue to run on the cluster indefinitely. pod “pod1” force deleted [root@k8scloude1 deploy]# kubectl get pods -o wide No resources found in pod namespace.

五.delete 节点

5.1 delete节点概览

delete 删除节点就直接把一个节点就k8s集群中删除了,delete 节点之前需要先drain 节点。

关于delete节点以及重装节点的详细内容,请查看博客《模拟重装Kubernetes(k8s)集群:删除k8s集群然后重装》

5.2 delete节点

kubectl drain 安全驱逐节点上面所有的 pod,–ignore-daemonsets往往需要指定的,这是因为deamonset会忽略SchedulingDisabled标签(使用kubectl drain时会自动给节点打上不可调度SchedulingDisabled标签),因此deamonset控制器控制的pod被删除后,可能马上又在此节点上启动起来,这样就会成为死循环。因此这里忽略daemonset。

[root@k8scloude1 ~]# kubectl drain k8scloude3 –ignore-daemonsets node/k8scloude3 cordoned WARNING: ignoring DaemonSet-managed Pods: kube-system/calico-node-wmz4r, kube-system/kube-proxy-84gcx evicting pod kube-system/calico-kube-controllers-6b9fbfff44-rl2mh pod/calico-kube-controllers-6b9fbfff44-rl2mh evicted node/k8scloude3 evicted

k8scloude3变为SchedulingDisabled

[root@k8scloude1 ~]# kubectl get nodes NAME STATUS ROLES AGE VERSION k8scloude1 Ready control-plane,master 64m v1.21.0 k8scloude2 Ready <none> 56m v1.21.0 k8scloude3 Ready,SchedulingDisabled <none> 56m v1.21.0

删除节点k8scloude3

[root@k8scloude1 ~]# kubectl delete nodes k8scloude3 node “k8scloude3” deleted [root@k8scloude1 ~]# kubectl get nodes NAME STATUS ROLES AGE VERSION k8scloude1 Ready control-plane,master 65m v1.21.0 k8scloude2 Ready <none> 57m v1.21.0

以上就是cordon节点drain驱逐节点delete节点的详细内容,更多关于cordon drain delete节点详解的资料请关注其它相关文章!

免费资源网 – https://freexyz.cn/


© 版权声明
THE END
★喜欢这篇文章吗?喜欢的话,麻烦动动手指支持一下!★
点赞8 分享
评论 抢沙发

请登录后发表评论

    暂无评论内容