2019-07-25-Kubernetes搭建
1. Kubernetes
1.1 kubernetes 简介:
Kubernetes 由Google团队发起并维护的基于Docker的开源容器集群管理系统,
目标是管理跨多个主机的容器, 它不仅支持常见的云平台,
而且支持内部数据中心.
核心概念是 Container Pod –> 一个
Pod(容器集合)由一组工作于同一物理工作节点的容器构成,
组容器拥有相同的网络命名空间, IP以及存储配额, 也可以根据实际情况对每一个
Pod 进行端口映射.
kubernetes 组件构成:
1.2 kubernetes 组件构成:
1.2.1 Master组件:
kube-apiserver: Kubernetes API ,
集群统一入口 , 各组件协调者 , 以HTTP
API提供接口服务,
所有对象资源的增删查改 和监听操都交给APIServer处理后再提交给Etcd存储 ;
kube-scheduler: (调度器)负责对资源进行调度,
分配未分发 Pod绑定到可用的Node节点上, 存储到etcd中;
kube-controller-manager: 负责管理控制器,
一个资源对应一个控制器, CotrolerManager负责管理监控pod运行状态, 根据
etcd中的信息, 调用 node中的kubelet创建Pod;
1.2.2 Node组件:
kubelet: 是 Master在 Node节点上的 Agent,
负责具体的容器生命周期管理, 根据从数据库中获取的信息来管理容器, 并上报
pod运行状态等, 下载secret、获取容器与节点状态, kubelet将每个
Pod装换成一组容器;
kube-proxy: 在 Node节点实现 Pod网络代理;
docker/rocker/rkt: 运行容器
1.2.3 容器运行时:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 ┌─────────────────────────────────────┐ │ kubelet (k8s 的节点代理) │ └────────────┬────────────────────────┘ │ CRI 接口(gRPC 协议) ▼ ┌─────────────────────────────────────┐ │ CRI 实现(高级运行时) │ ← containerd / CRI-O / Docker(已废弃) │ - 拉镜像、管 Pod、管容器生命周期 │ └────────────┬────────────────────────┘ │ OCI 接口 ▼ ┌─────────────────────────────────────┐ │ 低级运行时 │ ← runc / crun / kata / gVisor │ - 实际创建和运行容器 │ └─────────────────────────────────────┘
两套关键标准
CRI(Container Runtime Interface):k8s 定义的接口,kubelet
通过它调用运行时
OCI(Open Container
Initiative):容器行业标准,定义镜像格式和运行时规范
kubelet → containerd (CRI 插件) → runc → 容器
两层运行时
高级运行时(CRI 实现):和 kubelet 对话,管
Pod、镜像、网络等。例:containerd、CRI-O
项目
开发方
状态
性能
生态
k8s 支持
推荐度
containerd
CNCF
毕业
高
成熟
原生
⭐⭐⭐⭐⭐
CRI-O
Red Hat/CNCF
孵化
高
中
原生
⭐⭐⭐⭐
Docker + cri-dockerd
Docker/Mirantis
过渡
中(多一层)
成熟
需额外装
⭐⭐(新集群不推荐)
低级运行时(OCI 实现):真正启动进程、配
namespace/cgroup。例:runc、crun、kata
项目
隔离级别
启动速度
性能
资源开销
兼容性
适用
runc
内核级
极快
高
低
最好
通用
crun
内核级
更快
略高于 runc
最低
好
追求性能
Kata
VM 级
慢(秒级)
中
高
OCI 兼容
强隔离
gVisor
用户态内核
快
中低
中
部分兼容
沙箱
Firecracker
VM 级
快(125ms)
高
中
限 Linux
无服务器
1.2.4 第三方服务(master &&
node):
etcd: 用来保存集群所有状态的
Key/Value存储系统, 所有 Kubernetes组件会通过 API Server来跟
Etcd进行沟通从而保存或读取资源状态。如Pod、service等对象信息;
1.2.5 kubectl:
kubectl 是 Kubernetes 自带的客户端, 可以用它来直接操作 Kubernetes
command
mean
get
显示一个或多个资源
describe
显示特定资源的详细信心
create
通过filename或stdin创建资源
update
通过filename或stdin更新资源
delete
删除资源
namespace
设置并查看当前的Kubernetes命名空间
log
在容器中打印容器的日志
rolling-update
执行给定ReplicationController的滚动更新
resize
为Replication Controller设置新大小
exec
在容器中执行命令
port-forward
将一个或多个本地端口转发到容器
proxy
运行代理到Kubernetes API服务器
run-container
在群集(cluster)上运行特定映像
stop
通过id或filename正常关闭资源
expose
获取replicated application并将其公开为Kubernetes Service
lable
更新资源标签
config
更改kuber相关配置文件
cluster-info
查看集群信息
api-version
打印可用的API版本
version
查看客户端和服务端版本信息
help
查看关于某个命令的帮助信息
1.2.6
Kubernetes CNI网络最强对比:Flannel、Calico、Canal和Weave
1.3 搭建
1.3.1 官方推荐1: Kubeadm
优点: 简单、官方推荐、升级方便、支持高可用;
缺点: 不易维护
参考文章2
1.3.2 官方推荐2: Binary
优点: 易于维护、灵活、升级方便; 缺点:
搭建太过复杂, 相关文档太少
1.4 Kubeadm:
1.4.1 环境准备:
1.4.1.1 硬件信息,最小配置 2C4G:
系统类型
IP
节点
Hostname
CentOS7.2
192.168.56.129
master
master01
CentOS7.2
192.168.56.130
master
master02
CentOS7.2
192.168.56.131
master
master03
CentOS7.2
192.168.56.132
worker
node01
CentOS7.2
192.168.56.133
worker
node02
系统类型
IP
节点
Hostname
CentOS7.2
192.168.56.129
master
master01
CentOS7.2
192.168.56.130
worker
node01
CentOS7.2
192.168.56.131
worker
node02
1.4.1.2
设置主机名hostname ,
每一台配置hosts 文件域名解析:
1 2 3 4 5 6 7 8 9 10 11 cat <<EOF >>/etc/hosts 192.168.56.129 master01 192.168.56.130 node01 192.168.56.131 node02 EOF yum install ntpdate -y ntpdate -s ntp.aliyun.com crontab -e * * * * * /usr/sbin/ntpdate ntp.aliyun.com
1.4.1.3
关闭防火墙、SELinux及swap:
1 2 3 4 5 6 7 8 9 10 11 12 13 systemctl stop firewalld systmectl disable firewalled iptables -F iptables -X && iptables -F -t nat iptables -X -t nat && iptables -P FORWARD ACCEPT setenforce 0 sed -i "s/^SELINUX=enforcing/SELINUX=disabled/g" /etc/selinux/config swapoff -a sed -i 's/.*swap.*/#&/' /etc/fstab sed -i '/swap/s/^\(.*\)$/#\1/g' /etc/fstab
1.4.1.4 加载内核模块和网络参数
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 cat > /etc/modules-load.d/k8s.conf <<EOF overlay br_netfilter EOF modprobe overlay modprobe br_netfiltercat > /etc/sysctl.d/k8s.conf <<EOF net.bridge.bridge-nf-call-iptables = 1 net.bridge.bridge-nf-call-ip6tables = 1 net.ipv4.ip_forward = 1 EOF sysctl --system sysctl -p /etc/sysctl.d/k8s.conf
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 sudo tee /etc/modules-load.d/ipvs.conf <<EOF ip_vs ip_vs_rr ip_vs_wrr ip_vs_sh nf_conntrack EOF sudo modprobe ip_vssudo modprobe ip_vs_rrsudo modprobe ip_vs_wrrsudo modprobe ip_vs_shsudo modprobe nf_conntrack lsmod | grep ip_vssudo yum install -y ipvsadm ipset
1.4.1.5 containerd安装与配置:
1 2 3 4 yum install -y yum-utils yum-config-manager --add-repo https://mirrors.aliyun.com/docker-ce/linux/centos/docker-ce.repo yum install -y containerd.io
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 mkdir -p /etc/containerd containerd config default > /etc/containerd/config.toml sed -i 's/SystemdCgroup = false/SystemdCgroup = true/' /etc/containerd/config.toml sed -i 's#registry.k8s.io/pause:3.8#registry.aliyuncs.com/google_containers/pause:3.9#' /etc/containerd/config.tomlcat > /etc/crictl.yaml <<EOF runtime-endpoint: unix:///run/containerd/containerd.sock image-endpoint: unix:///run/containerd/containerd.sock timeout: 10 EOF [root@master01 ~]# tree /etc/containerd/ /etc/containerd/ ├── certs.d │ ├── docker.io │ │ └── hosts.toml │ └── registry.k8s.io │ └── hosts.toml └── config.toml systemctl daemon-reload systemctl enable --now containerd systemctl status containerd
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 config.toml[plugins."io.containerd.grpc.v1.cri".registry] config_path = "/etc/containerd/certs.d" [root@master01 ~] server = "https://docker.io" [host."https://docker.m.daocloud.io"] capabilities = ["pull" , "resolve" ][host."https://*.mirror.aliyuncs.com"] capabilities = ["pull" , "resolve" ][root@master01 ~] server = "https://registry.k8s.io" [host."https://registry.aliyuncs.com/google_containers"] capabilities = ["pull" , "resolve" ] override_path = true
1.4.1.6
配置国内yum源、Docker源及Kubernets源:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 yum install -y wgetmkdir /etc/yum.repos.d/bak && mv /etc/yum.repos.d/*.repo /etc/yum.repos.d/bak wget -O /etc/yum.repos.d/CentOS-Base.repo http://mirrors.cloud.tencent.com/repo/centos7_base.repo wget http://mirrors.cloud.tencent.com/repo/epel-7.repo -O /etc/yum.repos.d/epel.repo yum clean all && yum makecache wget https://mirrors.aliyun.com/docker-ce/linux/centos/docker-ce.repo -O /etc/yum.repos.d/docker-ce.repocat <<EOF > /etc/yum.repo.d/kubernetes.repo [kubernetes] name=Kubernetes baseurl=https://mirrors.aliyun.com/kubernetes/yum/repos/kubernetes-el7-x86_64/ enabled=1 gpgcheck=1 repo_gpgcheck=1 gpgkey=https://mirrors.aliyun.com/kubernetes/yum/doc/yum-key.gpg https://mirrors.aliyun.com/kubernetes/yum/doc/rpm-package-key.gpg EOF
1.4.2 软件安装:
1.4.2.1
所有节点安装docker、kuberadm、kubelet及kubectl
kubeadm: 部署集群用的命令
kubelet:
在集群中每台机器上都要运行的组件,负责管理pod、容器的生命周期
kubectl: 集群管理工具(可选, 只要在控制集群的节点上安装即可)
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 yum install -y device-mapper-persistent-data lvm2 wget net-tools nfs-utils lrzsz gcc gcc-c++ make cmake libxml2-devel openssl-devel curl curl-devel unzip sudo ntp libaio-devel wget vim ncurses-devel autoconf automake zlib-devel python-devel epel-release openssh-server socat ipvsadm conntrack telnet ipvsadm yum remove -y docker* container-selinux yum install -y docker-ce systemctl enable docker && systemctl start docker docker -versioncat <<EOF > /etc/docker/daemon.json { "graph": "/docker/data/path", # df -h 找空间比较大的, 默认 /var/lib/docker "exec-opts": ["native.cgroupdrive=systemd"] # 默认 cgroups "registry-mirrors": ["https://docker.m.daocloud.io","https://*.mirror.aliyuncs.com"] } EOF yum list kubeadm --showduplicates | sort -r yum install -y kubelet-1.28.* kubeadm-1.28.* kubectl-1.28.* --disableexcludes=kubernetes systemctl enable kubelet
1.4.3 部署配置:
1.4.3.1 下载参考配置文件
1 2 3 4 5 6 7 8 9 10 #!/bin/bash images=(kube-proxy-amd64:v1.13.3 kube-apiserver-amd64:v1.13.3 kube-controller-manager-amd64:v1.13.3 kube-scheduler-amd64:v1.13.3 pause:3.1 etcd-amd64:3.2.18 coredns:1.1.3)for imageName in ${images[@]} do docker pull anjia0532/google-containers.$imageName docker tag anjia0532/google-containers.$imageName k8s.gcr.io/$imageName docker rmi anjia0532/google-containers.$imageName done ./k8s-docker-images.sh
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 [root@master01 Package]# kubeadm config images list --kubernetes-version v1.28.15 I0421 15:21:04.144804 8405 version.go:256] remote version is much newer: v1.35.4; falling back to: stable-1.28 registry.k8s.io/kube-apiserver:v1.28.15 registry.k8s.io/kube-controller-manager:v1.28.15 registry.k8s.io/kube-scheduler:v1.28.15 registry.k8s.io/kube-proxy:v1.28.15 registry.k8s.io/pause:3.9 registry.k8s.io/etcd:3.5.15-0 registry.k8s.io/coredns/coredns:v1.10.1 kubeadm config images pull --config kubeadm-config.yaml kubeadm config images pull \ --image-repository registry.aliyuncs.com/google_containers \ --kubernetes-version v1.28.15
1.4.3.2 初始化mater节点
1 2 3 4 5 kubeadm config print init-defaults --component-configs KubeletConfiguration,KubeProxyConfiguration kubeadm config print init-defaults > kubeadm-config.yaml kubeadm init --config kubeadm-config.yaml --dry-run
▶
title kubeadm-config.yaml
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 apiVersion: kubeadm.k8s.io/v1beta3 kind: InitConfiguration localAPIEndpoint: advertiseAddress: 192.168 .56 .129 bindPort: 6443 nodeRegistration: criSocket: unix:///run/containerd/containerd.sock name: master01 taints: - effect: NoSchedule key: node-role.kubernetes.io/control-plane --- apiVersion: kubeadm.k8s.io/v1beta3 kind: ClusterConfiguration kubernetesVersion: v1.28.15 imageRepository: registry.aliyuncs.com/google_containers etcd: local: dataDir: /var/lib/etcd networking: dnsDomain: cluster.local podSubnet: 10.244 .0 .0 /16 serviceSubnet: 10.96 .0 .0 /12 apiServer: certSANs: - "master01" - "192.168.56.129" - "127.0.0.1" controllerManager: {}scheduler: {}--- apiVersion: kubeproxy.config.k8s.io/v1alpha1 kind: KubeProxyConfiguration mode: ipvs ipvs: scheduler: "rr" --- apiVersion: kubelet.config.k8s.io/v1beta1 kind: KubeletConfiguration
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 kubeadm init --config kubeadm-config.yaml --upload-certs kubeadm init \ --apiserver-advertise-address=192.168.56.129 \ --image-repository registry.aliyuncs.com/google_containers \ --kubernetes-version v1.28.15 \ --service-cidr=10.96.0.0/12 \ --pod-network-cidr=10.244.0.0/16 \ --cri-socket=unix:///run/containerd/containerd.sock kubeadm join 192.168.56.129:6443 --token kekvgu.nw1n76h84f4camj6 --discovery-token-ca-cert-hash sha256:4ee74205227c78ca62f2d641635afa4d50e6634acfaa8291f28582c7e3b0e30els /etc/kubernetes/pki mkdir -p $HOME /.kubesudo cp -i /etc/kubernetes/admin.conf $HOME /.kube/configsudo chown $(id -u):$(id -g) $HOME /.kube/config kubectl label node node01 node-role.kubernetes.io/worker= [root@master01 yum.repos.d]# kubectl get node NAME STATUS ROLES AGE VERSION master01 NotReady control-plane 20h v1.28.15 node01 NotReady worker 20h v1.28.15 node02 NotReady worker 20h v1.28.15
1.4.3.3 加入worker节点
1 2 3 4 kubeadm token create --print-join-command kubeadm join 192.168.56.129:6443 --token kekvgu.nw1n76h84f4camj6 --discovery-token-ca-cert-hash sha256:4ee74205227c78ca62f2d641635afa4d50e6634acfaa8291f28582c7e3b0e30e
1.4.3.4 部署网络插件(flannel 或
calico)
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 kubectl apply -f https://github.com/flannel-io/flannel/releases/latest/download/kube-flannel.yml kubectl get pods -n kube-system kubectl delete pods kube-flannel-ds-amd64-6jf7t -n kube-system kubectl create -f https://raw.githubusercontent.com/projectcalico/calico/v3.26.4/manifests/tigera-operator.yaml kubectl get pods -n tigera-operator wget https://raw.githubusercontent.com/projectcalico/calico/v3.26.4/manifests/custom-resources.yaml sed -i 's#192.168.0.0/16#10.244.0.0/16#' custom-resources.yaml kubectl apply -f custom-resources.yaml kubectl get pods -n calico-system wget https://raw.githubusercontent.com/projectcalico/calico/v3.26.4/manifests/calico.yaml sed -i 's@# - name: CALICO_IPV4POOL_CIDR@- name: CALICO_IPV4POOL_CIDR@; s@# value: "192.168.0.0/16"@ value: "10.244.0.0/16"@' calico.yaml kubectl apply -f calico.yaml [root@master01 ~]# kubectl get pods -n kube-system | grep calico
1.4.3.5
创建Pod以验证集群是否正常
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 cat > cluster-test.yaml <<EOF --- apiVersion: apps/v1 kind: Deployment metadata: name: nginx-test labels: app: nginx-test spec: replicas: 3 selector: matchLabels: app: nginx-test template: metadata: labels: app: nginx-test spec: containers: - name: nginx image: nginx:alpine ports: - containerPort: 80 resources: requests: memory: "32Mi" cpu: "50m" limits: memory: "128Mi" cpu: "200m" --- apiVersion: v1 kind: Service metadata: name: nginx-test-svc spec: type: NodePort selector: app: nginx-test ports: - port: 80 targetPort: 80 nodePort: 30080 EOF
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 kubectl apply -f cluster-test.yaml [root@master01 yaml]# kubectl get pods -o wide -l app=nginx-test NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES nginx-test-6595dd4466-rwp5b 0/1 ContainerCreating 0 9s <none> node01 <none> <none> nginx-test-6595dd4466-t9stt 0/1 ContainerCreating 0 9s <none> node01 <none> <none> nginx-test-6595dd4466-wc45z 0/1 ContainerCreating 0 9s <none> node02 <none> <none> [root@master01 yaml]# kubectl get svc nginx-test-svc NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE nginx-test-svc NodePort 10.110.210.87 <none> 80:30080/TCP 21s kubectl run test-client --rm -it --image=busybox:1.28 --restart=Never -- sh wget -qO- http://nginx-test-svc nslookup nginx-test-svcexit curl http://192.168.56.129:30080 curl http://192.168.56.130:30080 curl http://192.168.56.131:30080 alias k=kubectlalias kgp='kubectl get pods' alias kgpa='kubectl get pods -A' alias kgn='kubectl get nodes' alias kgs='kubectl get svc' alias kd='kubectl describe' alias kl='kubectl logs' alias ke='kubectl exec -it' source <(kubectl completion bash) complete -F __start_kubectl ksource ~/.bashrc
1.5安装Dashboard(master节点)
1.5.1 安装Dashboard v2 vs v3
对比
维度
Dashboard v2.x
Dashboard v3.x
部署方式
单个 recommended.yaml
Helm Chart
架构
单体
微服务拆分
依赖
无额外依赖
cert-manager、nginx 等
复杂度
低
中
维护状态
只有 bug fix
主力开发
k8s 1.28 兼容
✅
✅
对比项
v2.7.0
v7.x (v3 架构)
部署方式
单 YAML
Helm 为主
组件数
2 个 (dashboard + metrics-scraper)
5 个 + Kong 网关
资源占用
小
大(虚拟机 4G 建议给 master 6G)
安装复杂度
一键
中等
访问方式
直连 Dashboard Pod
Kong 网关路由
架构
单体
微服务
维护状态
只修 bug
主力开发
1.5.2 安装Dashboard v2.x
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 wget https://raw.githubusercontent.com/kubernetes/dashboard/v2.7.0/aio/deploy/recommended.yaml -O dashboard.yaml sed -i '/name: kubernetes-dashboard$/,/selector:$/ { /ports:/a\ type: NodePort /targetPort: 8443/a\ nodePort: 30443 }' dashboard.yaml kubectl apply -f dashboard.yaml [root@master01 dashboard]# kubectl get pods -n kubernetes-dashboard kubectl set image -n kubernetes-dashboard deployment/kubernetes-dashboard kubernetes-dashboard=docker.m.daocloud.io/kubernetesui/dashboard:v2.7.0 kubectl set image -n kubernetes-dashboard deployment/dashboard-metrics-scraper dashboard-metrics-scraper=docker.m.daocloud.io/kubernetesui/metrics-scraper:v1.0.8
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 kind: Service apiVersion: v1 metadata: labels: k8s-app: kubernetes-dashboard name: kubernetes-dashboard namespace: kubernetes-dashboard spec: type: NodePort ports: - port: 443 targetPort: 8443 nodePort: 30443 selector: k8s-app: kubernetes-dashboard
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 cat > dashboard-admin.yaml <<EOF --- apiVersion: v1 kind: ServiceAccount metadata: name: admin-user namespace: kubernetes-dashboard --- apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRoleBinding metadata: name: admin-user roleRef: apiGroup: rbac.authorization.k8s.io kind: ClusterRole name: cluster-admin subjects: - kind: ServiceAccount name: admin-user namespace: kubernetes-dashboard EOF kubectl apply -f dashboard-admin.yaml kubectl create token -n kubernetes-dashboard admin-user --duration=8760h
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 TOKEN=$(kubectl -n kubernetes-dashboard create token admin-user --duration=8760h) CA_DATA=$(kubectl config view --raw -o jsonpath='{.clusters[0].cluster.certificate-authority-data}' ) SERVER=$(kubectl config view --raw -o jsonpath='{.clusters[0].cluster.server}' )cat > dashboard-kubeconfig.yaml <<EOF apiVersion: v1 kind: Config clusters: - name: kubernetes cluster: certificate-authority-data: ${CA_DATA} server: ${SERVER} users: - name: admin-user user: token: ${TOKEN} contexts: - name: admin-user@kubernetes context: cluster: kubernetes user: admin-user current-context: admin-user@kubernetes EOF
1.5.3 一键部署dashboard脚本
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 #!/bin/bash kubectl apply -f https://raw.githubusercontent.com/kubernetes/dashboard/v2.7.0/aio/deploy/recommended.yaml kubectl patch svc kubernetes-dashboard -n kubernetes-dashboard \ -p '{"spec":{"type":"NodePort","ports":[{"port":443,"targetPort":8443,"nodePort":30443}]}}' cat <<EOF | kubectl apply -f - apiVersion: v1 kind: ServiceAccount metadata: name: admin-user namespace: kubernetes-dashboard --- apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRoleBinding metadata: name: admin-user roleRef: apiGroup: rbac.authorization.k8s.io kind: ClusterRole name: cluster-admin subjects: - kind: ServiceAccount name: admin-user namespace: kubernetes-dashboard EOF echo "等待 Dashboard 启动..." kubectl wait --for =condition=ready pod -l k8s-app=kubernetes-dashboard -n kubernetes-dashboard --timeout =300secho "===== 访问地址 =====" echo "https://$(kubectl get nodes -o jsonpath='{.items[0].status.addresses[?(@.type=="InternalIP" ) ].address}'):30443" echo "" echo "===== 登录 Token =====" kubectl -n kubernetes-dashboard create token admin-user --duration=8760h
1.5.4 一键生成dashboard
kubeconfig 脚本
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 #!/bin/bash set -e NAMESPACE="kubernetes-dashboard" SA_NAME="admin-user" OUTPUT_FILE="dashboard-kubeconfig.yaml" TOKEN_DURATION="8760h" echo "==> 1. 确认 ServiceAccount 存在" kubectl get sa -n ${NAMESPACE} ${SA_NAME} >/dev/null 2>&1 || { echo "创建 ServiceAccount..." cat <<EOF | kubectl apply -f - apiVersion: v1 kind: ServiceAccount metadata: name: ${SA_NAME} namespace: ${NAMESPACE} --- apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRoleBinding metadata: name: ${SA_NAME} roleRef: apiGroup: rbac.authorization.k8s.io kind: ClusterRole name: cluster-admin subjects: - kind: ServiceAccount name: ${SA_NAME} namespace: ${NAMESPACE} EOF }echo "==> 2. 生成 Token (${TOKEN_DURATION} )" TOKEN=$(kubectl -n ${NAMESPACE} create token ${SA_NAME} --duration=${TOKEN_DURATION} )echo "==> 3. 获取集群信息" CA_DATA=$(kubectl config view --raw -o jsonpath='{.clusters[0].cluster.certificate-authority-data}' ) SERVER=$(kubectl config view --raw -o jsonpath='{.clusters[0].cluster.server}' )echo "==> 4. 生成 kubeconfig" cat > ${OUTPUT_FILE} <<EOF apiVersion: v1 kind: Config clusters: - name: kubernetes cluster: certificate-authority-data: ${CA_DATA} server: ${SERVER} users: - name: ${SA_NAME} user: token: ${TOKEN} contexts: - name: ${SA_NAME}@kubernetes context: cluster: kubernetes user: ${SA_NAME} namespace: ${NAMESPACE} current-context: ${SA_NAME}@kubernetes EOF echo "" echo "✅ 完成!" echo "文件: $(pwd) /${OUTPUT_FILE} " echo "" echo "下载到本地:" echo " scp $(whoami) @$(hostname -I | awk '{print $1}') :$(pwd) /${OUTPUT_FILE} ~/" echo "" echo "然后在 Dashboard 登录页选 Kubeconfig 方式,上传这个文件。"
1.5.5 清理dangling 镜像
nerdctl -n k8s.io images vs docker images 的区别
维度
docker images
nerdctl -n k8s.io images
操作对象
Docker daemon 的镜像库
containerd 的 k8s.io namespace
镜像来源
docker pull 拉的
k8s 拉的 + nerdctl pull 拉的
是否显示中间层
不显示(默认隐藏)
显示 ,所以有很多 <none>
数据位置
/var/lib/docker/
/var/lib/containerd/
哪些容器能用
Docker 容器
k8s 的 Pod
namespace 隔离
没有
有 ,同一 containerd 可多 namespace
四个工具的定位对比
工具
层级
定位
用户友好度
谁维护
ctr
最底层
containerd 原生调试工具
⭐⭐(不友好)
containerd 官方
nerdctl
中层
Docker 风格 CLI
⭐⭐⭐⭐⭐
containerd 社区
crictl
CRI 层
k8s 专用调试工具
⭐⭐⭐
k8s 官方
kubectl
集群层
管整个 k8s 集群
⭐⭐⭐⭐
k8s 官方
调用链对比
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 kubectl ─┐ │ (HTTP API ) ▼ API Server ──┐ │ (k8s 调度 → kubelet ) ▼ kubelet ──┐ │ (CRI gRPC ) ▼ ←──────── crictl (在这一层) containerd ──┐ │ (containerd gRPC ) ▼ ←── ctr / nerdctl (在这一层) runc ──┐ ▼ 容器
三个工具的功能覆盖对比
功能
ctr
nerdctl
crictl
列镜像
✅
✅
✅
拉/推镜像
✅
✅
✅(只拉)
删镜像
✅(危险)
✅(安全)
✅(安全)
导出/导入镜像
✅
✅
❌
跑容器
✅(麻烦)
✅(简单)
❌
列容器
✅
✅
✅(Pod 视角)
进容器
✅(麻烦)
✅
✅
看日志
❌
✅
✅
构建镜像
❌
✅(需 buildkit)
❌
用 compose 编排
❌
✅
❌
列 Pod(k8s 概念)
❌
❌
✅
管理 CRI 资源
❌
❌
✅
读 mirror 配置
❌
✅
✅(通过 containerd)
1 2 3 4 5 6 7 8 9 10 11 12 sudo nerdctl -n k8s.io image prunesudo nerdctl -n k8s.io image prune -asudo crictl rmi --prune
1.6 K8s 证书有效期与续期全攻略
1.6.1 Kubeadm 默认证书有效期
1.6.1.1 两类证书
kubeadm 签发的证书分两类,有效期截然不同 :
证书类型
有效期
作用
CA 根证书
10 年
签发其他所有证书的”根”,基本不用续
其他组件证书
1 年
apiserver、etcd、scheduler 等,需要定期续
1.6.1.2 具体证书列表
kubeadm 会生成这些 1 年期证书:
1 2 3 4 5 6 7 8 9 10 11 /etc/kubernetes/pki/ ├── apiserver.crt ← API Server 服务证书 ├── apiserver-etcd-client.crt ← API Server 连 etcd 用 ├── apiserver-kubelet-client.crt ← API Server 连 kubelet 用 ├── front-proxy-client.crt ← 前端代理 ├── etcd/ │ ├── server.crt ← etcd 服务端 │ ├── peer.crt ← etcd 节点间通信 │ ├── healthcheck-client.crt ← etcd 健康检查 │ └── ca.crt ← etcd CA (10 年) └── ca.crt ← 集群根 CA (10 年)
还有 kubeconfig 里嵌入的证书(也是 1 年):
1 2 3 4 5 /etc/kubernetes/ ├── admin.conf ← admin 用的 ├── controller-manager.conf ├── scheduler.conf └── kubelet.conf ← 首次用,之后自动轮换
1.6.2 查看证书有效期
1.6.2.1 方法 1:kubeadm
命令(最方便)
1 2 sudo kubeadm certs check-expiration
1.6.2.2 方法 2:openssl 直接看
1 2 3 4 5 6 sudo openssl x509 -in /etc/kubernetes/pki/apiserver.crt -noout -dates notBefore=Apr 23 10:23:00 2026 GMT notAfter=Apr 23 10:23:00 2027 GMT
1.6.2.3 方法 3:批量查看
1 2 3 4 for cert in /etc/kubernetes/pki/*.crt /etc/kubernetes/pki/etcd/*.crt; do echo "=== $cert ===" sudo openssl x509 -in "$cert " -noout -enddate 2>/dev/nulldone
1.6.3 过期前如何续期(正常流程)
1.6.3.1
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 sudo cp -r /etc/kubernetes /etc/kubernetes.bak.$(date +%Y%m%d)sudo kubeadm certs renew allsudo systemctl restart kubelet /sudo mv /etckubernetes/manifests/kube-apiserver.yaml /tmp/sleep 20sudo mv /tmp/kube-apiserver.yaml /etc/kubernetes/manifests/sudo cp /etc/kubernetes/admin.conf $HOME /.kube/configsudo chown $(id -u):$(id -g) $HOME /.kube/config kubeadm certs check-expiration