当前位置: 首页 > news >正文

[K8s] K8s 安装部署篇

0 序

  • 熟悉了 K8s 的基础概念和运行原理后,即可尝试自行部署一下K8s集群。
  • K8s概述 - 博客园/千千寰宇

1 K8s 安装部署篇

环境配置与前置准备

  • CENTOS7 服务器 x N台 (N≥3)
  • 每台服务器 2Core 2GB
  • 更新YUM镜像源
yum update
yum upgrade

Step1 安装、运行 Docker

yum -y install wget
wget https://mirrors.aliyun.com/docker-ce/linux/centos/docker-ce.repo -O /etc/yum.repos.d/docker-ce.repo
yum -y install docker-ce//查看版本
docker versionsystemctl enable docker
systemctl start docker

Step2 安装 kubeadm

  • 安装阿里源
cat <<EOF | sudo tee /etc/yum.repos.d/kubernetes.repo
[kubernetes]
name=Kubernetes
#baseurl=https://packages.cloud.google.com/yum/repos/kubernetes-el7-\$basearch
baseurl=https://mirrors.aliyun.com/kubernetes/yum/repos/kubernetes-el7-x86_64/
enabled=1
gpgcheck=1
#gpgcheck=0
repo_gpgcheck=1
#repo_gpgcheck=0
#gpgkey=https://packages.cloud.google.com/yum/doc/rpm-package-key.gpg
gpgkey=https://mirrors.aliyun.com/kubernetes/yum/doc/yum-key.gpg https://mirrors.aliyun.com/kubernetes/yum/doc/rpm-package-key.gpg
#exclude=kubelet kubeadm kubectl
EOF
//更新 YUM 源
yum update
  • 禁用 SELinux

将 SELinux 设置为 pemissive 模式,相当于将其禁用
通过运行命令setenforce 0sed ... 将 SELinux 设置为 permissive 模式可以有效地将其禁用。这是允许容器访问主机文件系统所必须的,而这些操作是为了保证Pod网络工作正常。

setenforce 0
sed -i 's/^SELINUX=enforcing$/SELINUX=permissive/' /etc/selinux/config
  • 关闭防火墙
systemctl stop firewalld
systemctl disable firewalld
  • 关闭 swap

禁用交换分区,为了保证kubelet 正常工作,你必须禁用交互分区。

例如,sudo swappoff -a 将暂时禁用交换分区。要使此更改在重启后保持不变,请确保在如/etc/fstabsystemd.swap 等配置文件中禁用交换分区,具体取决于你的系统配置如何。

swapoff -a
  • 安装、并启用 kubelet
sudo yum install -y kubelet kubeadm kubectl --disableexclude=kubernetes
sudo systemctl enable --now kubelet

Step3 部署主节点

查看 kubeadm 版本信息

# kubeadm config print init-defaults
...
apiVersion: kubeadm.k8s.io/v1beta3
certificatesDir: /etc/kubernetes/pki
clusterName: kubernetes
controllerManager: {}
dns: {}
etcd:local:dataDir: /var/lib/etcd
imageRepository: registry.k8s.io
kind: ClusterConfiguration
kubernetesVersion: 1.28.0
...

其中 apiVersion 和 kubernetesVersion 需要和下面编写的 kubeadm.yml 保持一致

编写 kubeadm.yaml

apiVersion: kubeadm.k8s.io/v1beta3
kind: ClusterConfiguration
kubernetesVersion: 1.28.0
imageRepository: registry.aliyuncs.com/google_containers
#imageRepository: k8s.gcr.io
controllerManager: {}
dns:type: CoreDNS
apiServer:extraArgs:runtime-config: "api/all=true"
etcd:local:dataDir: /data/k8s/etcd
scheduler: {}

启用 kubelet.service

systemctl enable kubelet.service

启动容器运行时:containerd

rm -rf /etc/containerd/config.toml//修改 containerd 配置,添加镜像加速:
//  1) 基于默认配置之上,编辑 containerd 配置
sudo mkdir -p /etc/containerd
sudo containerd config default | sudo tee /etc/containerd/config.toml//修改 /etc/containerd/config.toml :...[plugins."io.containerd.grpc.v1.cri"]# 修改: 1行配置# sandbox_image = "registry.k8s.io/pause:3.6"sandbox_image = "registry.aliyuncs.com/google_containers/pause:3.9"[plugins."io.containerd.grpc.v1.cri".registry]...# 新增:3+2行配置(不含注释行或空行)[plugins."io.containerd.grpc.v1.cri".registry.mirrors][plugins."io.containerd.grpc.v1.cri".registry.mirrors."docker.io"]# 阿里云镜像加速获取 from https://cr.console.aliyun.com/cn-hangzhou/instances/mirrorsendpoint = ["https://xxx.mirror.aliyuncs.com", "https://registry-1.docker.io"][plugins."io.containerd.grpc.v1.cri".registry.mirrors."registry.k8s.io"]endpoint = ["https://registry.aliyuncs.com/google_containers"]...systemctl restart containerd
systemctl status containerd

使网桥支持 IPv6

cd /etc/sysctl.d/vi k8s-sysctl.conf
//添加如下文本:
net.bridge.bridge-nf-call-ip6tables = 1
net.bridge.bridge-nf-call-iptables = 1ls /etc/sysctl.d/k8s-sysctl.conf

主节点初始化: kubeadm / kubelet

  • 主节点初始化

本操作会生成 /var/lib/kubelet/config.yaml 等重要文件。

//使其生效
# kubeadm init --config /root/kubeadm.yaml
[init] Using Kubernetes version: v1.28.0
[preflight] Running pre-flight checks
[preflight] Pulling images required for setting up a Kubernetes cluster
[preflight] This might take a minute or two, depending on the speed of your internet connection
[preflight] You can also perform this action in beforehand using 'kubeadm config images pull'
[certs] Using certificateDir folder "/etc/kubernetes/pki"
[certs] Generating "ca" certificate and key
[certs] Generating "apiserver" certificate and key
[certs] apiserver serving cert is signed for DNS names [kubernetes kubernetes.default kubernetes.default.svc kubernetes.default.svc.cluster.local vm-a] and IPs [10.96.0.1 192.168.xx.211]
[certs] Generating "apiserver-kubelet-client" certificate and key
[certs] Generating "front-proxy-ca" certificate and key
[certs] Generating "front-proxy-client" certificate and key
[certs] Generating "etcd/ca" certificate and key
[certs] Generating "etcd/server" certificate and key
[certs] etcd/server serving cert is signed for DNS names [localhost vm-a] and IPs [192.168.xx.211 127.0.0.1 ::1]
[certs] Generating "etcd/peer" certificate and key
[certs] etcd/peer serving cert is signed for DNS names [localhost vm-a] and IPs [192.168.xx.211 127.0.0.1 ::1]
[certs] Generating "etcd/healthcheck-client" certificate and key
[certs] Generating "apiserver-etcd-client" certificate and key
[certs] Generating "sa" key and public key
[kubeconfig] Using kubeconfig folder "/etc/kubernetes"
[kubeconfig] Writing "admin.conf" kubeconfig file
[kubeconfig] Writing "kubelet.conf" kubeconfig file
[kubeconfig] Writing "controller-manager.conf" kubeconfig file
[kubeconfig] Writing "scheduler.conf" kubeconfig file
[etcd] Creating static Pod manifest for local etcd in "/etc/kubernetes/manifests"
[control-plane] Using manifest folder "/etc/kubernetes/manifests"
[control-plane] Creating static Pod manifest for "kube-apiserver"
[control-plane] Creating static Pod manifest for "kube-controller-manager"
[control-plane] Creating static Pod manifest for "kube-scheduler"
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Starting the kubelet
[wait-control-plane] Waiting for the kubelet to boot up the control plane as static Pods from directory "/etc/kubernetes/manifests". This can take up to 4m0s
[apiclient] All control plane components are healthy after 7.507281 seconds
[upload-config] Storing the configuration used in ConfigMap "kubeadm-config" in the "kube-system" Namespace
[kubelet] Creating a ConfigMap "kubelet-config" in namespace kube-system with the configuration for the kubelets in the cluster
[upload-certs] Skipping phase. Please see --upload-certs
[mark-control-plane] Marking the node vm-a as control-plane by adding the labels: [node-role.kubernetes.io/control-plane node.kubernetes.io/exclude-from-external-load-balancers]
[mark-control-plane] Marking the node vm-a as control-plane by adding the taints [node-role.kubernetes.io/control-plane:NoSchedule]
[bootstrap-token] Using token: qi82d7.glltv3hltpe4aq08
[bootstrap-token] Configuring bootstrap tokens, cluster-info ConfigMap, RBAC Roles
[bootstrap-token] Configured RBAC rules to allow Node Bootstrap tokens to get nodes
[bootstrap-token] Configured RBAC rules to allow Node Bootstrap tokens to post CSRs in order for nodes to get long term certificate credentials
[bootstrap-token] Configured RBAC rules to allow the csrapprover controller automatically approve CSRs from a Node Bootstrap Token
[bootstrap-token] Configured RBAC rules to allow certificate rotation for all node client certificates in the cluster
[bootstrap-token] Creating the "cluster-info" ConfigMap in the "kube-public" namespace
[kubelet-finalize] Updating "/etc/kubernetes/kubelet.conf" to point to a rotatable kubelet client certificate and key
[addons] Applied essential addon: CoreDNS
[addons] Applied essential addon: kube-proxyYour Kubernetes control-plane has initialized successfully!To start using your cluster, you need to run the following as a regular user:mkdir -p $HOME/.kubesudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/configsudo chown $(id -u):$(id -g) $HOME/.kube/configAlternatively, if you are the root user, you can run:export KUBECONFIG=/etc/kubernetes/admin.confYou should now deploy a pod network to the cluster.
Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:https://kubernetes.io/docs/concepts/cluster-administration/addons/Then you can join any number of worker nodes by running the following on each as root:kubeadm join 192.168.xx.211:6443 --token qi82d7.glltv3hltpe4aq08 \--discovery-token-ca-cert-hash sha256:6054b8402053e9eb8f6cb134c066f3e28ae80aa5fd28cec002af1f4199383890

Step4 核验部署结果

  • 查看 kubelet 运行状态与日志
sudo systemctl status kubelet --no-pagersudo journalctl -xeu kubelet -n 50 --no-pager | tail -30
  • 检查镜像是否成功拉取
# sudo crictl --runtime-endpoint unix:///var/run/containerd/containerd.sock images | grep pause
registry.aliyuncs.com/google_containers/pause                     3.9                 e6f1816883972       322kB
  • 查看容器的运行状态
# sudo crictl --runtime-endpoint unix:///var/run/containerd/containerd.sock ps -a
CONTAINER           IMAGE               CREATED             STATE               NAME                      ATTEMPT             POD ID              POD
1f57bb4c9ab9d       ea1030da44aa1       6 minutes ago       Running             kube-proxy                0                   4d286784c3ddb       kube-proxy-7crkv
280d71fdbe867       f6f496300a2ae       6 minutes ago       Running             kube-scheduler            1                   9c57afb350343       kube-scheduler-vm-a
0b32789371828       4be79c38a4bab       6 minutes ago       Running             kube-controller-manager   1                   14ab1f95b380a       kube-controller-manager-vm-a
2e821a289e1ca       73deb9a3f7025       6 minutes ago       Running             etcd                      1                   b06ec8f19d648       etcd-vm-a
62288683995c3       bb5e0dde9054c       6 minutes ago       Running             kube-apiserver            1                   e4b6ad50d3758       kube-apiserver-vm-a

补充看下: docker

docker ps -a
CONTAINER ID   IMAGE     COMMAND   CREATED   STATUS    PORTS     NAMES

说明: 【容器运行时】确实已切换到了 containerd

Z FAQ for K8s 安装部署

Q: 重置k8s节点?

# 1. 重置 kubeadm(会删除所有集群配置和容器)
sudo kubeadm reset -f# 2. 删除残留的配置文件和目录
sudo rm -rf /etc/kubernetes/
sudo rm -rf /var/lib/kubelet/
sudo rm -rf /var/lib/etcd/
sudo rm -rf ~/.kube/# 3. 清理 CNI 网络配置
sudo rm -rf /etc/cni/net.d/
sudo rm -rf /var/lib/cni/# 4. 清理 iptables 规则(可选,但建议执行)
sudo iptables -F && sudo iptables -t nat -F && sudo iptables -t mangle -F && sudo iptables -X# 5. 重启 kubelet(或整个系统)
sudo systemctl restart kubelet
# 或者重启系统以确保干净
sudo reboot检查 kubelet 运行或异常情况: sudo systemctl status kubeletsudo journalctl -xeu kubelet -n 100 --no-pager | tail -50

Q: kubeadm init 失败,报等待控制平面组件启动时超时等错误?

原因分析

这类错误表示 kubeadm init 在等待控制平面组件启动时超时了。最常见的原因是 kubelet 没有正常运行容器运行时配置问题

1. 检查 kubelet 进程状态

sudo systemctl status kubelet

如果显示 inactive (dead) 或失败状态,尝试启动:

sudo systemctl start kubelet
sudo systemctl enable kubelet

2. 查看 kubelet 详细日志

sudo journalctl -xeu kubelet -n 200 --no-pager

重点关注以下错误:

  • failed to run Kubelet
  • node "xxx" not found
  • cannot find cgroup
  • container runtime is down

3. 检查容器运行时(containerd)

# 检查 containerd 是否运行
sudo systemctl status containerd# 如果未运行,启动它
sudo systemctl start containerd
sudo systemctl enable containerd# 验证 containerd 状态
sudo crictl --runtime-endpoint unix:///var/run/containerd/containerd.sock version

4. 检查控制平面容器状态

# 列出所有 Kubernetes 相关容器
sudo crictl --runtime-endpoint unix:///var/run/containerd/containerd.sock ps -a | grep kube | grep -v pause# 如果有失败的容器,查看日志(替换 CONTAINERID)
sudo crictl --runtime-endpoint unix:///var/run/containerd/containerd.sock logs CONTAINERID

常见原因及解决方案

原因 A:kubelet 配置错误

检查 kubelet 配置文件:

cat /var/lib/kubelet/config.yaml

常见问题:

  • cgroupDriver 与 containerd 不匹配(应为 systemd
  • 节点名称解析问题

修复 cgroupDriver 不匹配:

# 检查 containerd 的 cgroup driver
sudo cat /etc/containerd/config.toml | grep SystemdCgroup# 确保 kubelet 使用相同的 driver
# 编辑 /var/lib/kubelet/config.yaml,设置:
# cgroupDriver: systemd# 重启服务
sudo systemctl restart containerd
sudo systemctl restart kubelet

原因 B:镜像拉取失败(国内环境非常常见)

控制平面镜像无法从 registry.k8s.io 拉取。

  • 解决方案:使用国内镜像源
  • 自动方式: 参见 配置 /etc/containerd/config.toml 的章节 (亲测)
  • 手动方式(未亲测)
# 查看 kubeadm.yaml 中是否指定了 imageRepository
grep imageRepository /root/kubeadm.yaml# 如果没有,修改 kubeadm.yaml 添加:
# imageRepository: registry.aliyuncs.com/google_containers# 或者手动拉取镜像
sudo kubeadm config images pull --image-repository=registry.aliyuncs.com/google_containers# 然后重新初始化
sudo kubeadm init --config /root/kubeadm.yaml

原因 C:之前的重置不彻底

如果之前执行过 kubeadm reset,但 etcd 或网络配置残留:

# 彻底清理
sudo kubeadm reset -f
sudo rm -rf /etc/kubernetes/ /var/lib/kubelet/ /var/lib/etcd/ /var/lib/cni/ /etc/cni/
sudo rm -rf ~/.kube/# 清理网络接口
sudo ip link delete cni0 2>/dev/null || true
sudo ip link delete flannel.1 2>/dev/null || true# 重启
sudo reboot

推荐排查流程

# 1. 先查看具体错误日志
sudo journalctl -xeu kubelet -n 100 --no-pager | tail -50# 2. 根据错误类型处理,常见情况:# 情况 1:如果看到 "connection refused" 到 containerd
sudo systemctl restart containerd
sudo systemctl restart kubelet# 情况 2:如果看到 "ImagePullBackOff" 或镜像相关错误
sudo kubeadm config images pull --image-repository=registry.aliyuncs.com/google_containers# 情况 3:如果看到 cgroup 相关错误
# 编辑 /var/lib/kubelet/config.yaml 确保 cgroupDriver: systemd
sudo systemctl restart kubelet# 3. 重新初始化
sudo kubeadm init --config /root/kubeadm.yaml

这样可以看到具体的错误信息,而不是超时提示。

Y 推荐文献

X 参考文献

http://www.jsqmd.com/news/594217/

相关文章:

  • 浙江温州防水配电箱市场测评:五家实力厂商深度解析与选型指南 - 2026年企业推荐榜
  • PropertyChangeLib:嵌入式状态感知变量设计与实践
  • DeepSeek+WPS/office的高效办公,润色排版翻译公式全搞定!
  • OpenClaw扩展性测试:Qwen3.5-9B-AWQ-4bit同时处理10个图片任务表现
  • 电子工程师必读:假芯片识别与防范全指南
  • X9C103S数字电位器驱动原理与Arduino工程实践
  • 2026安徽化妆品生产许可证办理服务商Top5测评与选型指南 - 2026年企业推荐榜
  • OpenClaw+Phi-3-vision-128k-instruct:学术论文图表自动解析与归档系统
  • Element Plus:Vue 3企业级UI组件库的全方位解析与实践指南
  • OpenClaw省钱方案:百川2-13B-4bits量化版自部署实战
  • 观察者同步才是物理学真正的基石:局部重叠如何自然衍生出全部现实架构
  • OpenClaw家庭应用:Qwen3.5-9B管理儿童在线学习时间
  • 2026年调味品行业深度盘点:综合实力与创新力TOP5品牌解析 - 2026年企业推荐榜
  • Linux内存优化:slab/slub分配器原理与实践
  • DOM Text:深入理解文档对象模型中的文本操作
  • 2026年呼和浩特企业必看:ISO三体系认证服务商深度解析与专业选型指南 - 2026年企业推荐榜
  • Quectel AT指令轻量库:嵌入式蜂窝通信的可审计管道
  • I2C总线原理与嵌入式系统应用实践
  • [具身智能-228]:OpenCV的主要功能
  • MS5xxx气压传感器Arduino驱动库深度解析与工业级应用
  • 论文格式修改技巧-Word查找替换
  • 2026年B2B企业GEO优化服务商深度测评:谁在引领智能营销新浪潮? - 2026年企业推荐榜
  • 数字信号眼图解析与高速电路调试实战
  • 2026年Q2工业清洁升级指南:五大电瓶式工业吸尘器服务商深度横评与选择策略 - 2026年企业推荐榜
  • WinSCP实现Windows与Linux安全文件互传指南
  • [具身智能-230]:大模型编程的一个最佳实践:先通过自然语言让大模型编写Python语言代码,功能和性能调通后,再让大模型把python程序转换成C++或其他语言的程序
  • 【硬件片内测试】基于FPGA的完整16QAM链路测试,含频偏锁定,帧同步,定时点,Viterbi译码,信道,误码统计
  • 2026年酱香酒采购指南:聚焦铜仁,五大实力厂家深度解析与选择之道 - 2026年企业推荐榜
  • jQuery 事件方法详解
  • Arduino嵌入式Flash库:抽象层设计与磨损均衡实践