当前位置: 首页 > news >正文

从零开始:手动部署Kubernetes(k8s)v1.34.0高可用集群

1. 环境准备与系统配置

1.1 主机规划与网络拓扑

在开始部署Kubernetes高可用集群之前,我们需要先规划好主机角色和网络架构。典型的Kubernetes高可用集群包含以下节点类型:

  • Master节点:运行控制平面组件(API Server、Controller Manager、Scheduler等),建议至少3个以实现高可用
  • Worker节点:运行工作负载的实际节点
  • 负载均衡节点:可选,用于对外暴露API Server服务

在我们的部署方案中,使用5台主机:

  • 3个Master节点(k8s-master01/02/03)
  • 2个Worker节点(k8s-node01/02)
  • 1个虚拟IP(172.16.1.36)通过keepalived实现高可用

网络规划采用双栈配置(IPv4+IPv6):

  • 物理网络:172.16.1.0/24
  • Service网段:10.96.0.0/12
  • Pod网段:172.16.0.0/12
  • IPv6物理网络:fc00::/8
  • IPv6 Service网段:fd00:1111::/112
  • IPv6 Pod网段:fc00:2222::/112

1.2 系统初始化配置

所有节点需要执行以下基础配置:

# 关闭防火墙 systemctl disable --now firewalld # 关闭SELinux setenforce 0 sed -i 's#SELINUX=enforcing#SELINUX=disabled#g' /etc/selinux/config # 关闭swap sed -ri 's/.*swap.*/#&/' /etc/fstab swapoff -a && sysctl -w vm.swappiness=0 # 设置时区同步 yum install -y chrony cat > /etc/chrony.conf << EOF pool ntp.aliyun.com iburst driftfile /var/lib/chrony/drift makestep 1.0 3 rtcsync allow 172.16.1.0/24 local stratum 10 keyfile /etc/chrony.keys leapsectz right/UTC logdir /var/log/chrony EOF systemctl restart chronyd && systemctl enable chronyd

1.3 内核参数优化

Kubernetes对Linux内核参数有特定要求,需要调整以下参数:

cat <<EOF > /etc/sysctl.d/k8s.conf net.ipv4.ip_forward = 1 net.bridge.bridge-nf-call-iptables = 1 fs.may_detach_mounts = 1 vm.overcommit_memory=1 vm.panic_on_oom=0 fs.inotify.max_user_watches=89100 fs.file-max=52706963 fs.nr_open=52706963 net.netfilter.nf_conntrack_max=2310720 net.ipv4.tcp_keepalive_time = 600 net.ipv4.tcp_keepalive_probes = 3 net.ipv4.tcp_keepalive_intvl =15 net.ipv4.tcp_max_tw_buckets = 36000 net.ipv4.tcp_tw_reuse = 1 net.ipv4.tcp_max_orphans = 327680 net.ipv4.tcp_orphan_retries = 3 net.ipv4.tcp_syncookies = 1 net.ipv4.ip_conntrack_max = 65536 net.ipv4.tcp_max_syn_backlog = 16384 net.ipv4.tcp_timestamps = 0 net.core.somaxconn = 16384 net.ipv6.conf.all.disable_ipv6 = 0 net.ipv6.conf.default.disable_ipv6 = 0 net.ipv6.conf.lo.disable_ipv6 = 0 net.ipv6.conf.all.forwarding = 1 EOF sysctl --system

2. 容器运行时安装

Kubernetes支持多种容器运行时,这里我们以containerd为例进行安装配置。

2.1 安装containerd

# 下载containerd二进制包 wget https://github.com/containerd/containerd/releases/download/v2.0.5/containerd-2.0.5-linux-amd64.tar.gz tar xf containerd-*-linux-amd64.tar.gz -C /usr/local/ # 创建systemd服务 cat > /etc/systemd/system/containerd.service << EOF [Unit] Description=containerd container runtime Documentation=https://containerd.io After=network.target local-fs.target [Service] ExecStartPre=-/sbin/modprobe overlay ExecStart=/usr/local/bin/containerd Type=notify Delegate=yes KillMode=process Restart=always RestartSec=5 LimitNPROC=infinity LimitCORE=infinity LimitNOFILE=infinity TasksMax=infinity OOMScoreAdjust=-999 [Install] WantedBy=multi-user.target EOF # 加载内核模块 cat <<EOF | sudo tee /etc/modules-load.d/containerd.conf overlay br_netfilter EOF systemctl restart systemd-modules-load.service # 生成默认配置 mkdir -p /etc/containerd containerd config default | tee /etc/containerd/config.toml # 修改sandbox镜像为国内源 sed -i "s#registry.k8s.io#registry.aliyuncs.com/chenby#g" /etc/containerd/config.toml # 启动服务 systemctl daemon-reload systemctl enable --now containerd

2.2 配置crictl客户端

# 下载crictl wget https://github.com/kubernetes-sigs/cri-tools/releases/download/v1.34.0/crictl-v1.34.0-linux-amd64.tar.gz tar xf crictl-v*-linux-amd64.tar.gz -C /usr/bin/ # 创建配置文件 cat > /etc/crictl.yaml << EOF runtime-endpoint: unix:///run/containerd/containerd.sock image-endpoint: unix:///run/containerd/containerd.sock timeout: 10 debug: false EOF

3. Kubernetes组件安装

3.1 下载Kubernetes二进制文件

# 下载etcd和k8s组件 wget https://github.com/etcd-io/etcd/releases/download/v3.5.21/etcd-v3.5.21-linux-amd64.tar.gz wget https://cdn.dl.k8s.io/release/v1.34.0/kubernetes-server-linux-amd64.tar.gz # 解压安装 tar -xf kubernetes-server-linux-amd64.tar.gz --strip-components=3 -C /usr/local/bin kubernetes/server/bin/kube{let,ctl,-apiserver,-controller-manager,-scheduler,-proxy} tar -xf etcd*.tar.gz && mv etcd-*/etcd /usr/local/bin/ && mv etcd-*/etcdctl /usr/local/bin/

3.2 证书生成

Kubernetes集群需要大量证书用于组件间认证,我们使用cfssl工具生成证书。

3.2.1 安装cfssl工具
wget "https://github.com/cloudflare/cfssl/releases/download/v1.6.5/cfssl_1.6.5_linux_amd64" -O /usr/local/bin/cfssl wget "https://github.com/cloudflare/cfssl/releases/download/v1.6.5/cfssljson_1.6.5_linux_amd64" -O /usr/local/bin/cfssljson chmod +x /usr/local/bin/cfssl /usr/local/bin/cfssljson
3.2.2 生成CA证书
cat > ca-config.json << EOF { "signing": { "default": { "expiry": "876000h" }, "profiles": { "kubernetes": { "usages": ["signing", "key encipherment", "server auth", "client auth"], "expiry": "876000h" } } } } EOF cat > ca-csr.json << EOF { "CN": "kubernetes", "key": { "algo": "rsa", "size": 2048 }, "names": [ { "C": "CN", "ST": "Beijing", "L": "Beijing", "O": "Kubernetes", "OU": "Kubernetes-manual" } ], "ca": { "expiry": "876000h" } } EOF cfssl gencert -initca ca-csr.json | cfssljson -bare /etc/kubernetes/pki/ca
3.2.3 生成API Server证书
cat > apiserver-csr.json << EOF { "CN": "kube-apiserver", "key": { "algo": "rsa", "size": 2048 }, "names": [ { "C": "CN", "ST": "Beijing", "L": "Beijing", "O": "Kubernetes", "OU": "Kubernetes-manual" } ] } EOF cfssl gencert \ -ca=/etc/kubernetes/pki/ca.pem \ -ca-key=/etc/kubernetes/pki/ca-key.pem \ -config=ca-config.json \ -hostname=10.96.0.1,172.16.1.36,127.0.0.1,kubernetes,kubernetes.default,kubernetes.default.svc,kubernetes.default.svc.cluster,kubernetes.default.svc.cluster.local,172.16.1.31,172.16.1.32,172.16.1.33,172.16.1.34,172.16.1.35 \ -profile=kubernetes \ apiserver-csr.json | cfssljson -bare /etc/kubernetes/pki/apiserver

4. 高可用方案部署

4.1 使用keepalived+haproxy实现高可用

4.1.1 安装haproxy
yum install -y haproxy cat > /etc/haproxy/haproxy.cfg << EOF global maxconn 2000 ulimit-n 16384 log 127.0.0.1 local0 err stats timeout 30s defaults log global mode http option httplog timeout connect 5000 timeout client 50000 timeout server 50000 timeout http-request 15s timeout http-keep-alive 15s frontend k8s-master bind 0.0.0.0:9443 bind 127.0.0.1:9443 mode tcp option tcplog tcp-request inspect-delay 5s default_backend k8s-master backend k8s-master mode tcp option tcplog option tcp-check balance roundrobin server k8s-master01 172.16.1.31:6443 check server k8s-master02 172.16.1.32:6443 check server k8s-master03 172.16.1.33:6443 check EOF systemctl enable --now haproxy
4.1.2 安装keepalived
yum install -y keepalived # Master节点配置 cat > /etc/keepalived/keepalived.conf << EOF ! Configuration File for keepalived global_defs { router_id LVS_DEVEL } vrrp_script chk_apiserver { script "/etc/keepalived/check_apiserver.sh" interval 5 weight -5 fall 2 rise 1 } vrrp_instance VI_1 { state MASTER interface ens160 virtual_router_id 51 priority 100 advert_int 2 authentication { auth_type PASS auth_pass K8SHA_KA_AUTH } virtual_ipaddress { 172.16.1.36 } track_script { chk_apiserver } } EOF # 健康检查脚本 cat > /etc/keepalived/check_apiserver.sh << EOF #!/bin/bash err=0 for k in \$(seq 1 3) do check_code=\$(pgrep haproxy) if [[ \$check_code == "" ]]; then err=\$(expr \$err + 1) sleep 1 continue else err=0 break fi done if [[ \$err != "0" ]]; then echo "systemctl stop keepalived" /usr/bin/systemctl stop keepalived exit 1 else exit 0 fi EOF chmod +x /etc/keepalived/check_apiserver.sh systemctl enable --now keepalived

5. 控制平面组件部署

5.1 etcd集群部署

# 创建etcd配置文件 cat > /etc/etcd/etcd.config.yml << EOF name: 'k8s-master01'>cat > /usr/lib/systemd/system/kube-apiserver.service << EOF [Unit] Description=Kubernetes API Server Documentation=https://github.com/kubernetes/kubernetes After=network.target [Service] ExecStart=/usr/local/bin/kube-apiserver \\ --v=2 \\ --allow-privileged=true \\ --bind-address=0.0.0.0 \\ --secure-port=6443 \\ --advertise-address=172.16.1.31 \\ --service-cluster-ip-range=10.96.0.0/12,fd00:1111::/112 \\ --service-node-port-range=30000-32767 \\ --etcd-servers=https://172.16.1.31:2379,https://172.16.1.32:2379,https://172.16.1.33:2379 \\ --etcd-cafile=/etc/kubernetes/pki/etcd/etcd-ca.pem \\ --etcd-certfile=/etc/kubernetes/pki/etcd/etcd.pem \\ --etcd-keyfile=/etc/kubernetes/pki/etcd/etcd-key.pem \\ --client-ca-file=/etc/kubernetes/pki/ca.pem \\ --tls-cert-file=/etc/kubernetes/pki/apiserver.pem \\ --tls-private-key-file=/etc/kubernetes/pki/apiserver-key.pem \\ --kubelet-client-certificate=/etc/kubernetes/pki/apiserver.pem \\ --kubelet-client-key=/etc/kubernetes/pki/apiserver-key.pem \\ --service-account-key-file=/etc/kubernetes/pki/sa.pub \\ --service-account-signing-key-file=/etc/kubernetes/pki/sa.key \\ --service-account-issuer=https://kubernetes.default.svc.cluster.local \\ --kubelet-preferred-address-types=InternalIP,ExternalIP,Hostname \\ --enable-admission-plugins=NamespaceLifecycle,LimitRanger,ServiceAccount,DefaultStorageClass,DefaultTolerationSeconds,NodeRestriction,ResourceQuota \\ --authorization-mode=Node,RBAC \\ --enable-bootstrap-token-auth=true \\ --requestheader-client-ca-file=/etc/kubernetes/pki/front-proxy-ca.pem \\ --proxy-client-cert-file=/etc/kubernetes/pki/front-proxy-client.pem \\ --proxy-client-key-file=/etc/kubernetes/pki/front-proxy-client-key.pem \\ --requestheader-allowed-names=aggregator \\ --requestheader-group-headers=X-Remote-Group \\ --requestheader-extra-headers-prefix=X-Remote-Extra- \\ --requestheader-username-headers=X-Remote-User \\ --enable-aggregator-routing=true Restart=on-failure RestartSec=10s LimitNOFILE=65535 [Install] WantedBy=multi-user.target EOF systemctl daemon-reload systemctl enable --now kube-apiserver

6. 集群网络与插件部署

6.1 安装Calico网络插件

kubectl create -f https://raw.githubusercontent.com/projectcalico/calico/v3.30.3/manifests/tigera-operator.yaml curl https://raw.githubusercontent.com/projectcalico/calico/v3.30.3/manifests/custom-resources.yaml -O # 修改custom-resources.yaml配置 cat > custom-resources.yaml << EOF apiVersion: operator.tigera.io/v1 kind: Installation metadata: name: default spec: calicoNetwork: ipPools: - name: default-ipv4-ippool blockSize: 26 cidr: 172.16.0.0/12 encapsulation: VXLANCrossSubnet natOutgoing: Enabled nodeSelector: all() EOF kubectl create -f custom-resources.yaml

6.2 安装CoreDNS

helm repo add coredns https://coredns.github.io/helm helm install coredns coredns/coredns -n kube-system --set service.clusterIP=10.96.0.10

7. 节点加入与验证

7.1 Worker节点加入集群

在Worker节点上执行:

# 安装kubelet cat > /usr/lib/systemd/system/kubelet.service << EOF [Unit] Description=Kubernetes Kubelet Documentation=https://github.com/kubernetes/kubernetes After=network-online.target firewalld.service containerd.service Wants=network-online.target Requires=containerd.service [Service] ExecStart=/usr/local/bin/kubelet \\ --bootstrap-kubeconfig=/etc/kubernetes/bootstrap-kubelet.kubeconfig \\ --kubeconfig=/etc/kubernetes/kubelet.kubeconfig \\ --config=/etc/kubernetes/kubelet-conf.yml \\ --container-runtime-endpoint=unix:///run/containerd/containerd.sock \\ --node-labels=node.kubernetes.io/node= Restart=always RestartSec=10s [Install] WantedBy=multi-user.target EOF # 启动kubelet systemctl daemon-reload systemctl enable --now kubelet

7.2 集群验证

# 查看节点状态 kubectl get nodes # 部署测试Pod cat<<EOF | kubectl apply -f - apiVersion: v1 kind: Pod metadata: name: busybox namespace: default spec: containers: - name: busybox image: busybox:1.28 command: ["sleep", "3600"] EOF # 检查Pod运行状态 kubectl get pod -o wide

8. 集群扩展组件

8.1 安装Metrics Server

wget https://github.com/kubernetes-sigs/metrics-server/releases/latest/download/components.yaml -O metrics-server.yaml # 修改配置添加以下参数 args: - --kubelet-insecure-tls - --requestheader-client-ca-file=/etc/kubernetes/pki/front-proxy-ca.pem kubectl apply -f metrics-server.yaml # 验证 kubectl top node

8.2 安装Dashboard

helm repo add kubernetes-dashboard https://kubernetes.github.io/dashboard/ helm install kubernetes-dashboard kubernetes-dashboard/kubernetes-dashboard \ --namespace kube-system \ --set service.type=NodePort

9. 生产环境建议

  1. 日志收集:部署EFK(Elasticsearch+Fluentd+Kibana)或Loki+Promtail+Grafana日志系统
  2. 监控告警:部署Prometheus+Alertmanager+Grafana监控系统
  3. 备份恢复:定期备份etcd数据,可使用etcdctl snapshot save命令
  4. 安全加固
    • 启用Pod安全策略
    • 使用NetworkPolicy限制Pod间通信
    • 定期轮换证书
  5. 自动扩缩:配置Cluster Autoscaler和HPA实现自动扩缩容

10. 常见问题排查

  1. 节点NotReady

    • 检查kubelet日志:journalctl -u kubelet -f
    • 验证网络插件是否正常运行
    • 检查节点资源是否充足
  2. Pod创建失败

    • 使用kubectl describe pod 查看事件
    • 检查资源配额限制
    • 验证镜像拉取是否成功
  3. 网络问题

    • 使用kubectl exec进入Pod测试网络连通性
    • 检查CoreDNS是否正常运行
    • 验证网络插件配置
  4. 证书过期

    • 定期检查证书有效期:openssl x509 -in /etc/kubernetes/pki/apiserver.crt -noout -text | grep Not
    • 使用kubeadm alpha certs renew更新证书(如使用kubeadm部署)
http://www.jsqmd.com/news/522337/

相关文章:

  • 市集运营乱象多?巨有智慧市集系统破解管理困局
  • Typora Markdown笔记管理:集成StructBERT实现笔记内容的智能链接与推荐
  • 单片机/C/C++八股:(二十一)include <> 和 include ““ 的区别
  • 避坑指南:Windows 10/11下用Anaconda安装Segmentation Models Pytorch (smp) 的正确姿势(含CUDA版本匹配与镜像源配置)
  • 时空折叠技术:XposedRimetHelper实现远程办公自由的底层逻辑
  • 参考文献崩了?AI论文平台千笔·专业学术智能体 VS 锐智 AI,专科生专属写作神器
  • 乡村文旅难出圈?巨有科技数字化激活乡村活力
  • 从Cargo[特殊字符]到项目实战:用Mac玩转Rust包管理的5个高效技巧
  • 常温常新之阿里巴巴开发手册并发处理
  • XposedRimetHelper:Android系统级虚拟定位解决方案深度解析
  • AidLux新手必看:3种方法快速获取设备IP(WLAN/Cloud_ip/ifconfig)
  • Python爬虫实战:手把手教你用Requests库搞定京东商品评论数据(附完整源码与翻页避坑指南)
  • 别再手动巡线了!用馈线自动化(FA)实现配电网故障自愈,5分钟看懂核心原理
  • 告别经纬度模糊聚合!用Uber H3 Java库实现六边形地理网格的5个实战场景
  • 15|Prompt 结构化:目标-上下文-约束-输出格式
  • Qwen-Image-Edit免费体验:阿里通义千问开源模型,零成本玩转AI修图
  • CppStateMachine嵌入式状态机库深度解析
  • ECCV2024新星MambaIRv2:图像去噪效果实测与性能优化技巧
  • PandaCam云台库:面向空间任务的高精度I2C闭环控制方案
  • 别再让大文件撑爆你的Git仓库了!手把手教你用Git LFS管理视频和数据集
  • Power BI数据刷新全攻略:从网关安装到自动刷新配置(2023最新版)
  • Python处理CSV文件行数的3种高效方法(附性能对比)
  • Qwen3-VL-4B Pro快速部署指南:开箱即用的视觉语言模型,看图说话超简单
  • Vue2项目实战:用js-audio-recorder和阿里云WebSocket搞定网页录音转文字(附完整代码)
  • 终局思维:亚马逊领导者的“品类定义权”与终局布局
  • 0~40kPa微差压传感器模块驱动与TM7711嵌入式实现
  • 无刷电机PWM控制实战:从占空比到转速曲线的完整测试记录
  • CoPaw强化学习环境模拟:加速智能体训练与策略评估
  • stlink v1.8.0 升级指南:提升STM32开发效率的开源工具升级方案
  • 实测分享:Fish-Speech-1.5语音合成效果到底有多自然?