当前位置: 首页 > news >正文

k8s使用Readiness Probe就绪探针:确保java应用在数据库恢复后才接收流量

在Kubernetes环境中,有几种优雅的方式处理MySQL宕机后的应用重连问题。以下是完整的解决方案:

一、Kubernetes原生解决方案

1.Readiness Probe + Liveness Probe配置

apiVersion:apps/v1kind:Deploymentmetadata:name:springboot-appnamespace:defaultspec:replicas:3selector:matchLabels:app:springboot-apptemplate:metadata:labels:app:springboot-appspec:containers:-name:appimage:your-springboot-app:latestports:-containerPort:8080env:-name:SPRING_DATASOURCE_URLvalue:"jdbc:mysql://mysql-service:3306/db?autoReconnect=true&useSSL=false"-name:SPRING_DATASOURCE_HIKARI_CONNECTION-TEST-QUERYvalue:"SELECT 1"-name:SPRING_DATASOURCE_HIKARI_VALIDATION-TIMEOUTvalue:"5000"# 存活探针 - 检测应用是否存活livenessProbe:httpGet:path:/actuator/healthport:8080initialDelaySeconds:60periodSeconds:10timeoutSeconds:5failureThreshold:3# 就绪探针 - 检测应用是否就绪readinessProbe:httpGet:path:/actuator/health/readinessport:8080initialDelaySeconds:30periodSeconds:5timeoutSeconds:3failureThreshold:1# 启动探针 - 检测应用是否启动完成startupProbe:httpGet:path:/actuator/health/livenessport:8080failureThreshold:30periodSeconds:10

2.自定义Readiness端点

@ComponentpublicclassDatabaseReadinessIndicatorimplementsHealthIndicator{privatefinalDataSourcedataSource;privatefinalAtomicBooleandatabaseAvailable=newAtomicBoolean(true);publicDatabaseReadinessIndicator(DataSourcedataSource){this.dataSource=dataSource;}@OverridepublicHealthhealth(){// 检测数据库连接booleandbHealthy=checkDatabase();databaseAvailable.set(dbHealthy);if(dbHealthy){returnHealth.up().withDetail("database","available").build();}else{returnHealth.down().withDetail("database","unavailable").build();}}privatebooleancheckDatabase(){try(Connectionconn=dataSource.getConnection()){returnconn.isValid(3);}catch(SQLExceptione){returnfalse;}}publicbooleanisDatabaseAvailable(){returndatabaseAvailable.get();}}

二、重启脚本方案

1.监控重启脚本 (monitor-restart.sh)

#!/bin/bash# Kubernetes MySQL宕机后重启应用脚本# 用法: ./monitor-restart.sh <namespace> <deployment-name>set-eNAMESPACE=${1:-default}DEPLOYMENT=${2:-springboot-app}MYSQL_SERVICE="mysql-service"CHECK_INTERVAL=30# 检查间隔(秒)MAX_RETRIES=3# 最大重试次数LOG_FILE="/tmp/k8s-mysql-monitor.log"# 颜色输出RED='\033[0;31m'GREEN='\033[0;32m'YELLOW='\033[1;33m'NC='\033[0m'# No Color# 日志函数log(){echo-e"[$(date'+%Y-%m-%d %H:%M:%S')]$1"|tee-a${LOG_FILE}}log_info(){log"${GREEN}[INFO]${NC}$1"}log_warn(){log"${YELLOW}[WARN]${NC}$1"}log_error(){log"${RED}[ERROR]${NC}$1"}# 检查kubectl是否可用check_kubectl(){if!command-v kubectl&>/dev/null;thenlog_error"kubectl not found"exit1fiif!kubectl get ns${NAMESPACE}&>/dev/null;thenlog_error"Namespace${NAMESPACE}not found"exit1fi}# 检查MySQL状态check_mysql_status(){localmysql_podmysql_pod=$(kubectl get pod -n ${NAMESPACE}-lapp=mysql -ojsonpath='{.items[0].metadata.name}'2>/dev/null)if[-z"${mysql_pod}"];thenlog_warn"MySQL pod not found"return1fi# 检查MySQL pod是否就绪localready_statusready_status=$(kubectl get pod ${mysql_pod}-n ${NAMESPACE}-ojsonpath='{.status.conditions[?(@.type=="Ready")].status}')if["${ready_status}"!="True"];thenlog_warn"MySQL pod is not ready"return1fi# 尝试连接MySQLifkubectlexec-n${NAMESPACE}${mysql_pod}-- mysqladminping-h localhost -u root -p"${MYSQL_ROOT_PASSWORD}"&>/dev/null;thenreturn0elsereturn1fi}# 获取MySQL重启时间get_mysql_restart_time(){localmysql_podmysql_pod=$(kubectl get pod -n ${NAMESPACE}-lapp=mysql -ojsonpath='{.items[0].metadata.name}')if[-n"${mysql_pod}"];thenkubectl get pod${mysql_pod}-n${NAMESPACE}-ojsonpath='{.status.startTime}'fi}# 重启Deploymentrestart_deployment(){localreason=$1log_warn"Restarting deployment${DEPLOYMENT}due to:${reason}"# 记录重启前的pod列表localold_podsold_pods=$(kubectl get pods -n ${NAMESPACE}-lapp=${DEPLOYMENT}-o name)# 重启deploymentkubectl rollout restart deployment/${DEPLOYMENT}-n${NAMESPACE}# 等待新pod就绪log_info"Waiting for new pods to be ready..."ifkubectl rollout status deployment/${DEPLOYMENT}-n${NAMESPACE}--timeout=300s;thenlog_info"Deployment restarted successfully"# 清理旧podforpodin${old_pods};dokubectl delete${pod}-n${NAMESPACE}--grace-period=30&>/dev/nulldonereturn0elselog_error"Deployment restart failed"return1fi}# 滚动重启(逐个重启)rolling_restart(){localreason=$1log_warn"Performing rolling restart due to:${reason}"localreplicasreplicas=$(kubectl get deployment ${DEPLOYMENT}-n ${NAMESPACE}-ojsonpath='{.spec.replicas}')foriin$(seq0$((replicas-1)));dolocalpod_name="${DEPLOYMENT}-${i}"log_info"Restarting pod${pod_name}..."kubectl delete pod${pod_name}-n${NAMESPACE}--wait=true --timeout=120s# 等待新pod就绪sleep10donelog_info"Rolling restart completed"}# 监控主循环monitor_loop(){localmysql_was_down=falselocallast_mysql_restart_time=""whiletrue;doifcheck_mysql_status;then# MySQL正常current_restart_time=$(get_mysql_restart_time)if["${mysql_was_down}"=true];thenlog_info"MySQL has recovered, restarting application..."restart_deployment"MySQL recovery detected"mysql_was_down=falseelif[-n"${last_mysql_restart_time}"]&&["${current_restart_time}"!="${last_mysql_restart_time}"];thenlog_info"MySQL was restarted, restarting application..."restart_deployment"MySQL pod restart detected"filast_mysql_restart_time="${current_restart_time}"else# MySQL异常if["${mysql_was_down}"=false];thenlog_warn"MySQL is down, waiting for recovery..."mysql_was_down=truefifisleep${CHECK_INTERVAL}done}# 主函数main(){log_info"Starting MySQL monitor for deployment${DEPLOYMENT}in namespace${NAMESPACE}"log_info"Check interval:${CHECK_INTERVAL}s"check_kubectl# 获取MySQL root密码(从secret中)MYSQL_ROOT_PASSWORD=$(kubectl get secret mysql-secret -n ${NAMESPACE}-ojsonpath='{.data.root-password}'|base64 --decode2>/dev/null)if[-z"${MYSQL_ROOT_PASSWORD}"];thenlog_warn"MySQL password not found in secret, using default"MYSQL_ROOT_PASSWORD="root"fi# 捕获中断信号trap'log_info "Monitor stopped"; exit 0'INTTERM# 开始监控monitor_loop}# 执行主函数main

2.初始化等待脚本 (wait-for-mysql.sh)

#!/bin/bash# 等待MySQL就绪后再启动Spring Boot应用set-eMYSQL_HOST=${MYSQL_HOST:-mysql-service}MYSQL_PORT=${MYSQL_PORT:-3306}MYSQL_USER=${MYSQL_USER:-root}MYSQL_PASSWORD=${MYSQL_PASSWORD:-root}MYSQL_DATABASE=${MYSQL_DATABASE:-db}TIMEOUT=${TIMEOUT:-300}INTERVAL=${INTERVAL:-5}echo"Waiting for MySQL to be ready at${MYSQL_HOST}:${MYSQL_PORT}..."start_time=$(date+%s)whiletrue;docurrent_time=$(date+%s)elapsed=$((current_time-start_time))if[${elapsed}-gt${TIMEOUT}];thenecho"Timeout after${TIMEOUT}seconds waiting for MySQL"exit1fi# 尝试连接MySQLifmysqladminping-h"${MYSQL_HOST}"-P"${MYSQL_PORT}"-u"${MYSQL_USER}"-p"${MYSQL_PASSWORD}"--silent&>/dev/null;thenecho"MySQL is ready!"breakfiecho"MySQL not ready yet... (${elapsed}s/${TIMEOUT}s)"sleep${INTERVAL}done# 验证数据库是否存在if!mysql -h"${MYSQL_HOST}"-P"${MYSQL_PORT}"-u"${MYSQL_USER}"-p"${MYSQL_PASSWORD}"-e"USE${MYSQL_DATABASE}"2>/dev/null;thenecho"Database${MYSQL_DATABASE}does not exist"exit1fiecho"Database${MYSQL_DATABASE}is ready"# 启动Spring Boot应用execjava -jar /app.jar

3.Dockerfile集成

FROM openjdk:11-jre-slim # 安装MySQL客户端 RUN apt-get update && apt-get install -y mysql-client && rm -rf /var/lib/apt/lists/* # 复制等待脚本 COPY wait-for-mysql.sh /wait-for-mysql.sh RUN chmod +x /wait-for-mysql.sh # 复制应用 COPY target/*.jar /app.jar # 使用脚本作为入口点 ENTRYPOINT ["/wait-for-mysql.sh"] CMD ["java", "-jar", "/app.jar"]

4.Kubernetes Job用于恢复

apiVersion:batch/v1kind:CronJobmetadata:name:mysql-recovery-checknamespace:defaultspec:schedule:"*/5 * * * *"# 每5分钟执行一次jobTemplate:spec:template:spec:serviceAccountName:mysql-recovery-sacontainers:-name:recovery-checkimage:bitnami/kubectl:latestcommand:-/bin/bash--c-|# 检查MySQL状态 MYSQL_POD=$(kubectl get pod -l app=mysql -o jsonpath='{.items[0].metadata.name}') MYSQL_READY=$(kubectl get pod $MYSQL_POD -o jsonpath='{.status.conditions[?(@.type=="Ready")].status}')if["$MYSQL_READY"!="True"]; then echo "MySQL is not ready,skipping..." exit 0 fi# 检查应用连接状态APP_PODS=$(kubectl get pods-l app=springboot-app-o name) for pod in $APP_PODS; do# 检查应用日志中是否有数据库连接错误if kubectl logs $pod--tail=50|grep-q "CommunicationsException\|Connection refused"; then echo "Found connection errors in $pod,restarting deployment" kubectl rollout restart deployment/springboot-app break fi donerestartPolicy:OnFailure---apiVersion:v1kind:ServiceAccountmetadata:name:mysql-recovery-sa---apiVersion:rbac.authorization.k8s.io/v1kind:Rolemetadata:name:mysql-recovery-rolerules:-apiGroups:[""]resources:["pods","pods/log"]verbs:["get","list"]-apiGroups:["apps"]resources:["deployments"]verbs:["get","list","restart"]---apiVersion:rbac.authorization.k8s.io/v1kind:RoleBindingmetadata:name:mysql-recovery-bindingsubjects:-kind:ServiceAccountname:mysql-recovery-saroleRef:kind:Rolename:mysql-recovery-roleapiGroup:rbac.authorization.k8s.io

三、使用Sidecar容器模式

1.Sidecar配置

apiVersion:apps/v1kind:Deploymentmetadata:name:springboot-appspec:replicas:3template:spec:containers:-name:appimage:springboot-app:latestenv:-name:SPRING_DATASOURCE_URLvalue:"jdbc:mysql://localhost:3306/db"# 通过sidecar代理-name:mysql-proxyimage:alpine:latestcommand:-/bin/sh--c-|apk add mysql-client socat# 监控MySQL连接while true; do if mysqladmin ping-h mysql-service-u root-p"$MYSQL_PASSWORD"&>/dev/null; then# MySQL正常,转发连接socat TCP-LISTEN:3306,fork,reuseaddr TCP:mysql-service:3306 & else# MySQL异常,返回错误echo "MySQL is down"|socat-TCP-LISTEN:3306,fork,reuseaddr & fi sleep 10 done

四、使用Operator模式

1.自定义控制器示例

// 自定义控制器监控MySQL状态并重启应用@ControllerpublicclassMySQLRecoveryControllerimplementsInformed<MySQL>{@OverridepublicvoidonUpdate(MySQLmysql){if(mysql.isReady()){// MySQL恢复,重启关联的应用restartAssociatedApplications(mysql);}}privatevoidrestartAssociatedApplications(MySQLmysql){// 获取依赖此MySQL的应用List<SpringBootApp>apps=getDependentApps(mysql);for(SpringBootAppapp:apps){// 滚动重启app.rollingRestart();}}}

五、使用说明

1.部署监控脚本

# 作为Kubernetes Job运行kubectl create configmap mysql-monitor\--from-file=monitor-restart.sh kubectl apply -f -<<EOF apiVersion: batch/v1 kind: CronJob metadata: name: mysql-monitor spec: schedule: "*/1 * * * *" jobTemplate: spec: template: spec: serviceAccountName: monitor-sa containers: - name: monitor image: bitnami/kubectl:latest command: - /bin/bash - /scripts/monitor-restart.sh volumeMounts: - name: scripts mountPath: /scripts volumes: - name: scripts configMap: name: mysql-monitor restartPolicy: OnFailure EOF

2.RBAC配置

apiVersion:rbac.authorization.k8s.io/v1kind:ClusterRolemetadata:name:mysql-monitor-rolerules:-apiGroups:["apps"]resources:["deployments"]verbs:["get","list","watch","update","patch","restart"]-apiGroups:[""]resources:["pods","services"]verbs:["get","list","watch"]---apiVersion:rbac.authorization.k8s.io/v1kind:ClusterRoleBindingmetadata:name:mysql-monitor-bindingsubjects:-kind:ServiceAccountname:monitor-sanamespace:defaultroleRef:kind:ClusterRolename:mysql-monitor-roleapiGroup:rbac.authorization.k8s.io

六、最佳实践建议

  1. 使用Readiness Probe:确保应用在数据库恢复后才接收流量
  2. 配置重试机制:应用层配置合理的重试策略
  3. 多副本部署:避免单点故障
  4. 监控告警:及时发现并处理问题
  5. 自动化恢复:使用Operator或CronJob自动处理

这样配置后,当MySQL宕机又恢复时,Kubernetes会自动检测并重启应用,确保服务快速恢复可用。

http://www.jsqmd.com/news/396732/

相关文章:

  • P3808 AC 自动机(简单版)
  • Alpine Linux容器中安装工具示例
  • springboot高校大学生创新创业项目管理系统-Pycharm django
  • qwen3.5-plus识别原神按钮groundingbox
  • Agent实习模拟面试之具身智能:如何赋予大模型“双手”与“眼睛”——从工具调用到多模态感知的深度解析
  • 基于Python基于flask的出国留学信息国外大学学校推荐系统的设计与实现-Pycharm django
  • 案例分享——MCP改进提案在生产中落地的例子
  • 基于Python基于flask的大学生招聘求职系统-Pycharm django
  • 生成引擎优化(GEO)在提升内容创作效率与用户体验方面的创新策略分析
  • Agent实习模拟面试之企业级大模型融合架构:从单点调用到智能中枢的系统设计深度拷问
  • 强烈安利!圈粉无数的AI论文平台 —— 千笔ai写作
  • 导师严选! 降AI率软件 千笔·降AIGC助手 VS speedai,专科生专属高效选择
  • Agent实习模拟面试之Agentic 代理模式:从单智能体到多智能体协同的系统设计深度拷问
  • 横评后发现 8个AI论文平台:专科生毕业论文写作全攻略
  • 用实力说话!降AI率软件 千笔·降AI率助手 VS speedai 专科生专属首选
  • 一遍搞定全流程!断层领先的AI论文网站 —— 千笔写作工具
  • 「Chrome 扩展开发」系列入门教程
  • 写作小白救星!9个AI论文写作软件深度测评,继续教育毕业论文必备工具推荐
  • 滑雪问题
  • USB线选购指南2026:避开3大陷阱,选到耐用快充的好线 - 速递信息
  • 洛谷 P1801:黑匣子 ← 二叉堆
  • 运动木地板怎么选?洛可风情5S全价值方法论破解选型困局 - 速递信息
  • Python Streamlit介绍(开源Python Web应用框架,快速将Python脚本转换成交互式Web应用,适合数据科学和机器学习项目快速展示)
  • 【强化学习的数学原理-赵世钰】随记
  • 2026年北京飞亚达手表维修推荐:权威网点深度评价,针对维修时效与质量痛点指南 - 十大品牌推荐
  • 2026年北京古驰手表维修推荐:权威网点综合排名,针对非官方服务品质痛点 - 十大品牌推荐
  • P10657 BZOJ4998 星球联盟
  • 如何选择可靠手表维修点?2026年北京海鸥手表维修评测与推荐,直击非官方与乱报价痛点 - 十大品牌推荐
  • 如何选择维修点?2026年北京法穆兰手表维修推荐与排名,直击技术隐忧 - 十大品牌推荐
  • 2026年北京梵克雅宝手表维修推荐:高端腕表保养深度评价,涵盖复杂机芯与日常维护场景 - 十大品牌推荐