当前位置: 首页 > news >正文

117. 如何在Rancher监控中测试 AlertManager

Procedure 程序

This guide demonstrates how to test Alertmanager and PrometheusRule configuration, to validate that alerts are sent successfully by Alertmanager.
本指南演示如何测试 AlertManager 和 PrometheusRule 配置,以验证 AlertManager 是否成功发送了警报。

With this objective in mind, and for this test to be self-contained, a webhook receiver is configured in Alertmanager. A webhook-receiver pod is deployed to receive these webhook alert requests and print them to stdout, such that they are visible in the Pod logs for verification. All of these resources are created in the cattle-monitoring-system.
基于这一目标,并使测试自成一体,在 Alertmanager 中配置了一个 webhook 接收器。部署一个 webhook-receiver pod 接收这些 webhook 警报请求并打印至 stdout,使其在 Pod 日志中可见以便验证。所有这些资源均在牛群监控系统中创建。

  1. Navigate to a Rancher-managed cluster with rancher-monitoring installed.
    导航到安装了牧场监控的牧场主管理集群。
  2. Apply the following YAMLs:
    应用以下 YAML:
    1. ConfigMap: 配置地图:
      <span style="color:#000000"><span style="background-color:#ffffff"><span style="background-color:#efefef"><code>apiVersion: v1 kind: ConfigMap metadata: name: webhook-receiver-configmap-script namespace: cattle-monitoring-system data: <a>Pod: 播客:
      <span style="color:#000000"><span style="background-color:#ffffff"><span style="background-color:#efefef"><code>apiVersion: v1 kind: Pod metadata: name: webhook-receiver namespace: cattle-monitoring-system labels: app: webhook-receiver spec: containers: - name: receiver-container image: rancherlabs/swiss-army-knife:latest command: ["/bin/bash", "/script/<a>Service: 服务:
      <span style="color:#000000"><span style="background-color:#ffffff"><span style="background-color:#efefef"><code>apiVersion: v1 kind: Service metadata: name: webhook-receiver-service namespace: cattle-monitoring-system spec: selector: app: webhook-receiver ports: - protocol: TCP port: 80 targetPort: 8080 type: ClusterIP </code></span></span></span>
  3. Ensure that the pod is up and tail the log, you should see a couple of lines stating that the netcat listener is ready and waiting for a connection. The Alertmanager alert configured below will be visible in these logs.
    确保 Pod 已上线并跟踪日志,你应该会看到几行显示 netcat 监听器已准备好并等待连接。下面配置的 Alertmanager 警报会在这些日志中显示。
  4. Apply the following AlertmanagerConfig to configure Alertmanager to send any alerts with the label "severity=critical" to the webhook-receiver pod (the Alertmanager configuration documentation can be found here). Note that the URL used is that of the service created above:
    应用以下 AlertmanagerConfig 来配置 AlertManager,将任何标签为“severity=critical”的警报发送到 webhook-receiver pod(AlertManager 配置文件文档可在此处找到)。请注意,所使用的 URL 是上述创建服务的地址:
    <span style="color:#000000"><span style="background-color:#ffffff"><span style="background-color:#efefef"><code>apiVersion: <a>Create a PrometheusRule with an alert expression. This example uses vector(1) as the expression, such that its value will be always "1" and the alert will be trigged continuously:
    创建一个带有警报表达式的 Prometheus 规则。本示例使用向量(1)作为表达式,使其值始终为“1”,警报将持续触发:
    <span style="color:#000000"><span style="background-color:#ffffff"><span style="background-color:#efefef"><code>apiVersion: <a>

  5. Wait for the alert to appear in the Alertmanager Alerts UI.
    等待警报出现在警报管理器的警报界面中。
  6. Check the log of the webhook-receiver pod and observe that the test-rule alert is received, similar to the following:
    检查 webhook-receiver pod 的日志,观察测试规则警报是否已接收,类似于以下内容:
    <span style="color:#000000"><span style="background-color:#ffffff"><span style="background-color:#efefef"><code>Starting nc listener with 1 second timeout... Waiting for connection... --- RECEIVED FULL REQUEST --- POST / HTTP/1.1 Host: webhook-receiver-service User-Agent: Alertmanager/0.28.1 Content-Length: 1214 Content-Type: application/json {"receiver":"cattle-monitoring-system/webhook-receiver-am-config/webhook-receiver-pod","status":"firing","alerts":[{"status":"firing","labels":{"alertname":"test-alert","namespace":"cattle-monitoring-system","prometheus":"cattle-monitoring-system/rancher-monitoring-prometheus","severity":"critical"},"annotations":{},"startsAt":"2025-10-14T09:04:11.437Z","endsAt":"0001-01-01T00:00:00Z","generatorURL":"<a>Following this method, it is possible to test Alertmanager and PrometheusRule configurations without needing a third party app or configuring an external receiver. This is useful to see if the alerts arrive as expected or if they are not being sent. If you are struggling to correctly apply an AlertmanagerConfig, you can check the rancher-monitoring-operator pod logs, in order to check that the syntax is correct and was accepted; the Alertmanager pod logs; as well as the value of the PrometheusRule expression, using the Prometheus Query UI, to confirm whether the alert should currently trigger.
    按照这种方法,可以在无需第三方应用或配置外部接收器的情况下测试 Alertmanager 和 PrometheusRule 的配置。这有助于判断警报是否按预期到达,或者是否未发送。如果你在正确应用 AlertmanagerConfig 时遇到困难,可以检查 rancher-monitoring-operator 的 pod 日志,以确认语法正确且已被接受;Alertmanager Pod 日志;以及使用 Prometheus 查询界面的 PrometheusRule 表达式值,以确认警报当前是否应触发。

Environment 环境

A Kubernetes cluster managed by Rancher v2.6+ with rancher-monitoring installed
由 Rancher v2.6+ 管理的 Kubernetes 集群,安装了 rancher-monitoring(农场监控功能)

访问Rancher-K8S解决方案博主,企业合作伙伴 :
https://blog.csdn.net/lidw2009

http://www.jsqmd.com/news/582617/

相关文章:

  • GitHub 学生认证须知
  • 学会OpenClaw后,我的摸鱼时间又变长了
  • 如何通过LAV Filters解决媒体播放难题?开源解码工具完整优化指南
  • STM32H723ZGT6 与 STM32F103RCT6 硬核对比,从参数到实战的全维度精准解析
  • 2026最新户外文旅灯光设计厂家推荐!权威榜单发布,品质服务双优 - 十大品牌榜
  • LFM2.5-1.2B-Thinking-GGUF版本管理与协作:GitHub工作流中的AI助手
  • 苏州日料哪家优惠力度大?火地铁板烧口令解锁隐藏福利,性价比碾压同档门店 - 资讯焦点
  • 为什么 ABAP 开发团队现在要认真看待 AI 这项能力
  • Ruby短信营销接口示例代码:Ruby开发环境下营销短信API接口的集成与Demo演示
  • 《从Claude Code泄露源码看工程架构:导读》
  • pre-pre-training的规则系统有哪些
  • 分子动力学自由能计算实战指南:从理论到实践掌握gmx_MMPBSA
  • 腾讯云摆摊、淘宝卖20万:OpenClaw掀起的自动化风暴,到底是什么?
  • BEVFormer论文复现
  • 118. 从 RKE1(Docker)迁移到 RKE2(容器化)后,JSON 日志未能正确解析
  • STM32 HAL驱动SSD1306 OLED显示库(C++/I²C/128×64)
  • Qwen1.5-1.8B GPTQ企业级部署指南:内网穿透与安全访问配置
  • Shell短信营销接口示例代码:利用Curl指令在Linux环境下快速调用营销短信API
  • OpenCV 颜色空间(RGB/BGR/HSV)超详细用法教程
  • IP归属地查询在互联网业务中能解决什么问题?3个真实场景+查询工具落地实操
  • 图像降噪太慢?用积分图像把Python版Non-Local Means速度提升10倍以上
  • 2026届学术党必备的五大AI科研平台横评
  • ImStudio终极指南:5个实战技巧打造高效GUI布局设计工作流
  • 2026-03月随笔
  • 超简单小白爬虫急速五分钟上手教程
  • 119. 使用 Fluentd concat 过滤器插件在牧场日志中串接多行日志
  • 当HTTPS上传太慢时,我是如何用Minio Java SDK在后端搞定大文件分片上传的
  • Java调用C/C++库从未如此简单:3步实现JNI替代方案,性能提升40%的FFM实测报告
  • 2026最新舞台灯光推荐!国内优质舞台灯光工厂权威榜单发布 - 十大品牌榜
  • 2026最新贵州避暑推荐!打卡地/风景区/景区/度假村权威榜单发布,助力消费者找到心仪的避暑好去处 - 十大品牌榜