当前位置: 首页 > news >正文

流量图9 - 小镇

流量图9

下面给你一份可直接用的生产级 Markdown 流程图(含完整路径 + br_netfilter + iptables + conntrack + ARP + trace 点位),已经按你要求不删步骤、不改结构,只做工程化补全与标准化


🧠 Kubernetes Pod → 宿主机端口访问完整网络流程图(生产级)


📌 ① 完整数据路径(skb 生命周期)

Pod││  (1) socket write▼
veth (Pod netns)││  L2 (Linux bridge domain)▼
cni0 (Linux bridge)│││  🔥🔥🔥【关键转折点:br_netfilter】││  net.bridge.bridge-nf-call-iptables = 1│  net.bridge.bridge-nf-call-ip6tables = 1│  net.bridge.bridge-nf-call-arptables = 1││  👉 作用:bridge frame → skb 上送 netfilter││  === L2 → L3 转换点(trace 经常丢失点)===▼
════════════════ L2 → L3 转换 ════════════════││  ❗ ARP resolution(ip neigh)│  ❗ MAC → skb│▼
═══════════════ netfilter PREROUTING ═══════════════││  raw PREROUTING│     ││     ├── 🔍 iptables TRACE(最早观察点)│     └── conntrack(NEW / ESTABLISHED / RELATED)││  mangle PREROUTING││  filter PREROUTING(部分发行版)│▼
routing decision (FIB lookup)│├── local process ?│▼
INPUT chain││  filter table│▼
socket (宿主机进程 2483)

📌 ② ARP / L2 层完整路径(你当前问题核心)

Pod → cni0││  ARP Request│▼
ip neigh table│├── REACHABLE ✔(正常)├── STALE     ⚠(可用但未验证)├── FAILED    ❌(关键故障点)│▼
bridge fdb (MAC learning)│├── MAC exists ✔├── MAC missing ❌│▼
cni0 forwarding decision│▼
进入 skb → netfilter

📌 ③ br_netfilter 作用位置(关键理解点)

                cni0 (bridge)│▼┌────────────────────────────┐│ br_netfilter hook point     ││                            ││  → bridge frame intercept  ││  → convert to skb         ││  → inject into netfilter  │└────────────────────────────┘│▼iptables / conntrack PREROUTING

👉 核心结论:

br_netfilter = bridge → iptables 的“桥接入口开关”

📌 ④ iptables trace 点位(完整链路)

✔ TRACE 能放的位置

raw     PREROUTING   ✔(最早点,推荐)
mangle  PREROUTING   ⚠(部分内核限制)
filter  PREROUTING   ✔(部分发行版)

❗ 正确 trace 方式

iptables -t raw -A PREROUTING -p tcp --dport 2483 -j TRACE

观察:

dmesg -w

📌 TRACE 输出格式(标准)

TRACE: raw:PREROUTING:policy:2
IN=cni0 OUT=
SRC=10.244.1.10
DST=10.248.104.141
PROTO=TCP
SPT=xxxxx
DPT=2483

📌 ⑤ conntrack 在哪里发生

PREROUTING│├── conntrack NEW├── conntrack ESTABLISHED├── conntrack RELATED│▼
routing decision

📌 conntrack -S 含义

new            → 新连接
found          → 命中已有连接
insert_failed  → 表满 / hash冲突
drop           → 被丢弃

📌 ⑥ tcpdump 观测点(必须分层)

Pod side:
tcpdump -i vethNode bridge:
tcpdump -i cni0Node NIC:
tcpdump -i eth0

📌 判断规则

veth 有包 → cni0 无 → bridge问题
cni0 有包 → eth0 无 → routing / rp_filter
eth0 有包 → 无响应 → 应用层问题

📌 ⑦ sysctl 控制点(核心影响流向)

net.bridge.bridge-nf-call-iptables = 1   ✔ 必须
net.bridge.bridge-nf-call-ip6tables = 1  ✔ IPv6
net.ipv4.conf.all.arp_filter = 0         ✔ 允许跨接口ARP
net.ipv4.ip_forward = 1                  ✔ NodePort必需
net.ipv4.conf.all.rp_filter              ⚠ 最常见坑

📌 ⑧ 完整 trace 排查路径(生产标准)

STEP 1:ARP
ip neigh show nud failedSTEP 2:bridge
bridge fdb showSTEP 3:skb是否进入bridge
tcpdump -i cni0STEP 4:是否进入netfilter
iptables TRACE + dmesgSTEP 5:conntrack
conntrack -SSTEP 6:路由
ip route get 2483STEP 7:socket
ss -lntp | grep 2483

📌 ⑨ 最终总图(工程级汇总)

Pod││ socket write▼
veth│▼
cni0 (bridge)││ ARP / fdb / neigh││ br_netfilter▼
raw PREROUTING││ conntrack▼
mangle PREROUTING▼
routing decision│▼
INPUT│▼
socket (2483)

🧠 一句话工程总结

Pod访问宿主机失败,本质只有三类断点:
1) ARP / bridge(L2)
2) br_netfilter + iptables(skb是否进入netfilter)
3) conntrack / routing / rp_filter(L3/L4)

如果你下一步要,我可以再给你升级一版:

🔥「单包级 skb 时间线(从 veth 到 socket return)」

🔥「iptables + conntrack + bridge + VXLAN 全融合时序图」

那一版就是内核级抓包复现图谱