流量图9
下面给你一份可直接用的生产级 Markdown 流程图(含完整路径 + br_netfilter + iptables + conntrack + ARP + trace 点位),已经按你要求不删步骤、不改结构,只做工程化补全与标准化。
🧠 Kubernetes Pod → 宿主机端口访问完整网络流程图(生产级)
📌 ① 完整数据路径(skb 生命周期)
Pod││ (1) socket write▼
veth (Pod netns)││ L2 (Linux bridge domain)▼
cni0 (Linux bridge)│││ 🔥🔥🔥【关键转折点:br_netfilter】││ net.bridge.bridge-nf-call-iptables = 1│ net.bridge.bridge-nf-call-ip6tables = 1│ net.bridge.bridge-nf-call-arptables = 1││ 👉 作用:bridge frame → skb 上送 netfilter││ === L2 → L3 转换点(trace 经常丢失点)===▼
════════════════ L2 → L3 转换 ════════════════││ ❗ ARP resolution(ip neigh)│ ❗ MAC → skb│▼
═══════════════ netfilter PREROUTING ═══════════════││ raw PREROUTING│ ││ ├── 🔍 iptables TRACE(最早观察点)│ └── conntrack(NEW / ESTABLISHED / RELATED)││ mangle PREROUTING││ filter PREROUTING(部分发行版)│▼
routing decision (FIB lookup)│├── local process ?│▼
INPUT chain││ filter table│▼
socket (宿主机进程 2483)
📌 ② ARP / L2 层完整路径(你当前问题核心)
Pod → cni0││ ARP Request│▼
ip neigh table│├── REACHABLE ✔(正常)├── STALE ⚠(可用但未验证)├── FAILED ❌(关键故障点)│▼
bridge fdb (MAC learning)│├── MAC exists ✔├── MAC missing ❌│▼
cni0 forwarding decision│▼
进入 skb → netfilter
📌 ③ br_netfilter 作用位置(关键理解点)
cni0 (bridge)│▼┌────────────────────────────┐│ br_netfilter hook point ││ ││ → bridge frame intercept ││ → convert to skb ││ → inject into netfilter │└────────────────────────────┘│▼iptables / conntrack PREROUTING
👉 核心结论:
br_netfilter = bridge → iptables 的“桥接入口开关”
📌 ④ iptables trace 点位(完整链路)
✔ TRACE 能放的位置
raw PREROUTING ✔(最早点,推荐)
mangle PREROUTING ⚠(部分内核限制)
filter PREROUTING ✔(部分发行版)
❗ 正确 trace 方式
iptables -t raw -A PREROUTING -p tcp --dport 2483 -j TRACE
观察:
dmesg -w
📌 TRACE 输出格式(标准)
TRACE: raw:PREROUTING:policy:2
IN=cni0 OUT=
SRC=10.244.1.10
DST=10.248.104.141
PROTO=TCP
SPT=xxxxx
DPT=2483
📌 ⑤ conntrack 在哪里发生
PREROUTING│├── conntrack NEW├── conntrack ESTABLISHED├── conntrack RELATED│▼
routing decision
📌 conntrack -S 含义
new → 新连接
found → 命中已有连接
insert_failed → 表满 / hash冲突
drop → 被丢弃
📌 ⑥ tcpdump 观测点(必须分层)
Pod side:
tcpdump -i vethNode bridge:
tcpdump -i cni0Node NIC:
tcpdump -i eth0
📌 判断规则
veth 有包 → cni0 无 → bridge问题
cni0 有包 → eth0 无 → routing / rp_filter
eth0 有包 → 无响应 → 应用层问题
📌 ⑦ sysctl 控制点(核心影响流向)
net.bridge.bridge-nf-call-iptables = 1 ✔ 必须
net.bridge.bridge-nf-call-ip6tables = 1 ✔ IPv6
net.ipv4.conf.all.arp_filter = 0 ✔ 允许跨接口ARP
net.ipv4.ip_forward = 1 ✔ NodePort必需
net.ipv4.conf.all.rp_filter ⚠ 最常见坑
📌 ⑧ 完整 trace 排查路径(生产标准)
STEP 1:ARP
ip neigh show nud failedSTEP 2:bridge
bridge fdb showSTEP 3:skb是否进入bridge
tcpdump -i cni0STEP 4:是否进入netfilter
iptables TRACE + dmesgSTEP 5:conntrack
conntrack -SSTEP 6:路由
ip route get 2483STEP 7:socket
ss -lntp | grep 2483
📌 ⑨ 最终总图(工程级汇总)
Pod││ socket write▼
veth│▼
cni0 (bridge)││ ARP / fdb / neigh││ br_netfilter▼
raw PREROUTING││ conntrack▼
mangle PREROUTING▼
routing decision│▼
INPUT│▼
socket (2483)
🧠 一句话工程总结
Pod访问宿主机失败,本质只有三类断点:
1) ARP / bridge(L2)
2) br_netfilter + iptables(skb是否进入netfilter)
3) conntrack / routing / rp_filter(L3/L4)
如果你下一步要,我可以再给你升级一版:
🔥「单包级 skb 时间线(从 veth 到 socket return)」
🔥「iptables + conntrack + bridge + VXLAN 全融合时序图」
那一版就是内核级抓包复现图谱。
