当前位置: 首页 > news >正文

TCP close 过程分析 - liyan

在一些场景下,对服务的调用观测是很有价值的。笔者最近实践了使用tcp_close对服务主被调信息的观测,在这里作一下记录。

一、tcp close 的一般过程

首先来看一下tcp close的过程。
tcp涉及操作的分析最权威的自然是RFC文档。依据RFC-793文档中的描述,tcp close时的状态转移信息为如下:

tcp state

但是涉及到具体的Linux下的tcp close的过程分析,文档就比较少了。笔者找到了一篇介绍Linuxtcp操作相关的介绍文档。Analysis_TCP_in_Linux中描述了主动触发close及被动触发closesocket双方涉及的函数调用,这为后面的验证提供了思路。

二、BPF 来观测 tcp close 过程

依据Analysis_TCP_in_Linux中的描述,笔者使用python构建了如下的验证demo

# coding=UTF-8
import socket
import time
import getopt
import syssrv_ip = ""
srv_port = 0def server(srv_ip, srv_port):conn = socket.socket(socket.AF_INET, socket.SOCK_STREAM)conn.bind((srv_ip, srv_port))conn.listen(1024)conn.setblocking(1)index = 0while True:connection, address = conn.accept()try:dst = connection.getpeername()while True:request = connection.recv(1024)req_str = str(request.decode())if req_str == 'end':# 这里以客户端传输一个特殊信息作为结束信息# tcp server 和 client 之间的 close 是没有必然联系的# 只能约定一个关闭条件。此时,无法确定客户端是否发起了断联print("rcv end, close...")connection.close()time.sleep(2)break# passprint("conn: %s:%d received: %s" % (dst[0], dst[1], req_str))response = ("client, msg index: %d" % index).encode()connection.send(response)index += 1print("conn: %s:%d closed" % (dst[0], dst[1]))except Exception as e:print("handle exception during dst. %s ..." % e)# pass# passdef client(srv_ip, srv_port):try:server_addr = (srv_ip, srv_port)conn = socket.socket(socket.AF_INET, socket.SOCK_STREAM)conn.connect(server_addr)msg = ("server, msg index: 0").encode()conn.send(msg)data = conn.recv(1024)print("rcv from server: %s" % str(data.decode()))conn.send("end".encode())print("end. close ...")time.sleep(2)conn.close()time.sleep(2)except Exception as e:print("connection with server with error, %s" % e)returnif __name__ == "__main__":work_mode = "s"try:opts, args = getopt.getopt(sys.argv[1:], "i:p:s:c",["srv_ip=", "port=", "server", "client"])if len(opts) == 0:print("unknown opts")sys.exit(0)for opt, arg in opts:if opt in ("-i", "--srv_ip"):srv_ip = argif opt in ("-p", "--port"):srv_port = int(arg)if opt in ("--server"):work_mode = "s"if opt in ("--client"):work_mode = "c"except Exception as e:print("unknown args")sys.exit(0)if work_mode == "s":server(srv_ip, srv_port)else:client(srv_ip, srv_port)

demo中可以看到,笔者构建的测试代码中,是server端发起的close,而后client端发起close
同时,笔者使用bpftrace构造了如下的观测代码:

#include <net/sock.h>/*
TCP_ESTABLISHED = 1,
TCP_SYN_SENT = 2,
TCP_SYN_RECV = 3,
TCP_FIN_WAIT1 = 4,
TCP_FIN_WAIT2 = 5,
TCP_TIME_WAIT = 6,
TCP_CLOSE = 7,
TCP_CLOSE_WAIT = 8,
TCP_LAST_ACK = 9,
TCP_LISTEN = 10,
TCP_CLOSING = 11,
TCP_NEW_SYN_RECV = 12,
TCP_MAX_STATES = 13每个 hook 点关注 进程的 pid, sk_state
*/kprobe:tcp_close
/ comm == "python" /
{$sk = (struct sock*)arg0;printf("[tcp_close] pid: %d, state: %d, sock: %d, sk_max_ack_backlog: %d
",pid, $sk->__sk_common.skc_state,$sk, $sk->sk_max_ack_backlog);
}kprobe:tcp_set_state
/ comm == "python" /
{$sk = (struct sock*)arg0;$ns = arg1;printf("[tcp_set_state] pid: %d, state: %d, ns: %d, sk: %d
",pid, $sk->__sk_common.skc_state,$ns, $sk);
}kprobe:tcp_rcv_established
/ comm == "python" /
{$sk = (struct sock*)arg0;printf("[tcp_rcv_established] pid: %d, state: %d, sk: %d
",pid, $sk->__sk_common.skc_state,$sk);
}kprobe:tcp_fin
/ comm == "python" /
{$sk = (struct sock*)arg0;printf("[tcp_fin] pid: %d, state: %d, sk: %d
",pid, $sk->__sk_common.skc_state, $sk);
}kprobe:tcp_send_fin
/ comm == "python" /
{$sk = (struct sock*)arg0;printf("[tcp_send_fin] pid: %d, state: %d, sk: %d
",pid, $sk->__sk_common.skc_state, $sk);
}kprobe:tcp_timewait_state_process
/ comm == "python" /
{$sk = (struct sock*)arg0;printf("[tcp_timewait_state_process] pid: %d, state: %d, sk: %d
",pid, $sk->__sk_common.skc_state,$sk);
}kprobe:tcp_rcv_state_process
/ comm == "python" /
{$sk = (struct sock*)arg0;printf("[tcp_rcv_state_process] pid: %d, state: %d, sk: %d
",pid, $sk->__sk_common.skc_state,$sk);
}kprobe:tcp_v4_do_rcv
/ comm == "python" /
{$sk = (struct sock*)arg0;printf("[tcp_v4_do_rcv] pid: %d, state: %d, sk: %d
",pid, $sk->__sk_common.skc_state,$sk);
}kprobe:tcp_timewait_state_process
/ comm == "python" /
{$sk = (struct sock*)arg0;printf("[tcp_stream_wait_close] pid: %d, state: %d, sk: %d
",pid, $sk->__sk_common.skc_state,$sk);
}

首先启动bpftrace,然后启动server,使用client进行通信。此时bpftrace端的输出为:

[tcp_close] pid: 2828708, state: 1, sock: -1907214080, sk_max_ack_backlog: 1024
[tcp_set_state] pid: 2828708, state: 1, ns: 4, sk: -1907214080
[tcp_send_fin] pid: 2828708, state: 4, sk: -1907214080
[tcp_v4_do_rcv] pid: 2828708, state: 1, sk: -1907216512
[tcp_rcv_established] pid: 2828708, state: 1, sk: -1907216512
[tcp_fin] pid: 2828708, state: 1, sk: -1907216512
[tcp_set_state] pid: 2828708, state: 1, ns: 8, sk: -1907216512
[tcp_v4_do_rcv] pid: 2855763, state: 1, sk: -1907214080
[tcp_rcv_established] pid: 2855763, state: 1, sk: -1907214080
[tcp_close] pid: 2855763, state: 8, sock: -1907216512, sk_max_ack_backlog: 0
[tcp_set_state] pid: 2855763, state: 8, ns: 9, sk: -1907216512
[tcp_send_fin] pid: 2855763, state: 9, sk: -1907216512
[tcp_timewait_state_process] pid: 2855763, state: 6, sk: -2077492080
[tcp_stream_wait_close] pid: 2855763, state: 6, sk: -2077492080
[tcp_v4_do_rcv] pid: 2855763, state: 9, sk: -1907216512
[tcp_rcv_state_process] pid: 2855763, state: 9, sk: -1907216512
[tcp_set_state] pid: 2855763, state: 9, ns: 7, sk: -1907216512

三、笔者的困惑

这里,笔者观测到的结果和Analysis_TCP_in_Linux存在出入,主动发起close的一方,在第三次挥手时,响应的并不是tcp_rcv_state_process。相反的,被动closesocket在第四次挥手时触发了这个函数。而且,主动closesocket,第二次挥手时,响应的socket看起来发生了变更,而且其状态是TCP_ESTABLISHED。这其中需要继续探索。

以上,作为记录了部分总结。

http://www.jsqmd.com/news/481769/

相关文章:

  • 用实力说话千笔,多场景适配降重神器 —— 千笔
  • AReaL: A Large-Scale Asynchronous Reinforcement Learning System for Language Reasoning
  • bpftrace 无侵入遍历golang链表 - liyan
  • 恒企专修学院电话查询:选择培训机构的风险提示 - 品牌推荐
  • 导师推荐 8个降AIGC工具:多场景适配+降AI率全测评
  • 大模型开发入门到进阶:从入门到实战,4阶段完整路径,带你掌握大模型开发!
  • 30天硬核!从0到精通大模型开发,高薪风口等你来抓!
  • 毕业论文神器 9个AI论文网站深度测评:本科生开题报告与学术写作必备工具
  • bpftrace 遍历 golang 链表(go17+) - liyan
  • c++插件管理--pluma实践 - liyan
  • 四周速成!从零掌握AI大模型,内含实战项目与学习计划_30天大模型开发速成
  • 09 部署与成本控制:Serverless 架构下 Agent 的 Token 优化艺术
  • 老王-城府不是心机而是清醒的边界感
  • BPF 获取 LVS FullNat 模式下的 Client IP - liyan
  • 解决RDK X5(ARM64架构)板卡Remote-SSH运行Antigravity AI崩溃(SIGILL):Samba网络盘本地挂载方案
  • 强烈安利! AI论文工具,千笔AI VS 灵感风暴AI,专科生必备神器!
  • centos 安装docker并构建golang镜像 - liyan
  • 狡猾的北狐狸
  • 老王-三观稳则人生稳
  • centos 构建 local-k8s - liyan
  • 老王-老祖宗没说完的后半句
  • 2026涂塑钢管市场评测:哪些公司口碑较好?IPN8710防腐钢管/圆孔滤水钢管 ,涂塑钢管制造厂家推荐排行榜单 - 品牌推荐师
  • 老王-缺心眼是边界感缺失
  • clisp编译 - liyan
  • 老王-家兴不在运在德
  • challenges of bpf tracing go - liyan
  • 泛型体系实战
  • 电机设计仿真:Maxwell ANSYS 五相电机设计
  • Openclaw 附录B 常用Skills清单
  • 053店铺租赁租凭平台系统-springboot+vue