为了能彻底讲清楚TIME_WAIT的原理及解决办法,本系列一共有4篇
彻底理解并解决服务器出现大量TIME_WAIT - 第一篇_YZF_Kevin的博客-CSDN博客
彻底理解并解决服务器出现大量TIME_WAIT - 第二篇_YZF_Kevin的博客-CSDN博客
彻底理解并解决服务器出现大量TIME_WAIT - 第三篇_YZF_Kevin的博客-CSDN博客
彻底理解并解决服务器出现大量TIME_WAIT - 第四篇_YZF_Kevin的博客-CSDN博客
第一篇博客中我们讲了 TIME_WAIT 出现的原理,引发的问题,解决办法等,如下
解决办法
1. 代码层修改,把短连接改为长连接,但代价较大
2. 修改 ip_local_port_range,增大可用端口范围,比如1024 ~ 65535
3. 客户端程序中设置socket的 SO_LINGER 选项
4. 打开 tcp_tw_recycle 和tcp_timestamps 选项,有一定风险,且linux4.12之后被废弃
5. 打开 tcp_tw_reuse 和 tcp_timestamps 选项
6. 设置 tcp_max_tw_buckets 为一个较小的值
下面我们接着对各个办法进行详细讲解
办法5. 打开tcp_tw_reuse和tcp_timestamps选项
官方文档中解释如下:
tcp_tw_recycle选项:Allow to reuse TIME-WAIT sockets for new connections when it is
safe from protocol viewpoint. Default value is 0
这里的关键在于“协议什么情况下认为是安全的”,由于环境限制,没有办法进行验证,通过看源码简单分析了一下
=====linux-2.6.37 net/ipv4/tcp_ipv4.c 114=====
- int tcp_twsk_unique(struct sock *sk, struct sock *sktw, void *twp)
- {
- const struct tcp_timewait_sock *tcptw = tcp_twsk(sktw);
- struct tcp_sock *tp = tcp_sk(sk);
-
-
- /* With PAWS, it is safe from the viewpoint
- of data integrity. Even without PAWS it is safe provided sequence
- spaces do not overlap i.e. at data rates <= 80Mbit/sec.
- Actually, the idea is close to VJ's one, only timestamp cache is
- held not per host, but per port pair and TW bucket is used as state
- holder.
- If TW bucket has been already destroyed we fall back to VJ's scheme
- and use initial timestamp retrieved from peer table.
- */
-
- //从代码来看,tcp_tw_reuse选项和tcp_timestamps选项也必须同时打开;否则tcp_tw_reuse就不起作用
- //另外,所谓的“协议安全”,从代码来看应该是收到最后一个包后超过1s
- if (tcptw->tw_ts_recent_stamp &&
- (twp == NULL || (sysctl_tcp_tw_reuse &&
- get_seconds() - tcptw->tw_ts_recent_stamp > 1)))
- {
- tp->write_seq = tcptw->tw_snd_nxt + 65535 + 2;
- if (tp->write_seq == 0)
- tp->write_seq = 1;
- tp->rx_opt.ts_recent = tcptw->tw_ts_recent;
- tp->rx_opt.ts_recent_stamp = tcptw->tw_ts_recent_stamp;
- sock_hold(sktw);
- return 1;
- }
-
- return 0;
- }
总结一下:
1. tcp_tw_reuse选项和tcp_timestamps选项也必须同时打开;
2. 重用TIME_WAIT的条件是收到最后一个包后超过1s。
官方手册有一段警告:
It should not be changed without advice/request of technical experts.
对于大部分局域网或者公司内网应用来说,满足条件都是没有问题的,因此官方手册里面的警告其实也没那么可怕
办法6:设置tcp_max_tw_buckets为一个较小的值,要比可用端口范围小,比如可用端口范围为6万,这个值可以设置为5.5万
tcp_max_tw_buckets - INTEGER
官方文档解释如下
Maximal number of timewait sockets held by system simultaneously. If this number is exceeded time-wait socket is immediately destroyed and warning is printed.
翻译一下:内核持有的状态为TIME_WAIT的最大连接数。如果超过这个数字,新的TIME_WAIT的连接会被立即销毁,并打印警告
官方文档没有说明默认值,通过几个系统的简单验证,初步确定默认值是180000
源码如下
- void tcp_time_wait(struct sock *sk, int state, int timeo)
- {
- struct inet_timewait_sock *tw = NULL;
- const struct inet_connection_sock *icsk = inet_csk(sk);
- const struct tcp_sock *tp = tcp_sk(sk);
- int recycle_ok = 0;
-
- if (tcp_death_row.sysctl_tw_recycle && tp->rx_opt.ts_recent_stamp)
- recycle_ok = icsk->icsk_af_ops->remember_stamp(sk);
-
- // 这里判断TIME_WAIT状态的连接数是否超过上限
- if (tcp_death_row.tw_count < tcp_death_row.sysctl_max_tw_buckets)
- tw = inet_twsk_alloc(sk, state);
-
- if (tw != NULL)
- {
- //分配成功,进行TIME_WAIT状态处理,此处略去很多代码
- }
- else
- {
- //分配失败,不进行处理,只记录日志: TCP: time wait bucket table overflow
- /* Sorry, if we're out of memory, just CLOSE this
- * socket up. We've got bigger problems than
- * non-graceful socket closings.
- */
- NET_INC_STATS_BH(sock_net(sk), LINUX_MIB_TCPTIMEWAITOVERFLOW);
- }
-
- tcp_update_metrics(sk);
- tcp_done(sk);
- }
官方手册中有一段警告:
This limit exists only to prevent simple DoS attacks, you _must_ not lower the limit artificially,
but rather increase it (probably, after increasing installed memory), if network conditions require more than default value.
基本意思是这个用于防止Dos攻击,我们不应该人工减少,如果网络条件需要的话,反而应该增加。
但其实对于我们的局域网或者公司内网应用来说,这个风险并不大