tcp: abort orphan sockets stalling on zero window probes

Currently we have two different policies for orphan sockets that repeatedly stall on zero window ACKs. If a socket gets a zero window ACK when it is transmitting data, the RTO is used to probe the window. The socket is aborted after roughly tcp_orphan_retries() retries (as in tcp_write_timeout()). But if the socket was idle when it received the zero window ACK, and later wants to send more data, we use the probe timer to probe the window. If the receiver always returns zero window ACKs, icsk_probes keeps getting reset in tcp_ack() and the orphan socket can stall forever until the system reaches the orphan limit (as commented in tcp_probe_timer()). This opens up a simple attack to create lots of hanging orphan sockets to burn the memory and the CPU, as demonstrated in the recent netdev post "TCP connection will hang in FIN_WAIT1 after closing if zero window is advertised." http://www.spinics.net/lists/netdev/msg296539.html This patch follows the design in RTO-based probe: we abort an orphan socket stalling on zero window when the probe timer reaches both the maximum backoff and the maximum RTO. For example, an 100ms RTT connection will timeout after roughly 153 seconds (0.3 + 0.6 + .... + 76.8) if the receiver keeps the window shut. If the orphan socket passes this check, but the system already has too many orphans (as in tcp_out_of_resources()), we still abort it but we'll also send an RST packet as the connection may still be active. In addition, we change TCP_USER_TIMEOUT to cover (life or dead) sockets stalled on zero-window probes. This changes the semantics of TCP_USER_TIMEOUT slightly because it previously only applies when the socket has pending transmission. Signed-off-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Neal Cardwell <ncardwell@google.com> Reported-by: Andrey Dmitrov <andrey.dmitrov@oktetlabs.ru> Signed-off-by: David S. Miller <davem@davemloft.net>
author: Yuchung Cheng <ycheng@google.com> 2014-09-29 13:20:38 -0700
committer: David S. Miller <davem@davemloft.net> 2014-10-01 16:27:52 -0400
commit: b248230c34970a6c1c17c591d63b464e8d2cfc33 (patch)
tree: 1b87913e6b3dc3574cbe78f7d1736ae4074ebf93 /net/ipv4/tcp.c
parent: cipso: add __init to cipso_v4_cache_init (diff)
download: linux-dev-b248230c34970a6c1c17c591d63b464e8d2cfc33.tar.xz
linux-dev-b248230c34970a6c1c17c591d63b464e8d2cfc33.zip
1 files changed, 1 insertions, 1 deletions
diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
index 5c170340f684..26a6f113f00c 100644
--- a/net/ipv4/tcp.c
+++ b/net/ipv4/tcp.c
@@ -2693,7 +2693,7 @@ static int do_tcp_setsockopt(struct sock *sk, int level,
 		break;
 #endif
 	case TCP_USER_TIMEOUT:
-		/* Cap the max timeout in ms TCP will retry/retrans
+		/* Cap the max time in ms TCP will retry or probe the window
 		 * before giving up and aborting (ETIMEDOUT) a connection.
 		 */
 		if (val < 0)
author	Yuchung Cheng <ycheng@google.com>	2014-09-29 13:20:38 -0700
committer	David S. Miller <davem@davemloft.net>	2014-10-01 16:27:52 -0400
commit	b248230c34970a6c1c17c591d63b464e8d2cfc33 (patch)
tree	1b87913e6b3dc3574cbe78f7d1736ae4074ebf93 /net/ipv4/tcp.c
parent	cipso: add __init to cipso_v4_cache_init (diff)
download	linux-dev-b248230c34970a6c1c17c591d63b464e8d2cfc33.tar.xz linux-dev-b248230c34970a6c1c17c591d63b464e8d2cfc33.zip