net/tcp_fastopen: Disable active side TFO in certain scenarios

Middlebox firewall issues can potentially cause server's data being blackholed after a successful 3WHS using TFO. Following are the related reports from Apple: https://www.nanog.org/sites/default/files/Paasch_Network_Support.pdf Slide 31 identifies an issue where the client ACK to the server's data sent during a TFO'd handshake is dropped. C ---> syn-data ---> S C <--- syn/ack ----- S C (accept & write) C <---- data ------- S C ----- ACK -> X S [retry and timeout] https://www.ietf.org/proceedings/94/slides/slides-94-tcpm-13.pdf Slide 5 shows a similar situation that the server's data gets dropped after 3WHS. C ---- syn-data ---> S C <--- syn/ack ----- S C ---- ack --------> S S (accept & write) C? X <- data ------ S [retry and timeout] This is the worst failure b/c the client can not detect such behavior to mitigate the situation (such as disabling TFO). Failing to proceed, the application (e.g., SSL library) may simply timeout and retry with TFO again, and the process repeats indefinitely. The proposed solution is to disable active TFO globally under the following circumstances: 1. client side TFO socket detects out of order FIN 2. client side TFO socket receives out of order RST We disable active side TFO globally for 1hr at first. Then if it happens again, we disable it for 2h, then 4h, 8h, ... And we reset the timeout to 1hr if a client side TFO sockets not opened on loopback has successfully received data segs from server. And we examine this condition during close(). The rational behind it is that when such firewall issue happens, application running on the client should eventually close the socket as it is not able to get the data it is expecting. Or application running on the server should close the socket as it is not able to receive any response from client. In both cases, out of order FIN or RST will get received on the client given that the firewall will not block them as no data are in those frames. And we want to disable active TFO globally as it helps if the middle box is very close to the client and most of the connections are likely to fail. Also, add a debug sysctl: tcp_fastopen_blackhole_detect_timeout_sec: the initial timeout to use when firewall blackhole issue happens. This can be set and read. When setting it to 0, it means to disable the active disable logic. Signed-off-by: Wei Wang <weiwan@google.com> Acked-by: Yuchung Cheng <ycheng@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
author: Wei Wang <weiwan@google.com> 2017-04-20 14:45:46 -0700
committer: David S. Miller <davem@davemloft.net> 2017-04-24 14:27:17 -0400
commit: cf1ef3f0719b4dcb74810ed507e2a2540f9811b4 (patch)
tree: 1b070be7a31eb3557e7f0701a6aca667b1eb13dd /net/ipv4/sysctl_net_ipv4.c
parent: Merge tag 'mlx5-updates-2017-04-22' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux (diff)
download: linux-dev-cf1ef3f0719b4dcb74810ed507e2a2540f9811b4.tar.xz
linux-dev-cf1ef3f0719b4dcb74810ed507e2a2540f9811b4.zip
1 files changed, 21 insertions, 0 deletions
diff --git a/net/ipv4/sysctl_net_ipv4.c b/net/ipv4/sysctl_net_ipv4.c
index ddac9e64b702..86957e9cd6c6 100644
--- a/net/ipv4/sysctl_net_ipv4.c
+++ b/net/ipv4/sysctl_net_ipv4.c
@@ -350,6 +350,19 @@ static int proc_udp_early_demux(struct ctl_table *table, int write,
 	return ret;
 }
 
+static int proc_tfo_blackhole_detect_timeout(struct ctl_table *table,
+					     int write,
+					     void __user *buffer,
+					     size_t *lenp, loff_t *ppos)
+{
+	int ret;
+
+	ret = proc_dointvec_minmax(table, write, buffer, lenp, ppos);
+	if (write && ret == 0)
+		tcp_fastopen_active_timeout_reset();
+	return ret;
+}
+
 static struct ctl_table ipv4_table[] = {
 	{
 		.procname	= "tcp_timestamps",
@@ -400,6 +413,14 @@ static struct ctl_table ipv4_table[] = {
 		.proc_handler	= proc_tcp_fastopen_key,
 	},
 	{
+		.procname	= "tcp_fastopen_blackhole_timeout_sec",
+		.data		= &sysctl_tcp_fastopen_blackhole_timeout,
+		.maxlen		= sizeof(int),
+		.mode		= 0644,
+		.proc_handler	= proc_tfo_blackhole_detect_timeout,
+		.extra1		= &zero,
+	},
+	{
 		.procname	= "tcp_abort_on_overflow",
 		.data		= &sysctl_tcp_abort_on_overflow,
 		.maxlen		= sizeof(int),
author	Wei Wang <weiwan@google.com>	2017-04-20 14:45:46 -0700
committer	David S. Miller <davem@davemloft.net>	2017-04-24 14:27:17 -0400
commit	cf1ef3f0719b4dcb74810ed507e2a2540f9811b4 (patch)
tree	1b070be7a31eb3557e7f0701a6aca667b1eb13dd /net/ipv4/sysctl_net_ipv4.c
parent	Merge tag 'mlx5-updates-2017-04-22' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux (diff)
download	linux-dev-cf1ef3f0719b4dcb74810ed507e2a2540f9811b4.tar.xz linux-dev-cf1ef3f0719b4dcb74810ed507e2a2540f9811b4.zip