tcp: fix false undo corner cases

The undo code assumes that, upon entering loss recovery, TCP 1) always retransmit something 2) the retransmission never fails locally (e.g., qdisc drop) so undo_marker is set in tcp_enter_recovery() and undo_retrans is incremented only when tcp_retransmit_skb() is successful. When the assumption is broken because TCP's cwnd is too small to retransmit or the retransmit fails locally. The next (DUP)ACK would incorrectly revert the cwnd and the congestion state in tcp_try_undo_dsack() or tcp_may_undo(). Subsequent (DUP)ACKs may enter the recovery state. The sender repeatedly enter and (incorrectly) exit recovery states if the retransmits continue to fail locally while receiving (DUP)ACKs. The fix is to initialize undo_retrans to -1 and start counting on the first retransmission. Always increment undo_retrans even if the retransmissions fail locally because they couldn't cause DSACKs to undo the cwnd reduction. Signed-off-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
author: Yuchung Cheng <ycheng@google.com> 2014-07-02 12:07:16 -0700
committer: David S. Miller <davem@davemloft.net> 2014-07-07 21:40:48 -0700
commit: 6e08d5e3c8236e7484229e46fdf92006e1dd4c49 (patch)
tree: 0cef9beb502c504b884cccc127e34a50974d0f07 /net/ipv4/tcp_output.c
parent: net/mlx4_en: Don't configure the HW vxlan parser when vxlan offloading isn't set (diff)
download: linux-dev-6e08d5e3c8236e7484229e46fdf92006e1dd4c49.tar.xz
linux-dev-6e08d5e3c8236e7484229e46fdf92006e1dd4c49.zip
1 files changed, 4 insertions, 2 deletions
diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
index d92bce0ea24e..179b51e6bda3 100644
--- a/net/ipv4/tcp_output.c
+++ b/net/ipv4/tcp_output.c
@@ -2525,8 +2525,6 @@ int tcp_retransmit_skb(struct sock *sk, struct sk_buff *skb)
 		if (!tp->retrans_stamp)
 			tp->retrans_stamp = TCP_SKB_CB(skb)->when;
 
-		tp->undo_retrans += tcp_skb_pcount(skb);
-
 		/* snd_nxt is stored to detect loss of retransmitted segment,
 		 * see tcp_input.c tcp_sacktag_write_queue().
 		 */
@@ -2534,6 +2532,10 @@ int tcp_retransmit_skb(struct sock *sk, struct sk_buff *skb)
 	} else if (err != -EBUSY) {
 		NET_INC_STATS_BH(sock_net(sk), LINUX_MIB_TCPRETRANSFAIL);
 	}
+
+	if (tp->undo_retrans < 0)
+		tp->undo_retrans = 0;
+	tp->undo_retrans += tcp_skb_pcount(skb);
 	return err;
 }
author	Yuchung Cheng <ycheng@google.com>	2014-07-02 12:07:16 -0700
committer	David S. Miller <davem@davemloft.net>	2014-07-07 21:40:48 -0700
commit	6e08d5e3c8236e7484229e46fdf92006e1dd4c49 (patch)
tree	0cef9beb502c504b884cccc127e34a50974d0f07 /net/ipv4/tcp_output.c
parent	net/mlx4_en: Don't configure the HW vxlan parser when vxlan offloading isn't set (diff)
download	linux-dev-6e08d5e3c8236e7484229e46fdf92006e1dd4c49.tar.xz linux-dev-6e08d5e3c8236e7484229e46fdf92006e1dd4c49.zip