Merge branch 'big-tcp'

Eric Dumazet says: ==================== tcp: BIG TCP implementation This series implements BIG TCP as presented in netdev 0x15: https://netdevconf.info/0x15/session.html?BIG-TCP Jonathan Corbet made a nice summary: https://lwn.net/Articles/884104/ Standard TSO/GRO packet limit is 64KB With BIG TCP, we allow bigger TSO/GRO packet sizes for IPv6 traffic. Note that this feature is by default not enabled, because it might break some eBPF programs assuming TCP header immediately follows IPv6 header. While tcpdump recognizes the HBH/Jumbo header, standard pcap filters are unable to skip over IPv6 extension headers. Reducing number of packets traversing networking stack usually improves performance, as shown on this experiment using a 100Gbit NIC, and 4K MTU. 'Standard' performance with current (74KB) limits. for i in {1..10}; do ./netperf -t TCP_RR -H iroa23 -- -r80000,80000 -O MIN_LATENCY,P90_LATENCY,P99_LATENCY,THROUGHPUT|tail -1; done 77 138 183 8542.19 79 143 178 8215.28 70 117 164 9543.39 80 144 176 8183.71 78 126 155 9108.47 80 146 184 8115.19 71 113 165 9510.96 74 113 164 9518.74 79 137 178 8575.04 73 111 171 9561.73 Now enable BIG TCP on both hosts. ip link set dev eth0 gro_max_size 185000 gso_max_size 185000 for i in {1..10}; do ./netperf -t TCP_RR -H iroa23 -- -r80000,80000 -O MIN_LATENCY,P90_LATENCY,P99_LATENCY,THROUGHPUT|tail -1; done 57 83 117 13871.38 64 118 155 11432.94 65 116 148 11507.62 60 105 136 12645.15 60 103 135 12760.34 60 102 134 12832.64 62 109 132 10877.68 58 82 115 14052.93 57 83 124 14212.58 57 82 119 14196.01 We see an increase of transactions per second, and lower latencies as well. v7: adopt unsafe_memcpy() in mlx5 to avoid FORTIFY warnings. v6: fix a compilation error for CONFIG_IPV6=n in "net: allow gso_max_size to exceed 65536", reported by kernel bots. v5: Replaced two patches (that were adding new attributes) with patches from Alexander Duyck. Idea is to reuse existing gso_max_size/gro_max_size v4: Rebased on top of Jakub series (Merge branch 'tso-gso-limit-split') max_tso_size is now family independent. v3: Fixed a typo in RFC number (Alexander) Added Reviewed-by: tags from Tariq on mlx4/mlx5 parts. v2: Removed the MAX_SKB_FRAGS change, this belongs to a different series. Addressed feedback, for Alexander and nvidia folks. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
author: David S. Miller <davem@davemloft.net> 2022-05-16 10:18:56 +0100
committer: David S. Miller <davem@davemloft.net> 2022-05-16 10:18:56 +0100
commit: 7fa2e481ff2fee20e0338d98489eb9f513ada45f (patch)
tree: f677e03a56c15c8f30fa099bc7032298fd114b89 /include/net
parent: Merge branch 'Renesas-RSZ-V2M-support' (diff)
parent: mlx5: support BIG TCP packets (diff)
download: linux-dev-7fa2e481ff2fee20e0338d98489eb9f513ada45f.tar.xz
linux-dev-7fa2e481ff2fee20e0338d98489eb9f513ada45f.zip
1 files changed, 44 insertions, 0 deletions
diff --git a/include/net/ipv6.h b/include/net/ipv6.h
index 213612f1680c..5b38bf1a586b 100644
--- a/include/net/ipv6.h
+++ b/include/net/ipv6.h
@@ -151,6 +151,17 @@ struct frag_hdr {
 	__be32	identification;
 };
 
+/*
+ * Jumbo payload option, as described in RFC 2675 2.
+ */
+struct hop_jumbo_hdr {
+	u8	nexthdr;
+	u8	hdrlen;
+	u8	tlv_type;	/* IPV6_TLV_JUMBO, 0xC2 */
+	u8	tlv_len;	/* 4 */
+	__be32	jumbo_payload_len;
+};
+
 #define	IP6_MF		0x0001
 #define	IP6_OFFSET	0xFFF8
 
@@ -456,6 +467,39 @@ bool ipv6_opt_accepted(const struct sock *sk, const struct sk_buff *skb,
 struct ipv6_txoptions *ipv6_update_options(struct sock *sk,
 					   struct ipv6_txoptions *opt);
 
+/* This helper is specialized for BIG TCP needs.
+ * It assumes the hop_jumbo_hdr will immediately follow the IPV6 header.
+ * It assumes headers are already in skb->head.
+ * Returns 0, or IPPROTO_TCP if a BIG TCP packet is there.
+ */
+static inline int ipv6_has_hopopt_jumbo(const struct sk_buff *skb)
+{
+	const struct hop_jumbo_hdr *jhdr;
+	const struct ipv6hdr *nhdr;
+
+	if (likely(skb->len <= GRO_LEGACY_MAX_SIZE))
+		return 0;
+
+	if (skb->protocol != htons(ETH_P_IPV6))
+		return 0;
+
+	if (skb_network_offset(skb) +
+	    sizeof(struct ipv6hdr) +
+	    sizeof(struct hop_jumbo_hdr) > skb_headlen(skb))
+		return 0;
+
+	nhdr = ipv6_hdr(skb);
+
+	if (nhdr->nexthdr != NEXTHDR_HOP)
+		return 0;
+
+	jhdr = (const struct hop_jumbo_hdr *) (nhdr + 1);
+	if (jhdr->tlv_type != IPV6_TLV_JUMBO || jhdr->hdrlen != 0 ||
+	    jhdr->nexthdr != IPPROTO_TCP)
+		return 0;
+	return jhdr->nexthdr;
+}
+
 static inline bool ipv6_accept_ra(struct inet6_dev *idev)
 {
 	/* If forwarding is enabled, RA are not accepted unless the special
author	David S. Miller <davem@davemloft.net>	2022-05-16 10:18:56 +0100
committer	David S. Miller <davem@davemloft.net>	2022-05-16 10:18:56 +0100
commit	7fa2e481ff2fee20e0338d98489eb9f513ada45f (patch)
tree	f677e03a56c15c8f30fa099bc7032298fd114b89 /include/net
parent	Merge branch 'Renesas-RSZ-V2M-support' (diff)
parent	mlx5: support BIG TCP packets (diff)
download	linux-dev-7fa2e481ff2fee20e0338d98489eb9f513ada45f.tar.xz linux-dev-7fa2e481ff2fee20e0338d98489eb9f513ada45f.zip