linux-dev - Linux kernel development work

Age	Commit message (Collapse)	Author	Files	Lines
2018-09-13	gso_segment: Reset skb->mac_len after modifying network header	Toke Høiland-Jørgensen	2	-0/+2
	When splitting a GSO segment that consists of encapsulated packets, the skb->mac_len of the segments can end up being set wrong, causing packet drops in particular when using act_mirred and ifb interfaces in combination with a qdisc that splits GSO packets. This happens because at the time skb_segment() is called, network_header will point to the inner header, throwing off the calculation in skb_reset_mac_len(). The network_header is subsequently adjust by the outer IP gso_segment handlers, but they don't set the mac_len. Fix this by adding skb_reset_mac_len() calls to both the IPv4 and IPv6 gso_segment handlers, after they modify the network_header. Many thanks to Eric Dumazet for his help in identifying the cause of the bug. Acked-by: Dave Taht <dave.taht@gmail.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Toke Høiland-Jørgensen <toke@toke.dk> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-13	vxlan: Remove duplicated include from vxlan.h	YueHaibing	1	-1/+0
	Remove duplicated include. Signed-off-by: YueHaibing <yuehaibing@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-13	net: dsa: b53: Do not fail when IRQ are not initialized	Florian Fainelli	1	-1/+7
	When the Device Tree is not providing the per-port interrupts, do not fail during b53_srab_irq_enable() but instead bail out gracefully. The SRAB driver is used on the BCM5301X (Northstar) platforms which do not yet have the SRAB interrupts wired up. Fixes: 16994374a6fc ("net: dsa: b53: Make SRAB driver manage port interrupts") Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-13	vhost_net: batch submitting XDP buffers to underlayer sockets	Jason Wang	1	-13/+161
	This patch implements XDP batching for vhost_net. The idea is first to try to do userspace copy and build XDP buff directly in vhost. Instead of submitting the packet immediately, vhost_net will batch them in an array and submit every 64 (VHOST_NET_BATCH) packets to the under layer sockets through msg_control of sendmsg(). When XDP is enabled on the TUN/TAP, TUN/TAP can process XDP inside a loop without caring GUP thus it can do batch map flushing. When XDP is not enabled or not supported, the underlayer socket need to build skb and pass it to network core. The batched packet submission allows us to do batching like netif_receive_skb_list() in the future. This saves lots of indirect calls for better cache utilization. For the case that we can't so batching e.g when sndbuf is limited or packet size is too large, we will go for usual one packet per sendmsg() way. Doing testpmd on various setups gives us: Test /+pps% XDP_DROP on TAP /+44.8% XDP_REDIRECT on TAP /+29% macvtap (skb) /+26% Netperf tests shows obvious improvements for small packet transmission: size/session/+thu%/+normalize% 64/ 1/ +2%/ 0% 64/ 2/ +3%/ +1% 64/ 4/ +7%/ +5% 64/ 8/ +8%/ +6% 256/ 1/ +3%/ 0% 256/ 2/ +10%/ +7% 256/ 4/ +26%/ +22% 256/ 8/ +27%/ +23% 512/ 1/ +3%/ +2% 512/ 2/ +19%/ +14% 512/ 4/ +43%/ +40% 512/ 8/ +45%/ +41% 1024/ 1/ +4%/ 0% 1024/ 2/ +27%/ +21% 1024/ 4/ +38%/ +73% 1024/ 8/ +15%/ +24% 2048/ 1/ +10%/ +7% 2048/ 2/ +16%/ +12% 2048/ 4/ 0%/ +2% 2048/ 8/ 0%/ +2% 4096/ 1/ +36%/ +60% 4096/ 2/ -11%/ -26% 4096/ 4/ 0%/ +14% 4096/ 8/ 0%/ +4% 16384/ 1/ -1%/ +5% 16384/ 2/ 0%/ +2% 16384/ 4/ 0%/ -3% 16384/ 8/ 0%/ +4% 65535/ 1/ 0%/ +10% 65535/ 2/ 0%/ +8% 65535/ 4/ 0%/ +1% 65535/ 8/ 0%/ +3% Signed-off-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-13	tap: accept an array of XDP buffs through sendmsg()	Jason Wang	1	-2/+72
	This patch implement TUN_MSG_PTR msg_control type. This type allows the caller to pass an array of XDP buffs to tuntap through ptr field of the tun_msg_control. Tap will build skb through those XDP buffers. This will avoid lots of indirect calls thus improves the icache utilization and allows to do XDP batched flushing when doing XDP redirection. Signed-off-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-13	tuntap: accept an array of XDP buffs through sendmsg()	Jason Wang	1	-3/+114
	This patch implement TUN_MSG_PTR msg_control type. This type allows the caller to pass an array of XDP buffs to tuntap through ptr field of the tun_msg_control. If an XDP program is attached, tuntap can run XDP program directly. If not, tuntap will build skb and do a fast receiving since part of the work has been done by vhost_net. This will avoid lots of indirect calls thus improves the icache utilization and allows to do XDP batched flushing when doing XDP redirection. Signed-off-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-13	tun: switch to new type of msg_control	Jason Wang	4	-9/+36
	This patch introduces to a new tun/tap specific msg_control: #define TUN_MSG_UBUF 1 #define TUN_MSG_PTR 2 struct tun_msg_ctl { int type; void *ptr; }; This allows us to pass different kinds of msg_control through sendmsg(). The first supported type is ubuf (TUN_MSG_UBUF) which will be used by the existed vhost_net zerocopy code. The second is XDP buff, which allows vhost_net to pass XDP buff to TUN. This could be used to implement accepting an array of XDP buffs from vhost_net in the following patches. Signed-off-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-13	tuntap: move XDP flushing out of tun_do_xdp()	Jason Wang	1	-1/+2
	This will allow adding batch flushing on top. Signed-off-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-13	tuntap: split out XDP logic	Jason Wang	1	-37/+51
	This patch split out XDP logic into a single function. This make it to be reused by XDP batching path in the following patch. Signed-off-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-13	tuntap: tweak on the path of skb XDP case in tun_build_skb()	Jason Wang	1	-15/+24
	If we're sure not to go native XDP, there's no need for several things like bh and rcu stuffs. So this patch introduces a helper to build skb and hold page refcnt. When we found we will go through skb path, build skb directly. Signed-off-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-13	tuntap: simplify error handling in tun_build_skb()	Jason Wang	1	-7/+3
	There's no need to duplicate page get logic in each action. So this patch tries to get page and calculate the offset before processing XDP actions (except for XDP_DROP), and undo them when meet errors (we don't care the performance on errors). This will be used for factoring out XDP logic. Signed-off-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-13	tuntap: enable bh early during processing XDP	Jason Wang	1	-7/+3
	This patch move the bh enabling a little bit earlier, this will be used for factoring out the core XDP logic of tuntap. Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-13	tuntap: switch to use XDP_PACKET_HEADROOM	Jason Wang	1	-2/+1
	Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-13	net: sock: introduce SOCK_XDP	Jason Wang	2	-0/+20
	This patch introduces a new sock flag - SOCK_XDP. This will be used for notifying the upper layer that XDP program is attached on the lower socket, and requires for extra headroom. TUN will be the first user. Signed-off-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-13	llc: avoid blocking in llc_sap_close()	Cong Wang	2	-3/+2
	llc_sap_close() is called by llc_sap_put() which could be called in BH context in llc_rcv(). We can't block in BH. There is no reason to block it here, kfree_rcu() should be sufficient. Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-13	ipv6: Add sockopt IPV6_MULTICAST_ALL analogue to IP_MULTICAST_ALL	Andre Naujoks	5	-2/+16
	The socket option will be enabled by default to ensure current behaviour is not changed. This is the same for the IPv4 version. A socket bound to in6addr_any and a specific port will receive all traffic on that port. Analogue to IP_MULTICAST_ALL, disable this behaviour, if one or more multicast groups were joined (using said socket) and only pass on multicast traffic from groups, which were explicitly joined via this socket. Without this option disabled a socket (system even) joined to multiple multicast groups is very hard to get right. Filtering by destination address has to take place in user space to avoid receiving multicast traffic from other multicast groups, which might have traffic on the same port. The extension of the IP_MULTICAST_ALL socketoption to just apply to ipv6, too, is not done to avoid changing the behaviour of current applications. Signed-off-by: Andre Naujoks <nautsch2@gmail.com> Acked-By: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-13	net: dsa: Add Lantiq / Intel DSA driver for vrx200	Hauke Mehrtens	6	-4/+1337
	This adds the DSA driver for the GSWIP Switch found in the VRX200 SoC. This switch is integrated in the DSL SoC, this SoC uses a GSWIP version 2.1, there are other SoCs using different versions of this IP block, but this driver was only tested with the version found in the VRX200. Currently only the basic features are implemented which will forward all packages to the CPU and let the CPU do the forwarding. The hardware also support Layer 2 offloading which is not yet implemented in this driver. The GPHY FW loaded is now done by this driver and not any more by the separate driver in drivers/soc/lantiq/gphy.c, I will remove this driver is a separate patch. to make use of the GPHY this switch driver is needed anyway. Other SoCs have more embedded GPHYs so this driver should support a variable number of GPHYs. After the firmware was loaded the GPHY can be probed on the MDIO bus and it behaves like an external GPHY, without the firmware it can not be probed on the MDIO bus. The clock names in the sysctrl.c file have to be changed because the clocks are now used by a different driver. This should be cleaned up and a real common clock driver should provide the clocks instead. Signed-off-by: Hauke Mehrtens <hauke@hauke-m.de> Signed-off-by: David S. Miller <davem@davemloft.net>