wireguard-linux - WireGuard for the Linux kernel

Age	Commit message (Collapse)	Author	Files	Lines
2016-05-20	net: suppress warnings on dev_alloc_skb	Neil Horman	1	-2/+2
	Noticed an allocation failure in a network driver the other day on a 32 bit system: DMA-API: debugging out of memory - disabling bnx2fc: adapter_lookup: hba NULL lldpad: page allocation failure. order:0, mode:0x4120 Pid: 4556, comm: lldpad Not tainted 2.6.32-639.el6.i686.debug #1 Call Trace: [<c08a4086>] ? printk+0x19/0x23 [<c05166a4>] ? __alloc_pages_nodemask+0x664/0x830 [<c0649d02>] ? free_object+0x82/0xa0 [<fb4e2c9b>] ? ixgbe_alloc_rx_buffers+0x10b/0x1d0 [ixgbe] [<fb4e2fff>] ? ixgbe_configure_rx_ring+0x29f/0x420 [ixgbe] [<fb4e228c>] ? ixgbe_configure_tx_ring+0x15c/0x220 [ixgbe] [<fb4e3709>] ? ixgbe_configure+0x589/0xc00 [ixgbe] [<fb4e7be7>] ? ixgbe_open+0xa7/0x5c0 [ixgbe] [<fb503ce6>] ? ixgbe_init_interrupt_scheme+0x5b6/0x970 [ixgbe] [<fb4e8e54>] ? ixgbe_setup_tc+0x1a4/0x260 [ixgbe] [<fb505a9f>] ? ixgbe_dcbnl_set_state+0x7f/0x90 [ixgbe] [<c088d80d>] ? dcb_doit+0x10ed/0x16d0 ... Thought that perhaps the big splat in the logs wasn't really necessecary, as all call sites for dev_alloc_skb: a) check the return code for the function and b) either print their own error message or have a recovery path that makes the warning moot. Fix it by modifying dev_alloc_pages to pass __GFP_NOWARN as a gfp flag to suppress the warning applies to the net tree Signed-off-by: Neil Horman <nhorman@tuxdriver.com> CC: "David S. Miller" <davem@davemloft.net> CC: Eric Dumazet <eric.dumazet@gmail.com> CC: Alexander Duyck <alexander.duyck@gmail.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-05-20	uapi glibc compat: fix compilation when !__USE_MISC in glibc	Nicolas Dichtel	1	-1/+1
	These structures are defined only if __USE_MISC is set in glibc net/if.h headers, ie when _BSD_SOURCE or _SVID_SOURCE are defined. CC: Jan Engelhardt <jengelh@inai.de> CC: Josh Boyer <jwboyer@fedoraproject.org> CC: Stephen Hemminger <shemming@brocade.com> CC: Waldemar Brodkorb <mail@waldemar-brodkorb.de> CC: Gabriel Laskar <gabriel@lse.epita.fr> CC: Mikko Rapeli <mikko.rapeli@iki.fi> Fixes: 4a91cb61bb99 ("uapi glibc compat: fix compile errors when glibc net/if.h included before linux/if.h") Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-05-20	udp: prevent skbs lingering in tunnel socket queues	Hannes Frederic Sowa	4	-11/+7
	In case we find a socket with encapsulation enabled we should call the encap_recv function even if just a udp header without payload is available. The callbacks are responsible for correctly verifying and dropping the packets. Also, in case the header validation fails for geneve and vxlan we shouldn't put the skb back into the socket queue, no one will pick them up there. Instead we can simply discard them in the respective encap_recv functions. Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-05-20	bpf: teach verifier to recognize imm += ptr pattern	Alexei Starovoitov	1	-1/+17
	Humans don't write C code like: u8 *ptr = skb->data; int imm = 4; imm += ptr; but from llvm backend point of view 'imm' and 'ptr' are registers and imm += ptr may be preferred vs ptr += imm depending which register value will be used further in the code, while verifier can only recognize ptr += imm. That caused small unrelated changes in the C code of the bpf program to trigger rejection by the verifier. Therefore teach the verifier to recognize both ptr += imm and imm += ptr. For example: when R6=pkt(id=0,off=0,r=62) R7=imm22 after r7 += r6 instruction will be R6=pkt(id=0,off=0,r=62) R7=pkt(id=0,off=22,r=62) Fixes: 969bf05eb3ce ("bpf: direct packet access") Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-05-20	bpf: support decreasing order in direct packet access	Alexei Starovoitov	1	-8/+4
	when packet headers are accessed in 'decreasing' order (like TCP port may be fetched before the program reads IP src) the llvm may generate the following code: [...] // R7=pkt(id=0,off=22,r=70) r2 = (u32 )(r7 +0) // good access [...] r7 += 40 // R7=pkt(id=0,off=62,r=70) r8 = (u32 )(r7 +0) // good access [...] r1 = (u32 )(r7 -20) // this one will fail though it's within a safe range // it's doing (u32)(skb->data + 42) Fix verifier to recognize such code pattern Alos turned out that 'off > range' condition is not a verifier bug. It's a buggy program that may do something like: if (ptr + 50 > data_end) return 0; ptr += 60; (u32)ptr; in such case emit "invalid access to packet, off=0 size=4, R1(id=0,off=60,r=50)" error message, so all information is available for the program author to fix the program. Fixes: 969bf05eb3ce ("bpf: direct packet access") Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-05-20	net: usb: ch9200: use kmemdup	Muhammad Falak R Wani	1	-2/+1
	Use kmemdup when some other buffer is immediately copied into allocated region. It replaces call to allocation followed by memcpy, by a single call to kmemdup. Signed-off-by: Muhammad Falak R Wani <falakreyaz@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-05-20	ps3_gelic: use kmemdup	Muhammad Falak R Wani	1	-2/+2
	Use kmemdup when some other buffer is immediately copied into allocated region. It replaces call to allocation followed by memcpy, by a single call to kmemdup. Signed-off-by: Muhammad Falak R Wani <falakreyaz@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-05-20	net:liquidio: use kmemdup	Muhammad Falak R Wani	1	-3/+1
	Use kmemdup when some other buffer is immediately copied into allocated region. It replaces call to allocation followed by memcpy, by a single call to kmemdup. Signed-off-by: Muhammad Falak R Wani <falakreyaz@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-05-20	bpf: Use mount_nodev not mount_ns to mount the bpf filesystem	Eric W. Biederman	1	-1/+1
	While reviewing the filesystems that set FS_USERNS_MOUNT I spotted the bpf filesystem. Looking at the code I saw a broken usage of mount_ns with current->nsproxy->mnt_ns. As the code does not acquire a reference to the mount namespace it can not possibly be correct to store the mount namespace on the superblock as it does. Replace mount_ns with mount_nodev so that each mount of the bpf filesystem returns a distinct instance, and the code is not buggy. In discussion with Hannes Frederic Sowa it was reported that the use of mount_ns was an attempt to have one bpf instance per mount namespace, in an attempt to keep resources that pin resources from hiding. That intent simply does not work, the vfs is not built to allow that kind of behavior. Which means that the bpf filesystem really is buggy both semantically and in it's implemenation as it does not nor can it implement the original intent. This change is userspace visible, but my experience with similar filesystems leads me to believe nothing will break with a model of each mount of the bpf filesystem is distinct from all others. Fixes: b2197755b263 ("bpf: add support for persistent maps/progs") Cc: Hannes Frederic Sowa <hannes@stressinduktion.org> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-05-20	net: cdc_ncm: update datagram size after changing mtu	Rafal Redzimski	1	-2/+4
	Current implementation updates the mtu size and notify cdc_ncm device using USB_CDC_SET_MAX_DATAGRAM_SIZE request about datagram size change instead of changing rx_urb_size. Whenever mtu is being changed, datagram size should also be updated. Also updating maxmtu formula so it takes max_datagram_size with use of cdc_ncm_max_dgram_size() and not ctx. Signed-off-by: Robert Dobrowolski <robert.dobrowolski@linux.intel.com> Signed-off-by: Rafal Redzimski <rafal.f.redzimski@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-05-20	tuntap: correctly wake up process during uninit	Jason Wang	1	-3/+3
	We used to check dev->reg_state against NETREG_REGISTERED after each time we are woke up. But after commit 9e641bdcfa4e ("net-tun: restructure tun_do_read for better sleep/wakeup efficiency"), it uses skb_recv_datagram() which does not check dev->reg_state. This will result if we delete a tun/tap device after a process is blocked in the reading. The device will wait for the reference count which was held by that process for ever. Fixes this by using RCV_SHUTDOWN which will be checked during sk_recv_datagram() before trying to wake up the process during uninit. Fixes: 9e641bdcfa4e ("net-tun: restructure tun_do_read for better sleep/wakeup efficiency") Cc: Eric Dumazet <edumazet@google.com> Cc: Xi Wang <xii@google.com> Cc: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com> Acked-by: Eric Dumazet <edumazet@google.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-05-20	intel: Add support for IPv6 IP-in-IP offload	Alexander Duyck	8	-0/+8
	This patch adds support for offloading IPXIP6 type packets that represent either IPv4 or IPv6 encapsulated inside of an IPv6 outer IP header. In addition with this change we should also be able to support FOU encapsulated traffic with outer IPv6 headers. Signed-off-by: Alexander Duyck <aduyck@mirantis.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-05-20	ip6_gre: Do not allow segmentation offloads GRE_CSUM is enabled with FOU/GUE	Alexander Duyck	1	-4/+8
	This patch addresses the same issue we had for IPv4 where enabling GRE with an inner checksum cannot be supported with FOU/GUE due to the fact that they will jump past the GRE header at it is treated like a tunnel header. Signed-off-by: Alexander Duyck <aduyck@mirantis.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-05-20	RDS: TCP: Avoid rds connection churn from rogue SYNs	Sowmini Varadhan	1	-4/+6
	When a rogue SYN is received after the connection arbitration algorithm has converged, the incoming SYN should not needlessly quiesce the transmit path, and it should not result in needless TCP connection resets due to re-execution of the connection arbitration logic. Signed-off-by: Sowmini Varadhan <sowmini.varadhan@oracle.com> Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-05-20	RDS: TCP: rds_tcp_accept_worker() must exit gracefully when terminating rds-tcp	Sowmini Varadhan	1	-0/+3
	There are two instances where we want to terminate RDS-TCP: when exiting the netns or during module unload. In either case, the termination sequence is to stop the listen socket, mark the rtn->rds_tcp_listen_sock as null, and flush any accept workqs. Thus any workqs that get flushed at this point will encounter a null rds_tcp_listen_sock, and must exit gracefully to allow the RDS-TCP termination to complete successfully. Signed-off-by: Sowmini Varadhan <sowmini.varadhan@oracle.com> Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-05-20	net: sock: move ->sk_shutdown out of bitfields.	Andrey Ryabinin	1	-1/+8
	->sk_shutdown bits share one bitfield with some other bits in sock struct, such as ->sk_no_check_[r,t]x, ->sk_userlocks ... sock_setsockopt() may write to these bits, while holding the socket lock. In case of AF_UNIX sockets, we change ->sk_shutdown bits while holding only unix_state_lock(). So concurrent setsockopt() and shutdown() may lead to corrupting these bits. Fix this by moving ->sk_shutdown bits out of bitfield into a separate byte. This will not change the 'struct sock' size since ->sk_shutdown moved into previously unused 16-bit hole. Signed-off-by: Andrey Ryabinin <aryabinin@virtuozzo.com> Suggested-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-05-20	ipv6: Don't reset inner headers in ip6_tnl_xmit	Tom Herbert	1	-5/+0
	Since iptunnel_handle_offloads() is called in all paths we can probably drop the block in ip6_tnl_xmit that was checking for skb->encapsulation and resetting the inner headers. Signed-off-by: Tom Herbert <tom@herbertland.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-05-20	ip4ip6: Support for GSO/GRO	Tom Herbert	4	-6/+49
	Signed-off-by: Tom Herbert <tom@herbertland.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-05-20	ip6ip6: Support for GSO/GRO	Tom Herbert	2	-3/+26
	Signed-off-by: Tom Herbert <tom@herbertland.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-05-20	ipv6: Set features for IPv6 tunnels	Tom Herbert	1	-0/+9
	Need to set dev features, use same values that are used in GREv6. Signed-off-by: Tom Herbert <tom@herbertland.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-05-20	ip6_tunnel: Add support for fou/gue encapsulation	Tom Herbert	1	-0/+72
	Add netlink and setup for encapsulation Signed-off-by: Tom Herbert <tom@herbertland.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-05-20	ip6_gre: Add support for fou/gue encapsulation	Tom Herbert	1	-4/+75
	Add netlink and setup for encapsulation Signed-off-by: Tom Herbert <tom@herbertland.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-05-20	fou: Add encap ops for IPv6 tunnels	Tom Herbert	3	-1/+142
	This patch add a new fou6 module that provides encapsulation operations for IPv6. Signed-off-by: Tom Herbert <tom@herbertland.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-05-20	ip6_tun: Add infrastructure for doing encapsulation	Tom Herbert	3	-13/+144
	Add encap_hlen and ip_tunnel_encap structure to ip6_tnl. Add functions for getting encap hlen, setting up encap on a tunnel, performing encapsulation operation. Signed-off-by: Tom Herbert <tom@herbertland.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-05-20	fou: Support IPv6 in fou	Tom Herbert	1	-12/+35
	This patch adds receive path support for IPv6 with fou. - Add address family to fou structure for open sockets. This supports AF_INET and AF_INET6. Lookups for fou ports are performed on both the port number and family. - In fou and gue receive adjust tot_len in IPv4 header or payload_len based on address family. - Allow AF_INET6 in FOU_ATTR_AF netlink attribute. Signed-off-by: Tom Herbert <tom@herbertland.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-05-20	fou: Split out {fou,gue}_build_header	Tom Herbert	2	-14/+41
	Create __fou_build_header and __gue_build_header. These implement the protocol generic parts of building the fou and gue header. fou_build_header and gue_build_header implement the IPv4 specific functions and call the __*_build_header functions. Signed-off-by: Tom Herbert <tom@herbertland.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-05-20	fou: Call setup_udp_tunnel_sock	Tom Herbert	1	-34/+16
	Use helper function to set up UDP tunnel related information for a fou socket. Signed-off-by: Tom Herbert <tom@herbertland.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-05-20	net: Cleanup encap items in ip_tunnels.h	Tom Herbert	3	-63/+62
	Consolidate all the ip_tunnel_encap definitions in one spot in the header file. Also, move ip_encap_hlen and ip_tunnel_encap from ip_tunnel.c to ip_tunnels.h so they call be called without a dependency on ip_tunnel module. Similarly, move iptun_encaps to ip_tunnel_core.c. Signed-off-by: Tom Herbert <tom@herbertland.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-05-20	ipv6: Change "final" protocol processing for encapsulation	Tom Herbert	1	-1/+14
	When performing foo-over-UDP, UDP packets are processed by the encapsulation handler which returns another protocol to process. This may result in processing two (or more) protocols in the loop that are marked as INET6_PROTO_FINAL. The actions taken for hitting a final protocol, in particular the skb_postpull_rcsum can only be performed once. This patch set adds a check of a final protocol has been seen. The rules are: - If the final protocol has not been seen any protocol is processed (final and non-final). In the case of a final protocol, the final actions are taken (like the skb_postpull_rcsum) - If a final protocol has been seen (e.g. an encapsulating UDP header) then no further non-final protocols are allowed (e.g. extension headers). For more final protocols the final actions are not taken (e.g. skb_postpull_rcsum). Signed-off-by: Tom Herbert <tom@herbertland.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-05-20	ipv6: Fix nexthdr for reinjection	Tom Herbert	1	-3/+15
	In ip6_input_finish the nexthdr protocol is retrieved from the next header offset that is returned in the cb of the skb. This method does not work for UDP encapsulation that may not even have a concept of a nexthdr field (e.g. FOU). This patch checks for a final protocol (INET6_PROTO_FINAL) when a protocol handler returns > 0. If the protocol is not final then resubmission is performed on nhoff value. If the protocol is final then the nexthdr is taken to be the return value. Signed-off-by: Tom Herbert <tom@herbertland.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-05-20	net: define gso types for IPx over IPv4 and IPv6	Tom Herbert	19	-50/+37
	This patch defines two new GSO definitions SKB_GSO_IPXIP4 and SKB_GSO_IPXIP6 along with corresponding NETIF_F_GSO_IPXIP4 and NETIF_F_GSO_IPXIP6. These are used to described IP in IP tunnel and what the outer protocol is. The inner protocol can be deduced from other GSO types (e.g. SKB_GSO_TCPV4 and SKB_GSO_TCPV6). The GSO types of SKB_GSO_IPIP and SKB_GSO_SIT are removed (these are both instances of SKB_GSO_IPXIP4). SKB_GSO_IPXIP6 will be used when support for GSO with IP encapsulation over IPv6 is added. Signed-off-by: Tom Herbert <tom@herbertland.com> Acked-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-05-20	gso: Remove arbitrary checks for unsupported GSO	Tom Herbert	7	-102/+1
	In several gso_segment functions there are checks of gso_type against a seemingly arbitrary list of SKB_GSO_* flags. This seems like an attempt to identify unsupported GSO types, but since the stack is the one that set these GSO types in the first place this seems unnecessary to do. If a combination isn't valid in the first place that stack should not allow setting it. This is a code simplication especially for add new GSO types. Signed-off-by: Tom Herbert <tom@herbertland.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-05-20	Revert "phy: add support for a reset-gpio specification"	Fabio Estevam	2	-11/+0
	Commit da47b4572056 ("phy: add support for a reset-gpio specification") causes the following xtensa qemu crash according to Guenter Roeck: [ 9.366256] libphy: ethoc-mdio: probed [ 9.367389] (null): could not attach to PHY [ 9.368555] (null): failed to probe MDIO bus [ 9.371540] Unable to handle kernel paging request at virtual address 0000001c [ 9.371540] pc = d0320926, ra = 903209d1 [ 9.375358] Oops: sig: 11 [#1] This reverts commit da47b4572056487fd7941c26f73b3e8815ff712a. Reported-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Fabio Estevam <fabio.estevam@nxp.com> Acked-by: Florian Fainelli <f.fainelli@gmail.com> Tested-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-05-20	xen-netback: only deinitialized hash if it was initialized	Paul Durrant	1	-2/+1
	A domain with a frontend that does not implement a control ring has been seen to cause a crash during domain save. This was apparently because the call to xenvif_deinit_hash() in xenvif_disconnect_ctrl() is made regardless of whether a control ring was connected, and hence xenvif_hash_init() was called. This patch brings the call to xenvif_deinit_hash() in xenvif_disconnect_ctrl() inside the if clause that checks whether the control ring event channel was connected. This is sufficient to ensure it is only called if xenvif_init_hash() was called previously. Signed-off-by: Paul Durrant <paul.durrant@citrix.com> Reported-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Tested-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: Wei Liu <wei.liu2@citrix.com> Acked-by: Wei Liu <wei.liu2@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-05-20	mlx5: avoid unused variable warning	Arnd Bergmann	1	-1/+1
	When CONFIG_NET_CLS_ACT is disabled, we get a new warning in the mlx5 ethernet driver because the tc_for_each_action() loop never references the iterator: mellanox/mlx5/core/en_tc.c: In function 'mlx5e_stats_flower': mellanox/mlx5/core/en_tc.c:431:20: error: unused variable 'a' [-Werror=unused-variable] struct tc_action *a; This changes the dummy tc_for_each_action() macro by adding a cast to void, letting the compiler know that the variable is intentionally declared but not used here. I could not come up with a nicer workaround, but this seems to do the trick. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Fixes: aad7e08d39bd ("net/mlx5e: Hardware offloaded flower filter statistics support") Fixes: 00175aec941e ("net/sched: Macro instead of CONFIG_NET_CLS_ACT ifdef") Acked-By: Amir Vadai <amir@vadai.me> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-05-20	bpf: rather use get_random_int for randomizations	Daniel Borkmann	1	-2/+2
	Start address randomization and blinding in BPF currently use prandom_u32(). prandom_u32() values are not exposed to unpriviledged user space to my knowledge, but given other kernel facilities such as ASLR, stack canaries, etc make use of stronger get_random_int(), we better make use of it here as well given blinding requests successively new random values. get_random_int() has minimal entropy pool depletion, is not cryptographically secure, but doesn't need to be for our use cases here. Suggested-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-05-20	qede: Fix DMA address APIs usage	Manish Chopra	1	-4/+3
	Driver incorrectly uses dma_unmap_addr_set() to set a variable which is in truth a dma_addr_t [i.e not defined using DEFINE_DMA_UNMAP_ADDR()] and is being used by the driver flows other than unmapping physical addresses. This patch fixes driver fastpath where CONFIG_NEED_DMA_MAP_STATE is not set. Signed-off-by: Manish Chopra <manish.chopra@qlogic.com> Signed-off-by: Yuval Mintz <Yuval.Mintz@qlogic.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-05-20	macsec: fix netlink attribute for key id	Sabrina Dubroca	1	-2/+2
	In my last commit I replaced MACSEC_SA_ATTR_KEYID by MACSEC_SA_ATTR_KEY. Fixes: 8acca6acebd0 ("macsec: key identifier is 128 bits, not 64") Signed-off-by: Sabrina Dubroca <sd@queasysnail.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-05-20	xen-netback: correct length checks on hash copy_ops	Paul Durrant	1	-2/+2
	The length checks on the grant table copy_ops for setting hash key and hash mapping are checking the local 'len' value which is correct in the case of the former but not the latter. This was picked up by static analysis checks. This patch replaces checks of 'len' with 'copy_op.len' in both cases to correct the incorrect check, keep the two checks consistent, and to make it clear what the checks are for. Signed-off-by: Paul Durrant <paul.durrant@citrix.com> Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Cc: Wei Liu <wei.liu2@citrix.com> Acked-by: Wei Liu <wei.liu2@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-05-20	phy: fix crash in fixed_phy_add()	Rabin Vincent	1	-1/+5
	Since e7f4dc3536a ("mdio: Move allocation of interrupts into core"), platforms which call fixed_phy_add() before fixed_mdio_bus_init() is called (for example, because the platform code and the fixed_phy driver use the same initcall level) crash in fixed_phy_add() since the ->mii_bus is not allocated. Also since e7f4dc3536a, these interrupts are initalized to polling by default. The few (old) platforms which directly use fixed_phy_add() from their platform code all pass PHY_POLL for the irq argument, so we can keep these platforms not crashing by simply not attempting to set the irq if PHY_POLL is passed. Also, even if problems have not been reported on more modern platforms which used fixed_phy_register() from drivers' probe functions, we return -EPROBE_DEFER if the MDIO bus is not yet registered so that the probe is retried later. Fixes: e7f4dc3536a400 ("mdio: Move allocation of interrupts into core") Signed-off-by: Rabin Vincent <rabinv@axis.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-05-20	irqchip: nps: add 64BIT dependency	Arnd Bergmann	1	-0/+1
	The newly added nps irqchip driver causes build warnings on ARM64. include/soc/nps/common.h: In function 'nps_host_reg_non_cl': include/soc/nps/common.h:148:9: warning: cast to pointer from integer of different size [-Wint-to-pointer-cast] As the driver is only used on ARC, we don't need to see it without COMPILE_TEST elsewhere, and we can avoid the warnings by only building on 32-bit architectures even with CONFIG_COMPILE_TEST. Acked-by: Marc Zyngier <narc.zyngier@arm.com> Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Vineet Gupta <vgupta@synopsys.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-05-19	Revert "net: pegasus: remove dead coding"	David S. Miller	1	-0/+53
	This reverts commit e00be9e4d0ffcc0121606229f0aa4b246d6881d7. It causes warnings and has several problems. Signed-off-by: David S. Miller <davem@davemloft.net>
2016-05-19	CIFS: Remove some obsolete comments	Steve French	1	-6/+1
	Remove some obsolete comments in the cifs inode_operations structs that were pointed out by Stephen Rothwell. CC: Stephen Rothwell <sfr@canb.auug.org.au> CC: Al Viro <viro@zeniv.linux.org.uk> Reviewed-by: Sachin Prabhu <sprabhu@redhat.com> Signed-off-by: Steve French <steve.french@primarydata.com>
2016-05-19	cifs: Create dedicated keyring for spnego operations	Sachin Prabhu	3	-2/+71
	The session key is the default keyring set for request_key operations. This session key is revoked when the user owning the session logs out. Any long running daemon processes started by this session ends up with revoked session keyring which prevents these processes from using the request_key mechanism from obtaining the krb5 keys. The problem has been reported by a large number of autofs users. The problem is also seen with multiuser mounts where the share may be used by processes run by a user who has since logged out. A reproducer using automount is available on the Red Hat bz. The patch creates a new keyring which is used to cache cifs spnego upcalls. Red Hat bz: 1267754 Signed-off-by: Sachin Prabhu <sprabhu@redhat.com> Reported-by: Scott Mayhew <smayhew@redhat.com> Reviewed-by: Shirish Pargaonkar <shirishpargaonkar@gmail.com> CC: Stable <stable@vger.kernel.org> Signed-off-by: Steve French <smfrench@gmail.com>
2016-05-19	mm, page_alloc: restore the original nodemask if the fast path allocation failed	Mel Gorman	1	-0/+6
	The page allocator fast path uses either the requested nodemask or cpuset_current_mems_allowed if cpusets are enabled. If the allocation context allows watermarks to be ignored then it can also ignore memory policies. However, on entering the allocator slowpath the nodemask may still be cpuset_current_mems_allowed and the policies are enforced. This patch resets the nodemask appropriately before entering the slowpath. Link: http://lkml.kernel.org/r/20160504143628.GU2858@techsingularity.net Signed-off-by: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: Mel Gorman <mgorman@techsingularity.net> Cc: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-05-19	mm, page_alloc: uninline the bad page part of check_new_page()	Vlastimil Babka	1	-16/+17
	Bad pages should be rare so the code handling them doesn't need to be inline for performance reasons. Put it to separate function which returns void. This also assumes that the initial page_expected_state() result will match the result of the thorough check, i.e. the page doesn't become "good" in the meanwhile. This matches the same expectations already in place in free_pages_check(). !DEBUG_VM bloat-o-meter: add/remove: 1/0 grow/shrink: 0/1 up/down: 134/-274 (-140) function old new delta check_new_page_bad - 134 +134 get_page_from_freelist 3468 3194 -274 Signed-off-by: Vlastimil Babka <vbabka@suse.cz> Acked-by: Mel Gorman <mgorman@techsingularity.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-05-19	mm, page_alloc: don't duplicate code in free_pcp_prepare	Mel Gorman	1	-78/+55
	The new free_pcp_prepare() function shares a lot of code with free_pages_prepare(), which makes this a maintenance risk when some future patch modifies only one of them. We should be able to achieve the same effect (skipping free_pages_check() from !DEBUG_VM configs) by adding a parameter to free_pages_prepare() and making it inline, so the checks (and the order != 0 parts) are eliminated from the call from free_pcp_prepare(). !DEBUG_VM: bloat-o-meter reports no difference, as my gcc was already inlining free_pages_prepare() and the elimination seems to work as expected DEBUG_VM bloat-o-meter: add/remove: 0/1 grow/shrink: 2/0 up/down: 1035/-778 (257) function old new delta __free_pages_ok 297 1060 +763 free_hot_cold_page 480 752 +272 free_pages_prepare 778 - -778 Here inlining didn't occur before, and added some code, but it's ok for a debug option. [akpm@linux-foundation.org: fix build] Signed-off-by: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: Mel Gorman <mgorman@techsingularity.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-05-19	mm, page_alloc: defer debugging checks of pages allocated from the PCP	Mel Gorman	1	-28/+64
	Every page allocated checks a number of page fields for validity. This catches corruption bugs of pages that are already freed but it is expensive. This patch weakens the debugging check by checking PCP pages only when the PCP lists are being refilled. All compound pages are checked. This potentially avoids debugging checks entirely if the PCP lists are never emptied and refilled so some corruption issues may be missed. Full checking requires DEBUG_VM. With the two deferred debugging patches applied, the impact to a page allocator microbenchmark is 4.6.0-rc3 4.6.0-rc3 inline-v3r6 deferalloc-v3r7 Min alloc-odr0-1 344.00 ( 0.00%) 317.00 ( 7.85%) Min alloc-odr0-2 248.00 ( 0.00%) 231.00 ( 6.85%) Min alloc-odr0-4 209.00 ( 0.00%) 192.00 ( 8.13%) Min alloc-odr0-8 181.00 ( 0.00%) 166.00 ( 8.29%) Min alloc-odr0-16 168.00 ( 0.00%) 154.00 ( 8.33%) Min alloc-odr0-32 161.00 ( 0.00%) 148.00 ( 8.07%) Min alloc-odr0-64 158.00 ( 0.00%) 145.00 ( 8.23%) Min alloc-odr0-128 156.00 ( 0.00%) 143.00 ( 8.33%) Min alloc-odr0-256 168.00 ( 0.00%) 154.00 ( 8.33%) Min alloc-odr0-512 178.00 ( 0.00%) 167.00 ( 6.18%) Min alloc-odr0-1024 186.00 ( 0.00%) 174.00 ( 6.45%) Min alloc-odr0-2048 192.00 ( 0.00%) 180.00 ( 6.25%) Min alloc-odr0-4096 198.00 ( 0.00%) 184.00 ( 7.07%) Min alloc-odr0-8192 200.00 ( 0.00%) 188.00 ( 6.00%) Min alloc-odr0-16384 201.00 ( 0.00%) 188.00 ( 6.47%) Min free-odr0-1 189.00 ( 0.00%) 180.00 ( 4.76%) Min free-odr0-2 132.00 ( 0.00%) 126.00 ( 4.55%) Min free-odr0-4 104.00 ( 0.00%) 99.00 ( 4.81%) Min free-odr0-8 90.00 ( 0.00%) 85.00 ( 5.56%) Min free-odr0-16 84.00 ( 0.00%) 80.00 ( 4.76%) Min free-odr0-32 80.00 ( 0.00%) 76.00 ( 5.00%) Min free-odr0-64 78.00 ( 0.00%) 74.00 ( 5.13%) Min free-odr0-128 77.00 ( 0.00%) 73.00 ( 5.19%) Min free-odr0-256 94.00 ( 0.00%) 91.00 ( 3.19%) Min free-odr0-512 108.00 ( 0.00%) 112.00 ( -3.70%) Min free-odr0-1024 115.00 ( 0.00%) 118.00 ( -2.61%) Min free-odr0-2048 120.00 ( 0.00%) 125.00 ( -4.17%) Min free-odr0-4096 123.00 ( 0.00%) 129.00 ( -4.88%) Min free-odr0-8192 126.00 ( 0.00%) 130.00 ( -3.17%) Min free-odr0-16384 126.00 ( 0.00%) 131.00 ( -3.97%) Note that the free paths for large numbers of pages is impacted as the debugging cost gets shifted into that path when the page data is no longer necessarily cache-hot. Signed-off-by: Mel Gorman <mgorman@techsingularity.net> Acked-by: Vlastimil Babka <vbabka@suse.cz> Cc: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-05-19	mm, page_alloc: defer debugging checks of freed pages until a PCP drain	Mel Gorman	1	-51/+101
	Every page free checks a number of page fields for validity. This catches premature frees and corruptions but it is also expensive. This patch weakens the debugging check by checking PCP pages at the time they are drained from the PCP list. This will trigger the bug but the site that freed the corrupt page will be lost. To get the full context, a kernel rebuild with DEBUG_VM is necessary. [akpm@linux-foundation.org: fix build] Signed-off-by: Mel Gorman <mgorman@techsingularity.net> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-05-19	cpuset: use static key better and convert to new API	Vlastimil Babka	3	-22/+36
	An important function for cpusets is cpuset_node_allowed(), which optimizes on the fact if there's a single root CPU set, it must be trivially allowed. But the check "nr_cpusets() <= 1" doesn't use the cpusets_enabled_key static key the right way where static keys eliminate branching overhead with jump labels. This patch converts it so that static key is used properly. It's also switched to the new static key API and the checking functions are converted to return bool instead of int. We also provide a new variant __cpuset_zone_allowed() which expects that the static key check was already done and they key was enabled. This is needed for get_page_from_freelist() where we want to also avoid the relatively slower check when ALLOC_CPUSET is not set in alloc_flags. The impact on the page allocator microbenchmark is less than expected but the cleanup in itself is worthwhile. 4.6.0-rc2 4.6.0-rc2 multcheck-v1r20 cpuset-v1r20 Min alloc-odr0-1 348.00 ( 0.00%) 348.00 ( 0.00%) Min alloc-odr0-2 254.00 ( 0.00%) 254.00 ( 0.00%) Min alloc-odr0-4 213.00 ( 0.00%) 213.00 ( 0.00%) Min alloc-odr0-8 186.00 ( 0.00%) 183.00 ( 1.61%) Min alloc-odr0-16 173.00 ( 0.00%) 171.00 ( 1.16%) Min alloc-odr0-32 166.00 ( 0.00%) 163.00 ( 1.81%) Min alloc-odr0-64 162.00 ( 0.00%) 159.00 ( 1.85%) Min alloc-odr0-128 160.00 ( 0.00%) 157.00 ( 1.88%) Min alloc-odr0-256 169.00 ( 0.00%) 166.00 ( 1.78%) Min alloc-odr0-512 180.00 ( 0.00%) 180.00 ( 0.00%) Min alloc-odr0-1024 188.00 ( 0.00%) 187.00 ( 0.53%) Min alloc-odr0-2048 194.00 ( 0.00%) 193.00 ( 0.52%) Min alloc-odr0-4096 199.00 ( 0.00%) 198.00 ( 0.50%) Min alloc-odr0-8192 202.00 ( 0.00%) 201.00 ( 0.50%) Min alloc-odr0-16384 203.00 ( 0.00%) 202.00 ( 0.49%) Signed-off-by: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: Mel Gorman <mgorman@techsingularity.net> Acked-by: Zefan Li <lizefan@huawei.com> Cc: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>