aboutsummaryrefslogtreecommitdiffstatshomepage
path: root/include (follow)
AgeCommit message (Collapse)AuthorFilesLines
2020-07-31tcp: syncookies: create mptcp request socket for ACK cookies with MPTCP optionFlorian Westphal1-0/+2
If SYN packet contains MP_CAPABLE option, keep it enabled. Syncokie validation and cookie-based socket creation is changed to instantiate an mptcp request sockets if the ACK contains an MPTCP connection request. Rather than extend both cookie_v4/6_check, add a common helper to create the (mp)tcp request socket. Suggested-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-31mptcp: subflow: add mptcp_subflow_init_cookie_req helperFlorian Westphal1-0/+10
Will be used to initialize the mptcp request socket when a MP_CAPABLE request was handled in syncookie mode, i.e. when a TCP ACK containing a MP_CAPABLE option is a valid syncookie value. Normally (non-cookie case), MPTCP will generate a unique 32 bit connection ID and stores it in the MPTCP token storage to be able to retrieve the mptcp socket for subflow joining. In syncookie case, we do not want to store any state, so just generate the unique ID and use it in the reply. This means there is a small window where another connection could generate the same token. When Cookie ACK comes back, we check that the token has not been registered in the mean time. If it was, the connection needs to fall back to TCP. Changes in v2: - use req->syncookie instead of passing 'want_cookie' arg to ->init_req() (Eric Dumazet) Signed-off-by: Florian Westphal <fw@strlen.de> Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-31mptcp: rename and export mptcp_subflow_request_sock_opsFlorian Westphal1-0/+1
syncookie code path needs to create an mptcp request sock. Prepare for this and add mptcp prefix plus needed export of ops struct. Signed-off-by: Florian Westphal <fw@strlen.de> Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-31tcp: rename request_sock cookie_ts bit to syncookieFlorian Westphal1-1/+1
Nowadays output function has a 'synack_type' argument that tells us when the syn/ack is emitted via syncookies. The request already tells us when timestamps are supported, so check both to detect special timestamp for tcp option encoding is needed. We could remove cookie_ts altogether, but a followup patch would otherwise need to adjust function signatures to pass 'want_cookie' to mptcp core. This way, the 'existing' bit can be used. Suggested-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-31Merge branch 'for-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth-nextDavid S. Miller6-15/+233
Johan Hedberg says: ==================== pull request: bluetooth-next 2020-07-31 Here's the main bluetooth-next pull request for 5.9: - Fix firmware filenames for Marvell chipsets - Several suspend-related fixes - Addedd mgmt commands for runtime configuration - Multiple fixes for Qualcomm-based controllers - Add new monitoring feature for mgmt - Fix handling of legacy cipher (E4) together with security level 4 - Add support for Realtek 8822CE controller - Fix issues with Chinese controllers using fake VID/PID values - Multiple other smaller fixes & improvements ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-30Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec-nextDavid S. Miller2-1/+5
Steffen Klassert says: ==================== pull request (net-next): ipsec-next 2020-07-30 Please note that I did the first time now --no-ff merges of my testing branch into the master branch to include the [PATCH 0/n] message of a patchset. Please let me know if this is desirable, or if I should do it any different. 1) Introduce a oseq-may-wrap flag to disable anti-replay protection for manually distributed ICVs as suggested in RFC 4303. From Petr Vaněk. 2) Patchset to fully support IPCOMP for vti4, vti6 and xfrm interfaces. From Xin Long. 3) Switch from a linear list to a hash list for xfrm interface lookups. From Eyal Birger. 4) Fixes to not register one xfrm(6)_tunnel object twice. From Xin Long. 5) Fix two compile errors that were introduced with the IPCOMP support for vti and xfrm interfaces. Also from Xin Long. 6) Make the policy hold queue work with VTI. This was forgotten when VTI was implemented. Please pull or let me know if there are problems. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-30Bluetooth: Enable controller RPA resolution using Experimental featureSathish Narasimman1-0/+1
This patch adds support to enable the use of RPA Address resolution using expermental feature mgmt command. Signed-off-by: Sathish Narasimman <sathish.narasimman@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2020-07-30Bluetooth: Enable RPA TimeoutSathish Narasimman1-0/+2
Enable RPA timeout during bluetooth initialization. The RPA timeout value is used from hdev, which initialized from debug_fs Signed-off-by: Sathish Narasimman <sathish.narasimman@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2020-07-30Bluetooth: Configure controller address resolution if availableMarcel Holtmann1-0/+3
When the LL Privacy support is available, then as part of enabling or disabling passive background scanning, it is required to set up the controller based address resolution as well. Since only passive background scanning is utilizing the whitelist, the address resolution is now bound to the whitelist and passive background scanning. All other resolution can be easily done by the host stack. Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Sathish Narsimman <sathish.narasimman@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2020-07-30Bluetooth: Translate additional address type correctlyMarcel Holtmann1-2/+4
When using controller based address resolution, then the new address types 0x02 and 0x03 are used. These types need to be converted back into either public address or random address types. Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Sathish Narsimman <sathish.narasimman@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2020-07-28fib: use indirect call wrappers in the most common fib_rules_opsBrian Vazquez1-0/+18
This avoids another inderect call per RX packet which save us around 20-40 ns. Changelog: v1 -> v2: - Move declaraions to fib_rules.h to remove warnings Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Brian Vazquez <brianvv@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-28Add pldmfw library for PLDM firmware updateJacob Keller1-0/+165
The pldmfw library is used to implement common logic needed to flash devices based on firmware files using the format described by the PLDM for Firmware Update standard. This library consists of logic to parse the PLDM file format from a firmware file object, as well as common logic for sending the relevant PLDM header data to the device firmware. A simple ops table is provided so that device drivers can implement device specific hardware interactions while keeping the common logic to the pldmfw library. This library will be used by the Intel ice networking driver as part of implementing device flash update via devlink. The library aims to be vendor and device agnostic. For this reason, it has been placed in lib/pldmfw, in the hopes that other devices which use the PLDM firmware file format may benefit from it in the future. However, do note that not all features defined in the PLDM standard have been implemented. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-28net: improve the user pointer check in init_user_sockptrChristoph Hellwig1-12/+6
Make sure not just the pointer itself but the whole range lies in the user address space. For that pass the length and then use the access_ok helper to do the check. Fixes: 6d04fe15f78a ("net: optimize the sockptr_t for unified kernel/user address spaces") Reported-by: David Laight <David.Laight@ACULAB.COM> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-28net: remove sockptr_advanceChristoph Hellwig1-14/+13
sockptr_advance never properly worked. Replace it with _offset variants of copy_from_sockptr and copy_to_sockptr. Fixes: ba423fdaa589 ("net: add a new sockptr_t type") Reported-by: Jason A. Donenfeld <Jason@zx2c4.com> Reported-by: Ido Schimmel <idosch@idosch.org> Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Jason A. Donenfeld <Jason@zx2c4.com> Tested-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-28net: make sockptr_is_null strict aliasing safeChristoph Hellwig1-1/+3
While the kernel in general is not strict aliasing safe we can trivially do that in sockptr_is_null without affecting code generation, so always check the actually assigned union member. Reported-by: Jan Engelhardt <jengelh@inai.de> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-28net/mlx5: Hold pages RB tree per VFEran Ben Elisha1-1/+1
Per page request event, FW request to allocated or release pages for a single function. Driver maintains FW pages object per function, so there is no need to hold one global page data-base. Instead, have a page data-base per function, which will improve performance release flow in all cases, especially for "release all pages". As the range of function IDs is large and not sequential, use xarray to store a per function ID page data-base, where the function ID is the key. Upon first allocation of a page to a function ID, create the page data-base per function. This data-base will be released only at pagealloc mechanism cleanup. NIC: ConnectX-4 Lx CPU: Intel(R) Xeon(R) CPU E5-2650 v2 @ 2.60GHz Test case: 32 VFs, measure release pages on one VF as part of FLR Before: 0.021 Sec After: 0.014 Sec The improvement depends on amount of VFs and memory utilization by them. Time measurements above were taken from idle system. Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com> Reviewed-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-07-28Bluetooth: btusb: Fix and detect most of the Chinese Bluetooth controllersIsmael Ferreras Morezuelas2-0/+13
For some reason they tend to squat on the very first CSR/ Cambridge Silicon Radio VID/PID instead of paying fees. This is an extremely common problem; the issue goes as back as 2013 and these devices are only getting more popular, even rebranded by reputable vendors and sold by retailers everywhere. So, at this point in time there are hundreds of modern dongles reusing the ID of what originally was an early Bluetooth 1.1 controller. Linux is the only place where they don't work due to spotty checks in our detection code. It only covered a minimum subset. So what's the big idea? Take advantage of the fact that all CSR chips report the same internal version as both the LMP sub-version and HCI revision number. It always matches, couple that with the manufacturer code, that rarely lies, and we now have a good idea of who is who. Additionally, by compiling a list of user-reported HCI/lsusb dumps, and searching around for legit CSR dongles in similar product ranges we can find what CSR BlueCore firmware supported which Bluetooth versions. That way we can narrow down ranges of fakes for each of them. e.g. Real CSR dongles with LMP subversion 0x73 are old enough that support BT 1.1 only; so it's a dead giveaway when some third-party BT 4.0 dongle reuses it. So, to sum things up; there are multiple classes of fake controllers reusing the same 0A12:0001 VID/PID. This has been broken for a while. Known 'fake' bcdDevices: 0x0100, 0x0134, 0x1915, 0x2520, 0x7558, 0x8891 IC markings on 0x7558: FR3191AHAL 749H15143 (???) https://bugzilla.kernel.org/show_bug.cgi?id=60824 Fixes: 81cac64ba258ae (Deal with USB devices that are faking CSR vendor) Reported-by: Michał Wiśniewski <brylozketrzyn@gmail.com> Tested-by: Mike Johnson <yuyuyak@gmail.com> Tested-by: Ricardo Rodrigues <ekatonb@gmail.com> Tested-by: M.Hanny Sabbagh <mhsabbagh@outlook.com> Tested-by: Oussama BEN BRAHIM <b.brahim.oussama@gmail.com> Tested-by: Ismael Ferreras Morezuelas <swyterzone@gmail.com> Signed-off-by: Ismael Ferreras Morezuelas <swyterzone@gmail.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2020-07-27hsr: enhance netlink socket interface to support PRPMurali Karicheri2-2/+12
Parallel Redundancy Protocol (PRP) is another redundancy protocol introduced by IEC 63439 standard. It is similar to HSR in many aspects:- - Use a pair of Ethernet interfaces to created the PRP device - Use a 6 byte redundancy protocol part (RCT, Redundancy Check Trailer) similar to HSR Tag. - Has Link Redundancy Entity (LRE) that works with RCT to implement redundancy. Key difference is that the protocol unit is a trailer instead of a prefix as in HSR. That makes it inter-operable with tradition network components such as bridges/switches which treat it as pad bytes, whereas HSR nodes requires some kind of translators (Called redbox) to talk to regular network devices. This features allows regular linux box to be converted to a DAN-P box. DAN-P stands for Dual Attached Node - PRP similar to DAN-H (Dual Attached Node - HSR). Add a comment at the header/source code to explicitly state that the driver files also handles PRP protocol as well. Signed-off-by: Murali Karicheri <m-karicheri2@ti.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-25Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netDavid S. Miller25-24/+157
The UDP reuseport conflict was a little bit tricky. The net-next code, via bpf-next, extracted the reuseport handling into a helper so that the BPF sk lookup code could invoke it. At the same time, the logic for reuseport handling of unconnected sockets changed via commit efc6b6f6c3113e8b203b9debfb72d81e0f3dcace which changed the logic to carry on the reuseport result into the rest of the lookup loop if we do not return immediately. This requires moving the reuseport_has_conns() logic into the callers. While we are here, get rid of inline directives as they do not belong in foo.c files. The other changes were cases of more straightforward overlapping modifications. Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-25Merge tag 'x86-urgent-2020-07-25' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip into masterLinus Torvalds1-1/+4
Pull x86 fixes from Ingo Molnar: "Misc fixes: - Fix a section end page alignment assumption that was causing crashes - Fix ORC unwinding on freshly forked tasks which haven't executed yet and which have empty user task stacks - Fix the debug.exception-trace=1 sysctl dumping of user stacks, which was broken by recent maccess changes" * tag 'x86-urgent-2020-07-25' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/dumpstack: Dump user space code correctly again x86/stacktrace: Fix reliable check for empty user task stacks x86/unwind/orc: Fix ORC for newly forked tasks x86, vmlinux.lds: Page-align end of ..page_aligned sections
2020-07-25Merge tag 'efi-urgent-2020-07-25' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip into masterLinus Torvalds1-0/+1
Pull EFI fixes from Ingo Molnar: "Various EFI fixes: - Fix the layering violation in the use of the EFI runtime services availability mask in users of the 'efivars' abstraction - Revert build fix for GCC v4.8 which is no longer supported - Clean up some x86 EFI stub details, some of which are borderline bugs that copy around garbage into padding fields - let's fix these out of caution. - Fix build issues while working on RISC-V support - Avoid --whole-archive when linking the stub on arm64" * tag 'efi-urgent-2020-07-25' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: efi: Revert "efi/x86: Fix build with gcc 4" efi/efivars: Expose RT service availability via efivars abstraction efi/libstub: Move the function prototypes to header file efi/libstub: Fix gcc error around __umoddi3 for 32 bit builds efi/libstub/arm64: link stub lib.a conditionally efi/x86: Only copy upto the end of setup_header efi/x86: Remove unused variables
2020-07-25Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net into masterLinus Torvalds3-4/+5
Pull networking fixes from David Miller: 1) Fix RCU locaking in iwlwifi, from Johannes Berg. 2) mt76 can access uninitialized NAPI struct, from Felix Fietkau. 3) Fix race in updating pause settings in bnxt_en, from Vasundhara Volam. 4) Propagate error return properly during unbind failures in ax88172a, from George Kennedy. 5) Fix memleak in adf7242_probe, from Liu Jian. 6) smc_drv_probe() can leak, from Wang Hai. 7) Don't muck with the carrier state if register_netdevice() fails in the bonding driver, from Taehee Yoo. 8) Fix memleak in dpaa_eth_probe, from Liu Jian. 9) Need to check skb_put_padto() return value in hsr_fill_tag(), from Murali Karicheri. 10) Don't lose ionic RSS hash settings across FW update, from Shannon Nelson. 11) Fix clobbered SKB control block in act_ct, from Wen Xu. 12) Missing newlink in "tx_timeout" sysfs output, from Xiongfeng Wang. 13) IS_UDPLITE cleanup a long time ago, incorrectly handled transformations involving UDPLITE_RECV_CC. From Miaohe Lin. 14) Unbalanced locking in netdevsim, from Taehee Yoo. 15) Suppress false-positive error messages in qed driver, from Alexander Lobakin. 16) Out of bounds read in ax25_connect and ax25_sendmsg, from Peilin Ye. 17) Missing SKB release in cxgb4's uld_send(), from Navid Emamdoost. 18) Uninitialized value in geneve_changelink(), from Cong Wang. 19) Fix deadlock in xen-netfront, from Andera Righi. 19) flush_backlog() frees skbs with IRQs disabled, so should use dev_kfree_skb_irq() instead of kfree_skb(). From Subash Abhinov Kasiviswanathan. * git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (111 commits) drivers/net/wan: lapb: Corrected the usage of skb_cow dev: Defer free of skbs in flush_backlog qrtr: orphan socket in qrtr_release() xen-netfront: fix potential deadlock in xennet_remove() flow_offload: Move rhashtable inclusion to the source file geneve: fix an uninitialized value in geneve_changelink() bonding: check return value of register_netdevice() in bond_newlink() tcp: allow at most one TLP probe per flight AX.25: Prevent integer overflows in connect and sendmsg cxgb4: add missing release on skb in uld_send() net: atlantic: fix PTP on AQC10X AX.25: Prevent out-of-bounds read in ax25_sendmsg() sctp: shrink stream outq when fails to do addstream reconf sctp: shrink stream outq only when new outcnt < old outcnt AX.25: Fix out-of-bounds read in ax25_connect() enetc: Remove the mdio bus on PF probe bailout net: ethernet: ti: add NETIF_F_HW_TC hw feature flag for taprio offload net: ethernet: ave: Fix error returns in ave_init drivers/net/wan/x25_asy: Fix to make it work ipvs: fix the connection sync failed in some cases ...
2020-07-24icmp6: support rfc 4884Willem de Bruijn3-0/+3
Extend the rfc 4884 read interface introduced for ipv4 in commit eba75c587e81 ("icmp: support rfc 4884") to ipv6. Add socket option SOL_IPV6/IPV6_RECVERR_RFC4884. Changes v1->v2: - make ipv6_icmp_error_rfc4884 static (file scope) Signed-off-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-24icmp: prepare rfc 4884 for ipv6Willem de Bruijn1-1/+2
The RFC 4884 spec is largely the same between IPv4 and IPv6. Factor out the IPv4 specific parts in preparation for IPv6 support: - icmp types supported - icmp header size, and thus offset to original datagram start - datagram length field offset in icmp(6)hdr. - datagram length field word size: 4B for IPv4, 8B for IPv6. Signed-off-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-24net: optimize the sockptr_t for unified kernel/user address spacesChristoph Hellwig1-2/+30
For architectures like x86 and arm64 we don't need the separate bit to indicate that a pointer is a kernel pointer as the address spaces are unified. That way the sockptr_t can be reduced to a union of two pointers, which leads to nicer calling conventions. The only caveat is that we need to check that users don't pass in kernel address and thus gain access to kernel memory. Thus the USER_SOCKPTR helper is replaced with a init_user_sockptr function that does this check and returns an error if it fails. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-24net: pass a sockptr_t into ->setsockoptChristoph Hellwig7-10/+13
Rework the remaining setsockopt code to pass a sockptr_t instead of a plain user pointer. This removes the last remaining set_fs(KERNEL_DS) outside of architecture specific code. Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Stefan Schmidt <stefan@datenfreihafen.org> [ieee802154] Acked-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-24net/tcp: switch ->md5_parse to sockptr_tChristoph Hellwig1-1/+1
Pass a sockptr_t to prepare for set_fs-less handling of the kernel pointer from bpf-cgroup. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-24net/udp: switch udp_lib_setsockopt to sockptr_tChristoph Hellwig1-1/+1
Pass a sockptr_t to prepare for set_fs-less handling of the kernel pointer from bpf-cgroup. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-24net/ipv6: switch ipv6_flowlabel_opt to sockptr_tChristoph Hellwig1-1/+1
Pass a sockptr_t to prepare for set_fs-less handling of the kernel pointer from bpf-cgroup. Note that the get case is pretty weird in that it actually copies data back to userspace from setsockopt. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-24net/ipv6: switch ip6_mroute_setsockopt to sockptr_tChristoph Hellwig1-4/+4
Pass a sockptr_t to prepare for set_fs-less handling of the kernel pointer from bpf-cgroup. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-24net/ipv4: merge ip_options_get and ip_options_get_from_userChristoph Hellwig1-3/+2
Use the sockptr_t type to merge the versions. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-24net/ipv4: switch ip_mroute_setsockopt to sockptr_tChristoph Hellwig1-2/+3
Pass a sockptr_t to prepare for set_fs-less handling of the kernel pointer from bpf-cgroup. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-24bpfilter: switch bpfilter_ip_set_sockopt to sockptr_tChristoph Hellwig1-3/+3
This is mostly to prepare for cleaning up the callers, as bpfilter by design can't handle kernel pointers. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-24netfilter: switch nf_setsockopt to sockptr_tChristoph Hellwig1-2/+4
Pass a sockptr_t to prepare for set_fs-less handling of the kernel pointer from bpf-cgroup. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-24netfilter: switch xt_copy_counters to sockptr_tChristoph Hellwig1-2/+2
Pass a sockptr_t to prepare for set_fs-less handling of the kernel pointer from bpf-cgroup. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-24net/xfrm: switch xfrm_user_policy to sockptr_tChristoph Hellwig1-3/+5
Pass a sockptr_t to prepare for set_fs-less handling of the kernel pointer from bpf-cgroup. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-24net: switch sock_set_timeout to sockptr_tChristoph Hellwig1-1/+2
Pass a sockptr_t to prepare for set_fs-less handling of the kernel pointer from bpf-cgroup. Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-24net: switch copy_bpf_fprog_from_user to sockptr_tChristoph Hellwig1-1/+2
Pass a sockptr_t to prepare for set_fs-less handling of the kernel pointer from bpf-cgroup. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-24net: add a new sockptr_t typeChristoph Hellwig1-0/+104
Add a uptr_t type that can hold a pointer to either a user or kernel memory region, and simply helpers to copy to and from it. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-24net/sched: cls_flower: Add hash info to flow classificationAriel Levkovich1-0/+3
Adding new cls flower keys for hash value and hash mask and dissect the hash info from the skb into the flow key towards flow classication. Signed-off-by: Ariel Levkovich <lariel@mellanox.com> Reviewed-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-24net/flow_dissector: add packet hash dissectionAriel Levkovich2-0/+13
Retreive a hash value from the SKB and store it in the dissector key for future matching. Signed-off-by: Ariel Levkovich <lariel@mellanox.com> Reviewed-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-24flow_offload: Move rhashtable inclusion to the source fileHerbert Xu1-1/+0
I noticed that touching linux/rhashtable.h causes lib/vsprintf.c to be rebuilt. This dependency came through a bogus inclusion in the file net/flow_offload.h. This patch moves it to the right place. This patch also removes a lingering rhashtable inclusion in cls_api created by the same commit. Fixes: 4e481908c51b ("flow_offload: move tc indirect block to...") Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-24Merge branch 'akpm' into master (patches from Andrew)Linus Torvalds2-2/+6
Merge misc fixes from Andrew Morton: "Subsystems affected by this patch series: mm/pagemap, mm/shmem, mm/hotfixes, mm/memcg, mm/hugetlb, mailmap, squashfs, scripts, io-mapping, MAINTAINERS, and gdb" * emailed patches from Andrew Morton <akpm@linux-foundation.org>: scripts/gdb: fix lx-symbols 'gdb.error' while loading modules MAINTAINERS: add KCOV section io-mapping: indicate mapping failure scripts/decode_stacktrace: strip basepath from all paths squashfs: fix length field overlap check in metadata reading mailmap: add entry for Mike Rapoport khugepaged: fix null-pointer dereference due to race mm/hugetlb: avoid hardcoding while checking if cma is enabled mm: memcg/slab: fix memory leak at non-root kmem_cache destroy mm/memcg: fix refcount error while moving and swapping mm/memcontrol: fix OOPS inside mem_cgroup_get_nr_swap_pages() mm: initialize return of vm_insert_pages vfs/xattr: mm/shmem: kernfs: release simple xattr entry in a right way mm/mmap.c: close race between munmap() and expand_upwards()/downwards()
2020-07-24Merge tag 'for-5.8/dm-fixes-3' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm into masterLinus Torvalds1-0/+1
Pull device mapper fix from Mike Snitzer: "A stable fix for DM integrity target's integrity recalculation that gets skipped when resuming a device. This is a fix for a previous stable@ fix" * tag 'for-5.8/dm-fixes-3' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm: dm integrity: fix integrity recalculation that is improperly skipped
2020-07-24Merge branch 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux into masterLinus Torvalds1-1/+1
Pull i2c fixes from Wolfram Sang: "Again some driver bugfixes and some documentation fixes" * 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux: i2c: i2c-qcom-geni: Fix DMA transfer race i2c: rcar: always clear ICSAR to avoid side effects MAINTAINERS: i2c: at91: handover maintenance to Codrin Ciubotariu i2c: drop duplicated word in the header file i2c: cadence: Clear HOLD bit at correct time in Rx path Revert "i2c: cadence: Fix the hold bit setting"
2020-07-24io-mapping: indicate mapping failureMichael J. Ruhl1-1/+4
The !ATOMIC_IOMAP version of io_maping_init_wc will always return success, even when the ioremap fails. Since the ATOMIC_IOMAP version returns NULL when the init fails, and callers check for a NULL return on error this is unexpected. During a device probe, where the ioremap failed, a crash can look like this: BUG: unable to handle page fault for address: 0000000000210000 #PF: supervisor write access in kernel mode #PF: error_code(0x0002) - not-present page Oops: 0002 [#1] PREEMPT SMP CPU: 0 PID: 177 Comm: RIP: 0010:fill_page_dma [i915] gen8_ppgtt_create [i915] i915_ppgtt_create [i915] intel_gt_init [i915] i915_gem_init [i915] i915_driver_probe [i915] pci_device_probe really_probe driver_probe_device The remap failure occurred much earlier in the probe. If it had been propagated, the driver would have exited with an error. Return NULL on ioremap failure. [akpm@linux-foundation.org: detect ioremap_wc() errors earlier] Fixes: cafaf14a5d8f ("io-mapping: Always create a struct to hold metadata about the io-mapping") Signed-off-by: Michael J. Ruhl <michael.j.ruhl@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Cc: Mike Rapoport <rppt@linux.ibm.com> Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Daniel Vetter <daniel@ffwll.ch> Cc: <stable@vger.kernel.org> Link: http://lkml.kernel.org/r/20200721171936.81563-1-michael.j.ruhl@intel.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-07-24vfs/xattr: mm/shmem: kernfs: release simple xattr entry in a right wayChengguang Xu1-1/+2
After commit fdc85222d58e ("kernfs: kvmalloc xattr value instead of kmalloc"), simple xattr entry is allocated with kvmalloc() instead of kmalloc(), so we should release it with kvfree() instead of kfree(). Fixes: fdc85222d58e ("kernfs: kvmalloc xattr value instead of kmalloc") Signed-off-by: Chengguang Xu <cgxu519@mykernel.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Acked-by: Hugh Dickins <hughd@google.com> Acked-by: Tejun Heo <tj@kernel.org> Cc: Daniel Xu <dxu@dxuuu.xyz> Cc: Chris Down <chris@chrisdown.name> Cc: Andreas Dilger <adilger@dilger.ca> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: <stable@vger.kernel.org> [5.7] Link: http://lkml.kernel.org/r/20200704051608.15043-1-cgxu519@mykernel.net Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-07-23net: dsa: stop overriding master's ndo_get_phys_port_nameVladimir Oltean1-23/+0
The purpose of this override is to give the user an indication of what the number of the CPU port is (in DSA, the CPU port is a hardware implementation detail and not a network interface capable of traffic). However, it has always failed (by design) at providing this information to the user in a reliable fashion. Prior to commit 3369afba1e46 ("net: Call into DSA netdevice_ops wrappers"), the behavior was to only override this callback if it was not provided by the DSA master. That was its first failure: if the DSA master itself was a DSA port or a switchdev, then the user would not see the number of the CPU port in /sys/class/net/eth0/phys_port_name, but the number of the DSA master port within its respective physical switch. But that was actually ok in a way. The commit mentioned above changed that behavior, and now overrides the master's ndo_get_phys_port_name unconditionally. That comes with problems of its own, which are worse in a way. The idea is that it's typical for switchdev users to have udev rules for consistent interface naming. These are based, among other things, on the phys_port_name attribute. If we let the DSA switch at the bottom to start randomly overriding ndo_get_phys_port_name with its own CPU port, we basically lose any predictability in interface naming, or even uniqueness, for that matter. So, there are reasons to let DSA override the master's callback (to provide a consistent interface, a number which has a clear meaning and must not be interpreted according to context), and there are reasons to not let DSA override it (it breaks udev matching for the DSA master). But, there is an alternative method for users to retrieve the number of the CPU port of each DSA switch in the system: $ devlink port pci/0000:00:00.5/0: type eth netdev swp0 flavour physical port 0 pci/0000:00:00.5/2: type eth netdev swp2 flavour physical port 2 pci/0000:00:00.5/4: type notset flavour cpu port 4 spi/spi2.0/0: type eth netdev sw0p0 flavour physical port 0 spi/spi2.0/1: type eth netdev sw0p1 flavour physical port 1 spi/spi2.0/2: type eth netdev sw0p2 flavour physical port 2 spi/spi2.0/4: type notset flavour cpu port 4 spi/spi2.1/0: type eth netdev sw1p0 flavour physical port 0 spi/spi2.1/1: type eth netdev sw1p1 flavour physical port 1 spi/spi2.1/2: type eth netdev sw1p2 flavour physical port 2 spi/spi2.1/3: type eth netdev sw1p3 flavour physical port 3 spi/spi2.1/4: type notset flavour cpu port 4 So remove this duplicated, unreliable and troublesome method. From this patch on, the phys_port_name attribute of the DSA master will only contain information about itself (if at all). If the users need reliable information about the CPU port they're probably using devlink anyway. Signed-off-by: Vladimir Oltean <olteanv@gmail.com> Acked-by: florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-23tcp: allow at most one TLP probe per flightYuchung Cheng1-2/+4
Previously TLP may send multiple probes of new data in one flight. This happens when the sender is cwnd limited. After the initial TLP containing new data is sent, the sender receives another ACK that acks partial inflight. It may re-arm another TLP timer to send more, if no further ACK returns before the next TLP timeout (PTO) expires. The sender may send in theory a large amount of TLP until send queue is depleted. This only happens if the sender sees such irregular uncommon ACK pattern. But it is generally undesirable behavior during congestion especially. The original TLP design restrict only one TLP probe per inflight as published in "Reducing Web Latency: the Virtue of Gentle Aggression", SIGCOMM 2013. This patch changes TLP to send at most one probe per inflight. Note that if the sender is app-limited, TLP retransmits old data and did not have this issue. Signed-off-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-23dm integrity: fix integrity recalculation that is improperly skippedMikulas Patocka1-0/+1
Commit adc0daad366b62ca1bce3e2958a40b0b71a8b8b3 ("dm: report suspended device during destroy") broke integrity recalculation. The problem is dm_suspended() returns true not only during suspend, but also during resume. So this race condition could occur: 1. dm_integrity_resume calls queue_work(ic->recalc_wq, &ic->recalc_work) 2. integrity_recalc (&ic->recalc_work) preempts the current thread 3. integrity_recalc calls if (unlikely(dm_suspended(ic->ti))) goto unlock_ret; 4. integrity_recalc exits and no recalculating is done. To fix this race condition, add a function dm_post_suspending that is only true during the postsuspend phase and use it instead of dm_suspended(). Signed-off-by: Mikulas Patocka <mpatocka redhat com> Fixes: adc0daad366b ("dm: report suspended device during destroy") Cc: stable vger kernel org # v4.18+ Signed-off-by: Mike Snitzer <snitzer@redhat.com>