aboutsummaryrefslogtreecommitdiffstats
path: root/include (follow)
AgeCommit message (Collapse)AuthorFilesLines
2018-11-07udp: add support for UDP_GRO cmsgPaolo Abeni1-0/+11
When UDP GRO is enabled, the UDP_GRO cmsg will carry the ingress datagram size. User-space can use such info to compute the original packets layout. Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-07udp: implement GRO for plain UDP sockets.Paolo Abeni2-1/+3
This is the RX counterpart of commit bec1f6f69736 ("udp: generate gso with UDP_SEGMENT"). When UDP_GRO is enabled, such socket is also eligible for GRO in the rx path: UDP segments directed to such socket are assembled into a larger GSO_UDP_L4 packet. The core UDP GRO support is enabled with setsockopt(UDP_GRO). Initial benchmark numbers: Before: udp rx: 1079 MB/s 769065 calls/s After: udp rx: 1466 MB/s 24877 calls/s This change introduces a side effect in respect to UDP tunnels: after a UDP tunnel creation, now the kernel performs a lookup per ingress UDP packet, while before such lookup happened only if the ingress packet carried a valid internal header csum. rfc v2 -> rfc v3: - fixed typos in macro name and comments - really enforce UDP_GRO_CNT_MAX, instead of UDP_GRO_CNT_MAX + 1 - acquire socket lock in UDP_GRO setsockopt rfc v1 -> rfc v2: - use a new option to enable UDP GRO - use static keys to protect the UDP GRO socket lookup Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-07udp: implement complete book-keeping for encap_neededPaolo Abeni2-1/+12
The *encap_needed static keys are enabled by UDP tunnels and several UDP encapsulations type, but they are never turned off. This can cause unneeded overall performance degradation for systems where such features are used transiently. This patch introduces complete book-keeping for such keys, decreasing the usage at socket destruction time, if needed, and avoiding that the same socket could increase the key usage multiple times. rfc v3 -> v1: - add socket lock around udp_tunnel_encap_enable() rfc v2 -> rfc v3: - use udp_tunnel_encap_enable() in setsockopt() Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-07net: fix raw socket lookup device bind matching with VRFsDuncan Eastoe1-1/+12
When there exist a pair of raw sockets one unbound and one bound to a VRF but equal in all other respects, when a packet is received in the VRF context, __raw_v4_lookup() matches on both sockets. This results in the packet being delivered over both sockets, instead of only the raw socket bound to the VRF. The bound device checks in __raw_v4_lookup() are replaced with a call to raw_sk_bound_dev_eq() which correctly handles whether the packet should be delivered over the unbound socket in such cases. In __raw_v6_lookup() the match on the device binding of the socket is similarly updated to use raw_sk_bound_dev_eq() which matches the handling in __raw_v4_lookup(). Importantly raw_sk_bound_dev_eq() takes the raw_l3mdev_accept sysctl into account. Signed-off-by: Duncan Eastoe <deastoe@vyatta.att-mail.com> Signed-off-by: Mike Manning <mmanning@vyatta.att-mail.com> Reviewed-by: David Ahern <dsahern@gmail.com> Tested-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-07net: provide a sysctl raw_l3mdev_accept for raw socket lookup with VRFsMike Manning2-0/+4
Add a sysctl raw_l3mdev_accept to control raw socket lookup in a manner similar to use of tcp_l3mdev_accept for stream and of udp_l3mdev_accept for datagram sockets. Have this default to enabled for reasons of backwards compatibility. This is so as to specify the output device with cmsg and IP_PKTINFO, but using a socket not bound to the corresponding VRF. This allows e.g. older ping implementations to be run with specifying the device but without executing it in the VRF. If the option is disabled, packets received in a VRF context are only handled by a raw socket bound to the VRF, and correspondingly packets in the default VRF are only handled by a socket not bound to any VRF. Signed-off-by: Mike Manning <mmanning@vyatta.att-mail.com> Reviewed-by: David Ahern <dsahern@gmail.com> Tested-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-07net: ensure unbound datagram socket to be chosen when not in a VRFMike Manning1-0/+11
Ensure an unbound datagram skt is chosen when not in a VRF. The check for a device match in compute_score() for UDP must be performed when there is no device match. For this, a failure is returned when there is no device match. This ensures that bound sockets are never selected, even if there is no unbound socket. Allow IPv6 packets to be sent over a datagram skt bound to a VRF. These packets are currently blocked, as flowi6_oif was set to that of the master vrf device, and the ipi6_ifindex is that of the slave device. Allow these packets to be sent by checking the device with ipi6_ifindex has the same L3 scope as that of the bound device of the skt, which is the master vrf device. Note that this check always succeeds if the skt is unbound. Even though the right datagram skt is now selected by compute_score(), a different skt is being returned that is bound to the wrong vrf. The difference between these and stream sockets is the handling of the skt option for SO_REUSEPORT. While the handling when adding a skt for reuse correctly checks that the bound device of the skt is a match, the skts in the hashslot are already incorrect. So for the same hash, a skt for the wrong vrf may be selected for the required port. The root cause is that the skt is immediately placed into a slot when it is created, but when the skt is then bound using SO_BINDTODEVICE, it remains in the same slot. The solution is to move the skt to the correct slot by forcing a rehash. Signed-off-by: Mike Manning <mmanning@vyatta.att-mail.com> Reviewed-by: David Ahern <dsahern@gmail.com> Tested-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-07net: ensure unbound stream socket to be chosen when not in a VRFMike Manning2-0/+19
The commit a04a480d4392 ("net: Require exact match for TCP socket lookups if dif is l3mdev") only ensures that the correct socket is selected for packets in a VRF. However, there is no guarantee that the unbound socket will be selected for packets when not in a VRF. By checking for a device match in compute_score() also for the case when there is no bound device and attaching a score to this, the unbound socket is selected. And if a failure is returned when there is no device match, this ensures that bound sockets are never selected, even if there is no unbound socket. Signed-off-by: Mike Manning <mmanning@vyatta.att-mail.com> Reviewed-by: David Ahern <dsahern@gmail.com> Tested-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-07net: allow binding socket in a VRF when there's an unbound socketRobert Shearman3-10/+21
Change the inet socket lookup to avoid packets arriving on a device enslaved to an l3mdev from matching unbound sockets by removing the wildcard for non sk_bound_dev_if and instead relying on check against the secondary device index, which will be 0 when the input device is not enslaved to an l3mdev and so match against an unbound socket and not match when the input device is enslaved. Change the socket binding to take the l3mdev into account to allow an unbound socket to not conflict sockets bound to an l3mdev given the datapath isolation now guaranteed. Signed-off-by: Robert Shearman <rshearma@vyatta.att-mail.com> Signed-off-by: Mike Manning <mmanning@vyatta.att-mail.com> Reviewed-by: David Ahern <dsahern@gmail.com> Tested-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-06net: Add extack argument to ip_fib_metrics_initDavid Ahern1-1/+2
Add extack argument to ip_fib_metrics_init and add messages for invalid metrics. Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-06net: Add extack argument to rtnl_create_linkDavid Ahern1-1/+2
Add extack arg to rtnl_create_link and add messages for invalid number of Tx or Rx queues. Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-06net: skbuff.h: remove unnecessary unlikely()Yangtao Li1-3/+1
WARN_ON() already contains an unlikely(), so it's not necessary to use unlikely. Signed-off-by: Yangtao Li <tiny.windzz@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-06Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds9-5/+75
Pull networking fixes from David Miller: 1) Handle errors mid-stream of an all dump, from Alexey Kodanev. 2) Fix build of openvswitch with certain combinations of netfilter options, from Arnd Bergmann. 3) Fix interactions between GSO and BQL, from Eric Dumazet. 4) Don't put a '/' in RTL8201F's sysfs file name, from Holger Hoffstätte. 5) S390 qeth driver fixes from Julian Wiedmann. 6) Allow ipv6 link local addresses for netconsole when both source and destination are link local, from Matwey V. Kornilov. 7) Fix the BPF program address seen in /proc/kallsyms, from Song Liu. 8) Initialize mutex before use in dsa microchip driver, from Tristram Ha. 9) Out-of-bounds access in hns3, from Yunsheng Lin. 10) Various netfilter fixes from Stefano Brivio, Jozsef Kadlecsik, Jiri Slaby, Florian Westphal, Eric Westbrook, Andrey Ryabinin, and Pablo Neira Ayuso. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (50 commits) net: alx: make alx_drv_name static net: bpfilter: fix iptables failure if bpfilter_umh is disabled sock_diag: fix autoloading of the raw_diag module net: core: netpoll: Enable netconsole IPv6 link local address ipv6: properly check return value in inet6_dump_all() rtnetlink: restore handling of dumpit return value in rtnl_dump_all() net/ipv6: Move anycast init/cleanup functions out of CONFIG_PROC_FS bonding/802.3ad: fix link_failure_count tracking net: phy: realtek: fix RTL8201F sysfs name sctp: define SCTP_SS_DEFAULT for Stream schedulers sctp: fix strchange_flags name for Stream Change Event mlxsw: spectrum: Fix IP2ME CPU policer configuration openvswitch: fix linking without CONFIG_NF_CONNTRACK_LABELS qed: fix link config error handling net: hns3: Fix for out-of-bounds access when setting pfc back pressure net/mlx4_en: use __netdev_tx_sent_queue() net: do not abort bulk send on BQL status net: bql: add __netdev_tx_sent_queue() s390/qeth: report 25Gbit link speed s390/qeth: sanitize ARP requests ...
2018-11-05Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nfDavid S. Miller5-5/+48
Pablo Neira Ayuso says: ==================== Netfilter fixes for net The following patchset contains the first batch of Netfilter fixes for your net tree: 1) Fix splat with IPv6 defragmenting locally generated fragments, from Florian Westphal. 2) Fix Incorrect check for missing attribute in nft_osf. 3) Missing INT_MIN & INT_MAX definition for netfilter bridge uapi header, from Jiri Slaby. 4) Revert map lookup in nft_numgen, this is already possible with the existing infrastructure without this extension. 5) Fix wrong listing of set reference counter, make counter synchronous again, from Stefano Brivio. 6) Fix CIDR 0 in hash:net,port,net, from Eric Westbrook. 7) Fix allocation failure with large set, use kvcalloc(). From Andrey Ryabinin. 8) No need to disable BH when fetch ip set comment, patch from Jozsef Kadlecsik. 9) Sanity check for valid sysfs entry in xt_IDLETIMER, from Taehee Yoo. 10) Fix suspicious rcu usage via ip_set() macro at netlink dump, from Jozsef Kadlecsik. 11) Fix setting default timeout via nfnetlink_cttimeout, this comes with preparation patch to add nf_{tcp,udp,...}_pernet() helper. 12) Allow ebtables table nat to be of filter type via nft_compat. From Florian Westphal. 13) Incorrect calculation of next bucket in early_drop, do no bump hash value, update bucket counter instead. From Vasily Khoruzhick. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-05compiler: remove __no_sanitize_address_or_inline againMartin Schwidefsky2-13/+1
The __no_sanitize_address_or_inline and __no_kasan_or_inline defines are almost identical. The only difference is that __no_kasan_or_inline does not have the 'notrace' attribute. To be able to replace __no_sanitize_address_or_inline with the older definition, add 'notrace' to __no_kasan_or_inline and change to two users of __no_sanitize_address_or_inline in the s390 code. The 'notrace' option is necessary for e.g. the __load_psw_mask function in arch/s390/include/asm/processor.h. Without the option it is possible to trace __load_psw_mask which leads to kernel stack overflow. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Pointed-out-by: Andrey Ryabinin <aryabinin@virtuozzo.com> Acked-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-11-04Merge tag 'nfs-for-4.20-2' of git://git.linux-nfs.org/projects/trondmy/linux-nfsLinus Torvalds1-5/+2
Pull NFS client bugfixes from Trond Myklebust: "Highlights include: Bugfix: - Fix build issues on architectures that don't provide 64-bit cmpxchg Cleanups: - Fix a spelling mistake" * tag 'nfs-for-4.20-2' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: NFS: fix spelling mistake, EACCESS -> EACCES SUNRPC: Use atomic(64)_t for seq_send(64)
2018-11-04Merge branch 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tipLinus Torvalds1-0/+1
Pull more timer updates from Thomas Gleixner: "A set of commits for the new C-SKY architecture timers" * 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: dt-bindings: timer: gx6605s SOC timer clocksource/drivers/c-sky: Add gx6605s SOC system timer dt-bindings: timer: C-SKY Multi-processor timer clocksource/drivers/c-sky: Add C-SKY SMP timer
2018-11-03sctp: define SCTP_SS_DEFAULT for Stream schedulersXin Long1-0/+1
According to rfc8260#section-4.3.2, SCTP_SS_DEFAULT is required to defined as SCTP_SS_FCFS or SCTP_SS_RR. SCTP_SS_FCFS is used for SCTP_SS_DEFAULT's value in this patch. Fixes: 5bbbbe32a431 ("sctp: introduce stream scheduler foundations") Reported-by: Jianwen Ji <jiji@redhat.com> Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-03sctp: fix strchange_flags name for Stream Change EventXin Long1-0/+2
As defined in rfc6525#section-6.1.3, SCTP_STREAM_CHANGE_DENIED and SCTP_STREAM_CHANGE_FAILED should be used instead of SCTP_ASSOC_CHANGE_DENIED and SCTP_ASSOC_CHANGE_FAILED. To keep the compatibility, fix it by adding two macros. Fixes: b444153fb5a6 ("sctp: add support for generating add stream change event notification") Reported-by: Jianwen Ji <jiji@redhat.com> Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-03Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tipLinus Torvalds1-2/+2
Pull x86 fixes from Ingo Molnar: "A number of fixes and some late updates: - make in_compat_syscall() behavior on x86-32 similar to other platforms, this touches a number of generic files but is not intended to impact non-x86 platforms. - objtool fixes - PAT preemption fix - paravirt fixes/cleanups - cpufeatures updates for new instructions - earlyprintk quirk - make microcode version in sysfs world-readable (it is already world-readable in procfs) - minor cleanups and fixes" * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: compat: Cleanup in_compat_syscall() callers x86/compat: Adjust in_compat_syscall() to generic code under !COMPAT objtool: Support GCC 9 cold subfunction naming scheme x86/numa_emulation: Fix uniform-split numa emulation x86/paravirt: Remove unused _paravirt_ident_32 x86/mm/pat: Disable preemption around __flush_tlb_all() x86/paravirt: Remove GPL from pv_ops export x86/traps: Use format string with panic() call x86: Clean up 'sizeof x' => 'sizeof(x)' x86/cpufeatures: Enumerate MOVDIR64B instruction x86/cpufeatures: Enumerate MOVDIRI instruction x86/earlyprintk: Add a force option for pciserial device objtool: Support per-function rodata sections x86/microcode: Make revision and processor flags world-readable
2018-11-03Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tipLinus Torvalds1-0/+2
Pull perf updates and fixes from Ingo Molnar: "These are almost all tooling updates: 'perf top', 'perf trace' and 'perf script' fixes and updates, an UAPI header sync with the merge window versions, license marker updates, much improved Sparc support from David Miller, and a number of fixes" * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (66 commits) perf intel-pt/bts: Calculate cpumode for synthesized samples perf intel-pt: Insert callchain context into synthesized callchains perf tools: Don't clone maps from parent when synthesizing forks perf top: Start display thread earlier tools headers uapi: Update linux/if_link.h header copy tools headers uapi: Update linux/netlink.h header copy tools headers: Sync the various kvm.h header copies tools include uapi: Update linux/mmap.h copy perf trace beauty: Use the mmap flags table generated from headers perf beauty: Wire up the mmap flags table generator to the Makefile perf beauty: Add a generator for MAP_ mmap's flag constants tools include uapi: Update asound.h copy tools arch uapi: Update asm-generic/unistd.h and arm64 unistd.h copies tools include uapi: Update linux/fs.h copy perf callchain: Honour the ordering of PERF_CONTEXT_{USER,KERNEL,etc} perf cs-etm: Correct CPU mode for samples perf unwind: Take pgoff into account when reporting elf to libdwfl perf top: Do not use overwrite mode by default perf top: Allow disabling the overwrite mode perf trace: Beautify mount's first pathname arg ...
2018-11-03Merge branch 'core/urgent' into x86/urgent, to pick up objtool fixIngo Molnar190-2483/+7357
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-11-03net: bql: add __netdev_tx_sent_queue()Eric Dumazet1-0/+20
When qdisc_run() tries to use BQL budget to bulk-dequeue a batch of packets, GSO can later transform this list in another list of skbs, and each skb is sent to device ndo_start_xmit(), one at a time, with skb->xmit_more being set to one but for last skb. Problem is that very often, BQL limit is hit in the middle of the packet train, forcing dev_hard_start_xmit() to stop the bulk send and requeue the end of the list. BQL role is to avoid head of line blocking, making sure a qdisc can deliver high priority packets before low priority ones. But there is no way requeued packets can be bypassed by fresh packets in the qdisc. Aborting the bulk send increases TX softirqs, and hot cache lines (after skb_segment()) are wasted. Note that for TSO packets, we never split a packet in the middle because of BQL limit being hit. Drivers should be able to update BQL counters without flipping/caring about BQL status, if the current skb has xmit_more set. Upper layers are ultimately responsible to stop sending another packet train when BQL limit is hit. Code template in a driver might look like the following : send_doorbell = __netdev_tx_sent_queue(tx_queue, nr_bytes, skb->xmit_more); Note that __netdev_tx_sent_queue() use is not mandatory, since following patch will change dev_hard_start_xmit() to not care about BQL status. But it is highly recommended so that xmit_more full benefits can be reached (less doorbells sent, and less atomic operations as well) Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-03mm, thp: consolidate THP gfp handling into alloc_hugepage_direct_gfpmaskMichal Hocko2-8/+6
THP allocation mode is quite complex and it depends on the defrag mode. This complexity is hidden in alloc_hugepage_direct_gfpmask from a large part currently. The NUMA special casing (namely __GFP_THISNODE) is however independent and placed in alloc_pages_vma currently. This both adds an unnecessary branch to all vma based page allocation requests and it makes the code more complex unnecessarily as well. Not to mention that e.g. shmem THP used to do the node reclaiming unconditionally regardless of the defrag mode until recently. This was not only unexpected behavior but it was also hardly a good default behavior and I strongly suspect it was just a side effect of the code sharing more than a deliberate decision which suggests that such a layering is wrong. Get rid of the thp special casing from alloc_pages_vma and move the logic to alloc_hugepage_direct_gfpmask. __GFP_THISNODE is applied to the resulting gfp mask only when the direct reclaim is not requested and when there is no explicit numa binding to preserve the current logic. Please note that there's also a slight difference wrt MPOL_BIND now. The previous code would avoid using __GFP_THISNODE if the local node was outside of policy_nodemask(). After this patch __GFP_THISNODE is avoided for all MPOL_BIND policies. So there's a difference that if local node is actually allowed by the bind policy's nodemask, previously __GFP_THISNODE would be added, but now it won't be. From the behavior POV this is still correct because the policy nodemask is used. Link: http://lkml.kernel.org/r/20180925120326.24392-3-mhocko@kernel.org Signed-off-by: Michal Hocko <mhocko@suse.com> Acked-by: Vlastimil Babka <vbabka@suse.cz> Cc: Alex Williamson <alex.williamson@redhat.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: David Rientjes <rientjes@google.com> Cc: "Kirill A. Shutemov" <kirill@shutemov.name> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Stefan Priebe - Profihost AG <s.priebe@profihost.ag> Cc: Zi Yan <zi.yan@cs.rutgers.edu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-11-03include/linux/notifier.h: SRCU: fix ctagsSam Protsenko1-2/+1
ctags indexing ("make tags" command) throws this warning: ctags: Warning: include/linux/notifier.h:125: null expansion of name pattern "\1" This is the result of DEFINE_PER_CPU() macro expansion. Fix that by getting rid of line break. Similar fix was already done in commit 25528213fe9f ("tags: Fix DEFINE_PER_CPU expansions"), but this one probably wasn't noticed. Link: http://lkml.kernel.org/r/20181030202808.28027-1-semen.protsenko@linaro.org Fixes: 9c80172b902d ("kernel/SRCU: provide a static initializer") Signed-off-by: Sam Protsenko <semen.protsenko@linaro.org> Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Cc: Andy Shevchenko <andy.shevchenko@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-11-03netfilter: conntrack: add nf_{tcp,udp,sctp,icmp,dccp,icmpv6,generic}_pernet()Pablo Neira Ayuso1-0/+39
Expose these functions to access conntrack protocol tracker netns area, nfnetlink_cttimeout needs this. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2018-11-03netfilter: ipset: Correct rcu_dereference() call in ip_set_put_comment()Jozsef Kadlecsik1-2/+2
The function is called when rcu_read_lock() is held and not when rcu_read_lock_bh() is held. Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2018-11-02net/ipv6: Add anycast addresses to a global hashtableJeff Barnhill2-0/+4
icmp6_send() function is expensive on systems with a large number of interfaces. Every time it’s called, it has to verify that the source address does not correspond to an existing anycast address by looping through every device and every anycast address on the device. This can result in significant delays for a CPU when there are a large number of neighbors and ND timers are frequently timing out and calling neigh_invalidate(). Add anycast addresses to a global hashtable to allow quick searching for matching anycast addresses. This is based on inet6_addr_lst in addrconf.c. Signed-off-by: Jeff Barnhill <0xeffeff@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-02clocksource/drivers/c-sky: Add C-SKY SMP timerGuo Ren1-0/+1
The driver is for C-SKY SMP timer. It only supports oneshot event and 32bit overflow for clocksource. Per cpu core has one timer and all timers share one clock-counter-input from the same clocksource. This use mfcr&mtcr instructions to access the regs. Signed-off-by: Guo Ren <ren_guo@c-sky.com> Cc: Daniel Lezcano <daniel.lezcano@linaro.org> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2018-11-02Merge tag 'for-linus-20181102' of git://git.kernel.dk/linux-blockLinus Torvalds5-122/+57
Pull block layer fixes from Jens Axboe: "The biggest part of this pull request is the revert of the blkcg cleanup series. It had one fix earlier for a stacked device issue, but another one was reported. Rather than play whack-a-mole with this, revert the entire series and try again for the next kernel release. Apart from that, only small fixes/changes. Summary: - Indentation fixup for mtip32xx (Colin Ian King) - The blkcg cleanup series revert (Dennis Zhou) - Two NVMe fixes. One fixing a regression in the nvme request initialization in this merge window, causing nvme-fc to not work. The other is a suspend/resume p2p resource issue (James, Keith) - Fix sg discard merge, allowing us to merge in cases where we didn't before (Jianchao Wang) - Call rq_qos_exit() after the queue is frozen, preventing a hang (Ming) - Fix brd queue setup, fixing an oops if we fail setting up all devices (Ming)" * tag 'for-linus-20181102' of git://git.kernel.dk/linux-block: nvme-pci: fix conflicting p2p resource adds nvme-fc: fix request private initialization blkcg: revert blkcg cleanups series block: brd: associate with queue until adding disk block: call rq_qos_exit() after queue is frozen mtip32xx: clean an indentation issue, remove extraneous tabs block: fix the DISCARD request merge
2018-11-02Merge tag 'edac_for_4.20_2' of git://git.kernel.org/pub/scm/linux/kernel/git/bp/bpLinus Torvalds1-0/+5
Pull more EDAC updates from Borislav Petkov: "The second part of the EDAC pile which contains the ADXL user and a build fix which addresses a not-so-sensical .config but fixes randconfig builds people do: - skx_edac: Address translation for NVDIMMs (Tony Luck and Qiuxu Zhuo) - ACPI_ADXL build fix" [ I don't think "sensical" is a word, particularly when used in the context of actually meaning "nonsensical", but I like it - Linus ] * tag 'edac_for_4.20_2' of git://git.kernel.org/pub/scm/linux/kernel/git/bp/bp: EDAC, skx: Fix randconfig builds EDAC, skx_edac: Add address translation for non-volatile DIMMs
2018-11-02Merge tag 'drm-next-2018-11-02' of git://anongit.freedesktop.org/drm/drmLinus Torvalds1-2/+69
Pull drm fixes from Dave Airlie: "Pretty much a normal fixes pull pre-rc1, mostly amdgpu fixes, one i915 link training regression fix, and a couple of minor panel/bridge fixes and a panel quirk" * tag 'drm-next-2018-11-02' of git://anongit.freedesktop.org/drm/drm: (37 commits) drm/amdgpu: revert "enable gfxoff in non-sriov and stutter mode by default" drm/amd/pp: Print warning if od_sclk/mclk out of range drm/amd/pp: Fix pp_sclk/mclk_od not work on Vega10 drm/amd/pp: Fix pp_sclk/mclk_od not work on smu7 drm/amd/powerplay: no MGPU fan boost enablement on DPM disabled drm/amdgpu: Fix skipping hangged job reset during gpu recover. drm/amd/powerplay: revise Vega20 pptable version check drm/amd/display: set backlight level limit to 1 drm/panel: simple: Innolux TV123WAM is actually P120ZDG-BF1 dt-bindings: drm/panel: simple: Innolux TV123WAM is actually P120ZDG-BF1 drm/bridge: ti-sn65dsi86: Remove the mystery delay drm/panel: simple: Add "no-hpd" delay for Innolux TV123WAM drm/panel: simple: Support panels with HPD where HPD isn't connected dt-bindings: drm/panel: simple: Add no-hpd property drm/edid: Add 6 bpc quirk for BOE panel. drm/amdgpu: fix reporting of failed msg sent to SMU (v2) drm/amdgpu: Fix compute ring 1.0.0 failure after reset drm/amdgpu: fix VM leaf walking drm/amdgpu: fix amdgpu_vm_fini drm/amd/powerplay: commonize the API for retrieving current clocks ...
2018-11-02Merge tag 'xfs-4.20-merge-2' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linuxLinus Torvalds1-17/+38
Pull vfs dedup fixes from Dave Chinner: "This reworks the vfs data cloning infrastructure. We discovered many issues with these interfaces late in the 4.19 cycle - the worst of them (data corruption, setuid stripping) were fixed for XFS in 4.19-rc8, but a larger rework of the infrastructure fixing all the problems was needed. That rework is the contents of this pull request. Rework the vfs_clone_file_range and vfs_dedupe_file_range infrastructure to use a common .remap_file_range method and supply generic bounds and sanity checking functions that are shared with the data write path. The current VFS infrastructure has problems with rlimit, LFS file sizes, file time stamps, maximum filesystem file sizes, stripping setuid bits, etc and so they are addressed in these commits. We also introduce the ability for the ->remap_file_range methods to return short clones so that clones for vfs_copy_file_range() don't get rejected if the entire range can't be cloned. It also allows filesystems to sliently skip deduplication of partial EOF blocks if they are not capable of doing so without requiring errors to be thrown to userspace. Existing filesystems are converted to user the new remap_file_range method, and both XFS and ocfs2 are modified to make use of the new generic checking infrastructure" * tag 'xfs-4.20-merge-2' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux: (28 commits) xfs: remove [cm]time update from reflink calls xfs: remove xfs_reflink_remap_range xfs: remove redundant remap partial EOF block checks xfs: support returning partial reflink results xfs: clean up xfs_reflink_remap_blocks call site xfs: fix pagecache truncation prior to reflink ocfs2: remove ocfs2_reflink_remap_range ocfs2: support partial clone range and dedupe range ocfs2: fix pagecache truncation prior to reflink ocfs2: truncate page cache for clone destination file before remapping vfs: clean up generic_remap_file_range_prep return value vfs: hide file range comparison function vfs: enable remap callers that can handle short operations vfs: plumb remap flags through the vfs dedupe functions vfs: plumb remap flags through the vfs clone functions vfs: make remap_file_range functions take and return bytes completed vfs: remap helper should update destination inode metadata vfs: pass remap flags to generic_remap_checks vfs: pass remap flags to generic_remap_file_range_prep vfs: combine the clone and dedupe into a single remap_file_range ...
2018-11-01Merge branch 'work.afs' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfsLinus Torvalds2-38/+240
Pull AFS updates from Al Viro: "AFS series, with some iov_iter bits included" * 'work.afs' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (26 commits) missing bits of "iov_iter: Separate type from direction and use accessor functions" afs: Probe multiple fileservers simultaneously afs: Fix callback handling afs: Eliminate the address pointer from the address list cursor afs: Allow dumping of server cursor on operation failure afs: Implement YFS support in the fs client afs: Expand data structure fields to support YFS afs: Get the target vnode in afs_rmdir() and get a callback on it afs: Calc callback expiry in op reply delivery afs: Fix FS.FetchStatus delivery from updating wrong vnode afs: Implement the YFS cache manager service afs: Remove callback details from afs_callback_break struct afs: Commit the status on a new file/dir/symlink afs: Increase to 64-bit volume ID and 96-bit vnode ID for YFS afs: Don't invoke the server to read data beyond EOF afs: Add a couple of tracepoints to log I/O errors afs: Handle EIO from delivery function afs: Fix TTL on VL server and address lists afs: Implement VL server rotation afs: Improve FS server rotation error handling ...
2018-11-01blkcg: revert blkcg cleanups seriesDennis Zhou5-122/+57
This reverts a series committed earlier due to null pointer exception bug report in [1]. It seems there are edge case interactions that I did not consider and will need some time to understand what causes the adverse interactions. The original series can be found in [2] with a follow up series in [3]. [1] https://www.spinics.net/lists/cgroups/msg20719.html [2] https://lore.kernel.org/lkml/20180911184137.35897-1-dennisszhou@gmail.com/ [3] https://lore.kernel.org/lkml/20181020185612.51587-1-dennis@kernel.org/ This reverts the following commits: d459d853c2ed, b2c3fa546705, 101246ec02b5, b3b9f24f5fcc, e2b0989954ae, f0fcb3ec89f3, c839e7a03f92, bdc2491708c4, 74b7c02a9bc1, 5bf9a1f3b4ef, a7b39b4e961c, 07b05bcc3213, 49f4c2dc2b50, 27e6fa996c53 Signed-off-by: Dennis Zhou <dennis@kernel.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-11-01Merge tag 'compiler-attributes-for-linus-4.20-rc1' of https://github.com/ojeda/linuxLinus Torvalds6-178/+293
Pull compiler attribute updates from Miguel Ojeda: "This is an effort to disentangle the include/linux/compiler*.h headers and bring them up to date. The main idea behind the series is to use feature checking macros (i.e. __has_attribute) instead of compiler version checks (e.g. GCC_VERSION), which are compiler-agnostic (so they can be shared, reducing the size of compiler-specific headers) and version-agnostic. Other related improvements have been performed in the headers as well, which on top of the use of __has_attribute it has amounted to a significant simplification of these headers (e.g. GCC_VERSION is now only guarding a few non-attribute macros). This series should also help the efforts to support compiling the kernel with clang and icc. A fair amount of documentation and comments have also been added, clarified or removed; and the headers are now more readable, which should help kernel developers in general. The series was triggered due to the move to gcc >= 4.6. In turn, this series has also triggered Sparse to gain the ability to recognize __has_attribute on its own. Finally, the __nonstring variable attribute series has been also applied on top; plus two related patches from Nick Desaulniers for unreachable() that came a bit afterwards" * tag 'compiler-attributes-for-linus-4.20-rc1' of https://github.com/ojeda/linux: compiler-gcc: remove comment about gcc 4.5 from unreachable() compiler.h: update definition of unreachable() Compiler Attributes: ext4: remove local __nonstring definition Compiler Attributes: auxdisplay: panel: use __nonstring Compiler Attributes: enable -Wstringop-truncation on W=1 (gcc >= 8) Compiler Attributes: add support for __nonstring (gcc >= 8) Compiler Attributes: add MAINTAINERS entry Compiler Attributes: add Doc/process/programming-language.rst Compiler Attributes: remove uses of __attribute__ from compiler.h Compiler Attributes: KENTRY used twice the "used" attribute Compiler Attributes: use feature checks instead of version checks Compiler Attributes: add missing SPDX ID in compiler_types.h Compiler Attributes: remove unneeded sparse (__CHECKER__) tests Compiler Attributes: homogenize __must_be_array Compiler Attributes: remove unneeded tests Compiler Attributes: always use the extra-underscores syntax Compiler Attributes: remove unused attributes
2018-11-01Merge branch 'next-keys2' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-securityLinus Torvalds7-2/+263
Pull keys updates from James Morris: "Provide five new operations in the key_type struct that can be used to provide access to asymmetric key operations. These will be implemented for the asymmetric key type in a later patch and may refer to a key retained in RAM by the kernel or a key retained in crypto hardware. int (*asym_query)(const struct kernel_pkey_params *params, struct kernel_pkey_query *info); int (*asym_eds_op)(struct kernel_pkey_params *params, const void *in, void *out); int (*asym_verify_signature)(struct kernel_pkey_params *params, const void *in, const void *in2); Since encrypt, decrypt and sign are identical in their interfaces, they're rolled together in the asym_eds_op() operation and there's an operation ID in the params argument to distinguish them. Verify is different in that we supply the data and the signature instead and get an error value (or 0) as the only result on the expectation that this may well be how a hardware crypto device may work" * 'next-keys2' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security: (22 commits) KEYS: asym_tpm: Add support for the sign operation [ver #2] KEYS: asym_tpm: Implement tpm_sign [ver #2] KEYS: asym_tpm: Implement signature verification [ver #2] KEYS: asym_tpm: Implement the decrypt operation [ver #2] KEYS: asym_tpm: Implement tpm_unbind [ver #2] KEYS: asym_tpm: Add loadkey2 and flushspecific [ver #2] KEYS: Move trusted.h to include/keys [ver #2] KEYS: trusted: Expose common functionality [ver #2] KEYS: asym_tpm: Implement encryption operation [ver #2] KEYS: asym_tpm: Implement pkey_query [ver #2] KEYS: Add parser for TPM-based keys [ver #2] KEYS: asym_tpm: extract key size & public key [ver #2] KEYS: asym_tpm: add skeleton for asym_tpm [ver #2] crypto: rsa-pkcs1pad: Allow hash to be optional [ver #2] KEYS: Implement PKCS#8 RSA Private Key parser [ver #2] KEYS: Implement encrypt, decrypt and sign for software asymmetric key [ver #2] KEYS: Allow the public_key struct to hold a private key [ver #2] KEYS: Provide software public key query function [ver #2] KEYS: Make the X.509 and PKCS7 parsers supply the sig encoding type [ver #2] KEYS: Provide missing asymmetric key subops for new key type ops [ver #2] ...
2018-11-01Merge tag 'nfs-for-4.20-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfsAl Viro14-88/+116
backmerge to do fixup of iov_iter_kvec() conflict
2018-11-01Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhostLinus Torvalds1-0/+8
Pull virtio/vhost updates from Michael Tsirkin: "Fixes and tweaks: - virtio balloon page hinting support - vhost scsi control queue - misc fixes" * tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost: MAINTAINERS: remove reference to bogus vsock file vhost/scsi: Use common handling code in request queue handler vhost/scsi: Extract common handling code from control queue handler vhost/scsi: Respond to control queue operations vhost/scsi: truncate T10 PI iov_iter to prot_bytes virtio-balloon: VIRTIO_BALLOON_F_PAGE_POISON mm/page_poison: expose page_poisoning_enabled to kernel modules virtio-balloon: VIRTIO_BALLOON_F_FREE_PAGE_HINT kvm_config: add CONFIG_VIRTIO_MENU
2018-11-01Merge tag 'stackleak-v4.20-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linuxLinus Torvalds2-0/+40
Pull stackleak gcc plugin from Kees Cook: "Please pull this new GCC plugin, stackleak, for v4.20-rc1. This plugin was ported from grsecurity by Alexander Popov. It provides efficient stack content poisoning at syscall exit. This creates a defense against at least two classes of flaws: - Uninitialized stack usage. (We continue to work on improving the compiler to do this in other ways: e.g. unconditional zero init was proposed to GCC and Clang, and more plugin work has started too). - Stack content exposure. By greatly reducing the lifetime of valid stack contents, exposures via either direct read bugs or unknown cache side-channels become much more difficult to exploit. This complements the existing buddy and heap poisoning options, but provides the coverage for stacks. The x86 hooks are included in this series (which have been reviewed by Ingo, Dave Hansen, and Thomas Gleixner). The arm64 hooks have already been merged through the arm64 tree (written by Laura Abbott and reviewed by Mark Rutland and Will Deacon). With VLAs having been removed this release, there is no need for alloca() protection, so it has been removed from the plugin" * tag 'stackleak-v4.20-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: arm64: Drop unneeded stackleak_check_alloca() stackleak: Allow runtime disabling of kernel stack erasing doc: self-protection: Add information about STACKLEAK feature fs/proc: Show STACKLEAK metrics in the /proc file system lkdtm: Add a test for STACKLEAK gcc-plugins: Add STACKLEAK plugin for tracking the kernel stack x86/entry: Add STACKLEAK erasing the kernel stack at the end of syscalls
2018-11-01SUNRPC: Use atomic(64)_t for seq_send(64)Paul Burton1-5/+2
The seq_send & seq_send64 fields in struct krb5_ctx are used as atomically incrementing counters. This is implemented using cmpxchg() & cmpxchg64() to implement what amount to custom versions of atomic_fetch_inc() & atomic64_fetch_inc(). Besides the duplication, using cmpxchg64() has another major drawback in that some 32 bit architectures don't provide it. As such commit 571ed1fd2390 ("SUNRPC: Replace krb5_seq_lock with a lockless scheme") resulted in build failures for some architectures. Change seq_send to be an atomic_t and seq_send64 to be an atomic64_t, then use atomic(64)_* functions to manipulate the values. The atomic64_t type & associated functions are provided even on architectures which lack real 64 bit atomic memory access via CONFIG_GENERIC_ATOMIC64 which uses spinlocks to serialize access. This fixes the build failures for architectures lacking cmpxchg64(). A potential alternative that was raised would be to provide cmpxchg64() on the 32 bit architectures that currently lack it, using spinlocks. However this would provide a version of cmpxchg64() with semantics a little different to the implementations on architectures with real 64 bit atomics - the spinlock-based implementation would only work if all access to the memory used with cmpxchg64() is *always* performed using cmpxchg64(). That is not currently a requirement for users of cmpxchg64(), and making it one seems questionable. As such avoiding cmpxchg64() outside of architecture-specific code seems best, particularly in cases where atomic64_t seems like a better fit anyway. The CONFIG_GENERIC_ATOMIC64 implementation of atomic64_* functions will use spinlocks & so faces the same issue, but with the key difference that the memory backing an atomic64_t ought to always be accessed via the atomic64_* functions anyway making the issue moot. Signed-off-by: Paul Burton <paul.burton@mips.com> Fixes: 571ed1fd2390 ("SUNRPC: Replace krb5_seq_lock with a lockless scheme") Cc: Trond Myklebust <trond.myklebust@hammerspace.com> Cc: Anna Schumaker <anna.schumaker@netapp.com> Cc: J. Bruce Fields <bfields@fieldses.org> Cc: Jeff Layton <jlayton@kernel.org> Cc: David S. Miller <davem@davemloft.net> Cc: linux-nfs@vger.kernel.org Cc: netdev@vger.kernel.org Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2018-11-01Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds4-6/+17
Pull networking fixes from David Miller: 1) BPF verifier fixes from Daniel Borkmann. 2) HNS driver fixes from Huazhong Tan. 3) FDB only works for ethernet devices, reject attempts to install FDB rules for others. From Ido Schimmel. 4) Fix spectre V1 in vhost, from Jason Wang. 5) Don't pass on-stack object to irq_set_affinity_hint() in mvpp2 driver, from Marc Zyngier. 6) Fix mlx5e checksum handling when RXFCS is enabled, from Eric Dumazet. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (49 commits) openvswitch: Fix push/pop ethernet validation net: stmmac: Fix stmmac_mdio_reset() when building stmmac as modules bpf: test make sure to run unpriv test cases in test_verifier bpf: add various test cases to test_verifier bpf: don't set id on after map lookup with ptr_to_map_val return bpf: fix partial copy of map_ptr when dst is scalar libbpf: Fix compile error in libbpf_attach_type_by_name kselftests/bpf: use ping6 as the default ipv6 ping binary if it exists selftests: mlxsw: qos_mc_aware: Add a test for UC awareness selftests: mlxsw: qos_mc_aware: Tweak for min shaper mlxsw: spectrum: Set minimum shaper on MC TCs mlxsw: reg: QEEC: Add minimum shaper fields net: hns3: bugfix for rtnl_lock's range in the hclgevf_reset() net: hns3: bugfix for rtnl_lock's range in the hclge_reset() net: hns3: bugfix for handling mailbox while the command queue reinitialized net: hns3: fix incorrect return value/type of some functions net: hns3: bugfix for hclge_mdio_write and hclge_mdio_read net: hns3: bugfix for is_valid_csq_clean_head() net: hns3: remove unnecessary queue reset in the hns3_uninit_all_ring() net: hns3: bugfix for the initialization of command queue's spin lock ...
2018-11-01Merge tag 'platform-drivers-x86-v4.20-1' of git://git.infradead.org/linux-platform-drivers-x86Linus Torvalds1-0/+101
Pull x86 platform driver updates from Darren Hart: - Move the Dell dcdbas and dell_rbu drivers into platform/drivers/x86 as they are closely coupled with other drivers in this location. - Improve _init* usage for acerhdf and fix some usage issues with messages and module parameters. - Simplify asus-wmi by calling ACPI/WMI methods directly, eliminating workqueue overhead, eliminate double reporting of keyboard backlight. - Fix wake from USB failure on Bay Trail devices (intel_int0002_vgpio). - Notify intel_telemetry users when IPC1 device is not enabled. - Update various drivers with new laptop model IDs. - Update several intel drivers to use SPDX identifers and order headers alphabetically. * tag 'platform-drivers-x86-v4.20-1' of git://git.infradead.org/linux-platform-drivers-x86: (64 commits) HID: asus: only support backlight when it's not driven by WMI platform/x86: asus-wmi: export function for evaluating WMI methods platform/x86: asus-wmi: Only notify kbd LED hw_change by fn-key pressed platform/x86: wmi: declare device_type structure as constant platform/x86: ideapad: Add Y530-15ICH to no_hw_rfkill platform/x86: Add Intel AtomISP2 dummy / power-management driver platform/x86: touchscreen_dmi: Add min-x and min-y settings for various models platform/x86: touchscreen_dmi: Add info for the Onda V80 Plus v3 tablet platform/x86: touchscreen_dmi: Add info for the Trekstor Primetab T13B tablet platform/x86: intel_telemetry: Get rid of custom macro platform/x86: intel_telemetry: report debugfs failure MAINTAINERS: intel_telemetry: Update maintainers info platform/x86: Add LG Gram laptop special features driver platform/x86: asus-wmi: Simplify the keyboard brightness updating process platform/x86: touchscreen_dmi: Add info for the Trekstor Primebook C11 convertible platform/x86: mlx-platform: Properly use mlxplat_mlxcpld_msn201x_items MAINTAINERS: intel_pmc_core: Update MAINTAINERS firmware: dcdbas: include linux/io.h platform/x86: intel-wmi-thunderbolt: Add dynamic debugging platform/x86: intel-wmi-thunderbolt: Convert to use SPDX identifier ...
2018-11-01x86/compat: Adjust in_compat_syscall() to generic code under !COMPATDmitry Safonov1-2/+2
The result of in_compat_syscall() can be pictured as: x86 platform: --------------------------------------------------- | Arch\syscall | 64-bit | ia32 | x32 | |-------------------------------------------------| | x86_64 | false | true | true | |-------------------------------------------------| | i686 | | <true> | | --------------------------------------------------- Other platforms: ------------------------------------------- | Arch\syscall | 64-bit | compat | |-----------------------------------------| | 64-bit | false | true | |-----------------------------------------| | 32-bit(?) | | <false> | ------------------------------------------- As seen, the result of in_compat_syscall() on generic 32-bit platform differs from i686. There is no reason for in_compat_syscall() == true on native i686. It also easy to misread code if the result on native 32-bit platform differs between arches. Because of that non arch-specific code has many places with: if (IS_ENABLED(CONFIG_COMPAT) && in_compat_syscall()) in different variations. It looks-like the only non-x86 code which uses in_compat_syscall() not under CONFIG_COMPAT guard is in amd/amdkfd. But according to the commit a18069c132cb ("amdkfd: Disable support for 32-bit user processes"), it actually should be disabled on native i686. Rename in_compat_syscall() to in_32bit_syscall() for x86-specific code and make in_compat_syscall() false under !CONFIG_COMPAT. A follow on patch will clean up generic users which were forced to check IS_ENABLED(CONFIG_COMPAT) with in_compat_syscall(). Signed-off-by: Dmitry Safonov <dima@arista.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Andy Lutomirski <luto@kernel.org> Cc: Dmitry Safonov <0x7f454c46@gmail.com> Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: "David S. Miller" <davem@davemloft.net> Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: John Stultz <john.stultz@linaro.org> Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Steffen Klassert <steffen.klassert@secunet.com> Cc: Stephen Boyd <sboyd@kernel.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: linux-efi@vger.kernel.org Cc: netdev@vger.kernel.org Link: https://lkml.kernel.org/r/20181012134253.23266-2-dima@arista.com
2018-10-31Merge branch '10GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/net-queueDavid S. Miller1-3/+9
Jeff Kirsher says: ==================== Intel Wired LAN Driver Updates 2018-10-31 This series contains a various collection of fixes. Miroslav Lichvar from Red Hat or should I say IBM now? Updates the PHC timecounter interval for igb so that it gets updated at least once every 550 seconds. Ngai-Mint provides a fix for fm10k to prevent a soft lockup or system crash by adding a new condition to determine if the SM mailbox is in the correct state before proceeding. Jake provides several fm10k fixes, first one marks complier aborts as non-fatal since on some platforms trigger machine check errors when the compile aborts. Added missing device ids to the in-kernel driver. Due to the recent fixes, bumped the driver version. I (Jeff Kirsher) fixed a XFRM_ALGO dependency for both ixgbe and ixgbevf. This fix was based on the original work from Arnd Bergmann, which only fixed ixgbe. Mitch provides a fix for i40e/avf to update the status codes, which resolves an issue between a mis-match between i40e and the iavf driver, which also supports the ice LAN driver. Radoslaw fixes the ixgbe where the driver is logging a message about spoofed packets detected when the VF is re-started with a different MAC address. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-31Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpfDavid S. Miller1-0/+3
Daniel Borkmann says: ==================== pull-request: bpf 2018-11-01 The following pull-request contains BPF updates for your *net* tree. The main changes are: 1) Fix tcp_bpf_recvmsg() to return -EAGAIN instead of 0 in non-blocking case when no data is available yet, from John. 2) Fix a compilation error in libbpf_attach_type_by_name() when compiled with clang 3.8, from Andrey. 3) Fix a partial copy of map pointer on scalar alu and remove id generation for RET_PTR_TO_MAP_VALUE return types, from Daniel. 4) Add unlimited memlock limit for kernel selftest's flow_dissector_load program, from Yonghong. 5) Fix ping for some BPF shell based kselftests where distro does not ship "ping -6" anymore, from Li. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-31bpf: fix partial copy of map_ptr when dst is scalarDaniel Borkmann1-0/+3
ALU operations on pointers such as scalar_reg += map_value_ptr are handled in adjust_ptr_min_max_vals(). Problem is however that map_ptr and range in the register state share a union, so transferring state through dst_reg->range = ptr_reg->range is just buggy as any new map_ptr in the dst_reg is then truncated (or null) for subsequent checks. Fix this by adding a raw member and use it for copying state over to dst_reg. Fixes: f1174f77b50c ("bpf/verifier: rework value tracking") Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Cc: Edward Cree <ecree@solarflare.com> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-10-31Merge tag 'tag-chrome-platform-for-v4.20' of git://git.kernel.org/pub/scm/linux/kernel/git/bleung/chrome-platformLinus Torvalds4-357/+303
Pull chrome-platform updates from Benson Leung: - Move mfd/cros_ec_lpc* includes to drivers/platform from mfd - Adding a new interrupt path for cros_ec_lpc * tag 'tag-chrome-platform-for-v4.20' of git://git.kernel.org/pub/scm/linux/kernel/git/bleung/chrome-platform: platform/chrome: chromeos_tbmc - Remove unneeded const platform/chrome: Add a new interrupt path for cros_ec_lpc mfd: cros_ec: Fix and improve kerneldoc comments. platform/chrome: Move mfd/cros_ec_lpc* includes to drivers/platform.
2018-11-01netfilter: ipset: list:set: Decrease refcount synchronously on deletion and replaceStefano Brivio1-1/+1
Commit 45040978c899 ("netfilter: ipset: Fix set:list type crash when flush/dump set in parallel") postponed decreasing set reference counters to the RCU callback. An 'ipset del' command can terminate before the RCU grace period is elapsed, and if sets are listed before then, the reference counter shown in userspace will be wrong: # ipset create h hash:ip; ipset create l list:set; ipset add l # ipset del l h; ipset list h Name: h Type: hash:ip Revision: 4 Header: family inet hashsize 1024 maxelem 65536 Size in memory: 88 References: 1 Number of entries: 0 Members: # sleep 1; ipset list h Name: h Type: hash:ip Revision: 4 Header: family inet hashsize 1024 maxelem 65536 Size in memory: 88 References: 0 Number of entries: 0 Members: Fix this by making the reference count update synchronous again. As a result, when sets are listed, ip_set_name_byindex() might now fetch a set whose reference count is already zero. Instead of relying on the reference count to protect against concurrent set renaming, grab ip_set_ref_lock as reader and copy the name, while holding the same lock in ip_set_rename() as writer instead. Reported-by: Li Shuang <shuali@redhat.com> Fixes: 45040978c899 ("netfilter: ipset: Fix set:list type crash when flush/dump set in parallel") Signed-off-by: Stefano Brivio <sbrivio@redhat.com> Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2018-10-31Merge tag 'riscv-for-linus-4.20-mw2' of git://git.kernel.org/pub/scm/linux/kernel/git/palmer/riscv-linuxLinus Torvalds1-0/+1
Pull more RISC-V updates from Palmer Dabbelt: "This contains the follow-on patches I'd like to target for the 4.20 merge window. I'm being somewhat conservative here, as while there are a few patches on the mailing list that were posted early in the merge window I'd like to let those bake for another round -- this was a fairly big release as far as RISC-V is concerened, and we need to walk before we can run. As far as the patches that made it go: - A patch to ignore offline CPUs when calculating AT_HWCAP. This should fix GDB on the HiFive unleashed, which has an embedded core for hart 0 which is exposed to Linux as an offline CPU. - A move of EM_RISCV to elf-em.h, which is where it should have been to begin with. - I've also removed the 64-bit divide routines. I know I'm not really playing by my own rules here because I posted the patches this morning, but since they shouldn't be in the kernel I think it's better to err on the side of going too fast here. I don't anticipate any more patch sets for the merge window" * tag 'riscv-for-linus-4.20-mw2' of git://git.kernel.org/pub/scm/linux/kernel/git/palmer/riscv-linux: Move EM_RISCV into elf-em.h RISC-V: properly determine hardware caps Revert "lib: Add umoddi3 and udivmoddi4 of GCC library routines" Revert "RISC-V: Select GENERIC_LIB_UMODDI3 on RV32"
2018-10-31Merge tag 'fuse-update-4.20' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuseLinus Torvalds2-62/+87
Pull fuse updates from Miklos Szeredi: "As well as the usual bug fixes, this adds the following new features: - cached readdir and readlink - max I/O size increased from 128k to 1M - improved performance and scalability of request queues - copy_file_range support The only non-fuse bits are trivial cleanups of macros in <linux/bitops.h>" * tag 'fuse-update-4.20' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse: (31 commits) fuse: enable caching of symlinks fuse: only invalidate atime in direct read fuse: don't need GETATTR after every READ fuse: allow fine grained attr cache invaldation bitops: protect variables in bit_clear_unless() macro bitops: protect variables in set_mask_bits() macro fuse: realloc page array fuse: add max_pages to init_out fuse: allocate page array more efficiently fuse: reduce size of struct fuse_inode fuse: use iversion for readdir cache verification fuse: use mtime for readdir cache verification fuse: add readdir cache version fuse: allow using readdir cache fuse: allow caching readdir fuse: extract fuse_emit() helper fuse: add FOPEN_CACHE_DIR fuse: split out readdir.c fuse: Use hash table to link processing request fuse: kill req->intr_unique ...