aboutsummaryrefslogtreecommitdiffstatshomepage
path: root/include/uapi (unfollow)
AgeCommit message (Collapse)AuthorFilesLines
2020-06-12dt-bindings: Remove redundant 'maxItems'Rob Herring7-9/+0
There's no need to specify 'maxItems' with the same value as the number of entries in 'items'. A meta-schema update will catch future cases. Cc: Shawn Guo <shawnguo@kernel.org> Cc: Sascha Hauer <s.hauer@pengutronix.de> Cc: Anson Huang <Anson.Huang@nxp.com> Cc: linux-clk@vger.kernel.org Cc: linux-pwm@vger.kernel.org Cc: linux-usb@vger.kernel.org Reviewed-by: Stephen Boyd <sboyd@kernel.org> # clk Acked-by: Thierry Reding <thierry.reding@gmail.com> Signed-off-by: Rob Herring <robh@kernel.org>
2020-06-12ima: fix mprotect checkingMimi Zohar1-1/+2
Make sure IMA is enabled before checking mprotect change. Addresses report of a 3.7% regression of boot-time.dhcp. Fixes: 8eb613c0b8f1 ("ima: verify mprotect change is consistent with mmap policy") Reported-by: kernel test robot <rong.a.chen@intel.com> Reviewed-by: Lakshmi Ramasubramanian <nramas@linux.microsoft.com> Tested-by: Xing Zhengjun <zhengjun.xing@linux.intel.com> Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
2020-06-12nios2: signal: Mark expected switch fall-throughLey Foon Tan1-0/+1
Mark switch cases where we are expecting to fall through. Fix the following warning through the use of the new the new pseudo-keyword fallthrough; arch/nios2/kernel/signal.c:254:12: warning: this statement may fall through [-Wimplicit-fallthrough=] 254 | restart = -2; | ~~~~~~~~^~~~ arch/nios2/kernel/signal.c:255:3: note: here 255 | case ERESTARTNOHAND: | ^~~~ Reported-by: Christian Brauner <christian.brauner@ubuntu.com> Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org> Signed-off-by: Ley Foon Tan <ley.foon.tan@intel.com>
2020-06-11amdgpu: a NULL ->mm does not mean a thread is a kthreadChristoph Hellwig1-1/+1
Use the proper API instead. Fixes: 70539bd795002 ("drm/amd: Update MEC HQD loading code for KFD") Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Tested-by: Jens Axboe <axboe@kernel.dk> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Jens Axboe <axboe@kernel.dk> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: Zhenyu Wang <zhenyuw@linux.intel.com> Cc: Zhi Wang <zhi.a.wang@intel.com> Cc: Felipe Balbi <balbi@kernel.org> Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Jason Wang <jasowang@redhat.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Link: http://lkml.kernel.org/r/20200404094101.672954-1-hch@lst.de Link: http://lkml.kernel.org/r/20200404094101.672954-2-hch@lst.de Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-11lib/lzo: fix ambiguous encoding bug in lzo-rleDave Rodgman2-2/+19
In some rare cases, for input data over 32 KB, lzo-rle could encode two different inputs to the same compressed representation, so that decompression is then ambiguous (i.e. data may be corrupted - although zram is not affected because it operates over 4 KB pages). This modifies the compressor without changing the decompressor or the bitstream format, such that: - there is no change to how data produced by the old compressor is decompressed - an old decompressor will correctly decode data from the updated compressor - performance and compression ratio are not affected - we avoid introducing a new bitstream format In testing over 12.8M real-world files totalling 903 GB, three files were affected by this bug. I also constructed 37M semi-random 64 KB files totalling 2.27 TB, and saw no affected files. Finally I tested over files constructed to contain each of the ~1024 possible bad input sequences; for all of these cases, updated lzo-rle worked correctly. There is no significant impact to performance or compression ratio. Signed-off-by: Dave Rodgman <dave.rodgman@arm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Dave Rodgman <dave.rodgman@arm.com> Cc: Willy Tarreau <w@1wt.eu> Cc: Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com> Cc: Markus F.X.J. Oberhumer <markus@oberhumer.com> Cc: Minchan Kim <minchan@kernel.org> Cc: Nitin Gupta <ngupta@vflare.org> Cc: Chao Yu <yuchao0@huawei.com> Cc: <stable@vger.kernel.org> Link: http://lkml.kernel.org/r/20200507100203.29785-1-dave.rodgman@arm.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-11ocfs2: fix build failure when TCP/IP is disabledTom Seewald1-1/+1
After commit 12abc5ee7873 ("tcp: add tcp_sock_set_nodelay") and commit c488aeadcbd0 ("tcp: add tcp_sock_set_user_timeout"), building the kernel with OCFS2_FS=y but without INET=y causes it to fail with: ld: fs/ocfs2/cluster/tcp.o: in function `o2net_accept_many': tcp.c:(.text+0x21b1): undefined reference to `tcp_sock_set_nodelay' ld: tcp.c:(.text+0x21c1): undefined reference to `tcp_sock_set_user_timeout' ld: fs/ocfs2/cluster/tcp.o: in function `o2net_start_connect': tcp.c:(.text+0x2633): undefined reference to `tcp_sock_set_nodelay' ld: tcp.c:(.text+0x2643): undefined reference to `tcp_sock_set_user_timeout' This is due to tcp_sock_set_nodelay() and tcp_sock_set_user_timeout() being declared in linux/tcp.h and defined in net/ipv4/tcp.c, which depend on TCP/IP being enabled. To fix this, make OCFS2_FS depend on INET=y which already requires NET=y. Fixes: 12abc5ee7873 ("tcp: add tcp_sock_set_nodelay") Fixes: c488aeadcbd0 ("tcp: add tcp_sock_set_user_timeout") Signed-off-by: Tom Seewald <tseewald@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com> Acked-by: Christoph Hellwig <hch@lst.de> Cc: Sagi Grimberg <sagi@grimberg.me> Cc: Jason Gunthorpe <jgg@mellanox.com> Cc: David S. Miller <davem@davemloft.net> Cc: Mark Fasheh <mark@fasheh.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Changwei Ge <gechangwei@live.cn> Cc: Gang He <ghe@suse.com> Cc: Jun Piao <piaojun@huawei.com> Link: http://lkml.kernel.org/r/20200606190827.23954-1-tseewald@gmail.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-11mm/memory-failure: send SIGBUS(BUS_MCEERR_AR) only to current threadNaoya Horiguchi1-7/+16
Action Required memory error should happen only when a processor is about to access to a corrupted memory, so it's synchronous and only affects current process/thread. Recently commit 872e9a205c84 ("mm, memory_failure: don't send BUS_MCEERR_AO for action required error") fixed the issue that Action Required memory could unnecessarily send SIGBUS to the processes which share the error memory. But we still have another issue that we could send SIGBUS to a wrong thread. This is because collect_procs() and task_early_kill() fails to add the current process to "to-kill" list. So this patch is suggesting to fix it. With this fix, SIGBUS(BUS_MCEERR_AR) is never sent to non-current process/thread. Signed-off-by: Naoya Horiguchi <naoya.horiguchi@nec.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Acked-by: Tony Luck <tony.luck@intel.com> Acked-by: Pankaj Gupta <pankaj.gupta.linux@gmail.com> Link: http://lkml.kernel.org/r/1591321039-22141-3-git-send-email-naoya.horiguchi@nec.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-11mm/memory-failure: prioritize prctl(PR_MCE_KILL) over vm.memory_failure_early_killNaoya Horiguchi1-10/+10
Patch series "hwpoison: fixes signaling on memory error" This is a small patchset to solve issues in memory error handler to send SIGBUS to proper process/thread as expected in configuration. Please see descriptions in individual patches for more details. This patch (of 2): Early-kill policy is controlled from two types of settings, one is per-process setting prctl(PR_MCE_KILL) and the other is system-wide setting vm.memory_failure_early_kill. Users expect per-process setting to override system-wide setting as many other settings do, but early-kill setting doesn't work as such. For example, if a system configures vm.memory_failure_early_kill to 1 (enabled), a process receives SIGBUS even if it's configured to explicitly disable PF_MCE_KILL by prctl(). That's not desirable for applications with their own policies. This patch is suggesting to change the priority of these two types of settings, by checking sysctl_memory_failure_early_kill only when a given process has the default kill policy. Note that this patch is solving a thread choice issue too. Originally, collect_procs() always chooses the main thread when vm.memory_failure_early_kill is 1, even if the process has a dedicated thread for memory error handling. SIGBUS should be sent to the dedicated thread if early-kill is enabled via vm.memory_failure_early_kill as we are doing for PR_MCE_KILL_EARLY processes. Signed-off-by: Naoya Horiguchi <naoya.horiguchi@nec.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Cc: Tony Luck <tony.luck@intel.com> Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com> Link: http://lkml.kernel.org/r/1591321039-22141-1-git-send-email-naoya.horiguchi@nec.com Link: http://lkml.kernel.org/r/1591321039-22141-2-git-send-email-naoya.horiguchi@nec.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-11afs: Fix afs_store_data() to set mtime in new operation descriptorDavid Howells1-0/+1
Fix afs_store_data() so that it sets the mtime in the new operation descriptor otherwise the mtime on the server gets set to 0 when a write is stored to the server. Fixes: e49c7b2f6de7 ("afs: Build an abstraction around an "operation" concept") Reported-by: Dave Botsch <botsch@cnf.cornell.edu> Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-11dt-bindings: Fix more incorrect 'reg' property sizes in examplesRob Herring10-13/+13
The examples template is a 'simple-bus' with a size of 1 cell for had between 2 and 4 cells which really only errors on I2C or SPI type devices with a single cell. The easiest fix in most cases is to change the 'reg' property to 1 cell for address and size. Cc: "Heiko Stübner" <heiko@sntech.de> Cc: Ezequiel Garcia <ezequiel@collabora.com> Cc: Mauro Carvalho Chehab <mchehab@kernel.org> Cc: Philipp Zabel <p.zabel@pengutronix.de> Cc: Miquel Raynal <miquel.raynal@bootlin.com> Cc: Richard Weinberger <richard@nod.at> Cc: "David S. Miller" <davem@davemloft.net> Cc: Bjorn Andersson <bjorn.andersson@linaro.org> Cc: Kishon Vijay Abraham I <kishon@ti.com> Cc: Vinod Koul <vkoul@kernel.org> Cc: Liam Girdwood <lgirdwood@gmail.com> Cc: linux-rockchip@lists.infradead.org Cc: linux-media@vger.kernel.org Cc: linux-mtd@lists.infradead.org Cc: netdev@vger.kernel.org Cc: alsa-devel@alsa-project.org Acked-by: Mark Brown <broonie@kernel.org> Signed-off-by: Rob Herring <robh@kernel.org>
2020-06-11alpha: Fix build around srm_sysrq_reboot_opJoerg Roedel1-1/+6
The patch introducing the struct was probably never compile tested, because it sets a handler with a wrong function signature. Wrap the handler into a functions with the correct signature to fix the build. Fixes: 0f1c9688a194 ("tty/sysrq: alpha: export and use __sysrq_get_key_op()") Cc: Emil Velikov <emil.l.velikov@gmail.com> Signed-off-by: Joerg Roedel <jroedel@suse.de> Reviewed-by: Guenter Roeck <linux@roeck-us.net> Tested-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-11dt-bindings: phy: qcom: Fix missing 'ranges' and example addressesRob Herring2-25/+33
The QCom QMP PHY bindings have child nodes with translatable (MMIO) addresses, so a 'ranges' property is required in the parent node. Additionally, the examples default to 1 address and size cell, so let's fix that, too. Fixes: ccf51c1cedfd ("dt-bindings: phy: qcom,qmp: Convert QMP PHY bindings to yaml") Cc: Andy Gross <agross@kernel.org> Cc: Bjorn Andersson <bjorn.andersson@linaro.org> Cc: Kishon Vijay Abraham I <kishon@ti.com> Cc: Vinod Koul <vkoul@kernel.org> Cc: Manu Gautam <mgautam@codeaurora.org> Cc: linux-arm-msm@vger.kernel.org Signed-off-by: Rob Herring <robh@kernel.org>
2020-06-11dt-bindings: Remove more cases of 'allOf' containing a '$ref'Rob Herring33-316/+224
Another round of 'allOf' removals that came in this cycle. json-schema versions draft7 and earlier have a weird behavior in that any keywords combined with a '$ref' are ignored (silently). The correct form was to put a '$ref' under an 'allOf'. This behavior is now changed in the 2019-09 json-schema spec and '$ref' can be mixed with other keywords. The json-schema library doesn't yet support this, but the tooling now does a fixup for this and either way works. This has been a constant source of review comments, so let's change this treewide so everyone copies the simpler syntax. Signed-off-by: Rob Herring <robh@kernel.org>
2020-06-11compiler_types.h, kasan: Use __SANITIZE_ADDRESS__ instead of CONFIG_KASAN to decide inliningMarco Elver1-5/+8
Use __always_inline in compilation units that have instrumentation disabled (KASAN_SANITIZE_foo.o := n) for KASAN, like it is done for KCSAN. Also, add common documentation for KASAN and KCSAN explaining the attribute. [ bp: Massage commit message. ] Signed-off-by: Marco Elver <elver@google.com> Signed-off-by: Borislav Petkov <bp@suse.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Will Deacon <will@kernel.org> Link: https://lkml.kernel.org/r/20200521142047.169334-12-elver@google.com
2020-06-11compiler.h: Move function attributes to compiler_types.hMarco Elver2-29/+29
Cleanup and move the KASAN and KCSAN related function attributes to compiler_types.h, where the rest of the same kind live. No functional change intended. Signed-off-by: Marco Elver <elver@google.com> Signed-off-by: Borislav Petkov <bp@suse.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Will Deacon <will@kernel.org> Link: https://lkml.kernel.org/r/20200521142047.169334-11-elver@google.com
2020-06-11compiler.h: Avoid nested statement expression in data_race()Marco Elver1-5/+5
It appears that compilers have trouble with nested statement expressions. Therefore, remove one level of statement expression nesting from the data_race() macro. This will help avoiding potential problems in the future as its usage increases. Reported-by: Borislav Petkov <bp@suse.de> Reported-by: Nathan Chancellor <natechancellor@gmail.com> Signed-off-by: Marco Elver <elver@google.com> Signed-off-by: Borislav Petkov <bp@suse.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Will Deacon <will@kernel.org> Tested-by: Nick Desaulniers <ndesaulniers@google.com> Link: https://lkml.kernel.org/r/20200520221712.GA21166@zn.tnic Link: https://lkml.kernel.org/r/20200521142047.169334-10-elver@google.com
2020-06-11compiler.h: Remove data_race() and unnecessary checks from {READ,WRITE}_ONCE()Marco Elver1-11/+2
The volatile accesses no longer need to be wrapped in data_race() because compilers that emit instrumentation distinguishing volatile accesses are required for KCSAN. Consequently, the explicit kcsan_check_atomic*() are no longer required either since the compiler emits instrumentation distinguishing the volatile accesses. Finally, simplify __READ_ONCE_SCALAR() and remove __WRITE_ONCE_SCALAR(). [ bp: Convert commit message to passive voice. ] Signed-off-by: Marco Elver <elver@google.com> Signed-off-by: Borislav Petkov <bp@suse.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Will Deacon <will@kernel.org> Link: https://lkml.kernel.org/r/20200521142047.169334-9-elver@google.com
2020-06-11kcsan: Update Documentation to change supported compilersMarco Elver1-8/+1
Document change in required compiler version for KCSAN, and remove the now redundant note about __no_kcsan and inlining problems with older compilers. Signed-off-by: Marco Elver <elver@google.com> Signed-off-by: Borislav Petkov <bp@suse.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Will Deacon <will@kernel.org> Link: https://lkml.kernel.org/r/20200521142047.169334-8-elver@google.com
2020-06-11kcsan: Remove 'noinline' from __no_kcsan_or_inlineMarco Elver1-4/+2
Some compilers incorrectly inline small __no_kcsan functions, which then results in instrumenting the accesses. For this reason, the 'noinline' attribute was added to __no_kcsan_or_inline. All known versions of GCC are affected by this. Supported versions of Clang are unaffected, and never inline a no_sanitize function. However, the attribute 'noinline' in __no_kcsan_or_inline causes unexpected code generation in functions that are __no_kcsan and call a __no_kcsan_or_inline function. In certain situations it is expected that the __no_kcsan_or_inline function is actually inlined by the __no_kcsan function, and *no* calls are emitted. By removing the 'noinline' attribute, give the compiler the ability to inline and generate the expected code in __no_kcsan functions. Signed-off-by: Marco Elver <elver@google.com> Signed-off-by: Borislav Petkov <bp@suse.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Will Deacon <will@kernel.org> Link: https://lkml.kernel.org/r/CANpmjNNOpJk0tprXKB_deiNAv_UmmORf1-2uajLhnLWQQ1hvoA@mail.gmail.com Link: https://lkml.kernel.org/r/20200521142047.169334-6-elver@google.com
2020-06-11kcsan: Pass option tsan-instrument-read-before-write to ClangMarco Elver1-0/+1
Clang (unlike GCC) removes reads before writes with matching addresses in the same basic block. This is an optimization for TSAN, since writes will always cause conflict if the preceding read would have. However, for KCSAN we cannot rely on this option, because we apply several special rules to writes, in particular when the KCSAN_ASSUME_PLAIN_WRITES_ATOMIC option is selected. To avoid missing potential data races, pass the -tsan-instrument-read-before-write option to Clang if it is available [1]. [1] https://github.com/llvm/llvm-project/commit/151ed6aa38a3ec6c01973b35f684586b6e1c0f7e Signed-off-by: Marco Elver <elver@google.com> Signed-off-by: Borislav Petkov <bp@suse.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Will Deacon <will@kernel.org> Link: https://lkml.kernel.org/r/20200521142047.169334-5-elver@google.com
2020-06-11kcsan: Support distinguishing volatile accessesMarco Elver2-1/+47
In the kernel, the "volatile" keyword is used in various concurrent contexts, whether in low-level synchronization primitives or for legacy reasons. If supported by the compiler, it will be assumed that aligned volatile accesses up to sizeof(long long) (matching compiletime_assert_rwonce_type()) are atomic. Recent versions of Clang [1] (GCC tentative [2]) can instrument volatile accesses differently. Add the option (required) to enable the instrumentation, and provide the necessary runtime functions. None of the updated compilers are widely available yet (Clang 11 will be the first release to support the feature). [1] https://github.com/llvm/llvm-project/commit/5a2c31116f412c3b6888be361137efd705e05814 [2] https://gcc.gnu.org/pipermail/gcc-patches/2020-April/544452.html This change allows removing of any explicit checks in primitives such as READ_ONCE() and WRITE_ONCE(). [ bp: Massage commit message a bit. ] Signed-off-by: Marco Elver <elver@google.com> Signed-off-by: Borislav Petkov <bp@suse.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Will Deacon <will@kernel.org> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20200521142047.169334-4-elver@google.com
2020-06-11kcsan: Restrict supported compilersMarco Elver1-1/+8
The first version of Clang that supports -tsan-distinguish-volatile will be able to support KCSAN. The first Clang release to do so, will be Clang 11. This is due to satisfying all the following requirements: 1. Never emit calls to __tsan_func_{entry,exit}. 2. __no_kcsan functions should not call anything, not even kcsan_{enable,disable}_current(), when using __{READ,WRITE}_ONCE => Requires leaving them plain! 3. Support atomic_{read,set}*() with KCSAN, which rely on arch_atomic_{read,set}*() using __{READ,WRITE}_ONCE() => Because of #2, rely on Clang 11's -tsan-distinguish-volatile support. We will double-instrument atomic_{read,set}*(), but that's reasonable given it's still lower cost than the data_race() variant due to avoiding 2 extra calls (kcsan_{en,dis}able_current() calls). 4. __always_inline functions inlined into __no_kcsan functions are never instrumented. 5. __always_inline functions inlined into instrumented functions are instrumented. 6. __no_kcsan_or_inline functions may be inlined into __no_kcsan functions => Implies leaving 'noinline' off of __no_kcsan_or_inline. 7. Because of #6, __no_kcsan and __no_kcsan_or_inline functions should never be spuriously inlined into instrumented functions, causing the accesses of the __no_kcsan function to be instrumented. Older versions of Clang do not satisfy #3. The latest GCC currently doesn't support at least #1, #3, and #7. Signed-off-by: Marco Elver <elver@google.com> Signed-off-by: Borislav Petkov <bp@suse.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Will Deacon <will@kernel.org> Link: https://lkml.kernel.org/r/CANpmjNMTsY_8241bS7=XAfqvZHFLrVEkv_uM4aDUWE_kh3Rvbw@mail.gmail.com Link: https://lkml.kernel.org/r/20200521142047.169334-7-elver@google.com
2020-06-11kcsan: Avoid inserting __tsan_func_entry/exit if possibleMarco Elver1-1/+10
To avoid inserting __tsan_func_{entry,exit}, add option if supported by compiler. Currently only Clang can be told to not emit calls to these functions. It is safe to not emit these, since KCSAN does not rely on them. Note that, if we disable __tsan_func_{entry,exit}(), we need to disable tail-call optimization in sanitized compilation units, as otherwise we may skip frames in the stack trace; in particular when the tail called function is one of the KCSAN's runtime functions, and a report is generated, we might miss the function where the actual access occurred. Since __tsan_func_{entry,exit}() insertion effectively disabled tail-call optimization, there should be no observable change. This was caught and confirmed with kcsan-test & UNWINDER_ORC. Signed-off-by: Marco Elver <elver@google.com> Signed-off-by: Borislav Petkov <bp@suse.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Will Deacon <will@kernel.org> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20200521142047.169334-3-elver@google.com
2020-06-11ubsan, kcsan: Don't combine sanitizer with kcov on clangArnd Bergmann2-0/+22
Clang does not allow -fsanitize-coverage=trace-{pc,cmp} together with -fsanitize=bounds or with ubsan: clang: error: argument unused during compilation: '-fsanitize-coverage=trace-pc' [-Werror,-Wunused-command-line-argument] clang: error: argument unused during compilation: '-fsanitize-coverage=trace-cmp' [-Werror,-Wunused-command-line-argument] To avoid the warning, check whether clang can handle this correctly or disallow ubsan and kcsan when kcov is enabled. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Marco Elver <elver@google.com> Signed-off-by: Borislav Petkov <bp@suse.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Marco Elver <elver@google.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://bugs.llvm.org/show_bug.cgi?id=45831 Link: https://lore.kernel.org/lkml/20200505142341.1096942-1-arnd@arndb.de Link: https://lkml.kernel.org/r/20200521142047.169334-2-elver@google.com
2020-06-11KVM: x86: do not pass poisoned hva to __kvm_set_memory_regionPaolo Bonzini1-6/+1
__kvm_set_memory_region does not use the hva at all, so trying to catch use-after-delete is pointless and, worse, it fails access_ok now that we apply it to all memslots including private kernel ones. This fixes an AVIC regression. Fixes: 09d952c971a5 ("KVM: check userspace_addr for all memslots") Reported-by: Maxim Levitsky <mlevitsk@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-06-11NFS: Fix direct WRITE throughput regressionChuck Lever1-0/+2
I measured a 50% throughput regression for large direct writes. The observed on-the-wire behavior is that the client sends every NFS WRITE twice: once as an UNSTABLE WRITE plus a COMMIT, and once as a FILE_SYNC WRITE. This is because the nfs_write_match_verf() check in nfs_direct_commit_complete() fails for every WRITE. Buffered writes use nfs_write_completion(), which sets req->wb_verf correctly. Direct writes use nfs_direct_write_completion(), which does not set req->wb_verf at all. This leaves req->wb_verf set to all zeroes for every direct WRITE, and thus nfs_direct_commit_completion() always sets NFS_ODIRECT_RESCHED_WRITES. This fix appears to restore nearly all of the lost performance. Fixes: 1f28476dcb98 ("NFS: Fix O_DIRECT commit verifier handling") Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2020-06-11SUNRPC: rpc_xprt lifetime events should record xprt->stateChuck Lever1-1/+29
Help troubleshoot the logic that uses these flags. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2020-06-11xprtrdma: Make xprt_rdma_slot_table_entries staticZou Wei1-1/+1
Fix the following sparse warning: net/sunrpc/xprtrdma/transport.c:71:14: warning: symbol 'xprt_rdma_slot_table_entries' was not declared. Should it be static? Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Zou Wei <zou_wei@huawei.com> Reviewed-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2020-06-11nfs: set invalid blocks after NFSv4 writesZheng Bin2-3/+12
Use the following command to test nfsv4(size of file1M is 1MB): mount -t nfs -o vers=4.0,actimeo=60 127.0.0.1/dir1 /mnt cp file1M /mnt du -h /mnt/file1M -->0 within 60s, then 1M When write is done(cp file1M /mnt), will call this: nfs_writeback_done nfs4_write_done nfs4_write_done_cb nfs_writeback_update_inode nfs_post_op_update_inode_force_wcc_locked(change, ctime, mtime nfs_post_op_update_inode_force_wcc_locked nfs_set_cache_invalid nfs_refresh_inode_locked nfs_update_inode nfsd write response contains change, ctime, mtime, the flag will be clear after nfs_update_inode. Howerver, write response does not contain space_used, previous open response contains space_used whose value is 0, so inode->i_blocks is still 0. nfs_getattr -->called by "du -h" do_update |= force_sync || nfs_attribute_cache_expired -->false in 60s cache_validity = READ_ONCE(NFS_I(inode)->cache_validity) do_update |= cache_validity & (NFS_INO_INVALID_ATTR -->false if (do_update) { __nfs_revalidate_inode } Within 60s, does not send getattr request to nfsd, thus "du -h /mnt/file1M" is 0. Add a NFS_INO_INVALID_BLOCKS flag, set it when nfsv4 write is done. Fixes: 16e143751727 ("NFS: More fine grained attribute tracking") Signed-off-by: Zheng Bin <zhengbin13@huawei.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2020-06-11NFS: remove redundant initialization of variable resultColin Ian King1-1/+1
The variable result is being initialized with a value that is never read and it is being updated later with a new value. The initialization is redundant and can be removed. Addresses-Coverity: ("Unused value") Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2020-06-11sunrpc: add missing newline when printing parameter 'auth_hashtable_size' by sysfsXiongfeng Wang1-1/+1
When I cat parameter '/sys/module/sunrpc/parameters/auth_hashtable_size', it displays as follows. It is better to add a newline for easy reading. [root@hulk-202 ~]# cat /sys/module/sunrpc/parameters/auth_hashtable_size 16[root@hulk-202 ~]# Signed-off-by: Xiongfeng Wang <wangxiongfeng2@huawei.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2020-06-11NFS: Add a tracepoint in nfs_set_pgio_error()Chuck Lever2-0/+46
Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2020-06-11NFS: Trace short NFS READsChuck Lever2-0/+49
A short read can generate an -EIO error without there being an error on the wire. This tracepoint acts as an eyecatcher when there is no obvious I/O error. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2020-06-11NFS: nfs_xdr_status should record the procedure nameChuck Lever1-2/+13
When sunrpc trace points are not enabled, the recorded task ID information alone is not helpful. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2020-06-11SUNRPC: Set SOFTCONN when destroying GSS contextsChuck Lever1-5/+4
Move the RPC_TASK_SOFTCONN flag into rpc_call_null_helper(). The only minor behavior change is that it is now also set when destroying GSS contexts. This gives a better guarantee that gss_send_destroy_context() will not hang for long if a connection cannot be established. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2020-06-11SUNRPC: rpc_call_null_helper() should set RPC_TASK_SOFTChuck Lever2-5/+4
Clean up. All of rpc_call_null_helper() call sites assert RPC_TASK_SOFT, so move that setting into rpc_call_null_helper() itself. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2020-06-11SUNRPC: rpc_call_null_helper() already sets RPC_TASK_NULLCREDSChuck Lever1-2/+2
Clean up. Commit a52458b48af1 ("NFS/NFSD/SUNRPC: replace generic creds with 'struct cred'.") made rpc_call_null_helper() set RPC_TASK_NULLCREDS unconditionally. Therefore there's no need for rpc_call_null_helper()'s call sites to set RPC_TASK_NULLCREDS. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2020-06-11SUNRPC: trace RPC client lifetime eventsChuck Lever2-24/+126
The "create" tracepoint records parts of the rpc_create arguments, and the shutdown tracepoint records when the rpc_clnt is about to signal pending tasks and destroy auths. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2020-06-11SUNRPC: Trace transport lifetime eventsChuck Lever5-23/+46
Refactor: Hoist create/destroy/disconnect tracepoints out of xprtrdma and into the generic RPC client. Some benefits include: - Enable tracing of xprt lifetime events for the socket transport types - Expose the different types of disconnect to help run down issues with lingering connections Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2020-06-11SUNRPC: Split the xdr_buf event classChuck Lever4-52/+71
To help tie the recorded xdr_buf to a particular RPC transaction, the client side version of this class should display task ID information and the server side one should show the request's XID. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2020-06-11SUNRPC: Add tracepoint to rpc_call_rpcerror()Chuck Lever2-0/+29
Add a tracepoint in another common exit point for failing RPCs. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2020-06-11SUNRPC: Update the RPC_SHOW_SOCKET() macroChuck Lever1-3/+3
Clean up: remove unnecessary commas, and fix a white-space nit. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2020-06-11SUNRPC: Update the rpc_show_task_flags() macroChuck Lever1-1/+7
Recent additions to the RPC_TASK flags neglected to update the tracepoint ENUM definitions. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2020-06-11SUNRPC: Trace GSS context lifetimesChuck Lever3-8/+58
Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2020-06-11SUNRPC: receive buffer size estimation values almost never changeChuck Lever4-18/+69
Avoid unnecessary cache sloshing by placing the buffer size estimation update logic behind an atomic bit flag. The size of GSS information included in each wrapped Reply does not change during the lifetime of a GSS context. Therefore, the au_rslack and au_ralign fields need to be updated only once after establishing a fresh GSS credential. Thus a slack size update must occur after a cred is created, duplicated, renewed, or expires. I'm not sure I have this exactly right. A trace point is introduced to track updates to these variables to enable troubleshooting the problem if I missed a spot. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2020-06-11KVM: selftests: fix sync_with_host() in smm_testVitaly Kuznetsov1-2/+2
It was reported that older GCCs compile smm_test in a way that breaks it completely: kvm_exit: reason EXIT_CPUID rip 0x4014db info 0 0 func 7ffffffd idx 830 rax 0 rbx 0 rcx 0 rdx 0, cpuid entry not found ... kvm_exit: reason EXIT_MSR rip 0x40abd9 info 0 0 kvm_msr: msr_read 487 = 0x0 (#GP) ... Note, '7ffffffd' was supposed to be '80000001' as we're checking for SVM. Dropping '-O2' from compiler flags help. Turns out, asm block in sync_with_host() is wrong. We us 'in 0xe, %%al' instruction to sync with the host and in 'AL' register we actually pass the parameter (stage) but after sync 'AL' gets written to but GCC thinks the value is still there and uses it to compute 'EAX' for 'cpuid'. smm_test can't fully use standard ucall() framework as we need to write a very simple SMI handler there. Fix the immediate issue by making RAX input/output operand. While on it, make sync_with_host() static inline. Reported-by: Marcelo Bandeira Condotta <mcondotta@redhat.com> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Message-Id: <20200610164116.770811-1-vkuznets@redhat.com> Reviewed-by: Jim Mattson <jmattson@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-06-11KVM: async_pf: Inject 'page ready' event only if 'page not present' was previously injectedVitaly Kuznetsov6-6/+12
'Page not present' event may or may not get injected depending on guest's state. If the event wasn't injected, there is no need to inject the corresponding 'page ready' event as the guest may get confused. E.g. Linux thinks that the corresponding 'page not present' event wasn't delivered *yet* and allocates a 'dummy entry' for it. This entry is never freed. Note, 'wakeup all' events have no corresponding 'page not present' event and always get injected. s390 seems to always be able to inject 'page not present', the change is effectively a nop. Suggested-by: Vivek Goyal <vgoyal@redhat.com> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Message-Id: <20200610175532.779793-2-vkuznets@redhat.com> Fixes: https://bugzilla.kernel.org/show_bug.cgi?id=208081 Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-06-11KVM: async_pf: Cleanup kvm_setup_async_pf()Vitaly Kuznetsov1-13/+6
schedule_work() returns 'false' only when the work is already on the queue and this can't happen as kvm_setup_async_pf() always allocates a new one. Also, to avoid potential race, it makes sense to to schedule_work() at the very end after we've added it to the queue. While on it, do some minor cleanup. gfn_to_pfn_async() mentioned in a comment does not currently exist and, moreover, we can check kvm_is_error_hva() at the very beginning, before we try to allocate work so 'retry_sync' label can go away completely. Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Message-Id: <20200610175532.779793-1-vkuznets@redhat.com> Reviewed-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-06-11kvm: i8254: remove redundant assignment to pointer sColin Ian King1-1/+0
The pointer s is being assigned a value that is never read, the assignment is redundant and can be removed. Addresses-Coverity: ("Unused value") Signed-off-by: Colin Ian King <colin.king@canonical.com> Message-Id: <20200609233121.1118683-1-colin.king@canonical.com> Fixes: 7837699fa6d7 ("KVM: In kernel PIT model") Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-06-11KVM: x86: respect singlestep when emulating instructionFelipe Franciosi1-1/+1
When userspace configures KVM_GUESTDBG_SINGLESTEP, KVM will manage the presence of X86_EFLAGS_TF via kvm_set/get_rflags on vcpus. The actual rflag bit is therefore hidden from callers. That includes init_emulate_ctxt() which uses the value returned from kvm_get_flags() to set ctxt->tf. As a result, x86_emulate_instruction() will skip a single step, leaving singlestep_rip stale and not returning to userspace. This resolves the issue by observing the vcpu guest_debug configuration alongside ctxt->tf in x86_emulate_instruction(), performing the single step if set. Cc: stable@vger.kernel.org Signed-off-by: Felipe Franciosi <felipe@nutanix.com> Message-Id: <20200519081048.8204-1-felipe@nutanix.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>