aboutsummaryrefslogtreecommitdiffstats
AgeCommit message (Collapse)AuthorFilesLines
2020-01-31kcov: ignore fault-inject and stacktraceDmitry Vyukov3-0/+3
Don't instrument 3 more files that contain debugging facilities and produce large amounts of uninteresting coverage for every syscall. The following snippets are sprinkled all over the place in kcov traces in a debugging kernel. We already try to disable instrumentation of stack unwinding code and of most debug facilities. I guess we did not use fault-inject.c at the time, and stacktrace.c was somehow missed (or something has changed in kernel/configs). This change both speeds up kcov (kernel doesn't need to store these PCs, user-space doesn't need to process them) and frees trace buffer capacity for more useful coverage. should_fail lib/fault-inject.c:149 fail_dump lib/fault-inject.c:45 stack_trace_save kernel/stacktrace.c:124 stack_trace_consume_entry kernel/stacktrace.c:86 stack_trace_consume_entry kernel/stacktrace.c:89 ... a hundred frames skipped ... stack_trace_consume_entry kernel/stacktrace.c:93 stack_trace_consume_entry kernel/stacktrace.c:86 Link: http://lkml.kernel.org/r/20200116111449.217744-1-dvyukov@gmail.com Signed-off-by: Dmitry Vyukov <dvyukov@google.com> Reviewed-by: Andrey Konovalov <andreyknvl@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-01-31include/linux/io-mapping.h-mapping: use PHYS_PFN() macro in io_mapping_map_atomic_wc()Andy Shevchenko1-3/+2
Use PHYS_PFN() macro in io_mapping_map_atomic_wc() instead of open coded variant. Link: http://lkml.kernel.org/r/20191209165624.56351-1-andriy.shevchenko@linux.intel.com Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-01-31execve: warn if process starts with executable stackAlexey Dobriyan1-0/+5
There were few episodes of silent downgrade to an executable stack over years: 1) linking innocent looking assembly file will silently add executable stack if proper linker options is not given as well: $ cat f.S .intel_syntax noprefix .text .globl f f: ret $ cat main.c void f(void); int main(void) { f(); return 0; } $ gcc main.c f.S $ readelf -l ./a.out GNU_STACK 0x0000000000000000 0x0000000000000000 0x0000000000000000 0x0000000000000000 0x0000000000000000 RWE 0x10 ^^^ 2) converting C99 nested function into a closure https://nullprogram.com/blog/2019/11/15/ void intsort2(int *base, size_t nmemb, _Bool invert) { int cmp(const void *a, const void *b) { int r = *(int *)a - *(int *)b; return invert ? -r : r; } qsort(base, nmemb, sizeof(*base), cmp); } will silently require stack trampolines while non-closure version will not. Without doubt this behaviour is documented somewhere, add a warning so that developers and users can at least notice. After so many years of x86_64 having proper executable stack support it should not cause too many problems. Link: http://lkml.kernel.org/r/20191208171918.GC19716@avx2 Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Cc: Dan Carpenter <dan.carpenter@oracle.com> Cc: Will Deacon <will@kernel.org> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-01-31reiserfs: prevent NULL pointer dereference in reiserfs_insert_item()Yunfeng Ye1-1/+2
The variable inode may be NULL in reiserfs_insert_item(), but there is no check before accessing the member of inode. Fix this by adding NULL pointer check before calling reiserfs_debug(). Link: http://lkml.kernel.org/r/79c5135d-ff25-1cc9-4e99-9f572b88cc00@huawei.com Signed-off-by: Yunfeng Ye <yeyunfeng@huawei.com> Cc: zhengbin <zhengbin13@huawei.com> Cc: Hu Shiyuan <hushiyuan@huawei.com> Cc: Feilong Lin <linfeilong@huawei.com> Cc: Jan Kara <jack@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-01-31init/main.c: fix misleading "This architecture does not have kernel memory protection" messageChristophe Leroy1-0/+5
This message leads to thinking that memory protection is not implemented for the said architecture, whereas absence of CONFIG_STRICT_KERNEL_RWX only means that memory protection has not been selected at compile time. Don't print this message when CONFIG_ARCH_HAS_STRICT_KERNEL_RWX is selected by the architecture. Instead, print "Kernel memory protection not selected by kernel config." Link: http://lkml.kernel.org/r/62477e446d9685459d4f27d193af6ff1bd69d55f.1578557581.git.christophe.leroy@c-s.fr Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Acked-by: Kees Cook <keescook@chromium.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-01-31init/main.c: fix quoted value handling in unknown_bootoptionArvind Sankar1-3/+4
Patch series "init/main.c: minor cleanup/bugfix of envvar handling", v2. unknown_bootoption passes unrecognized command line arguments to init as either environment variables or arguments. Some of the logic in the function is broken for quoted command line arguments. When an argument of the form param="value" is processed by parse_args and passed to unknown_bootoption, the command line has param\0"value\0 with val pointing to the beginning of value. The helper function repair_env_string is then used to restore the '=' character that was removed by parse_args, and strip the quotes off fully. This results in param=value\0\0 and val ends up pointing to the 'a' instead of the 'v' in value. This bug was introduced when repair_env_string was refactored into a separate function, and the decrement of val in repair_env_string became dead code. This causes two problems in unknown_bootoption in the two places where the val pointer is used as a substitute for the length of param: 1. An argument of the form param=".value" is misinterpreted as a potential module parameter, with the result that it will not be placed in init's environment. 2. An argument of the form param="value" is checked to see if param is an existing environment variable that should be overwritten, but the comparison is off-by-one and compares 'param=v' instead of 'param=' against the existing environment. So passing, for example, TERM="vt100" on the command line results in init being passed both TERM=linux and TERM=vt100 in its environment. Patch 1 adds logging for the arguments and environment passed to init and is independent of the rest: it can be dropped if this is unnecessarily verbose. Patch 2 removes repair_env_string from initcall parameter parsing in do_initcall_level, as that uses a separate copy of the command line now and the repairing is no longer necessary. Patch 3 fixes the bug in unknown_bootoption by recording the length of param explicitly instead of implying it from val-param. This patch (of 3): Commit a99cd1125189 ("init: fix bug where environment vars can't be passed via boot args") introduced two minor bugs in unknown_bootoption by factoring out the quoted value handling into a separate function. When value is quoted, repair_env_string will move the value up 1 byte to strip the quotes, so val in unknown_bootoption no longer points to the actual location of the value. The result is that an argument of the form param=".value" is mistakenly treated as a potential module parameter and is not placed in init's environment, and an argument of the form param="value" can result in a duplicate environment variable: eg TERM="vt100" on the command line will result in both TERM=linux and TERM=vt100 being placed into init's environment. Fix this by recording the length of the param before calling repair_env_string instead of relying on val. Link: http://lkml.kernel.org/r/20191212180023.24339-4-nivedita@alum.mit.edu Signed-off-by: Arvind Sankar <nivedita@alum.mit.edu> Cc: Chris Metcalf <cmetcalf@tilera.com> Cc: Krzysztof Mazur <krzysiek@podlesie.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-01-31init/main.c: remove unnecessary repair_env_string in do_initcall_levelArvind Sankar1-6/+10
Since commit 08746a65c296 ("init: fix in-place parameter modification regression"), parse_args in do_initcall_level is called on a copy of saved_command_line. It is unnecessary to call repair_env_string during this parsing, as this copy is not used for anything later. Remove the now unnecessary arguments from repair_env_string as well. Link: http://lkml.kernel.org/r/20191212180023.24339-3-nivedita@alum.mit.edu Signed-off-by: Arvind Sankar <nivedita@alum.mit.edu> Cc: Krzysztof Mazur <krzysiek@podlesie.net> Cc: Chris Metcalf <cmetcalf@tilera.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-01-31init/main.c: log arguments and environment passed to initArvind Sankar1-0/+8
Extend logging in `run_init_process` to also show the arguments and environment that we are passing to init. Link: http://lkml.kernel.org/r/20191212180023.24339-2-nivedita@alum.mit.edu Signed-off-by: Arvind Sankar <nivedita@alum.mit.edu> Cc: Chris Metcalf <cmetcalf@tilera.com> Cc: Krzysztof Mazur <krzysiek@podlesie.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-01-31fs/binfmt_elf.c: coredump: allow process with empty address space to coredumpAlexey Dobriyan1-1/+9
Unmapping whole address space at once with munmap(0, (1ULL<<47) - 4096) or equivalent will create empty coredump. It is silly way to exit, however registers content may still be useful. The right to coredump is fundamental right of a process! Link: http://lkml.kernel.org/r/20191222150137.GA1277@avx2 Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-01-31fs/binfmt_elf.c: coredump: delete duplicated overflow checkAlexey Dobriyan1-2/+0
array_size() macro will do overflow check anyway. Link: http://lkml.kernel.org/r/20191222144009.GB24341@avx2 Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-01-31fs/binfmt_elf.c: coredump: allocate core ELF header on stackAlexey Dobriyan1-11/+5
Comment says ELF header is "too large to be on stack". 64 bytes on 64-bit is not large by any means. Link: http://lkml.kernel.org/r/20191222143850.GA24341@avx2 Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-01-31fs/binfmt_elf.c: make BAD_ADDR() unlikelyAlexey Dobriyan1-1/+1
If some mapping goes past TASK_SIZE it will be rejected by kernel which means no such userspace binaries exist. Mark every such check as unlikely. Link: http://lkml.kernel.org/r/20191215124355.GA21124@avx2 Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-01-31fs/binfmt_elf.c: better codegen around current->mmAlexey Dobriyan1-24/+28
"current->mm" pointer is stable in general except few cases one of which execve(2). Compiler can't treat is as stable but it _is_ stable most of the time. During ELF loading process ->mm becomes stable right after flush_old_exec(). Help compiler by caching current->mm, otherwise it continues to refetch it. add/remove: 0/0 grow/shrink: 0/2 up/down: 0/-141 (-141) Function old new delta elf_core_dump 5062 5039 -23 load_elf_binary 5426 5308 -118 Note: other cases are left as is because it is either pessimisation or no change in binary size. Link: http://lkml.kernel.org/r/20191215124755.GB21124@avx2 Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-01-31fs/binfmt_elf.c: don't copy ELF header aroundAlexey Dobriyan1-28/+27
ELF header is read into bprm->buf[] by generic execve code. Save a memcpy and allocate just one header for the interpreter instead of two headers (64 bytes instead of 128 on 64-bit). Link: http://lkml.kernel.org/r/20191208171242.GA19716@avx2 Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-01-31fs/binfmt_elf.c: fix ->start_code calculationAlexey Dobriyan1-1/+1
Only executable segments should be accounted to ->start_code just like they do to ->end_code (correctly). Link: http://lkml.kernel.org/r/20191208171410.GB19716@avx2 Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-01-31fs/binfmt_elf.c: smaller code generation around auxv vector fillAlexey Dobriyan1-7/+8
Filling auxv vector as array with index (auxv[i++] = ...) generates terrible code. "saved_auxv" should be reworked because it is the worst member of mm_struct by size/usefullness ratio but do it later. Meanwhile help gcc a little with *auxv++ idiom. Space savings on x86_64: add/remove: 0/0 grow/shrink: 0/1 up/down: 0/-127 (-127) Function old new delta load_elf_binary 5470 5343 -127 Link: http://lkml.kernel.org/r/20191208172301.GD19716@avx2 Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-01-31lib/find_bit.c: uninline helper _find_next_bit()Yury Norov1-1/+1
It saves 25% of .text for arm64, and more for BE architectures. Before: $ size lib/find_bit.o text data bss dec hex filename 1012 56 0 1068 42c lib/find_bit.o After: $ size lib/find_bit.o text data bss dec hex filename 776 56 0 832 340 lib/find_bit.o Link: http://lkml.kernel.org/r/20200103202846.21616-3-yury.norov@gmail.com Signed-off-by: Yury Norov <yury.norov@gmail.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Allison Randal <allison@lohutok.net> Cc: William Breathitt Gray <vilhelm.gray@gmail.com> Cc: Joe Perches <joe@perches.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-01-31lib/find_bit.c: join _find_next_bit{_le}Yury Norov1-45/+19
_find_next_bit and _find_next_bit_le are very similar functions. It's possible to join them by adding 1 parameter and a couple of simple checks. It's simplify maintenance and make possible to shrink the size of .text by un-inlining the unified function (in the following patch). Link: http://lkml.kernel.org/r/20200103202846.21616-2-yury.norov@gmail.com Signed-off-by: Yury Norov <yury.norov@gmail.com> Cc: Allison Randal <allison@lohutok.net> Cc: Joe Perches <joe@perches.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: William Breathitt Gray <vilhelm.gray@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-01-31uapi: rename ext2_swab() to swab() and share globally in swab.hYury Norov3-14/+13
ext2_swab() is defined locally in lib/find_bit.c However it is not specific to ext2, neither to bitmaps. There are many potential users of it, so rename it to just swab() and move to include/uapi/linux/swab.h ABI guarantees that size of unsigned long corresponds to BITS_PER_LONG, therefore drop unneeded cast. Link: http://lkml.kernel.org/r/20200103202846.21616-1-yury.norov@gmail.com Signed-off-by: Yury Norov <yury.norov@gmail.com> Cc: Allison Randal <allison@lohutok.net> Cc: Joe Perches <joe@perches.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: William Breathitt Gray <vilhelm.gray@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-01-31lib/scatterlist.c: adjust indentation in __sg_alloc_tableNathan Chancellor1-1/+1
Clang warns: ../lib/scatterlist.c:314:5: warning: misleading indentation; statement is not part of the previous 'if' [-Wmisleading-indentation] return -ENOMEM; ^ ../lib/scatterlist.c:311:4: note: previous statement is here if (prv) ^ 1 warning generated. This warning occurs because there is a space before the tab on this line. Remove it so that the indentation is consistent with the Linux kernel coding style and clang no longer warns. Link: http://lkml.kernel.org/r/20191218033606.11942-1-natechancellor@gmail.com Link: https://github.com/ClangBuiltLinux/linux/issues/830 Fixes: edce6820a9fd ("scatterlist: prevent invalid free when alloc fails") Signed-off-by: Nathan Chancellor <natechancellor@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-01-31btrfs: use larger zlib buffer for s390 hardware compressionMikhail Zaslonko2-36/+101
In order to benefit from s390 zlib hardware compression support, increase the btrfs zlib workspace buffer size from 1 to 4 pages (if s390 zlib hardware support is enabled on the machine). This brings up to 60% better performance in hardware on s390 compared to the PAGE_SIZE buffer and much more compared to the software zlib processing in btrfs. In case of memory pressure, fall back to a single page buffer during workspace allocation. The data compressed with larger input buffers will still conform to zlib standard and thus can be decompressed also on a systems that uses only PAGE_SIZE buffer for btrfs zlib. Link: http://lkml.kernel.org/r/20200108105103.29028-1-zaslonko@linux.ibm.com Signed-off-by: Mikhail Zaslonko <zaslonko@linux.ibm.com> Reviewed-by: David Sterba <dsterba@suse.com> Cc: Chris Mason <clm@fb.com> Cc: Josef Bacik <josef@toxicpanda.com> Cc: David Sterba <dsterba@suse.com> Cc: Richard Purdie <rpurdie@rpsys.net> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: Christian Borntraeger <borntraeger@de.ibm.com> Cc: Eduard Shishkin <edward6@linux.ibm.com> Cc: Ilya Leoshkevich <iii@linux.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-01-31lib/zlib: add zlib_deflate_dfltcc_enabled() functionMikhail Zaslonko5-9/+24
Add a new function to zlib.h checking if s390 Deflate-Conversion facility is installed and enabled. Link: http://lkml.kernel.org/r/20200103223334.20669-6-zaslonko@linux.ibm.com Signed-off-by: Mikhail Zaslonko <zaslonko@linux.ibm.com> Cc: Chris Mason <clm@fb.com> Cc: Christian Borntraeger <borntraeger@de.ibm.com> Cc: David Sterba <dsterba@suse.com> Cc: Eduard Shishkin <edward6@linux.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Ilya Leoshkevich <iii@linux.ibm.com> Cc: Josef Bacik <josef@toxicpanda.com> Cc: Richard Purdie <rpurdie@rpsys.net> Cc: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-01-31s390/boot: add dfltcc= kernel command line parameterMikhail Zaslonko9-2/+55
Add the new kernel command line parameter 'dfltcc=' to configure s390 zlib hardware support. Format: { on | off | def_only | inf_only | always } on: s390 zlib hardware support for compression on level 1 and decompression (default) off: No s390 zlib hardware support def_only: s390 zlib hardware support for deflate only (compression on level 1) inf_only: s390 zlib hardware support for inflate only (decompression) always: Same as 'on' but ignores the selected compression level always using hardware support (used for debugging) Link: http://lkml.kernel.org/r/20200103223334.20669-5-zaslonko@linux.ibm.com Signed-off-by: Mikhail Zaslonko <zaslonko@linux.ibm.com> Cc: Chris Mason <clm@fb.com> Cc: Christian Borntraeger <borntraeger@de.ibm.com> Cc: David Sterba <dsterba@suse.com> Cc: Eduard Shishkin <edward6@linux.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Ilya Leoshkevich <iii@linux.ibm.com> Cc: Josef Bacik <josef@toxicpanda.com> Cc: Richard Purdie <rpurdie@rpsys.net> Cc: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-01-31lib/zlib: add s390 hardware support for kernel zlib_inflateMikhail Zaslonko7-11/+233
Add decompression functions to zlib_dfltcc library. Update zlib_inflate functions with the hooks for s390 hardware support and adjust workspace structures with extra parameter lists required for hardware inflate decompression. Link: http://lkml.kernel.org/r/20200103223334.20669-4-zaslonko@linux.ibm.com Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> Signed-off-by: Mikhail Zaslonko <zaslonko@linux.ibm.com> Co-developed-by: Ilya Leoshkevich <iii@linux.ibm.com> Cc: Chris Mason <clm@fb.com> Cc: Christian Borntraeger <borntraeger@de.ibm.com> Cc: David Sterba <dsterba@suse.com> Cc: Eduard Shishkin <edward6@linux.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Josef Bacik <josef@toxicpanda.com> Cc: Richard Purdie <rpurdie@rpsys.net> Cc: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-01-31s390/boot: rename HEAP_SIZE due to name collisionMikhail Zaslonko1-4/+4
Change the conflicting macro name in preparation for zlib_inflate hardware support. Link: http://lkml.kernel.org/r/20200103223334.20669-3-zaslonko@linux.ibm.com Signed-off-by: Mikhail Zaslonko <zaslonko@linux.ibm.com> Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-01-31lib/zlib: add s390 hardware support for kernel zlib_deflateMikhail Zaslonko11-102/+751
Patch series "S390 hardware support for kernel zlib", v3. With IBM z15 mainframe the new DFLTCC instruction is available. It implements deflate algorithm in hardware (Nest Acceleration Unit - NXU) with estimated compression and decompression performance orders of magnitude faster than the current zlib. This patchset adds s390 hardware compression support to kernel zlib. The code is based on the userspace zlib implementation: https://github.com/madler/zlib/pull/410 The coding style is also preserved for future maintainability. There is only limited set of userspace zlib functions represented in kernel. Apart from that, all the memory allocation should be performed in advance. Thus, the workarea structures are extended with the parameter lists required for the DEFLATE CONVENTION CALL instruction. Since kernel zlib itself does not support gzip headers, only Adler-32 checksum is processed (also can be produced by DFLTCC facility). Like it was implemented for userspace, kernel zlib will compress in hardware on level 1, and in software on all other levels. Decompression will always happen in hardware (when enabled). Two DFLTCC compression calls produce the same results only when they both are made on machines of the same generation, and when the respective buffers have the same offset relative to the start of the page. Therefore care should be taken when using hardware compression when reproducible results are desired. However it does always produce the standard conform output which can be inflated anyway. The new kernel command line parameter 'dfltcc' is introduced to configure s390 zlib hardware support: Format: { on | off | def_only | inf_only | always } on: s390 zlib hardware support for compression on level 1 and decompression (default) off: No s390 zlib hardware support def_only: s390 zlib hardware support for deflate only (compression on level 1) inf_only: s390 zlib hardware support for inflate only (decompression) always: Same as 'on' but ignores the selected compression level always using hardware support (used for debugging) The main purpose of the integration of the NXU support into the kernel zlib is the use of hardware deflate in btrfs filesystem with on-the-fly compression enabled. Apart from that, hardware support can also be used during boot for decompressing the kernel or the ramdisk image With the patch for btrfs expanding zlib buffer from 1 to 4 pages (patch 6) the following performance results have been achieved using the ramdisk with btrfs. These are relative numbers based on throughput rate and compression ratio for zlib level 1: Input data Deflate rate Inflate rate Compression ratio NXU/Software NXU/Software NXU/Software stream of zeroes 1.46 1.02 1.00 random ASCII data 10.44 3.00 0.96 ASCII text (dickens) 6,21 3.33 0.94 binary data (vmlinux) 8,37 3.90 1.02 This means that s390 hardware deflate can provide up to 10 times faster compression (on level 1) and up to 4 times faster decompression (refers to all compression levels) for btrfs zlib. Disclaimer: Performance results are based on IBM internal tests using DD command-line utility on btrfs on a Fedora 30 based internal driver in native LPAR on a z15 system. Results may vary based on individual workload, configuration and software levels. This patch (of 9): Create zlib_dfltcc library with the s390 DEFLATE CONVERSION CALL implementation and related compression functions. Update zlib_deflate functions with the hooks for s390 hardware support and adjust workspace structures with extra parameter lists required for hardware deflate. Link: http://lkml.kernel.org/r/20200103223334.20669-2-zaslonko@linux.ibm.com Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> Signed-off-by: Mikhail Zaslonko <zaslonko@linux.ibm.com> Co-developed-by: Ilya Leoshkevich <iii@linux.ibm.com> Cc: Chris Mason <clm@fb.com> Cc: Christian Borntraeger <borntraeger@de.ibm.com> Cc: David Sterba <dsterba@suse.com> Cc: Eduard Shishkin <edward6@linux.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Josef Bacik <josef@toxicpanda.com> Cc: Richard Purdie <rpurdie@rpsys.net> Cc: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-01-31iio: adc: qcom-vadc-common: use <linux/units.h> helpersAkinobu Mita2-4/+3
This switches the qcom-vadc-common to use milli_kelvin_to_millicelsius() in <linux/units.h>. Link: http://lkml.kernel.org/r/1576386975-7941-13-git-send-email-akinobu.mita@gmail.com Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com> Acked-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com> Cc: Hartmut Knaack <knaack.h@gmx.de> Cc: Lars-Peter Clausen <lars@metafoo.de> Cc: Peter Meerwald-Stadler <pmeerw@pmeerw.net> Cc: Amit Kucheria <amit.kucheria@verdurent.com> Cc: Andy Shevchenko <andy@infradead.org> Cc: Christoph Hellwig <hch@lst.de> Cc: Daniel Lezcano <daniel.lezcano@linaro.org> Cc: Darren Hart <dvhart@infradead.org> Cc: Emmanuel Grumbach <emmanuel.grumbach@intel.com> Cc: Guenter Roeck <linux@roeck-us.net> Cc: Jean Delvare <jdelvare@suse.com> Cc: Jens Axboe <axboe@fb.com> Cc: Johannes Berg <johannes.berg@intel.com> Cc: Jonathan Cameron <jic23@kernel.org> Cc: Kalle Valo <kvalo@codeaurora.org> Cc: Keith Busch <kbusch@kernel.org> Cc: Luca Coelho <luciano.coelho@intel.com> Cc: Sagi Grimberg <sagi@grimberg.me> Cc: Stanislaw Gruszka <sgruszka@redhat.com> Cc: Sujith Thomas <sujith.thomas@intel.com> Cc: Zhang Rui <rui.zhang@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-01-31thermal: armada: remove unused TO_MCELSIUS macroAkinobu Mita1-2/+0
This removes unused TO_MCELSIUS() macro. Link: http://lkml.kernel.org/r/1576386975-7941-12-git-send-email-akinobu.mita@gmail.com Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com> Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com> Cc: Zhang Rui <rui.zhang@intel.com> Cc: Daniel Lezcano <daniel.lezcano@linaro.org> Cc: Amit Kucheria <amit.kucheria@verdurent.com> Cc: Andy Shevchenko <andy@infradead.org> Cc: Christoph Hellwig <hch@lst.de> Cc: Darren Hart <dvhart@infradead.org> Cc: Emmanuel Grumbach <emmanuel.grumbach@intel.com> Cc: Guenter Roeck <linux@roeck-us.net> Cc: Hartmut Knaack <knaack.h@gmx.de> Cc: Jean Delvare <jdelvare@suse.com> Cc: Jens Axboe <axboe@fb.com> Cc: Johannes Berg <johannes.berg@intel.com> Cc: Jonathan Cameron <jic23@kernel.org> Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com> Cc: Kalle Valo <kvalo@codeaurora.org> Cc: Keith Busch <kbusch@kernel.org> Cc: Lars-Peter Clausen <lars@metafoo.de> Cc: Luca Coelho <luciano.coelho@intel.com> Cc: Peter Meerwald-Stadler <pmeerw@pmeerw.net> Cc: Sagi Grimberg <sagi@grimberg.me> Cc: Stanislaw Gruszka <sgruszka@redhat.com> Cc: Sujith Thomas <sujith.thomas@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-01-31iwlwifi: use <linux/units.h> helpersAkinobu Mita2-7/+4
This switches the iwlwifi driver to use celsius_to_kelvin() and kelvin_to_celsius() in <linux/units.h>. Link: http://lkml.kernel.org/r/1576386975-7941-11-git-send-email-akinobu.mita@gmail.com Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com> Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com> Acked-by: Luca Coelho <luciano.coelho@intel.com> Cc: Kalle Valo <kvalo@codeaurora.org> Cc: Johannes Berg <johannes.berg@intel.com> Cc: Emmanuel Grumbach <emmanuel.grumbach@intel.com> Cc: Luca Coelho <luciano.coelho@intel.com> Cc: Amit Kucheria <amit.kucheria@verdurent.com> Cc: Andy Shevchenko <andy@infradead.org> Cc: Christoph Hellwig <hch@lst.de> Cc: Daniel Lezcano <daniel.lezcano@linaro.org> Cc: Darren Hart <dvhart@infradead.org> Cc: Guenter Roeck <linux@roeck-us.net> Cc: Hartmut Knaack <knaack.h@gmx.de> Cc: Jean Delvare <jdelvare@suse.com> Cc: Jens Axboe <axboe@fb.com> Cc: Jonathan Cameron <jic23@kernel.org> Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com> Cc: Keith Busch <kbusch@kernel.org> Cc: Lars-Peter Clausen <lars@metafoo.de> Cc: Peter Meerwald-Stadler <pmeerw@pmeerw.net> Cc: Sagi Grimberg <sagi@grimberg.me> Cc: Stanislaw Gruszka <sgruszka@redhat.com> Cc: Sujith Thomas <sujith.thomas@intel.com> Cc: Zhang Rui <rui.zhang@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-01-31iwlegacy: use <linux/units.h> helpersAkinobu Mita3-12/+11
This switches the iwlegacy driver to use celsius_to_kelvin() and kelvin_to_celsius() in <linux/units.h>. [akinobu.mita@gmail.com: fix build warnings with format string] Link: http://lkml.kernel.org/r/1579014483-9226-1-git-send-email-akinobu.mita@gmail.com Link: https://lore.kernel.org/r/20200106171452.201c3b4c@canb.auug.org.au Link: http://lkml.kernel.org/r/1576386975-7941-10-git-send-email-akinobu.mita@gmail.com Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com> Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com> Acked-by: Kalle Valo <kvalo@codeaurora.org> Cc: Kalle Valo <kvalo@codeaurora.org> Cc: Stanislaw Gruszka <sgruszka@redhat.com> Cc: Amit Kucheria <amit.kucheria@verdurent.com> Cc: Andy Shevchenko <andy@infradead.org> Cc: Christoph Hellwig <hch@lst.de> Cc: Daniel Lezcano <daniel.lezcano@linaro.org> Cc: Darren Hart <dvhart@infradead.org> Cc: Emmanuel Grumbach <emmanuel.grumbach@intel.com> Cc: Guenter Roeck <linux@roeck-us.net> Cc: Hartmut Knaack <knaack.h@gmx.de> Cc: Jean Delvare <jdelvare@suse.com> Cc: Jens Axboe <axboe@fb.com> Cc: Johannes Berg <johannes.berg@intel.com> Cc: Jonathan Cameron <jic23@kernel.org> Cc: Keith Busch <kbusch@kernel.org> Cc: Lars-Peter Clausen <lars@metafoo.de> Cc: Luca Coelho <luciano.coelho@intel.com> Cc: Peter Meerwald-Stadler <pmeerw@pmeerw.net> Cc: Sagi Grimberg <sagi@grimberg.me> Cc: Sujith Thomas <sujith.thomas@intel.com> Cc: Zhang Rui <rui.zhang@intel.com> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-01-31thermal: remove kelvin to/from Celsius conversion helpers from <linux/thermal.h>Akinobu Mita1-11/+0
This removes the kelvin to/from Celsius conversion helper macros in <linux/thermal.h> which were switched to the inline helper functions in <linux/units.h>. Link: http://lkml.kernel.org/r/1576386975-7941-9-git-send-email-akinobu.mita@gmail.com Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com> Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com> Cc: Sujith Thomas <sujith.thomas@intel.com> Cc: Darren Hart <dvhart@infradead.org> Cc: Andy Shevchenko <andy@infradead.org> Cc: Zhang Rui <rui.zhang@intel.com> Cc: Daniel Lezcano <daniel.lezcano@linaro.org> Cc: Amit Kucheria <amit.kucheria@verdurent.com> Cc: Jean Delvare <jdelvare@suse.com> Cc: Guenter Roeck <linux@roeck-us.net> Cc: Keith Busch <kbusch@kernel.org> Cc: Jens Axboe <axboe@fb.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Sagi Grimberg <sagi@grimberg.me> Cc: Emmanuel Grumbach <emmanuel.grumbach@intel.com> Cc: Hartmut Knaack <knaack.h@gmx.de> Cc: Johannes Berg <johannes.berg@intel.com> Cc: Jonathan Cameron <jic23@kernel.org> Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com> Cc: Kalle Valo <kvalo@codeaurora.org> Cc: Lars-Peter Clausen <lars@metafoo.de> Cc: Luca Coelho <luciano.coelho@intel.com> Cc: Peter Meerwald-Stadler <pmeerw@pmeerw.net> Cc: Stanislaw Gruszka <sgruszka@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-01-31nvme: hwmon: switch to use <linux/units.h> helpersAkinobu Mita1-8/+5
This switches the nvme driver to use kelvin_to_millicelsius() and millicelsius_to_kelvin() in <linux/units.h>. Link: http://lkml.kernel.org/r/1576386975-7941-8-git-send-email-akinobu.mita@gmail.com Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Guenter Roeck <linux@roeck-us.net> Reviewed-by: Keith Busch <kbusch@kernel.org> Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com> Cc: Sujith Thomas <sujith.thomas@intel.com> Cc: Darren Hart <dvhart@infradead.org> Cc: Andy Shevchenko <andy@infradead.org> Cc: Zhang Rui <rui.zhang@intel.com> Cc: Daniel Lezcano <daniel.lezcano@linaro.org> Cc: Amit Kucheria <amit.kucheria@verdurent.com> Cc: Jean Delvare <jdelvare@suse.com> Cc: Guenter Roeck <linux@roeck-us.net> Cc: Keith Busch <kbusch@kernel.org> Cc: Jens Axboe <axboe@fb.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Sagi Grimberg <sagi@grimberg.me> Cc: Emmanuel Grumbach <emmanuel.grumbach@intel.com> Cc: Hartmut Knaack <knaack.h@gmx.de> Cc: Johannes Berg <johannes.berg@intel.com> Cc: Jonathan Cameron <jic23@kernel.org> Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com> Cc: Kalle Valo <kvalo@codeaurora.org> Cc: Lars-Peter Clausen <lars@metafoo.de> Cc: Luca Coelho <luciano.coelho@intel.com> Cc: Peter Meerwald-Stadler <pmeerw@pmeerw.net> Cc: Stanislaw Gruszka <sgruszka@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-01-31thermal: intel_pch: switch to use <linux/units.h> helpersAkinobu Mita1-1/+2
This switches the intel pch thermal driver to use deci_kelvin_to_millicelsius() in <linux/units.h> instead of helpers in <linux/thermal.h>. This is preparation for centralizing the kelvin to/from Celsius conversion helpers in <linux/units.h>. Link: http://lkml.kernel.org/r/1576386975-7941-7-git-send-email-akinobu.mita@gmail.com Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com> Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com> Cc: Sujith Thomas <sujith.thomas@intel.com> Cc: Darren Hart <dvhart@infradead.org> Cc: Andy Shevchenko <andy@infradead.org> Cc: Zhang Rui <rui.zhang@intel.com> Cc: Daniel Lezcano <daniel.lezcano@linaro.org> Cc: Amit Kucheria <amit.kucheria@verdurent.com> Cc: Jean Delvare <jdelvare@suse.com> Cc: Guenter Roeck <linux@roeck-us.net> Cc: Keith Busch <kbusch@kernel.org> Cc: Jens Axboe <axboe@fb.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Sagi Grimberg <sagi@grimberg.me> Cc: Emmanuel Grumbach <emmanuel.grumbach@intel.com> Cc: Hartmut Knaack <knaack.h@gmx.de> Cc: Johannes Berg <johannes.berg@intel.com> Cc: Jonathan Cameron <jic23@kernel.org> Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com> Cc: Kalle Valo <kvalo@codeaurora.org> Cc: Lars-Peter Clausen <lars@metafoo.de> Cc: Luca Coelho <luciano.coelho@intel.com> Cc: Peter Meerwald-Stadler <pmeerw@pmeerw.net> Cc: Stanislaw Gruszka <sgruszka@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-01-31thermal: int340x: switch to use <linux/units.h> helpersAkinobu Mita1-3/+4
This switches the int340x thermal zone driver to use deci_kelvin_to_millicelsius() and millicelsius_to_deci_kelvin() in <linux/units.h> instead of helpers in <linux/thermal.h>. This is preparation for centralizing the kelvin to/from Celsius conversion helpers in <linux/units.h>. Link: http://lkml.kernel.org/r/1576386975-7941-6-git-send-email-akinobu.mita@gmail.com Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com> Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com> Cc: Sujith Thomas <sujith.thomas@intel.com> Cc: Darren Hart <dvhart@infradead.org> Cc: Andy Shevchenko <andy@infradead.org> Cc: Zhang Rui <rui.zhang@intel.com> Cc: Daniel Lezcano <daniel.lezcano@linaro.org> Cc: Amit Kucheria <amit.kucheria@verdurent.com> Cc: Jean Delvare <jdelvare@suse.com> Cc: Guenter Roeck <linux@roeck-us.net> Cc: Keith Busch <kbusch@kernel.org> Cc: Jens Axboe <axboe@fb.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Sagi Grimberg <sagi@grimberg.me> Cc: Emmanuel Grumbach <emmanuel.grumbach@intel.com> Cc: Hartmut Knaack <knaack.h@gmx.de> Cc: Johannes Berg <johannes.berg@intel.com> Cc: Jonathan Cameron <jic23@kernel.org> Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com> Cc: Kalle Valo <kvalo@codeaurora.org> Cc: Lars-Peter Clausen <lars@metafoo.de> Cc: Luca Coelho <luciano.coelho@intel.com> Cc: Peter Meerwald-Stadler <pmeerw@pmeerw.net> Cc: Stanislaw Gruszka <sgruszka@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-01-31platform/x86: intel_menlow: switch to use <linux/units.h> helpersAkinobu Mita1-3/+6
This switches the intel_menlow driver to use deci_kelvin_to_celsius() and celsius_to_deci_kelvin() in <linux/units.h> instead of helpers in <linux/thermal.h>. This is preparation for centralizing the kelvin to/from Celsius conversion helpers in <linux/units.h>. This also removes a trailing space, while we're at it. Link: http://lkml.kernel.org/r/1576386975-7941-5-git-send-email-akinobu.mita@gmail.com Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com> Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com> Cc: Sujith Thomas <sujith.thomas@intel.com> Cc: Darren Hart <dvhart@infradead.org> Cc: Andy Shevchenko <andy@infradead.org> Cc: Zhang Rui <rui.zhang@intel.com> Cc: Daniel Lezcano <daniel.lezcano@linaro.org> Cc: Amit Kucheria <amit.kucheria@verdurent.com> Cc: Jean Delvare <jdelvare@suse.com> Cc: Guenter Roeck <linux@roeck-us.net> Cc: Keith Busch <kbusch@kernel.org> Cc: Jens Axboe <axboe@fb.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Sagi Grimberg <sagi@grimberg.me> Cc: Emmanuel Grumbach <emmanuel.grumbach@intel.com> Cc: Hartmut Knaack <knaack.h@gmx.de> Cc: Johannes Berg <johannes.berg@intel.com> Cc: Jonathan Cameron <jic23@kernel.org> Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com> Cc: Kalle Valo <kvalo@codeaurora.org> Cc: Lars-Peter Clausen <lars@metafoo.de> Cc: Luca Coelho <luciano.coelho@intel.com> Cc: Peter Meerwald-Stadler <pmeerw@pmeerw.net> Cc: Stanislaw Gruszka <sgruszka@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-01-31platform/x86: asus-wmi: switch to use <linux/units.h> helpersAkinobu Mita1-4/+3
The asus-wmi driver doesn't implement the thermal device functionality directly, so including <linux/thermal.h> just for DECI_KELVIN_TO_CELSIUS() is a bit odd. This switches the asus-wmi driver to use deci_kelvin_to_millicelsius() in <linux/units.h>. The format string is changed from %d to %ld due to function returned type. Link: http://lkml.kernel.org/r/1576386975-7941-4-git-send-email-akinobu.mita@gmail.com Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com> Acked-by: Andy Shevchenko <andy.shevchenko@gmail.com> Cc: Sujith Thomas <sujith.thomas@intel.com> Cc: Darren Hart <dvhart@infradead.org> Cc: Andy Shevchenko <andy@infradead.org> Cc: Zhang Rui <rui.zhang@intel.com> Cc: Daniel Lezcano <daniel.lezcano@linaro.org> Cc: Amit Kucheria <amit.kucheria@verdurent.com> Cc: Jean Delvare <jdelvare@suse.com> Cc: Guenter Roeck <linux@roeck-us.net> Cc: Keith Busch <kbusch@kernel.org> Cc: Jens Axboe <axboe@fb.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Sagi Grimberg <sagi@grimberg.me> Cc: Emmanuel Grumbach <emmanuel.grumbach@intel.com> Cc: Hartmut Knaack <knaack.h@gmx.de> Cc: Johannes Berg <johannes.berg@intel.com> Cc: Jonathan Cameron <jic23@kernel.org> Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com> Cc: Kalle Valo <kvalo@codeaurora.org> Cc: Lars-Peter Clausen <lars@metafoo.de> Cc: Luca Coelho <luciano.coelho@intel.com> Cc: Peter Meerwald-Stadler <pmeerw@pmeerw.net> Cc: Stanislaw Gruszka <sgruszka@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-01-31ACPI: thermal: switch to use <linux/units.h> helpersAkinobu Mita1-16/+18
This switches the ACPI thermal zone driver to use celsius_to_deci_kelvin(), deci_kelvin_to_celsius(), and deci_kelvin_to_millicelsius_with_offset() in <linux/units.h> instead of helpers in <linux/thermal.h>. This is preparation for centralizing the kelvin to/from Celsius conversion helpers in <linux/units.h>. Link: http://lkml.kernel.org/r/1576386975-7941-3-git-send-email-akinobu.mita@gmail.com Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com> Cc: Sujith Thomas <sujith.thomas@intel.com> Cc: Darren Hart <dvhart@infradead.org> Cc: Andy Shevchenko <andy@infradead.org> Cc: Zhang Rui <rui.zhang@intel.com> Cc: Daniel Lezcano <daniel.lezcano@linaro.org> Cc: Amit Kucheria <amit.kucheria@verdurent.com> Cc: Jean Delvare <jdelvare@suse.com> Cc: Guenter Roeck <linux@roeck-us.net> Cc: Keith Busch <kbusch@kernel.org> Cc: Jens Axboe <axboe@fb.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Sagi Grimberg <sagi@grimberg.me> Cc: Andy Shevchenko <andy.shevchenko@gmail.com> Cc: Emmanuel Grumbach <emmanuel.grumbach@intel.com> Cc: Hartmut Knaack <knaack.h@gmx.de> Cc: Johannes Berg <johannes.berg@intel.com> Cc: Jonathan Cameron <jic23@kernel.org> Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com> Cc: Kalle Valo <kvalo@codeaurora.org> Cc: Lars-Peter Clausen <lars@metafoo.de> Cc: Luca Coelho <luciano.coelho@intel.com> Cc: Peter Meerwald-Stadler <pmeerw@pmeerw.net> Cc: Stanislaw Gruszka <sgruszka@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-01-31include/linux/units.h: add helpers for kelvin to/from Celsius conversionAkinobu Mita1-0/+84
Patch series "add header file for kelvin to/from Celsius conversion helpers", v4. There are several helper macros to convert kelvin to/from Celsius in <linux/thermal.h> for thermal drivers. These are useful for any other drivers or subsystems, but it's odd to include <linux/thermal.h> just for the helpers. This adds a new <linux/units.h> that provides the equivalent inline functions for any drivers or subsystems, and switches all the users of conversion helpers in <linux/thermal.h> to use <linux/units.h> helpers. This patch (of 12): There are several helper macros to convert kelvin to/from Celsius in <linux/thermal.h> for thermal drivers. These are useful for any other drivers or subsystems, but it's odd to include <linux/thermal.h> just for the helpers. This adds a new <linux/units.h> that provides the equivalent inline functions for any drivers or subsystems. It is intended to replace the helpers in <linux/thermal.h>. Link: http://lkml.kernel.org/r/1576386975-7941-2-git-send-email-akinobu.mita@gmail.com Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com> Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com> Cc: Sujith Thomas <sujith.thomas@intel.com> Cc: Darren Hart <dvhart@infradead.org> Cc: Zhang Rui <rui.zhang@intel.com> Cc: Daniel Lezcano <daniel.lezcano@linaro.org> Cc: Amit Kucheria <amit.kucheria@verdurent.com> Cc: Jean Delvare <jdelvare@suse.com> Cc: Guenter Roeck <linux@roeck-us.net> Cc: Keith Busch <kbusch@kernel.org> Cc: Jens Axboe <axboe@fb.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Sagi Grimberg <sagi@grimberg.me> Cc: Kalle Valo <kvalo@codeaurora.org> Cc: Stanislaw Gruszka <sgruszka@redhat.com> Cc: Johannes Berg <johannes.berg@intel.com> Cc: Emmanuel Grumbach <emmanuel.grumbach@intel.com> Cc: Luca Coelho <luciano.coelho@intel.com> Cc: Jonathan Cameron <jic23@kernel.org> Cc: Hartmut Knaack <knaack.h@gmx.de> Cc: Lars-Peter Clausen <lars@metafoo.de> Cc: Peter Meerwald-Stadler <pmeerw@pmeerw.net> Cc: Andy Shevchenko <andy@infradead.org> Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-01-31drivers/block/zram/zram_drv.c: fix error return codes not being returned in writeback_storeColin Ian King1-2/+1
Currently when an error code -EIO or -ENOSPC in the for-loop of writeback_store the error code is being overwritten by a ret = len assignment at the end of the function and the error codes are being lost. Fix this by assigning ret = len at the start of the function and remove the assignment from the end, hence allowing ret to be preserved when error codes are assigned to it. Addresses Coverity ("Unused value") Link: http://lkml.kernel.org/r/20191128122958.178290-1-colin.king@canonical.com Fixes: a939888ec38b ("zram: support idle/huge page writeback") Signed-off-by: Colin Ian King <colin.king@canonical.com> Acked-by: Minchan Kim <minchan@kernel.org> Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Cc: Jens Axboe <axboe@kernel.dk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-01-31zram: try to avoid worst-case scenario on same element pagesTaejoon Song1-2/+5
The worst-case scenario on finding same element pages is that almost all elements are same at the first glance but only last few elements are different. Since the same element tends to be grouped from the beginning of the pages, if we check the first element with the last element before looping through all elements, we might have some chances to quickly detect non-same element pages. 1. Test is done under LG webOS TV (64-bit arch) 2. Dump the swap-out pages (~819200 pages) 3. Analyze the pages with simple test script which counts the iteration number and measures the speed at off-line Under 64-bit arch, the worst iteration count is PAGE_SIZE / 8 bytes = 512. The speed is based on the time to consume page_same_filled() function only. The result, on average, is listed as below: Num of Iter Speed(MB/s) Looping-Forward (Orig) 38 99265 Looping-Backward 36 102725 Last-element-check (This Patch) 33 125072 The result shows that the average iteration count decreases by 13% and the speed increases by 25% with this patch. This patch does not increase the overall time complexity, though. I also ran simpler version which uses backward loop. Just looping backward also makes some improvement, but less than this patch. [taejoon.song@lge.com: fix off-by-one] Link: http://lkml.kernel.org/r/1578642001-11765-1-git-send-email-taejoon.song@lge.com Link: http://lkml.kernel.org/r/1575424418-16119-1-git-send-email-taejoon.song@lge.com Signed-off-by: Taejoon Song <taejoon.song@lge.com> Acked-by: Minchan Kim <minchan@kernel.org> Cc: Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com> Cc: Jens Axboe <axboe@kernel.dk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-01-31mm: fix comments related to node reclaimHao Lee2-2/+2
As zone reclaim has been replaced by node reclaim, this patch fixes related comments. Link: http://lkml.kernel.org/r/20191126141346.GA22665@haolee.github.io Signed-off-by: Hao Lee <haolee.swjtu@gmail.com> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Cc: Anshuman Khandual <anshuman.khandual@arm.com> Cc: Mel Gorman <mgorman@techsingularity.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-01-31include/linux/memory.h: drop fields 'hw' and 'phys_callback' from struct memory_blockAnshuman Khandual1-2/+0
memory_block structure elements 'hw' and 'phys_callback' are not getting used. This was originally added with commit 3947be1969a9 ("[PATCH] memory hotplug: sysfs and add/remove functions") but never seem to have been used. Just drop them now. Link: http://lkml.kernel.org/r/1576728650-13867-1-git-send-email-anshuman.khandual@arm.com Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com> Reviewed-by: Dan Williams <dan.j.williams@intel.com> Reviewed-by: David Hildenbrand <david@redhat.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Pavel Tatashin <pasha.tatashin@soleen.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-01-31include/linux/mm.h: remove dead code totalram_pages_set()Wei Yang1-5/+0
totalram_pages_set() was introduced in commit ca79b0c211af ("mm: convert totalram_pages and totalhigh_pages variables to atomic"), but no one uses it. Link: http://lkml.kernel.org/r/20191218005543.24146-1-richardw.yang@linux.intel.com Signed-off-by: Wei Yang <richardw.yang@linux.intel.com> Reviewed-by: David Hildenbrand <david@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-01-31include/linux/mm.h: clean up obsolete check on space in page->flagsYu Zhao1-4/+0
The check was intended to make sure we don't overrun page flags. But it's obsolete because it doesn't include LAST_CPUPID_WIDTH nor KASAN_TAG_WIDTH. Just remove check since we already have it covered in linux/page-flags-layout.h (near the end of the file). Link: http://lkml.kernel.org/r/20191208183508.89177-1-yuzhao@google.com Signed-off-by: Yu Zhao <yuzhao@google.com> Reviewed-by: David Hildenbrand <david@redhat.com> Cc: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-01-31zswap: potential NULL dereference on error in init_zswap()Dan Carpenter1-1/+2
The "pool" pointer can be NULL at the end of the init_zswap(). (We would allocate a new pool later in that situation) So in the error handling then we need to make sure pool is a valid pointer before calling "zswap_pool_destroy(pool);" because that function dereferences the argument. Link: http://lkml.kernel.org/r/20200114050902.og32fkllkod5ycf5@kili.mountain Fixes: 93d4dfa9fbd0 ("mm/zswap.c: add allocation hysteresis if pool limit is hit") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Cc: Vitaly Wool <vitaly.wool@konsulko.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-01-31mm/zswap.c: add allocation hysteresis if pool limit is hitVitaly Wool2-31/+67
zswap will always try to shrink pool when zswap is full. If there is a high pressure on zswap it will result in flipping pages in and out zswap pool without any real benefit, and the overall system performance will drop. The previous discussion on this subject [1] ended up with a suggestion to implement a sort of hysteresis to refuse taking pages into zswap pool until it has sufficient space if the limit has been hit. This is my take on this. Hysteresis is controlled with a sysfs-configurable parameter (namely, /sys/kernel/debug/zswap/accept_threhsold_percent). It specifies the threshold at which zswap would start accepting pages again after it became full. Setting this parameter to 100 disables the hysteresis and sets the zswap behavior to pre-hysteresis state. [1] https://lkml.org/lkml/2019/11/8/949 Link: http://lkml.kernel.org/r/20200108200118.15563-1-vitaly.wool@konsulko.com Signed-off-by: Vitaly Wool <vitaly.wool@konsulko.com> Cc: Dan Streetman <ddstreet@ieee.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-01-31mm/page_isolation: fix potential warning from userQian Cai2-14/+15
It makes sense to call the WARN_ON_ONCE(zone_idx(zone) == ZONE_MOVABLE) from start_isolate_page_range(), but should avoid triggering it from userspace, i.e, from is_mem_section_removable() because it could crash the system by a non-root user if warn_on_panic is set. While at it, simplify the code a bit by removing an unnecessary jump label. Link: http://lkml.kernel.org/r/20200120163915.1469-1-cai@lca.pw Signed-off-by: Qian Cai <cai@lca.pw> Suggested-by: Michal Hocko <mhocko@kernel.org> Acked-by: Michal Hocko <mhocko@suse.com> Reviewed-by: David Hildenbrand <david@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-01-31mm/hotplug: silence a lockdep splat with printk()Qian Cai4-17/+31
It is not that hard to trigger lockdep splats by calling printk from under zone->lock. Most of them are false positives caused by lock chains introduced early in the boot process and they do not cause any real problems (although most of the early boot lock dependencies could happen after boot as well). There are some console drivers which do allocate from the printk context as well and those should be fixed. In any case, false positives are not that trivial to workaround and it is far from optimal to lose lockdep functionality for something that is a non-issue. So change has_unmovable_pages() so that it no longer calls dump_page() itself - instead it returns a "struct page *" of the unmovable page back to the caller so that in the case of a has_unmovable_pages() failure, the caller can call dump_page() after releasing zone->lock. Also, make dump_page() is able to report a CMA page as well, so the reason string from has_unmovable_pages() can be removed. Even though has_unmovable_pages doesn't hold any reference to the returned page this should be reasonably safe for the purpose of reporting the page (dump_page) because it cannot be hotremoved in the context of memory unplug. The state of the page might change but that is the case even with the existing code as zone->lock only plays role for free pages. While at it, remove a similar but unnecessary debug-only printk() as well. A sample of one of those lockdep splats is, WARNING: possible circular locking dependency detected ------------------------------------------------------ test.sh/8653 is trying to acquire lock: ffffffff865a4460 (console_owner){-.-.}, at: console_unlock+0x207/0x750 but task is already holding lock: ffff88883fff3c58 (&(&zone->lock)->rlock){-.-.}, at: __offline_isolated_pages+0x179/0x3e0 which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #3 (&(&zone->lock)->rlock){-.-.}: __lock_acquire+0x5b3/0xb40 lock_acquire+0x126/0x280 _raw_spin_lock+0x2f/0x40 rmqueue_bulk.constprop.21+0xb6/0x1160 get_page_from_freelist+0x898/0x22c0 __alloc_pages_nodemask+0x2f3/0x1cd0 alloc_pages_current+0x9c/0x110 allocate_slab+0x4c6/0x19c0 new_slab+0x46/0x70 ___slab_alloc+0x58b/0x960 __slab_alloc+0x43/0x70 __kmalloc+0x3ad/0x4b0 __tty_buffer_request_room+0x100/0x250 tty_insert_flip_string_fixed_flag+0x67/0x110 pty_write+0xa2/0xf0 n_tty_write+0x36b/0x7b0 tty_write+0x284/0x4c0 __vfs_write+0x50/0xa0 vfs_write+0x105/0x290 redirected_tty_write+0x6a/0xc0 do_iter_write+0x248/0x2a0 vfs_writev+0x106/0x1e0 do_writev+0xd4/0x180 __x64_sys_writev+0x45/0x50 do_syscall_64+0xcc/0x76c entry_SYSCALL_64_after_hwframe+0x49/0xbe -> #2 (&(&port->lock)->rlock){-.-.}: __lock_acquire+0x5b3/0xb40 lock_acquire+0x126/0x280 _raw_spin_lock_irqsave+0x3a/0x50 tty_port_tty_get+0x20/0x60 tty_port_default_wakeup+0xf/0x30 tty_port_tty_wakeup+0x39/0x40 uart_write_wakeup+0x2a/0x40 serial8250_tx_chars+0x22e/0x440 serial8250_handle_irq.part.8+0x14a/0x170 serial8250_default_handle_irq+0x5c/0x90 serial8250_interrupt+0xa6/0x130 __handle_irq_event_percpu+0x78/0x4f0 handle_irq_event_percpu+0x70/0x100 handle_irq_event+0x5a/0x8b handle_edge_irq+0x117/0x370 do_IRQ+0x9e/0x1e0 ret_from_intr+0x0/0x2a cpuidle_enter_state+0x156/0x8e0 cpuidle_enter+0x41/0x70 call_cpuidle+0x5e/0x90 do_idle+0x333/0x370 cpu_startup_entry+0x1d/0x1f start_secondary+0x290/0x330 secondary_startup_64+0xb6/0xc0 -> #1 (&port_lock_key){-.-.}: __lock_acquire+0x5b3/0xb40 lock_acquire+0x126/0x280 _raw_spin_lock_irqsave+0x3a/0x50 serial8250_console_write+0x3e4/0x450 univ8250_console_write+0x4b/0x60 console_unlock+0x501/0x750 vprintk_emit+0x10d/0x340 vprintk_default+0x1f/0x30 vprintk_func+0x44/0xd4 printk+0x9f/0xc5 -> #0 (console_owner){-.-.}: check_prev_add+0x107/0xea0 validate_chain+0x8fc/0x1200 __lock_acquire+0x5b3/0xb40 lock_acquire+0x126/0x280 console_unlock+0x269/0x750 vprintk_emit+0x10d/0x340 vprintk_default+0x1f/0x30 vprintk_func+0x44/0xd4 printk+0x9f/0xc5 __offline_isolated_pages.cold.52+0x2f/0x30a offline_isolated_pages_cb+0x17/0x30 walk_system_ram_range+0xda/0x160 __offline_pages+0x79c/0xa10 offline_pages+0x11/0x20 memory_subsys_offline+0x7e/0xc0 device_offline+0xd5/0x110 state_store+0xc6/0xe0 dev_attr_store+0x3f/0x60 sysfs_kf_write+0x89/0xb0 kernfs_fop_write+0x188/0x240 __vfs_write+0x50/0xa0 vfs_write+0x105/0x290 ksys_write+0xc6/0x160 __x64_sys_write+0x43/0x50 do_syscall_64+0xcc/0x76c entry_SYSCALL_64_after_hwframe+0x49/0xbe other info that might help us debug this: Chain exists of: console_owner --> &(&port->lock)->rlock --> &(&zone->lock)->rlock Possible unsafe locking scenario: CPU0 CPU1 ---- ---- lock(&(&zone->lock)->rlock); lock(&(&port->lock)->rlock); lock(&(&zone->lock)->rlock); lock(console_owner); *** DEADLOCK *** 9 locks held by test.sh/8653: #0: ffff88839ba7d408 (sb_writers#4){.+.+}, at: vfs_write+0x25f/0x290 #1: ffff888277618880 (&of->mutex){+.+.}, at: kernfs_fop_write+0x128/0x240 #2: ffff8898131fc218 (kn->count#115){.+.+}, at: kernfs_fop_write+0x138/0x240 #3: ffffffff86962a80 (device_hotplug_lock){+.+.}, at: lock_device_hotplug_sysfs+0x16/0x50 #4: ffff8884374f4990 (&dev->mutex){....}, at: device_offline+0x70/0x110 #5: ffffffff86515250 (cpu_hotplug_lock.rw_sem){++++}, at: __offline_pages+0xbf/0xa10 #6: ffffffff867405f0 (mem_hotplug_lock.rw_sem){++++}, at: percpu_down_write+0x87/0x2f0 #7: ffff88883fff3c58 (&(&zone->lock)->rlock){-.-.}, at: __offline_isolated_pages+0x179/0x3e0 #8: ffffffff865a4920 (console_lock){+.+.}, at: vprintk_emit+0x100/0x340 stack backtrace: Hardware name: HPE ProLiant DL560 Gen10/ProLiant DL560 Gen10, BIOS U34 05/21/2019 Call Trace: dump_stack+0x86/0xca print_circular_bug.cold.31+0x243/0x26e check_noncircular+0x29e/0x2e0 check_prev_add+0x107/0xea0 validate_chain+0x8fc/0x1200 __lock_acquire+0x5b3/0xb40 lock_acquire+0x126/0x280 console_unlock+0x269/0x750 vprintk_emit+0x10d/0x340 vprintk_default+0x1f/0x30 vprintk_func+0x44/0xd4 printk+0x9f/0xc5 __offline_isolated_pages.cold.52+0x2f/0x30a offline_isolated_pages_cb+0x17/0x30 walk_system_ram_range+0xda/0x160 __offline_pages+0x79c/0xa10 offline_pages+0x11/0x20 memory_subsys_offline+0x7e/0xc0 device_offline+0xd5/0x110 state_store+0xc6/0xe0 dev_attr_store+0x3f/0x60 sysfs_kf_write+0x89/0xb0 kernfs_fop_write+0x188/0x240 __vfs_write+0x50/0xa0 vfs_write+0x105/0x290 ksys_write+0xc6/0x160 __x64_sys_write+0x43/0x50 do_syscall_64+0xcc/0x76c entry_SYSCALL_64_after_hwframe+0x49/0xbe Link: http://lkml.kernel.org/r/20200117181200.20299-1-cai@lca.pw Signed-off-by: Qian Cai <cai@lca.pw> Reviewed-by: David Hildenbrand <david@redhat.com> Cc: Michal Hocko <mhocko@kernel.org> Cc: Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com> Cc: Petr Mladek <pmladek@suse.com> Cc: Steven Rostedt (VMware) <rostedt@goodmis.org> Cc: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-01-31mm/memory_hotplug: pass in nid to online_pages()David Hildenbrand3-15/+7
Patch series "mm/memory_hotplug: pass in nid to online_pages()". Simplify onlining code and get rid of find_memory_block(). Pass in the nid from the memory block we are trying to online directly, instead of manually looking it up. This patch (of 2): No need to lookup the memory block, we can directly pass in the nid. Link: http://lkml.kernel.org/r/20200113113354.6341-2-david@redhat.com Signed-off-by: David Hildenbrand <david@redhat.com> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: "Rafael J. Wysocki" <rafael@kernel.org> Cc: Michal Hocko <mhocko@suse.com> Cc: Oscar Salvador <osalvador@suse.de> Cc: Anshuman Khandual <anshuman.khandual@arm.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Pavel Tatashin <pasha.tatashin@soleen.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-01-31mm/mmap.c: get rid of odd jump labels in find_mergeable_anon_vma()Miaohe Lin1-20/+16
The jump labels try_prev and none are not really needed in find_mergeable_anon_vma(), eliminate them to improve readability. Link: http://lkml.kernel.org/r/1574079844-17493-1-git-send-email-linmiaohe@huawei.com Signed-off-by: Miaohe Lin <linmiaohe@huawei.com> Reviewed-by: David Hildenbrand <david@redhat.com> Reviewed-by: John Hubbard <jhubbard@nvidia.com> Reviewed-by: Wei Yang <richardw.yang@linux.intel.com> Acked-by: David Rientjes <rientjes@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>