aboutsummaryrefslogtreecommitdiffstats
path: root/Kconfig (unfollow)
AgeCommit message (Collapse)AuthorFilesLines
2022-10-02kbuild: use obj-y instead extra-y for objects placed at the headMasahiro Yamada29-113/+91
The objects placed at the head of vmlinux need special treatments: - arch/$(SRCARCH)/Makefile adds them to head-y in order to place them before other archives in the linker command line. - arch/$(SRCARCH)/kernel/Makefile adds them to extra-y instead of obj-y to avoid them going into built-in.a. This commit gets rid of the latter. Create vmlinux.a to collect all the objects that are unconditionally linked to vmlinux. The objects listed in head-y are moved to the head of vmlinux.a by using 'ar m'. With this, arch/$(SRCARCH)/kernel/Makefile can consistently use obj-y for builtin objects. There is no *.o that is directly linked to vmlinux. Drop unneeded code in scripts/clang-tools/gen_compile_commands.py. $(AR) mPi needs 'T' to workaround the llvm-ar bug. The fix was suggested by Nathan Chancellor [1]. [1]: https://lore.kernel.org/llvm/YyjjT5gQ2hGMH0ni@dev-arch.thelio-3990X/ Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Tested-by: Nick Desaulniers <ndesaulniers@google.com> Reviewed-by: Nicolas Schier <nicolas@fjasle.eu>
2022-09-29kbuild: hide error checker logs for V=1 buildsMasahiro Yamada1-2/+2
V=1 (verbose build) shows commands executed by Make, but it may cause misunderstanding. For example, the following command shows the outstanding error message. $ make V=1 INSTALL_PATH=/tmp install test -e include/generated/autoconf.h -a -e include/config/auto.conf || ( \ echo >&2; \ echo >&2 " ERROR: Kernel configuration is invalid."; \ echo >&2 " include/generated/autoconf.h or include/config/auto.conf are missing.";\ echo >&2 " Run 'make oldconfig && make prepare' on kernel src to fix it."; \ echo >&2 ; \ /bin/false) unset sub_make_done; ./scripts/install.sh It is not an error. Make just showed the recipe lines it has executed, but people may think that 'make install' has failed. Likewise, the combination of V=1 and O= shows confusing "*** The source tree is not clean, please run 'make mrproper'". Suppress such misleading logs. Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Reviewed-by: Nick Desaulniers <ndesaulniers@google.com> Reviewed-by: Nicolas Schier <nicolas@fjasle.eu>
2022-09-29kbuild: re-run modpost when it is updatedMasahiro Yamada1-6/+8
Modpost generates .vmlinux.export.c and *.mod.c, which are prerequisites of vmlinux and modules, respectively. The modpost stage should be re-run when the modpost code is updated. Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2022-09-29kbuild: unify two modpost invocationsMasahiro Yamada4-76/+54
Currently, modpost is executed twice; first for vmlinux, second for modules. This commit merges them. Current build flow ================== 1) build obj-y and obj-m objects 2) link vmlinux.o 3) modpost for vmlinux 4) link vmlinux 5) modpost for modules 6) link modules (*.ko) The build steps 1) through 6) are serialized, that is, modules are built after vmlinux. You do not get benefits of parallel builds when scripts/link-vmlinux.sh is being run. New build flow ============== 1) build obj-y and obj-m objects 2) link vmlinux.o 3) modpost for vmlinux and modules 4a) link vmlinux 4b) link modules (*.ko) In the new build flow, modpost is invoked just once. vmlinux and modules are built in parallel. One exception is CONFIG_DEBUG_INFO_BTF_MODULES=y, where modules depend on vmlinux. Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Tested-by: Nick Desaulniers <ndesaulniers@google.com> Reviewed-by: Nicolas Schier <nicolas@fjasle.eu>
2022-09-29kbuild: move vmlinux.o rule to the top MakefileMasahiro Yamada2-7/+5
Move the build rules of vmlinux.o out of scripts/link-vmlinux.sh to clearly separate 1) pre-modpost, 2) modpost, 3) post-modpost stages. This will make further refactoring possible. Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Tested-by: Nick Desaulniers <ndesaulniers@google.com> Reviewed-by: Nicolas Schier <nicolas@fjasle.eu>
2022-09-29kbuild: move .vmlinux.objs rule to Makefile.modpostMasahiro Yamada3-21/+29
.vmlinux.objs is used by modpost, so scripts/Makefile.modpost is a better place to generate it. It is used only when CONFIG_MODVERSIONS=y. It should be guarded by "ifdef CONFIG_MODVERSIONS". Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Tested-by: Nick Desaulniers <ndesaulniers@google.com> Reviewed-by: Nicolas Schier <nicolas@fjasle.eu>
2022-09-29kbuild: list sub-directories in ./KbuildMasahiro Yamada3-43/+46
Use the ordinary obj-y syntax to list subdirectories. Note1: Previously, the link order of lib-y depended on CONFIG_MODULES; lib-y was linked before drivers-y when CONFIG_MODULES=y, otherwise after drivers-y. This was a bug of commit 7273ad2b08f8 ("kbuild: link lib-y objects to vmlinux forcibly when CONFIG_MODULES=y"), but it was not a big deal after all. Now, all objects listed in lib-y are linked last, irrespective of CONFIG_MODULES. Note2: Finally, the single target build in arch/*/lib/ works correctly. There was a bug report about this. [1] $ make ARCH=arm arch/arm/lib/findbit.o CALL scripts/checksyscalls.sh AS arch/arm/lib/findbit.o [1]: https://lore.kernel.org/linux-kbuild/YvUQOwL6lD4%2F5%2FU6@shell.armlinux.org.uk/ Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Tested-by: Nick Desaulniers <ndesaulniers@google.com> Reviewed-by: Nicolas Schier <nicolas@fjasle.eu>
2022-09-29Makefile.compiler: replace cc-ifversion with compiler-specific macrosNick Desaulniers5-22/+29
cc-ifversion is GCC specific. Replace it with compiler specific variants. Update the users of cc-ifversion to use these new macros. Link: https://github.com/ClangBuiltLinux/linux/issues/350 Link: https://lore.kernel.org/llvm/CAGG=3QWSAUakO42kubrCap8fp-gm1ERJJAYXTnP1iHk_wrH=BQ@mail.gmail.com/ Suggested-by: Bill Wendling <morbo@google.com> Reviewed-by: Nathan Chancellor <nathan@kernel.org> Signed-off-by: Nick Desaulniers <ndesaulniers@google.com> Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2022-09-29kbuild: rpm-pkg: fix breakage when V=1 is usedJanis Schoetterl-Glausch1-2/+2
Doing make V=1 binrpm-pkg results in: Executing(%install): /bin/sh -e /var/tmp/rpm-tmp.EgV6qJ + umask 022 + cd . + /bin/rm -rf /home/scgl/rpmbuild/BUILDROOT/kernel-6.0.0_rc5+-1.s390x + /bin/mkdir -p /home/scgl/rpmbuild/BUILDROOT + /bin/mkdir /home/scgl/rpmbuild/BUILDROOT/kernel-6.0.0_rc5+-1.s390x + mkdir -p /home/scgl/rpmbuild/BUILDROOT/kernel-6.0.0_rc5+-1.s390x/boot + make -f ./Makefile image_name + cp test -e include/generated/autoconf.h -a -e include/config/auto.conf || ( \ echo >&2; \ echo >&2 " ERROR: Kernel configuration is invalid."; \ echo >&2 " include/generated/autoconf.h or include/config/auto.conf are missing.";\ echo >&2 " Run 'make oldconfig && make prepare' on kernel src to fix it."; \ echo >&2 ; \ /bin/false) arch/s390/boot/bzImage /home/scgl/rpmbuild/BUILDROOT/kernel-6.0.0_rc5+-1.s390x/boot/vmlinuz-6.0.0-rc5+ cp: invalid option -- 'e' Try 'cp --help' for more information. error: Bad exit status from /var/tmp/rpm-tmp.EgV6qJ (%install) Because the make call to get the image name is verbose and prints additional information. Fixes: 993bdde94547 ("kbuild: add image_name to no-sync-config-targets") Signed-off-by: Janis Schoetterl-Glausch <scgl@linux.ibm.com> Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2022-09-29scripts: remove unused argument 'type'Zeng Heng1-3/+3
Remove unused function argument, and there is no logic changes. Signed-off-by: Zeng Heng <zengheng4@huawei.com> Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2022-09-29Kconfig: remove sym_set_choice_valueZeng Heng2-6/+1
sym_set_choice_value could be removed and directly call sym_set_tristate_value instead. Signed-off-by: Zeng Heng <zengheng4@huawei.com> Suggested-by: Masahiro Yamada <masahiroy@kernel.org> Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2022-09-29linux/export: use inline assembler to populate symbol CRCsMasahiro Yamada1-2/+4
Since commit 7b4537199a4a ("kbuild: link symbol CRCs at final link, removing CONFIG_MODULE_REL_CRCS"), the module versioning on the (non-upstreamed-yet) kvx Linux port is broken due to unexpected padding for __crc_* symbols. The kvx GCC adds padding so u32 gets 8-byte alignment instead of 4. I do not know if this happens for upstream architectures in general, but any compiler has the freedom to insert padding for faster access. Use the inline assembler to directly specify the wanted data layout. This is how we previously did before the breakage. Link: https://lore.kernel.org/lkml/20220817161438.32039-1-ysionneau@kalray.eu/ Link: https://lore.kernel.org/linux-kbuild/31ce5305-a76b-13d7-ea55-afca82c46cf2@kalray.eu/ Fixes: 7b4537199a4a ("kbuild: link symbol CRCs at final link, removing CONFIG_MODULE_REL_CRCS") Reported-by: Yann Sionneau <ysionneau@kalray.eu> Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Tested-by: Yann Sionneau <ysionneau@kalray.eu>
2022-09-29kbuild: use objtool-args-y to clean up objtool argumentsMasahiro Yamada2-26/+20
Based on Linus' patch. Refactor scripts/Makefile.vmlinux_o as well. Link: https://lore.kernel.org/lkml/CAHk-=wgjTMQgiKzBZTmb=uWGDEQxDdyF1+qxBkODYciuNsmwnw@mail.gmail.com/ Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Acked-by: Peter Zijlstra <peterz@infradead.org> Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
2022-09-29kbuild: fix and refactor single target buildMasahiro Yamada2-43/+20
The single target build has a subtle bug for the combination for an individual file and a subdirectory. [1] 'make kernel/fork.i' builds only kernel/fork.i $ make kernel/fork.i CALL scripts/checksyscalls.sh DESCEND objtool CPP kernel/fork.i [2] 'make kernel/' builds only under the kernel/ directory. $ make kernel/ CALL scripts/checksyscalls.sh DESCEND objtool CC kernel/fork.o CC kernel/exec_domain.o [snip] CC kernel/rseq.o AR kernel/built-in.a But, if you try to do [1] and [2] in a single command, you will get only [1] with a weird log: $ make kernel/fork.i kernel/ CALL scripts/checksyscalls.sh DESCEND objtool CPP kernel/fork.i make[2]: Nothing to be done for 'kernel/'. With 'make kernel/fork.i kernel/', you should get both [1] and [2]. Rewrite the single target build. Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2022-09-29kbuild: rewrite check-local-export in sh/awkOwen Rafferty1-49/+48
Remove the bash build dependency for those who otherwise do not have it installed. This also provides a significant speedup: $ make defconfig $ make yes2modconfig ... $ find . -name "*.o" | grep -v vmlinux | wc 3169 3169 89615 $ export NM=nm $ time sh -c 'find . -name "*.o" | grep -v vmlinux | xargs -n1 ./scripts/check-local-export' Without patch: 0m15.90s real 0m12.17s user 0m05.28s system With patch: dash + nawk 0m02.16s real 0m02.92s user 0m00.34s system dash + busybox awk 0m02.36s real 0m03.36s user 0m00.34s system dash + gawk 0m02.07s real 0m03.26s user 0m00.32s system bash + gawk 0m03.55s real 0m05.00s user 0m00.54s system Signed-off-by: Owen Rafferty <owen@owenrafferty.com> Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2022-09-29Revert "kbuild: Make scripts/compile.h when sh != bash"Masahiro Yamada1-3/+0
This reverts commit [1] in the pre-git era. I do not know what problem happened in the script when sh != bash because there is no commit message. Now that this script is much simpler than it used to be, let's revert it, and let' see. (If this turns out to be problematic, fix the code with proper commit description.) [1]: https://git.kernel.org/pub/scm/linux/kernel/git/history/history.git/commit/?id=11acbbbb8a50f4de7dbe4bc1b5acc440dfe81810 Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2022-09-29scripts/mkcompile_h: move LC_ALL=C to '$LD -v'Masahiro Yamada1-6/+2
Minimize the scope of LC_ALL=C like before commit 87c94bfb8ad3 ("kbuild: override build timestamp & version"). Give LC_ALL=C to '$LD -v' to get the consistent version output, as commit bcbcf50f5218 ("kbuild: fix ld-version.sh to not be affected by locale") mentioned the LD version is affected by locale. While I was here, I merged two sed invocations. Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2022-09-29kbuild: generate include/generated/compile.h in top MakefileMasahiro Yamada2-8/+8
Now that UTS_VERSION was separated out, this header can be generated much earlier, and probably the top Makefile is a better place to do it than init/Makefile. Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2022-09-29kbuild: build init/built-in.a just onceMasahiro Yamada12-131/+120
Kbuild builds init/built-in.a twice; first during the ordinary directory descending, second from scripts/link-vmlinux.sh. We do this because UTS_VERSION contains the build version and the timestamp. We cannot update it during the normal directory traversal since we do not yet know if we need to update vmlinux. UTS_VERSION is temporarily calculated, but omitted from the update check. Otherwise, vmlinux would be rebuilt every time. When Kbuild results in running link-vmlinux.sh, it increments the version number in the .version file and takes the timestamp at that time to really fix UTS_VERSION. However, updating the same file twice is a footgun. To avoid nasty timestamp issues, all build artifacts that depend on init/built-in.a are atomically generated in link-vmlinux.sh, where some of them do not need rebuilding. To fix this issue, this commit changes as follows: [1] Split UTS_VERSION out to include/generated/utsversion.h from include/generated/compile.h include/generated/utsversion.h is generated just before the vmlinux link. It is generated under include/generated/ because some decompressors (s390, x86) use UTS_VERSION. [2] Split init_uts_ns and linux_banner out to init/version-timestamp.c from init/version.c init_uts_ns and linux_banner contain UTS_VERSION. During the ordinary directory descending, they are compiled with __weak and used to determine if vmlinux needs relinking. Just before the vmlinux link, they are compiled without __weak to embed the real version and timestamp. Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2022-09-29init/version.c: remove #include <linux/version.h>Masahiro Yamada1-1/+0
This is unneeded since commit 073a9ecb3a73 ("init/version.c: remove Version_<LINUX_VERSION_CODE> symbol"). Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2022-09-29kbuild: move 'PHONY += modules_prepare' to the common partMasahiro Yamada1-5/+1
Unify the code between in-tree builds and external module builds. Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2022-09-29kbuild: refactor single builds of *.koMasahiro Yamada1-12/+4
Remove the potentially invalid modules.order instead of using the temporary file. Also, KBUILD_MODULES is don't care for single builds. No need to cancel it. Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2022-09-29kbuild: remove duplicated dependency between modules and modules_checkMasahiro Yamada1-2/+1
The dependency, "modules: modules_check" is specified twice. Commit 1a998be620a1 ("kbuild: check module name conflict for external modules as well") missed to clean it up. 'PHONY += modules' also appears twice. Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2022-09-29nios2: move core-y in arch/nios2/Makefile to arch/nios2/KbuildMasahiro Yamada2-4/+2
Use obj-y to clean up Makefile. Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2022-09-29kbuild: do not deduplicate modules.orderMasahiro Yamada2-5/+2
The AWK code was added to deduplicate modules.order in case $(obj-m) contains the same module multiple times, but it is actually unneeded since commit b2c885549122 ("kbuild: update modules.order only when contained modules are updated"). The list is already deduplicated before being processed by AWK because $^ is the deduplicated list of prerequisites. (Please note the real-prereqs macro uses $^) Yet, modules.order will contain duplication if two different Makefiles build the same module: foo/Makefile: obj-m += bar/baz.o foo/bar/Makefile: obj-m += baz.o However, the parallel builds cannot properly handle this case in the first place. So, it is better to let it fail (as already done by scripts/modules-check.sh). Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2022-09-29kbuild: check sha1sum just once for each atomic headerMasahiro Yamada3-43/+25
It is unneeded to check the sha1sum every time. Create the timestamp files to manage it. Add '.' to clean-dirs because 'make clean' must visit ./Kbuild to clean up the timestamp files. Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2022-09-29kbuild: hard-code KBUILD_ALLDIRS in scripts/Makefile.packageMasahiro Yamada2-7/+6
My future plan is to list subdirectories in ./Kbuild. When it occurs, $(vmlinux-alldirs) will not contain all subdirectories. Let's hard-code the directory list until I get around to implementing a more sophisticated way for generating a source tarball. Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Tested-by: Nick Desaulniers <ndesaulniers@google.com> Reviewed-by: Nicolas Schier <nicolas@fjasle.eu>
2022-09-29kbuild: add phony targets to ./KbuildMasahiro Yamada2-14/+14
missing-syscalls and old-atomics are meant to be phony targets. Adding them to always-y is odd. (always-y should generate something). Add a new phony target 'prepare', which depends on all the other. Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2022-09-29kbuild: remove the target in signal traps when interruptedMasahiro Yamada1-1/+22
When receiving some signal, GNU Make automatically deletes the target if it has already been changed by the interrupted recipe. If the target is possibly incomplete due to interruption, it must be deleted so that it will be remade from scratch on the next run of make. Otherwise, the target would remain corrupted permanently because its timestamp had already been updated. Thanks to this behavior of Make, you can stop the build any time by pressing Ctrl-C, and just run 'make' to resume it. Kbuild also relies on this feature, but it is equivalently important for any build systems that make decisions based on timestamps (if you want to support Ctrl-C reliably). However, this does not always work as claimed; Make immediately dies with Ctrl-C if its stderr goes into a pipe. [Test Makefile] foo: echo hello > $@ sleep 3 echo world >> $@ [Test Result] $ make # hit Ctrl-C echo hello > foo sleep 3 ^Cmake: *** Deleting file 'foo' make: *** [Makefile:3: foo] Interrupt $ make 2>&1 | cat # hit Ctrl-C echo hello > foo sleep 3 ^C$ # 'foo' is often left-over The reason is because SIGINT is sent to the entire process group. In this example, SIGINT kills 'cat', and 'make' writes the message to the closed pipe, then dies with SIGPIPE before cleaning the target. A typical bad scenario (as reported by [1], [2]) is to save build log by using the 'tee' command: $ make 2>&1 | tee log This can be problematic for any build systems based on Make, so I hope it will be fixed in GNU Make. The maintainer of GNU Make stated this is a long-standing issue and difficult to fix [3]. It has not been fixed yet as of writing. So, we cannot rely on Make cleaning the target. We can do it by ourselves, in signal traps. As far as I understand, Make takes care of SIGHUP, SIGINT, SIGQUIT, and SITERM for the target removal. I added the traps for them, and also for SIGPIPE just in case cmd_* rule prints something to stdout or stderr (but I did not observe an actual case where SIGPIPE was triggered). [Note 1] The trap handler might be worth explaining. rm -f $@; trap - $(sig); kill -s $(sig) $$ This lets the shell kill itself by the signal it caught, so the parent process can tell the child has exited on the signal. Generally, this is a proper manner for handling signals, in case the calling program (like Bash) may monitor WIFSIGNALED() and WTERMSIG() for WCE although this may not be a big deal here because GNU Make handles SIGHUP, SIGINT, SIGQUIT in WUE and SIGTERM in IUE. IUE - Immediate Unconditional Exit WUE - Wait and Unconditional Exit WCE - Wait and Cooperative Exit For details, see "Proper handling of SIGINT/SIGQUIT" [4]. [Note 2] Reverting 392885ee82d3 ("kbuild: let fixdep directly write to .*.cmd files") would directly address [1], but it only saves if_changed_dep. As reported in [2], all commands that use redirection can potentially leave an empty (i.e. broken) target. [Note 3] Another (even safer) approach might be to always write to a temporary file, and rename it to $@ at the end of the recipe. <command> > $(tmp-target) mv $(tmp-target) $@ It would require a lot of Makefile changes, and result in ugly code, so I did not take it. [Note 4] A little more thoughts about a pattern rule with multiple targets (or a grouped target). %.x %.y: %.z <recipe> When interrupted, GNU Make deletes both %.x and %.y, while this solution only deletes $@. Probably, this is not a big deal. The next run of make will execute the rule again to create $@ along with the other files. [1]: https://lore.kernel.org/all/YLeot94yAaM4xbMY@gmail.com/ [2]: https://lore.kernel.org/all/20220510221333.2770571-1-robh@kernel.org/ [3]: https://lists.gnu.org/archive/html/help-make/2021-06/msg00001.html [4]: https://www.cons.org/cracauer/sigint.html Fixes: 392885ee82d3 ("kbuild: let fixdep directly write to .*.cmd files") Reported-by: Ingo Molnar <mingo@kernel.org> Reported-by: Rob Herring <robh@kernel.org> Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Tested-by: Ingo Molnar <mingo@kernel.org> Reviewed-by: Nicolas Schier <nicolas@fjasle.eu>
2022-09-25Linux 6.0-rc7Linus Torvalds1-1/+1
2022-09-24devdax: Fix soft-reservation memory descriptionDan Williams1-0/+1
The "hmem" platform-devices that are created to represent the platform-advertised "Soft Reserved" memory ranges end up inserting a resource that causes the iomem_resource tree to look like this: 340000000-43fffffff : hmem.0 340000000-43fffffff : Soft Reserved 340000000-43fffffff : dax0.0 This is because insert_resource() reparents ranges when they completely intersect an existing range. This matters because code that uses region_intersects() to scan for a given IORES_DESC will only check that top-level 'hmem.0' resource and not the 'Soft Reserved' descendant. So, to support EINJ (via einj_error_inject()) to inject errors into memory hosted by a dax-device, be sure to describe the memory as IORES_DESC_SOFT_RESERVED. This is a follow-on to: commit b13a3e5fd40b ("ACPI: APEI: Fix _EINJ vs EFI_MEMORY_SP") ...that fixed EINJ support for "Soft Reserved" ranges in the first instance. Fixes: 262b45ae3ab4 ("x86/efi: EFI soft reservation to E820 enumeration") Reported-by: Ricardo Sandoval Torres <ricardo.sandoval.torres@intel.com> Tested-by: Ricardo Sandoval Torres <ricardo.sandoval.torres@intel.com> Cc: <stable@vger.kernel.org> Cc: Tony Luck <tony.luck@intel.com> Cc: Omar Avelar <omar.avelar@intel.com> Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Cc: Mark Gross <markgross@kernel.org> Link: https://lore.kernel.org/r/166397075670.389916.7435722208896316387.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2022-09-24Makefile.debug: re-enable debug info for .S filesNick Desaulniers2-11/+14
Alexey reported that the fraction of unknown filename instances in kallsyms grew from ~0.3% to ~10% recently; Bill and Greg tracked it down to assembler defined symbols, which regressed as a result of: commit b8a9092330da ("Kbuild: do not emit debug info for assembly with LLVM_IAS=1") In that commit, I allude to restoring debug info for assembler defined symbols in a follow up patch, but it seems I forgot to do so in commit a66049e2cf0e ("Kbuild: make DWARF version a choice") Link: https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;h=31bf18645d98b4d3d7357353be840e320649a67d Fixes: b8a9092330da ("Kbuild: do not emit debug info for assembly with LLVM_IAS=1") Reported-by: Alexey Alexandrov <aalexand@google.com> Reported-by: Bill Wendling <morbo@google.com> Reported-by: Greg Thelen <gthelen@google.com> Reviewed-by: Nathan Chancellor <nathan@kernel.org> Suggested-by: Masahiro Yamada <masahiroy@kernel.org> Signed-off-by: Nick Desaulniers <ndesaulniers@google.com> Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2022-09-24Makefile.debug: set -g unconditional on CONFIG_DEBUG_INFO_SPLITNick Desaulniers1-3/+1
Dmitrii, Fangrui, and Mashahiro note: Before GCC 11 and Clang 12 -gsplit-dwarf implicitly uses -g2. Fix CONFIG_DEBUG_INFO_SPLIT for gcc-11+ & clang-12+ which now need -g specified in order for -gsplit-dwarf to work at all. -gsplit-dwarf has been mutually exclusive with -g since support for CONFIG_DEBUG_INFO_SPLIT was introduced in commit 866ced950bcd ("kbuild: Support split debug info v4") I don't think it ever needed to be. Link: https://lore.kernel.org/lkml/20220815013317.26121-1-dmitrii.bundin.a@gmail.com/ Link: https://lore.kernel.org/lkml/CAK7LNARPAmsJD5XKAw7m_X2g7Fi-CAAsWDQiP7+ANBjkg7R7ng@mail.gmail.com/ Link: https://reviews.llvm.org/D80391 Cc: Andi Kleen <ak@linux.intel.com> Reported-by: Dmitrii Bundin <dmitrii.bundin.a@gmail.com> Reported-by: Fangrui Song <maskray@google.com> Reported-by: Masahiro Yamada <masahiroy@kernel.org> Suggested-by: Dmitrii Bundin <dmitrii.bundin.a@gmail.com> Reviewed-by: Nathan Chancellor <nathan@kernel.org> Signed-off-by: Nick Desaulniers <ndesaulniers@google.com> Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2022-09-23io_uring: ensure that cached task references are always put on exitJens Axboe1-0/+3
io_uring caches task references to avoid doing atomics for each of them per request. If a request is put from the same task that allocated it, then we can maintain a per-ctx cache of them. This obviously relies on io_uring always pruning caches in a reliable way, and there's currently a case off io_uring fd release where we can miss that. One example is a ring setup with IOPOLL, which relies on the task polling for completions, which will free them. However, if such a task submits a request and then exits or closes the ring without reaping the completion, then ring release will reap and put. If release happens from that very same task, the completed request task refs will get put back into the cache pool. This is problematic, as we're now beyond the point of pruning caches. Manually drop these caches after doing an IOPOLL reap. This releases references from the current task, which is enough. If another task happens to be doing the release, then the caching will not be triggered and there's no issue. Cc: stable@vger.kernel.org Fixes: e98e49b2bbf7 ("io_uring: extend task put optimisations") Reported-by: Homin Rhee <hominlab@gmail.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-09-24certs: make system keyring depend on built-in x509 parserMasahiro Yamada1-1/+1
Commit e90886291c7c ("certs: make system keyring depend on x509 parser") is not the right fix because x509_load_certificate_list() can be modular. The combination of CONFIG_SYSTEM_TRUSTED_KEYRING=y and CONFIG_X509_CERTIFICATE_PARSER=m still results in the following error: LD .tmp_vmlinux.kallsyms1 ld: certs/system_keyring.o: in function `load_system_certificate_list': system_keyring.c:(.init.text+0x8c): undefined reference to `x509_load_certificate_list' make: *** [Makefile:1169: vmlinux] Error 1 Fixes: e90886291c7c ("certs: make system keyring depend on x509 parser") Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Tested-by: Adam Borowski <kilobyte@angband.pl>
2022-09-24Kconfig: remove unused function 'menu_get_root_menu'Zeng Heng2-6/+0
There is nowhere calling `menu_get_root_menu` function, so remove it. Signed-off-by: Zeng Heng <zengheng4@huawei.com> Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2022-09-24scripts/clang-tools: remove unused moduleyangxingwu1-1/+0
Remove unused imported 'os' module. Signed-off-by: yangxingwu <xingwu.yang@gmail.com> Reviewed-by: Nathan Chancellor <nathan@kernel.org> Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2022-09-23cgroup: cgroup_get_from_id() must check the looked-up kn is a directoryMing Lei1-1/+4
cgroup has to be one kernfs dir, otherwise kernel panic is caused, especially cgroup id is provide from userspace. Reported-by: Marco Patalano <mpatalan@redhat.com> Fixes: 6b658c4863c1 ("scsi: cgroup: Add cgroup_get_from_id()") Cc: Muneendra <muneendra.kumar@broadcom.com> Signed-off-by: Ming Lei <ming.lei@redhat.com> Acked-by: Mukesh Ojha <quic_mojha@quicinc.com> Cc: stable@vger.kernel.org # v5.14+ Signed-off-by: Tejun Heo <tj@kernel.org>
2022-09-23vmlinux.lds.h: CFI: Reduce alignment of jump-table to function alignmentWill Deacon1-2/+1
Due to undocumented, hysterical raisins on x86, the CFI jump-table sections in .text are needlessly aligned to PMD_SIZE in the vmlinux linker script. When compiling a CFI-enabled arm64 kernel with a 64KiB page-size, a PMD maps 512MiB of virtual memory and so the .text section increases to a whopping 940MiB and blows the final Image up to 960MiB. Others report a link failure. Since the CFI jump-table requires only instruction alignment, reduce the alignment directives to function alignment for parity with other parts of the .text section. This reduces the size of the .text section for the aforementioned 64KiB page size arm64 kernel to 19MiB for a much more reasonable total Image size of 39MiB. Cc: Sami Tolvanen <samitolvanen@google.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: "Mohan Rao .vanimina" <mailtoc.mohanrao@gmail.com> Cc: Kees Cook <keescook@chromium.org> Cc: Nathan Chancellor <nathan@kernel.org> Cc: <stable@vger.kernel.org> Link: https://lore.kernel.org/all/CAL_GTzigiNOMYkOPX1KDnagPhJtFNqSK=1USNbS0wUL4PW6-Uw@mail.gmail.com/ Fixes: cf68fffb66d6 ("add support for Clang CFI") Reviewed-by: Mark Rutland <mark.rutland@arm.com> Tested-by: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Sami Tolvanen <samitolvanen@google.com> Reviewed-by: Kees Cook <keescook@chromium.org> Link: https://lore.kernel.org/r/20220922215715.13345-1-will@kernel.org Signed-off-by: Will Deacon <will@kernel.org>
2022-09-23MAINTAINERS: switch graphics to airlied other addressesDave Airlie1-3/+4
My linux.ie address is in a bad place. also add dri-devel for agpgart. Signed-off-by: Dave Airlie <airlied@redhat.com>
2022-09-22KVM: x86: Inject #UD on emulated XSETBV if XSAVES isn't enabledSean Christopherson2-0/+4
Inject #UD when emulating XSETBV if CR4.OSXSAVE is not set. This also covers the "XSAVE not supported" check, as setting CR4.OSXSAVE=1 #GPs if XSAVE is not supported (and userspace gets to keep the pieces if it forces incoherent vCPU state). Add a comment to kvm_emulate_xsetbv() to call out that the CPU checks CR4.OSXSAVE before checking for intercepts. AMD'S APM implies that #UD has priority (says that intercepts are checked before #GP exceptions), while Intel's SDM says nothing about interception priority. However, testing on hardware shows that both AMD and Intel CPUs prioritize the #UD over interception. Fixes: 02d4160fbd76 ("x86: KVM: add xsetbv to the emulator") Cc: stable@vger.kernel.org Cc: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20220824033057.3576315-4-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-09-22KVM: x86: Always enable legacy FP/SSE in allowed user XFEATURESDr. David Alan Gilbert1-1/+7
Allow FP and SSE state to be saved and restored via KVM_{G,SET}_XSAVE on XSAVE-capable hosts even if their bits are not exposed to the guest via XCR0. Failing to allow FP+SSE first showed up as a QEMU live migration failure, where migrating a VM from a pre-XSAVE host, e.g. Nehalem, to an XSAVE host failed due to KVM rejecting KVM_SET_XSAVE. However, the bug also causes problems even when migrating between XSAVE-capable hosts as KVM_GET_SAVE won't set any bits in user_xfeatures if XSAVE isn't exposed to the guest, i.e. KVM will fail to actually migrate FP+SSE. Because KVM_{G,S}ET_XSAVE are designed to allowing migrating between hosts with and without XSAVE, KVM_GET_XSAVE on a non-XSAVE (by way of fpu_copy_guest_fpstate_to_uabi()) always sets the FP+SSE bits in the header so that KVM_SET_XSAVE will work even if the new host supports XSAVE. Fixes: ad856280ddea ("x86/kvm/fpu: Limit guest user_xfeatures to supported bits of XCR0") bz: https://bugzilla.redhat.com/show_bug.cgi?id=2079311 Cc: stable@vger.kernel.org Cc: Leonardo Bras <leobras@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> [sean: add comment, massage changelog] Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20220824033057.3576315-3-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-09-22KVM: x86: Reinstate kvm_vcpu_arch.guest_supported_xcr0Sean Christopherson3-10/+5
Reinstate the per-vCPU guest_supported_xcr0 by partially reverting commit 988896bb6182; the implicit assessment that guest_supported_xcr0 is always the same as guest_fpu.fpstate->user_xfeatures was incorrect. kvm_vcpu_after_set_cpuid() isn't the only place that sets user_xfeatures, as user_xfeatures is set to fpu_user_cfg.default_features when guest_fpu is allocated via fpu_alloc_guest_fpstate() => __fpstate_reset(). guest_supported_xcr0 on the other hand is zero-allocated. If userspace never invokes KVM_SET_CPUID2, supported XCR0 will be '0', whereas the allowed user XFEATURES will be non-zero. Practically speaking, the edge case likely doesn't matter as no sane userspace will live migrate a VM without ever doing KVM_SET_CPUID2. The primary motivation is to prepare for KVM intentionally and explicitly setting bits in user_xfeatures that are not set in guest_supported_xcr0. Because KVM_{G,S}ET_XSAVE can be used to svae/restore FP+SSE state even if the host doesn't support XSAVE, KVM needs to set the FP+SSE bits in user_xfeatures even if they're not allowed in XCR0, e.g. because XCR0 isn't exposed to the guest. At that point, the simplest fix is to track the two things separately (allowed save/restore vs. allowed XCR0). Fixes: 988896bb6182 ("x86/kvm/fpu: Remove kvm_vcpu_arch.guest_supported_xcr0") Cc: stable@vger.kernel.org Cc: Leonardo Bras <leobras@redhat.com> Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20220824033057.3576315-2-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-09-22KVM: x86/mmu: add missing update to max_mmu_rmap_sizeMiaohe Lin1-0/+2
The update to statistic max_mmu_rmap_size is unintentionally removed by commit 4293ddb788c1 ("KVM: x86/mmu: Remove redundant spte present check in mmu_set_spte"). Add missing update to it or max_mmu_rmap_size will always be nonsensical 0. Fixes: 4293ddb788c1 ("KVM: x86/mmu: Remove redundant spte present check in mmu_set_spte") Signed-off-by: Miaohe Lin <linmiaohe@huawei.com> Message-Id: <20220907080657.42898-1-linmiaohe@huawei.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-09-22selftests: kvm: Fix a compile error in selftests/kvm/rseq_test.cJinrong Liang1-1/+1
The following warning appears when executing: make -C tools/testing/selftests/kvm rseq_test.c: In function ‘main’: rseq_test.c:237:33: warning: implicit declaration of function ‘gettid’; did you mean ‘getgid’? [-Wimplicit-function-declaration] (void *)(unsigned long)gettid()); ^~~~~~ getgid /usr/bin/ld: /tmp/ccr5mMko.o: in function `main': ../kvm/tools/testing/selftests/kvm/rseq_test.c:237: undefined reference to `gettid' collect2: error: ld returned 1 exit status make: *** [../lib.mk:173: ../kvm/tools/testing/selftests/kvm/rseq_test] Error 1 Use the more compatible syscall(SYS_gettid) instead of gettid() to fix it. More subsequent reuse may cause it to be wrapped in a lib file. Signed-off-by: Jinrong Liang <cloudliang@tencent.com> Message-Id: <20220802071240.84626-1-cloudliang@tencent.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-09-22mm: slub: fix flush_cpu_slab()/__free_slab() invocations in task context.Maurizio Lombardi1-1/+8
Commit 5a836bf6b09f ("mm: slub: move flush_cpu_slab() invocations __free_slab() invocations out of IRQ context") moved all flush_cpu_slab() invocations to the global workqueue to avoid a problem related with deactivate_slab()/__free_slab() being called from an IRQ context on PREEMPT_RT kernels. When the flush_all_cpu_locked() function is called from a task context it may happen that a workqueue with WQ_MEM_RECLAIM bit set ends up flushing the global workqueue, this will cause a dependency issue. workqueue: WQ_MEM_RECLAIM nvme-delete-wq:nvme_delete_ctrl_work [nvme_core] is flushing !WQ_MEM_RECLAIM events:flush_cpu_slab WARNING: CPU: 37 PID: 410 at kernel/workqueue.c:2637 check_flush_dependency+0x10a/0x120 Workqueue: nvme-delete-wq nvme_delete_ctrl_work [nvme_core] RIP: 0010:check_flush_dependency+0x10a/0x120[ 453.262125] Call Trace: __flush_work.isra.0+0xbf/0x220 ? __queue_work+0x1dc/0x420 flush_all_cpus_locked+0xfb/0x120 __kmem_cache_shutdown+0x2b/0x320 kmem_cache_destroy+0x49/0x100 bioset_exit+0x143/0x190 blk_release_queue+0xb9/0x100 kobject_cleanup+0x37/0x130 nvme_fc_ctrl_free+0xc6/0x150 [nvme_fc] nvme_free_ctrl+0x1ac/0x2b0 [nvme_core] Fix this bug by creating a workqueue for the flush operation with the WQ_MEM_RECLAIM bit set. Fixes: 5a836bf6b09f ("mm: slub: move flush_cpu_slab() invocations __free_slab() invocations out of IRQ context") Cc: <stable@vger.kernel.org> Signed-off-by: Maurizio Lombardi <mlombard@redhat.com> Reviewed-by: Hyeonggon Yoo <42.hyeyoo@gmail.com> Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
2022-09-22ext4: limit the number of retries after discarding preallocations blocksTheodore Ts'o1-1/+3
This patch avoids threads live-locking for hours when a large number threads are competing over the last few free extents as they blocks getting added and removed from preallocation pools. From our bug reporter: A reliable way for triggering this has multiple writers continuously write() to files when the filesystem is full, while small amounts of space are freed (e.g. by truncating a large file -1MiB at a time). In the local filesystem, this can be done by simply not checking the return code of write (0) and/or the error (ENOSPACE) that is set. Over NFS with an async mount, even clients with proper error checking will behave this way since the linux NFS client implementation will not propagate the server errors [the write syscalls immediately return success] until the file handle is closed. This leads to a situation where NFS clients send a continuous stream of WRITE rpcs which result in ERRNOSPACE -- but since the client isn't seeing this, the stream of writes continues at maximum network speed. When some space does appear, multiple writers will all attempt to claim it for their current write. For NFS, we may see dozens to hundreds of threads that do this. The real-world scenario of this is database backup tooling (in particular, github.com/mdkent/percona-xtrabackup) which may write large files (>1TiB) to NFS for safe keeping. Some temporary files are written, rewound, and read back -- all before closing the file handle (the temp file is actually unlinked, to trigger automatic deletion on close/crash.) An application like this operating on an async NFS mount will not see an error code until TiB have been written/read. The lockup was observed when running this database backup on large filesystems (64 TiB in this case) with a high number of block groups and no free space. Fragmentation is generally not a factor in this filesystem (~thousands of large files, mostly contiguous except for the parts written while the filesystem is at capacity.) Signed-off-by: Theodore Ts'o <tytso@mit.edu> Cc: stable@kernel.org
2022-09-22ext4: fix bug in extents parsing when eh_entries == 0 and eh_depth > 0Luís Henriques1-0/+4
When walking through an inode extents, the ext4_ext_binsearch_idx() function assumes that the extent header has been previously validated. However, there are no checks that verify that the number of entries (eh->eh_entries) is non-zero when depth is > 0. And this will lead to problems because the EXT_FIRST_INDEX() and EXT_LAST_INDEX() will return garbage and result in this: [ 135.245946] ------------[ cut here ]------------ [ 135.247579] kernel BUG at fs/ext4/extents.c:2258! [ 135.249045] invalid opcode: 0000 [#1] PREEMPT SMP [ 135.250320] CPU: 2 PID: 238 Comm: tmp118 Not tainted 5.19.0-rc8+ #4 [ 135.252067] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.15.0-0-g2dd4b9b-rebuilt.opensuse.org 04/01/2014 [ 135.255065] RIP: 0010:ext4_ext_map_blocks+0xc20/0xcb0 [ 135.256475] Code: [ 135.261433] RSP: 0018:ffffc900005939f8 EFLAGS: 00010246 [ 135.262847] RAX: 0000000000000024 RBX: ffffc90000593b70 RCX: 0000000000000023 [ 135.264765] RDX: ffff8880038e5f10 RSI: 0000000000000003 RDI: ffff8880046e922c [ 135.266670] RBP: ffff8880046e9348 R08: 0000000000000001 R09: ffff888002ca580c [ 135.268576] R10: 0000000000002602 R11: 0000000000000000 R12: 0000000000000024 [ 135.270477] R13: 0000000000000000 R14: 0000000000000024 R15: 0000000000000000 [ 135.272394] FS: 00007fdabdc56740(0000) GS:ffff88807dd00000(0000) knlGS:0000000000000000 [ 135.274510] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 135.276075] CR2: 00007ffc26bd4f00 CR3: 0000000006261004 CR4: 0000000000170ea0 [ 135.277952] Call Trace: [ 135.278635] <TASK> [ 135.279247] ? preempt_count_add+0x6d/0xa0 [ 135.280358] ? percpu_counter_add_batch+0x55/0xb0 [ 135.281612] ? _raw_read_unlock+0x18/0x30 [ 135.282704] ext4_map_blocks+0x294/0x5a0 [ 135.283745] ? xa_load+0x6f/0xa0 [ 135.284562] ext4_mpage_readpages+0x3d6/0x770 [ 135.285646] read_pages+0x67/0x1d0 [ 135.286492] ? folio_add_lru+0x51/0x80 [ 135.287441] page_cache_ra_unbounded+0x124/0x170 [ 135.288510] filemap_get_pages+0x23d/0x5a0 [ 135.289457] ? path_openat+0xa72/0xdd0 [ 135.290332] filemap_read+0xbf/0x300 [ 135.291158] ? _raw_spin_lock_irqsave+0x17/0x40 [ 135.292192] new_sync_read+0x103/0x170 [ 135.293014] vfs_read+0x15d/0x180 [ 135.293745] ksys_read+0xa1/0xe0 [ 135.294461] do_syscall_64+0x3c/0x80 [ 135.295284] entry_SYSCALL_64_after_hwframe+0x46/0xb0 This patch simply adds an extra check in __ext4_ext_check(), verifying that eh_entries is not 0 when eh_depth is > 0. Link: https://bugzilla.kernel.org/show_bug.cgi?id=215941 Link: https://bugzilla.kernel.org/show_bug.cgi?id=216283 Cc: Baokun Li <libaokun1@huawei.com> Cc: stable@kernel.org Signed-off-by: Luís Henriques <lhenriques@suse.de> Reviewed-by: Jan Kara <jack@suse.cz> Reviewed-by: Baokun Li <libaokun1@huawei.com> Link: https://lore.kernel.org/r/20220822094235.2690-1-lhenriques@suse.de Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2022-09-22serial: sifive: enable clocks for UART when probedOlof Johansson1-1/+1
When the PWM driver was changed to disable clocks if no PWMs are enabled, it ended up also disabling the shared parent with the UART, since the UART doesn't do any clock enablement on its own. To avoid these surprises, switch to clk_get_enabled(). Fixes: ace41d7564e655 ("pwm: sifive: Ensure the clk is enabled exactly once per running PWM") Cc: stable <stable@kernel.org> Cc: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Cc: Emil Renner Berthing <emil.renner.berthing@canonical.com> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Paul Walmsley <paul.walmsley@sifive.com> Reviewed-by: Palmer Dabbelt <palmer@rivosinc.com> Reviewed-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Acked-by: Palmer Dabbelt <palmer@rivosinc.com> Signed-off-by: Olof Johansson <olof@lixom.net> Link: https://lore.kernel.org/r/20220920160017.7315-1-olof@lixom.net Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-09-22serial: 8250: omap: Use serial8250_em485_supportedMatthias Schiffer1-0/+1
8250_omap uses em485, fill in rs485_supported accordingly. This makes RS485 work with 8250_omap again, which was broken with the introduction of the RS485 config sanitization. Fixes: be2e2cb1d2819 ("serial: Sanitize rs485_struct") Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Matthias Schiffer <matthias.schiffer@ew.tq-group.com> Link: https://lore.kernel.org/r/20220916110955.161099-1-matthias.schiffer@ew.tq-group.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>