linux-dev - Linux kernel development work

Age	Commit message (Collapse)	Author	Files	Lines
2022-11-01	drm/imx: Kconfig: Remove duplicated 'select DRM_KMS_HELPER' line	Liu Ying	1	-1/+0
	A duplicated line 'select DRM_KMS_HELPER' was introduced in Kconfig file by commit 09717af7d13d ("drm: Remove CONFIG_DRM_KMS_CMA_HELPER option"), so remove it. Fixes: 09717af7d13d ("drm: Remove CONFIG_DRM_KMS_CMA_HELPER option") Signed-off-by: Liu Ying <victor.liu@nxp.com> Reviewed-by: Philipp Zabel <p.zabel@pengutronix.de> Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de> Link: https://patchwork.freedesktop.org/patch/msgid/20221009023527.3669647-1-victor.liu@nxp.com
2022-10-31	drm/format-helper: Only advertise supported formats for conversion	Hector Martin	1	-19/+47
	drm_fb_build_fourcc_list() currently returns all emulated formats unconditionally as long as the native format is among them, even though not all combinations have conversion helpers. Although the list is arguably provided to userspace in precedence order, userspace can pick something out-of-order (and thus break when it shouldn't), or simply only support a format that is unsupported (and thus think it can work, which results in the appearance of a hang as FB blits fail later on, instead of the initialization error you'd expect in this case). Add checks to filter the list of emulated formats to only those supported for conversion to the native format. This presumes that there is a single native format (only the first is checked, if there are multiple). Refactoring this API to drop the native list or support it properly (by returning the appropriate emulated->native mapping table) is left for a future patch. The simpledrm driver is left as-is with a full table of emulated formats. This keeps all currently working conversions available and drops all the broken ones (i.e. this a strict bugfix patch, adding no new supported formats nor removing any actually working ones). In order to avoid proliferation of emulated formats, future drivers should advertise only XRGB8888 as the sole emulated format (since some userspace assumes its presence). This fixes a real user regression where the ?RGB2101010 support commit started advertising it unconditionally where not supported, and KWin decided to start to use it over the native format and broke, but also the fixes the spurious RGB565/RGB888 formats which have been wrongly unconditionally advertised since the dawn of simpledrm. Fixes: 6ea966fca084 ("drm/simpledrm: Add [AX]RGB2101010 formats") Fixes: 11e8f5fd223b ("drm: Add simpledrm driver") Cc: stable@vger.kernel.org Signed-off-by: Hector Martin <marcan@marcan.st> Acked-by: Pekka Paalanen <pekka.paalanen@collabora.com> Reviewed-by: Thomas Zimmermann <tzimmermann@suse.de> Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de> Link: https://patchwork.freedesktop.org/patch/msgid/20221027135711.24425-1-marcan@marcan.st
2022-10-29	drm/rockchip: vop2: disable planes when disabling the crtc	Michael Tretter	1	-0/+4
	The vop2 driver needs to explicitly disable the planes if the crtc is disabled. Unless the planes are explicitly disabled, the address of the last framebuffer is kept in the registers of the VOP2. When re-enabling the encoder after it has been disabled by the driver, the VOP2 will start and read the framebuffer that has been freed but is still pointed to by the register. The iommu will catch these read accesses and print errors. Explicitly disable the planes when the crtc is disabled to reset the registers. Signed-off-by: Michael Tretter <m.tretter@pengutronix.de> Signed-off-by: Heiko Stuebner <heiko@sntech.de> Link: https://patchwork.freedesktop.org/patch/msgid/20221028095206.2136601-3-m.tretter@pengutronix.de
2022-10-29	drm/rockchip: vop2: fix null pointer in plane_atomic_disable	Michael Tretter	1	-2/+4
	If the vop2_plane_atomic_disable function is called with NULL as a state, accessing the old_pstate runs into a null pointer exception. However, the drm_atomic_helper_disable_planes_on_crtc function calls the atomic_disable callback with state NULL. Allow to disable a plane without passing a plane state by checking the old_pstate only if a state is passed. Signed-off-by: Michael Tretter <m.tretter@pengutronix.de> Signed-off-by: Heiko Stuebner <heiko@sntech.de> Link: https://patchwork.freedesktop.org/patch/msgid/20221028095206.2136601-2-m.tretter@pengutronix.de
2022-10-29	drm/rockchip: dsi: Fix VOP selection on SoCs that support it	Ondrej Jirman	1	-3/+1
	lcdsel_grf_reg is defined as u32, so "< 0" comaprison is always false, which breaks VOP selection on eg. RK3399. Compare against 0. Fixes: f3aaa6125b6f ("drm/rockchip: dsi: add rk3568 support") Signed-off-by: Ondrej Jirman <megi@xff.cz> Tested-by: Chris Morgan <macromorgan@hotmail.com> Signed-off-by: Heiko Stuebner <heiko@sntech.de> Link: https://patchwork.freedesktop.org/patch/msgid/20221023160747.607943-1-megi@xff.cz
2022-10-29	drm/rockchip: fix fbdev on non-IOMMU devices	John Keeping	1	-1/+4
	When switching to the generic fbdev infrastructure, it was missed that framebuffers were created with the alloc_kmap parameter to rockchip_gem_create_object() set to true. The generic infrastructure calls this via the .dumb_create() driver operation and thus creates a buffer without an associated kmap. alloc_kmap only makes a difference on devices without an IOMMU, but when it is missing rockchip_gem_prime_vmap() fails and the framebuffer cannot be used. Detect the case where a buffer is being allocated for the framebuffer and ensure a kernel mapping is created in this case. Fixes: 24af7c34b290 ("drm/rockchip: use generic fbdev setup") Reported-by: Johan Jonker <jbx6244@gmail.com> Cc: Thomas Zimmermann <tzimmermann@suse.de> Signed-off-by: John Keeping <john@metanate.com> Signed-off-by: Heiko Stuebner <heiko@sntech.de> Link: https://patchwork.freedesktop.org/patch/msgid/20221020181248.2497065-1-john@metanate.com
2022-10-29	drm/rockchip: dsi: Force synchronous probe	Brian Norris	1	-0/+6
	We can't safely probe a dual-DSI display asynchronously (driver_async_probe='*' or driver_async_probe='dw-mipi-dsi-rockchip' cmdline), because dw_mipi_dsi_rockchip_find_second() pokes one DSI device's drvdata from the other device without any locking. Request synchronous probe, at least until this driver learns some appropriate locking for dual-DSI initialization. Cc: <stable@vger.kernel.org> Signed-off-by: Brian Norris <briannorris@chromium.org> Signed-off-by: Heiko Stuebner <heiko@sntech.de> Link: https://patchwork.freedesktop.org/patch/msgid/20221019170255.2.I6b985b0ca372b7e35c6d9ea970b24bcb262d4fc1@changeid
2022-10-29	drm/rockchip: dsi: Clean up 'usage_mode' when failing to attach	Brian Norris	1	-4/+12
	If we fail to attach the first time (especially: EPROBE_DEFER), we fail to clean up 'usage_mode', and thus will fail to attach on any subsequent attempts, with "dsi controller already in use". Re-set to DW_DSI_USAGE_IDLE on attach failure. This is especially common to hit when enabling asynchronous probe on a duel-DSI system (such as RK3399 Gru/Scarlet), such that we're more likely to fail dw_mipi_dsi_rockchip_find_second() the first time. Fixes: 71f68fe7f121 ("drm/rockchip: dsi: add ability to work as a phy instead of full dsi") Cc: <stable@vger.kernel.org> Signed-off-by: Brian Norris <briannorris@chromium.org> Signed-off-by: Heiko Stuebner <heiko@sntech.de> Link: https://patchwork.freedesktop.org/patch/msgid/20221019170255.1.Ia68dfb27b835d31d22bfe23812baf366ee1c6eac@changeid
2022-10-29	drm/rockchip: dw_hdmi: filter regulator -EPROBE_DEFER error messages	Aurelien Jarno	1	-1/+2
	When the avdd-0v9 or avdd-1v8 supply are not yet available, EPROBE_DEFER is returned by rockchip_hdmi_parse_dt(). This causes the following error message to be printed multiple times: dwhdmi-rockchip fe0a0000.hdmi: [drm:dw_hdmi_rockchip_bind [rockchipdrm]] ERROR Unable to parse OF data Fix that by not printing the message when rockchip_hdmi_parse_dt() returns -EPROBE_DEFER. Fixes: ca80c4eb4b01 ("drm/rockchip: dw_hdmi: add regulator support") Signed-off-by: Aurelien Jarno <aurelien@aurel32.net> Reviewed-by: Dmitry Osipenko <dmitry.osipenko@collabora.com> Signed-off-by: Heiko Stuebner <heiko@sntech.de> Link: https://patchwork.freedesktop.org/patch/msgid/20220926203752.5430-1-aurelien@aurel32.net
2022-10-27	fbdev/core: Avoid uninitialized read in aperture_remove_conflicting_pci_device()	Michał Mirosław	1	-4/+1
	Return on error directly from the BAR-iterating loop instead of break+return. This is actually a cosmetic fix, since it would be highly unusual to have this called for a PCI device without any memory BARs. Fixes: 9d69ef183815 ("fbdev/core: Remove remove_conflicting_pci_framebuffers()") Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl> Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de> Link: https://patchwork.freedesktop.org/patch/msgid/e75323732bedc46d613d72ecb40f97e3bc75eea8.1666829073.git.mirq-linux@rere.qmqm.pl
2022-10-25	drm/scheduler: fix fence ref counting	Christian König	1	-1/+5
	We leaked dependency fences when processes were beeing killed. Additional to that grab a reference to the last scheduled fence. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220929180151.139751-1-christian.koenig@amd.com
2022-10-23	Linux 6.1-rc2	Linus Torvalds	1	-1/+1

2022-10-23	Revert "mfd: syscon: Remove repetition of the regmap_get_val_endian()"	Jason A. Donenfeld	1	-0/+8
	This reverts commit 72a95859728a7866522e6633818bebc1c2519b17. It broke reboots on big-endian MIPS and MIPS64 malta QEMU instances, which use the syscon driver. Little-endian is not effected, which means likely it's important to handle regmap_get_val_endian() in this function after all. Fixes: 72a95859728a ("mfd: syscon: Remove repetition of the regmap_get_val_endian()") Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Cc: Lee Jones <lee@kernel.org> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-10-23	kernel/utsname_sysctl.c: Fix hostname polling	Linus Torvalds	2	-0/+2
	Commit bfca3dd3d068 ("kernel/utsname_sysctl.c: print kernel arch") added a new entry to the uts_kern_table[] array, but didn't update the UTS_PROC_xyz enumerators of older entries, breaking anything that used them. Which is admittedly not many cases: it's really just the two uses of uts_proc_notify() in kernel/sys.c. But apparently journald-systemd actually uses this to detect hostname changes. Reported-by: Torsten Hilbrich <torsten.hilbrich@secunet.com> Fixes: bfca3dd3d068 ("kernel/utsname_sysctl.c: print kernel arch") Link: https://lore.kernel.org/lkml/0c2b92a6-0f25-9538-178f-eee3b06da23f@secunet.com/ Link: https://linux-regtracking.leemhuis.info/regzbot/regression/0c2b92a6-0f25-9538-178f-eee3b06da23f@secunet.com/ Cc: Petr Vorel <pvorel@suse.cz> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-10-22	io_uring/net: fail zc sendmsg when unsupported by socket	Pavel Begunkov	1	-0/+2
	The previous patch fails zerocopy send requests for protocols that don't support it, do the same for zerocopy sendmsg. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/0854e7bb4c3d810a48ec8b5853e2f61af36a0467.1666346426.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-10-22	io_uring/net: fail zc send when unsupported by socket	Pavel Begunkov	1	-0/+2
	If a protocol doesn't support zerocopy it will silently fall back to copying. This type of behaviour has always been a source of troubles so it's better to fail such requests instead. Cc: <stable@vger.kernel.org> # 6.0 Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/2db3c7f16bb6efab4b04569cd16e6242b40c5cb3.1666346426.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-10-22	net: flag sockets supporting msghdr originated zerocopy	Pavel Begunkov	3	-0/+3
	We need an efficient way in io_uring to check whether a socket supports zerocopy with msghdr provided ubuf_info. Add a new flag into the struct socket flags fields. Cc: <stable@vger.kernel.org> # 6.0 Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Acked-by: Jakub Kicinski <kuba@kernel.org> Link: https://lore.kernel.org/r/3dafafab822b1c66308bb58a0ac738b1e3f53f74.1666346426.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-10-22	hwmon: (corsair-psu) Add USB id of the new HX1500i psu	Wilken Gottwalt	2	-0/+3
	Also update the documentation accordingly. Signed-off-by: Wilken Gottwalt <wilken.gottwalt@posteo.net> Link: https://lore.kernel.org/r/Y0FghqQCHG/cX5Jz@monster.localdomain Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2022-10-22	tools: include: sync include/api/linux/kvm.h	Paolo Bonzini	1	-0/+1
	Provide a definition of KVM_CAP_DIRTY_LOG_RING_ACQ_REL. Fixes: 17601bfed909 ("KVM: Add KVM_CAP_DIRTY_LOG_RING_ACQ_REL capability and config option") Cc: Marc Zyngier <maz@kernel.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-10-22	KVM: x86: Add compat handler for KVM_X86_SET_MSR_FILTER	Alexander Graf	1	-0/+56
	The KVM_X86_SET_MSR_FILTER ioctls contains a pointer in the passed in struct which means it has a different struct size depending on whether it gets called from 32bit or 64bit code. This patch introduces compat code that converts from the 32bit struct to its 64bit counterpart which then gets used going forward internally. With this applied, 32bit QEMU can successfully set MSR bitmaps when running on 64bit kernels. Reported-by: Andrew Randrianasulu <randrianasulu@gmail.com> Fixes: 1a155254ff937 ("KVM: x86: Introduce MSR filtering") Signed-off-by: Alexander Graf <graf@amazon.com> Message-Id: <20221017184541.2658-4-graf@amazon.com> Cc: stable@vger.kernel.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-10-22	KVM: x86: Copy filter arg outside kvm_vm_ioctl_set_msr_filter()	Alexander Graf	1	-14/+17
	In the next patch we want to introduce a second caller to set_msr_filter() which constructs its own filter list on the stack. Refactor the original function so it takes it as argument instead of reading it through copy_from_user(). Signed-off-by: Alexander Graf <graf@amazon.com> Message-Id: <20221017184541.2658-3-graf@amazon.com> Cc: stable@vger.kernel.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-10-22	kvm: Add support for arch compat vm ioctls	Alexander Graf	2	-0/+13
	We will introduce the first architecture specific compat vm ioctl in the next patch. Add all necessary boilerplate to allow architectures to override compat vm ioctls when necessary. Signed-off-by: Alexander Graf <graf@amazon.com> Message-Id: <20221017184541.2658-2-graf@amazon.com> Cc: stable@vger.kernel.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-10-21	x86/fpu: Fix copy_xstate_to_uabi() to copy init states correctly	Chang S. Bae	1	-0/+9
	When an extended state component is not present in fpstate, but in init state, the function copies from init_fpstate via copy_feature(). But, dynamic states are not present in init_fpstate because of all-zeros init states. Then retrieving them from init_fpstate will explode like this: BUG: kernel NULL pointer dereference, address: 0000000000000000 ... RIP: 0010:memcpy_erms+0x6/0x10 ? __copy_xstate_to_uabi_buf+0x381/0x870 fpu_copy_guest_fpstate_to_uabi+0x28/0x80 kvm_arch_vcpu_ioctl+0x14c/0x1460 [kvm] ? __this_cpu_preempt_check+0x13/0x20 ? vmx_vcpu_put+0x2e/0x260 [kvm_intel] kvm_vcpu_ioctl+0xea/0x6b0 [kvm] ? kvm_vcpu_ioctl+0xea/0x6b0 [kvm] ? __fget_light+0xd4/0x130 __x64_sys_ioctl+0xe3/0x910 ? debug_smp_processor_id+0x17/0x20 ? fpregs_assert_state_consistent+0x27/0x50 do_syscall_64+0x3f/0x90 entry_SYSCALL_64_after_hwframe+0x63/0xcd Adjust the 'mask' to zero out the userspace buffer for the features that are not available both from fpstate and from init_fpstate. The dynamic features depend on the compacted XSAVE format. Ensure it is enabled before reading XCOMP_BV in init_fpstate. Fixes: 2308ee57d93d ("x86/fpu/amx: Enable the AMX feature in 64-bit mode") Reported-by: Yuan Yao <yuan.yao@intel.com> Suggested-by: Dave Hansen <dave.hansen@intel.com> Signed-off-by: Chang S. Bae <chang.seok.bae@intel.com> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Tested-by: Yuan Yao <yuan.yao@intel.com> Link: https://lore.kernel.org/lkml/BYAPR11MB3717EDEF2351C958F2C86EED95259@BYAPR11MB3717.namprd11.prod.outlook.com/ Link: https://lkml.kernel.org/r/20221021185844.13472-1-chang.seok.bae@intel.com
2022-10-21	drm/bridge: ps8640: Add back the 50 ms mystery delay after HPD	Douglas Anderson	1	-2/+23
	Back in commit 826cff3f7ebb ("drm/bridge: parade-ps8640: Enable runtime power management") we removed a mysterious 50 ms delay because "Parade's support [couldn't] explain what the delay [was] for". While I'm always a fan of removing mysterious delays, I suspect that we need this mysterious delay to avoid some problems. Specifically, what I found recently is that on sc7180-trogdor-homestar sometimes the AUX backlight wasn't initializing properly. Some debugging showed that the drm_dp_dpcd_read() function that the AUX backlight driver was calling was returning bogus data about 1% of the time when I booted up. This confused drm_panel_dp_aux_backlight(). From continued debugging: - If I retried the read then the read worked just fine. - If I added a loop to perform the same read that drm_panel_dp_aux_backlight() was doing 30 times at bootup I could see that some percentage of the time the first read would give bogus data but all 29 additional reads would always be fine. - If I added a large delay _after_ powering on the panel but before powering on PS8640 I could still reproduce the problem. - If I added a delay after PS8640 powered on then I couldn't reproduce the problem. - I couldn't reproduce the problem on a board with the same panel but the ti-sn65dsi86 bridge chip. To me, the above indicated that there was a problem with PS8640 and not the panel. I don't really have any insight into what's going on in the MCU, but my best guess is that when the MCU itself sees the HPD go high that it does some AUX transfers itself and this is confusing things. Let's go back and add back in the mysterious 50 ms delay. We only want to do this the first time we see HPD go high after booting the MCU, not every time we double-check HPD. With this, the backlight initializes reliably on homestar. Fixes: 826cff3f7ebb ("drm/bridge: parade-ps8640: Enable runtime power management") Reviewed-by: Stephen Boyd <swboyd@chromium.org> Signed-off-by: Douglas Anderson <dianders@chromium.org> Link: https://patchwork.freedesktop.org/patch/msgid/20221017121813.1.I59700c745fbc31559a5d5c8e2a960279c751dbd5@changeid
2022-10-21	x86/unwind/orc: Fix unreliable stack dump with gcov	Chen Zhongjin	1	-1/+1
	When a console stack dump is initiated with CONFIG_GCOV_PROFILE_ALL enabled, show_trace_log_lvl() gets out of sync with the ORC unwinder, causing the stack trace to show all text addresses as unreliable: # echo l > /proc/sysrq-trigger [ 477.521031] sysrq: Show backtrace of all active CPUs [ 477.523813] NMI backtrace for cpu 0 [ 477.524492] CPU: 0 PID: 1021 Comm: bash Not tainted 6.0.0 #65 [ 477.525295] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.0-1.fc36 04/01/2014 [ 477.526439] Call Trace: [ 477.526854] <TASK> [ 477.527216] ? dump_stack_lvl+0xc7/0x114 [ 477.527801] ? dump_stack+0x13/0x1f [ 477.528331] ? nmi_cpu_backtrace.cold+0xb5/0x10d [ 477.528998] ? lapic_can_unplug_cpu+0xa0/0xa0 [ 477.529641] ? nmi_trigger_cpumask_backtrace+0x16a/0x1f0 [ 477.530393] ? arch_trigger_cpumask_backtrace+0x1d/0x30 [ 477.531136] ? sysrq_handle_showallcpus+0x1b/0x30 [ 477.531818] ? __handle_sysrq.cold+0x4e/0x1ae [ 477.532451] ? write_sysrq_trigger+0x63/0x80 [ 477.533080] ? proc_reg_write+0x92/0x110 [ 477.533663] ? vfs_write+0x174/0x530 [ 477.534265] ? handle_mm_fault+0x16f/0x500 [ 477.534940] ? ksys_write+0x7b/0x170 [ 477.535543] ? __x64_sys_write+0x1d/0x30 [ 477.536191] ? do_syscall_64+0x6b/0x100 [ 477.536809] ? entry_SYSCALL_64_after_hwframe+0x63/0xcd [ 477.537609] </TASK> This happens when the compiled code for show_stack() has a single word on the stack, and doesn't use a tail call to show_stack_log_lvl(). (CONFIG_GCOV_PROFILE_ALL=y is the only known case of this.) Then the __unwind_start() skip logic hits an off-by-one bug and fails to unwind all the way to the intended starting frame. Fix it by reverting the following commit: f1d9a2abff66 ("x86/unwind/orc: Don't skip the first frame for inactive tasks") The original justification for that commit no longer exists. That original issue was later fixed in a different way, with the following commit: f2ac57a4c49d ("x86/unwind/orc: Fix inactive tasks with stack pointer in %sp on GCC 10 compiled kernels") Fixes: f1d9a2abff66 ("x86/unwind/orc: Don't skip the first frame for inactive tasks") Signed-off-by: Chen Zhongjin <chenzhongjin@huawei.com> [jpoimboe: rewrite commit log] Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org> Signed-off-by: Peter Zijlstra <peterz@infradead.org>
2022-10-21	efi: runtime: Don't assume virtual mappings are missing if VA == PA == 0	Ard Biesheuvel	3	-6/+6
	The generic EFI stub can be instructed to avoid SetVirtualAddressMap(), and simply run with the firmware's 1:1 mapping. In this case, it populates the virtual address fields of the runtime regions in the memory map with the physical address of each region, so that the mapping code has to be none the wiser. Only if SetVirtualAddressMap() fails, the virtual addresses are wiped and the kernel code knows that the regions cannot be mapped. However, wiping amounts to setting it to zero, and if a runtime region happens to live at physical address 0, its valid 1:1 mapped virtual address could be mistaken for a wiped field, resulting on loss of access to the EFI services at runtime. So let's only assume that VA == 0 means 'no runtime services' if the region in question does not live at PA 0x0. Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2022-10-21	efi: libstub: Fix incorrect payload size in zboot header	Ard Biesheuvel	1	-1/+2
	The linker script symbol definition that captures the size of the compressed payload inside the zboot decompressor (which is exposed via the image header) refers to '.' for the end of the region, which does not give the correct result as the expression is not placed at the end of the payload. So use the symbol name explicitly. Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2022-10-21	efi: libstub: Give efi_main() asmlinkage qualification	Ard Biesheuvel	1	-3/+3
	To stop the bots from sending sparse warnings to me and the list about efi_main() not having a prototype, decorate it with asmlinkage so that it is clear that it is called from assembly, and therefore needs to remain external, even if it is never declared in a header file. Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2022-10-21	efi: efivars: Fix variable writes without query_variable_store()	Ard Biesheuvel	3	-24/+5
	Commit bbc6d2c6ef22 ("efi: vars: Switch to new wrapper layer") refactored the efivars layer so that the 'business logic' related to which UEFI variables affect the boot flow in which way could be moved out of it, and into the efivarfs driver. This inadvertently broke setting variables on firmware implementations that lack the QueryVariableInfo() boot service, because we no longer tolerate a EFI_UNSUPPORTED result from check_var_size() when calling efivar_entry_set_get_size(), which now ends up calling check_var_size() a second time inadvertently. If QueryVariableInfo() is missing, we support writes of up to 64k - let's move that logic into check_var_size(), and drop the redundant call. Cc: <stable@vger.kernel.org> # v6.0 Fixes: bbc6d2c6ef22 ("efi: vars: Switch to new wrapper layer") Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2022-10-21	efi: ssdt: Don't free memory if ACPI table was loaded successfully	Ard Biesheuvel	1	-0/+2
	Amadeusz reports KASAN use-after-free errors introduced by commit 3881ee0b1edc ("efi: avoid efivars layer when loading SSDTs from variables"). The problem appears to be that the memory that holds the new ACPI table is now freed unconditionally, instead of only when the ACPI core reported a failure to load the table. So let's fix this, by omitting the kfree() on success. Cc: <stable@vger.kernel.org> # v6.0 Link: https://lore.kernel.org/all/a101a10a-4fbb-5fae-2e3c-76cf96ed8fbd@linux.intel.com/ Fixes: 3881ee0b1edc ("efi: avoid efivars layer when loading SSDTs from variables") Reported-by: Amadeusz Sławiński <amadeuszx.slawinski@linux.intel.com> Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2022-10-21	efi: libstub: Remove zboot signing from build options	Ard Biesheuvel	2	-47/+4
	The zboot decompressor series introduced a feature to sign the PE/COFF kernel image for secure boot as part of the kernel build. This was necessary because there are actually two images that need to be signed: the kernel with the EFI stub attached, and the decompressor application. This is a bit of a burden, because it means that the images must be signed on the the same system that performs the build, and this is not realistic for distros. During the next cycle, we will introduce changes to the zboot code so that the inner image no longer needs to be signed. This means that the outer PE/COFF image can be handled as usual, and be signed later in the release process. Let's remove the associated Kconfig options now so that they don't end up in a LTS release while already being deprecated. Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2022-10-21	iommu/vt-d: Clean up si_domain in the init_dmars() error path	Jerry Snitselaar	1	-0/+5
	A splat from kmem_cache_destroy() was seen with a kernel prior to commit ee2653bbe89d ("iommu/vt-d: Remove domain and devinfo mempool") when there was a failure in init_dmars(), because the iommu_domain cache still had objects. While the mempool code is now gone, there still is a leak of the si_domain memory if init_dmars() fails. So clean up si_domain in the init_dmars() error path. Cc: Lu Baolu <baolu.lu@linux.intel.com> Cc: Joerg Roedel <joro@8bytes.org> Cc: Will Deacon <will@kernel.org> Cc: Robin Murphy <robin.murphy@arm.com> Fixes: 86080ccc223a ("iommu/vt-d: Allocate si_domain in init_dmars()") Signed-off-by: Jerry Snitselaar <jsnitsel@redhat.com> Link: https://lore.kernel.org/r/20221010144842.308890-1-jsnitsel@redhat.com Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>
2022-10-21	iommu/vt-d: Allow NVS regions in arch_rmrr_sanity_check()	Charlotte Tan	1	-1/+3
	arch_rmrr_sanity_check() warns if the RMRR is not covered by an ACPI Reserved region, but it seems like it should accept an NVS region as well. The ACPI spec https://uefi.org/specs/ACPI/6.5/15_System_Address_Map_Interfaces.html uses similar wording for "Reserved" and "NVS" region types; for NVS regions it says "This range of addresses is in use or reserved by the system and must not be used by the operating system." There is an old comment on this mailing list that also suggests NVS regions should pass the arch_rmrr_sanity_check() test: The warnings come from arch_rmrr_sanity_check() since it checks whether the region is E820_TYPE_RESERVED. However, if the purpose of the check is to detect RMRR has regions that may be used by OS as free memory, isn't E820_TYPE_NVS safe, too? This patch overlaps with another proposed patch that would add the region type to the log since sometimes the bug reporter sees this log on the console but doesn't know to include the kernel log: https://lore.kernel.org/lkml/20220611204859.234975-3-atomlin@redhat.com/ Here's an example of the "Firmware Bug" apparent false positive (wrapped for line length): DMAR: [Firmware Bug]: No firmware reserved region can cover this RMRR [0x000000006f760000-0x000000006f762fff], contact BIOS vendor for fixes DMAR: [Firmware Bug]: Your BIOS is broken; bad RMRR [0x000000006f760000-0x000000006f762fff] This is the snippet from the e820 table: BIOS-e820: [mem 0x0000000068bff000-0x000000006ebfefff] reserved BIOS-e820: [mem 0x000000006ebff000-0x000000006f9fefff] ACPI NVS BIOS-e820: [mem 0x000000006f9ff000-0x000000006fffefff] ACPI data Fixes: f036c7fa0ab6 ("iommu/vt-d: Check VT-d RMRR region in BIOS is reported as reserved") Cc: Will Mortensen <will@extrahop.com> Link: https://lore.kernel.org/linux-iommu/64a5843d-850d-e58c-4fc2-0a0eeeb656dc@nec.com/ Link: https://bugzilla.kernel.org/show_bug.cgi?id=216443 Signed-off-by: Charlotte Tan <charlotte@extrahop.com> Reviewed-by: Aaron Tomlin <atomlin@redhat.com> Link: https://lore.kernel.org/r/20220929044449.32515-1-charlotte@extrahop.com Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>
2022-10-21	iommu/vt-d: Use rcu_lock in get_resv_regions	Lu Baolu	1	-3/+3
	Commit 5f64ce5411b46 ("iommu/vt-d: Duplicate iommu_resv_region objects per device list") converted rcu_lock in get_resv_regions to dmar_global_lock to allow sleeping in iommu_alloc_resv_region(). This introduced possible recursive locking if get_resv_regions is called from within a section where intel_iommu_init() already holds dmar_global_lock. Especially, after commit 57365a04c921 ("iommu: Move bus setup to IOMMU device registration"), below lockdep splats could always be seen. ============================================ WARNING: possible recursive locking detected 6.0.0-rc4+ #325 Tainted: G I -------------------------------------------- swapper/0/1 is trying to acquire lock: ffffffffa8a18c90 (dmar_global_lock){++++}-{3:3}, at: intel_iommu_get_resv_regions+0x25/0x270 but task is already holding lock: ffffffffa8a18c90 (dmar_global_lock){++++}-{3:3}, at: intel_iommu_init+0x36d/0x6ea ... Call Trace: <TASK> dump_stack_lvl+0x48/0x5f __lock_acquire.cold.73+0xad/0x2bb lock_acquire+0xc2/0x2e0 ? intel_iommu_get_resv_regions+0x25/0x270 ? lock_is_held_type+0x9d/0x110 down_read+0x42/0x150 ? intel_iommu_get_resv_regions+0x25/0x270 intel_iommu_get_resv_regions+0x25/0x270 iommu_create_device_direct_mappings.isra.28+0x8d/0x1c0 ? iommu_get_dma_cookie+0x6d/0x90 bus_iommu_probe+0x19f/0x2e0 iommu_device_register+0xd4/0x130 intel_iommu_init+0x3e1/0x6ea ? iommu_setup+0x289/0x289 ? rdinit_setup+0x34/0x34 pci_iommu_init+0x12/0x3a do_one_initcall+0x65/0x320 ? rdinit_setup+0x34/0x34 ? rcu_read_lock_sched_held+0x5a/0x80 kernel_init_freeable+0x28a/0x2f3 ? rest_init+0x1b0/0x1b0 kernel_init+0x1a/0x130 ret_from_fork+0x1f/0x30 </TASK> This rolls back dmar_global_lock to rcu_lock in get_resv_regions to avoid the lockdep splat. Fixes: 57365a04c921 ("iommu: Move bus setup to IOMMU device registration") Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Tested-by: Alex Williamson <alex.williamson@redhat.com> Link: https://lore.kernel.org/r/20220927053109.4053662-3-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2022-10-21	iommu: Add gfp parameter to iommu_alloc_resv_region	Lu Baolu	10	-18/+27
	Add gfp parameter to iommu_alloc_resv_region() for the callers to specify the memory allocation behavior. Thus iommu_alloc_resv_region() could also be available in critical contexts. Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Tested-by: Alex Williamson <alex.williamson@redhat.com> Link: https://lore.kernel.org/r/20220927053109.4053662-2-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2022-10-21	RISC-V: KVM: Fix kvm_riscv_vcpu_timer_pending() for Sstc	Anup Patel	3	-2/+19
	The kvm_riscv_vcpu_timer_pending() checks per-VCPU next_cycles and per-VCPU software injected VS timer interrupt. This function returns incorrect value when Sstc is available because the per-VCPU next_cycles are only updated by kvm_riscv_vcpu_timer_save() called from kvm_arch_vcpu_put(). As a result, when Sstc is available the VCPU does not block properly upon WFI traps. To fix the above issue, we introduce kvm_riscv_vcpu_timer_sync() which will update per-VCPU next_cycles upon every VM exit instead of kvm_riscv_vcpu_timer_save(). Fixes: 8f5cb44b1bae ("RISC-V: KVM: Support sstc extension") Signed-off-by: Anup Patel <apatel@ventanamicro.com> Reviewed-by: Atish Patra <atishp@rivosinc.com> Signed-off-by: Anup Patel <anup@brainfault.org>
2022-10-21	RISC-V: Fix compilation without RISCV_ISA_ZICBOM	Andrew Jones	3	-49/+38
	riscv_cbom_block_size and riscv_init_cbom_blocksize() should always be available and riscv_init_cbom_blocksize() should always be invoked, even when compiling without RISCV_ISA_ZICBOM enabled. This is because disabling RISCV_ISA_ZICBOM means "don't use zicbom instructions in the kernel" not "pretend there isn't zicbom, even when there is". When zicbom is available, whether the kernel enables its use with RISCV_ISA_ZICBOM or not, KVM will offer it to guests. Ensure we can build KVM and that the block size is initialized even when compiling without RISCV_ISA_ZICBOM. Fixes: 8f7e001e0325 ("RISC-V: Clean up the Zicbom block size probing") Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Andrew Jones <ajones@ventanamicro.com> Signed-off-by: Anup Patel <apatel@ventanamicro.com> Reviewed-by: Conor Dooley <conor.dooley@microchip.com> Reviewed-by: Heiko Stuebner <heiko@sntech.de> Tested-by: Heiko Stuebner <heiko@sntech.de> Signed-off-by: Anup Patel <anup@brainfault.org>
2022-10-21	i2c: mlxbf: depend on ACPI; clean away ifdeffage	Adam Borowski	2	-9/+1
	This fixes maybe_unused warnings/errors. According to a comment during device tree removal, only ACPI is supported, thus let's actually require it. Fixes: be18c5ede25d ("i2c: mlxbf: remove device tree support") Signed-off-by: Adam Borowski <kilobyte@angband.pl> Signed-off-by: Wolfram Sang <wsa@kernel.org>
2022-10-20	nouveau: fix migrate_to_ram() for faulting page	Alistair Popple	1	-0/+1
	Commit 16ce101db85d ("mm/memory.c: fix race when faulting a device private page") changed the migrate_to_ram() callback to take a reference on the device page to ensure it can't be freed while handling the fault. Unfortunately the corresponding update to Nouveau to accommodate this change was inadvertently dropped from that patch causing GPU to CPU migration to fail so add it here. Link: https://lkml.kernel.org/r/20221019122934.866205-1-apopple@nvidia.com Fixes: 16ce101db85d ("mm/memory.c: fix race when faulting a device private page") Signed-off-by: Alistair Popple <apopple@nvidia.com> Cc: John Hubbard <jhubbard@nvidia.com> Cc: Ralph Campbell <rcampbell@nvidia.com> Cc: Lyude Paul <lyude@redhat.com> Cc: Ben Skeggs <bskeggs@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-10-20	mm/huge_memory: do not clobber swp_entry_t during THP split	Mel Gorman	1	-1/+10
	The following has been observed when running stressng mmap since commit b653db77350c ("mm: Clear page->private when splitting or migrating a page") watchdog: BUG: soft lockup - CPU#75 stuck for 26s! [stress-ng:9546] CPU: 75 PID: 9546 Comm: stress-ng Tainted: G E 6.0.0-revert-b653db77-fix+ #29 0357d79b60fb09775f678e4f3f64ef0579ad1374 Hardware name: SGI.COM C2112-4GP3/X10DRT-P-Series, BIOS 2.0a 05/09/2016 RIP: 0010:xas_descend+0x28/0x80 Code: cc cc 0f b6 0e 48 8b 57 08 48 d3 ea 83 e2 3f 89 d0 48 83 c0 04 48 8b 44 c6 08 48 89 77 18 48 89 c1 83 e1 03 48 83 f9 02 75 08 <48> 3d fd 00 00 00 76 08 88 57 12 c3 cc cc cc cc 48 c1 e8 02 89 c2 RSP: 0018:ffffbbf02a2236a8 EFLAGS: 00000246 RAX: ffff9cab7d6a0002 RBX: ffffe04b0af88040 RCX: 0000000000000002 RDX: 0000000000000030 RSI: ffff9cab60509b60 RDI: ffffbbf02a2236c0 RBP: 0000000000000000 R08: ffff9cab60509b60 R09: ffffbbf02a2236c0 R10: 0000000000000001 R11: ffffbbf02a223698 R12: 0000000000000000 R13: ffff9cab4e28da80 R14: 0000000000039c01 R15: ffff9cab4e28da88 FS: 00007fab89b85e40(0000) GS:ffff9cea3fcc0000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007fab84e00000 CR3: 00000040b73a4003 CR4: 00000000003706e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: <TASK> xas_load+0x3a/0x50 __filemap_get_folio+0x80/0x370 ? put_swap_page+0x163/0x360 pagecache_get_page+0x13/0x90 __try_to_reclaim_swap+0x50/0x190 scan_swap_map_slots+0x31e/0x670 get_swap_pages+0x226/0x3c0 folio_alloc_swap+0x1cc/0x240 add_to_swap+0x14/0x70 shrink_page_list+0x968/0xbc0 reclaim_page_list+0x70/0xf0 reclaim_pages+0xdd/0x120 madvise_cold_or_pageout_pte_range+0x814/0xf30 walk_pgd_range+0x637/0xa30 __walk_page_range+0x142/0x170 walk_page_range+0x146/0x170 madvise_pageout+0xb7/0x280 ? asm_common_interrupt+0x22/0x40 madvise_vma_behavior+0x3b7/0xac0 ? find_vma+0x4a/0x70 ? find_vma+0x64/0x70 ? madvise_vma_anon_name+0x40/0x40 madvise_walk_vmas+0xa6/0x130 do_madvise+0x2f4/0x360 __x64_sys_madvise+0x26/0x30 do_syscall_64+0x5b/0x80 ? do_syscall_64+0x67/0x80 ? syscall_exit_to_user_mode+0x17/0x40 ? do_syscall_64+0x67/0x80 ? syscall_exit_to_user_mode+0x17/0x40 ? do_syscall_64+0x67/0x80 ? do_syscall_64+0x67/0x80 ? common_interrupt+0x8b/0xa0 entry_SYSCALL_64_after_hwframe+0x63/0xcd The problem can be reproduced with the mmtests config config-workload-stressng-mmap. It does not always happen and when it triggers is variable but it has happened on multiple machines. The intent of commit b653db77350c patch was to avoid the case where PG_private is clear but folio->private is not-NULL. However, THP tail pages uses page->private for "swp_entry_t if folio_test_swapcache()" as stated in the documentation for struct folio. This patch only clobbers page->private for tail pages if the head page was not in swapcache and warns once if page->private had an unexpected value. Link: https://lkml.kernel.org/r/20221019134156.zjyyn5aownakvztf@techsingularity.net Fixes: b653db77350c ("mm: Clear page->private when splitting or migrating a page") Signed-off-by: Mel Gorman <mgorman@techsingularity.net> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Yang Shi <shy828301@gmail.com> Cc: Brian Foster <bfoster@redhat.com> Cc: Dan Streetman <ddstreet@ieee.org> Cc: Miaohe Lin <linmiaohe@huawei.com> Cc: Oleksandr Natalenko <oleksandr@natalenko.name> Cc: Seth Jennings <sjenning@redhat.com> Cc: Vitaly Wool <vitaly.wool@konsulko.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-10-20	hugetlb: fix memory leak associated with vma_lock structure	Mike Kravetz	1	-8/+27
	The hugetlb vma_lock structure hangs off the vm_private_data pointer of sharable hugetlb vmas. The structure is vma specific and can not be shared between vmas. At fork and various other times, vmas are duplicated via vm_area_dup(). When this happens, the pointer in the newly created vma must be cleared and the structure reallocated. Two hugetlb specific routines deal with this hugetlb_dup_vma_private and hugetlb_vm_op_open. Both routines are called for newly created vmas. hugetlb_dup_vma_private would always clear the pointer and hugetlb_vm_op_open would allocate the new vms_lock structure. This did not work in the case of this calling sequence pointed out in [1]. move_vma copy_vma new_vma = vm_area_dup(vma); new_vma->vm_ops->open(new_vma); --> new_vma has its own vma lock. is_vm_hugetlb_page(vma) clear_vma_resv_huge_pages hugetlb_dup_vma_private --> vma->vm_private_data is set to NULL When clearing hugetlb_dup_vma_private we actually leak the associated vma_lock structure. The vma_lock structure contains a pointer to the associated vma. This information can be used in hugetlb_dup_vma_private and hugetlb_vm_op_open to ensure we only clear the vm_private_data of newly created (copied) vmas. In such cases, the vma->vma_lock->vma field will not point to the vma. Update hugetlb_dup_vma_private and hugetlb_vm_op_open to not clear vm_private_data if vma->vma_lock->vma == vma. Also, log a warning if hugetlb_vm_op_open ever encounters the case where vma_lock has already been correctly allocated for the vma. [1] https://lore.kernel.org/linux-mm/5154292a-4c55-28cd-0935-82441e512fc3@huawei.com/ Link: https://lkml.kernel.org/r/20221019201957.34607-1-mike.kravetz@oracle.com Fixes: 131a79b474e9 ("hugetlb: fix vma lock handling during split vma and range unmapping") Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com> Reviewed-by: Miaohe Lin <linmiaohe@huawei.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com> Cc: Axel Rasmussen <axelrasmussen@google.com> Cc: David Hildenbrand <david@redhat.com> Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: James Houghton <jthoughton@google.com> Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Mina Almasry <almasrymina@google.com> Cc: Muchun Song <songmuchun@bytedance.com> Cc: Naoya Horiguchi <naoya.horiguchi@linux.dev> Cc: Pasha Tatashin <pasha.tatashin@soleen.com> Cc: Peter Xu <peterx@redhat.com> Cc: Prakash Sangappa <prakash.sangappa@oracle.com> Cc: Sven Schnelle <svens@linux.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-10-20	mm/page_alloc: reduce potential fragmentation in make_alloc_exact()	Liam R. Howlett	1	-8/+12
	Try to avoid using the left over split page on the next request for a page by calling __free_pages_ok() with FPI_TO_TAIL. This increases the potential of defragmenting memory when it's used for a short period of time. Link: https://lkml.kernel.org/r/20220531185626.yvlmymbxyoe5vags@revolver Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com> Suggested-by: Matthew Wilcox (Oracle) <willy@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-10-20	mm: /proc/pid/smaps_rollup: fix maple tree search	Hugh Dickins	1	-1/+1
	/proc/pid/smaps_rollup showed 0 kB for everything: now find first vma. Link: https://lkml.kernel.org/r/3011bee7-182-97a2-1083-d5f5b688e54b@google.com Fixes: c4c84f06285e ("fs/proc/task_mmu: stop using linked list and highest_vm_end") Signed-off-by: Hugh Dickins <hughd@google.com> Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com> Cc: Alexey Dobriyan <adobriyan@gmail.com> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-10-20	mm,hugetlb: take hugetlb_lock before decrementing h->resv_huge_pages	Rik van Riel	1	-1/+1
	The h->*_huge_pages counters are protected by the hugetlb_lock, but alloc_huge_page has a corner case where it can decrement the counter outside of the lock. This could lead to a corrupted value of h->resv_huge_pages, which we have observed on our systems. Take the hugetlb_lock before decrementing h->resv_huge_pages to avoid a potential race. Link: https://lkml.kernel.org/r/20221017202505.0e6a4fcd@imladris.surriel.com Fixes: a88c76954804 ("mm: hugetlb: fix hugepage memory leak caused by wrong reserve count") Signed-off-by: Rik van Riel <riel@surriel.com> Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com> Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Cc: Glen McCready <gkmccready@meta.com> Cc: Mike Kravetz <mike.kravetz@oracle.com> Cc: Muchun Song <songmuchun@bytedance.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-10-20	mm/mmap: fix MAP_FIXED address return on VMA merge	Liam Howlett	1	-8/+7
	mmap should return the start address of newly mapped area when successful. On a successful merge of a VMA, the return address was changed and thus was violating that expectation from userspace. This is a restoration of functionality provided by 309d08d9b3a3 (mm/mmap.c: fix mmap return value when vma is merged after call_mmap()). For completeness of fixing MAP_FIXED, implement the comments from the previous discussion to never update the address and fail if the address changes. Leaving the error as a WARN_ON() to avoid crashing the kernel. Link: https://lkml.kernel.org/r/20221018191613.4133459-1-Liam.Howlett@oracle.com Link: https://lore.kernel.org/all/Y06yk66SKxlrwwfb@lakrids/ Link: https://lore.kernel.org/all/20201203085350.22624-1-liuzixian4@huawei.com/ Fixes: 4dd1b84140c1 ("mm/mmap: use advanced maple tree API for mmap_region()") Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com> Reported-by: Mark Rutland <mark.rutland@arm.com> Cc: Liu Zixian <liuzixian4@huawei.com> Cc: David Hildenbrand <david@redhat.com> Cc: Jason Gunthorpe <jgg@nvidia.com> Cc: Matthew Wilcox <willy@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-10-20	mm/mmap.c: __vma_adjust(): suppress uninitialized var warning	Andrew Morton	1	-1/+2
	The code is OK, but it fools gcc. mm/mmap.c:802 __vma_adjust() error: uninitialized symbol 'next_next'. Fixes: 524e00b36e8c5 ("mm: remove rb tree.") Reported-by: kernel test robot <lkp@intel.com> Cc: Liam R. Howlett <Liam.Howlett@Oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-10-20	mm/mmap: undo ->mmap() when mas_preallocate() fails	Mike Kravetz	1	-1/+1
	A memory leak in hugetlb_reserve_pages was reported in [1]. The root cause was traced to an error path in mmap_region when mas_preallocate() fails. In this case, the vma is freed after a successful call to filesystem specific mmap. The hugetlbfs mmap routine may allocate data structures pointed to by m_private_data. These need to be cleaned up by the hugetlb vm_ops->close() routine. The same issue was addressed by commit deb0f6562884 ("mm/mmap: undo ->mmap() when arch_validate_flags() fails") for the arch_validate_flags() test. Go to the same close_and_free_vma label if mas_preallocate() fails. [1] https://lore.kernel.org/linux-mm/CAKXUXMxf7OiCwbxib7MwfR4M1b5+b3cNTU7n5NV9Zm4967=FPQ@mail.gmail.com/ Link: https://lkml.kernel.org/r/20221018024945.415036-1-mike.kravetz@oracle.com Fixes: d4af56c5c7c6 ("mm: start tracking VMAs with maple tree") Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com> Reported-by: Lukas Bulwahn <lukas.bulwahn@gmail.com> Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com> Cc: Andrii Nakryiko <andrii@kernel.org> Cc: Carlos Llamas <cmllamas@google.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Muchun Song <songmuchun@bytedance.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-10-20	init: Kconfig: fix spelling mistake "satify" -> "satisfy"	Colin Ian King	1	-1/+1
	There is a spelling mistake in a Kconfig description. Fix it. Link: https://lkml.kernel.org/r/20221007204339.2757753-1-colin.i.king@gmail.com Signed-off-by: Colin Ian King <colin.i.king@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-10-20	ocfs2: clear dinode links count in case of error	Joseph Qi	1	-2/+10
	In ocfs2_mknod(), if error occurs after dinode successfully allocated, ocfs2 i_links_count will not be 0. So even though we clear inode i_nlink before iput in error handling, it still won't wipe inode since we'll refresh inode from dinode during inode lock. So just like clear inode i_nlink, we clear ocfs2 i_links_count as well. Also do the same change for ocfs2_symlink(). Link: https://lkml.kernel.org/r/20221017130227.234480-2-joseph.qi@linux.alibaba.com Signed-off-by: Joseph Qi <joseph.qi@linux.alibaba.com> Reported-by: Yan Wang <wangyan122@huawei.com> Cc: Mark Fasheh <mark@fasheh.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Changwei Ge <gechangwei@live.cn> Cc: Gang He <ghe@suse.com> Cc: Jun Piao <piaojun@huawei.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-10-20	ocfs2: fix BUG when iput after ocfs2_mknod fails	Joseph Qi	1	-10/+1
	Commit b1529a41f777 "ocfs2: should reclaim the inode if '__ocfs2_mknod_locked' returns an error" tried to reclaim the claimed inode if __ocfs2_mknod_locked() fails later. But this introduce a race, the freed bit may be reused immediately by another thread, which will update dinode, e.g. i_generation. Then iput this inode will lead to BUG: inode->i_generation != le32_to_cpu(fe->i_generation) We could make this inode as bad, but we did want to do operations like wipe in some cases. Since the claimed inode bit can only affect that an dinode is missing and will return back after fsck, it seems not a big problem. So just leave it as is by revert the reclaim logic. Link: https://lkml.kernel.org/r/20221017130227.234480-1-joseph.qi@linux.alibaba.com Fixes: b1529a41f777 ("ocfs2: should reclaim the inode if '__ocfs2_mknod_locked' returns an error") Signed-off-by: Joseph Qi <joseph.qi@linux.alibaba.com> Reported-by: Yan Wang <wangyan122@huawei.com> Cc: Mark Fasheh <mark@fasheh.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Changwei Ge <gechangwei@live.cn> Cc: Gang He <ghe@suse.com> Cc: Jun Piao <piaojun@huawei.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>