aboutsummaryrefslogtreecommitdiffstatshomepage
path: root/scripts/generate_rust_analyzer.py (unfollow)
AgeCommit message (Collapse)AuthorFilesLines
2024-07-01gpio: add sloppy logic analyzer using pollingWolfram Sang6-0/+704
This is a sloppy logic analyzer using GPIOs. It comes with a script to isolate a CPU for polling. While this is definitely not a production level analyzer, it can be a helpful first view when remote debugging. Read the documentation for details. Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com> Reviewed-by: Linus Walleij <linus.walleij@linaro.org> Link: https://lore.kernel.org/r/20240620094159.6785-2-wsa+renesas@sang-engineering.com [Bartosz: moved the Kconfig entry into a different category] Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
2024-07-01Documentation: gpio: Reconfiguration with unset direction (uAPI v2)Kent Gibson1-2/+5
Update description of reconfiguration rules, adding requirement that a direction flag be set to enable changing configuration for a line. Signed-off-by: Kent Gibson <warthog618@gmail.com> Link: https://lore.kernel.org/r/20240626052925.174272-5-warthog618@gmail.com Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
2024-07-01Documentation: gpio: Reconfiguration with unset direction (uAPI v1)Kent Gibson1-1/+4
Update description of reconfiguration rules, adding requirement that a direction flag be set or the configuration is considered invalid. Signed-off-by: Kent Gibson <warthog618@gmail.com> Link: https://lore.kernel.org/r/20240626052925.174272-4-warthog618@gmail.com Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
2024-06-28dt-bindings: gpio: fsl,qoriq-gpio: add common property gpio-line-namesFrank Li1-0/+4
Add common gpio-line-names property for fsl,qoriq-gpio to fix below warning. arch/arm64/boot/dts/freescale/fsl-ls1028a-kontron-kbox-a-230-ls.dtb: gpio@2300000: 'gpio-line-names' does not match any of the regexes: 'pinctrl-[0-9]+' from schema $id: http://devicetree.org/schemas/gpio/fsl,qoriq-gpio.yaml Signed-off-by: Frank Li <Frank.Li@nxp.com> Acked-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Link: https://lore.kernel.org/r/20240627202151.456812-1-Frank.Li@nxp.com Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
2024-06-26gpio: ath79: convert to dynamic GPIO base allocationShiji Yang1-2/+0
ath79 target has already been converted to device tree based platform. Use dynamic GPIO numberspace base to suppress the warning: gpio gpiochip0: Static allocation of GPIO base is deprecated, use dynamic allocation. Tested on Atheros AR7241 and AR9344. Signed-off-by: Shiji Yang <yangshiji66@outlook.com> Suggested-by: Jonas Gorski <jonas.gorski@gmail.com> Link: https://lore.kernel.org/r/TYCP286MB089598EA71E964BD8AB9EFD3BCD62@TYCP286MB0895.JPNP286.PROD.OUTLOOK.COM [Bartosz: tweaked the commit message] Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
2024-06-25pinctrl: da9062: replace gpiochip_get_desc() with gpio_device_get_desc()Bartosz Golaszewski1-1/+1
In order to finally confine the unsafe gpiochip_get_desc() to drivers/gpio/, let's convert this driver to using the safer alternative that takes the gpio_device as argument. Acked-by: Linus Walleij <linus.walleij@linaro.org> Link: https://lore.kernel.org/r/20240618111824.15593-1-brgl@bgdev.pl Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
2024-06-14gpiolib: put gpio_suffixes in a single compilation unitBartosz Golaszewski4-5/+10
The gpio_suffixes array is defined in the gpiolib.h header. This means the array is stored in .rodata of every compilation unit that includes it. Put the definition for the array in gpiolib.c and export just the symbol in the header. We need the size of the array so expose it too. Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com> Link: https://lore.kernel.org/r/20240612184821.58053-1-brgl@bgdev.pl Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
2024-06-12Documentation: gpio: Clarify effect of active low flag on line edgesKent Gibson2-0/+10
The documentation does not make sufficiently clear that edge polarity is based on changes to the logical line values, and that the physical polarity of edges is dependent on the active low flag. Clarify the relationship between the active low flag and edge polarity for the functions that read edge events. Suggested-by: David C. Rankin <drankinatty@gmail.com> Signed-off-by: Kent Gibson <warthog618@gmail.com> Link: https://lore.kernel.org/r/20240610092157.9147-3-warthog618@gmail.com Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
2024-06-12Documentation: gpio: Clarify effect of active low flag on line valuesKent Gibson4-0/+28
The documentation does not make sufficiently clear that the uAPI deals with logical line values, nor does it describe the influence of the active low flag on the mapping between physical and logical line values. Clarify the relationship between physical and logical line values and the effect of the active low flag for ioctls passing line values. Suggested-by: David C. Rankin <drankinatty@gmail.com> Signed-off-by: Kent Gibson <warthog618@gmail.com> Link: https://lore.kernel.org/r/20240610092157.9147-2-warthog618@gmail.com Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
2024-06-11gpiolib: Remove data-less gpiochip_add() functionAndrew Davis3-8/+3
GPIO chips should be added with driver-private data associated with the chip. If none is needed, NULL can be used. All users already do this except one, fix that here. With no more users of the base gpiochip_add() we can drop this function so no more users show up later. Signed-off-by: Andrew Davis <afd@ti.com> Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Link: https://lore.kernel.org/r/20240610135313.142571-1-afd@ti.com Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
2024-06-11gpio: sim: use devm_mutex_init()Bartosz Golaszewski1-10/+1
Drop the hand-coded devres action callback for destroying the mutex in favor of devm_mutex_init(). Suggested-by: Andy Shevchenko <andy.shevchenko@gmail.com> Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com> Link: https://lore.kernel.org/r/20240610140548.35358-4-brgl@bgdev.pl Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
2024-06-11gpio: sim: drop kernel.h includeBartosz Golaszewski1-1/+1
We included kernel.h for ARRAY_SIZE() which has since been moved into its own header. Use it instead. Suggested-by: Andy Shevchenko <andy.shevchenko@gmail.com> Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com> Link: https://lore.kernel.org/r/20240610140548.35358-3-brgl@bgdev.pl Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
2024-06-11gpio: sim: use device_match_name() instead of strcmp(dev_name(...Bartosz Golaszewski1-12/+12
Use the dedicated helper for comparing device names against strings. While at it: reshuffle the code a bit for less indentation. Suggested-by: Andy Shevchenko <andy.shevchenko@gmail.com> Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com> Link: https://lore.kernel.org/r/20240610140548.35358-2-brgl@bgdev.pl Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
2024-06-10docs: gpio: prefer pread(2) for interrupt readingHuichun Feng1-3/+4
In legacy sysfs GPIO, when using poll(2) on the sysfs GPIO value for state change awaiting, a subsequent read(2) is required for consuming the event, which the doc recommends the use of lseek(2) or close-and-reopen to reset the file offset afterwards. The recommendations however, require at least 2 syscalls to consume the event. Gladly, use of pread(2) require only 1 syscall for the consumption. Let's advertise this usage by prioritizing its placement. Signed-off-by: Huichun Feng <foxhoundsk.tw@gmail.com> Link: https://lore.kernel.org/r/20240609173728.2950808-1-foxhoundsk.tw@gmail.com Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
2024-06-04gpiolib: Show more info for interrupt only lines in debugfsAndy Shevchenko1-2/+2
Show more info for interrupt only lines in debugfs. It's useful to monitor the lines that have been never requested as GPIOs, but IRQs. Reviewed-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Link: https://lore.kernel.org/r/20240530191418.1138003-3-andriy.shevchenko@linux.intel.com Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
2024-06-04gpiolib: Return label, if set, for IRQ only lineAndy Shevchenko1-6/+6
If line has been locked as IRQ without requesting, still check its label and return it, if not NULL. Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Link: https://lore.kernel.org/r/20240530191418.1138003-2-andriy.shevchenko@linux.intel.com Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
2024-06-03gpiolib: make gpiochip_set_desc_names() return voidBartosz Golaszewski1-8/+4
gpiochip_set_desc_names() cannot fail so drop its return value. Link: https://lore.kernel.org/r/20240527194613.197810-1-brgl@bgdev.pl Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
2024-06-03dt-bindings: gpio: aspeed,sgpio: Specify #interrupt-cellsAndrew Jeffery1-0/+5
Squash warnings such as: arch/arm/boot/dts/aspeed/aspeed-ast2500-evb.dtb: sgpio@1e780200: '#interrupt-cells' does not match any of the regexes: 'pinctrl-[0-9]+' Also, mark #interrupt-cells as required. The kernel devicetrees already specified it where compatible nodes were defined, and u-boot pulls in the kernel devicetrees, so this should have minimal practical impact. Signed-off-by: Andrew Jeffery <andrew@codeconstruct.com.au> Acked-by: Conor Dooley <conor.dooley@microchip.com> Link: https://lore.kernel.org/r/20240530-dt-warnings-gpio-sgpio-interrupt-cells-v2-2-912cd16e641f@codeconstruct.com.au Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
2024-06-03dt-bindings: gpio: aspeed,sgpio: Specify gpio-line-namesAndrew Jeffery1-0/+5
Some devicetrees specify gpio-line-names in the sgpio node despite it not being defined by the binding. It's a reasonable thing to do, so define the property to squash warnings such as: arch/arm/boot/dts/aspeed/aspeed-bmc-vegman-rx20.dtb: sgpio@1e780200: 'gpio-line-names' does not match any of the regexes: 'pinctrl-[0-9]+' Acked-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Signed-off-by: Andrew Jeffery <andrew@codeconstruct.com.au> Link: https://lore.kernel.org/r/20240530-dt-warnings-gpio-sgpio-interrupt-cells-v2-1-912cd16e641f@codeconstruct.com.au Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
2024-06-03dt-bindings: gpio: mpc8xxx: Convert to yaml formatFrank Li2-53/+82
Convert binding doc from txt to yaml. Remove redundated "gpio1: gpio@2300000" example. Add gpio-controller at example "gpio@1100". Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Signed-off-by: Frank Li <Frank.Li@nxp.com> Link: https://lore.kernel.org/r/20240531164357.1419858-1-Frank.Li@nxp.com Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
2024-06-03dt-bindings: gpio: pca95xx: Document the TI TCA9535 variantFabio Estevam1-0/+1
The TI TCA9535 variant has the same programming model as the NXP PCA9535. Document the "ti,tca9535" compatible string. Signed-off-by: Fabio Estevam <festevam@gmail.com> Acked-by: Conor Dooley <conor.dooley@microchip.com> Link: https://lore.kernel.org/r/20240531121801.2161154-2-festevam@gmail.com Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
2024-06-03gpio: pca953x: Add support for TI TCA9535 variantFabio Estevam1-0/+1
Add support for the TI TCA9535 variant. The NXP PCA9535 is already supported by the driver. TCA9535 supports lower voltage operation (down to 1.65V VCC) compared to PCA (down to 2.3V VCC). >From a software perspective, these models are equivalent as they have the same register map. Signed-off-by: Fabio Estevam <festevam@gmail.com> Link: https://lore.kernel.org/r/20240531121801.2161154-1-festevam@gmail.com Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
2024-06-03gpio: brcmstb: Allow building driver for ARCH_BCM2835Peter Robinson1-1/+1
The GPIO_BRCMSTB hardware IP is also included in the bcm2712 SoC so enable the driver to also be built for ARCH_BCM2835. Signed-off-by: Peter Robinson <pbrobinson@gmail.com> Reviewed-by: Linus Walleij <linus.walleij@linaro.org> Link: https://lore.kernel.org/r/20240530181737.1261450-1-pbrobinson@gmail.com Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
2024-05-30gpiolib: cdev: Cleanup kfifo_out() error handlingKent Gibson1-26/+27
The handling of kfifo_out() errors in read functions obscures any error. The error condition should never occur but, while a ret is set to -EIO, it is subsequently ignored and the read functions instead return the number of bytes copied to that point, potentially masking the fact that any error occurred. Log a warning and return -EIO in the case of a kfifo_out() error to make it clear something very odd is going on here. Signed-off-by: Kent Gibson <warthog618@gmail.com> Link: https://lore.kernel.org/r/20240529131953.195777-4-warthog618@gmail.com Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
2024-05-30gpiolib: cdev: Refactor allocation of linereq events kfifoKent Gibson1-13/+13
The allocation of the linereq events kfifo is performed in two separate places. Add a helper function to remove the duplication. Signed-off-by: Kent Gibson <warthog618@gmail.com> Link: https://lore.kernel.org/r/20240529131953.195777-3-warthog618@gmail.com Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
2024-05-30gpiolib: cdev: Add INIT_KFIFO() for linereq eventsKent Gibson1-0/+1
The initialisation of the linereq events kfifo relies on the struct being zeroed and a subsequent call to kfifo_alloc(). The call to kfifo_alloc() is deferred until edge detection is first enabled for the linereq. If the kfifo is inadvertently accessed before the call to kfifo_alloc(), as was the case in a recently discovered bug, it behaves as a FIFO of size 1 with an element size of 0, so writes and reads to the kfifo appear successful but copy no actual data. As a defensive measure, initialise the kfifo with INIT_KFIFO() when the events kfifo is constructed. This initialises the kfifo element size and zeroes its data pointer, so any inadvertant access prior to the kfifo_alloc() call will trigger an oops. Signed-off-by: Kent Gibson <warthog618@gmail.com> Link: https://lore.kernel.org/r/20240529131953.195777-2-warthog618@gmail.com Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
2024-05-30gpio: rdc321x: Convert PCIBIOS_* return codes to errnosIlpo Järvinen1-3/+3
rdc_gpio_config() uses pci_{read,write}_config_dword() that return PCIBIOS_* codes. rdc_gpio_config() is used for direction_{input,output}() in the struct gpio_chip which both require normal errnos to be returned. Similarly, rdc321x_gpio_probe() that is probe function returns PCIBIOS_* codes without converting them first into normal errnos. Convert PCIBIOS_* returns code using pcibios_err_to_errno() into normal errno before returning them to fix both issues. Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Link: https://lore.kernel.org/r/20240527132345.13956-2-ilpo.jarvinen@linux.intel.com Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
2024-05-30gpio: amd8111: Convert PCIBIOS_* return codes to errnosIlpo Järvinen1-1/+3
amd_gpio_init() uses pci_read_config_dword() that returns PCIBIOS_* codes. The return code is then returned as is but amd_gpio_init() is a module init function that should return normal errnos. Convert PCIBIOS_* returns code using pcibios_err_to_errno() into normal errno before returning it from amd_gpio_init(). Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Link: https://lore.kernel.org/r/20240527132345.13956-1-ilpo.jarvinen@linux.intel.com Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
2024-05-29gpio: syscon: do not report bogus errorEtienne Buira1-11/+16
Do not issue "can't read the data register offset!" when gpio,syscon-dev is not set albeit unneeded. gpio-syscon is used with rk3328 chip, but this iomem region is documented in Documentation/devicetree/bindings/gpio/rockchip,rk3328-grf-gpio.yaml and does not require gpio,syscon-dev setting. Signed-off-by: Etienne Buira <etienne.buira@free.fr> Reviewed-by: Linus Walleij <linus.walleij@linaro.org> Link: https://lore.kernel.org/r/Zktwd4Y8zu6XSGaE@Z926fQmE5jqhFMgp6 Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
2024-05-27dt-bindings: gpio: lsi,zevio-gpio: convert to dtschemaPratik Farkase2-16/+43
Convert Zevio GPIO Controller from text to dtschema. Adding `interrupts` property fixes the following warning: linux/out/arch/arm/boot/dts/nspire/nspire-tp.dtb: gpio@90000000: 'interrupts' does not match any of the regexes: 'pinctrl-[0-9]+'` while executing `make dtbs_check` on Texas Instruments nspire boards. Signed-off-by: Pratik Farkase <pratik.farkase@wsisweden.com> Reviewed-by: Krzysztof Kozlowski <krzk@kernel.org> Link: https://lore.kernel.org/r/20240522151616.27397-1-pratik.farkase@wsisweden.com Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
2024-05-27gpio: prevent potential speculation leaks in gpio_device_get_desc()Hagar Hemdan1-1/+2
Userspace may trigger a speculative read of an address outside the gpio descriptor array. Users can do that by calling gpio_ioctl() with an offset out of range. Offset is copied from user and then used as an array index to get the gpio descriptor without sanitization in gpio_device_get_desc(). This change ensures that the offset is sanitized by using array_index_nospec() to mitigate any possibility of speculative information leaks. This bug was discovered and resolved using Coverity Static Analysis Security Testing (SAST) by Synopsys, Inc. Signed-off-by: Hagar Hemdan <hagarhem@amazon.com> Link: https://lore.kernel.org/r/20240523085332.1801-1-hagarhem@amazon.com Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
2024-05-27gpio: Remove legacy API documentationAndy Shevchenko9-1900/+2
In order to discourage people to use old and legacy GPIO APIs remove the respective documentation completely. It also helps further cleanups of the legacy GPIO API leftovers, which is ongoing task. Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Reviewed-by: Hu Haowen <2023002089@link.tyut.edu.cn> Link: https://lore.kernel.org/r/20240508101703.830066-1-andriy.shevchenko@linux.intel.com Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
2024-05-26Linux 6.10-rc1Linus Torvalds1-3/+3
2024-05-26mm: percpu: Include smp.h in alloc_tag.hKent Overstreet1-0/+1
percpu.h depends on smp.h, but doesn't include it directly because of circular header dependency issues; percpu.h is needed in a bunch of low level headers. This fixes a randconfig build error on mips: include/linux/alloc_tag.h: In function '__alloc_tag_ref_set': include/asm-generic/percpu.h:31:40: error: implicit declaration of function 'raw_smp_processor_id' [-Werror=implicit-function-declaration] Reported-by: kernel test robot <lkp@intel.com> Fixes: 24e44cc22aa3 ("mm: percpu: enable per-cpu allocation tagging") Closes: https://lore.kernel.org/oe-kbuild-all/202405210052.DIrMXJNz-lkp@intel.com/ Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2024-05-26Revert "perf parse-events: Prefer sysfs/JSON hardware events over legacy"Arnaldo Carvalho de Melo4-103/+68
This reverts commit 617824a7f0f73e4de325cf8add58e55b28c12493. This made a simple 'perf record -e cycles:pp make -j199' stop working on the Ampere ARM64 system Linus uses to test ARM64 kernels, as discussed at length in the threads in the Link tags below. The fix provided by Ian wasn't acceptable and work to fix this will take time we don't have at this point, so lets revert this and work on it on the next devel cycle. Reported-by: Linus Torvalds <torvalds@linux-foundation.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Bhaskar Chowdhury <unixbhaskar@gmail.com> Cc: Ethan Adams <j.ethan.adams@gmail.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Thomas Richter <tmricht@linux.ibm.com> Cc: Tycho Andersen <tycho@tycho.pizza> Cc: Yang Jihong <yangjihong@bytedance.com> Link: https://lore.kernel.org/lkml/CAHk-=wi5Ri=yR2jBVk-4HzTzpoAWOgstr1LEvg_-OXtJvXXJOA@mail.gmail.com Link: https://lore.kernel.org/lkml/CAHk-=wiWvtFyedDNpoV7a8Fq_FpbB+F5KmWK2xPY3QoYseOf_A@mail.gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-05-24cifs: Fix missing set of remote_i_sizeDavid Howells2-3/+4
Occasionally, the generic/001 xfstest will fail indicating corruption in one of the copy chains when run on cifs against a server that supports FSCTL_DUPLICATE_EXTENTS_TO_FILE (eg. Samba with a share on btrfs). The problem is that the remote_i_size value isn't updated by cifs_setsize() when called by smb2_duplicate_extents(), but i_size *is*. This may cause cifs_remap_file_range() to then skip the bit after calling ->duplicate_extents() that sets sizes. Fix this by calling netfs_resize_file() in smb2_duplicate_extents() before calling cifs_setsize() to set i_size. This means we don't then need to call netfs_resize_file() upon return from ->duplicate_extents(), but we also fix the test to compare against the pre-dup inode size. [Note that this goes back before the addition of remote_i_size with the netfs_inode struct. It should probably have been setting cifsi->server_eof previously.] Fixes: cfc63fc8126a ("smb3: fix cached file size problems in duplicate extents (reflink)") Signed-off-by: David Howells <dhowells@redhat.com> cc: Steve French <sfrench@samba.org> cc: Paulo Alcantara <pc@manguebit.com> cc: Shyam Prasad N <nspmangalore@gmail.com> cc: Rohith Surabattula <rohiths.msft@gmail.com> cc: Jeff Layton <jlayton@kernel.org> cc: linux-cifs@vger.kernel.org cc: netfs@lists.linux.dev Signed-off-by: Steve French <stfrench@microsoft.com>
2024-05-24cifs: Fix smb3_insert_range() to move the zero_pointDavid Howells1-0/+1
Fix smb3_insert_range() to move the zero_point over to the new EOF. Without this, generic/147 fails as reads of data beyond the old EOF point return zeroes. Fixes: 3ee1a1fc3981 ("cifs: Cut over to using netfslib") Signed-off-by: David Howells <dhowells@redhat.com> cc: Shyam Prasad N <nspmangalore@gmail.com> cc: Rohith Surabattula <rohiths.msft@gmail.com> cc: Jeff Layton <jlayton@kernel.org> cc: linux-cifs@vger.kernel.org cc: netfs@lists.linux.dev Signed-off-by: Steve French <stfrench@microsoft.com>
2024-05-24mm/ksm: fix possible UAF of stable_nodeChengming Zhou1-1/+2
The commit 2c653d0ee2ae ("ksm: introduce ksm_max_page_sharing per page deduplication limit") introduced a possible failure case in the stable_tree_insert(), where we may free the new allocated stable_node_dup if we fail to prepare the missing chain node. Then that kfolio return and unlock with a freed stable_node set... And any MM activities can come in to access kfolio->mapping, so UAF. Fix it by moving folio_set_stable_node() to the end after stable_node is inserted successfully. Link: https://lkml.kernel.org/r/20240513-b4-ksm-stable-node-uaf-v1-1-f687de76f452@linux.dev Fixes: 2c653d0ee2ae ("ksm: introduce ksm_max_page_sharing per page deduplication limit") Signed-off-by: Chengming Zhou <chengming.zhou@linux.dev> Acked-by: David Hildenbrand <david@redhat.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Hugh Dickins <hughd@google.com> Cc: Stefan Roesch <shr@devkernel.io> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-05-24mm/memory-failure: fix handling of dissolved but not taken off from buddy pagesMiaohe Lin1-2/+2
When I did memory failure tests recently, below panic occurs: page: refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x8cee00 flags: 0x6fffe0000000000(node=1|zone=2|lastcpupid=0x7fff) raw: 06fffe0000000000 dead000000000100 dead000000000122 0000000000000000 raw: 0000000000000000 0000000000000009 00000000ffffffff 0000000000000000 page dumped because: VM_BUG_ON_PAGE(!PageBuddy(page)) ------------[ cut here ]------------ kernel BUG at include/linux/page-flags.h:1009! invalid opcode: 0000 [#1] PREEMPT SMP NOPTI RIP: 0010:__del_page_from_free_list+0x151/0x180 RSP: 0018:ffffa49c90437998 EFLAGS: 00000046 RAX: 0000000000000035 RBX: 0000000000000009 RCX: ffff8dd8dfd1c9c8 RDX: 0000000000000000 RSI: 0000000000000027 RDI: ffff8dd8dfd1c9c0 RBP: ffffd901233b8000 R08: ffffffffab5511f8 R09: 0000000000008c69 R10: 0000000000003c15 R11: ffffffffab5511f8 R12: ffff8dd8fffc0c80 R13: 0000000000000001 R14: ffff8dd8fffc0c80 R15: 0000000000000009 FS: 00007ff916304740(0000) GS:ffff8dd8dfd00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 000055eae50124c8 CR3: 00000008479e0000 CR4: 00000000000006f0 Call Trace: <TASK> __rmqueue_pcplist+0x23b/0x520 get_page_from_freelist+0x26b/0xe40 __alloc_pages_noprof+0x113/0x1120 __folio_alloc_noprof+0x11/0xb0 alloc_buddy_hugetlb_folio.isra.0+0x5a/0x130 __alloc_fresh_hugetlb_folio+0xe7/0x140 alloc_pool_huge_folio+0x68/0x100 set_max_huge_pages+0x13d/0x340 hugetlb_sysctl_handler_common+0xe8/0x110 proc_sys_call_handler+0x194/0x280 vfs_write+0x387/0x550 ksys_write+0x64/0xe0 do_syscall_64+0xc2/0x1d0 entry_SYSCALL_64_after_hwframe+0x77/0x7f RIP: 0033:0x7ff916114887 RSP: 002b:00007ffec8a2fd78 EFLAGS: 00000246 ORIG_RAX: 0000000000000001 RAX: ffffffffffffffda RBX: 000055eae500e350 RCX: 00007ff916114887 RDX: 0000000000000004 RSI: 000055eae500e390 RDI: 0000000000000003 RBP: 000055eae50104c0 R08: 0000000000000000 R09: 000055eae50104c0 R10: 0000000000000077 R11: 0000000000000246 R12: 0000000000000004 R13: 0000000000000004 R14: 00007ff916216b80 R15: 00007ff916216a00 </TASK> Modules linked in: mce_inject hwpoison_inject ---[ end trace 0000000000000000 ]--- And before the panic, there had an warning about bad page state: BUG: Bad page state in process page-types pfn:8cee00 page: refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x8cee00 flags: 0x6fffe0000000000(node=1|zone=2|lastcpupid=0x7fff) page_type: 0xffffff7f(buddy) raw: 06fffe0000000000 ffffd901241c0008 ffffd901240f8008 0000000000000000 raw: 0000000000000000 0000000000000009 00000000ffffff7f 0000000000000000 page dumped because: nonzero mapcount Modules linked in: mce_inject hwpoison_inject CPU: 8 PID: 154211 Comm: page-types Not tainted 6.9.0-rc4-00499-g5544ec3178e2-dirty #22 Call Trace: <TASK> dump_stack_lvl+0x83/0xa0 bad_page+0x63/0xf0 free_unref_page+0x36e/0x5c0 unpoison_memory+0x50b/0x630 simple_attr_write_xsigned.constprop.0.isra.0+0xb3/0x110 debugfs_attr_write+0x42/0x60 full_proxy_write+0x5b/0x80 vfs_write+0xcd/0x550 ksys_write+0x64/0xe0 do_syscall_64+0xc2/0x1d0 entry_SYSCALL_64_after_hwframe+0x77/0x7f RIP: 0033:0x7f189a514887 RSP: 002b:00007ffdcd899718 EFLAGS: 00000246 ORIG_RAX: 0000000000000001 RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f189a514887 RDX: 0000000000000009 RSI: 00007ffdcd899730 RDI: 0000000000000003 RBP: 00007ffdcd8997a0 R08: 0000000000000000 R09: 00007ffdcd8994b2 R10: 0000000000000000 R11: 0000000000000246 R12: 00007ffdcda199a8 R13: 0000000000404af1 R14: 000000000040ad78 R15: 00007f189a7a5040 </TASK> The root cause should be the below race: memory_failure try_memory_failure_hugetlb me_huge_page __page_handle_poison dissolve_free_hugetlb_folio drain_all_pages -- Buddy page can be isolated e.g. for compaction. take_page_off_buddy -- Failed as page is not in the buddy list. -- Page can be putback into buddy after compaction. page_ref_inc -- Leads to buddy page with refcnt = 1. Then unpoison_memory() can unpoison the page and send the buddy page back into buddy list again leading to the above bad page state warning. And bad_page() will call page_mapcount_reset() to remove PageBuddy from buddy page leading to later VM_BUG_ON_PAGE(!PageBuddy(page)) when trying to allocate this page. Fix this issue by only treating __page_handle_poison() as successful when it returns 1. Link: https://lkml.kernel.org/r/20240523071217.1696196-1-linmiaohe@huawei.com Fixes: ceaf8fbea79a ("mm, hwpoison: skip raw hwpoison page in freeing 1GB hugepage") Signed-off-by: Miaohe Lin <linmiaohe@huawei.com> Cc: Naoya Horiguchi <nao.horiguchi@gmail.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-05-24mm: /proc/pid/smaps_rollup: avoid skipping vma after getting mmap_lock againYuanyuan Zhong1-2/+7
After switching smaps_rollup to use VMA iterator, searching for next entry is part of the condition expression of the do-while loop. So the current VMA needs to be addressed before the continue statement. Otherwise, with some VMAs skipped, userspace observed memory consumption from /proc/pid/smaps_rollup will be smaller than the sum of the corresponding fields from /proc/pid/smaps. Link: https://lkml.kernel.org/r/20240523183531.2535436-1-yzhong@purestorage.com Fixes: c4c84f06285e ("fs/proc/task_mmu: stop using linked list and highest_vm_end") Signed-off-by: Yuanyuan Zhong <yzhong@purestorage.com> Reviewed-by: Mohamed Khalfella <mkhalfella@purestorage.com> Cc: David Hildenbrand <david@redhat.com> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-05-24nilfs2: fix potential hang in nilfs_detach_log_writer()Ryusuke Konishi1-3/+18
Syzbot has reported a potential hang in nilfs_detach_log_writer() called during nilfs2 unmount. Analysis revealed that this is because nilfs_segctor_sync(), which synchronizes with the log writer thread, can be called after nilfs_segctor_destroy() terminates that thread, as shown in the call trace below: nilfs_detach_log_writer nilfs_segctor_destroy nilfs_segctor_kill_thread --> Shut down log writer thread flush_work nilfs_iput_work_func nilfs_dispose_list iput nilfs_evict_inode nilfs_transaction_commit nilfs_construct_segment (if inode needs sync) nilfs_segctor_sync --> Attempt to synchronize with log writer thread *** DEADLOCK *** Fix this issue by changing nilfs_segctor_sync() so that the log writer thread returns normally without synchronizing after it terminates, and by forcing tasks that are already waiting to complete once after the thread terminates. The skipped inode metadata flushout will then be processed together in the subsequent cleanup work in nilfs_segctor_destroy(). Link: https://lkml.kernel.org/r/20240520132621.4054-4-konishi.ryusuke@gmail.com Signed-off-by: Ryusuke Konishi <konishi.ryusuke@gmail.com> Reported-by: syzbot+e3973c409251e136fdd0@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=e3973c409251e136fdd0 Tested-by: Ryusuke Konishi <konishi.ryusuke@gmail.com> Cc: <stable@vger.kernel.org> Cc: "Bai, Shuangpeng" <sjb7183@psu.edu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-05-24nilfs2: fix unexpected freezing of nilfs_segctor_sync()Ryusuke Konishi1-4/+13
A potential and reproducible race issue has been identified where nilfs_segctor_sync() would block even after the log writer thread writes a checkpoint, unless there is an interrupt or other trigger to resume log writing. This turned out to be because, depending on the execution timing of the log writer thread running in parallel, the log writer thread may skip responding to nilfs_segctor_sync(), which causes a call to schedule() waiting for completion within nilfs_segctor_sync() to lose the opportunity to wake up. The reason why waking up the task waiting in nilfs_segctor_sync() may be skipped is that updating the request generation issued using a shared sequence counter and adding an wait queue entry to the request wait queue to the log writer, are not done atomically. There is a possibility that log writing and request completion notification by nilfs_segctor_wakeup() may occur between the two operations, and in that case, the wait queue entry is not yet visible to nilfs_segctor_wakeup() and the wake-up of nilfs_segctor_sync() will be carried over until the next request occurs. Fix this issue by performing these two operations simultaneously within the lock section of sc_state_lock. Also, following the memory barrier guidelines for event waiting loops, move the call to set_current_state() in the same location into the event waiting loop to ensure that a memory barrier is inserted just before the event condition determination. Link: https://lkml.kernel.org/r/20240520132621.4054-3-konishi.ryusuke@gmail.com Fixes: 9ff05123e3bf ("nilfs2: segment constructor") Signed-off-by: Ryusuke Konishi <konishi.ryusuke@gmail.com> Tested-by: Ryusuke Konishi <konishi.ryusuke@gmail.com> Cc: <stable@vger.kernel.org> Cc: "Bai, Shuangpeng" <sjb7183@psu.edu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-05-24nilfs2: fix use-after-free of timer for log writer threadRyusuke Konishi1-6/+19
Patch series "nilfs2: fix log writer related issues". This bug fix series covers three nilfs2 log writer-related issues, including a timer use-after-free issue and potential deadlock issue on unmount, and a potential freeze issue in event synchronization found during their analysis. Details are described in each commit log. This patch (of 3): A use-after-free issue has been reported regarding the timer sc_timer on the nilfs_sc_info structure. The problem is that even though it is used to wake up a sleeping log writer thread, sc_timer is not shut down until the nilfs_sc_info structure is about to be freed, and is used regardless of the thread's lifetime. Fix this issue by limiting the use of sc_timer only while the log writer thread is alive. Link: https://lkml.kernel.org/r/20240520132621.4054-1-konishi.ryusuke@gmail.com Link: https://lkml.kernel.org/r/20240520132621.4054-2-konishi.ryusuke@gmail.com Fixes: fdce895ea5dd ("nilfs2: change sc_timer from a pointer to an embedded one in struct nilfs_sc_info") Signed-off-by: Ryusuke Konishi <konishi.ryusuke@gmail.com> Reported-by: "Bai, Shuangpeng" <sjb7183@psu.edu> Closes: https://groups.google.com/g/syzkaller/c/MK_LYqtt8ko/m/8rgdWeseAwAJ Tested-by: Ryusuke Konishi <konishi.ryusuke@gmail.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-05-24selftests/mm: fix build warnings on ppc64Michael Ellerman2-0/+2
Fix warnings like: In file included from uffd-unit-tests.c:8: uffd-unit-tests.c: In function `uffd_poison_handle_fault': uffd-common.h:45:33: warning: format `%llu' expects argument of type `long long unsigned int', but argument 3 has type `__u64' {aka `long unsigned int'} [-Wformat=] By switching to unsigned long long for u64 for ppc64 builds. Link: https://lkml.kernel.org/r/20240521030219.57439-1-mpe@ellerman.id.au Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Cc: Shuah Khan <skhan@linuxfoundation.org> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-05-24arm64: patching: fix handling of execmem addressesWill Deacon1-1/+1
Klara Modin reported warnings for a kernel configured with BPF_JIT but without MODULES: [ 44.131296] Trying to vfree() bad address (000000004a17c299) [ 44.138024] WARNING: CPU: 1 PID: 193 at mm/vmalloc.c:3189 remove_vm_area (mm/vmalloc.c:3189 (discriminator 1)) [ 44.146675] CPU: 1 PID: 193 Comm: kworker/1:2 Tainted: G D W 6.9.0-01786-g2c9e5d4a0082 #25 [ 44.158229] Hardware name: Raspberry Pi 3 Model B (DT) [ 44.164433] Workqueue: events bpf_prog_free_deferred [ 44.170492] pstate: 60000005 (nZCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--) [ 44.178601] pc : remove_vm_area (mm/vmalloc.c:3189 (discriminator 1)) [ 44.183705] lr : remove_vm_area (mm/vmalloc.c:3189 (discriminator 1)) [ 44.188772] sp : ffff800082a13c70 [ 44.193112] x29: ffff800082a13c70 x28: 0000000000000000 x27: 0000000000000000 [ 44.201384] x26: 0000000000000000 x25: ffff00003a44efa0 x24: 00000000d4202000 [ 44.209658] x23: ffff800081223dd0 x22: ffff00003a198a40 x21: ffff8000814dd880 [ 44.217924] x20: 00000000d4202000 x19: ffff8000814dd880 x18: 0000000000000006 [ 44.226206] x17: 0000000000000000 x16: 0000000000000020 x15: 0000000000000002 [ 44.234460] x14: ffff8000811a6370 x13: 0000000020000000 x12: 0000000000000000 [ 44.242710] x11: ffff8000811a6370 x10: 0000000000000144 x9 : ffff8000811fe370 [ 44.250959] x8 : 0000000000017fe8 x7 : 00000000fffff000 x6 : ffff8000811fe370 [ 44.259206] x5 : 0000000000000000 x4 : 0000000000000000 x3 : 0000000000000000 [ 44.267457] x2 : 0000000000000000 x1 : 0000000000000000 x0 : ffff000002203240 [ 44.275703] Call trace: [ 44.279158] remove_vm_area (mm/vmalloc.c:3189 (discriminator 1)) [ 44.283858] vfree (mm/vmalloc.c:3322) [ 44.287835] execmem_free (mm/execmem.c:70) [ 44.292347] bpf_jit_free_exec+0x10/0x1c [ 44.297283] bpf_prog_pack_free (kernel/bpf/core.c:1006) [ 44.302457] bpf_jit_binary_pack_free (kernel/bpf/core.c:1195) [ 44.307951] bpf_jit_free (include/linux/filter.h:1083 arch/arm64/net/bpf_jit_comp.c:2474) [ 44.312342] bpf_prog_free_deferred (kernel/bpf/core.c:2785) [ 44.317785] process_one_work (kernel/workqueue.c:3273) [ 44.322684] worker_thread (kernel/workqueue.c:3342 (discriminator 2) kernel/workqueue.c:3429 (discriminator 2)) [ 44.327292] kthread (kernel/kthread.c:388) [ 44.331342] ret_from_fork (arch/arm64/kernel/entry.S:861) The problem is because bpf_arch_text_copy() silently fails to write to the read-only area as a result of patch_map() faulting and the resulting -EFAULT being chucked away. Update patch_map() to use CONFIG_EXECMEM instead of CONFIG_STRICT_MODULE_RWX to check for vmalloc addresses. Link: https://lkml.kernel.org/r/20240521213813.703309-1-rppt@kernel.org Fixes: 2c9e5d4a0082 ("bpf: remove CONFIG_BPF_JIT dependency on CONFIG_MODULES of") Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Mike Rapoport (IBM) <rppt@kernel.org> Reported-by: Klara Modin <klarasmodin@gmail.com> Closes: https://lore.kernel.org/all/7983fbbf-0127-457c-9394-8d6e4299c685@gmail.com Tested-by: Klara Modin <klarasmodin@gmail.com> Cc: Björn Töpel <bjorn@kernel.org> Cc: Luis Chamberlain <mcgrof@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-05-24selftests/mm: compaction_test: fix bogus test success and reduce probability of OOM-killer invocationDev Jain1-22/+49
Reset nr_hugepages to zero before the start of the test. If a non-zero number of hugepages is already set before the start of the test, the following problems arise: - The probability of the test getting OOM-killed increases. Proof: The test wants to run on 80% of available memory to prevent OOM-killing (see original code comments). Let the value of mem_free at the start of the test, when nr_hugepages = 0, be x. In the other case, when nr_hugepages > 0, let the memory consumed by hugepages be y. In the former case, the test operates on 0.8 * x of memory. In the latter, the test operates on 0.8 * (x - y) of memory, with y already filled, hence, memory consumed is y + 0.8 * (x - y) = 0.8 * x + 0.2 * y > 0.8 * x. Q.E.D - The probability of a bogus test success increases. Proof: Let the memory consumed by hugepages be greater than 25% of x, with x and y defined as above. The definition of compaction_index is c_index = (x - y)/z where z is the memory consumed by hugepages after trying to increase them again. In check_compaction(), we set the number of hugepages to zero, and then increase them back; the probability that they will be set back to consume at least y amount of memory again is very high (since there is not much delay between the two attempts of changing nr_hugepages). Hence, z >= y > (x/4) (by the 25% assumption). Therefore, c_index = (x - y)/z <= (x - y)/y = x/y - 1 < 4 - 1 = 3 hence, c_index can always be forced to be less than 3, thereby the test succeeding always. Q.E.D Link: https://lkml.kernel.org/r/20240521074358.675031-4-dev.jain@arm.com Fixes: bd67d5c15cc1 ("Test compaction of mlocked memory") Signed-off-by: Dev Jain <dev.jain@arm.com> Cc: <stable@vger.kernel.org> Cc: Anshuman Khandual <anshuman.khandual@arm.com> Cc: Shuah Khan <shuah@kernel.org> Cc: Sri Jayaramappa <sjayaram@akamai.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-05-24selftests/mm: compaction_test: fix incorrect write of zero to nr_hugepagesDev Jain1-0/+2
Currently, the test tries to set nr_hugepages to zero, but that is not actually done because the file offset is not reset after read(). Fix that using lseek(). Link: https://lkml.kernel.org/r/20240521074358.675031-3-dev.jain@arm.com Fixes: bd67d5c15cc1 ("Test compaction of mlocked memory") Signed-off-by: Dev Jain <dev.jain@arm.com> Cc: <stable@vger.kernel.org> Cc: Anshuman Khandual <anshuman.khandual@arm.com> Cc: Shuah Khan <shuah@kernel.org> Cc: Sri Jayaramappa <sjayaram@akamai.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-05-24selftests/mm: compaction_test: fix bogus test success on Aarch64Dev Jain1-7/+13
Patch series "Fixes for compaction_test", v2. The compaction_test memory selftest introduces fragmentation in memory and then tries to allocate as many hugepages as possible. This series addresses some problems. On Aarch64, if nr_hugepages == 0, then the test trivially succeeds since compaction_index becomes 0, which is less than 3, due to no division by zero exception being raised. We fix that by checking for division by zero. Secondly, correctly set the number of hugepages to zero before trying to set a large number of them. Now, consider a situation in which, at the start of the test, a non-zero number of hugepages have been already set (while running the entire selftests/mm suite, or manually by the admin). The test operates on 80% of memory to avoid OOM-killer invocation, and because some memory is already blocked by hugepages, it would increase the chance of OOM-killing. Also, since mem_free used in check_compaction() is the value before we set nr_hugepages to zero, the chance that the compaction_index will be small is very high if the preset nr_hugepages was high, leading to a bogus test success. This patch (of 3): Currently, if at runtime we are not able to allocate a huge page, the test will trivially pass on Aarch64 due to no exception being raised on division by zero while computing compaction_index. Fix that by checking for nr_hugepages == 0. Anyways, in general, avoid a division by zero by exiting the program beforehand. While at it, fix a typo, and handle the case where the number of hugepages may overflow an integer. Link: https://lkml.kernel.org/r/20240521074358.675031-1-dev.jain@arm.com Link: https://lkml.kernel.org/r/20240521074358.675031-2-dev.jain@arm.com Fixes: bd67d5c15cc1 ("Test compaction of mlocked memory") Signed-off-by: Dev Jain <dev.jain@arm.com> Cc: Anshuman Khandual <anshuman.khandual@arm.com> Cc: Shuah Khan <shuah@kernel.org> Cc: Sri Jayaramappa <sjayaram@akamai.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-05-24mailmap: update email address for Satya PriyaSatya Priya Kakitapalli1-1/+1
Update mailmap with my latest email ID, quic_c_skakit@quicinc.com is no longer active. Link: https://lkml.kernel.org/r/20240515-mailmap-update-v1-1-df4853f757a3@quicinc.com Signed-off-by: Satya Priya Kakitapalli <quic_skakitap@quicinc.com> Cc: Ajit Pandey <quic_ajipan@quicinc.com> Cc: Bjorn Andersson <andersson@kernel.org> Cc: Imran Shaik <quic_imrashai@quicinc.com> Cc: Jagadeesh Kona <quic_jkona@quicinc.com> Cc: Konrad Dybcio <konrad.dybcio@linaro.org> Cc: Taniya Das <quic_tdas@quicinc.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-05-24mm/huge_memory: don't unpoison huge_zero_folioMiaohe Lin1-0/+7
When I did memory failure tests recently, below panic occurs: kernel BUG at include/linux/mm.h:1135! invalid opcode: 0000 [#1] PREEMPT SMP NOPTI CPU: 9 PID: 137 Comm: kswapd1 Not tainted 6.9.0-rc4-00491-gd5ce28f156fe-dirty #14 RIP: 0010:shrink_huge_zero_page_scan+0x168/0x1a0 RSP: 0018:ffff9933c6c57bd0 EFLAGS: 00000246 RAX: 000000000000003e RBX: 0000000000000000 RCX: ffff88f61fc5c9c8 RDX: 0000000000000000 RSI: 0000000000000027 RDI: ffff88f61fc5c9c0 RBP: ffffcd7c446b0000 R08: ffffffff9a9405f0 R09: 0000000000005492 R10: 00000000000030ea R11: ffffffff9a9405f0 R12: 0000000000000000 R13: 0000000000000000 R14: 0000000000000000 R15: ffff88e703c4ac00 FS: 0000000000000000(0000) GS:ffff88f61fc40000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 000055f4da6e9878 CR3: 0000000c71048000 CR4: 00000000000006f0 Call Trace: <TASK> do_shrink_slab+0x14f/0x6a0 shrink_slab+0xca/0x8c0 shrink_node+0x2d0/0x7d0 balance_pgdat+0x33a/0x720 kswapd+0x1f3/0x410 kthread+0xd5/0x100 ret_from_fork+0x2f/0x50 ret_from_fork_asm+0x1a/0x30 </TASK> Modules linked in: mce_inject hwpoison_inject ---[ end trace 0000000000000000 ]--- RIP: 0010:shrink_huge_zero_page_scan+0x168/0x1a0 RSP: 0018:ffff9933c6c57bd0 EFLAGS: 00000246 RAX: 000000000000003e RBX: 0000000000000000 RCX: ffff88f61fc5c9c8 RDX: 0000000000000000 RSI: 0000000000000027 RDI: ffff88f61fc5c9c0 RBP: ffffcd7c446b0000 R08: ffffffff9a9405f0 R09: 0000000000005492 R10: 00000000000030ea R11: ffffffff9a9405f0 R12: 0000000000000000 R13: 0000000000000000 R14: 0000000000000000 R15: ffff88e703c4ac00 FS: 0000000000000000(0000) GS:ffff88f61fc40000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 000055f4da6e9878 CR3: 0000000c71048000 CR4: 00000000000006f0 The root cause is that HWPoison flag will be set for huge_zero_folio without increasing the folio refcnt. But then unpoison_memory() will decrease the folio refcnt unexpectedly as it appears like a successfully hwpoisoned folio leading to VM_BUG_ON_PAGE(page_ref_count(page) == 0) when releasing huge_zero_folio. Skip unpoisoning huge_zero_folio in unpoison_memory() to fix this issue. We're not prepared to unpoison huge_zero_folio yet. Link: https://lkml.kernel.org/r/20240516122608.22610-1-linmiaohe@huawei.com Fixes: 478d134e9506 ("mm/huge_memory: do not overkill when splitting huge_zero_page") Signed-off-by: Miaohe Lin <linmiaohe@huawei.com> Acked-by: David Hildenbrand <david@redhat.com> Reviewed-by: Yang Shi <shy828301@gmail.com> Reviewed-by: Oscar Salvador <osalvador@suse.de> Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com> Cc: Naoya Horiguchi <nao.horiguchi@gmail.com> Cc: Xu Yu <xuyu@linux.alibaba.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>