linux-dev - Linux kernel development work

Age	Commit message (Collapse)	Author	Files	Lines
2022-02-19	ata: pata_hpt3x2n: disable fast interrupts in prereset() method	Sergey Shtylyov	1	-13/+10
	The PIO/DMA mode setting function is hardly a good place for disabling the fast interrupts on a channel -- let's move that code to the driver's prereset() method instead. Signed-off-by: Sergey Shtylyov <s.shtylyov@omp.ru> Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
2022-02-19	ata: pata_hpt37x: disable fast interrupts in prereset() method	Sergey Shtylyov	1	-26/+22
	The PIO/DMA mode setting functions are hardly a good place for disabling the fast interrupts on a channel -- let's move that code to the driver's prereset() method instead. Signed-off-by: Sergey Shtylyov <s.shtylyov@omp.ru> Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
2022-02-19	ata: pata_hpt366: disable fast interrupts in prereset() method	Sergey Shtylyov	1	-6/+7
	The PIO/DMA mode setting function is hardly a good place for disabling the fast interrupts on a channel -- let's move that code to the driver's prereset() method instead. Signed-off-by: Sergey Shtylyov <s.shtylyov@omp.ru> Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
2022-02-19	ata: pata_mpc52xx: use GFP_KERNEL	Julia Lawall	1	-1/+1
	Platform_driver probe functions aren't called with locks held and thus don't need GFP_ATOMIC. Use GFP_KERNEL instead. Problem found with Coccinelle. Signed-off-by: Julia Lawall <Julia.Lawall@inria.fr> Reviewed-by: Sergey Shtylyov <s.shtylyov@omp.ru> Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
2022-02-19	ata: sata_rcar: drop unused #define's	Sergey Shtylyov	1	-4/+0
	This driver has never used the SH-Navi2G/ATAPI-ATA compatible taskfile registers (the driver uses the taskfile registers in another location anyway), so drop their #define's... Signed-off-by: Sergey Shtylyov <s.shtylyov@omp.ru> Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
2022-02-19	ata: pata_hpt366: check channel enable bits	Sergey Shtylyov	1	-2/+40
	HighPoint HPT36x chips did turn out to have the channel enable bits -- however, badly implemented. Make use of them, despite that is probably only going to burden the driver's code -- assuming both channels are always enabled by the HighPoint BIOS anyway... Signed-off-by: Sergey Shtylyov <s.shtylyov@omp.ru> Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
2022-02-19	ata: sata_rcar: make sata_rcar_ata_devchk() return 'bool'	Sergey Shtylyov	1	-4/+3
	sata_rcar_ata_devchk() returns 1 if a device is present, 0 if not -- the 'bool' type clearly fits better here than 'unsigned int'... Signed-off-by: Sergey Shtylyov <s.shtylyov@omp.ru> Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
2022-02-19	ata: pata_samsung_cf: make pata_s3c_devchk() return 'bool'	Sergey Shtylyov	1	-4/+3
	pata_s3c_devchk() returns 1 if a device is present, 0 if not -- the 'bool' type clearly fits better here than 'unsigned int'... Signed-off-by: Sergey Shtylyov <s.shtylyov@omp.ru> Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
2022-02-19	ata: libata-sff: make ata_devchk() return 'bool'	Sergey Shtylyov	1	-3/+6
	ata_devchk() returns 1 if a device is present, 0 if not -- the 'bool' type clearly fits better here than 'unsigned int'... Signed-off-by: Sergey Shtylyov <s.shtylyov@omp.ru> Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
2022-02-19	ata: pata_hpt3x2n: drop unused 'struct hpt_chip'	Sergey Shtylyov	1	-5/+0
	The driver has never used 'struct hpt_chip' -- drop its declaration. Signed-off-by: Sergey Shtylyov <s.shtylyov@omp.ru> Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
2022-02-19	ata: pata_hpt3x2n: drop unused HPT_PCI_FAST	Sergey Shtylyov	1	-1/+0
	The driver has never used HPT_PCI_FAST -- drop it. Signed-off-by: Sergey Shtylyov <s.shtylyov@omp.ru> Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
2022-02-19	ata: pata_artop: use switch in atp8xx_fixup()	Sergey Shtylyov	1	-5/+9
	This driver uses a string of the if statements in atp8xx_fixup() where a switch statement would fit better... Signed-off-by: Sergey Shtylyov <s.shtylyov@omp.ru> Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
2022-02-19	ata: pata_artop: use switch in artop_init_one()	Sergey Shtylyov	1	-8/+11
	This driver uses a string of the if statements in artop_init_one() where the switch statement would fit better. While fixing this, refactor the 6280 code to e.g. avoid a compound statement inside the case section... Signed-off-by: Sergey Shtylyov <s.shtylyov@omp.ru> Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
2022-02-19	pata_hpt3x2n: fix writing to wrong register in hpt3x2n_bmdma_stop()	Sergey Shtylyov	1	-2/+2
	The driver's bmdma_stop() method writes to the wrong PCI config register (0x52 intead of 0x54) when trying to clear the state machine on secondary channel -- "luckily", the write falls on a read-only part of the primary channel MISC. control 3 register, so no collateral damage is done... Alan Cox fixed the HPT37x driver in commit 6929da4427b4 ("[PATCH] hpt37x: Two important bug fixes") but forgot to check the HPT3x2N driver which has the same bug. :-/ Signed-off-by: Sergey Shtylyov <s.shtylyov@omp.ru> Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
2022-02-19	pata_hpt3x2n: check channel enable bits	Sergey Shtylyov	1	-1/+8
	The driver's prereset() method still doesn't check the channel enable bits. The bug was there for the entire time the driver has existed. :-/ Alan Cox fixed the HPT37x driver in commit b5bf24b94c65 ("[PATCH] hpt37x: Check the enablebits") but forgot to check the HPT3x2N driver which has the same bug. :-/ Signed-off-by: Sergey Shtylyov <s.shtylyov@omp.ru> Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
2022-02-19	ata: libata: make ata_host_suspend() void	Sergey Shtylyov	16	-53/+39
	ata_host_suspend() always returns 0, so the result checks in many drivers look pointless. Let's make this function return void instead of int. Found by Linux Verification Center (linuxtesting.org) with the SVACE static analysis tool. Signed-off-by: Sergey Shtylyov <s.shtylyov@omp.ru> Acked-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Reviewed-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
2022-02-03	ata: libata: ata_{sff\|std}_prereset() always return 0	Sergey Shtylyov	2	-5/+4
	ata_std_prereset() always returns 0, hence the check in ata_sff_prereset() is pointless and thus it also can return only 0 (however, we cannot change the prototypes of ata_{sff\|std}_prereset() as they implement the driver's prereset() method). Found by Linux Verification Center (linuxtesting.org) with the SVACE static analysis tool. Signed-off-by: Sergey Shtylyov <s.shtylyov@omp.ru> Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
2022-02-03	ata: ahci: Skip 200 ms debounce delay for Marvell 88SE9235	Paul Menzel	1	-0/+2
	The 200 ms delay before debouncing the PHY in `sata_link_resume()` is not needed for the Marvell 88SE9235. $ lspci -nn -s 0021:0e:00.0 0021:0e:00.0 SATA controller [0106]: Marvell Technology Group Ltd. 88SE9235 PCIe 2.0 x2 4-port SATA 6 Gb/s Controller [1b4b:9235] (rev 11) So, remove it using the board_ahci_no_debounce_delay board definition. Tested on IBM S822LC with current Linux 5.17-rc1: Currently, without this patch (with 200 ms delay), device probe for ata1 takes 485 ms: [ 3.358158] ata1: SATA max UDMA/133 abar m2048@0x3fe881000000 port 0x3fe881000100 irq 39 [ 3.358175] ata2: SATA max UDMA/133 abar m2048@0x3fe881000000 port 0x3fe881000180 irq 39 [ 3.358191] ata3: SATA max UDMA/133 abar m2048@0x3fe881000000 port 0x3fe881000200 irq 39 [ 3.358207] ata4: SATA max UDMA/133 abar m2048@0x3fe881000000 port 0x3fe881000280 irq 39 […] [ 3.677542] ata3: SATA link down (SStatus 0 SControl 300) [ 3.677719] ata4: SATA link down (SStatus 0 SControl 300) [ 3.839242] ata2: SATA link up 6.0 Gbps (SStatus 133 SControl 300) [ 3.839828] ata2.00: ATA-10: ST1000NX0313 00LY266 00LY265IBM, BE33, max UDMA/133 [ 3.840029] ata2.00: 1953525168 sectors, multi 0: LBA48 NCQ (depth 32), AA [ 3.841796] ata2.00: configured for UDMA/133 [ 3.843231] ata1: SATA link up 6.0 Gbps (SStatus 133 SControl 300) [ 3.844083] ata1.00: ATA-10: ST1000NX0313 00LY266 00LY265IBM, BE33, max UDMA/133 [ 3.844313] ata1.00: 1953525168 sectors, multi 0: LBA48 NCQ (depth 32), AA [ 3.846043] ata1.00: configured for UDMA/133 With this patch (no delay) device probe for ata1 takes 273 ms: [ 3.624259] ata1: SATA max UDMA/133 abar m2048@0x3fe881000000 port 0x3f e881000100 irq 39 [ 3.624436] ata2: SATA max UDMA/133 abar m2048@0x3fe881000000 port 0x3f e881000180 irq 39 [ 3.624452] ata3: SATA max UDMA/133 abar m2048@0x3fe881000000 port 0x3f e881000200 irq 39 [ 3.624468] ata4: SATA max UDMA/133 abar m2048@0x3fe881000000 port 0x3f e881000280 irq 39 […] [ 3.731966] ata3: SATA link down (SStatus 0 SControl 300) [ 3.732069] ata4: SATA link down (SStatus 0 SControl 300) [ 3.897448] ata1: SATA link up 6.0 Gbps (SStatus 133 SControl 300) [ 3.897678] ata2: SATA link up 6.0 Gbps (SStatus 133 SControl 300) [ 3.898140] ata1.00: ATA-10: ST1000NX0313 00LY266 00LY265IBM, BE33, max UDMA/133 [ 3.898175] ata2.00: ATA-10: ST1000NX0313 00LY266 00LY265IBM, BE33, max UDMA/133 [ 3.898287] ata1.00: 1953525168 sectors, multi 0: LBA48 NCQ (depth 32), AA [ 3.898349] ata2.00: 1953525168 sectors, multi 0: LBA48 NCQ (depth 32), AA [ 3.900070] ata1.00: configured for UDMA/133 [ 3.900166] ata2.00: configured for UDMA/133 Signed-off-by: Paul Menzel <pmenzel@molgen.mpg.de> Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
2022-02-02	ata: libata-acpi: kill ata_acpi_on_suspend()	Sergey Shtylyov	3	-29/+1
	Since the commit c05e6ff035c1b25d17364a685432 ("libata-acpi: implement and use ata_acpi_init_gtm()") ata_acpi_on_suspend() just returns 0, so its call from ata_eh_handle_port_suspend() doesn't make sense anymore. Remove the function completely, at last... Found by Linux Verification Center (linuxtesting.org) with the SVACE static analysis tool. Signed-off-by: Sergey Shtylyov <s.shtylyov@omp.ru> Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
2022-01-31	ata: libata-scsi: Simplify scsi_XX_lba_len()	Damien Le Moal	1	-34/+6
	In scsi_10_lba_len() and scsi_16_lba_len() functions, use get_unaligned_bexx() to access a cdb LBA and length fields instead of hardcoding the byte retrieval. With these simplification, the functions can also be declared inline. Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com> Reviewed-by: Hannes Reinecke <hare@suse.de>
2022-01-31	ata: libata-scsi: Simplify ata_scsi_mode_select_xlat()	Damien Le Moal	1	-3/+3
	Use get_unaligned_be16() instead of using hardcoded accesses to 16-bits big endian cdb fields. Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com> Reviewed-by: Hannes Reinecke <hare@suse.de>
2022-01-31	ata: libata-scsi: Cleanup ata_get_xlat_func()	Damien Le Moal	1	-1/+0
	Remove the unnecessary "break" after the return statement in the MODE_SELECT/MODE_SELECT_10 case. Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com> Reviewed-by: Hannes Reinecke <hare@suse.de>
2022-01-31	ata: pata_pdc202xx_old: make static read-only array pio_timing const	Colin Ian King	1	-1/+1
	The static array pio_timing is read-only so it make sense to make it const. Signed-off-by: Colin Ian King <colin.i.king@gmail.com> Reviewed-by: Sergey Shtylyov <s.shtylyov@omp.ru> Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
2022-01-31	ata: pata_atiixp: make static read-only arrays const	Colin Ian King	1	-2/+2
	The static arrays pio_timings and mwdma_timings are read-only so it make sense to make them const. Signed-off-by: Colin Ian King <colin.i.king@gmail.com> Reviewed-by: Sergey Shtylyov <s.shtylyov@omp.ru> Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
2022-01-31	ata: pata_platform: Make use of platform_get_mem_or_io()	Lad Prabhakar	1	-12/+6
	Make use of platform_get_mem_or_io() to simplify the code. While at it, drop use of unlikely() from pata_platform_probe() as it isn't a hotpath. Signed-off-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com> Reviewed-by: Sergey Shtylyov <s.shtylyov@omp.ru> Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
2022-01-30	Linux 5.17-rc2	Linus Torvalds	1	-1/+1

2022-01-30	ocfs2: fix a deadlock when commit trans	Joseph Qi	1	-14/+11
	commit 6f1b228529ae introduces a regression which can deadlock as follows: Task1: Task2: jbd2_journal_commit_transaction ocfs2_test_bg_bit_allocatable spin_lock(&jh->b_state_lock) jbd_lock_bh_journal_head __jbd2_journal_remove_checkpoint spin_lock(&jh->b_state_lock) jbd2_journal_put_journal_head jbd_lock_bh_journal_head Task1 and Task2 lock bh->b_state and jh->b_state_lock in different order, which finally result in a deadlock. So use jbd2_journal_[grab\|put]_journal_head instead in ocfs2_test_bg_bit_allocatable() to fix it. Link: https://lkml.kernel.org/r/20220121071205.100648-3-joseph.qi@linux.alibaba.com Fixes: 6f1b228529ae ("ocfs2: fix race between searching chunks and release journal_head from buffer_head") Signed-off-by: Joseph Qi <joseph.qi@linux.alibaba.com> Reported-by: Gautham Ananthakrishna <gautham.ananthakrishna@oracle.com> Tested-by: Gautham Ananthakrishna <gautham.ananthakrishna@oracle.com> Reported-by: Saeed Mirzamohammadi <saeed.mirzamohammadi@oracle.com> Cc: "Theodore Ts'o" <tytso@mit.edu> Cc: Andreas Dilger <adilger.kernel@dilger.ca> Cc: Changwei Ge <gechangwei@live.cn> Cc: Gang He <ghe@suse.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Jun Piao <piaojun@huawei.com> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Mark Fasheh <mark@fasheh.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-01-30	jbd2: export jbd2_journal_[grab\|put]_journal_head	Joseph Qi	1	-0/+2
	Patch series "ocfs2: fix a deadlock case". This fixes a deadlock case in ocfs2. We firstly export jbd2 symbols jbd2_journal_[grab\|put]_journal_head as preparation and later use them in ocfs2 insread of jbd_[lock\|unlock]_bh_journal_head to fix the deadlock. This patch (of 2): This exports symbols jbd2_journal_[grab\|put]_journal_head, which will be used outside modules, e.g. ocfs2. Link: https://lkml.kernel.org/r/20220121071205.100648-2-joseph.qi@linux.alibaba.com Signed-off-by: Joseph Qi <joseph.qi@linux.alibaba.com> Cc: Mark Fasheh <mark@fasheh.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Changwei Ge <gechangwei@live.cn> Cc: Gang He <ghe@suse.com> Cc: Jun Piao <piaojun@huawei.com> Cc: Andreas Dilger <adilger.kernel@dilger.ca> Cc: Gautham Ananthakrishna <gautham.ananthakrishna@oracle.com> Cc: Saeed Mirzamohammadi <saeed.mirzamohammadi@oracle.com> Cc: "Theodore Ts'o" <tytso@mit.edu> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-01-30	psi: fix "defined but not used" warnings when CONFIG_PROC_FS=n	Suren Baghdasaryan	1	-38/+41
	When CONFIG_PROC_FS is disabled psi code generates the following warnings: kernel/sched/psi.c:1364:30: warning: 'psi_cpu_proc_ops' defined but not used [-Wunused-const-variable=] 1364 \| static const struct proc_ops psi_cpu_proc_ops = { \| ^~~~~~~~~~~~~~~~ kernel/sched/psi.c:1355:30: warning: 'psi_memory_proc_ops' defined but not used [-Wunused-const-variable=] 1355 \| static const struct proc_ops psi_memory_proc_ops = { \| ^~~~~~~~~~~~~~~~~~~ kernel/sched/psi.c:1346:30: warning: 'psi_io_proc_ops' defined but not used [-Wunused-const-variable=] 1346 \| static const struct proc_ops psi_io_proc_ops = { \| ^~~~~~~~~~~~~~~ Make definitions of these structures and related functions conditional on CONFIG_PROC_FS config. Link: https://lkml.kernel.org/r/20220119223940.787748-3-surenb@google.com Fixes: 0e94682b73bf ("psi: introduce psi monitor") Signed-off-by: Suren Baghdasaryan <surenb@google.com> Reported-by: kernel test robot <lkp@intel.com> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-01-30	psi: fix "no previous prototype" warnings when CONFIG_CGROUPS=n	Suren Baghdasaryan	1	-6/+5
	When CONFIG_CGROUPS is disabled psi code generates the following warnings: kernel/sched/psi.c:1112:21: warning: no previous prototype for 'psi_trigger_create' [-Wmissing-prototypes] 1112 \| struct psi_trigger psi_trigger_create(struct psi_group group, \| ^~~~~~~~~~~~~~~~~~ kernel/sched/psi.c:1182:6: warning: no previous prototype for 'psi_trigger_destroy' [-Wmissing-prototypes] 1182 \| void psi_trigger_destroy(struct psi_trigger t) \| ^~~~~~~~~~~~~~~~~~~ kernel/sched/psi.c:1249:10: warning: no previous prototype for 'psi_trigger_poll' [-Wmissing-prototypes] 1249 \| __poll_t psi_trigger_poll(void *trigger_ptr, \| ^~~~~~~~~~~~~~~~ Change the declarations of these functions in the header to provide the prototypes even when they are unused. Link: https://lkml.kernel.org/r/20220119223940.787748-2-surenb@google.com Fixes: 0e94682b73bf ("psi: introduce psi monitor") Signed-off-by: Suren Baghdasaryan <surenb@google.com> Reported-by: kernel test robot <lkp@intel.com> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-01-30	mm, kasan: use compare-exchange operation to set KASAN page tag	Peter Collingbourne	1	-5/+12
	It has been reported that the tag setting operation on newly-allocated pages can cause the page flags to be corrupted when performed concurrently with other flag updates as a result of the use of non-atomic operations. Fix the problem by using a compare-exchange loop to update the tag. Link: https://lkml.kernel.org/r/20220120020148.1632253-1-pcc@google.com Link: https://linux-review.googlesource.com/id/I456b24a2b9067d93968d43b4bb3351c0cec63101 Fixes: 2813b9c02962 ("kasan, mm, arm64: tag non slab memory allocated via pagealloc") Signed-off-by: Peter Collingbourne <pcc@google.com> Reviewed-by: Andrey Konovalov <andreyknvl@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-01-30	kasan: test: fix compatibility with FORTIFY_SOURCE	Marco Elver	1	-0/+5
	With CONFIG_FORTIFY_SOURCE enabled, string functions will also perform dynamic checks using __builtin_object_size(ptr), which when failed will panic the kernel. Because the KASAN test deliberately performs out-of-bounds operations, the kernel panics with FORTIFY_SOURCE, for example: \| kernel BUG at lib/string_helpers.c:910! \| invalid opcode: 0000 [#1] PREEMPT SMP KASAN PTI \| CPU: 1 PID: 137 Comm: kunit_try_catch Tainted: G B 5.16.0-rc3+ #3 \| Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.14.0-2 04/01/2014 \| RIP: 0010:fortify_panic+0x19/0x1b \| ... \| Call Trace: \| kmalloc_oob_in_memset.cold+0x16/0x16 \| ... Fix it by also hiding `ptr` from the optimizer, which will ensure that __builtin_object_size() does not return a valid size, preventing fortified string functions from panicking. Link: https://lkml.kernel.org/r/20220124160744.1244685-1-elver@google.com Signed-off-by: Marco Elver <elver@google.com> Reported-by: Nico Pache <npache@redhat.com> Reviewed-by: Nico Pache <npache@redhat.com> Reviewed-by: Andrey Konovalov <andreyknvl@gmail.com> Reviewed-by: Kees Cook <keescook@chromium.org> Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com> Cc: Alexander Potapenko <glider@google.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Brendan Higgins <brendanhiggins@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-01-30	tools/testing/scatterlist: add missing defines	Maor Gottlieb	1	-1/+2
	The cited commits replaced preemptible with pagefault_disabled and flush_kernel_dcache_page with flush_dcache_page respectively, hence need to update the corresponding defines in the test. scatterlist.c: In function ‘sg_miter_stop’: scatterlist.c:919:4: warning: implicit declaration of function ‘flush_dcache_page’ [-Wimplicit-function-declaration] flush_dcache_page(miter->page); ^~~~~~~~~~~~~~~~~ In file included from linux/scatterlist.h:8:0, from scatterlist.c:9: scatterlist.c:922:18: warning: implicit declaration of function ‘pagefault_disabled’ [-Wimplicit-function-declaration] WARN_ON_ONCE(!pagefault_disabled()); ^ linux/mm.h:23:25: note: in definition of macro ‘WARN_ON_ONCE’ int __ret_warn_on = !!(condition); \ ^~~~~~~~~ Link: https://lkml.kernel.org/r/20220118082105.1737320-1-maorg@nvidia.com Fixes: 723aca208516 ("mm/scatterlist: replace the !preemptible warning in sg_miter_stop()") Fixes: 0e84f5dbf8d6 ("scatterlist: replace flush_kernel_dcache_page with flush_dcache_page") Signed-off-by: Maor Gottlieb <maorg@nvidia.com> Tested-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Christoph Hellwig <hch@lst.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-01-30	mm: page->mapping folio->mapping should have the same offset	Wei Yang	1	-0/+1
	As with the other members of folio, the offset of page->mapping and folio->mapping must be the same. The compile-time check was inadvertently removed during development. Add it back. [willy@infradead.org: changelog redo] Link: https://lkml.kernel.org/r/20220104011734.21714-1-richard.weiyang@gmail.com Signed-off-by: Wei Yang <richard.weiyang@gmail.com> Reviewed-by: Matthew Wilcox (Oracle) <willy@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-01-30	memory-failure: fetch compound_head after pgmap_pfn_valid()	Joao Martins	1	-0/+6
	memory_failure_dev_pagemap() at the moment assumes base pages (e.g. dax_lock_page()). For devmap with compound pages fetch the compound_head in case a tail page memory failure is being handled. Currently this is a nop, but in the advent of compound pages in dev_pagemap it allows memory_failure_dev_pagemap() to keep working. Without this fix memory-failure handling (i.e. MCEs on pmem) with device-dax configured namespaces will regress (and crash). Link: https://lkml.kernel.org/r/20211202204422.26777-2-joao.m.martins@oracle.com Reported-by: Jane Chu <jane.chu@oracle.com> Signed-off-by: Joao Martins <joao.m.martins@oracle.com> Reviewed-by: Naoya Horiguchi <naoya.horiguchi@nec.com> Reviewed-by: Dan Williams <dan.j.williams@intel.com> Reviewed-by: Muchun Song <songmuchun@bytedance.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-01-30	ia64: make IA64_MCA_RECOVERY bool instead of tristate	Randy Dunlap	1	-1/+1
	In linux-next, IA64_MCA_RECOVERY uses the (new) function make_task_dead(), which is not exported for use by modules. Instead of exporting it for one user, convert IA64_MCA_RECOVERY to be a bool Kconfig symbol. In a config file from "kernel test robot <lkp@intel.com>" for a different problem, this linker error was exposed when CONFIG_IA64_MCA_RECOVERY=m. Fixes this build error: ERROR: modpost: "make_task_dead" [arch/ia64/kernel/mca_recovery.ko] undefined! Link: https://lkml.kernel.org/r/20220124213129.29306-1-rdunlap@infradead.org Fixes: 0e25498f8cd4 ("exit: Add and use make_task_dead.") Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Suggested-by: Christoph Hellwig <hch@infradead.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Tony Luck <tony.luck@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-01-30	binfmt_misc: fix crash when load/unload module	Tong Zhang	1	-4/+4
	We should unregister the table upon module unload otherwise something horrible will happen when we load binfmt_misc module again. Also note that we should keep value returned by register_sysctl_mount_point() and release it later, otherwise it will leak. Also, per Christian's comment, to fully restore the old behavior that won't break userspace the check(binfmt_misc_header) should be eliminated. To reproduce: modprobe binfmt_misc modprobe -r binfmt_misc modprobe binfmt_misc modprobe -r binfmt_misc modprobe binfmt_misc resulting in modprobe: can't load module binfmt_misc (kernel/fs/binfmt_misc.ko): Cannot allocate memory and an unhappy kernel: binfmt_misc: Failed to create fs/binfmt_misc sysctl mount point binfmt_misc: Failed to create fs/binfmt_misc sysctl mount point BUG: unable to handle page fault for address: fffffbfff8004802 Call Trace: init_misc_binfmt+0x2d/0x1000 [binfmt_misc] Link: https://lkml.kernel.org/r/20220124181812.1869535-2-ztong0001@gmail.com Fixes: 3ba442d5331f ("fs: move binfmt_misc sysctl to its own file") Signed-off-by: Tong Zhang <ztong0001@gmail.com> Co-developed-by: Christian Brauner<brauner@kernel.org> Acked-by: Luis Chamberlain <mcgrof@kernel.org> Cc: Eric Biederman <ebiederm@xmission.com> Cc: Kees Cook <keescook@chromium.org> Cc: Iurii Zaikin <yzaikin@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-01-30	include/linux/sysctl.h: fix register_sysctl_mount_point() return type	Andrew Morton	1	-1/+1
	The CONFIG_SYSCTL=n stub returns the wrong type. Fixes: ee9efac48a082 ("sysctl: add helper to register a sysctl mount point") Reported-by: kernel test robot <lkp@intel.com> Acked-by: Luis Chamberlain <mcgrof@kernel.org> Cc: Tong Zhang <ztong0001@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-01-28	dm: properly fix redundant bio-based IO accounting	Mike Snitzer	1	-2/+3
	Record the start_time for a bio but defer the starting block core's IO accounting until after IO is submitted using bio_start_io_acct_time(). This approach avoids the need to mess around with any of the individual IO stats in response to a bio_split() that follows bio submission. Reported-by: Bud Brown <bubrown@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Cc: stable@vger.kernel.org Depends-on: e45c47d1f94e ("block: add bio_start_io_acct_time() to control start_time") Signed-off-by: Mike Snitzer <snitzer@redhat.com> Link: https://lore.kernel.org/r/20220128155841.39644-4-snitzer@redhat.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-01-28	dm: revert partial fix for redundant bio-based IO accounting	Mike Snitzer	1	-15/+0
	Reverts a1e1cb72d9649 ("dm: fix redundant IO accounting for bios that need splitting") because it was too narrow in scope (only addressed redundant 'sectors[]' accounting and not ios, nsecs[], etc). Cc: stable@vger.kernel.org Signed-off-by: Mike Snitzer <snitzer@redhat.com> Link: https://lore.kernel.org/r/20220128155841.39644-3-snitzer@redhat.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-01-28	block: add bio_start_io_acct_time() to control start_time	Mike Snitzer	2	-6/+20
	bio_start_io_acct_time() interface is like bio_start_io_acct() that allows start_time to be passed in. This gives drivers the ability to defer starting accounting until after IO is issued (but possibily not entirely due to bio splitting). Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Mike Snitzer <snitzer@redhat.com> Link: https://lore.kernel.org/r/20220128155841.39644-2-snitzer@redhat.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-01-28	security, lsm: dentry_init_security() Handle multi LSM registration	Vivek Goyal	2	-3/+14
	A ceph user has reported that ceph is crashing with kernel NULL pointer dereference. Following is the backtrace. /proc/version: Linux version 5.16.2-arch1-1 (linux@archlinux) (gcc (GCC) 11.1.0, GNU ld (GNU Binutils) 2.36.1) #1 SMP PREEMPT Thu, 20 Jan 2022 16:18:29 +0000 distro / arch: Arch Linux / x86_64 SELinux is not enabled ceph cluster version: 16.2.7 (dd0603118f56ab514f133c8d2e3adfc983942503) relevant dmesg output: [ 30.947129] BUG: kernel NULL pointer dereference, address: 0000000000000000 [ 30.947206] #PF: supervisor read access in kernel mode [ 30.947258] #PF: error_code(0x0000) - not-present page [ 30.947310] PGD 0 P4D 0 [ 30.947342] Oops: 0000 [#1] PREEMPT SMP PTI [ 30.947388] CPU: 5 PID: 778 Comm: touch Not tainted 5.16.2-arch1-1 #1 86fbf2c313cc37a553d65deb81d98e9dcc2a3659 [ 30.947486] Hardware name: Gigabyte Technology Co., Ltd. B365M DS3H/B365M DS3H, BIOS F5 08/13/2019 [ 30.947569] RIP: 0010:strlen+0x0/0x20 [ 30.947616] Code: b6 07 38 d0 74 16 48 83 c7 01 84 c0 74 05 48 39 f7 75 ec 31 c0 31 d2 89 d6 89 d7 c3 48 89 f8 31 d2 89 d6 89 d7 c3 0 f 1f 40 00 <80> 3f 00 74 12 48 89 f8 48 83 c0 01 80 38 00 75 f7 48 29 f8 31 ff [ 30.947782] RSP: 0018:ffffa4ed80ffbbb8 EFLAGS: 00010246 [ 30.947836] RAX: 0000000000000000 RBX: ffffa4ed80ffbc60 RCX: 0000000000000000 [ 30.947904] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000 [ 30.947971] RBP: ffff94b0d15c0ae0 R08: 0000000000000000 R09: 0000000000000000 [ 30.948040] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000 [ 30.948106] R13: 0000000000000001 R14: ffffa4ed80ffbc60 R15: 0000000000000000 [ 30.948174] FS: 00007fc7520f0740(0000) GS:ffff94b7ced40000(0000) knlGS:0000000000000000 [ 30.948252] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 30.948308] CR2: 0000000000000000 CR3: 0000000104a40001 CR4: 00000000003706e0 [ 30.948376] Call Trace: [ 30.948404] <TASK> [ 30.948431] ceph_security_init_secctx+0x7b/0x240 [ceph 49f9c4b9bf5be8760f19f1747e26da33920bce4b] [ 30.948582] ceph_atomic_open+0x51e/0x8a0 [ceph 49f9c4b9bf5be8760f19f1747e26da33920bce4b] [ 30.948708] ? get_cached_acl+0x4d/0xa0 [ 30.948759] path_openat+0x60d/0x1030 [ 30.948809] do_filp_open+0xa5/0x150 [ 30.948859] do_sys_openat2+0xc4/0x190 [ 30.948904] __x64_sys_openat+0x53/0xa0 [ 30.948948] do_syscall_64+0x5c/0x90 [ 30.948989] ? exc_page_fault+0x72/0x180 [ 30.949034] entry_SYSCALL_64_after_hwframe+0x44/0xae [ 30.949091] RIP: 0033:0x7fc7521e25bb [ 30.950849] Code: 25 00 00 41 00 3d 00 00 41 00 74 4b 64 8b 04 25 18 00 00 00 85 c0 75 67 44 89 e2 48 89 ee bf 9c ff ff ff b8 01 01 0 0 00 0f 05 <48> 3d 00 f0 ff ff 0f 87 91 00 00 00 48 8b 54 24 28 64 48 2b 14 25 Core of the problem is that ceph checks for return code from security_dentry_init_security() and if return code is 0, it assumes everything is fine and continues to call strlen(name), which crashes. Typically SELinux LSM returns 0 and sets name to "security.selinux" and it is not a problem. Or if selinux is not compiled in or disabled, it returns -EOPNOTSUP and ceph deals with it. But somehow in this configuration, 0 is being returned and "name" is not being initialized and that's creating the problem. Our suspicion is that BPF LSM is registering a hook for dentry_init_security() and returns hook default of 0. LSM_HOOK(int, 0, dentry_init_security, struct dentry *dentry,...) I have not been able to reproduce it just by doing CONFIG_BPF_LSM=y. Stephen has tested the patch though and confirms it solves the problem for him. dentry_init_security() is written in such a way that it expects only one LSM to register the hook. Atleast that's the expectation with current code. If another LSM returns a hook and returns default, it will simply return 0 as of now and that will break ceph. Hence, suggestion is that change semantics of this hook a bit. If there are no LSMs or no LSM is taking ownership and initializing security context, then return -EOPNOTSUP. Also allow at max one LSM to initialize security context. This hook can't deal with multiple LSMs trying to init security context. This patch implements this new behavior. Reported-by: Stephen Muth <smuth4@gmail.com> Tested-by: Stephen Muth <smuth4@gmail.com> Suggested-by: Casey Schaufler <casey@schaufler-ca.com> Acked-by: Casey Schaufler <casey@schaufler-ca.com> Reviewed-by: Serge Hallyn <serge@hallyn.com> Cc: Jeff Layton <jlayton@kernel.org> Cc: Christian Brauner <brauner@kernel.org> Cc: Paul Moore <paul@paul-moore.com> Cc: <stable@vger.kernel.org> # 5.16.0 Signed-off-by: Vivek Goyal <vgoyal@redhat.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> Acked-by: Paul Moore <paul@paul-moore.com> Acked-by: Christian Brauner <brauner@kernel.org> Signed-off-by: James Morris <jmorris@namei.org>
2022-01-28	dt-bindings: interrupt-controller: sifive,plic: Group interrupt tuples	Geert Uytterhoeven	1	-6/+5
	To improve human readability and enable automatic validation, the tuples in "interrupts-extended" properties should be grouped using angle brackets. Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org> Reviewed-by: Rob Herring <robh@kernel.org> Reviewed-by: Anup Patel <anup@brainfault.org> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/211705e74a2ce77de43d036c5dea032484119bf7.1643360419.git.geert@linux-m68k.org
2022-01-28	dt-bindings: interrupt-controller: sifive,plic: Fix number of interrupts	Geert Uytterhoeven	1	-0/+1
	The number of interrupts lacks an upper bound, thus assuming one, causing properly grouped "interrupts-extended" properties to be flagged as an error by "make dtbs_check". Fix this by adding the missing "maxItems", using the architectural maximum of 15872 interrupts. Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org> Acked-by: Rob Herring <robh@kernel.org> Reviewed-by: Anup Patel <anup@brainfault.org> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/f73a0aead89e1426b146c4c64f797aa035868bf0.1643360419.git.geert@linux-m68k.org
2022-01-28	dt-bindings: irqchip: renesas-irqc: Add R-Car V3U support	Geert Uytterhoeven	1	-0/+1
	Document support for the Interrupt Controller for External Devices (INT-EC) in the Renesas R-Car V3U (r8a779a0) SoC. Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Tested-by: Kieran Bingham <kieran.bingham+renesas@ideasonboard.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/85b246cc0792663c72c1bb12a8576bd23d2299d3.1643200256.git.geert+renesas@glider.be
2022-01-28	arm64: cpufeature: List early Cortex-A510 parts as having broken dbm	James Morse	3	-0/+15
	Versions of Cortex-A510 before r0p3 are affected by a hardware erratum where the hardware update of the dirty bit is not correctly ordered. Add these cpus to the cpu_has_broken_dbm list. Signed-off-by: James Morse <james.morse@arm.com> Link: https://lore.kernel.org/r/20220125154040.549272-3-james.morse@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2022-01-28	ocfs2: fix subdirectory registration with register_sysctl()	Linus Torvalds	1	-12/+1
	The kernel test robot reports that commit c42ff46f97c1 ("ocfs2: simplify subdirectory registration with register_sysctl()") is broken, and results in kernel warning messages like sysctl table check failed: fs/ocfs2/nm Not a file sysctl table check failed: fs/ocfs2/nm No proc_handler sysctl table check failed: fs/ocfs2/nm bogus .mode 0555 and in fact this was already reported back in linux-next, but nobody seems to have reacted to that report. Possibly that original report only ever made it to the lkp list. The problem seems to be that the simplification didn't actually go far enough, and should have converted the whole directory path to the final sysctl file, rather than just the two first components. So take that last step. Fixes: c42ff46f97c1 ("ocfs2: simplify subdirectory registration with register_sysctl()") Reported-by: kernel test robot <oliver.sang@intel.com> Link: https://lore.kernel.org/all/20220128065310.GF8421@xsang-OptiPlex-9020/ Link: https://lists.01.org/hyperkitty/list/lkp@lists.01.org/thread/KQ2F6TPJWMDVEXJM4WTUC4DU3EH3YJVT/ Tested-by: Jan Kara <jack@suse.cz> Reviewed-by: Jan Kara <jack@suse.cz> Cc: Luis Chamberlain <mcgrof@kernel.org> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-01-28	KVM: eventfd: Fix false positive RCU usage warning	Hou Wenlong	1	-4/+4
	Fix the following false positive warning: ============================= WARNING: suspicious RCU usage 5.16.0-rc4+ #57 Not tainted ----------------------------- arch/x86/kvm/../../../virt/kvm/eventfd.c:484 RCU-list traversed in non-reader section!! other info that might help us debug this: rcu_scheduler_active = 2, debug_locks = 1 3 locks held by fc_vcpu 0/330: #0: ffff8884835fc0b0 (&vcpu->mutex){+.+.}-{3:3}, at: kvm_vcpu_ioctl+0x88/0x6f0 [kvm] #1: ffffc90004c0bb68 (&kvm->srcu){....}-{0:0}, at: vcpu_enter_guest+0x600/0x1860 [kvm] #2: ffffc90004c0c1d0 (&kvm->irq_srcu){....}-{0:0}, at: kvm_notify_acked_irq+0x36/0x180 [kvm] stack backtrace: CPU: 26 PID: 330 Comm: fc_vcpu 0 Not tainted 5.16.0-rc4+ Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.14.0-0-g155821a1990b-prebuilt.qemu.org 04/01/2014 Call Trace: <TASK> dump_stack_lvl+0x44/0x57 kvm_notify_acked_gsi+0x6b/0x70 [kvm] kvm_notify_acked_irq+0x8d/0x180 [kvm] kvm_ioapic_update_eoi+0x92/0x240 [kvm] kvm_apic_set_eoi_accelerated+0x2a/0xe0 [kvm] handle_apic_eoi_induced+0x3d/0x60 [kvm_intel] vmx_handle_exit+0x19c/0x6a0 [kvm_intel] vcpu_enter_guest+0x66e/0x1860 [kvm] kvm_arch_vcpu_ioctl_run+0x438/0x7f0 [kvm] kvm_vcpu_ioctl+0x38a/0x6f0 [kvm] __x64_sys_ioctl+0x89/0xc0 do_syscall_64+0x3a/0x90 entry_SYSCALL_64_after_hwframe+0x44/0xae Since kvm_unregister_irq_ack_notifier() does synchronize_srcu(&kvm->irq_srcu), kvm->irq_ack_notifier_list is protected by kvm->irq_srcu. In fact, kvm->irq_srcu SRCU read lock is held in kvm_notify_acked_irq(), making it a false positive warning. So use hlist_for_each_entry_srcu() instead of hlist_for_each_entry_rcu(). Reviewed-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Hou Wenlong <houwenlong93@linux.alibaba.com> Message-Id: <f98bac4f5052bad2c26df9ad50f7019e40434512.1643265976.git.houwenlong.hwl@antgroup.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-01-28	KVM: nVMX: Allow VMREAD when Enlightened VMCS is in use	Vitaly Kuznetsov	2	-16/+51
	Hyper-V TLFS explicitly forbids VMREAD and VMWRITE instructions when Enlightened VMCS interface is in use: "Any VMREAD or VMWRITE instructions while an enlightened VMCS is active is unsupported and can result in unexpected behavior."" Windows 11 + WSL2 seems to ignore this, attempts to VMREAD VMCS field 0x4404 ("VM-exit interruption information") are observed. Failing these attempts with nested_vmx_failInvalid() makes such guests unbootable. Microsoft confirms this is a Hyper-V bug and claims that it'll get fixed eventually but for the time being we need a workaround. (Temporary) allow VMREAD to get data from the currently loaded Enlightened VMCS. Note: VMWRITE instructions remain forbidden, it is not clear how to handle them properly and hopefully won't ever be needed. Reviewed-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Message-Id: <20220112170134.1904308-6-vkuznets@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-01-28	KVM: nVMX: Implement evmcs_field_offset() suitable for handle_vmread()	Vitaly Kuznetsov	2	-10/+25
	In preparation to allowing reads from Enlightened VMCS from handle_vmread(), implement evmcs_field_offset() to get the correct read offset. get_evmcs_offset(), which is being used by KVM-on-Hyper-V, is almost what's needed but a few things need to be adjusted. First, WARN_ON() is unacceptable for handle_vmread() as any field can (in theory) be supplied by the guest and not all fields are defined in eVMCS v1. Second, we need to handle 'holes' in eVMCS (missing fields). It also sounds like a good idea to WARN_ON() if such fields are ever accessed by KVM-on-Hyper-V. Implement dedicated evmcs_field_offset() helper. No functional change intended. Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Message-Id: <20220112170134.1904308-5-vkuznets@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>