linux-dev - Linux kernel development work

Age	Commit message (Collapse)	Author	Files	Lines
2021-07-26	io_uring: fix race in unified task_work running	Jens Axboe	1	-1/+5
	We use a bit to manage if we need to add the shared task_work, but a list + lock for the pending work. Before aborting a current run of the task_work we check if the list is empty, but we do so without grabbing the lock that protects it. This can lead to races where we think we have nothing left to run, where in practice we could be racing with a task adding new work to the list. If we do hit that race condition, we could be left with work items that need processing, but the shared task_work is not active. Ensure that we grab the lock before checking if the list is empty, so we know if it's safe to exit the run or not. Link: https://lore.kernel.org/io-uring/c6bd5987-e9ae-cd02-49d0-1b3ac1ef65b1@tnonline.net/ Cc: stable@vger.kernel.org # 5.11+ Reported-by: Forza <forza@tnonline.net> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-07-26	io_uring: fix io_prep_async_link locking	Pavel Begunkov	1	-2/+11
	io_prep_async_link() may be called after arming a linked timeout, automatically making it unsafe to traverse the linked list. Guard with completion_lock if there was a linked timeout. Cc: stable@vger.kernel.org # 5.9+ Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/93f7c617e2b4f012a2a175b3dab6bc2f27cebc48.1627304436.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-07-23	io_uring: explicitly catch any illegal async queue attempt	Jens Axboe	2	-1/+17
	Catch an illegal case to queue async from an unrelated task that got the ring fd passed to it. This should not be possible to hit, but better be proactive and catch it explicitly. io-wq is extended to check for early IO_WQ_WORK_CANCEL being set on a work item as well, so it can run the request through the normal cancelation path. Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-07-23	io_uring: never attempt iopoll reissue from release path	Jens Axboe	1	-7/+7
	There are two reasons why this shouldn't be done: 1) Ring is exiting, and we're canceling requests anyway. Any request should be canceled anyway. In theory, this could iterate for a number of times if someone else is also driving the target block queue into request starvation, however the likelihood of this happening is miniscule. 2) If the original task decided to pass the ring to another task, then we don't want to be reissuing from this context as it may be an unrelated task or context. No assumptions should be made about the context in which ->release() is run. This can only happen for pure read/write, and we'll get -EFAULT on them anyway. Link: https://lore.kernel.org/io-uring/YPr4OaHv0iv0KTOc@zeniv-ca.linux.org.uk/ Reported-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-07-22	io_uring: fix early fdput() of file	Jens Axboe	1	-2/+4
	A previous commit shuffled some code around, and inadvertently used struct file after fdput() had been called on it. As we can't touch the file post fdput() dropping our reference, move the fdput() to after that has been done. Cc: Pavel Begunkov <asml.silence@gmail.com> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/io-uring/YPnqM0fY3nM5RdRI@zeniv-ca.linux.org.uk/ Fixes: f2a48dd09b8e ("io_uring: refactor io_sq_offload_create()") Reported-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-07-20	io_uring: fix memleak in io_init_wq_offload()	Yang Yingliang	1	-1/+5
	I got memory leak report when doing fuzz test: BUG: memory leak unreferenced object 0xffff888107310a80 (size 96): comm "syz-executor.6", pid 4610, jiffies 4295140240 (age 20.135s) hex dump (first 32 bytes): 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 00 00 00 00 ad 4e ad de ff ff ff ff 00 00 00 00 .....N.......... backtrace: [<000000001974933b>] kmalloc include/linux/slab.h:591 [inline] [<000000001974933b>] kzalloc include/linux/slab.h:721 [inline] [<000000001974933b>] io_init_wq_offload fs/io_uring.c:7920 [inline] [<000000001974933b>] io_uring_alloc_task_context+0x466/0x640 fs/io_uring.c:7955 [<0000000039d0800d>] __io_uring_add_tctx_node+0x256/0x360 fs/io_uring.c:9016 [<000000008482e78c>] io_uring_add_tctx_node fs/io_uring.c:9052 [inline] [<000000008482e78c>] __do_sys_io_uring_enter fs/io_uring.c:9354 [inline] [<000000008482e78c>] __se_sys_io_uring_enter fs/io_uring.c:9301 [inline] [<000000008482e78c>] __x64_sys_io_uring_enter+0xabc/0xc20 fs/io_uring.c:9301 [<00000000b875f18f>] do_syscall_x64 arch/x86/entry/common.c:50 [inline] [<00000000b875f18f>] do_syscall_64+0x3b/0x90 arch/x86/entry/common.c:80 [<000000006b0a8484>] entry_SYSCALL_64_after_hwframe+0x44/0xae CPU0 CPU1 io_uring_enter io_uring_enter io_uring_add_tctx_node io_uring_add_tctx_node __io_uring_add_tctx_node __io_uring_add_tctx_node io_uring_alloc_task_context io_uring_alloc_task_context io_init_wq_offload io_init_wq_offload hash = kzalloc hash = kzalloc ctx->hash_map = hash ctx->hash_map = hash <- one of the hash is leaked When calling io_uring_enter() in parallel, the 'hash_map' will be leaked, add uring_lock to protect 'hash_map'. Fixes: e941894eae31 ("io-wq: make buffered file write hashed work map per-ctx") Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Yang Yingliang <yangyingliang@huawei.com> Reviewed-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/20210720083805.3030730-1-yangyingliang@huawei.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-07-20	io_uring: remove double poll entry on arm failure	Pavel Begunkov	1	-0/+2
	__io_queue_proc() can enqueue both poll entries and still fail afterwards, so the callers trying to cancel it should also try to remove the second poll entry (if any). For example, it may leave the request alive referencing a io_uring context but not accessible for cancellation: [ 282.599913][ T1620] task:iou-sqp-23145 state:D stack:28720 pid:23155 ppid: 8844 flags:0x00004004 [ 282.609927][ T1620] Call Trace: [ 282.613711][ T1620] __schedule+0x93a/0x26f0 [ 282.634647][ T1620] schedule+0xd3/0x270 [ 282.638874][ T1620] io_uring_cancel_generic+0x54d/0x890 [ 282.660346][ T1620] io_sq_thread+0xaac/0x1250 [ 282.696394][ T1620] ret_from_fork+0x1f/0x30 Cc: stable@vger.kernel.org Fixes: 18bceab101add ("io_uring: allow POLL_ADD with double poll_wait() users") Reported-and-tested-by: syzbot+ac957324022b7132accf@syzkaller.appspotmail.com Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/0ec1228fc5eda4cb524eeda857da8efdc43c331c.1626774457.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-07-20	io_uring: explicitly count entries for poll reqs	Pavel Begunkov	1	-6/+10
	If __io_queue_proc() fails to add a second poll entry, e.g. kmalloc() failed, but it goes on with a third waitqueue, it may succeed and overwrite the error status. Count the number of poll entries we added, so we can set pt->error to zero at the beginning and find out when the mentioned scenario happens. Cc: stable@vger.kernel.org Fixes: 18bceab101add ("io_uring: allow POLL_ADD with double poll_wait() users") Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/9d6b9e561f88bcc0163623b74a76c39f712151c3.1626774457.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-07-11	io_uring: fix io_drain_req()	Pavel Begunkov	1	-2/+4
	io_drain_req() return whether the request has been consumed or not, not an error code. Fix a stupid mistake slipped from optimisation patches. Reported-by: syzbot+ba6fcd859210f4e9e109@syzkaller.appspotmail.com Fixes: 76cc33d79175a ("io_uring: refactor io_req_defer()") Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/4d3c53c4274ffff307c8ae062fc7fda63b978df2.1626039606.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-07-11	io_uring: use right task for exiting checks	Pavel Begunkov	1	-1/+1
	When we use delayed_work for fallback execution of requests, current will be not of the submitter task, and so checks in io_req_task_submit() may not behave as expected. Currently, it leaves inline completions not flushed, so making io_ring_exit_work() to hang. Use the submitter task for all those checks. Cc: stable@vger.kernel.org Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/cb413c715bed0bc9c98b169059ea9c8a2c770715.1625881431.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-07-11	Linux 5.14-rc1	Linus Torvalds	1	-2/+2

2021-07-11	mm/rmap: try_to_migrate() skip zone_device !device_private	Hugh Dickins	1	-3/+3
	I know nothing about zone_device pages and !device_private pages; but if try_to_migrate_one() will do nothing for them, then it's better that try_to_migrate() filter them first, than trawl through all their vmas. Signed-off-by: Hugh Dickins <hughd@google.com> Reviewed-by: Shakeel Butt <shakeelb@google.com> Reviewed-by: Alistair Popple <apopple@nvidia.com> Link: https://lore.kernel.org/lkml/1241d356-8ec9-f47b-a5ec-9b2bf66d242@google.com/ Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Jason Gunthorpe <jgg@nvidia.com> Cc: Ralph Campbell <rcampbell@nvidia.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Yang Shi <shy828301@gmail.com> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-07-11	mm/rmap: fix new bug: premature return from page_mlock_one()	Hugh Dickins	1	-6/+5
	In the unlikely race case that page_mlock_one() finds VM_LOCKED has been cleared by the time it got page table lock, page_vma_mapped_walk_done() must be called before returning, either explicitly, or by a final call to page_vma_mapped_walk() - otherwise the page table remains locked. Fixes: cd62734ca60d ("mm/rmap: split try_to_munlock from try_to_unmap") Signed-off-by: Hugh Dickins <hughd@google.com> Reviewed-by: Alistair Popple <apopple@nvidia.com> Reviewed-by: Shakeel Butt <shakeelb@google.com> Reported-by: kernel test robot <oliver.sang@intel.com> Link: https://lore.kernel.org/lkml/20210711151446.GB4070@xsang-OptiPlex-9020/ Link: https://lore.kernel.org/lkml/f71f8523-cba7-3342-40a7-114abc5d1f51@google.com/ Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Jason Gunthorpe <jgg@nvidia.com> Cc: Ralph Campbell <rcampbell@nvidia.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Yang Shi <shy828301@gmail.com> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-07-11	mm/rmap: fix old bug: munlocking THP missed other mlocks	Hugh Dickins	1	-5/+8
	The kernel recovers in due course from missing Mlocked pages: but there was no point in calling page_mlock() (formerly known as try_to_munlock()) on a THP, because nothing got done even when it was found to be mapped in another VM_LOCKED vma. It's true that we need to be careful: Mlocked accounting of pte-mapped THPs is too difficult (so consistently avoided); but Mlocked accounting of only-pmd-mapped THPs is supposed to work, even when multiple mappings are mlocked and munlocked or munmapped. Refine the tests. There is already a VM_BUG_ON_PAGE(PageDoubleMap) in page_mlock(), so page_mlock_one() does not even have to worry about that complication. (I said the kernel recovers: but would page reclaim be likely to split THP before rediscovering that it's VM_LOCKED? I've not followed that up) Fixes: 9a73f61bdb8a ("thp, mlock: do not mlock PTE-mapped file huge pages") Signed-off-by: Hugh Dickins <hughd@google.com> Reviewed-by: Shakeel Butt <shakeelb@google.com> Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Link: https://lore.kernel.org/lkml/cfa154c-d595-406-eb7d-eb9df730f944@google.com/ Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Alistair Popple <apopple@nvidia.com> Cc: Jason Gunthorpe <jgg@nvidia.com> Cc: Ralph Campbell <rcampbell@nvidia.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Yang Shi <shy828301@gmail.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-07-11	mm/rmap: fix comments left over from recent changes	Hugh Dickins	2	-7/+2
	Parallel developments in mm/rmap.c have left behind some out-of-date comments: try_to_migrate_one() also accepts TTU_SYNC (already commented in try_to_migrate() itself), and try_to_migrate() returns nothing at all. TTU_SPLIT_FREEZE has just been deleted, so reword the comment about it in mm/huge_memory.c; and TTU_IGNORE_ACCESS was removed in 5.11, so delete the "recently referenced" comment from try_to_unmap_one() (once upon a time the comment was near the removed codeblock, but they drifted apart). Signed-off-by: Hugh Dickins <hughd@google.com> Reviewed-by: Shakeel Butt <shakeelb@google.com> Reviewed-by: Alistair Popple <apopple@nvidia.com> Link: https://lore.kernel.org/lkml/563ce5b2-7a44-5b4d-1dfd-59a0e65932a9@google.com/ Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Jason Gunthorpe <jgg@nvidia.com> Cc: Ralph Campbell <rcampbell@nvidia.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Yang Shi <shy828301@gmail.com> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-07-10	mm/page_alloc: Revert pahole zero-sized workaround	Mel Gorman	2	-14/+0
	Commit dbbee9d5cd83 ("mm/page_alloc: convert per-cpu list protection to local_lock") folded in a workaround patch for pahole that was unable to deal with zero-sized percpu structures. A superior workaround is achieved with commit a0b8200d06ad ("kbuild: skip per-CPU BTF generation for pahole v1.18-v1.21"). This patch reverts the dummy field and the pahole version check. Fixes: dbbee9d5cd83 ("mm/page_alloc: convert per-cpu list protection to local_lock") Signed-off-by: Mel Gorman <mgorman@techsingularity.net> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-07-10	rtc: pcf8523: rename register and bit defines	Alexandre Belloni	1	-73/+73
	arch/arm/mach-ixp4xx/include/mach/platform.h now gets included indirectly and defines REG_OFFSET. Rename the register and bit definition to something specific to the driver. Fixes: 7fd70c65faac ("ARM: irqstat: Get rid of duplicated declaration") Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com> Link: https://lore.kernel.org/r/20210710211431.1393589-1-alexandre.belloni@bootlin.com
2021-07-10	rtc: pcf2127: handle timestamp interrupts	Mian Yousaf Kaukab	1	-59/+133
	commit 03623b4b041c ("rtc: pcf2127: add tamper detection support") added support for timestamp interrupts. However they are not being handled in the irq handler. If a timestamp interrupt occurs it results in kernel disabling the interrupt and displaying the call trace: [ 121.145580] irq 78: nobody cared (try booting with the "irqpoll" option) ... [ 121.238087] [<00000000c4d69393>] irq_default_primary_handler threaded [<000000000a90d25b>] pcf2127_rtc_irq [rtc_pcf2127] [ 121.248971] Disabling IRQ #78 Handle timestamp interrupts in pcf2127_rtc_irq(). Save time stamp before clearing TSF1 and TSF2 flags so that it can't be overwritten. Set a flag to mark if the timestamp is valid and only report to sysfs if the flag is set. To mimic the hardware behavior, don’t save another timestamp until the first one has been read by the userspace. However, if the alarm irq is not configured, keep the old way of handling timestamp interrupt in the timestamp0 sysfs calls. Signed-off-by: Mian Yousaf Kaukab <ykaukab@suse.de> Reviewed-by: Bruno Thomsen <bruno.thomsen@gmail.com> Tested-by: Bruno Thomsen <bruno.thomsen@gmail.com> Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com> Link: https://lore.kernel.org/r/20210629150643.31551-1-ykaukab@suse.de
2021-07-10	rtc: at91sam9: Remove unnecessary offset variable checks	Nobuhiro Iwamatsu	1	-1/+1
	The offset variable is checked by at91_rtc_readalarm(), but this check is unnecessary because the previous check knew that the value of this variable was not 0. This removes that unnecessary offset variable checks. Cc: Nicolas Ferre <nicolas.ferre@microchip.com> Cc: Ludovic Desroches <ludovic.desroches@microchip.com> Signed-off-by: Nobuhiro Iwamatsu <nobuhiro1.iwamatsu@toshiba.co.jp> Acked-by: Nicolas Ferre <nicolas.ferre@microchip.com> Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com> Link: https://lore.kernel.org/r/20210708051340.341345-1-nobuhiro1.iwamatsu@toshiba.co.jp
2021-07-10	rtc: s5m: Check return value of s5m_check_peding_alarm_interrupt()	Nobuhiro Iwamatsu	1	-3/+1
	s5m_check_peding_alarm_interrupt() in s5m_rtc_read_alarm() gets the return value, but doesn't use it. This modifies using the s5m_check_peding_alarm_interrupt()"s return value as the s5m_rtc_read_alarm()'s return value. Cc: Krzysztof Kozlowski <krzysztof.kozlowski@canonical.com> Cc: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com> Cc: linux-samsung-soc@vger.kernel.org Signed-off-by: Nobuhiro Iwamatsu <nobuhiro1.iwamatsu@toshiba.co.jp> Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@canonical.com> Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com> Link: https://lore.kernel.org/r/20210708051304.341278-1-nobuhiro1.iwamatsu@toshiba.co.jp
2021-07-10	rtc: spear: convert to SPDX identifier	Nobuhiro Iwamatsu	1	-4/+1
	Use SPDX-License-Identifier instead of a verbose license text. Signed-off-by: Nobuhiro Iwamatsu <nobuhiro1.iwamatsu@toshiba.co.jp> Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com> Link: https://lore.kernel.org/r/20210707075804.337458-11-nobuhiro1.iwamatsu@toshiba.co.jp
2021-07-10	rtc: tps6586x: convert to SPDX identifier	Nobuhiro Iwamatsu	1	-14/+1
	Use SPDX-License-Identifier instead of a verbose license text. Signed-off-by: Nobuhiro Iwamatsu <nobuhiro1.iwamatsu@toshiba.co.jp> Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com> Link: https://lore.kernel.org/r/20210707075804.337458-9-nobuhiro1.iwamatsu@toshiba.co.jp
2021-07-10	rtc: tps80031: convert to SPDX identifier	Nobuhiro Iwamatsu	1	-14/+1
	Use SPDX-License-Identifier instead of a verbose license text. Signed-off-by: Nobuhiro Iwamatsu <nobuhiro1.iwamatsu@toshiba.co.jp> Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com> Link: https://lore.kernel.org/r/20210707075804.337458-8-nobuhiro1.iwamatsu@toshiba.co.jp
2021-07-10	rtc: rtd119x: Fix format of SPDX identifier	Nobuhiro Iwamatsu	1	-2/+1
	For C files, use the C99 format (//). Signed-off-by: Nobuhiro Iwamatsu <nobuhiro1.iwamatsu@toshiba.co.jp> Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com> Link: https://lore.kernel.org/r/20210707075804.337458-7-nobuhiro1.iwamatsu@toshiba.co.jp
2021-07-10	rtc: sc27xx: Fix format of SPDX identifier	Nobuhiro Iwamatsu	1	-1/+1
	For C files, use the C99 format (//). Cc: Orson Zhai <orsonzhai@gmail.com> Cc: Baolin Wang <baolin.wang7@gmail.com> Cc: Chunyan Zhang <zhang.lyra@gmail.com> Signed-off-by: Nobuhiro Iwamatsu <nobuhiro1.iwamatsu@toshiba.co.jp> Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com> Link: https://lore.kernel.org/r/20210707075804.337458-6-nobuhiro1.iwamatsu@toshiba.co.jp
2021-07-10	rtc: palmas: convert to SPDX identifier	Nobuhiro Iwamatsu	1	-14/+1
	Use SPDX-License-Identifier instead of a verbose license text. Signed-off-by: Nobuhiro Iwamatsu <nobuhiro1.iwamatsu@toshiba.co.jp> Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com> Link: https://lore.kernel.org/r/20210707075804.337458-5-nobuhiro1.iwamatsu@toshiba.co.jp
2021-07-10	rtc: max6900: convert to SPDX identifier	Nobuhiro Iwamatsu	1	-5/+3
	Use SPDX-License-Identifier instead of a verbose license text. Signed-off-by: Nobuhiro Iwamatsu <nobuhiro1.iwamatsu@toshiba.co.jp> Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com> Link: https://lore.kernel.org/r/20210707075804.337458-4-nobuhiro1.iwamatsu@toshiba.co.jp
2021-07-10	rtc: ds1374: convert to SPDX identifier	Nobuhiro Iwamatsu	1	-5/+2
	Use SPDX-License-Identifier instead of a verbose license text. Signed-off-by: Nobuhiro Iwamatsu <nobuhiro1.iwamatsu@toshiba.co.jp> Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com> Link: https://lore.kernel.org/r/20210707075804.337458-3-nobuhiro1.iwamatsu@toshiba.co.jp
2021-07-10	rtc: au1xxx: convert to SPDX identifier	Nobuhiro Iwamatsu	1	-4/+1
	Use SPDX-License-Identifier instead of a verbose license text. Signed-off-by: Nobuhiro Iwamatsu <nobuhiro1.iwamatsu@toshiba.co.jp> Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com> Link: https://lore.kernel.org/r/20210707075804.337458-2-nobuhiro1.iwamatsu@toshiba.co.jp
2021-07-09	Revert "PCI: Coalesce host bridge contiguous apertures"	Bjorn Helgaas	1	-46/+4
	This reverts commit 65db04053efea3f3e412a7e0cc599962999c96b4. Guenter reported that after 65db04053efe, the ppc:sam460ex qemu emulation no longer boots from nvme: nvme nvme0: Device not ready; aborting initialisation, CSTS=0x0 nvme nvme0: Removing after probe failure status: -19 Link: https://lore.kernel.org/r/20210709231529.GA3270116@roeck-us.net Reported-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2021-07-10	rtc: pcf85063: Update the PCF85063A datasheet revision	Fabio Estevam	1	-1/+1
	After updating the datasheet URL, the PCF85063A datasheet revision has changed. Adjust it accordingly. Reported-by: Nobuhiro Iwamatsu <iwamatsu@nigauri.org> Signed-off-by: Fabio Estevam <festevam@gmail.com> Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com> Link: https://lore.kernel.org/r/20210624120953.2313378-1-festevam@gmail.com
2021-07-10	dt-bindings: rtc: ti,bq32k: take maintainership	Alexandre Belloni	1	-1/+1
	Take maintainership of the binding as PAvel said he doesn't have the hardware anymore. Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com> Acked-by: Pavel Machek <pavel@ucw.cz> Link: https://lore.kernel.org/r/20210620224030.1115356-1-alexandre.belloni@bootlin.com
2021-07-09	perf test: Add free() calls for scandir() returned dirent entries	Riccardo Mancini	1	-4/+11
	ASan reported a memory leak for items of the entlist returned from scandir(). In fact, scandir() returns a malloc'd array of malloc'd dirents. This patch adds the missing (z)frees. Fixes: da963834fe6975a1 ("perf test: Iterate over shell tests in alphabetical order") Signed-off-by: Riccardo Mancini <rickyman7@gmail.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Fabian Hemmer <copy@copy.sh> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Remi Bernon <rbernon@codeweavers.com> Link: http://lore.kernel.org/lkml/20210709163454.672082-1-rickyman7@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-07-09	libperf: Add tests for perf_evlist__set_leader()	Jiri Olsa	1	-6/+21
	Add a test for the newly added perf_evlist__set_leader() function. Committer testing: $ cd tools/lib/perf/ $ sudo make tests [sudo] password for acme: running static: - running tests/test-cpumap.c...OK - running tests/test-threadmap.c...OK - running tests/test-evlist.c...OK - running tests/test-evsel.c...OK running dynamic: - running tests/test-cpumap.c...OK - running tests/test-threadmap.c...OK - running tests/test-evlist.c...OK - running tests/test-evsel.c...OK $ Signed-off-by: Jiri Olsa <jolsa@kernel.org> Requested-by: Shunsuke Nakamura <nakamura.shun@fujitsu.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lore.kernel.org/lkml/20210706151704.73662-8-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-07-09	libperf: Remove BUG_ON() from library code in get_group_fd()	Arnaldo Carvalho de Melo	1	-7/+16
	We shouldn't just panic, return a value that doesn't clash with what perf_evsel__open() was already returning in case of error, i.e. errno when sys_perf_event_open() fails. Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Shunsuke Nakamura <nakamura.shun@fujitsu.com> Link: http://lore.kernel.org/lkml/YOiOA5zOtVH9IBbE@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-07-09	cifs: update internal version number	Steve French	1	-1/+1
	To 2.33 Signed-off-by: Steve French <stfrench@microsoft.com>
2021-07-09	cifs: prevent NULL deref in cifs_compose_mount_options()	Paulo Alcantara	1	-0/+3
	The optional @ref parameter might contain an NULL node_name, so prevent dereferencing it in cifs_compose_mount_options(). Addresses-Coverity: 1476408 ("Explicit null dereferenced") Signed-off-by: Paulo Alcantara (SUSE) <pc@cjr.nz> Signed-off-by: Steve French <stfrench@microsoft.com>
2021-07-09	libperf: Add group support to perf_evsel__open()	Jiri Olsa	1	-2/+24
	Add support to set group_fd in perf_evsel__open() and make it follow the group setup. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Requested-by: Shunsuke Nakamura <nakamura.shun@fujitsu.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lore.kernel.org/lkml/20210706151704.73662-7-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-07-09	SMB3.1.1: Add support for negotiating signing algorithm	Steve French	4	-11/+86
	Support for faster packet signing (using GMAC instead of CMAC) can now be negotiated to some newer servers, including Windows. See MS-SMB2 section 2.2.3.17. This patch adds support for sending the new negotiate context with the first of three supported signing algorithms (AES-CMAC) and decoding the response. A followon patch will add support for sending the other two (including AES-GMAC, which is fastest) and changing the signing algorithm used based on what was negotiated. To allow the client to request GMAC signing set module parameter "enable_negotiate_signing" to 1. Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com> Reviewed-by: Pavel Shilovsky <pshilovsky@samba.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2021-07-09	perf tools: Fix pattern matching for same substring in different PMU type	Jin Yao	3	-2/+37
	Some different PMU types may have the same substring. For example, on Icelake server we have PMU types "uncore_imc" and "uncore_imc_free_running". Both PMU types have the substring "uncore_imc". But the parser wrongly thinks they are the same PMU type. We enable an imc event, perf stat -e uncore_imc/event=0xe3/ -a -- sleep 1 Perf actually expands the event to: uncore_imc_0/event=0xe3/ uncore_imc_1/event=0xe3/ uncore_imc_2/event=0xe3/ uncore_imc_3/event=0xe3/ uncore_imc_4/event=0xe3/ uncore_imc_5/event=0xe3/ uncore_imc_6/event=0xe3/ uncore_imc_7/event=0xe3/ uncore_imc_free_running_0/event=0xe3/ uncore_imc_free_running_1/event=0xe3/ uncore_imc_free_running_3/event=0xe3/ uncore_imc_free_running_4/event=0xe3/ That's because the "uncore_imc_free_running" matches the pattern "uncore_imc". Now we check that the last characters of PMU name is '_<digit>'. For example, for pattern "uncore_imc", "uncore_imc_0" is parsed ok, but "uncore_imc_free_running_0" fails. Fixes: b2b9d3a3f0211c5d ("perf pmu: Support wildcards on pmu name in dynamic pmu events") Signed-off-by: Jin Yao <yao.jin@linux.intel.com> Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Acked-by: Jiri Olsa <jolsa@redhat.com> Cc: Agustin Vega-Frias <agustinv@codeaurora.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lore.kernel.org/lkml/20210701064253.1175-1-yao.jin@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-07-09	perf record: Add a dummy event on hybrid systems to collect metadata records	Kan Liang	1	-4/+5
	Some symbols may not be resolved if a user only monitors one type of PMU. $ sudo perf record -e cpu_atom/branch-instructions/ ./big_small_workload $ sudo perf report –stdio # Overhead Command Shared Object Symbol # ........ ......... ................. ..................... # 28.02% perf-exec [unknown] [.] 0x0000000000401cf6 11.32% perf-exec [unknown] [.] 0x0000000000401d04 10.90% perf-exec [unknown] [.] 0x0000000000401d11 10.61% perf-exec [unknown] [.] 0x0000000000401cfc To parse symbols the metadata records, e.g., PERF_RECORD_COMM, which are generated by the kernel, are required. To decide whether to generate the metadata records, the kernel relies on the event_filter_match() to filter the unrelated events. On a hybrid system, event_filter_match() further checks the CPU mask of the current enabled PMU. If an event is collected on the CPU which doesn't have an enabled PMU, it's treated as an unrelated event. The "big_small_workload" is created in a big core, but runs on a small core. The metadata records are filtered, because the user only monitors the PMU of the small core. The big core PMU is not enabled. For a hybrid system, a dummy event is required to generate the complete side-band events. Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Jin Yao <yao.jin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lore.kernel.org/lkml/1625760212-18441-1-git-send-email-kan.liang@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-07-09	perf stat: Add Topdown metrics L2 events as default events	Kan Liang	2	-1/+8
	The Topdown Microarchitecture Analysis (TMA) Method is a structured analysis methodology to identify critical performance bottlenecks in out-of-order processors. The Topdown metrics L1 event was added as default in 42641d6f4d15e6db ("perf stat: Add Topdown metrics events as default events") From the Sapphire Rapids server and later platforms, the same dedicated "metrics" register is extended to support both L1 and L2 events. Add both L1 and L2 Topdown metrics events as default to enrich the default measuring information if the new measurement register is available. On legacy systems there is no change to avoid extra multiplexing. The topdown_level indicates the max metrics level for the top-down statistics. Set it to 2 to display all L1 and L2 Topdown metrics events. With the patch: $ perf stat sleep 1 Performance counter stats for 'sleep 1': 0.59 msec task-clock # 0.001 CPUs utilized 1 context-switches # 1.687 K/sec 0 cpu-migrations # 0.000 /sec 76 page-faults # 128.198 K/sec 1,405,318 cycles # 2.371 GHz 1,471,136 instructions # 1.05 insn per cycle 310,132 branches # 523.136 M/sec 10,435 branch-misses # 3.36% of all branches 8,431,908 slots # 14.223 G/sec 1,554,116 topdown-retiring # 18.4% retiring 1,289,585 topdown-bad-spec # 15.2% bad speculation 2,810,636 topdown-fe-bound # 33.2% frontend bound 2,810,636 topdown-be-bound # 33.2% backend bound 231,464 topdown-heavy-ops # 2.7% heavy operations # 15.6% light operations 1,223,453 topdown-br-mispredict # 14.5% branch mispredict # 0.8% machine clears 1,884,779 topdown-fetch-lat # 22.3% fetch latency # 10.9% fetch bandwidth 1,454,917 topdown-mem-bound # 17.2% memory bound # 16.0% Core bound 1.001179699 seconds time elapsed 0.000000000 seconds user 0.001238000 seconds sys Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Reviewed-by: Andi Kleen <ak@linux.intel.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Jin Yao <yao.jin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lore.kernel.org/lkml/1625760169-18396-1-git-send-email-kan.liang@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-07-09	libperf: Adopt evlist__set_leader() from tools/perf as perf_evlist__set_leader()	Jiri Olsa	7	-20/+26
	Move the implementation of evlist__set_leader() to a new libperf perf_evlist__set_leader() function with the same functionality make it a libperf exported API. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Requested-by: Shunsuke Nakamura <nakamura.shun@fujitsu.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lore.kernel.org/lkml/20210706151704.73662-6-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-07-09	libperf: Move 'nr_groups' from tools/perf to evlist::nr_groups	Jiri Olsa	11	-23/+23
	Move evsel::nr_groups to perf_evsel::nr_groups, so we can move the group interface to libperf. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Requested-by: Shunsuke Nakamura <nakamura.shun@fujitsu.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lore.kernel.org/lkml/20210706151704.73662-5-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-07-09	libperf: Move 'leader' from tools/perf to perf_evsel::leader	Jiri Olsa	20	-77/+103
	Move evsel::leader to perf_evsel::leader, so we can move the group interface to libperf. Also add several evsel helpers to ease up the transition: struct evsel evsel__leader(struct evsel evsel); - get leader evsel bool evsel__has_leader(struct evsel evsel, struct evsel leader); - true if evsel has leader as leader bool evsel__is_leader(struct evsel evsel); - true if evsel is itw own leader void evsel__set_leader(struct evsel evsel, struct evsel *leader); - set leader for evsel Committer notes: Fix this when building with 'make BUILD_BPF_SKEL=1' tools/perf/util/bpf_counter.c - if (evsel->leader->core.nr_members > 1) { + if (evsel->core.leader->nr_members > 1) { Signed-off-by: Jiri Olsa <jolsa@kernel.org> Requested-by: Shunsuke Nakamura <nakamura.shun@fujitsu.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lore.kernel.org/lkml/20210706151704.73662-4-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-07-09	libperf: Move 'idx' from tools/perf to perf_evsel::idx	Jiri Olsa	21	-58/+59
	Move evsel::idx to perf_evsel::idx, so we can move the group interface to libperf. Committer notes: Fixup evsel->idx usage in tools/perf/util/bpf_counter_cgroup.c, that appeared in my tree in my local tree. Also fixed up these: $ find tools/perf/ -name "*.[ch]" \| xargs grep 'evsel->idx' tools/perf/ui/gtk/annotate.c: evsel->idx + i); tools/perf/ui/gtk/annotate.c: evsel->idx); $ That running 'make -C tools/perf build-test' caught. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Requested-by: Shunsuke Nakamura <nakamura.shun@fujitsu.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lore.kernel.org/lkml/20210706151704.73662-3-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-07-09	io_uring: remove dead non-zero 'poll' check	Jens Axboe	1	-1/+1
	Colin reports that Coverity complains about checking for poll being non-zero after having dereferenced it multiple times. This is a valid complaint, and actually a leftover from back when this code was based on the aio poll code. Kill the redundant check. Link: https://lore.kernel.org/io-uring/fe70c532-e2a7-3722-58a1-0fa4e5c5ff2c@canonical.com/ Reported-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-07-09	MIPS: vdso: Invalid GIC access through VDSO	Martin Fäcknitz	1	-1/+1
	Accessing raw timers (currently only CLOCK_MONOTONIC_RAW) through VDSO doesn't return the correct time when using the GIC as clock source. The address of the GIC mapped page is in this case not calculated correctly. The GIC mapped page is calculated from the VDSO data by subtracting PAGE_SIZE: void get_gic(const struct vdso_data data) { return (void __iomem *)data - PAGE_SIZE; } However, the data pointer is not page aligned for raw clock sources. This is because the VDSO data for raw clock sources (CS_RAW = 1) is stored after the VDSO data for coarse clock sources (CS_HRES_COARSE = 0). Therefore, only the VDSO data for CS_HRES_COARSE is page aligned: +--------------------+ \| \| \| vd[CS_RAW] \| ---+ \| vd[CS_HRES_COARSE] \| \| +--------------------+ \| -PAGE_SIZE \| \| \| \| GIC mapped page \| <--+ \| \| +--------------------+ When __arch_get_hw_counter() is called with &vd[CS_RAW], get_gic returns the wrong address (somewhere inside the GIC mapped page). The GIC counter values are not returned which results in an invalid time. Fixes: a7f4df4e21dd ("MIPS: VDSO: Add implementations of gettimeofday() and clock_gettime()") Signed-off-by: Martin Fäcknitz <faecknitz@hotsplots.de> Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
2021-07-09	irqchip/mips: Fix RCU violation when using irqdomain lookup on interrupt entry	Marc Zyngier	5	-11/+31
	Since d4a45c68dc81 ("irqdomain: Protect the linear revmap with RCU"), any irqdomain lookup requires the RCU read lock to be held. This assumes that the architecture code will be structured such as irq_enter() will be called before the interrupt is looked up in the irq domain. However, this isn't the case for MIPS, and a number of drivers are structured to do it the other way around when handling an interrupt in their root irqchip (secondary irqchips are OK by construction). This results in a RCU splat on a lockdep-enabled kernel when the kernel takes an interrupt from idle, as reported by Guenter Roeck. Note that this could have fired previously if any driver had used tree-based irqdomain, which always had the RCU requirement. To solve this, provide a MIPS-specific helper (do_domain_IRQ()) as the pendent of do_IRQ() that will do thing in the right order (and maybe save some cycles in the process). Ideally, MIPS would be moved over to using handle_domain_irq(), but that's much more ambitious. Reported-by: Guenter Roeck <linux@roeck-us.net> Tested-by: Guenter Roeck <linux@roeck-us.net> [maz: add dependency on CONFIG_IRQ_DOMAIN after report from the kernelci bot] Signed-off-by: Marc Zyngier <maz@kernel.org> Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Cc: Serge Semin <fancer.lancer@gmail.com> Link: https://lore.kernel.org/r/20210705172352.GA56304@roeck-us.net Link: https://lore.kernel.org/r/20210706110647.3979002-1-maz@kernel.org
2021-07-08	cifs: use helpers when parsing uid/gid mount options and validate them	Ronnie Sahlberg	2	-5/+20
	Use the nice helpers to initialize and the uid/gid/cred_uid when passed as mount arguments. Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com> Acked-by: Pavel Shilovsky <pshilovsky@samba.org> Signed-off-by: Steve French <stfrench@microsoft.com>