aboutsummaryrefslogtreecommitdiffstatshomepage
path: root/tools/perf/scripts/python/export-to-postgresql.py (unfollow)
AgeCommit message (Collapse)AuthorFilesLines
2023-01-31mm, mremap: fix mremap() expanding for vma's with vm_ops->close()Vlastimil Babka1-6/+19
Fabian has reported another regression in 6.1 due to ca3d76b0aa80 ("mm: add merging after mremap resize"). The problem is that vma_merge() can fail when vma has a vm_ops->close() method, causing is_mergeable_vma() test to be negative. This was happening for vma mapping a file from fuse-overlayfs, which does have the method. But when we are simply expanding the vma, we never remove it due to the "merge" with the added area, so the test should not prevent the expansion. As a quick fix, check for such vmas and expand them using vma_adjust() directly as was done before commit ca3d76b0aa80. For a more robust long term solution we should try to limit the check for vma_ops->close only to cases that actually result in vma removal, so that no merge would be prevented unnecessarily. [akpm@linux-foundation.org: fix indenting whitespace, reflow comment] Link: https://lkml.kernel.org/r/20230117101939.9753-1-vbabka@suse.cz Fixes: ca3d76b0aa80 ("mm: add merging after mremap resize") Signed-off-by: Vlastimil Babka <vbabka@suse.cz> Reported-by: Fabian Vogt <fvogt@suse.com> Link: https://bugzilla.suse.com/show_bug.cgi?id=1206359#c35 Tested-by: Fabian Vogt <fvogt@suse.com> Cc: Jakub Matěna <matenajakub@gmail.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-01-31squashfs: harden sanity check in squashfs_read_xattr_id_tableFedor Pchelkin1-1/+1
While mounting a corrupted filesystem, a signed integer '*xattr_ids' can become less than zero. This leads to the incorrect computation of 'len' and 'indexes' values which can cause null-ptr-deref in copy_bio_to_actor() or out-of-bounds accesses in the next sanity checks inside squashfs_read_xattr_id_table(). Found by Linux Verification Center (linuxtesting.org) with Syzkaller. Link: https://lkml.kernel.org/r/20230117105226.329303-2-pchelkin@ispras.ru Fixes: 506220d2ba21 ("squashfs: add more sanity checks in xattr id lookup") Reported-by: <syzbot+082fa4af80a5bb1a9843@syzkaller.appspotmail.com> Signed-off-by: Fedor Pchelkin <pchelkin@ispras.ru> Signed-off-by: Alexey Khoroshilov <khoroshilov@ispras.ru> Cc: Phillip Lougher <phillip@squashfs.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-01-31ia64: fix build error due to switch case label appearing next to declarationJames Morse1-2/+5
Since commit aa06a9bd8533 ("ia64: fix clock_getres(CLOCK_MONOTONIC) to report ITC frequency"), gcc 10.1.0 fails to build ia64 with the gnomic: | ../arch/ia64/kernel/sys_ia64.c: In function 'ia64_clock_getres': | ../arch/ia64/kernel/sys_ia64.c:189:3: error: a label can only be part of a statement and a declaration is not a statement | 189 | s64 tick_ns = DIV_ROUND_UP(NSEC_PER_SEC, local_cpu_data->itc_freq); This line appears immediately after a case label in a switch. Move the declarations out of the case, to the top of the function. Link: https://lkml.kernel.org/r/20230117151632.393836-1-james.morse@arm.com Fixes: aa06a9bd8533 ("ia64: fix clock_getres(CLOCK_MONOTONIC) to report ITC frequency") Signed-off-by: James Morse <james.morse@arm.com> Reviewed-by: Sergei Trofimovich <slyich@gmail.com> Cc: Émeric Maschino <emeric.maschino@gmail.com> Cc: matoro <matoro_mailinglist_kernel@matoro.tk> Cc: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-01-31mm: multi-gen LRU: fix crash during cgroup migrationYu Zhao1-1/+4
lru_gen_migrate_mm() assumes lru_gen_add_mm() runs prior to itself. This isn't true for the following scenario: CPU 1 CPU 2 clone() cgroup_can_fork() cgroup_procs_write() cgroup_post_fork() task_lock() lru_gen_migrate_mm() task_unlock() task_lock() lru_gen_add_mm() task_unlock() And when the above happens, kernel crashes because of linked list corruption (mm_struct->lru_gen.list). Link: https://lore.kernel.org/r/20230115134651.30028-1-msizanoen@qtmlabs.xyz/ Link: https://lkml.kernel.org/r/20230116034405.2960276-1-yuzhao@google.com Fixes: bd74fdaea146 ("mm: multi-gen LRU: support page table walks") Signed-off-by: Yu Zhao <yuzhao@google.com> Reported-by: msizanoen <msizanoen@qtmlabs.xyz> Tested-by: msizanoen <msizanoen@qtmlabs.xyz> Cc: <stable@vger.kernel.org> [6.1+] Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-01-31Revert "mm: add nodes= arg to memory.reclaim"Michal Hocko4-68/+21
This reverts commit 12a5d3955227b0d7e04fb793ccceeb2a1dd275c5. Although it is recognized that a finer grained pro-active reclaim is something we need and want the semantic of this implementation is really ambiguous. In a follow up discussion it became clear that there are two essential usecases here. One is to use memory.reclaim to pro-actively reclaim memory and expectation is that the requested and reported amount of memory is uncharged from the memcg. Another usecase focuses on pro-active demotion when the memory is merely shuffled around to demotion targets while the overall charged memory stays unchanged. The current implementation considers demoted pages as reclaimed and that break both usecases. [1] has tried to address the reporting part but there are more issues with that summarized in [2] and follow up emails. Let's revert the nodemask based extension of the memcg pro-active reclaim for now until we settle with a more robust semantic. [1] http://lkml.kernel.org/r/http://lkml.kernel.org/r/20221206023406.3182800-1-almasrymina@google.com [2] http://lkml.kernel.org/r/Y5bsmpCyeryu3Zz1@dhcp22.suse.cz Link: https://lkml.kernel.org/r/Y5xASNe1x8cusiTx@dhcp22.suse.cz Fixes: 12a5d3955227b0d ("mm: add nodes= arg to memory.reclaim") Signed-off-by: Michal Hocko <mhocko@suse.com> Cc: Bagas Sanjaya <bagasdotme@gmail.com> Cc: Huang Ying <ying.huang@intel.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Mina Almasry <almasrymina@google.com> Cc: Muchun Song <songmuchun@bytedance.com> Cc: Roman Gushchin <roman.gushchin@linux.dev> Cc: Shakeel Butt <shakeelb@google.com> Cc: Tejun Heo <tj@kernel.org> Cc: Wei Xu <weixugc@google.com> Cc: Yang Shi <yang.shi@linux.alibaba.com> Cc: Yosry Ahmed <yosryahmed@google.com> Cc: zefan li <lizefan.x@bytedance.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-01-31zsmalloc: fix a race with deferred_handles storingNhat Pham1-32/+205
Currently, there is a race between zs_free() and zs_reclaim_page(): zs_reclaim_page() finds a handle to an allocated object, but before the eviction happens, an independent zs_free() call to the same handle could come in and overwrite the object value stored at the handle with the last deferred handle. When zs_reclaim_page() finally gets to call the eviction handler, it will see an invalid object value (i.e the previous deferred handle instead of the original object value). This race happens quite infrequently. We only managed to produce it with out-of-tree developmental code that triggers zsmalloc writeback with a much higher frequency than usual. This patch fixes this race by storing the deferred handle in the object header instead. We differentiate the deferred handle from the other two cases (handle for allocated object, and linkage for free object) with a new tag. If zspage reclamation succeeds, we will free these deferred handles by walking through the zspage objects. On the other hand, if zspage reclamation fails, we reconstruct the zspage freelist (with the deferred handle tag and allocated tag) before trying again with the reclamation. [arnd@arndb.de: avoid unused-function warning] Link: https://lkml.kernel.org/r/20230117170507.2651972-1-arnd@kernel.org Link: https://lkml.kernel.org/r/20230110231701.326724-1-nphamcs@gmail.com Fixes: 9997bc017549 ("zsmalloc: implement writeback mechanism for zsmalloc") Signed-off-by: Nhat Pham <nphamcs@gmail.com> Signed-off-by: Arnd Bergmann <arnd@arndb.de> Suggested-by: Johannes Weiner <hannes@cmpxchg.org> Cc: Dan Streetman <ddstreet@ieee.org> Cc: Minchan Kim <minchan@kernel.org> Cc: Nitin Gupta <ngupta@vflare.org> Cc: Sergey Senozhatsky <senozhatsky@chromium.org> Cc: Seth Jennings <sjenning@redhat.com> Cc: Vitaly Wool <vitaly.wool@konsulko.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-01-31mm/khugepaged: fix ->anon_vma raceJann Horn1-1/+13
If an ->anon_vma is attached to the VMA, collapse_and_free_pmd() requires it to be locked. Page table traversal is allowed under any one of the mmap lock, the anon_vma lock (if the VMA is associated with an anon_vma), and the mapping lock (if the VMA is associated with a mapping); and so to be able to remove page tables, we must hold all three of them. retract_page_tables() bails out if an ->anon_vma is attached, but does this check before holding the mmap lock (as the comment above the check explains). If we racily merged an existing ->anon_vma (shared with a child process) from a neighboring VMA, subsequent rmap traversals on pages belonging to the child will be able to see the page tables that we are concurrently removing while assuming that nothing else can access them. Repeat the ->anon_vma check once we hold the mmap lock to ensure that there really is no concurrent page table access. Hitting this bug causes a lockdep warning in collapse_and_free_pmd(), in the line "lockdep_assert_held_write(&vma->anon_vma->root->rwsem)". It can also lead to use-after-free access. Link: https://lore.kernel.org/linux-mm/CAG48ez3434wZBKFFbdx4M9j6eUwSUVPd4dxhzW_k_POneSDF+A@mail.gmail.com/ Link: https://lkml.kernel.org/r/20230111133351.807024-1-jannh@google.com Fixes: f3f0e1d2150b ("khugepaged: add support of collapse for tmpfs/shmem pages") Signed-off-by: Jann Horn <jannh@google.com> Reported-by: Zach O'Keefe <zokeefe@google.com> Acked-by: Kirill A. Shutemov <kirill.shutemov@intel.linux.com> Reviewed-by: Yang Shi <shy828301@gmail.com> Cc: David Hildenbrand <david@redhat.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-01-31maple_tree: fix mas_empty_area_rev() lower bound validationLiam Howlett2-9/+97
mas_empty_area_rev() was not correctly validating the start of a gap against the lower limit. This could lead to the range starting lower than the requested minimum. Fix the issue by better validating a gap once one is found. This commit also adds tests to the maple tree test suite for this issue and tests the mas_empty_area() function for similar bound checking. Link: https://lkml.kernel.org/r/20230111200136.1851322-1-Liam.Howlett@oracle.com Link: https://bugzilla.kernel.org/show_bug.cgi?id=216911 Fixes: 54a611b60590 ("Maple Tree: add new data structure") Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com> Reported-by: <amanieu@gmail.com> Link: https://lore.kernel.org/linux-mm/0b9f5425-08d4-8013-aa4c-e620c3b10bb2@leemhuis.info/ Tested-by: Holger Hoffsttte <holger@applied-asynchrony.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-01-19selftests/filesystems: grant executable permission to run_fat_tests.shPengfei Xu1-0/+0
When use tools/testing/selftests/kselftest_install.sh to make the kselftest-list.txt under tools/testing/selftests/kselftest_install. Then use tools/testing/selftests/kselftest_install/run_kselftest.sh to run all the kselftests in kselftest-list.txt, it will be blocked by case "filesystems/fat: run_fat_tests.sh" with "Warning: file run_fat_tests.sh is not executable", so grant executable permission to run_fat_tests.sh to fix this issue. Link: https://lkml.kernel.org/r/dfdbba6df8a1ab34bb1e81cd8bd7ca3f9ed5c369.1673424747.git.pengfei.xu@intel.com Fixes: dd7c9be330d8 ("selftests/filesystems: add a vfat RENAME_EXCHANGE test") Signed-off-by: Pengfei Xu <pengfei.xu@intel.com> Reviewed-by: Javier Martinez Canillas <javierm@redhat.com> Cc: Shuah Khan <shuah@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-01-18selftests/vm: remove __USE_GNU in hugetlb-madvise.cPeter Xu1-1/+0
__USE_GNU should be an internal macro only used inside glibc. Either memfd_create() or fallocate() requires _GNU_SOURCE per man page, where __USE_GNU will further be defined by glibc headers include/features.h: #ifdef _GNU_SOURCE # define __USE_GNU 1 #endif This fixes: >> hugetlb-madvise.c:20: warning: "__USE_GNU" redefined 20 | #define __USE_GNU | In file included from /usr/include/x86_64-linux-gnu/bits/libc-header-start.h:33, from /usr/include/stdlib.h:26, from hugetlb-madvise.c:16: /usr/include/features.h:407: note: this is the location of the previous definition 407 | # define __USE_GNU 1 | Link: https://lkml.kernel.org/r/Y8V9z+z6Tk7NetI3@x1n Signed-off-by: Peter Xu <peterx@redhat.com> Reported-by: kernel test robot <lkp@intel.com> Cc: Mike Kravetz <mike.kravetz@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-01-18mm: fix a few rare cases of using swapin error pte markerPeter Xu3-3/+16
This patch should harden commit 15520a3f0469 ("mm: use pte markers for swap errors") on using pte markers for swapin errors on a few corner cases. 1. Propagate swapin errors across fork()s: if there're swapin errors in the parent mm, after fork()s the child should sigbus too when an error page is accessed. 2. Fix a rare condition race in pte_marker_clear() where a uffd-wp pte marker can be quickly switched to a swapin error. 3. Explicitly ignore swapin error pte markers in change_protection(). I mostly don't worry on (2) or (3) at all, but we should still have them. Case (1) is special because it can potentially cause silent data corrupt on child when parent has swapin error triggered with swapoff, but since swapin error is rare itself already it's probably not easy to trigger either. Currently there is a priority difference between the uffd-wp bit and the swapin error entry, in which the swapin error always has higher priority (e.g. we don't need to wr-protect a swapin error pte marker). If there will be a 3rd bit introduced, we'll probably need to consider a more involved approach so we may need to start operate on the bits. Let's leave that for later. This patch is tested with case (1) explicitly where we'll get corrupted data before in the child if there's existing swapin error pte markers, and after patch applied the child can be rightfully killed. We don't need to copy stable for this one since 15520a3f0469 just landed as part of v6.2-rc1, only "Fixes" applied. Link: https://lkml.kernel.org/r/20221214200453.1772655-3-peterx@redhat.com Fixes: 15520a3f0469 ("mm: use pte markers for swap errors") Signed-off-by: Peter Xu <peterx@redhat.com> Acked-by: David Hildenbrand <david@redhat.com> Reviewed-by: Miaohe Lin <linmiaohe@huawei.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: "Huang, Ying" <ying.huang@intel.com> Cc: Nadav Amit <nadav.amit@gmail.com> Cc: Pengfei Xu <pengfei.xu@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-01-18mm/uffd: fix pte marker when fork() without fork eventPeter Xu1-6/+2
Patch series "mm: Fixes on pte markers". Patch 1 resolves the syzkiller report from Pengfei. Patch 2 further harden pte markers when used with the recent swapin error markers. The major case is we should persist a swapin error marker after fork(), so child shouldn't read a corrupted page. This patch (of 2): When fork(), dst_vma is not guaranteed to have VM_UFFD_WP even if src may have it and has pte marker installed. The warning is improper along with the comment. The right thing is to inherit the pte marker when needed, or keep the dst pte empty. A vague guess is this happened by an accident when there's the prior patch to introduce src/dst vma into this helper during the uffd-wp feature got developed and I probably messed up in the rebase, since if we replace dst_vma with src_vma the warning & comment it all makes sense too. Hugetlb did exactly the right here (copy_hugetlb_page_range()). Fix the general path. Reproducer: https://github.com/xupengfe/syzkaller_logs/blob/main/221208_115556_copy_page_range/repro.c Bugzilla report: https://bugzilla.kernel.org/show_bug.cgi?id=216808 Link: https://lkml.kernel.org/r/20221214200453.1772655-1-peterx@redhat.com Link: https://lkml.kernel.org/r/20221214200453.1772655-2-peterx@redhat.com Fixes: c56d1b62cce8 ("mm/shmem: handle uffd-wp during fork()") Signed-off-by: Peter Xu <peterx@redhat.com> Reported-by: Pengfei Xu <pengfei.xu@intel.com> Acked-by: David Hildenbrand <david@redhat.com> Reviewed-by: Miaohe Lin <linmiaohe@huawei.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: "Huang, Ying" <ying.huang@intel.com> Cc: Nadav Amit <nadav.amit@gmail.com> Cc: <stable@vger.kernel.org> # 5.19+ Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-01-15Linux 6.2-rc4Linus Torvalds1-1/+1
2023-01-13kbuild: Fix CFI hash randomization with KASANSami Tolvanen2-0/+2
Clang emits a asan.module_ctor constructor to each object file when KASAN is enabled, and these functions are indirectly called in do_ctors. With CONFIG_CFI_CLANG, the compiler also emits a CFI type hash before each address-taken global function so they can pass indirect call checks. However, in commit 0c3e806ec0f9 ("x86/cfi: Add boot time hash randomization"), x86 implemented boot time hash randomization, which relies on the .cfi_sites section generated by objtool. As objtool is run against vmlinux.o instead of individual object files with X86_KERNEL_IBT (enabled by default), CFI types in object files that are not part of vmlinux.o end up not being included in .cfi_sites, and thus won't get randomized and trip CFI when called. Only .vmlinux.export.o and init/version-timestamp.o are linked into vmlinux separately from vmlinux.o. As these files don't contain any functions, disable KASAN for both of them to avoid breaking hash randomization. Link: https://github.com/ClangBuiltLinux/linux/issues/1742 Fixes: 0c3e806ec0f9 ("x86/cfi: Add boot time hash randomization") Signed-off-by: Sami Tolvanen <samitolvanen@google.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Kees Cook <keescook@chromium.org> Link: https://lore.kernel.org/r/20230112224948.1479453-2-samitolvanen@google.com
2023-01-13firmware: coreboot: Check size of table entry and use flex-arrayKees Cook2-2/+8
The memcpy() of the data following a coreboot_table_entry couldn't be evaluated by the compiler under CONFIG_FORTIFY_SOURCE. To make it easier to reason about, add an explicit flexible array member to struct coreboot_device so the entire entry can be copied at once. Additionally, validate the sizes before copying. Avoids this run-time false positive warning: memcpy: detected field-spanning write (size 168) of single field "&device->entry" at drivers/firmware/google/coreboot_table.c:103 (size 8) Reported-by: Paul Menzel <pmenzel@molgen.mpg.de> Link: https://lore.kernel.org/all/03ae2704-8c30-f9f0-215b-7cdf4ad35a9a@molgen.mpg.de/ Cc: Jack Rosenthal <jrosenth@chromium.org> Cc: Guenter Roeck <groeck@chromium.org> Cc: Julius Werner <jwerner@chromium.org> Cc: Brian Norris <briannorris@chromium.org> Cc: Stephen Boyd <swboyd@chromium.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Kees Cook <keescook@chromium.org> Reviewed-by: Julius Werner <jwerner@chromium.org> Reviewed-by: Guenter Roeck <groeck@chromium.org> Link: https://lore.kernel.org/r/20230107031406.gonna.761-kees@kernel.org Reviewed-by: Stephen Boyd <swboyd@chromium.org> Reviewed-by: Jack Rosenthal <jrosenth@chromium.org> Link: https://lore.kernel.org/r/20230112230312.give.446-kees@kernel.org
2023-01-13kallsyms: Fix scheduling with interrupts disabled in self-testNicholas Piggin1-15/+6
kallsyms_on_each* may schedule so must not be called with interrupts disabled. The iteration function could disable interrupts, but this also changes lookup_symbol() to match the change to the other timing code. Reported-by: Erhard F. <erhard_f@mailbox.org> Link: https://lore.kernel.org/all/bug-216902-206035@https.bugzilla.kernel.org%2F/ Reported-by: kernel test robot <oliver.sang@intel.com> Link: https://lore.kernel.org/oe-lkp/202212251728.8d0872ff-oliver.sang@intel.com Fixes: 30f3bb09778d ("kallsyms: Add self-test facility") Tested-by: "Erhard F." <erhard_f@mailbox.org> Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
2023-01-14ata: pata_cs5535: Don't build on UMLPeter Foley1-0/+1
This driver uses MSR functions that aren't implemented under UML. Avoid building it to prevent tripping up allyesconfig. e.g. /usr/lib/gcc/x86_64-pc-linux-gnu/12/../../../../x86_64-pc-linux-gnu/bin/ld: pata_cs5535.c:(.text+0x3a3): undefined reference to `__tracepoint_read_msr' /usr/lib/gcc/x86_64-pc-linux-gnu/12/../../../../x86_64-pc-linux-gnu/bin/ld: pata_cs5535.c:(.text+0x3d2): undefined reference to `__tracepoint_write_msr' /usr/lib/gcc/x86_64-pc-linux-gnu/12/../../../../x86_64-pc-linux-gnu/bin/ld: pata_cs5535.c:(.text+0x457): undefined reference to `__tracepoint_write_msr' /usr/lib/gcc/x86_64-pc-linux-gnu/12/../../../../x86_64-pc-linux-gnu/bin/ld: pata_cs5535.c:(.text+0x481): undefined reference to `do_trace_write_msr' /usr/lib/gcc/x86_64-pc-linux-gnu/12/../../../../x86_64-pc-linux-gnu/bin/ld: pata_cs5535.c:(.text+0x4d5): undefined reference to `do_trace_write_msr' /usr/lib/gcc/x86_64-pc-linux-gnu/12/../../../../x86_64-pc-linux-gnu/bin/ld: pata_cs5535.c:(.text+0x4f5): undefined reference to `do_trace_read_msr' /usr/lib/gcc/x86_64-pc-linux-gnu/12/../../../../x86_64-pc-linux-gnu/bin/ld: pata_cs5535.c:(.text+0x51c): undefined reference to `do_trace_write_msr' Signed-off-by: Peter Foley <pefoley2@pefoley.com> Reviewed-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
2023-01-13lockref: stop doing cpu_relax in the cmpxchg loopMateusz Guzik1-1/+0
On the x86-64 architecture even a failing cmpxchg grants exclusive access to the cacheline, making it preferable to retry the failed op immediately instead of stalling with the pause instruction. To illustrate the impact, below are benchmark results obtained by running various will-it-scale tests on top of the 6.2-rc3 kernel and Cascade Lake (2 sockets * 24 cores * 2 threads) CPU. All results in ops/s. Note there is some variance in re-runs, but the code is consistently faster when contention is present. open3 ("Same file open/close"): proc stock no-pause 1 805603 814942 (+%1) 2 1054980 1054781 (-0%) 8 1544802 1822858 (+18%) 24 1191064 2199665 (+84%) 48 851582 1469860 (+72%) 96 609481 1427170 (+134%) fstat2 ("Same file fstat"): proc stock no-pause 1 3013872 3047636 (+1%) 2 4284687 4400421 (+2%) 8 3257721 5530156 (+69%) 24 2239819 5466127 (+144%) 48 1701072 5256609 (+209%) 96 1269157 6649326 (+423%) Additionally, a kernel with a private patch to help access() scalability: access2 ("Same file access"): proc stock patched patched +nopause 24 2378041 2005501 5370335 (-15% / +125%) That is, fixing the problems in access itself *reduces* scalability after the cacheline ping-pong only happens in lockref with the pause instruction. Note that fstat and access benchmarks are not currently integrated into will-it-scale, but interested parties can find them in pull requests to said project. Code at hand has a rather tortured history. First modification showed up in commit d472d9d98b46 ("lockref: Relax in cmpxchg loop"), written with Itanium in mind. Later it got patched up to use an arch-dependent macro to stop doing it on s390 where it caused a significant regression. Said macro had undergone revisions and was ultimately eliminated later, going back to cpu_relax. While I intended to only remove cpu_relax for x86-64, I got the following comment from Linus: I would actually prefer just removing it entirely and see if somebody else hollers. You have the numbers to prove it hurts on real hardware, and I don't think we have any numbers to the contrary. So I think it's better to trust the numbers and remove it as a failure, than say "let's just remove it on x86-64 and leave everybody else with the potentially broken code" Additionally, Will Deacon (maintainer of the arm64 port, one of the architectures previously benchmarked): So, from the arm64 side of the fence, I'm perfectly happy just removing the cpu_relax() calls from lockref. As such, come back full circle in history and whack it altogether. Signed-off-by: Mateusz Guzik <mjguzik@gmail.com> Link: https://lore.kernel.org/all/CAGudoHHx0Nqg6DE70zAVA75eV-HXfWyhVMWZ-aSeOofkA_=WdA@mail.gmail.com/ Acked-by: Tony Luck <tony.luck@intel.com> # ia64 Acked-by: Nicholas Piggin <npiggin@gmail.com> # powerpc Acked-by: Will Deacon <will@kernel.org> # arm64 Acked-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2023-01-13x86/pci: Treat EfiMemoryMappedIO as reservation of ECAM spaceBjorn Helgaas1-0/+31
Normally we reject ECAM space unless it is reported as reserved in the E820 table or via a PNP0C02 _CRS method (PCI Firmware, r3.3, sec 4.1.2). 07eab0901ede ("efi/x86: Remove EfiMemoryMappedIO from E820 map"), removes E820 entries that correspond to EfiMemoryMappedIO regions because some other firmware uses EfiMemoryMappedIO for PCI host bridge windows, and the E820 entries prevent Linux from allocating BAR space for hot-added devices. Some firmware doesn't report ECAM space via PNP0C02 _CRS methods, but does mention it as an EfiMemoryMappedIO region via EFI GetMemoryMap(), which is normally converted to an E820 entry by a bootloader or EFI stub. After 07eab0901ede, that E820 entry is removed, so we reject this ECAM space, which makes PCI extended config space (offsets 0x100-0xfff) inaccessible. The lack of extended config space breaks anything that relies on it, including perf, VSEC telemetry, EDAC, QAT, SR-IOV, etc. Allow use of ECAM for extended config space when the region is covered by an EfiMemoryMappedIO region, even if it's not included in E820 or PNP0C02 _CRS. Link: https://lore.kernel.org/r/ac2693d8-8ba3-72e0-5b66-b3ae008d539d@linux.intel.com Link: https://bugzilla.kernel.org/show_bug.cgi?id=216891 Fixes: 07eab0901ede ("efi/x86: Remove EfiMemoryMappedIO from E820 map") Link: https://lore.kernel.org/r/20230110180243.1590045-3-helgaas@kernel.org Reported-by: Kan Liang <kan.liang@linux.intel.com> Reported-by: Tony Luck <tony.luck@intel.com> Reported-by: Giovanni Cabiddu <giovanni.cabiddu@intel.com> Reported-by: Yunying Sun <yunying.sun@intel.com> Reported-by: Baowen Zheng <baowen.zheng@corigine.com> Reported-by: Zhenzhong Duan <zhenzhong.duan@intel.com> Reported-by: Yang Lixiao <lixiao.yang@intel.com> Tested-by: Tony Luck <tony.luck@intel.com> Tested-by: Giovanni Cabiddu <giovanni.cabiddu@intel.com> Tested-by: Kan Liang <kan.liang@linux.intel.com> Tested-by: Yunying Sun <yunying.sun@intel.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Dan Williams <dan.j.williams@intel.com> Reviewed-by: Rafael J. Wysocki <rafael@kernel.org>
2023-01-13efi: tpm: Avoid READ_ONCE() for accessing the event logArd Biesheuvel1-2/+2
Nathan reports that recent kernels built with LTO will crash when doing EFI boot using Fedora's GRUB and SHIM. The culprit turns out to be a misaligned load from the TPM event log, which is annotated with READ_ONCE(), and under LTO, this gets translated into a LDAR instruction which does not tolerate misaligned accesses. Interestingly, this does not happen when booting the same kernel straight from the UEFI shell, and so the fact that the event log may appear misaligned in memory may be caused by a bug in GRUB or SHIM. However, using READ_ONCE() to access firmware tables is slightly unusual in any case, and here, we only need to ensure that 'event' is not dereferenced again after it gets unmapped, but this is already taken care of by the implicit barrier() semantics of the early_memunmap() call. Cc: <stable@vger.kernel.org> Cc: Peter Jones <pjones@redhat.com> Cc: Jarkko Sakkinen <jarkko@kernel.org> Cc: Matthew Garrett <mjg59@srcf.ucam.org> Reported-by: Nathan Chancellor <nathan@kernel.org> Tested-by: Nathan Chancellor <nathan@kernel.org> Link: https://github.com/ClangBuiltLinux/linux/issues/1782 Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2023-01-13io_uring: lock overflowing for IOPOLLPavel Begunkov1-1/+5
syzbot reports an issue with overflow filling for IOPOLL: WARNING: CPU: 0 PID: 28 at io_uring/io_uring.c:734 io_cqring_event_overflow+0x1c0/0x230 io_uring/io_uring.c:734 CPU: 0 PID: 28 Comm: kworker/u4:1 Not tainted 6.2.0-rc3-syzkaller-16369-g358a161a6a9e #0 Workqueue: events_unbound io_ring_exit_work Call trace:  io_cqring_event_overflow+0x1c0/0x230 io_uring/io_uring.c:734  io_req_cqe_overflow+0x5c/0x70 io_uring/io_uring.c:773  io_fill_cqe_req io_uring/io_uring.h:168 [inline]  io_do_iopoll+0x474/0x62c io_uring/rw.c:1065  io_iopoll_try_reap_events+0x6c/0x108 io_uring/io_uring.c:1513  io_uring_try_cancel_requests+0x13c/0x258 io_uring/io_uring.c:3056  io_ring_exit_work+0xec/0x390 io_uring/io_uring.c:2869  process_one_work+0x2d8/0x504 kernel/workqueue.c:2289  worker_thread+0x340/0x610 kernel/workqueue.c:2436  kthread+0x12c/0x158 kernel/kthread.c:376  ret_from_fork+0x10/0x20 arch/arm64/kernel/entry.S:863 There is no real problem for normal IOPOLL as flush is also called with uring_lock taken, but it's getting more complicated for IOPOLL|SQPOLL, for which __io_cqring_overflow_flush() happens from the CQ waiting path. Reported-and-tested-by: syzbot+6805087452d72929404e@syzkaller.appspotmail.com Cc: stable@vger.kernel.org # 5.10+ Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2023-01-13ALSA: pcm: Move rwsem lock inside snd_ctl_elem_read to prevent UAFClement Lecigne1-9/+15
Takes rwsem lock inside snd_ctl_elem_read instead of snd_ctl_elem_read_user like it was done for write in commit 1fa4445f9adf1 ("ALSA: control - introduce snd_ctl_notify_one() helper"). Doing this way we are also fixing the following locking issue happening in the compat path which can be easily triggered and turned into an use-after-free. 64-bits: snd_ctl_ioctl snd_ctl_elem_read_user [takes controls_rwsem] snd_ctl_elem_read [lock properly held, all good] [drops controls_rwsem] 32-bits: snd_ctl_ioctl_compat snd_ctl_elem_write_read_compat ctl_elem_write_read snd_ctl_elem_read [missing lock, not good] CVE-2023-0266 was assigned for this issue. Cc: stable@kernel.org # 5.13+ Signed-off-by: Clement Lecigne <clecigne@google.com> Reviewed-by: Jaroslav Kysela <perex@perex.cz> Link: https://lore.kernel.org/r/20230113120745.25464-1-tiwai@suse.de Signed-off-by: Takashi Iwai <tiwai@suse.de>
2023-01-13iommu/mediatek-v1: Fix an error handling path in mtk_iommu_v1_probe()Christophe JAILLET1-1/+3
A clk, prepared and enabled in mtk_iommu_v1_hw_init(), is not released in the error handling path of mtk_iommu_v1_probe(). Add the corresponding clk_disable_unprepare(), as already done in the remove function. Fixes: b17336c55d89 ("iommu/mediatek: add support for mtk iommu generation one HW") Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Reviewed-by: Yong Wu <yong.wu@mediatek.com> Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Reviewed-by: Matthias Brugger <matthias.bgg@gmail.com> Link: https://lore.kernel.org/r/593e7b7d97c6e064b29716b091a9d4fd122241fb.1671473163.git.christophe.jaillet@wanadoo.fr Signed-off-by: Joerg Roedel <jroedel@suse.de>
2023-01-13iommu/iova: Fix alloc iova overflows issueYunfei Wang1-2/+2
In __alloc_and_insert_iova_range, there is an issue that retry_pfn overflows. The value of iovad->anchor.pfn_hi is ~0UL, then when iovad->cached_node is iovad->anchor, curr_iova->pfn_hi + 1 will overflow. As a result, if the retry logic is executed, low_pfn is updated to 0, and then new_pfn < low_pfn returns false to make the allocation successful. This issue occurs in the following two situations: 1. The first iova size exceeds the domain size. When initializing iova domain, iovad->cached_node is assigned as iovad->anchor. For example, the iova domain size is 10M, start_pfn is 0x1_F000_0000, and the iova size allocated for the first time is 11M. The following is the log information, new->pfn_lo is smaller than iovad->cached_node. Example log as follows: [ 223.798112][T1705487] sh: [name:iova&]__alloc_and_insert_iova_range start_pfn:0x1f0000,retry_pfn:0x0,size:0xb00,limit_pfn:0x1f0a00 [ 223.799590][T1705487] sh: [name:iova&]__alloc_and_insert_iova_range success start_pfn:0x1f0000,new->pfn_lo:0x1efe00,new->pfn_hi:0x1f08ff 2. The node with the largest iova->pfn_lo value in the iova domain is deleted, iovad->cached_node will be updated to iovad->anchor, and then the alloc iova size exceeds the maximum iova size that can be allocated in the domain. After judging that retry_pfn is less than limit_pfn, call retry_pfn+1 to fix the overflow issue. Signed-off-by: jianjiao zeng <jianjiao.zeng@mediatek.com> Signed-off-by: Yunfei Wang <yf.wang@mediatek.com> Cc: <stable@vger.kernel.org> # 5.15.* Fixes: 4e89dce72521 ("iommu/iova: Retry from last rb tree node if iova search fails") Acked-by: Robin Murphy <robin.murphy@arm.com> Link: https://lore.kernel.org/r/20230111063801.25107-1-yf.wang@mediatek.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2023-01-13iommu: Fix refcount leak in iommu_device_claim_dma_ownerMiaoqian Lin1-3/+5
iommu_group_get() returns the group with the reference incremented. Move iommu_group_get() after owner check to fix the refcount leak. Fixes: 89395ccedbc1 ("iommu: Add device-centric DMA ownership interfaces") Signed-off-by: Miaoqian Lin <linmq006@gmail.com> Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/20221230083100.1489569-1-linmq006@gmail.com [ joro: Remove *group = NULL initialization ] Signed-off-by: Joerg Roedel <jroedel@suse.de>
2023-01-13iommu/arm-smmu-v3: Don't unregister on shutdownVladimir Oltean1-1/+3
Similar to SMMUv2, this driver calls iommu_device_unregister() from the shutdown path, which removes the IOMMU groups with no coordination whatsoever with their users - shutdown methods are optional in device drivers. This can lead to NULL pointer dereferences in those drivers' DMA API calls, or worse. Instead of calling the full arm_smmu_device_remove() from arm_smmu_device_shutdown(), let's pick only the relevant function call - arm_smmu_device_disable() - more or less the reverse of arm_smmu_device_reset() - and call just that from the shutdown path. Fixes: 57365a04c921 ("iommu: Move bus setup to IOMMU device registration") Suggested-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Link: https://lore.kernel.org/r/20221215141251.3688780-2-vladimir.oltean@nxp.com Signed-off-by: Will Deacon <will@kernel.org>
2023-01-13iommu/arm-smmu: Don't unregister on shutdownVladimir Oltean1-8/+14
Michael Walle says he noticed the following stack trace while performing a shutdown with "reboot -f". He suggests he got "lucky" and just hit the correct spot for the reboot while there was a packet transmission in flight. Unable to handle kernel NULL pointer dereference at virtual address 0000000000000098 CPU: 0 PID: 23 Comm: kworker/0:1 Not tainted 6.1.0-rc5-00088-gf3600ff8e322 #1930 Hardware name: Kontron KBox A-230-LS (DT) pc : iommu_get_dma_domain+0x14/0x20 lr : iommu_dma_map_page+0x9c/0x254 Call trace: iommu_get_dma_domain+0x14/0x20 dma_map_page_attrs+0x1ec/0x250 enetc_start_xmit+0x14c/0x10b0 enetc_xmit+0x60/0xdc dev_hard_start_xmit+0xb8/0x210 sch_direct_xmit+0x11c/0x420 __dev_queue_xmit+0x354/0xb20 ip6_finish_output2+0x280/0x5b0 __ip6_finish_output+0x15c/0x270 ip6_output+0x78/0x15c NF_HOOK.constprop.0+0x50/0xd0 mld_sendpack+0x1bc/0x320 mld_ifc_work+0x1d8/0x4dc process_one_work+0x1e8/0x460 worker_thread+0x178/0x534 kthread+0xe0/0xe4 ret_from_fork+0x10/0x20 Code: d503201f f9416800 d503233f d50323bf (f9404c00) ---[ end trace 0000000000000000 ]--- Kernel panic - not syncing: Oops: Fatal exception in interrupt This appears to be reproducible when the board has a fixed IP address, is ping flooded from another host, and "reboot -f" is used. The following is one more manifestation of the issue: $ reboot -f kvm: exiting hardware virtualization cfg80211: failed to load regulatory.db arm-smmu 5000000.iommu: disabling translation sdhci-esdhc 2140000.mmc: Removing from iommu group 11 sdhci-esdhc 2150000.mmc: Removing from iommu group 12 fsl-edma 22c0000.dma-controller: Removing from iommu group 17 dwc3 3100000.usb: Removing from iommu group 9 dwc3 3110000.usb: Removing from iommu group 10 ahci-qoriq 3200000.sata: Removing from iommu group 2 fsl-qdma 8380000.dma-controller: Removing from iommu group 20 platform f080000.display: Removing from iommu group 0 etnaviv-gpu f0c0000.gpu: Removing from iommu group 1 etnaviv etnaviv: Removing from iommu group 1 caam_jr 8010000.jr: Removing from iommu group 13 caam_jr 8020000.jr: Removing from iommu group 14 caam_jr 8030000.jr: Removing from iommu group 15 caam_jr 8040000.jr: Removing from iommu group 16 fsl_enetc 0000:00:00.0: Removing from iommu group 4 arm-smmu 5000000.iommu: Blocked unknown Stream ID 0x429; boot with "arm-smmu.disable_bypass=0" to allow, but this may have security implications arm-smmu 5000000.iommu: GFSR 0x80000002, GFSYNR0 0x00000002, GFSYNR1 0x00000429, GFSYNR2 0x00000000 fsl_enetc 0000:00:00.1: Removing from iommu group 5 arm-smmu 5000000.iommu: Blocked unknown Stream ID 0x429; boot with "arm-smmu.disable_bypass=0" to allow, but this may have security implications arm-smmu 5000000.iommu: GFSR 0x80000002, GFSYNR0 0x00000002, GFSYNR1 0x00000429, GFSYNR2 0x00000000 arm-smmu 5000000.iommu: Blocked unknown Stream ID 0x429; boot with "arm-smmu.disable_bypass=0" to allow, but this may have security implications arm-smmu 5000000.iommu: GFSR 0x80000002, GFSYNR0 0x00000000, GFSYNR1 0x00000429, GFSYNR2 0x00000000 fsl_enetc 0000:00:00.2: Removing from iommu group 6 fsl_enetc_mdio 0000:00:00.3: Removing from iommu group 8 mscc_felix 0000:00:00.5: Removing from iommu group 3 fsl_enetc 0000:00:00.6: Removing from iommu group 7 pcieport 0001:00:00.0: Removing from iommu group 18 arm-smmu 5000000.iommu: Blocked unknown Stream ID 0x429; boot with "arm-smmu.disable_bypass=0" to allow, but this may have security implications arm-smmu 5000000.iommu: GFSR 0x00000002, GFSYNR0 0x00000000, GFSYNR1 0x00000429, GFSYNR2 0x00000000 pcieport 0002:00:00.0: Removing from iommu group 19 Unable to handle kernel NULL pointer dereference at virtual address 00000000000000a8 pc : iommu_get_dma_domain+0x14/0x20 lr : iommu_dma_unmap_page+0x38/0xe0 Call trace: iommu_get_dma_domain+0x14/0x20 dma_unmap_page_attrs+0x38/0x1d0 enetc_unmap_tx_buff.isra.0+0x6c/0x80 enetc_poll+0x170/0x910 __napi_poll+0x40/0x1e0 net_rx_action+0x164/0x37c __do_softirq+0x128/0x368 run_ksoftirqd+0x68/0x90 smpboot_thread_fn+0x14c/0x190 Code: d503201f f9416800 d503233f d50323bf (f9405400) ---[ end trace 0000000000000000 ]--- Kernel panic - not syncing: Oops: Fatal exception in interrupt ---[ end Kernel panic - not syncing: Oops: Fatal exception in interrupt ]--- The problem seems to be that iommu_group_remove_device() is allowed to run with no coordination whatsoever with the shutdown procedure of the enetc PCI device. In fact, it almost seems as if it implies that the pci_driver :: shutdown() method is mandatory if DMA is used with an IOMMU, otherwise this is inevitable. That was never the case; shutdown methods are optional in device drivers. This is the call stack that leads to iommu_group_remove_device() during reboot: kernel_restart -> device_shutdown -> platform_shutdown -> arm_smmu_device_shutdown -> arm_smmu_device_remove -> iommu_device_unregister -> bus_for_each_dev -> remove_iommu_group -> iommu_release_device -> iommu_group_remove_device I don't know much about the arm_smmu driver, but arm_smmu_device_shutdown() invoking arm_smmu_device_remove() looks suspicious, since it causes the IOMMU device to unregister and that's where everything starts to unravel. It forces all other devices which depend on IOMMU groups to also point their ->shutdown() to ->remove(), which will make reboot slower overall. There are 2 moments relevant to this behavior. First was commit b06c076ea962 ("Revert "iommu/arm-smmu: Make arm-smmu explicitly non-modular"") when arm_smmu_device_shutdown() was made to run the exact same thing as arm_smmu_device_remove(). Prior to that, there was no iommu_device_unregister() call in arm_smmu_device_shutdown(). However, that was benign until commit 57365a04c921 ("iommu: Move bus setup to IOMMU device registration"), which made iommu_device_unregister() call remove_iommu_group(). Restore the old shutdown behavior by making remove() call shutdown(), but shutdown() does not call the remove() specific bits. Fixes: 57365a04c921 ("iommu: Move bus setup to IOMMU device registration") Reported-by: Michael Walle <michael@walle.cc> Tested-by: Michael Walle <michael@walle.cc> # on kontron-sl28 Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Link: https://lore.kernel.org/r/20221215141251.3688780-1-vladimir.oltean@nxp.com Signed-off-by: Will Deacon <will@kernel.org>
2023-01-13iommu/arm-smmu: Report IOMMU_CAP_CACHE_COHERENCY even bettererRobin Murphy1-2/+8
Although it's vanishingly unlikely that anyone would integrate an SMMU within a coherent interconnect without also making the pagetable walk interface coherent, the same effect happens if a coherent SMMU fails to advertise CTTW correctly. This turns out to be the case on some popular NXP SoCs, where VFIO started failing the IOMMU_CAP_CACHE_COHERENCY test, even though IOMMU_CACHE *was* previously achieving the desired effect anyway thanks to the underlying integration. While those SoCs stand to gain some more general benefits from a firmware update to override CTTW correctly in DT/ACPI, it's also easy to work around this in Linux as well, to avoid imposing too much on affected users - since the upstream client devices *are* correctly marked as coherent, we can trivially infer their coherent paths through the SMMU as well. Reported-by: Vladimir Oltean <vladimir.oltean@nxp.com> Fixes: df198b37e72c ("iommu/arm-smmu: Report IOMMU_CAP_CACHE_COHERENCY better") Signed-off-by: Robin Murphy <robin.murphy@arm.com> Tested-by: Vladimir Oltean <vladimir.oltean@nxp.com> Link: https://lore.kernel.org/r/d6dc41952961e5c7b21acac08a8bf1eb0f69e124.1671123115.git.robin.murphy@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2023-01-13platform/x86: thinkpad_acpi: Fix profile mode display in AMT modeMark Pearson1-6/+17
Recently AMT mode was enabled (somewhat unexpectedly) on the Lenovo Z13 platform. The FW is advertising it is available and the driver tries to use it - unfortunately it reports the profile mode incorrectly. Note, there is also some extra work needed to enable the dynamic aspect of AMT support that I will be following up with; but more testing is needed first. This patch just fixes things so the profiles are reported correctly. Link: https://gitlab.freedesktop.org/hadess/power-profiles-daemon/-/issues/115 Fixes: 46dcbc61b739 ("platform/x86: thinkpad-acpi: Add support for automatic mode transitions") Reviewed-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Mark Pearson <mpearson-lenovo@squebb.ca> Link: https://lore.kernel.org/r/20230112221228.490946-1-mpearson-lenovo@squebb.ca Reviewed-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Hans de Goede <hdegoede@redhat.com>
2023-01-13ALSA: usb-audio: Fix possible NULL pointer dereference in snd_usb_pcm_has_fixed_rate()Jaroslav Kysela1-1/+2
The subs function argument may be NULL, so do not use it before the NULL check. Fixes: 291e9da91403 ("ALSA: usb-audio: Always initialize fixed_rate in snd_usb_find_implicit_fb_sync_format()") Reported-by: coverity-bot <keescook@chromium.org> Link: https://lore.kernel.org/alsa-devel/202301121424.4A79A485@keescook/ Signed-off-by: Jaroslav Kysela <perex@perex.cz> Link: https://lore.kernel.org/r/20230113085311.623325-1-perex@perex.cz Signed-off-by: Takashi Iwai <tiwai@suse.de>
2023-01-12platform/x86: int3472/discrete: Ensure the clk/power enable pins are in output modeHans de Goede2-0/+7
acpi_get_and_request_gpiod() does not take a gpio_lookup_flags argument specifying that the pins direction should be initialized to a specific value. This means that in some cases the pins might be left in input mode, causing the gpiod_set() calls made to enable the clk / regulator to not work. One example of this problem is the clk-enable GPIO for the ov01a1s sensor on a Dell Latitude 9420 being left in input mode causing the clk to never get enabled. Explicitly set the direction of the pins to output to fix this. Fixes: 5de691bffe57 ("platform/x86: Add intel_skl_int3472 driver") Signed-off-by: Hans de Goede <hdegoede@redhat.com> Reviewed-by: Andy Shevchenko <andy@kernel.org> Reviewed-by: Daniel Scally <djrscally@gmail.com> Reviewed-by: Sakari Ailus <sakari.ailus@linux.intel.com> Link: https://lore.kernel.org/r/20230111201426.947853-1-hdegoede@redhat.com
2023-01-12platform/x86/amd: Fix refcount leak in amd_pmc_probeMiaoqian Lin1-1/+1
pci_get_domain_bus_and_slot() takes reference, the caller should release the reference by calling pci_dev_put() after use. Call pci_dev_put() in the error path to fix this. Fixes: 3d7d407dfb05 ("platform/x86: amd-pmc: Add support for AMD Spill to DRAM STB feature") Signed-off-by: Miaoqian Lin <linmq006@gmail.com> Reviewed-by: Mario Limonciello <mario.limonciello@amd.com> Link: https://lore.kernel.org/r/20221229072534.1381432-1-linmq006@gmail.com Reviewed-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Hans de Goede <hdegoede@redhat.com>
2023-01-12platform/x86: intel/pmc/core: Add Meteor Lake mobile supportGayatri Kammela1-0/+1
Add Meteor Lake mobile support to pmc core driver. Meteor Lake mobile parts reuse all the Meteor Lake PCH IPs. Cc: David E Box <david.e.box@linux.intel.com> Signed-off-by: Gayatri Kammela <gayatri.kammela@linux.intel.com> Link: https://lore.kernel.org/r/20221228230553.2497183-1-gayatri.kammela@linux.intel.com Signed-off-by: Hans de Goede <hdegoede@redhat.com>
2023-01-12platform/x86: simatic-ipc: add another modelHenning Schild2-0/+2
Add IPC PX-39A support. Signed-off-by: Henning Schild <henning.schild@siemens.com> Link: https://lore.kernel.org/r/20221222103720.8546-3-henning.schild@siemens.com Reviewed-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Hans de Goede <hdegoede@redhat.com>
2023-01-12platform/x86: simatic-ipc: correct name of a modelHenning Schild2-2/+2
What we called IPC427G should be renamed to BX-39A to be more in line with the actual product name. Signed-off-by: Henning Schild <henning.schild@siemens.com> Link: https://lore.kernel.org/r/20221222103720.8546-2-henning.schild@siemens.com Reviewed-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Hans de Goede <hdegoede@redhat.com>
2023-01-12platform/x86: dell-privacy: Only register SW_CAMERA_LENS_COVER if presentHans de Goede1-4/+15
Unlike keys where userspace only reacts to keypresses, userspace may act on switches in both (0 and 1) of their positions. For example if a SW_TABLET_MODE switch is registered then GNOME will not automatically show the onscreen keyboard when a text field gets focus on touchscreen devices when SW_TABLET_MODE reports 0 and when SW_TABLET_MODE reports 1 libinput will block (filter out) builtin keyboard and touchpad events. So to avoid unwanted side-effects EV_SW type inputs should only be registered if they are actually present, only register SW_CAMERA_LENS_COVER if it is actually there. Fixes: 8af9fa37b8a3 ("platform/x86: dell-privacy: Add support for Dell hardware privacy") Signed-off-by: Hans de Goede <hdegoede@redhat.com> Link: https://lore.kernel.org/r/20221221220724.119594-2-hdegoede@redhat.com
2023-01-12platform/x86: dell-privacy: Fix SW_CAMERA_LENS_COVER reportingHans de Goede1-6/+16
Use KE_VSW instead of KE_SW for the SW_CAMERA_LENS_COVER key_entry and get the value of the switch from the status field when handling SW_CAMERA_LENS_COVER events, instead of always reporting 0. Also correctly set the initial SW_CAMERA_LENS_COVER value. Fixes: 8af9fa37b8a3 ("platform/x86: dell-privacy: Add support for Dell hardware privacy") Signed-off-by: Hans de Goede <hdegoede@redhat.com> Link: https://lore.kernel.org/r/20221221220724.119594-1-hdegoede@redhat.com
2023-01-12platform/x86: asus-wmi: Don't load fan curves without fanThomas Weißschuh1-0/+3
If we do not have a fan it does not make sense to load curves for it. This removes the following warnings from the kernel log: asus_wmi: fan_curve_get_factory_default (0x00110024) failed: -19 asus_wmi: fan_curve_get_factory_default (0x00110025) failed: -19 Fixes: a2bdf10ce96e ("platform/x86: asus-wmi: Increase FAN_CURVE_BUF_LEN to 32") Signed-off-by: Thomas Weißschuh <linux@weissschuh.net> Link: https://lore.kernel.org/r/20221221-asus-fan-v1-3-e07f3949725b@weissschuh.net Reviewed-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Hans de Goede <hdegoede@redhat.com>
2023-01-12platform/x86: asus-wmi: Ignore fan on E410MAThomas Weißschuh1-0/+13
The ASUS VivoBook has a fan device described in its ACPI tables but does not actually contain any physical fan. Use the quirk to inhibit fan handling. Signed-off-by: Thomas Weißschuh <linux@weissschuh.net> Link: https://lore.kernel.org/r/20221221-asus-fan-v1-2-e07f3949725b@weissschuh.net Reviewed-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Hans de Goede <hdegoede@redhat.com>
2023-01-12platform/x86: asus-wmi: Add quirk wmi_ignore_fanThomas Weißschuh2-1/+4
Some laptops have a fan device listed in their ACPI tables but do not actually contain a fan. Introduce a quirk that can be used to override the fan detection logic. This was observed with a ASUS VivoBook E410MA running firmware E410MAB.304. Signed-off-by: Thomas Weißschuh <linux@weissschuh.net> Link: https://lore.kernel.org/r/20221221-asus-fan-v1-1-e07f3949725b@weissschuh.net Reviewed-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Hans de Goede <hdegoede@redhat.com>
2023-01-12platform/x86: asus-nb-wmi: Add alternate mapping for KEY_SCREENLOCKHans de Goede1-0/+1
The 0x33 keycode is emitted by Fn + F6 on a ASUS FX705GE laptop. Reported-by: Nemcev Aleksey <Nemcev_Aleksey@inbox.ru> Signed-off-by: Hans de Goede <hdegoede@redhat.com> Link: https://lore.kernel.org/r/20230112181841.84652-1-hdegoede@redhat.com
2023-01-12platform/x86: asus-nb-wmi: Add alternate mapping for KEY_CAMERAThomas Weißschuh1-0/+1
This keycode is emitted on a Asus VivoBook E410MAB with firmware E410MAB.304. The physical key has a strikken-through camera printed on it. Signed-off-by: Thomas Weißschuh <linux@weissschuh.net> Link: https://lore.kernel.org/r/20221216-asus-key-v1-1-45da124119a3@weissschuh.net Reviewed-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Hans de Goede <hdegoede@redhat.com>
2023-01-12platform/surface: aggregator: Add missing call to ssam_request_sync_free()Maximilian Luz1-1/+3
Although rare, ssam_request_sync_init() can fail. In that case, the request should be freed via ssam_request_sync_free(). Currently it is leaked instead. Fix this. Fixes: c167b9c7e3d6 ("platform/surface: Add Surface Aggregator subsystem") Signed-off-by: Maximilian Luz <luzmaximilian@gmail.com> Link: https://lore.kernel.org/r/20221220175608.1436273-1-luzmaximilian@gmail.com Reviewed-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Hans de Goede <hdegoede@redhat.com>
2023-01-12platform/surface: aggregator: Ignore command messages not intended for usMaximilian Luz1-0/+14
It is possible that we (the host/kernel driver) receive command messages that are not intended for us. Ignore those for now. The whole story is a bit more complicated: It is possible to enable debug output on SAM, which is sent via SSH command messages. By default this output is sent to a debug connector, with its own target ID (TID=0x03). It is possible to override the target of the debug output and set it to the host/kernel driver. This, however, does not change the original target ID of the message. Meaning, we receive messages with TID=0x03 (debug) but expect to only receive messages with TID=0x00 (host). The problem is that the different target ID also comes with a different scope of request IDs. In particular, these do not follow the standard event rules (i.e. do not fall into a set of small reserved values). Therefore, current message handling interprets them as responses to pending requests and tries to match them up via the request ID. However, these debug output messages are not in fact responses, and therefore this will at best fail to find the request and at worst pass on the wrong data as response for a request. Therefore ignore any command messages not intended for us (host) for now. We can implement support for the debug messages once we have a better understanding of them. Note that this may also provide a bit more stability and avoid some driver confusion in case any other targets want to talk to us in the future, since we don't yet know what to do with those as well. A warning for the dropped messages should suffice for now and also give us a chance of discovering new targets if they come along without any potential for bugs/instabilities. Fixes: c167b9c7e3d6 ("platform/surface: Add Surface Aggregator subsystem") Signed-off-by: Maximilian Luz <luzmaximilian@gmail.com> Link: https://lore.kernel.org/r/20221202223327.690880-2-luzmaximilian@gmail.com Reviewed-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Hans de Goede <hdegoede@redhat.com>
2023-01-12io_uring/poll: attempt request issue after racy poll wakeupJens Axboe1-11/+20
If we have multiple requests waiting on the same target poll waitqueue, then it's quite possible to get a request triggered and get disappointed in not being able to make any progress with it. If we race in doing so, we'll potentially leave the poll request on the internal tables, but removed from the waitqueue. That means that any subsequent trigger of the poll waitqueue will not kick that request into action, causing an application to potentially wait for completion of a request that will never happen. Fix this by adding a new poll return state, IOU_POLL_REISSUE. Rather than have complicated logic for how to re-arm a given type of request, just punt it for a reissue. While in there, move the 'ret' variable to the only section where it gets used. This avoids confusion the scope of it. Cc: stable@vger.kernel.org Fixes: eb0089d629ba ("io_uring: single shot poll removal optimisation") Signed-off-by: Jens Axboe <axboe@kernel.dk>
2023-01-12platform/x86: touchscreen_dmi: Add info for the CSL Panther Tab HDMichael Klein1-0/+25
Add touchscreen info for the CSL Panther Tab HD. Signed-off-by: Michael Klein <m.klein@mvz-labor-lb.de> Link: https://lore.kernel.org/r/20221220121103.uiwn5l7fii2iggct@LLGMVZLB-0037 Reviewed-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Hans de Goede <hdegoede@redhat.com>
2023-01-12cifs: Fix uninitialized memory read for smb311 posix symlink createVolker Lendecke1-0/+1
If smb311 posix is enabled, we send the intended mode for file creation in the posix create context. Instead of using what's there on the stack, create the mfsymlink file with 0644. Fixes: ce558b0e17f8a ("smb3: Add posix create context for smb3.11 posix mounts") Cc: stable@vger.kernel.org Signed-off-by: Volker Lendecke <vl@samba.org> Reviewed-by: Tom Talpey <tom@talpey.com> Reviewed-by: Paulo Alcantara (SUSE) <pc@cjr.nz> Signed-off-by: Steve French <stfrench@microsoft.com>
2023-01-12r8152: add vendor/device ID pair for Microsoft DevkitAndre Przywara1-0/+1
The Microsoft Devkit 2023 is a an ARM64 based machine featuring a Realtek 8153 USB3.0-to-GBit Ethernet adapter. As in their other machines, Microsoft uses a custom USB device ID. Add the respective ID values to the driver. This makes Ethernet work on the MS Devkit device. The chip has been visually confirmed to be a RTL8153. Signed-off-by: Andre Przywara <andre.przywara@arm.com> Link: https://lore.kernel.org/r/20230111133228.190801-1-andre.przywara@arm.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-01-12drm: Optimize drm buddy top-down allocation methodArunpravin Paneer Selvam1-27/+54
We are observing performance drop in many usecases which include games, 3D benchmark applications,etc.. To solve this problem, We are strictly not allowing top down flag enabled allocations to steal the memory space from cpu visible region. The idea is, we are sorting each order list entries in ascending order and compare the last entry of each order list in the freelist and return the max block. This patch improves the 3D benchmark scores and solves fragmentation issues. All drm buddy selftests are verfied. drm_buddy: pass:6 fail:0 skip:0 total:6 Signed-off-by: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230112120027.3072-1-Arunpravin.PaneerSelvam@amd.com Signed-off-by: Christian König <christian.koenig@amd.com> CC: Cc: stable@vger.kernel.org # 5.18+
2023-01-12drm/ttm: Fix a regression causing kernel oops'esZack Rusin1-1/+1
The branch is explicitly taken if ttm == NULL which means that to avoid a null pointer reference the ttm object can not be used inside. Switch back to dst_mem to avoid kernel oops'es. This fixes kernel oops'es with any buffer objects which don't have ttm_tt, e.g. with vram based screen objects on vmwgfx. Signed-off-by: Zack Rusin <zackr@vmware.com> Fixes: e3c92eb4a84f ("drm/ttm: rework on ttm_resource to use size_t type") Cc: Somalapuram Amaranath <Amaranath.Somalapuram@amd.com> Cc: Christian König <christian.koenig@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230111175015.1134923-1-zack@kde.org Signed-off-by: Christian König <christian.koenig@amd.com>