linux-dev - Linux kernel development work

Age	Commit message (Collapse)	Author	Files	Lines
2022-01-11	sections: Fix __is_kernel() to include init ranges	Helge Deller	1	-3/+7
	With CONFIG_KALLSYMS_ALL=y, the function is_ksym_addr() is used to determine if a symbol is from inside the kernel range. For that the given symbol address is checked if it's inside the _stext to _end range. Although this is correct, some architectures (e.g. parisc) may have the init area before the _stext address and as such the check in is_ksym_addr() fails. By extending the range check to include the init section, __is_kernel() will now detect symbols in this range as well. This fixes an issue on parisc where addresses of kernel functions in init sections aren't resolved to their symbol names. Signed-off-by: Helge Deller <deller@gmx.de>
2022-01-11	parisc: Re-use toc_stack as hpmc_stack	Helge Deller	1	-4/+2
	No need to have an own hpmc_stack. Just re-use the toc_stack of the monarch CPU as either a TOC or a HPMC will happen at the same time. This reduces the kernel memory footprint by 16k. Signed-off-by: Helge Deller <deller@gmx.de>
2022-01-11	parisc: Enable TOC (transfer of contents) feature unconditionally	Helge Deller	5	-41/+30
	Before this patch, the TOC code used a pre-allocated stack of 16kb for each possible CPU. That space overhead was the reason why the TOC feature wasn't enabled by default for 32-bit kernels. This patch rewrites the TOC code to use a per-cpu stack. That way we use much less memory now and as such we enable the TOC feature by default on all kernels. Additionally the dump of the registers and the stacktrace wasn't serialized, which led to multiple CPUs printing the stack backtrace at once which rendered the output unreadable. Now the backtraces are nicely serialized by a lock. Signed-off-by: Helge Deller <deller@gmx.de>
2022-01-07	parisc: io: Improve the outb(), outw() and outl() macros	Bart Van Assche	1	-3/+3
	This patch fixes the following build error for source file drivers/scsi/pcmcia/sym53c500_cs.c: In file included from ./include/linux/bug.h:5, from ./include/linux/cpumask.h:14, from ./include/linux/mm_types_task.h:14, from ./include/linux/mm_types.h:5, from ./include/linux/buildid.h:5, from ./include/linux/module.h:14, from drivers/scsi/pcmcia/sym53c500_cs.c:42: drivers/scsi/pcmcia/sym53c500_cs.c: In function ‘SYM53C500_intr’: ./arch/parisc/include/asm/bug.h:28:2: error: expected expression before ‘do’ 28 \| do { \ \| ^~ ./arch/parisc/include/asm/io.h:276:20: note: in expansion of macro ‘BUG’ 276 \| #define outb(x, y) BUG() \| ^~~ drivers/scsi/pcmcia/sym53c500_cs.c:124:19: note: in expansion of macro ‘outb’ 124 \| #define REG0(x) (outb(C4_IMG, (x) + CONFIG4)) \| ^~~~ drivers/scsi/pcmcia/sym53c500_cs.c:362:2: note: in expansion of macro ‘REG0’ 362 \| REG0(port_base); \| ^~~~ Cc: James E.J. Bottomley <James.Bottomley@HansenPartnership.com> Cc: Helge Deller <deller@gmx.de> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: linux-parisc@vger.kernel.org Signed-off-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Helge Deller <deller@gmx.de>
2022-01-07	parisc: pdc_stable: use default_groups in kobj_type	Greg Kroah-Hartman	1	-1/+2
	There are currently 2 ways to create a set of sysfs files for a kobj_type, through the default_attrs field, and the default_groups field. Move the parisc pdc_stable sysfs code to use default_groups field which has been the preferred way since aa30f47cf666 ("kobject: Add support for default attribute groups to kobj_type") so that we can soon get rid of the obsolete default_attrs field. Cc: "James E.J. Bottomley" <James.Bottomley@HansenPartnership.com> Cc: Helge Deller <deller@gmx.de> Cc: linux-parisc@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Helge Deller <deller@gmx.de>
2022-01-07	parisc: Add kgdb io_module to read chars via PDC	Helge Deller	1	-0/+21
	Add a simplistic keyboard driver for usage of PDC I/O functions with kgdb. This driver makes it possible to use KGDB with QEMU. Signed-off-by: Helge Deller <deller@gmx.de>
2022-01-07	parisc: Fix pdc_toc_pim_11 and pdc_toc_pim_20 definitions	Helge Deller	1	-9/+23
	The definitions for pdc_toc_pim_11 and pdc_toc_pim_20 are wrong since they include an entry for a hversion field which doesn't exist in the specification. Fix this and clean up some whitespaces so that the whole file will be in sync with it's copy in the SeaBIOS-hppa sources. Signed-off-by: Helge Deller <deller@gmx.de> Cc: stable@vger.kernel.org # v5.16
2022-01-07	parisc: Add lws_atomic_xchg and lws_atomic_store syscalls	John David Anglin	1	-1/+392
	This patch adds two new LWS routines - lws_atomic_xchg and lws_atomic_store. These are simpler than the CAS routines. Currently, we use the CAS routines for atomic stores. This is inefficient since it requires both winning the spinlock and a successful CAS operation. Change has been tested on c8000 and rp3440. In v2, I moved the code to disble/enable page faults inside the spinlocks. Signed-off-by: John David Anglin <dave.anglin@bell.net> Signed-off-by: Helge Deller <deller@gmx.de>
2022-01-07	parisc: Rewrite light-weight syscall and futex code	John David Anglin	3	-212/+231
	The parisc architecture lacks general hardware support for compare and swap. Particularly for userspace, it is difficult to implement software atomic support. Page faults in critical regions can cause processes to sleep and block the forward progress of other processes. Thus, it is essential that page faults be disabled in critical regions. For performance reasons, we also need to disable external interrupts in critical regions. In order to do this, we need a mechanism to trigger COW breaks outside the critical region. Fortunately, parisc has the "stbys,e" instruction. When the leftmost byte of a word is addressed, this instruction triggers all the exceptions of a normal store but it does not write to memory. Thus, we can use it to trigger COW breaks outside the critical region without modifying the data that is to be updated atomically. COW breaks occur randomly. So even if we have priviously executed a "stbys,e" instruction, we still need to disable pagefaults around the critical region. If a fault occurs in the critical region, we return -EAGAIN. I had to add a wrapper around _arch_futex_atomic_op_inuser() as I found in testing that returning -EAGAIN caused problems for some processes even though it is listed as a possible return value. The patch implements the above. The code no longer attempts to sleep with interrupts disabled and I haven't seen any stalls with the change. I have attempted to merge common code and streamline the fast path. In the futex code, we only compute the spinlock address once. I eliminated some debug code in the original CAS routine that just made the flow more complicated. I don't clip the arguments when called from wide mode. As a result, the LWS routines should work when called from 64-bit processes. I defined TASK_PAGEFAULT_DISABLED offset for use in the lws_pagefault_disable and lws_pagefault_enable macros. Since we now disable interrupts on the gateway page where necessary, it might be possible to allow processes to be scheduled when they are on the gateway page. Change has been tested on c8000 and rp3440. It improves glibc build and test time by about 10%. In v2, I removed the lws_atomic_xchg and and lws_atomic_store calls. I also removed the bug fixes that were not directly related to this patch. In v3, I removed the code to force interruptions from arch_futex_atomic_op_inuser(). It is always called with page faults disabled, so this code had no effect. In v4, I fixed a typo in depi_safe line. In v5, I moved the code to disable/enable page faults inside the spinlocks. Signed-off-by: John David Anglin <dave.anglin@bell.net> Signed-off-by: Helge Deller <deller@gmx.de>
2022-01-07	parisc: Enhance page fault termination message	John David Anglin	1	-4/+10
	In debugging kernel panics, I believe it is useful to know what type of page fault caused the termination. "Bad Address" is too vague. Signed-off-by: John David Anglin <dave.anglin@bell.net> Signed-off-by: Helge Deller <deller@gmx.de>
2022-01-07	parisc: Don't call faulthandler_disabled() in do_page_fault()	John David Anglin	1	-3/+0
	It is dangerous to call faulthandler_disabled() when user_mode(regs) is true. The task pagefault_disabled counter is racy and it is not updated atomically on parisc. As a result, calling faulthandler_disabled() may cause erroneous termination. We now handle execption fixups and termination when user_mode(regs) is false in handle_interruption(). Thus, we can just remove the faulthandler_disabled() check from do_page_fault(). Signed-off-by: John David Anglin <dave.anglin@bell.net> Signed-off-by: Helge Deller <deller@gmx.de>
2022-01-07	parisc: Switch user access functions to signal errors in r29 instead of r8	Helge Deller	2	-7/+11
	Use register r29 instead of register r8 to signal faults when accessing user memory. In case of faults, the fixup routine will store -EFAULT in this register. This change saves up to 752 bytes on a 32bit kernel, partly because the compiler doesn't need to save and restore the old r8 value on the stack. bloat-o-meter results for usage with r29 register: add/remove: 0/0 grow/shrink: 23/86 up/down: 228/-980 (-752) bloat-o-meter results for usage with r28 register: add/remove: 0/0 grow/shrink: 28/83 up/down: 296/-956 (-660) Signed-off-by: Helge Deller <deller@gmx.de>
2022-01-07	parisc: Avoid calling faulthandler_disabled() twice	John David Anglin	1	-1/+1
	In handle_interruption(), we call faulthandler_disabled() to check whether the fault handler is not disabled. If the fault handler is disabled, we immediately call do_page_fault(). It then calls faulthandler_disabled(). If disabled, do_page_fault() attempts to fixup the exception by jumping to no_context: no_context: if (!user_mode(regs) && fixup_exception(regs)) { return; } parisc_terminate("Bad Address (null pointer deref?)", regs, code, address); Apart from the error messages, the two blocks of code perform the same function. We can avoid two calls to faulthandler_disabled() by a simple revision to the code in handle_interruption(). Note: I didn't try to fix the formatting of this code block. Signed-off-by: John David Anglin <dave.anglin@bell.net> Signed-off-by: Helge Deller <deller@gmx.de>
2022-01-07	parisc: Fix lpa and lpa_user defines	John David Anglin	1	-20/+24
	While working on the rewrite to the light-weight syscall and futex code, I experimented with using a hash index based on the user physical address of atomic variable. This exposed two problems with the lpa and lpa_user defines. Because of the copy instruction, the pa argument needs to be an early clobber argument. This prevents gcc from allocating the va and pa arguments to the same register. Secondly, the lpa instruction can cause a page fault so we need to catch exceptions. Signed-off-by: John David Anglin <dave.anglin@bell.net> Fixes: 116d753308cf ("parisc: Use lpa instruction to load physical addresses in driver code") Signed-off-by: Helge Deller <deller@gmx.de> Cc: stable@vger.kernel.org # v5.2+
2022-01-07	parisc: Define depi_safe macro	John David Anglin	1	-0/+10
	The depi instruction is similar to the extru instruction on 64-bit machines. It leaves the most-significant 32 bits of the target register in an undefined state. On 64-bit machines, the macro uses depdi to perform safe deposits in the least-significant 32 bits. Signed-off-by: John David Anglin <dave.anglin@bell.net> Signed-off-by: Helge Deller <deller@gmx.de>
2022-01-05	parisc: decompressor: do not copy source files while building	Masahiro Yamada	5	-10/+9
	As commit 7ae4a78daacf ("ARM: 8969/1: decompressor: simplify libfdt builds") stated, copying source files during the build time may not end up with as clean code as expected. Do similar for parisc to clean up the Makefile. Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Acked-by: Helge Deller <deller@gmx.de> Signed-off-by: Helge Deller <deller@gmx.de>
2022-01-02	Linux 5.16-rc8	Linus Torvalds	1	-1/+1

2022-01-02	perf top: Fix TUI exit screen refresh race condition	yaowenbin	1	-3/+5
	When the following command is executed several times, a coredump file is generated. $ timeout -k 9 5 perf top -e task-clock ***** *** ***** 0.01% [kernel] [k] __do_softirq 0.01% libpthread-2.28.so [.] __pthread_mutex_lock 0.01% [kernel] [k] __ll_sc_atomic64_sub_return double free or corruption (!prev) perf top --sort comm,dso timeout: the monitored command dumped core When we terminate "perf top" using sending signal method, SLsmg_reset_smg() called. SLsmg_reset_smg() resets the SLsmg screen management routines by freeing all memory allocated while it was active. However SLsmg_reinit_smg() maybe be called by another thread. SLsmg_reinit_smg() will free the same memory accessed by SLsmg_reset_smg(), thus it results in a double free. SLsmg_reinit_smg() is called already protected by ui__lock, so we fix the problem by adding pthread_mutex_trylock of ui__lock when calling SLsmg_reset_smg(). Signed-off-by: Wenyu Liu <liuwenyu7@huawei.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: wuxu.wu@huawei.com Link: http://lore.kernel.org/lkml/a91e3943-7ddc-f5c0-a7f5-360f073c20e6@huawei.com Signed-off-by: Hewenliang <hewenliang4@huawei.com> Signed-off-by: yaowenbin <yaowenbin1@huawei.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-01-02	perf pmu: Fix alias events list	John Garry	1	-6/+17
	Commit 0e0ae8742207c3b4 ("perf list: Display hybrid PMU events with cpu type") changes the event list for uncore PMUs or arm64 heterogeneous CPU systems, such that duplicate aliases are incorrectly listed per PMU (which they should not be), like: # perf list ... unc_cbo_cache_lookup.any_es [Unit: uncore_cbox L3 Lookup any request that access cache and found line in E or S-state] unc_cbo_cache_lookup.any_es [Unit: uncore_cbox L3 Lookup any request that access cache and found line in E or S-state] unc_cbo_cache_lookup.any_i [Unit: uncore_cbox L3 Lookup any request that access cache and found line in I-state] unc_cbo_cache_lookup.any_i [Unit: uncore_cbox L3 Lookup any request that access cache and found line in I-state] ... Notice how the events are listed twice. The named commit changed how we remove duplicate events, in that events for different PMUs are not treated as duplicates. I suppose this is to handle how "Each hybrid pmu event has been assigned with a pmu name". Fix PMU alias listing by restoring behaviour to remove duplicates for non-hybrid PMUs. Fixes: 0e0ae8742207c3b4 ("perf list: Display hybrid PMU events with cpu type") Signed-off-by: John Garry <john.garry@huawei.com> Tested-by: Zhengjun Xing <zhengjun.xing@linux.intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/1640103090-140490-1-git-send-email-john.garry@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-12-31	mm: vmscan: reduce throttling due to a failure to make progress -fix	Mel Gorman	1	-1/+2
	Hugh Dickins reported the following My tmpfs swapping load (tweaked to use huge pages more heavily than in real life) is far from being a realistic load: but it was notably slowed down by your throttling mods in 5.16-rc, and this patch makes it well again - thanks. But: it very quickly hit NULL pointer until I changed that last line to if (first_pgdat) consider_reclaim_throttle(first_pgdat, sc); The likely issue is that huge pages are a major component of the test workload. When this is the case, first_pgdat may never get set if compaction is ready to continue due to this check if (IS_ENABLED(CONFIG_COMPACTION) && sc->order > PAGE_ALLOC_COSTLY_ORDER && compaction_ready(zone, sc)) { sc->compaction_ready = true; continue; } If this was true for every zone in the zonelist, first_pgdat would never get set resulting in a NULL pointer exception. Link: https://lkml.kernel.org/r/20211209095453.GM3366@techsingularity.net Fixes: 1b4e3f26f9f75 ("mm: vmscan: Reduce throttling due to a failure to make progress") Signed-off-by: Mel Gorman <mgorman@techsingularity.net> Reported-by: Hugh Dickins <hughd@google.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Rik van Riel <riel@surriel.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Darrick J. Wong <djwong@kernel.org> Cc: Shakeel Butt <shakeelb@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-12-31	mm: vmscan: Reduce throttling due to a failure to make progress	Mel Gorman	3	-10/+59
	Mike Galbraith, Alexey Avramov and Darrick Wong all reported similar problems due to reclaim throttling for excessive lengths of time. In Alexey's case, a memory hog that should go OOM quickly stalls for several minutes before stalling. In Mike and Darrick's cases, a small memcg environment stalled excessively even though the system had enough memory overall. Commit 69392a403f49 ("mm/vmscan: throttle reclaim when no progress is being made") introduced the problem although commit a19594ca4a8b ("mm/vmscan: increase the timeout if page reclaim is not making progress") made it worse. Systems at or near an OOM state that cannot be recovered must reach OOM quickly and memcg should kill tasks if a memcg is near OOM. To address this, only stall for the first zone in the zonelist, reduce the timeout to 1 tick for VMSCAN_THROTTLE_NOPROGRESS and only stall if the scan control nr_reclaimed is 0, kswapd is still active and there were excessive pages pending for writeback. If kswapd has stopped reclaiming due to excessive failures, do not stall at all so that OOM triggers relatively quickly. Similarly, if an LRU is simply congested, only lightly throttle similar to NOPROGRESS. Alexey's original case was the most straight forward for i in {1..3}; do tail /dev/zero; done On vanilla 5.16-rc1, this test stalled heavily, after the patch the test completes in a few seconds similar to 5.15. Alexey's second test case added watching a youtube video while tail runs 10 times. On 5.15, playback only jitters slightly, 5.16-rc1 stalls a lot with lots of frames missing and numerous audio glitches. With this patch applies, the video plays similarly to 5.15. [lkp@intel.com: Fix W=1 build warning] Link: https://lore.kernel.org/r/99e779783d6c7fce96448a3402061b9dc1b3b602.camel@gmx.de Link: https://lore.kernel.org/r/20211124011954.7cab9bb4@mail.inbox.lv Link: https://lore.kernel.org/r/20211022144651.19914-1-mgorman@techsingularity.net Link: https://lore.kernel.org/r/20211202150614.22440-1-mgorman@techsingularity.net Link: https://linux-regtracking.leemhuis.info/regzbot/regression/20211124011954.7cab9bb4@mail.inbox.lv/ Reported-and-tested-by: Alexey Avramov <hakavlad@inbox.lv> Reported-and-tested-by: Mike Galbraith <efault@gmx.de> Reported-and-tested-by: Darrick J. Wong <djwong@kernel.org> Reported-by: kernel test robot <lkp@intel.com> Acked-by: Hugh Dickins <hughd@google.com> Tracked-by: Thorsten Leemhuis <regressions@leemhuis.info> Fixes: 69392a403f49 ("mm/vmscan: throttle reclaim when no progress is being made") Signed-off-by: Mel Gorman <mgorman@techsingularity.net> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-12-31	mm/damon/dbgfs: fix 'struct pid' leaks in 'dbgfs_target_ids_write()'	SeongJae Park	1	-2/+7
	DAMON debugfs interface increases the reference counts of 'struct pid's for targets from the 'target_ids' file write callback ('dbgfs_target_ids_write()'), but decreases the counts only in DAMON monitoring termination callback ('dbgfs_before_terminate()'). Therefore, when 'target_ids' file is repeatedly written without DAMON monitoring start/termination, the reference count is not decreased and therefore memory for the 'struct pid' cannot be freed. This commit fixes this issue by decreasing the reference counts when 'target_ids' is written. Link: https://lkml.kernel.org/r/20211229124029.23348-1-sj@kernel.org Fixes: 4bc05954d007 ("mm/damon: implement a debugfs-based user space interface") Signed-off-by: SeongJae Park <sj@kernel.org> Cc: <stable@vger.kernel.org> [5.15+] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-12-31	userfaultfd/selftests: fix hugetlb area allocations	Mike Kravetz	1	-6/+10
	Currently, userfaultfd selftest for hugetlb as run from run_vmtests.sh or any environment where there are 'just enough' hugetlb pages will always fail with: testing events (fork, remap, remove): ERROR: UFFDIO_COPY error: -12 (errno=12, line=616) The ENOMEM error code implies there are not enough hugetlb pages. However, there are free hugetlb pages but they are all reserved. There is a basic problem with the way the test allocates hugetlb pages which has existed since the test was originally written. Due to the way 'cleanup' was done between different phases of the test, this issue was masked until recently. The issue was uncovered by commit 8ba6e8640844 ("userfaultfd/selftests: reinitialize test context in each test"). For the hugetlb test, src and dst areas are allocated as PRIVATE mappings of a hugetlb file. This means that at mmap time, pages are reserved for the src and dst areas. At the start of event testing (and other tests) the src area is populated which results in allocation of huge pages to fill the area and consumption of reserves associated with the area. Then, a child is forked to fault in the dst area. Note that the dst area was allocated in the parent and hence the parent owns the reserves associated with the mapping. The child has normal access to the dst area, but can not use the reserves created/owned by the parent. Thus, if there are no other huge pages available allocation of a page for the dst by the child will fail. Fix by not creating reserves for the dst area. In this way the child can use free (non-reserved) pages. Also, MAP_PRIVATE of a file only makes sense if you are interested in the contents of the file before making a COW copy. The test does not do this. So, just use MAP_ANONYMOUS \| MAP_HUGETLB to create an anonymous hugetlb mapping. There is no need to create a hugetlb file in the non-shared case. Link: https://lkml.kernel.org/r/20211217172919.7861-1-mike.kravetz@oracle.com Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com> Cc: Axel Rasmussen <axelrasmussen@google.com> Cc: Peter Xu <peterx@redhat.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Mina Almasry <almasrymina@google.com> Cc: Shuah Khan <shuah@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-12-31	Docs: Fixes link to I2C specification	Deep Majumder	1	-3/+5
	The link to the I2C specification is broken. Although "https://www.nxp.com" hosts Rev 7 (2021) of this specification, it is behind a login-wall. Thus, an additional link has been added (which doesn't require a login) and the NXP official docs link has been updated. Signed-off-by: Deep Majumder <deep@fastmail.in> [wsa: minor updates to text and commit message] Signed-off-by: Wolfram Sang <wsa@kernel.org>
2021-12-31	i2c: validate user data in compat ioctl	Pavel Skripkin	1	-0/+3
	Wrong user data may cause warning in i2c_transfer(), ex: zero msgs. Userspace should not be able to trigger warnings, so this patch adds validation checks for user data in compact ioctl to prevent reported warnings Reported-and-tested-by: syzbot+e417648b303855b91d8a@syzkaller.appspotmail.com Fixes: 7d5cb45655f2 ("i2c compat ioctls: move to ->compat_ioctl()") Signed-off-by: Pavel Skripkin <paskripkin@gmail.com> Signed-off-by: Wolfram Sang <wsa@kernel.org>
2021-12-30	Input: spaceball - fix parsing of movement data packets	Leo L. Schwab	1	-2/+9
	The spaceball.c module was not properly parsing the movement reports coming from the device. The code read axis data as signed 16-bit little-endian values starting at offset 2. In fact, axis data in Spaceball movement reports are signed 16-bit big-endian values starting at offset 3. This was determined first by visually inspecting the data packets, and later verified by consulting: http://spacemice.org/pdf/SpaceBall_2003-3003_Protocol.pdf If this ever worked properly, it was in the time before Git... Signed-off-by: Leo L. Schwab <ewhac@ewhac.org> Link: https://lore.kernel.org/r/20211221101630.1146385-1-ewhac@ewhac.org Cc: stable@vger.kernel.org Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
2021-12-30	Input: appletouch - initialize work before device registration	Pavel Skripkin	1	-2/+2
	Syzbot has reported warning in __flush_work(). This warning is caused by work->func == NULL, which means missing work initialization. This may happen, since input_dev->close() calls cancel_work_sync(&dev->work), but dev->work initalization happens _after_ input_register_device() call. So this patch moves dev->work initialization before registering input device Fixes: 5a6eb676d3bc ("Input: appletouch - improve powersaving for Geyser3 devices") Reported-and-tested-by: syzbot+b88c5eae27386b252bbd@syzkaller.appspotmail.com Signed-off-by: Pavel Skripkin <paskripkin@gmail.com> Link: https://lore.kernel.org/r/20211230141151.17300-1-paskripkin@gmail.com Cc: stable@vger.kernel.org Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
2021-12-30	fs/mount_setattr: always cleanup mount_kattr	Christian Brauner	1	-5/+4
	Make sure that finish_mount_kattr() is called after mount_kattr was succesfully built in both the success and failure case to prevent leaking any references we took when we built it. We returned early if path lookup failed thereby risking to leak an additional reference we took when building mount_kattr when an idmapped mount was requested. Cc: linux-fsdevel@vger.kernel.org Cc: stable@vger.kernel.org Fixes: 9caccd41541a ("fs: introduce MOUNT_ATTR_IDMAP") Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-12-30	fsl/fman: Fix missing put_device() call in fman_port_probe	Miaoqian Lin	1	-5/+7
	The reference taken by 'of_find_device_by_node()' must be released when not needed anymore. Add the corresponding 'put_device()' in the and error handling paths. Fixes: 18a6c85fcc78 ("fsl/fman: Add FMan Port Support") Signed-off-by: Miaoqian Lin <linmq006@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-30	selftests: net: using ping6 for IPv6 in udpgro_fwd.sh	Jianguo Wu	1	-1/+3
	udpgro_fwd.sh output following message: ping: 2001:db8:1::100: Address family for hostname not supported Using ping6 when pinging IPv6 addresses. Fixes: a062260a9d5f ("selftests: net: add UDP GRO forwarding self-tests") Signed-off-by: Jianguo Wu <wujianguo@chinatelecom.cn> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-30	Documentation: fix outdated interpretation of ip_no_pmtu_disc	xu xin	1	-2/+4
	The updating way of pmtu has changed, but documentation is still in the old way. So this patch updates the interpretation of ip_no_pmtu_disc and min_pmtu. See commit 28d35bcdd3925 ("net: ipv4: don't let PMTU updates increase route MTU") Reported-by: Zeal Robot <zealci@zte.com.cn> Signed-off-by: xu xin <xu.xin16@zte.com.cn> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-29	net/ncsi: check for error return from call to nla_put_u32	Jiasheng Jiang	1	-1/+5
	As we can see from the comment of the nla_put() that it could return -EMSGSIZE if the tailroom of the skb is insufficient. Therefore, it should be better to check the return value of the nla_put_u32 and return the error code if error accurs. Also, there are many other functions have the same problem, and if this patch is correct, I will commit a new version to fix all. Fixes: 955dc68cb9b2 ("net/ncsi: Add generic netlink family") Signed-off-by: Jiasheng Jiang <jiasheng@iscas.ac.cn> Link: https://lore.kernel.org/r/20211229032118.1706294-1-jiasheng@iscas.ac.cn Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-29	net: bridge: mcast: fix br_multicast_ctx_vlan_global_disabled helper	Nikolay Aleksandrov	1	-3/+3
	We need to first check if the context is a vlan one, then we need to check the global bridge multicast vlan snooping flag, and finally the vlan's multicast flag, otherwise we will unnecessarily enable vlan mcast processing (e.g. querier timers). Fixes: 7b54aaaf53cb ("net: bridge: multicast: add vlan state initialization and control") Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com> Link: https://lore.kernel.org/r/20211228153142.536969-1-nikolay@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-29	net: fix use-after-free in tw_timer_handler	Muchun Song	1	-6/+4
	A real world panic issue was found as follow in Linux 5.4. BUG: unable to handle page fault for address: ffffde49a863de28 PGD 7e6fe62067 P4D 7e6fe62067 PUD 7e6fe63067 PMD f51e064067 PTE 0 RIP: 0010:tw_timer_handler+0x20/0x40 Call Trace: <IRQ> call_timer_fn+0x2b/0x120 run_timer_softirq+0x1ef/0x450 __do_softirq+0x10d/0x2b8 irq_exit+0xc7/0xd0 smp_apic_timer_interrupt+0x68/0x120 apic_timer_interrupt+0xf/0x20 This issue was also reported since 2017 in the thread [1], unfortunately, the issue was still can be reproduced after fixing DCCP. The ipv4_mib_exit_net is called before tcp_sk_exit_batch when a net namespace is destroyed since tcp_sk_ops is registered befrore ipv4_mib_ops, which means tcp_sk_ops is in the front of ipv4_mib_ops in the list of pernet_list. There will be a use-after-free on net->mib.net_statistics in tw_timer_handler after ipv4_mib_exit_net if there are some inflight time-wait timers. This bug is not introduced by commit f2bf415cfed7 ("mib: add net to NET_ADD_STATS_BH") since the net_statistics is a global variable instead of dynamic allocation and freeing. Actually, commit 61a7e26028b9 ("mib: put net statistics on struct net") introduces the bug since it put net statistics on struct net and free it when net namespace is destroyed. Moving init_ipv4_mibs() to the front of tcp_init() to fix this bug and replace pr_crit() with panic() since continuing is meaningless when init_ipv4_mibs() fails. [1] https://groups.google.com/g/syzkaller/c/p1tn-_Kc6l4/m/smuL_FMAAgAJ?pli=1 Fixes: 61a7e26028b9 ("mib: put net statistics on struct net") Signed-off-by: Muchun Song <songmuchun@bytedance.com> Cc: Cong Wang <cong.wang@bytedance.com> Cc: Fam Zheng <fam.zheng@bytedance.com> Cc: <stable@vger.kernel.org> Link: https://lore.kernel.org/r/20211228104145.9426-1-songmuchun@bytedance.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-29	selftests: net: Fix a typo in udpgro_fwd.sh	Jianguo Wu	1	-1/+1
	$rvs -> $rcv Fixes: a062260a9d5f ("selftests: net: add UDP GRO forwarding self-tests") Signed-off-by: Jianguo Wu <wujianguo@chinatelecom.cn> Link: https://lore.kernel.org/r/d247d7c8-a03a-0abf-3c71-4006a051d133@163.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-29	selftests/net: udpgso_bench_tx: fix dst ip argument	wujianguo	1	-1/+7
	udpgso_bench_tx call setup_sockaddr() for dest address before parsing all arguments, if we specify "-p ${dst_port}" after "-D ${dst_ip}", then ${dst_port} will be ignored, and using default cfg_port 8000. This will cause test case "multiple GRO socks" failed in udpgro.sh. Setup sockaddr after parsing all arguments. Fixes: 3a687bef148d ("selftests: udp gso benchmark") Signed-off-by: Jianguo Wu <wujianguo@chinatelecom.cn> Reviewed-by: Willem de Bruijn <willemb@google.com> Link: https://lore.kernel.org/r/ff620d9f-5b52-06ab-5286-44b945453002@163.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-29	x86/build: Use the proper name CONFIG_FW_LOADER	Lukas Bulwahn	1	-1/+1
	Commit in Fixes intends to add the expression regex only when FW_LOADER is enabled - not FW_LOADER_BUILTIN. Latter is a leftover from a previous patchset and not a valid config item. So, adjust the condition to the actual name of the config. [ bp: Cleanup commit message. ] Fixes: c8dcf655ec81 ("x86/build: Tuck away built-in firmware under FW_LOADER") Signed-off-by: Lukas Bulwahn <lukas.bulwahn@gmail.com> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Link: https://lore.kernel.org/r/20211229111553.5846-1-lukas.bulwahn@gmail.com
2021-12-29	net: bridge: mcast: add and enforce startup query interval minimum	Nikolay Aleksandrov	5	-3/+22
	As reported[1] if startup query interval is set too low in combination with large number of startup queries and we have multiple bridges or even a single bridge with multiple querier vlans configured we can crash the machine. Add a 1 second minimum which must be enforced by overwriting the value if set lower (i.e. without returning an error) to avoid breaking user-space. If that happens a log message is emitted to let the admin know that the startup interval has been set to the minimum. It doesn't make sense to make the startup interval lower than the normal query interval so use the same value of 1 second. The issue has been present since these intervals could be user-controlled. [1] https://lore.kernel.org/netdev/e8b9ce41-57b9-b6e2-a46a-ff9c791cf0ba@gmail.com/ Fixes: d902eee43f19 ("bridge: Add multicast count/interval sysfs entries") Reported-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-29	net: bridge: mcast: add and enforce query interval minimum	Nikolay Aleksandrov	5	-3/+22
	As reported[1] if query interval is set too low and we have multiple bridges or even a single bridge with multiple querier vlans configured we can crash the machine. Add a 1 second minimum which must be enforced by overwriting the value if set lower (i.e. without returning an error) to avoid breaking user-space. If that happens a log message is emitted to let the administrator know that the interval has been set to the minimum. The issue has been present since these intervals could be user-controlled. [1] https://lore.kernel.org/netdev/e8b9ce41-57b9-b6e2-a46a-ff9c791cf0ba@gmail.com/ Fixes: d902eee43f19 ("bridge: Add multicast count/interval sysfs entries") Reported-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-29	ipv6: raw: check passed optlen before reading	Tamir Duberstein	1	-0/+3
	Add a check that the user-provided option is at least as long as the number of bytes we intend to read. Before this patch we would blindly read sizeof(int) bytes even in cases where the user passed optlen<sizeof(int), which would potentially read garbage or fault. Discovered by new tests in https://github.com/google/gvisor/pull/6957 . The original get_user call predates history in the git repo. Signed-off-by: Tamir Duberstein <tamird@gmail.com> Signed-off-by: Willem de Bruijn <willemb@google.com> Link: https://lore.kernel.org/r/20211229200947.2862255-1-willemdebruijn.kernel@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-29	xsk: Initialise xskb free_list_node	Ciara Loftus	1	-0/+1
	This commit initialises the xskb's free_list_node when the xskb is allocated. This prevents a potential false negative returned from a call to list_empty for that node, such as the one introduced in commit 199d983bc015 ("xsk: Fix crash on double free in buffer pool") In my environment this issue caused packets to not be received by the xdpsock application if the traffic was running prior to application launch. This happened when the first batch of packets failed the xskmap lookup and XDP_PASS was returned from the bpf program. This action is handled in the i40e zc driver (and others) by allocating an skbuff, freeing the xdp_buff and adding the associated xskb to the xsk_buff_pool's free_list if it hadn't been added already. Without this fix, the xskb is not added to the free_list because the check to determine if it was added already returns an invalid positive result. Later, this caused allocation errors in the driver and the failure to receive packets. Fixes: 199d983bc015 ("xsk: Fix crash on double free in buffer pool") Fixes: 2b43470add8c ("xsk: Introduce AF_XDP buffer allocation API") Signed-off-by: Ciara Loftus <ciara.loftus@intel.com> Acked-by: Magnus Karlsson <magnus.karlsson@intel.com> Link: https://lore.kernel.org/r/20211220155250.2746-1-ciara.loftus@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-28	net/mlx5e: Fix wrong features assignment in case of error	Gal Pressman	1	-6/+5
	In case of an error in mlx5e_set_features(), 'netdev->features' must be updated with the correct state of the device to indicate which features were updated successfully. To do that we maintain a copy of 'netdev->features' and update it after successful feature changes, so we can assign it to back to 'netdev->features' if needed. However, since not all netdev features are handled by the driver (e.g. GRO/TSO/etc), some features may not be updated correctly in case of an error updating another feature. For example, while requesting to disable TSO (feature which is not handled by the driver) and enable HW-GRO, if an error occurs during HW-GRO enable, 'oper_features' will be assigned with 'netdev->features' and HW-GRO turned off. TSO will remain enabled in such case, which is a bug. To solve that, instead of using 'netdev->features' as the baseline of 'oper_features' and changing it on set feature success, use 'features' instead and update it in case of errors. Fixes: 75b81ce719b7 ("net/mlx5e: Don't override netdev features field unless in error flow") Signed-off-by: Gal Pressman <gal@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-12-28	net/mlx5e: TC, Fix memory leak with rules with internal port	Roi Dayan	1	-0/+2
	Fix a memory leak with decap rule with internal port as destination device. The driver allocates a modify hdr action but doesn't set the flow attr modify hdr action which results in skipping releasing the modify hdr action when releasing the flow. backtrace: [<000000005f8c651c>] krealloc+0x83/0xd0 [<000000009f59b143>] alloc_mod_hdr_actions+0x156/0x310 [mlx5_core] [<000000002257f342>] mlx5e_tc_match_to_reg_set_and_get_id+0x12a/0x360 [mlx5_core] [<00000000b44ea75a>] mlx5e_tc_add_fdb_flow+0x962/0x1470 [mlx5_core] [<0000000003e384a0>] __mlx5e_add_fdb_flow+0x54c/0xb90 [mlx5_core] [<00000000ed8b22b6>] mlx5e_configure_flower+0xe45/0x4af0 [mlx5_core] [<00000000024f4ab5>] mlx5e_rep_indr_offload.isra.0+0xfe/0x1b0 [mlx5_core] [<000000006c3bb494>] mlx5e_rep_indr_setup_tc_cb+0x90/0x130 [mlx5_core] [<00000000d3dac2ea>] tc_setup_cb_add+0x1d2/0x420 Fixes: b16eb3c81fe2 ("net/mlx5: Support internal port as decap route device") Signed-off-by: Roi Dayan <roid@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-12-28	ionic: Initialize the 'lif->dbid_inuse' bitmap	Christophe JAILLET	1	-1/+1
	When allocated, this bitmap is not initialized. Only the first bit is set a few lines below. Use bitmap_zalloc() to make sure that it is cleared before being used. Fixes: 6461b446f2a0 ("ionic: Add interrupts and doorbells") Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Signed-off-by: Shannon Nelson <snelson@pensando.io> Link: https://lore.kernel.org/r/6a478eae0b5e6c63774e1f0ddb1a3f8c38fa8ade.1640527506.git.christophe.jaillet@wanadoo.fr Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-28	drm/amd/display: Changed pipe split policy to allow for multi-display pipe split	Angus Wang	8	-8/+8
	[WHY] Current implementation of pipe split policy prevents pipe split with multiple displays connected, which caused the MCLK speed to be stuck at max [HOW] Changed the pipe split policies so that pipe split is allowed for multi-display configurations Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1522 Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1709 Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1655 Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1403 Note this is a backport of this commit from amdgpu drm-next for 5.16. Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Reviewed-by: Aric Cyr <Aric.Cyr@amd.com> Acked-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com> Signed-off-by: Angus Wang <angus.wang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
2021-12-28	drm/amd/display: Fix USB4 null pointer dereference in update_psp_stream_config	Nicholas Kazlauskas	1	-4/+1
	[Why] A porting error on a previous patch left the block of code that causes the crash from a NULL pointer dereference. More specifically, we try to access link_enc before it's assigned in the USB4 case in the following assignment: config.dio_output_idx = link_enc->transmitter - TRANSMITTER_UNIPHY_A; [How] That assignment occurs later depending on the ASIC version. It's only needed on DCN31 and only after link_enc is already assigned. Fixes: 986430446c917b ("drm/amd/display: fix a crash on USB4 over C20 PHY") Reviewed-by: Harry Wentland <harry.wentland@amd.com> Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-12-28	drm/amd/display: Set optimize_pwr_state for DCN31	Nicholas Kazlauskas	1	-0/+1
	[Why] We'll exit optimized power state to do link detection but we won't enter back into the optimized power state. This could potentially block s2idle entry depending on the sequencing, but it also means we're losing some power during the transition period. [How] Hook up the handler like DCN21. It was also missed like the exit_optimized_pwr_state callback. Fixes: 64b1d0e8d500 ("drm/amd/display: Add DCN3.1 HWSEQ") Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Reviewed-by: Eric Yang <Eric.Yang2@amd.com> Acked-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com> Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-12-28	drm/amd/display: Send s0i2_rdy in stream_count == 0 optimization	Nicholas Kazlauskas	1	-0/+1
	[Why] Otherwise SMU won't mark Display as idle when trying to perform s2idle. [How] Mark the bit in the dcn31 codepath, doesn't apply to older ASIC. It needed to be split from phy refclk off to prevent entering s2idle when PSR was engaged but driver was not ready. Fixes: 118a33151658 ("drm/amd/display: Add DCN3.1 clock manager support") Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Reviewed-by: Eric Yang <Eric.Yang2@amd.com> Acked-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com> Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-12-28	drm/amd/display: Added power down for DCN10	Lai, Derek	1	-0/+1
	[Why] The change of setting a timer callback on boot for 10 seconds is still working, just lacked power down for DCN10. [How] Added power down for DCN10. Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Reviewed-by: Anthony Koo <Anthony.Koo@amd.com> Acked-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com> Signed-off-by: Derek Lai <Derek.Lai@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-12-28	drm/amd/display: fix B0 TMDS deepcolor no dislay issue	Charlene Liu	2	-2/+54
	[why] B0 PHY C map to F, D map to G driver use logic instance, dmub does the remap. Driver still need use the right PHY instance to access right HW. [how] use phyical instance when program PHY register. [note] could move resync_control programming to dmub next. Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Reviewed-by: Dmytro Laktyushkin <Dmytro.Laktyushkin@amd.com> Reviewed-by: Jun Lei <Jun.Lei@amd.com> Acked-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com> Signed-off-by: Charlene Liu <Charlene.Liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>