Age | Commit message (Collapse) | Author | Files | Lines |
|
Some devices send (duplicate) tablet-mode events when moved around even
though the mode has not changed; and they do this even when suspended.
Change the tablet-mode event handling when priv->wakeup_mode is set to
update the switch state in case it changed and then return immediately
(without calling pm_wakeup_hard_event()) to avoid spurious wakeups.
BugLink: https://bugzilla.kernel.org/show_bug.cgi?id=212537
Fixes: 537b0dd4729e ("platform/x86: intel-hid: Add support for SW_TABLET_MODE")
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Reviewed-by: Elia Devito <eliadevito@gmail.com>
Link: https://lore.kernel.org/r/20210404143831.25173-1-hdegoede@redhat.com
|
|
Initialize them in set_nameidata() and make sure that terminate_walk() clears them
once the pointers become potentially invalid (i.e. we leave RCU mode or drop them
in non-RCU one). Currently we have "path_init() always initializes them and nobody
accesses them outside of path_init()/terminate_walk() segments", which is asking
for trouble.
With that change we would have nd->path.{mnt,dentry}
1) always valid - NULL or pointing to currently allocated objects.
2) non-NULL while we are successfully walking
3) NULL when we are not walking at all
4) contributing to refcounts whenever non-NULL outside of RCU mode.
Fixes: 6c6ec2b0a3e0 ("fs: add support for LOOKUP_CACHED")
Reported-by: syzbot+c88a7030da47945a3cc3@syzkaller.appspotmail.com
Tested-by: Christian Brauner <christian.brauner@ubuntu.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
with some additional cleanups by Helge.
Signed-off-by: Bhaskar Chowdhury <unixbhaskar@gmail.com>
Acked-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Helge Deller <deller@gmx.de>
|
|
commit b344d6a83d01 ("parisc: add support for cmpxchg on u8 pointers")
can generate a sparse warning ("cast truncates bits from constant
value"), which has been reported several times [1] [2] [3].
The original code worked as expected, but anyway, let silence such
sparse warning as what others did [4].
[1] https://lore.kernel.org/r/202104061220.nRMBwCXw-lkp@intel.com
[2] https://lore.kernel.org/r/202012291914.T5Agcn99-lkp@intel.com
[3] https://lore.kernel.org/r/202008210829.KVwn7Xeh%25lkp@intel.com
[4] https://lore.kernel.org/r/20210315131512.133720-2-jacopo+renesas@jmondi.org
Cc: Liam Beguin <liambeguin@gmail.com>
Cc: Helge Deller <deller@gmx.de>
Cc: stable@vger.kernel.org # v5.8+
Signed-off-by: Gao Xiang <hsiangkao@redhat.com>
Signed-off-by: Helge Deller <deller@gmx.de>
|
|
Add a dependency to the SBA IOMMU driver to avoid:
ERROR: modpost: "sba_list" [drivers/char/agp/parisc-agp.ko] undefined!
Reported-by: kernel test robot <lkp@intel.com>
Cc: stable@vger.kernel.org
Signed-off-by: Helge Deller <deller@gmx.de>
|
|
struct task_struct is declared twice. One has been declared
at 154th line. Remove the duplicate.
Signed-off-by: Wan Jiabing <wanjiabing@vivo.com>
Signed-off-by: Helge Deller <deller@gmx.de>
|
|
|
|
For each device, the nosy driver allocates a pcilynx structure.
A use-after-free might happen in the following scenario:
1. Open nosy device for the first time and call ioctl with command
NOSY_IOC_START, then a new client A will be malloced and added to
doubly linked list.
2. Open nosy device for the second time and call ioctl with command
NOSY_IOC_START, then a new client B will be malloced and added to
doubly linked list.
3. Call ioctl with command NOSY_IOC_START for client A, then client A
will be readded to the doubly linked list. Now the doubly linked
list is messed up.
4. Close the first nosy device and nosy_release will be called. In
nosy_release, client A will be unlinked and freed.
5. Close the second nosy device, and client A will be referenced,
resulting in UAF.
The root cause of this bug is that the element in the doubly linked list
is reentered into the list.
Fix this bug by adding a check before inserting a client. If a client
is already in the linked list, don't insert it.
The following KASAN report reveals it:
BUG: KASAN: use-after-free in nosy_release+0x1ea/0x210
Write of size 8 at addr ffff888102ad7360 by task poc
CPU: 3 PID: 337 Comm: poc Not tainted 5.12.0-rc5+ #6
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.12.0-59-gc9ba5276e321-prebuilt.qemu.org 04/01/2014
Call Trace:
nosy_release+0x1ea/0x210
__fput+0x1e2/0x840
task_work_run+0xe8/0x180
exit_to_user_mode_prepare+0x114/0x120
syscall_exit_to_user_mode+0x1d/0x40
entry_SYSCALL_64_after_hwframe+0x44/0xae
Allocated by task 337:
nosy_open+0x154/0x4d0
misc_open+0x2ec/0x410
chrdev_open+0x20d/0x5a0
do_dentry_open+0x40f/0xe80
path_openat+0x1cf9/0x37b0
do_filp_open+0x16d/0x390
do_sys_openat2+0x11d/0x360
__x64_sys_open+0xfd/0x1a0
do_syscall_64+0x33/0x40
entry_SYSCALL_64_after_hwframe+0x44/0xae
Freed by task 337:
kfree+0x8f/0x210
nosy_release+0x158/0x210
__fput+0x1e2/0x840
task_work_run+0xe8/0x180
exit_to_user_mode_prepare+0x114/0x120
syscall_exit_to_user_mode+0x1d/0x40
entry_SYSCALL_64_after_hwframe+0x44/0xae
The buggy address belongs to the object at ffff888102ad7300 which belongs to the cache kmalloc-128 of size 128
The buggy address is located 96 bytes inside of 128-byte region [ffff888102ad7300, ffff888102ad7380)
[ Modified to use 'list_empty()' inside proper lock - Linus ]
Link: https://lore.kernel.org/lkml/1617433116-5930-1-git-send-email-zheyuma97@gmail.com/
Reported-and-tested-by: 马哲宇 (Zheyu Ma) <zheyuma97@gmail.com>
Signed-off-by: Zheyu Ma <zheyuma97@gmail.com>
Cc: Greg Kroah-Hartman <greg@kroah.com>
Cc: Stefan Richter <stefanr@s5r6.in-berlin.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
84;0;0c84;0;0c
There are two workqueue-specific watchdog timestamps:
+ @wq_watchdog_touched_cpu (per-CPU) updated by
touch_softlockup_watchdog()
+ @wq_watchdog_touched (global) updated by
touch_all_softlockup_watchdogs()
watchdog_timer_fn() checks only the global @wq_watchdog_touched for
unbound workqueues. As a result, unbound workqueues are not aware
of touch_softlockup_watchdog(). The watchdog might report a stall
even when the unbound workqueues are blocked by a known slow code.
Solution:
touch_softlockup_watchdog() must touch also the global @wq_watchdog_touched
timestamp.
The global timestamp can no longer be used for bound workqueues because
it is now updated from all CPUs. Instead, bound workqueues have to check
only @wq_watchdog_touched_cpu and these timestamps have to be updated for
all CPUs in touch_all_softlockup_watchdogs().
Beware:
The change might cause the opposite problem. An unbound workqueue
might get blocked on CPU A because of a real softlockup. The workqueue
watchdog would miss it when the timestamp got touched on CPU B.
It is acceptable because softlockups are detected by softlockup
watchdog. The workqueue watchdog is there to detect stalls where
a work never finishes, for example, because of dependencies of works
queued into the same workqueue.
V3:
- Modify the commit message clearly according to Petr's suggestion.
Signed-off-by: Wang Qing <wangqing@vivo.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
|
|
The debug_work_activate() is called on the premise that
the work can be inserted, because if wq be in WQ_DRAINING
status, insert work may be failed.
Fixes: e41e704bc4f4 ("workqueue: improve destroy_workqueue() debuggability")
Signed-off-by: Zqiang <qiang.zhang@windriver.com>
Reviewed-by: Lai Jiangshan <jiangshanlai@gmail.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
|
|
The header file <linux/errno.h> is already included above and can be
removed here.
Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
Signed-off-by: Mateusz Holenko <mholenko@antmicro.com>
Signed-off-by: Stafford Horne <shorne@gmail.com>
|
|
kernel test robot correctly pinpoints a compilation failure if
CONFIG_BLOCK isn't set:
fs/io_uring.c: In function '__io_complete_rw':
>> fs/io_uring.c:2509:48: error: implicit declaration of function 'io_rw_should_reissue'; did you mean 'io_rw_reissue'? [-Werror=implicit-function-declaration]
2509 | if ((res == -EAGAIN || res == -EOPNOTSUPP) && io_rw_should_reissue(req)) {
| ^~~~~~~~~~~~~~~~~~~~
| io_rw_reissue
cc1: some warnings being treated as errors
Ensure that we have a stub declaration of io_rw_should_reissue() for
!CONFIG_BLOCK.
Fixes: 230d50d448ac ("io_uring: move reissue into regular IO path")
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Use the `marvell,reg-init` DT property to configure the LED[2]/INTn pin
of the Marvell 88E1514 ethernet PHY on Turris Omnia into interrupt mode.
Without this the pin is by default in LED[2] mode, and the Marvell PHY
driver configures LED[2] into "On - Link, Blink - Activity" mode.
This fixes the issue where the pca9538 GPIO/interrupt controller (which
can't mask interrupts in HW) received too many interrupts and after a
time started ignoring the interrupt with error message:
IRQ 71: nobody cared
There is a work in progress to have the Marvell PHY driver support
parsing PHY LED nodes from OF and registering the LEDs as Linux LED
class devices. Once this is done the PHY driver can also automatically
set the pin into INTn mode if it does not find LED[2] in OF.
Until then, though, we fix this via `marvell,reg-init` DT property.
Signed-off-by: Marek Behún <kabel@kernel.org>
Reported-by: Rui Salvaterra <rsalvaterra@gmail.com>
Fixes: 26ca8b52d6e1 ("ARM: dts: add support for Turris Omnia")
Cc: Uwe Kleine-König <uwe@kleine-koenig.org>
Cc: linux-arm-kernel@lists.infradead.org
Cc: Andrew Lunn <andrew@lunn.ch>
Cc: Gregory CLEMENT <gregory.clement@bootlin.com>
Cc: <stable@vger.kernel.org>
Tested-by: Rui Salvaterra <rsalvaterra@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Gregory CLEMENT <gregory.clement@bootlin.com>
|
|
Hardware buffer management has never worked on the Turris Omnia, as the
required MBus window hadn't been reserved. Fix thusly.
Fixes: 018b88eee1a2 ("ARM: dts: turris-omnia: enable HW buffer management")
Signed-off-by: Rui Salvaterra <rsalvaterra@gmail.com>
Reviewed-by: Marek Behún <kabel@kernel.org>
Tested-by: Klaus Kudielka <klaus.kudielka@gmail.com>
Signed-off-by: Gregory CLEMENT <gregory.clement@bootlin.com>
|
|
The driver part of this support was not merged which leads to break
AHCI on all Marvell Armada 7k8k / CN913x platforms as it was reported
by Marcin Wojtas.
So for now let's remove it in order to fix the issue waiting for the
driver part really be merged.
This reverts commit 53e950d597e3578da84238b86424bfcc9e101d87.
Fixes: 53e950d597e3 ("arm64: dts: marvell: armada-cp110: Switch to per-port SATA interrupts")
Signed-off-by: Gregory CLEMENT <gregory.clement@bootlin.com>
|
|
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
The big top of the file comment talk about grand plans that never
happened, so remove them to not confuse the readers. Also mark the
devname and volname fields as ignored as they were never used by the
kernel.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
It's non-obvious how retry is done for block backed files, when it happens
off the kiocb done path. It also makes it tricky to deal with the iov_iter
handling.
Just mark the req as needing a reissue, and handling it from the
submission path instead. This makes it directly obvious that we're not
re-importing the iovec from userspace past the submit point, and it means
that we can just reuse our usual -EAGAIN retry path from the read/write
handling.
At some point in the future, we'll gain the ability to always reliably
return -EAGAIN through the stack. A previous attempt on the block side
didn't pan out and got reverted, hence the need to check for this
information out-of-band right now.
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
If IOCB_NOWAIT is set on submission, then that needs to get propagated to
REQ_NOWAIT on the block side. Otherwise we completely lose this
information, and any issuer of IOCB_NOWAIT IO will potentially end up
blocking on eg request allocation on the storage side.
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
syzbot reported a bug when putting the last reference to a tasks file
descriptor table. Debugging this showed we didn't recalculate the
current maximum fd number for CLOSE_RANGE_UNSHARE | CLOSE_RANGE_CLOEXEC
after we unshared the file descriptors table. So max_fd could exceed the
current fdtable maximum causing us to set excessive bits. As a concrete
example, let's say the user requested everything from fd 4 to ~0UL to be
closed and their current fdtable size is 256 with their highest open fd
being 4. With CLOSE_RANGE_UNSHARE the caller will end up with a new
fdtable which has room for 64 file descriptors since that is the lowest
fdtable size we accept. But now max_fd will still point to 255 and needs
to be adjusted. Fix this by retrieving the correct maximum fd value in
__range_cloexec().
Reported-by: syzbot+283ce5a46486d6acdbaf@syzkaller.appspotmail.com
Fixes: 582f1fb6b721 ("fs, close_range: add flag CLOSE_RANGE_CLOEXEC")
Fixes: fec8a6a69103 ("close_range: unshare all fds for CLOSE_RANGE_UNSHARE | CLOSE_RANGE_CLOEXEC")
Cc: Christoph Hellwig <hch@lst.de>
Cc: Giuseppe Scrivano <gscrivan@redhat.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: linux-fsdevel@vger.kernel.org
Cc: stable@vger.kernel.org
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
|
|
NUMA is useless when NOMMU, and it leads some build error,
make it depend on MMU.
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
|
|
Eliminate the following coccicheck warning:
./arch/riscv/mm/kasan_init.c:219:2-3: Unneeded semicolon
Reported-by: Abaci Robot <abaci@linux.alibaba.com>
Signed-off-by: Yang Li <yang.lee@linux.alibaba.com>
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
|
|
In RV64, the size of each entry in excp_vect_table is 8 bytes. If the
base of the table is not 8-byte aligned, loading an entry in the table
will raise a misaligned exception. Although such exception will be
handled by opensbi/bbl, this still causes performance degradation.
Signed-off-by: Zihao Yu <yuzihao@ict.ac.cn>
Reviewed-by: Anup Patel <anup@brainfault.org>
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
|
|
The <asm/uaccess.h> header has a problem with put_user(a, ptr) if
the 'a' is not a simple variable, such as a function. This can lead
to the compiler producing code as so:
1: enable_user_access()
2: evaluate 'a' into register 'r'
3: put 'r' to 'ptr'
4: disable_user_acess()
The issue is that 'a' is now being evaluated with the user memory
protections disabled. So we try and force the evaulation by assigning
'x' to __val at the start, and hoping the compiler barriers in
enable_user_access() do the job of ordering step 2 before step 1.
This has shown up in a bug where 'a' sleeps and thus schedules out
and loses the SR_SUM flag. This isn't sufficient to fully fix, but
should reduce the window of opportunity. The first instance of this
we found is in scheudle_tail() where the code does:
$ less -N kernel/sched/core.c
4263 if (current->set_child_tid)
4264 put_user(task_pid_vnr(current), current->set_child_tid);
Here, the task_pid_vnr(current) is called within the block that has
enabled the user memory access. This can be made worse with KASAN
which makes task_pid_vnr() a rather large call with plenty of
opportunity to sleep.
Signed-off-by: Ben Dooks <ben.dooks@codethink.co.uk>
Reported-by: syzbot+e74b94fe601ab9552d69@syzkaller.appspotmail.com
Suggested-by: Arnd Bergman <arnd@arndb.de>
--
Changes since v1:
- fixed formatting and updated the patch description with more info
Changes since v2:
- fixed commenting on __put_user() (schwab@linux-m68k.org)
Change since v3:
- fixed RFC in patch title. Should be ready to merge.
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
|
|
The const annotation should not be used for 'sp', or it will
become read only and lead to bad stack output.
Fixes: dec822771b01 ("riscv: stacktrace: Move register keyword to beginning of declaration")
Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
|
|
Merge module sections only when using Clang LTO. With ld.bfd, merging
sections does not appear to update the symbol tables for the module,
e.g. 'readelf -s' shows the value that a symbol would have had, if
sections were not merged. ld.lld does not show this problem.
The stale symbol table breaks gdb's function disassembler, and presumably
other things, e.g.
gdb -batch -ex "file arch/x86/kvm/kvm.ko" -ex "disassemble kvm_init"
reads the wrong bytes and dumps garbage.
Fixes: dd2776222abb ("kbuild: lto: merge module sections")
Cc: Nick Desaulniers <ndesaulniers@google.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
Reviewed-by: Sami Tolvanen <samitolvanen@google.com>
Tested-by: Sami Tolvanen <samitolvanen@google.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/20210322234438.502582-1-seanjc@google.com
|
|
Commit cbc3b92ce037 fixed an issue to modify the macros of the stack trace
event so that user space could parse it properly. Originally the stack
trace format to user space showed that the called stack was a dynamic
array. But it is not actually a dynamic array, in the way that other
dynamic event arrays worked, and this broke user space parsing for it. The
update was to make the array look to have 8 entries in it. Helper
functions were added to make it parse it correctly, as the stack was
dynamic, but was determined by the size of the event stored.
Although this fixed user space on how it read the event, it changed the
internal structure used for the stack trace event. It changed the array
size from [0] to [8] (added 8 entries). This increased the size of the
stack trace event by 8 words. The size reserved on the ring buffer was the
size of the stack trace event plus the number of stack entries found in
the stack trace. That commit caused the amount to be 8 more than what was
needed because it did not expect the caller field to have any size. This
produced 8 entries of garbage (and reading random data) from the stack
trace event:
<idle>-0 [002] d... 1976396.837549: <stack trace>
=> trace_event_raw_event_sched_switch
=> __traceiter_sched_switch
=> __schedule
=> schedule_idle
=> do_idle
=> cpu_startup_entry
=> secondary_startup_64_no_verify
=> 0xc8c5e150ffff93de
=> 0xffff93de
=> 0
=> 0
=> 0xc8c5e17800000000
=> 0x1f30affff93de
=> 0x00000004
=> 0x200000000
Instead, subtract the size of the caller field from the size of the event
to make sure that only the amount needed to store the stack trace is
reserved.
Link: https://lore.kernel.org/lkml/your-ad-here.call-01617191565-ext-9692@work.hours/
Cc: stable@vger.kernel.org
Fixes: cbc3b92ce037 ("tracing: Set kernel_stack's caller size properly")
Reported-by: Vasily Gorbik <gor@linux.ibm.com>
Tested-by: Vasily Gorbik <gor@linux.ibm.com>
Acked-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
|
|
iov_iter_revert() is done in completion handlers that happensf before
read/write returns -EIOCBQUEUED, no need to repeat reverting afterwards.
Moreover, even though it may appear being just a no-op, it's actually
races with 1) user forging a new iovec of a different size 2) reissue,
that is done via io-wq continues completely asynchronously.
Fixes: 3e6a0d3c7571c ("io_uring: fix -EAGAIN retry with IOPOLL")
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
task_pid may be large enough to not fit into the left space of
TASK_COMM_LEN-sized buffers and overflow in sprintf. We not so care
about uniqueness, so replace it with safer snprintf().
Reported-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/1702c6145d7e1c46fbc382f28334c02e1a3d3994.1617267273.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
S_ISBLK is marked as unbounded work for async preparation, because it
doesn't match S_ISREG. That is incorrect, as any read/write to a block
device is also a bounded operation. Fix it up and ensure that S_ISBLK
isn't marked unbounded.
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Clang warns about the comparison when using a 32-bit phys_addr_t:
drivers/bus/mvebu-mbus.c:621:17: error: result of comparison of constant 4294967296 with expression of type 'phys_addr_t' (aka 'unsigned int') is always false [-Werror,-Wtautological-constant-out-of-range-compare]
if (reg_start >= 0x100000000ULL)
Add a cast to shut up the warning.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Nathan Chancellor <nathan@kernel.org>
Link: https://lore.kernel.org/r/20210323131952.2835509-1-arnd@kernel.org'
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
|
|
The default initializer at the start of the array causes a warning
when building with W=1:
In file included from arch/arm/mach-pxa/mainstone.c:47:
arch/arm/mach-pxa/mainstone.h:124:33: error: initialized field overwritten [-Werror=override-init]
124 | #define MAINSTONE_IRQ(x) (MAINSTONE_NR_IRQS + (x))
| ^
arch/arm/mach-pxa/mainstone.h:133:33: note: in expansion of macro 'MAINSTONE_IRQ'
133 | #define MAINSTONE_S0_CD_IRQ MAINSTONE_IRQ(9)
| ^~~~~~~~~~~~~
arch/arm/mach-pxa/mainstone.c:506:15: note: in expansion of macro 'MAINSTONE_S0_CD_IRQ'
506 | [5] = MAINSTONE_S0_CD_IRQ,
| ^~~~~~~~~~~~~~~~~~~
Rework the initializer to list each element explicitly and only once.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Link: https://lore.kernel.org/r/20210323130849.2362001-1-arnd@kernel.org'
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
|
|
The clang integrated assembler fails to build one file with
a complex asm instruction:
arch/arm/mach-omap1/ams-delta-fiq-handler.S:249:2: error: invalid instruction, any one of the following would fix this:
mov r10, #(1 << (((NR_IRQS_LEGACY + 12) - NR_IRQS_LEGACY) % 32)) @ set deferred_fiq bit
^
arch/arm/mach-omap1/ams-delta-fiq-handler.S:249:2: note: instruction requires: armv6t2
mov r10, #(1 << (((NR_IRQS_LEGACY + 12) - NR_IRQS_LEGACY) % 32)) @ set deferred_fiq bit
^
arch/arm/mach-omap1/ams-delta-fiq-handler.S:249:2: note: instruction requires: thumb2
mov r10, #(1 << (((NR_IRQS_LEGACY + 12) - NR_IRQS_LEGACY) % 32)) @ set deferred_fiq bit
^
The problem is that 'NR_IRQS_LEGACY' is not defined here. Apparently
gas does not care because we first add and then subtract this number,
leading to the immediate value to be the same regardless of the
specific definition of NR_IRQS_LEGACY.
Neither the way that 'gas' just silently builds this file, nor the
way that clang IAS makes nonsensical suggestions for how to fix it
is great. Fortunately there is an easy fix, which is to #include
the header that contains the definition.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Tony Lindgren <tony@atomide.com>
Link: https://lore.kernel.org/r/20210308153430.2530616-1-arnd@kernel.org'
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
|
|
When building with W=1, gcc points out that the __packed attribute
on struct qm_eqcr_entry conflicts with the 8-byte alignment
attribute on struct qm_fd inside it:
drivers/soc/fsl/qbman/qman.c:189:1: error: alignment 1 of 'struct qm_eqcr_entry' is less than 8 [-Werror=packed-not-aligned]
I assume that the alignment attribute is the correct one, and
that qm_eqcr_entry cannot actually be unaligned in memory,
so add the same alignment on the outer struct.
Fixes: c535e923bb97 ("soc/fsl: Introduce DPAA 1.x QMan device driver")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Link: https://lore.kernel.org/r/20210323131530.2619900-1-arnd@kernel.org'
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
|
|
clang warns about an impossible condition when building with 32-bit
phys_addr_t:
arch/arm/mach-keystone/keystone.c:79:16: error: result of comparison of constant 51539607551 with expression of type 'phys_addr_t' (aka 'unsigned int') is always false [-Werror,-Wtautological-constant-out-of-range-compare]
mem_end > KEYSTONE_HIGH_PHYS_END) {
~~~~~~~ ^ ~~~~~~~~~~~~~~~~~~~~~~
arch/arm/mach-keystone/keystone.c:78:16: error: result of comparison of constant 34359738368 with expression of type 'phys_addr_t' (aka 'unsigned int') is always true [-Werror,-Wtautological-constant-out-of-range-compare]
if (mem_start < KEYSTONE_HIGH_PHYS_START ||
~~~~~~~~~ ^ ~~~~~~~~~~~~~~~~~~~~~~~~
Change the temporary variable to a fixed-size u64 to avoid the warning.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Nathan Chancellor <nathan@kernel.org>
Acked-by: Santosh Shilimkar <ssantosh@kernel.org>
Link: https://lore.kernel.org/r/20210323131814.2751750-1-arnd@kernel.org'
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
|
|
Memory backed or zoned null block devices may generate actual request
timeout errors due to the submission path being blocked on memory
allocation or zone locking. Unlike fake timeouts or injected timeouts,
the request submission path will call blk_mq_complete_request() or
blk_mq_end_request() for these real timeout errors, causing a double
completion and use after free situation as the block layer timeout
handler executes blk_mq_rq_timed_out() and __blk_mq_free_request() in
blk_mq_check_expired(). This problem often triggers a NULL pointer
dereference such as:
BUG: kernel NULL pointer dereference, address: 0000000000000050
RIP: 0010:blk_mq_sched_mark_restart_hctx+0x5/0x20
...
Call Trace:
dd_finish_request+0x56/0x80
blk_mq_free_request+0x37/0x130
null_handle_cmd+0xbf/0x250 [null_blk]
? null_queue_rq+0x67/0xd0 [null_blk]
blk_mq_dispatch_rq_list+0x122/0x850
__blk_mq_do_dispatch_sched+0xbb/0x2c0
__blk_mq_sched_dispatch_requests+0x13d/0x190
blk_mq_sched_dispatch_requests+0x30/0x60
__blk_mq_run_hw_queue+0x49/0x90
process_one_work+0x26c/0x580
worker_thread+0x55/0x3c0
? process_one_work+0x580/0x580
kthread+0x134/0x150
? kthread_create_worker_on_cpu+0x70/0x70
ret_from_fork+0x1f/0x30
This problem very often triggers when running the full btrfs xfstests
on a memory-backed zoned null block device in a VM with limited amount
of memory.
Avoid this by executing blk_mq_complete_request() in null_timeout_rq()
only for commands that are marked for a fake timeout completion using
the fake_timeout boolean in struct null_cmd. For timeout errors injected
through debugfs, the timeout handler will execute
blk_mq_complete_request()i as before. This is safe as the submission
path does not execute complete requests in this case.
In null_timeout_rq(), also make sure to set the command error field to
BLK_STS_TIMEOUT and to propagate this error through to the request
completion.
Reported-by: Johannes Thumshirn <Johannes.Thumshirn@wdc.com>
Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
Tested-by: Johannes Thumshirn <Johannes.Thumshirn@wdc.com>
Reviewed-by: Johannes Thumshirn <Johannes.Thumshirn@wdc.com>
Link: https://lore.kernel.org/r/20210331225244.126426-1-damien.lemoal@wdc.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Instead of just reporting an assertion failure, report enough information
that we can start diagnosing exactly went wrong.
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
|
|
The throbber could race with creation of the anchor entry and cause the
IDR to have zero entries in it, which would cause the test to fail.
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
|
|
When run on a single CPU, this test would frequently access already-freed
memory. Due to timing, this bug never showed up on multi-CPU tests.
Reported-by: Chris von Recklinghausen <crecklin@redhat.com>
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
|
|
Several test runners register individual worker threads with the
RCU library, but neglect to register the main thread, which can lead
to objects being freed while the main thread is in what appears to be
an RCU critical section.
Reported-by: Chris von Recklinghausen <crecklin@redhat.com>
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
|
|
Commit 496121c02127 ("ACPI: processor: idle: Allow probing on platforms
with one ACPI C-state") broke CPU0 hotplug on certain systems, e.g.
I'm observing the following on AWS Nitro (e.g r5b.xlarge but other
instance types are affected as well):
# echo 0 > /sys/devices/system/cpu/cpu0/online
# echo 1 > /sys/devices/system/cpu/cpu0/online
<10 seconds delay>
-bash: echo: write error: Input/output error
In fact, the above mentioned commit only revealed the problem and did
not introduce it. On x86, to wakeup CPU an NMI is being used and
hlt_play_dead()/mwait_play_dead() loops are prepared to handle it:
/*
* If NMI wants to wake up CPU0, start CPU0.
*/
if (wakeup_cpu0())
start_cpu0();
cpuidle_play_dead() -> acpi_idle_play_dead() (which is now being called on
systems where it wasn't called before the above mentioned commit) serves
the same purpose but it doesn't have a path for CPU0. What happens now on
wakeup is:
- NMI is sent to CPU0
- wakeup_cpu0_nmi() works as expected
- we get back to while (1) loop in acpi_idle_play_dead()
- safe_halt() puts CPU0 to sleep again.
The straightforward/minimal fix is add the special handling for CPU0 on x86
and that's what the patch is doing.
Fixes: 496121c02127 ("ACPI: processor: idle: Allow probing on platforms with one ACPI C-state")
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Cc: 5.10+ <stable@vger.kernel.org> # 5.10+
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
Add a test for the issue when KVM_SET_CLOCK(0) call could cause
TSC page value to go very big because of a signedness issue around
hv_clock->system_time.
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Message-Id: <20210326155551.17446-3-vkuznets@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
When guest time is reset with KVM_SET_CLOCK(0), it is possible for
'hv_clock->system_time' to become a small negative number. This happens
because in KVM_SET_CLOCK handling we set 'kvm->arch.kvmclock_offset' based
on get_kvmclock_ns(kvm) but when KVM_REQ_CLOCK_UPDATE is handled,
kvm_guest_time_update() does (masterclock in use case):
hv_clock.system_time = ka->master_kernel_ns + v->kvm->arch.kvmclock_offset;
And 'master_kernel_ns' represents the last time when masterclock
got updated, it can precede KVM_SET_CLOCK() call. Normally, this is not a
problem, the difference is very small, e.g. I'm observing
hv_clock.system_time = -70 ns. The issue comes from the fact that
'hv_clock.system_time' is stored as unsigned and 'system_time / 100' in
compute_tsc_page_parameters() becomes a very big number.
Use 'master_kernel_ns' instead of get_kvmclock_ns() when masterclock is in
use and get_kvmclock_base_ns() when it's not to prevent 'system_time' from
going negative.
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Message-Id: <20210331124130.337992-2-vkuznets@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
pvclock_gtod_sync_lock can be taken with interrupts disabled if the
preempt notifier calls get_kvmclock_ns to update the Xen
runstate information:
spin_lock include/linux/spinlock.h:354 [inline]
get_kvmclock_ns+0x25/0x390 arch/x86/kvm/x86.c:2587
kvm_xen_update_runstate+0x3d/0x2c0 arch/x86/kvm/xen.c:69
kvm_xen_update_runstate_guest+0x74/0x320 arch/x86/kvm/xen.c:100
kvm_xen_runstate_set_preempted arch/x86/kvm/xen.h:96 [inline]
kvm_arch_vcpu_put+0x2d8/0x5a0 arch/x86/kvm/x86.c:4062
So change the users of the spinlock to spin_lock_irqsave and
spin_unlock_irqrestore.
Reported-by: syzbot+b282b65c2c68492df769@syzkaller.appspotmail.com
Fixes: 30b5c851af79 ("KVM: x86/xen: Add support for vCPU runstate information")
Cc: David Woodhouse <dwmw@amazon.co.uk>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
There is no need to include changes to vcpu->requests into
the pvclock_gtod_sync_lock critical section. The changes to
the shared data structures (in pvclock_update_vm_gtod_copy)
already occur under the lock.
Cc: David Woodhouse <dwmw@amazon.co.uk>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
Fixing nested_vmcb_check_save to avoid all TOC/TOU races
is a bit harder in released kernels, so do the bare minimum
by avoiding that EFER.SVME is cleared. This is problematic
because svm_set_efer frees the data structures for nested
virtualization if EFER.SVME is cleared.
Also check that EFER.SVME remains set after a nested vmexit;
clearing it could happen if the bit is zero in the save area
that is passed to KVM_SET_NESTED_STATE (the save area of the
nested state corresponds to the nested hypervisor's state
and is restored on the next nested vmexit).
Cc: stable@vger.kernel.org
Fixes: 2fcf4876ada ("KVM: nSVM: implement on demand allocation of the nested state")
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
Avoid races between check and use of the nested VMCB controls. This
for example ensures that the VMRUN intercept is always reflected to the
nested hypervisor, instead of being processed by the host. Without this
patch, it is possible to end up with svm->nested.hsave pointing to
the MSR permission bitmap for nested guests.
This bug is CVE-2021-29657.
Reported-by: Felix Wilhelm <fwilhelm@google.com>
Cc: stable@vger.kernel.org
Fixes: 2fcf4876ada ("KVM: nSVM: implement on demand allocation of the nested state")
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
The page table of AMDGPU requires an alignment to CPU page so we should
check ioctl parameters for it. Return -EINVAL if some parameter is
unaligned to CPU page, instead of corrupt the page table sliently.
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Xi Ruoyao <xry111@mengyan1223.wang>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
|
|
In Mesa, dev_info.gart_page_size is used for alignment and it was
set to AMDGPU_GPU_PAGE_SIZE(4KB). However, the page table of AMDGPU
driver requires an alignment on CPU pages. So, for non-4KB page system,
gart_page_size should be max_t(u32, PAGE_SIZE, AMDGPU_GPU_PAGE_SIZE).
Signed-off-by: Rui Wang <wangr@lemote.com>
Signed-off-by: Huacai Chen <chenhc@lemote.com>
Link: https://github.com/loongson-community/linux-stable/commit/caa9c0a1
[Xi: rebased for drm-next, use max_t for checkpatch,
and reworded commit message.]
Signed-off-by: Xi Ruoyao <xry111@mengyan1223.wang>
BugLink: https://gitlab.freedesktop.org/drm/amd/-/issues/1549
Tested-by: Dan Horák <dan@danny.cz>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
|
|
Do the same thing we do for Renoir. We can check, but since
the sbios has started DPM, it will always return true which
causes the driver to skip some of the SMU init when it shouldn't.
Reviewed-by: Zhan Liu <zhan.liu@amd.com>
Acked-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
|