Age | Commit message (Collapse) | Author | Files | Lines |
|
In preparation for removing logbuf_lock, inline log_output()
and log_store() into vprintk_store(). This will simplify dealing
with the various code branches and fallbacks that are possible.
Signed-off-by: John Ogness <john.ogness@linutronix.de>
Reviewed-by: Petr Mladek <pmladek@suse.com>
Reviewed-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Signed-off-by: Petr Mladek <pmladek@suse.com>
Link: https://lore.kernel.org/r/20201209004453.17720-2-john.ogness@linutronix.de
|
|
Any record with a trailing newline (LOG_NEWLINE flag) cannot
be continued because the newline has been stripped and will
not be visible if the message is appended. This was already
handled correctly when committing in log_output() but was
not handled correctly when committing in log_store().
Fixes: f5f022e53b87 ("printk: reimplement log_cont using record extension")
Link: https://lore.kernel.org/r/20201126114836.14750-1-john.ogness@linutronix.de
Reported-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Signed-off-by: John Ogness <john.ogness@linutronix.de>
Tested-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Reviewed-by: Petr Mladek <pmladek@suse.com>
Signed-off-by: Petr Mladek <pmladek@suse.com>
|
|
0day reported one -22.7% regression for will-it-scale page_fault2
case [1] on a 4 sockets 144 CPU platform, and bisected to it to be
caused by Waiman's optimization (commit bd0b230fe1) of saving one
'struct page_counter' space for 'struct mem_cgroup'.
Initially we thought it was due to the cache alignment change introduced
by the patch, but further debug shows that it is due to some hot data
members ('vmstats_local', 'vmstats_percpu', 'vmstats') sit in 2 adjacent
cacheline (2N and 2N+1 cacheline), and when adjacent cache line prefetch
is enabled, it triggers an "extended level" of cache false sharing for
2 adjacent cache lines.
So exchange the 2 member blocks, while keeping mostly the original
cache alignment, which can restore and even enhance the performance,
and save 64 bytes of space for 'struct mem_cgroup' (from 2880 to 2816,
with 0day's default RHEL-8.3 kernel config)
[1]. https://lore.kernel.org/lkml/20201102091543.GM31092@shao2-debian/
Fixes: bd0b230fe145 ("mm/memcg: unify swap and memsw page counters")
Reported-by: kernel test robot <rong.a.chen@intel.com>
Signed-off-by: Feng Tang <feng.tang@intel.com>
Acked-by: Waiman Long <longman@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
The bdi_dev_name() returns a char [64], and
the __entry->name is a char [32].
It maybe dangerous to TP_printk("%s", __entry->name)
after the strncpy().
CC: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20201124165205.GA23937@rlk
Acked-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Hui Su <sh_def@163.com>
Signed-off-by: Jan Kara <jack@suse.cz>
|
|
Twice now, when exercising ext4 looped on shmem huge pages, I have crashed
on the PF_ONLY_HEAD check inside PageWaiters(): ext4_finish_bio() calling
end_page_writeback() calling wake_up_page() on tail of a shmem huge page,
no longer an ext4 page at all.
The problem is that PageWriteback is not accompanied by a page reference
(as the NOTE at the end of test_clear_page_writeback() acknowledges): as
soon as TestClearPageWriteback has been done, that page could be removed
from page cache, freed, and reused for something else by the time that
wake_up_page() is reached.
https://lore.kernel.org/linux-mm/20200827122019.GC14765@casper.infradead.org/
Matthew Wilcox suggested avoiding or weakening the PageWaiters() tail
check; but I'm paranoid about even looking at an unreferenced struct page,
lest its memory might itself have already been reused or hotremoved (and
wake_up_page_bit() may modify that memory with its ClearPageWaiters()).
Then on crashing a second time, realized there's a stronger reason against
that approach. If my testing just occasionally crashes on that check,
when the page is reused for part of a compound page, wouldn't it be much
more common for the page to get reused as an order-0 page before reaching
wake_up_page()? And on rare occasions, might that reused page already be
marked PageWriteback by its new user, and already be waited upon? What
would that look like?
It would look like BUG_ON(PageWriteback) after wait_on_page_writeback()
in write_cache_pages() (though I have never seen that crash myself).
Matthew Wilcox explaining this to himself:
"page is allocated, added to page cache, dirtied, writeback starts,
--- thread A ---
filesystem calls end_page_writeback()
test_clear_page_writeback()
--- context switch to thread B ---
truncate_inode_pages_range() finds the page, it doesn't have writeback set,
we delete it from the page cache. Page gets reallocated, dirtied, writeback
starts again. Then we call write_cache_pages(), see
PageWriteback() set, call wait_on_page_writeback()
--- context switch back to thread A ---
wake_up_page(page, PG_writeback);
... thread B is woken, but because the wakeup was for the old use of
the page, PageWriteback is still set.
Devious"
And prior to 2a9127fcf229 ("mm: rewrite wait_on_page_bit_common() logic")
this would have been much less likely: before that, wake_page_function()'s
non-exclusive case would stop walking and not wake if it found Writeback
already set again; whereas now the non-exclusive case proceeds to wake.
I have not thought of a fix that does not add a little overhead: the
simplest fix is for end_page_writeback() to get_page() before calling
test_clear_page_writeback(), then put_page() after wake_up_page().
Was there a chance of missed wakeups before, since a page freed before
reaching wake_up_page() would have PageWaiters cleared? I think not,
because each waiter does hold a reference on the page. This bug comes
when the old use of the page, the one we do TestClearPageWriteback on,
had *no* waiters, so no additional page reference beyond the page cache
(and whoever racily freed it). The reuse of the page has a waiter
holding a reference, and its own PageWriteback set; but the belated
wake_up_page() has woken the reuse to hit that BUG_ON(PageWriteback).
Reported-by: syzbot+3622cea378100f45d59f@syzkaller.appspotmail.com
Reported-by: Qian Cai <cai@lca.pw>
Fixes: 2a9127fcf229 ("mm: rewrite wait_on_page_bit_common() logic")
Signed-off-by: Hugh Dickins <hughd@google.com>
Cc: stable@vger.kernel.org # v5.8+
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
We need to disable interrupts in load_fpu_regs(). Otherwise an
interrupt might come in after the registers are loaded, but before
CIF_FPU is cleared in load_fpu_regs(). When the interrupt returns,
CIF_FPU will be cleared and the registers will never be restored.
The entry.S code usually saves the interrupt state in __SF_EMPTY on the
stack when disabling/restoring interrupts. sie64a however saves the pointer
to the sie control block in __SF_SIE_CONTROL, which references the same
location. This is non-obvious to the reader. To avoid thrashing the sie
control block pointer in load_fpu_regs(), move the __SIE_* offsets eight
bytes after __SF_EMPTY on the stack.
Cc: <stable@vger.kernel.org> # 5.8
Fixes: 0b0ed657fe00 ("s390: remove critical section cleanup from entry.S")
Reported-by: Pierre Morel <pmorel@linux.ibm.com>
Signed-off-by: Sven Schnelle <svens@linux.ibm.com>
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
|
|
Commit 8410e7f3b31e ("cpufreq: scmi: Fix OPP addition failure with a
dummy clock provider") registers a dummy clock provider using
devm_of_clk_add_hw_provider. These *_hw_provider functions are defined
only when CONFIG_COMMON_CLK=y. One possible fix is to add the Kconfig
dependency, but since we plan to move away from the clock dependency
for scmi cpufreq, it is preferrable to avoid that.
Let us just conditionally compile out the offending call to
devm_of_clk_add_hw_provider. It also uses the variable 'dev' outside
of the #ifdef block to avoid build warning.
Fixes: 8410e7f3b31e ("cpufreq: scmi: Fix OPP addition failure with a dummy clock provider")
Cc: Rafael J. Wysocki <rjw@rjwysocki.net>
Cc: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
|
|
|
|
When doing a lookup in a directory, the afs filesystem uses a bulk
status fetch to speculatively retrieve the statuses of up to 48 other
vnodes found in the same directory and it will then either update extant
inodes or create new ones - effectively doing 'lookup ahead'.
To avoid the possibility of deadlocking itself, however, the filesystem
doesn't lock all of those inodes; rather just the directory inode is
locked (by the VFS).
When the operation completes, afs_inode_init_from_status() or
afs_apply_status() is called, depending on whether the inode already
exists, to commit the new status.
A case exists, however, where the speculative status fetch operation may
straddle a modification operation on one of those vnodes. What can then
happen is that the speculative bulk status RPC retrieves the old status,
and whilst that is happening, the modification happens - which returns
an updated status, then the modification status is committed, then we
attempt to commit the speculative status.
This results in something like the following being seen in dmesg:
kAFS: vnode modified {100058:861} 8->9 YFS.InlineBulkStatus
showing that for vnode 861 on volume 100058, we saw YFS.InlineBulkStatus
say that the vnode had data version 8 when we'd already recorded version
9 due to a local modification. This was causing the cache to be
invalidated for that vnode when it shouldn't have been. If it happens
on a data file, this might lead to local changes being lost.
Fix this by ignoring speculative status updates if the data version
doesn't match the expected value.
Note that it is possible to get a DV regression if a volume gets
restored from a backup - but we should get a callback break in such a
case that should trigger a recheck anyway. It might be worth checking
the volume creation time in the volsync info and, if a change is
observed in that (as would happen on a restore), invalidate all caches
associated with the volume.
Fixes: 5cf9dd55a0ec ("afs: Prospectively look up extra files when doing a single lookup")
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
The calculation of the end page index was incorrect, leading to a
regression of 70% when running stress-ng.
With this fix, we instead see a performance improvement of 3%.
Fixes: e6e88712e43b ("mm: optimise madvise WILLNEED")
Reported-by: kernel test robot <rong.a.chen@intel.com>
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Tested-by: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Cc: William Kucharski <william.kucharski@oracle.com>
Cc: Feng Tang <feng.tang@intel.com>
Cc: "Chen, Rong A" <rong.a.chen@intel.com>
Link: https://lkml.kernel.org/r/20201109134851.29692-1-willy@infradead.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
The attr->set() receive a value of u64, but simple_strtoll() is used for
doing the conversion. It will lead to the error cast if user inputs a
negative value.
Use kstrtoull() instead of simple_strtoll() to convert a string got from
the user to an unsigned value. The former will return '-EINVAL' if it
gets a negetive value, but the latter can't handle the situation
correctly. Make 'val' unsigned long long as what kstrtoull() takes,
this will eliminate the compile warning on no 64-bit architectures.
Fixes: f7b88631a897 ("fs/libfs.c: fix simple_attr_write() on 32bit machines")
Signed-off-by: Yicong Yang <yangyicong@hisilicon.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Link: https://lkml.kernel.org/r/1605341356-11872-1-git-send-email-yangyicong@hisilicon.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Alexander reported a syzkaller / KASAN finding on s390, see below for
complete output.
In do_huge_pmd_anonymous_page(), the pre-allocated pagetable will be
freed in some cases. In the case of userfaultfd_missing(), this will
happen after calling handle_userfault(), which might have released the
mmap_lock. Therefore, the following pte_free(vma->vm_mm, pgtable) will
access an unstable vma->vm_mm, which could have been freed or re-used
already.
For all architectures other than s390 this will go w/o any negative
impact, because pte_free() simply frees the page and ignores the
passed-in mm. The implementation for SPARC32 would also access
mm->page_table_lock for pte_free(), but there is no THP support in
SPARC32, so the buggy code path will not be used there.
For s390, the mm->context.pgtable_list is being used to maintain the 2K
pagetable fragments, and operating on an already freed or even re-used
mm could result in various more or less subtle bugs due to list /
pagetable corruption.
Fix this by calling pte_free() before handle_userfault(), similar to how
it is already done in __do_huge_pmd_anonymous_page() for the WRITE /
non-huge_zero_page case.
Commit 6b251fc96cf2c ("userfaultfd: call handle_userfault() for
userfaultfd_missing() faults") actually introduced both, the
do_huge_pmd_anonymous_page() and also __do_huge_pmd_anonymous_page()
changes wrt to calling handle_userfault(), but only in the latter case
it put the pte_free() before calling handle_userfault().
BUG: KASAN: use-after-free in do_huge_pmd_anonymous_page+0xcda/0xd90 mm/huge_memory.c:744
Read of size 8 at addr 00000000962d6988 by task syz-executor.0/9334
CPU: 1 PID: 9334 Comm: syz-executor.0 Not tainted 5.10.0-rc1-syzkaller-07083-g4c9720875573 #0
Hardware name: IBM 3906 M04 701 (KVM/Linux)
Call Trace:
do_huge_pmd_anonymous_page+0xcda/0xd90 mm/huge_memory.c:744
create_huge_pmd mm/memory.c:4256 [inline]
__handle_mm_fault+0xe6e/0x1068 mm/memory.c:4480
handle_mm_fault+0x288/0x748 mm/memory.c:4607
do_exception+0x394/0xae0 arch/s390/mm/fault.c:479
do_dat_exception+0x34/0x80 arch/s390/mm/fault.c:567
pgm_check_handler+0x1da/0x22c arch/s390/kernel/entry.S:706
copy_from_user_mvcos arch/s390/lib/uaccess.c:111 [inline]
raw_copy_from_user+0x3a/0x88 arch/s390/lib/uaccess.c:174
_copy_from_user+0x48/0xa8 lib/usercopy.c:16
copy_from_user include/linux/uaccess.h:192 [inline]
__do_sys_sigaltstack kernel/signal.c:4064 [inline]
__s390x_sys_sigaltstack+0xc8/0x240 kernel/signal.c:4060
system_call+0xe0/0x28c arch/s390/kernel/entry.S:415
Allocated by task 9334:
slab_alloc_node mm/slub.c:2891 [inline]
slab_alloc mm/slub.c:2899 [inline]
kmem_cache_alloc+0x118/0x348 mm/slub.c:2904
vm_area_dup+0x9c/0x2b8 kernel/fork.c:356
__split_vma+0xba/0x560 mm/mmap.c:2742
split_vma+0xca/0x108 mm/mmap.c:2800
mlock_fixup+0x4ae/0x600 mm/mlock.c:550
apply_vma_lock_flags+0x2c6/0x398 mm/mlock.c:619
do_mlock+0x1aa/0x718 mm/mlock.c:711
__do_sys_mlock2 mm/mlock.c:738 [inline]
__s390x_sys_mlock2+0x86/0xa8 mm/mlock.c:728
system_call+0xe0/0x28c arch/s390/kernel/entry.S:415
Freed by task 9333:
slab_free mm/slub.c:3142 [inline]
kmem_cache_free+0x7c/0x4b8 mm/slub.c:3158
__vma_adjust+0x7b2/0x2508 mm/mmap.c:960
vma_merge+0x87e/0xce0 mm/mmap.c:1209
userfaultfd_release+0x412/0x6b8 fs/userfaultfd.c:868
__fput+0x22c/0x7a8 fs/file_table.c:281
task_work_run+0x200/0x320 kernel/task_work.c:151
tracehook_notify_resume include/linux/tracehook.h:188 [inline]
do_notify_resume+0x100/0x148 arch/s390/kernel/signal.c:538
system_call+0xe6/0x28c arch/s390/kernel/entry.S:416
The buggy address belongs to the object at 00000000962d6948 which belongs to the cache vm_area_struct of size 200
The buggy address is located 64 bytes inside of 200-byte region [00000000962d6948, 00000000962d6a10)
The buggy address belongs to the page: page:00000000313a09fe refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x962d6 flags: 0x3ffff00000000200(slab)
raw: 3ffff00000000200 000040000257e080 0000000c0000000c 000000008020ba00
raw: 0000000000000000 000f001e00000000 ffffffff00000001 0000000096959501
page dumped because: kasan: bad access detected
page->mem_cgroup:0000000096959501
Memory state around the buggy address:
00000000962d6880: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00000000962d6900: 00 fc fc fc fc fc fc fc fc fa fb fb fb fb fb fb
>00000000962d6980: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
^
00000000962d6a00: fb fb fc fc fc fc fc fc fc fc 00 00 00 00 00 00
00000000962d6a80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
==================================================================
Fixes: 6b251fc96cf2c ("userfaultfd: call handle_userfault() for userfaultfd_missing() faults")
Reported-by: Alexander Egorenkov <egorenar@linux.ibm.com>
Signed-off-by: Gerald Schaefer <gerald.schaefer@linux.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: <stable@vger.kernel.org> [4.3+]
Link: https://lkml.kernel.org/r/20201110190329.11920-1-gerald.schaefer@linux.ibm.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
If we reparent the slab objects to the root memcg, when we free the slab
object, we need to update the per-memcg vmstats to keep it correct for
the root memcg. Now this at least affects the vmstat of
NR_KERNEL_STACK_KB for !CONFIG_VMAP_STACK when the thread stack size is
smaller than the PAGE_SIZE.
David said:
"I assume that without this fix that the root memcg's vmstat would
always be inflated if we reparented"
Fixes: ec9f02384f60 ("mm: workingset: fix vmstat counters for shadow nodes")
Signed-off-by: Muchun Song <songmuchun@bytedance.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Shakeel Butt <shakeelb@google.com>
Acked-by: Roman Gushchin <guro@fb.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: David Rientjes <rientjes@google.com>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Vladimir Davydov <vdavydov.dev@gmail.com>
Cc: Christopher Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Roman Gushchin <guro@fb.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Yafang Shao <laoar.shao@gmail.com>
Cc: Chris Down <chris@chrisdown.name>
Cc: <stable@vger.kernel.org> [5.3+]
Link: https://lkml.kernel.org/r/20201110031015.15715-1-songmuchun@bytedance.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Both btrfs and fuse have reported faults caused by seeing a retry entry
instead of the page they were looking for. This was caused by a missing
check in the iterator.
As can be seen in the below panic log, the accessing 0x402 causes a
panic. In the xarray.h, 0x402 means RETRY_ENTRY.
BUG: kernel NULL pointer dereference, address: 0000000000000402
CPU: 14 PID: 306003 Comm: as Not tainted 5.9.0-1-amd64 #1 Debian 5.9.1-1
Hardware name: Lenovo ThinkSystem SR665/7D2VCTO1WW, BIOS D8E106Q-1.01 05/30/2020
RIP: 0010:fuse_readahead+0x152/0x470 [fuse]
Code: 41 8b 57 18 4c 8d 54 10 ff 4c 89 d6 48 8d 7c 24 10 e8 d2 e3 28 f9 48 85 c0 0f 84 fe 00 00 00 44 89 f2 49 89 04 d4 44 8d 72 01 <48> 8b 10 41 8b 4f 1c 48 c1 ea 10 83 e2 01 80 fa 01 19 d2 81 e2 01
RSP: 0018:ffffad99ceaebc50 EFLAGS: 00010246
RAX: 0000000000000402 RBX: 0000000000000001 RCX: 0000000000000002
RDX: 0000000000000000 RSI: ffff94c5af90bd98 RDI: ffffad99ceaebc60
RBP: ffff94ddc1749a00 R08: 0000000000000402 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000100 R12: ffff94de6c429ce0
R13: ffff94de6c4d3700 R14: 0000000000000001 R15: ffffad99ceaebd68
FS: 00007f228c5c7040(0000) GS:ffff94de8ed80000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000402 CR3: 0000001dbd9b4000 CR4: 0000000000350ee0
Call Trace:
read_pages+0x83/0x270
page_cache_readahead_unbounded+0x197/0x230
generic_file_buffered_read+0x57a/0xa20
new_sync_read+0x112/0x1a0
vfs_read+0xf8/0x180
ksys_read+0x5f/0xe0
do_syscall_64+0x33/0x80
entry_SYSCALL_64_after_hwframe+0x44/0xa9
Fixes: 042124cc64c3 ("mm: add new readahead_control API")
Reported-by: David Sterba <dsterba@suse.com>
Reported-by: Wonhyuk Yang <vvghjk1234@gmail.com>
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cc: <stable@vger.kernel.org>
Link: https://lkml.kernel.org/r/20201103142852.8543-1-willy@infradead.org
Link: https://lkml.kernel.org/r/20201103124349.16722-1-vvghjk1234@gmail.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
The core-mm has a default __weak implementation of phys_to_target_node()
to mirror the weak definition of memory_add_physaddr_to_nid(). That
symbol is exported for modules. However, while the export in
mm/memory_hotplug.c exported the symbol in the configuration cases of:
CONFIG_NUMA_KEEP_MEMINFO=y
CONFIG_MEMORY_HOTPLUG=y
...and:
CONFIG_NUMA_KEEP_MEMINFO=n
CONFIG_MEMORY_HOTPLUG=y
...it failed to export the symbol in the case of:
CONFIG_NUMA_KEEP_MEMINFO=y
CONFIG_MEMORY_HOTPLUG=n
Not only is that broken, but Christoph points out that the kernel should
not be exporting any __weak symbol, which means that
memory_add_physaddr_to_nid() example that phys_to_target_node() copied
is broken too.
Rework the definition of phys_to_target_node() and
memory_add_physaddr_to_nid() to not require weak symbols. Move to the
common arch override design-pattern of an asm header defining a symbol
to replace the default implementation.
The only common header that all memory_add_physaddr_to_nid() producing
architectures implement is asm/sparsemem.h. In fact, powerpc already
defines its memory_add_physaddr_to_nid() helper in sparsemem.h.
Double-down on that observation and define phys_to_target_node() where
necessary in asm/sparsemem.h. An alternate consideration that was
discarded was to put this override in asm/numa.h, but that entangles
with the definition of MAX_NUMNODES relative to the inclusion of
linux/nodemask.h, and requires powerpc to grow a new header.
The dependency on NUMA_KEEP_MEMINFO for DEV_DAX_HMEM_DEVICES is invalid
now that the symbol is properly exported / stubbed in all combinations
of CONFIG_NUMA_KEEP_MEMINFO and CONFIG_MEMORY_HOTPLUG.
[dan.j.williams@intel.com: v4]
Link: https://lkml.kernel.org/r/160461461867.1505359.5301571728749534585.stgit@dwillia2-desk3.amr.corp.intel.com
[dan.j.williams@intel.com: powerpc: fix create_section_mapping compile warning]
Link: https://lkml.kernel.org/r/160558386174.2948926.2740149041249041764.stgit@dwillia2-desk3.amr.corp.intel.com
Fixes: a035b6bf863e ("mm/memory_hotplug: introduce default phys_to_target_node() implementation")
Reported-by: Randy Dunlap <rdunlap@infradead.org>
Reported-by: Thomas Gleixner <tglx@linutronix.de>
Reported-by: kernel test robot <lkp@intel.com>
Reported-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Tested-by: Randy Dunlap <rdunlap@infradead.org>
Tested-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Cc: Joao Martins <joao.m.martins@oracle.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Vishal Verma <vishal.l.verma@intel.com>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Link: https://lkml.kernel.org/r/160447639846.1133764.7044090803980177548.stgit@dwillia2-desk3.amr.corp.intel.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
bpftrace parses the kernel headers and uses Clang under the hood.
Remove the version check when __BPF_TRACING__ is defined (as bpftrace
does) so that this tool can continue to parse kernel headers, even with
older clang sources.
Fixes: commit 1f7a44f63e6c ("compiler-clang: add build check for clang 10.0.1")
Reported-by: Chen Yu <yu.chen.surf@gmail.com>
Reported-by: Jarkko Sakkinen <jarkko@kernel.org>
Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Tested-by: Jarkko Sakkinen <jarkko@kernel.org>
Acked-by: Jarkko Sakkinen <jarkko@kernel.org>
Acked-by: Song Liu <songliubraving@fb.com>
Acked-by: Nathan Chancellor <natechancellor@gmail.com>
Acked-by: Miguel Ojeda <ojeda@kernel.org>
Link: https://lkml.kernel.org/r/20201104191052.390657-1-ndesaulniers@google.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
The early return in process_madvise() will produce a memory leak.
Fix it.
Fixes: ecb8ac8b1f14 ("mm/madvise: introduce process_madvise() syscall: an external memory hinting API")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Minchan Kim <minchan@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Link: https://lkml.kernel.org/r/20201116155132.GA3805951@google.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
It looks like the seccomp selftests was never actually built for sh.
This fixes it, though I don't have an environment to do a runtime test
of it yet.
Fixes: 0bb605c2c7f2b4b3 ("sh: Add SECCOMP_FILTER")
Tested-by: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
Link: https://lore.kernel.org/lkml/a36d7b48-6598-1642-e403-0c77a86f416d@physik.fu-berlin.de
Signed-off-by: Kees Cook <keescook@chromium.org>
|
|
A typo sneaked into the powerpc selftest. Fix the name so it builds again.
Fixes: 46138329faea ("selftests/seccomp: powerpc: Fix seccomp return value testing")
Acked-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/lkml/87y2ix2895.fsf@mpe.ellerman.id.au
Signed-off-by: Kees Cook <keescook@chromium.org>
|
|
x86 Hyper-V used to essentially always overwrite the effective cache type
of guest memory accesses to WB. This was problematic in cases where there
is a physical device assigned to the VM, since that often requires that
the VM should have control over cache types. Thus, on newer Hyper-V since
2018, Hyper-V always honors the VM's cache type, but unexpectedly Linux VM
users start to complain that Linux VM's VRAM becomes very slow, and it
turns out that Linux VM should not map the VRAM uncacheable by ioremap().
Fix this slowness issue by using ioremap_cache().
On ARM64, ioremap_cache() is also required as the host also maps the VRAM
cacheable, otherwise VM Connect can't display properly with ioremap() or
ioremap_wc().
With this change, the VRAM on new Hyper-V is as fast as regular RAM, so
it's no longer necessary to use the hacks we added to mitigate the
slowness, i.e. we no longer need to allocate physical memory and use
it to back up the VRAM in Generation-1 VM, and we also no longer need to
allocate physical memory to back up the framebuffer in a Generation-2 VM
and copy the framebuffer to the real VRAM. A further big change will
address these for v5.11.
Fixes: 68a2d20b79b1 ("drivers/video: add Hyper-V Synthetic Video Frame Buffer Driver")
Tested-by: Boqun Feng <boqun.feng@gmail.com>
Signed-off-by: Dexuan Cui <decui@microsoft.com>
Reviewed-by: Michael Kelley <mikelley@microsoft.com>
Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com>
Link: https://lore.kernel.org/r/20201118000305.24797-1-decui@microsoft.com
Signed-off-by: Wei Liu <wei.liu@kernel.org>
|
|
The idea of the warning in ext4_update_dx_flag() is that we should warn
when we are clearing EXT4_INODE_INDEX on a filesystem with metadata
checksums enabled since after clearing the flag, checksums for internal
htree nodes will become invalid. So there's no need to warn (or actually
do anything) when EXT4_INODE_INDEX is not set.
Link: https://lore.kernel.org/r/20201118153032.17281-1-jack@suse.cz
Fixes: 48a34311953d ("ext4: fix checksum errors with indexed dirs")
Reported-by: Eric Biggers <ebiggers@kernel.org>
Reviewed-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Cc: stable@kernel.org
|
|
Kernel-doc markup should use this format:
identifier - description
They should not have any type before that, as otherwise
the parser won't do the right thing.
Also, some identifiers have different names between their
prototypes and the kernel-doc markup.
Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Link: https://lore.kernel.org/r/72f5c6628f5f278d67625f60893ffbc2ca28d46e.1605521731.git.mchehab+huawei@kernel.org
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
|
|
This reverts commit 6ff646b2ceb0eec916101877f38da0b73e3a5b7f.
Your maintainer committed a major braino in the rmap code by adding the
attr fork, bmbt, and unwritten extent usage bits into rmap record key
comparisons. While XFS uses the usage bits *in the rmap records* for
cross-referencing metadata in xfs_scrub and xfs_repair, it only needs
the owner and offset information to distinguish between reverse mappings
of the same physical extent into the data fork of a file at multiple
offsets. The other bits are not important for key comparisons for index
lookups, and never have been.
Eric Sandeen reports that this causes regressions in generic/299, so
undo this patch before it does more damage.
Reported-by: Eric Sandeen <sandeen@sandeen.net>
Fixes: 6ff646b2ceb0 ("xfs: fix rmap key and record comparison functions")
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Eric Sandeen <sandeen@redhat.com>
|
|
The options in /proc/mounts must be valid mount options --- and
fast_commit is not a mount option. Otherwise, command sequences like
this will fail:
# mount /dev/vdc /vdc
# mkdir -p /vdc/phoronix_test_suite /pts
# mount --bind /vdc/phoronix_test_suite /pts
# mount -o remount,nodioread_nolock /pts
mount: /pts: mount point not mounted or bad option.
And in the system logs, you'll find:
EXT4-fs (vdc): Unrecognized mount option "fast_commit" or missing value
Fixes: 995a3ed67fc8 ("ext4: add fast_commit feature and handling for extended mount options")
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
|
|
Forcing mocs:1 [used for our winsys follows-pte mode] to be cached
caused display glitches. Though it is documented as deprecated (and so
likely behaves as uncached) use the follow-pte bit and force it out of
L3 cache.
Testcase: igt/kms_frontbuffer_tracking
Testcase: igt/kms_big_fb
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Ayaz A Siddiqui <ayaz.siddiqui@intel.com>
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Cc: Matt Roper <matthew.d.roper@intel.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201015122138.30161-4-chris@chris-wilson.co.uk
(cherry picked from commit a04ac827366594c7244f60e9be79fcb404af69f0)
Fixes: 849c0fe9e831 ("drm/i915/gt: Initialize reserved and unspecified MOCS indices")
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
[Rodrigo: Updated Fixes tag]
|
|
Fix a mutex_unlock() issue where before copy_from_user() is
not called mutex_locked.
Fixes: 4b1a29a7f542 ("error-injection: Support fault injection framework")
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Luo Meng <luomeng12@huawei.com>
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Masami Hiramatsu <mhiramat@kernel.org>
Link: https://lore.kernel.org/bpf/160570737118.263807.8358435412898356284.stgit@devnote2
|
|
Previously, bpf_probe_read_user_str() could potentially overcopy the
trailing bytes after the NUL due to how do_strncpy_from_user() does the
copy in long-sized strides. The issue has been fixed in the previous
commit.
This commit adds a selftest that ensures we don't regress
bpf_probe_read_user_str() again.
Signed-off-by: Daniel Xu <dxu@dxuuu.xyz>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Song Liu <songliubraving@fb.com>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/4d977508fab4ec5b7b574b85bdf8b398868b6ee9.1605642949.git.dxu@dxuuu.xyz
|
|
do_strncpy_from_user() may copy some extra bytes after the NUL
terminator into the destination buffer. This usually does not matter for
normal string operations. However, when BPF programs key BPF maps with
strings, this matters a lot.
A BPF program may read strings from user memory by calling the
bpf_probe_read_user_str() helper which eventually calls
do_strncpy_from_user(). The program can then key a map with the
destination buffer. BPF map keys are fixed-width and string-agnostic,
meaning that map keys are treated as a set of bytes.
The issue is when do_strncpy_from_user() overcopies bytes after the NUL
terminator, it can result in seemingly identical strings occupying
multiple slots in a BPF map. This behavior is subtle and totally
unexpected by the user.
This commit masks out the bytes following the NUL while preserving
long-sized stride in the fast path.
Fixes: 6ae08ae3dea2 ("bpf: Add probe_read_{user, kernel} and probe_read_{user, kernel}_str helpers")
Signed-off-by: Daniel Xu <dxu@dxuuu.xyz>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/21efc982b3e9f2f7b0379eed642294caaa0c27a7.1605642949.git.dxu@dxuuu.xyz
|
|
Commit 7053e0eab473 ("drm/vram-helper: stop using TTM placement flags")
cleared the BO placement flags if top-down placement had been selected.
Hence, BOs that were supposed to go into VRAM are now placed in a default
location in system memory.
Trying to scanout the incorrectly pinned BO results in displayed garbage
and an error message.
[ 146.108127] ------------[ cut here ]------------
[ 146.1V08180] WARNING: CPU: 0 PID: 152 at drivers/gpu/drm/drm_gem_vram_helper.c:284 drm_gem_vram_offset+0x59/0x60 [drm_vram_helper]
...
[ 146.108591] ast_cursor_page_flip+0x3e/0x150 [ast]
[ 146.108622] ast_cursor_plane_helper_atomic_update+0x8a/0xc0 [ast]
[ 146.108654] drm_atomic_helper_commit_planes+0x197/0x4c0
[ 146.108699] drm_atomic_helper_commit_tail_rpm+0x59/0xa0
[ 146.108718] commit_tail+0x103/0x1c0
...
[ 146.109302] ---[ end trace d901a1ba1d949036 ]---
Fix the bug by keeping the placement flags. The top-down placement flag
is stored in a separate variable.
Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Reviewed-by: Christian König <christian.koenig@amd.com>
Fixes: 7053e0eab473 ("drm/vram-helper: stop using TTM placement flags")
Reported-by: Pu Wen <puwen@hygon.cn> [for 5.10-rc1]
Tested-by: Pu Wen <puwen@hygon.cn>
Cc: Christian König <christian.koenig@amd.com>
Cc: Dave Airlie <airlied@redhat.com>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Maxime Ripard <mripard@kernel.org>
Cc: Thomas Zimmermann <tzimmermann@suse.de>
Cc: David Airlie <airlied@linux.ie>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: dri-devel@lists.freedesktop.org
Link: https://patchwork.freedesktop.org/patch/msgid/20200921142536.4392-1-tzimmermann@suse.de
(cherry picked from commit b8f8dbf6495850b0babc551377bde754b7bc0eea)
[pulled into fixes from drm-next]
Signed-off-by: Dave Airlie <airlied@redhat.com>
|
|
Sparse complaints 3 times about:
net/smc/smc_ib.c:203:52: warning: incorrect type in argument 1 (different address spaces)
net/smc/smc_ib.c:203:52: expected struct net_device const *dev
net/smc/smc_ib.c:203:52: got struct net_device [noderef] __rcu *const ndev
Fix that by using the existing and validated ndev variable instead of
accessing attr->ndev directly.
Fixes: 5102eca9039b ("net/smc: Use rdma_read_gid_l2_fields to L2 fields")
Signed-off-by: Karsten Graul <kgraul@linux.ibm.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
With the multi-subnet support of SMC-Dv2 the match for existing link
groups should not include the vlanid of the network device.
Set ini->smcd_version accordingly before the call to smc_conn_create()
and use this value in smc_conn_create() to skip the vlanid check.
Fixes: 5c21c4ccafe8 ("net/smc: determine accepted ISM devices")
Signed-off-by: Karsten Graul <kgraul@linux.ibm.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
IPV6=m
NF_DEFRAG_IPV6=y
ld: net/ipv6/netfilter/nf_conntrack_reasm.o: in function
`nf_ct_frag6_gather':
net/ipv6/netfilter/nf_conntrack_reasm.c:462: undefined reference to
`ipv6_frag_thdr_truncated'
Netfilter is depending on ipv6 symbol ipv6_frag_thdr_truncated. This
dependency is forcing IPV6=y.
Remove this dependency by moving ipv6_frag_thdr_truncated out of ipv6. This
is the same solution as used with a similar issues: Referring to
commit 70b095c843266 ("ipv6: remove dependency of nf_defrag_ipv6 on ipv6
module")
Fixes: 9d9e937b1c8b ("ipv6/netfilter: Discard first fragment not including all headers")
Reported-by: Randy Dunlap <rdunlap@infradead.org>
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Georg Kohmann <geokohma@cisco.com>
Acked-by: Pablo Neira Ayuso <pablo@netfilter.org>
Acked-by: Randy Dunlap <rdunlap@infradead.org> # build-tested
Link: https://lore.kernel.org/r/20201119095833.8409-1-geokohma@cisco.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The code change for switching to non-atomic mode brought the
unexpected mutex deadlock in get_msg(). It converted the spinlock
with the existing mutex, but there were calls with the already holding
the mutex. Since the only place that needs the extra lock is the code
path from snd_mixart_send_msg(), remove the mutex lock in get_msg()
and apply in the caller side for fixing the mutex deadlock.
Fixes: 8d3a8b5cb57d ("ALSA: mixart: Use nonatomic PCM ops")
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Cc: <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/20201119121440.18945-1-tiwai@suse.de
Signed-off-by: Takashi Iwai <tiwai@suse.de>
|
|
Jens has reported a situation where partial direct IOs can be issued
and completed yet still return -EAGAIN. We don't want this to report
a short IO as we want XFS to complete user DIO entirely or not at
all.
This partial IO situation can occur on a write IO that is split
across an allocated extent and a hole, and the second mapping is
returning EAGAIN because allocation would be required.
The trivial reproducer:
$ sudo xfs_io -fdt -c "pwrite 0 4k" -c "pwrite -V 1 -b 8k -N 0 8k" /mnt/scr/foo
wrote 4096/4096 bytes at offset 0
4 KiB, 1 ops; 0.0001 sec (27.509 MiB/sec and 7042.2535 ops/sec)
pwrite: Resource temporarily unavailable
$
The pwritev2(0, 8kB, RWF_NOWAIT) call returns EAGAIN having done
the first 4kB write:
xfs_file_direct_write: dev 259:1 ino 0x83 size 0x1000 offset 0x0 count 0x2000
iomap_apply: dev 259:1 ino 0x83 pos 0 length 8192 flags WRITE|DIRECT|NOWAIT (0x31) ops xfs_direct_write_iomap_ops caller iomap_dio_rw actor iomap_dio_actor
xfs_ilock_nowait: dev 259:1 ino 0x83 flags ILOCK_SHARED caller xfs_ilock_for_iomap
xfs_iunlock: dev 259:1 ino 0x83 flags ILOCK_SHARED caller xfs_direct_write_iomap_begin
xfs_iomap_found: dev 259:1 ino 0x83 size 0x1000 offset 0x0 count 8192 fork data startoff 0x0 startblock 24 blockcount 0x1
iomap_apply_dstmap: dev 259:1 ino 0x83 bdev 259:1 addr 102400 offset 0 length 4096 type MAPPED flags DIRTY
Here the first iomap loop has mapped the first 4kB of the file and
issued the IO, and we enter the second iomap_apply loop:
iomap_apply: dev 259:1 ino 0x83 pos 4096 length 4096 flags WRITE|DIRECT|NOWAIT (0x31) ops xfs_direct_write_iomap_ops caller iomap_dio_rw actor iomap_dio_actor
xfs_ilock_nowait: dev 259:1 ino 0x83 flags ILOCK_SHARED caller xfs_ilock_for_iomap
xfs_iunlock: dev 259:1 ino 0x83 flags ILOCK_SHARED caller xfs_direct_write_iomap_begin
And we exit with -EAGAIN out because we hit the allocate case trying
to make the second 4kB block.
Then IO completes on the first 4kB and the original IO context
completes and unlocks the inode, returning -EAGAIN to userspace:
xfs_end_io_direct_write: dev 259:1 ino 0x83 isize 0x1000 disize 0x1000 offset 0x0 count 4096
xfs_iunlock: dev 259:1 ino 0x83 flags IOLOCK_SHARED caller xfs_file_dio_aio_write
There are other vectors to the same problem when we re-enter the
mapping code if we have to make multiple mappinfs under NOWAIT
conditions. e.g. failing trylocks, COW extents being found,
allocation being required, and so on.
Avoid all these potential problems by only allowing IOMAP_NOWAIT IO
to go ahead if the mapping we retrieve for the IO spans an entire
allocated extent. This avoids the possibility of subsequent mappings
to complete the IO from triggering NOWAIT semantics by any means as
NOWAIT IO will now only enter the mapping code once per NOWAIT IO.
Reported-and-tested-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
|
|
We remove "other info" from "readelf -s --wide" output when
parsing GLOBAL_SYM_COUNT variable, which was added in [1].
But we don't do that for VERSIONED_SYM_COUNT and it's failing
the check_abi target on powerpc Fedora 33.
The extra "other info" wasn't problem for VERSIONED_SYM_COUNT
parsing until commit [2] added awk in the pipe, which assumes
that the last column is symbol, but it can be "other info".
Adding "other info" removal for VERSIONED_SYM_COUNT the same
way as we did for GLOBAL_SYM_COUNT parsing.
[1] aa915931ac3e ("libbpf: Fix readelf output parsing for Fedora")
[2] 746f534a4809 ("tools/libbpf: Avoid counting local symbols in ABI check")
Fixes: 746f534a4809 ("tools/libbpf: Avoid counting local symbols in ABI check")
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20201118211350.1493421-1-jolsa@kernel.org
|
|
Some users are pairing the Dinovo keyboards with the MX5000 or MX5500
receivers, instead of with the Dinovo receivers. The receivers are
mostly the same (and the air protocol obviously is compatible) but
currently the Dinovo receivers are handled by hid-lg.c while the
MX5x00 receivers are handled by logitech-dj.c.
When using a Dinovo keyboard, with its builtin touchpad, through
logitech-dj.c then the touchpad stops working because when asking the
receiver for paired devices, we get only 1 paired device with
a device_type of REPORT_TYPE_KEYBOARD. And since we don't see a paired
mouse, we have nowhere to send mouse-events to, so we drop them.
Extend the existing fix for the Dinovo Edge for this to also cover the
Dinovo Mini keyboard and also add a mapping to logitech-hidpp for the
Media key on the Dinovo Mini, so that that keeps working too.
BugLink: https://bugzilla.redhat.com/show_bug.cgi?id=1811424
Fixes: f2113c3020ef ("HID: logitech-dj: add support for Logitech Bluetooth Mini-Receiver")
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Benjamin Tissoires <benjamin.tissoires@redhat.com>
|
|
Fix an error in the mouse / INPUT(2) descriptor used for quad/bt2.0 combo
receivers. Replace INPUT with INPUT (Data,Var,Abs) for the field for the
4 extra buttons which share their report-byte with the low-res hwheel.
This is likely a copy and paste error. I've verified that the new
0x81, 0x02 value matches both the mouse descriptor for the currently
supported MX5000 / MX5500 receivers, as well as the INPUT(2) mouse
descriptors for the Dinovo receivers for which support is being
worked on.
Cc: stable@vger.kernel.org
Fixes: f2113c3020ef ("HID: logitech-dj: add support for Logitech Bluetooth Mini-Receiver")
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Benjamin Tissoires <benjamin.tissoires@redhat.com>
|
|
This reverts commit 8986f223bd777a73119f5d593c15b4d630ff49bb.
The proper fix is queued in Will's tree now
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
pseries|pnv_setup_rfi_flush already does the count cache flush setup, and
we just added entry and uaccess flushes. So the name is not very accurate
any more. In both platforms we then also immediately setup the STF flush.
Rename them to _setup_security_mitigations and fold the STF flush in.
Signed-off-by: Daniel Axtens <dja@axtens.net>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
|
For simplicity in backporting, the original entry_flush test contained
a lot of duplicated code from the rfi_flush test. De-duplicate that code.
Signed-off-by: Daniel Axtens <dja@axtens.net>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
|
Add a test modelled on the RFI flush test which counts the number
of L1D misses doing a simple syscall with the entry flush on and off.
For simplicity of backporting, this test duplicates a lot of code from
rfi_flush. We clean that up in the next patch.
Signed-off-by: Daniel Axtens <dja@axtens.net>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
|
In kup.h we currently include kup-radix.h for all 64-bit builds, which
includes Book3S and Book3E. The latter doesn't make sense, Book3E
never uses the Radix MMU.
This has worked up until now, but almost by accident, and the recent
uaccess flush changes introduced a build breakage on Book3E because of
the bad structure of the code.
So disentangle things so that we only use kup-radix.h for Book3S. This
requires some more stubs in kup.h and fixing an include in
syscall_64.c.
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
|
IBM Power9 processors can speculatively operate on data in the L1 cache
before it has been completely validated, via a way-prediction mechanism. It
is not possible for an attacker to determine the contents of impermissible
memory using this method, since these systems implement a combination of
hardware and software security measures to prevent scenarios where
protected data could be leaked.
However these measures don't address the scenario where an attacker induces
the operating system to speculatively execute instructions using data that
the attacker controls. This can be used for example to speculatively bypass
"kernel user access prevention" techniques, as discovered by Anthony
Steinhauser of Google's Safeside Project. This is not an attack by itself,
but there is a possibility it could be used in conjunction with
side-channels or other weaknesses in the privileged code to construct an
attack.
This issue can be mitigated by flushing the L1 cache between privilege
boundaries of concern. This patch flushes the L1 cache after user accesses.
This is part of the fix for CVE-2020-4788.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Daniel Axtens <dja@axtens.net>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
|
IBM Power9 processors can speculatively operate on data in the L1 cache
before it has been completely validated, via a way-prediction mechanism. It
is not possible for an attacker to determine the contents of impermissible
memory using this method, since these systems implement a combination of
hardware and software security measures to prevent scenarios where
protected data could be leaked.
However these measures don't address the scenario where an attacker induces
the operating system to speculatively execute instructions using data that
the attacker controls. This can be used for example to speculatively bypass
"kernel user access prevention" techniques, as discovered by Anthony
Steinhauser of Google's Safeside Project. This is not an attack by itself,
but there is a possibility it could be used in conjunction with
side-channels or other weaknesses in the privileged code to construct an
attack.
This issue can be mitigated by flushing the L1 cache between privilege
boundaries of concern. This patch flushes the L1 cache on kernel entry.
This is part of the fix for CVE-2020-4788.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Daniel Axtens <dja@axtens.net>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
|
We are about to add an entry flush. The rfi (exit) flush test measures
the number of L1D flushes over a syscall with the RFI flush enabled and
disabled. But if the entry flush is also enabled, the effect of enabling
and disabling the RFI flush is masked.
If there is a debugfs entry for the entry flush, disable it during the RFI
flush and restore it later.
Reported-by: Spoorthy S <spoorts2@in.ibm.com>
Signed-off-by: Russell Currey <ruscur@russell.cc>
Signed-off-by: Daniel Axtens <dja@axtens.net>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
|
CONFIG_PCI=n leads to a compile warning like:
sound/pci/hda/patch_ca0132.c:8214:10: warning: no case matching constant switch condition '0'
due to the missed handling of QUIRK_NONE in ca0132_mmio_init().
Fix it.
Fixes: bf2aa9ccc8e5 ("ALSA: hda/ca0132 - Cleanup ca0132_mmio_init function.")
Reported-by: kernel test robot <lkp@intel.com>
Link: https://lore.kernel.org/r/20201119120404.16833-1-tiwai@suse.de
Signed-off-by: Takashi Iwai <tiwai@suse.de>
|
|
Joerg is recovering from an injury, so temporarily add myself to the
IOMMU MAINTAINERS entry so that I'm more likely to get CC'd on patches
while I help to look after the tree for him.
Suggested-by: Joerg Roedel <joro@8bytes.org>
Link: https://lore.kernel.org/r/20201117100953.GR22888@8bytes.org
Signed-off-by: Will Deacon <will@kernel.org>
|
|
Fix the compile error below (CONFIG_PCI_ATS not set):
drivers/iommu/intel/dmar.c: In function ‘vf_inherit_msi_domain’:
drivers/iommu/intel/dmar.c:338:59: error: ‘struct pci_dev’ has no member named ‘physfn’; did you mean ‘is_physfn’?
338 | dev_set_msi_domain(&pdev->dev, dev_get_msi_domain(&pdev->physfn->dev));
| ^~~~~~
| is_physfn
Fixes: ff828729be44 ("iommu/vt-d: Cure VF irqdomain hickup")
Reported-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/linux-iommu/CAMuHMdXA7wfJovmfSH2nbAhN0cPyCiFHodTvg4a8Hm9rx5Dj-w@mail.gmail.com/
Link: https://lore.kernel.org/r/20201119055119.2862701-1-baolu.lu@linux.intel.com
Signed-off-by: Will Deacon <will@kernel.org>
|
|
Since we allocate some breadcrumbs for the virtual engine, and the
virtual engine has a custom destructor, we also need to free the
breadcrumbs after use.
Fixes: b3786b29379c ("drm/i915/gt: Distinguish the virtual breadcrumbs from the irq breadcrumbs")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201118133839.1783-1-chris@chris-wilson.co.uk
(cherry picked from commit 45e50f48b7907e650cfbbc7879abfe3a0c419c73)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
|
|
EDID can declare the maximum supported bpc up to 16,
and apparently there are displays that do so. Currently
we assume 12 bpc is tha max. Fix the assumption and
toss in a MISSING_CASE() for any other value we don't
expect to see.
This fixes modesets with a display with EDID max bpc > 12.
Previously any modeset would just silently fail on platforms
that didn't otherwise limit this via the max_bpc property.
In particular we don't add the max_bpc property to HDMI
ports on gmch platforms, and thus we would see the raw
max_bpc coming from the EDID.
I suppose we could already adjust this to also allow 16bpc,
but seeing as no current platform supports that there is
little point.
Cc: stable@vger.kernel.org
Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/2632
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201110210447.27454-1-ville.syrjala@linux.intel.com
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
(cherry picked from commit 2ca5a7b85b0c2b97ef08afbd7799b022e29f192e)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
|