Age | Commit message (Collapse) | Author | Files | Lines |
|
Modern x86 laptops are starting to use GPIO pins as interrupts more
and more, e.g. touchpads and touchscreens have almost all moved away
from PS/2 and USB to using I2C with a GPIO pin as interrupt.
Modern x86 laptops also have almost all moved to using s2idle instead
of using the system S3 ACPI power state to suspend.
The Intel and AMD pinctrl drivers do not define irq_retrigger handlers
for the irqchips they register, this is causing edge triggered interrupts
which happen while suspended using s2idle to get lost.
One specific example of this is the lid switch on some devices, lid
switches used to be handled by the embedded-controller, but now the
lid open/closed sensor is sometimes directly connected to a GPIO pin.
On most devices the ACPI code for this looks like this:
Method (_E00, ...) {
Notify (LID0, 0x80) // Status Change
}
Where _E00 is an ACPI event handler for changes on both edges of the GPIO
connected to the lid sensor, this event handler is then combined with an
_LID method which directly reads the pin. When the device is resumed by
opening the lid, the GPIO interrupt will wake the system, but because the
pinctrl irqchip doesn't have an irq_retrigger handler, the Notify will not
happen. This is not a problem in the case the _LID method directly reads
the GPIO, because the drivers/acpi/button.c code will call _LID on resume
anyways.
But some devices have an event handler for the GPIO connected to the
lid sensor which looks like this:
Method (_E00, ...) {
if (LID_GPIO == One)
LIDS = One
else
LIDS = Zero
Notify (LID0, 0x80) // Status Change
}
And the _LID method returns the cached LIDS value, since on open we
do not re-run the edge-interrupt handler when we re-enable IRQS on resume
(because of the missing irq_retrigger handler), _LID now will keep
reporting closed, as LIDS was never changed to reflect the open status,
this causes userspace to re-resume the laptop again shortly after opening
the lid.
The Intel GPIO controllers do not allow implementing irq_retrigger without
emulating it in software, at which point we are better of just using the
generic HARDIRQS_SW_RESEND mechanism rather then re-implementing software
emulation for this separately in aprox. 14 different pinctrl drivers.
Select HARDIRQS_SW_RESEND to solve the problem of edge-triggered GPIO
interrupts not being re-triggered on resume when they were triggered during
suspend (s2idle) and/or when they were the cause of the wakeup.
This requires
008f1d60fe25 ("x86/apic/vector: Force interupt handler invocation to irq context")
c16816acd086 ("genirq: Add protection against unsafe usage of generic_handle_irq()")
to protect the APIC based interrupts from being wreckaged by a software
resend.
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lkml.kernel.org/r/20200123210242.53367-1-hdegoede@redhat.com
|
|
irq_domain_alloc_irqs_hierarchy() has 3 call sites in the compilation unit
but only one of them checks for the pointer which is being dereferenced
inside the called function. Move the check into the function. This allows
for catching the error instead of the following crash:
Unable to handle kernel NULL pointer dereference at virtual address 00000000
PC is at 0x0
LR is at gpiochip_hierarchy_irq_domain_alloc+0x11f/0x140
...
[<c06c23ff>] (gpiochip_hierarchy_irq_domain_alloc)
[<c0462a89>] (__irq_domain_alloc_irqs)
[<c0462dad>] (irq_create_fwspec_mapping)
[<c06c2251>] (gpiochip_to_irq)
[<c06c1c9b>] (gpiod_to_irq)
[<bf973073>] (gpio_irqs_init [gpio_irqs])
[<bf974048>] (gpio_irqs_exit+0xecc/0xe84 [gpio_irqs])
Code: bad PC value
Signed-off-by: Alexander Sverdlin <alexander.sverdlin@nokia.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lkml.kernel.org/r/20200306174720.82604-1-alexander.sverdlin@nokia.com
|
|
The AER error injection mechanism just blindly abuses generic_handle_irq()
which is really not meant for consumption by random drivers. The include of
linux/irq.h should have been a red flag in the first place. Driver code,
unless implementing interrupt chips or low level hypervisor functionality
has absolutely no business with that.
Invoking generic_handle_irq() from non interrupt handling context can have
nasty side effects at least on x86 due to the hardware trainwreck which
makes interrupt affinity changes a fragile beast. Sathyanarayanan triggered
a NULL pointer dereference in the low level APIC code that way. While the
particular pointer could be checked this would only paper over the issue
because there are other ways to trigger warnings or silently corrupt state.
Invoke the new irq_inject_interrupt() mechanism, which has the necessary
sanity checks in place and injects the interrupt via the irq_retrigger()
mechanism, which is at least halfways safe vs. the fragile x86 affinity
change mechanics.
It's safe on x86 as it does not corrupt state, but it still can cause a
premature completion of an interrupt affinity change causing the interrupt
line to become stale. Very unlikely, but possible.
For regular operations this is a non issue as AER error injection is meant
for debugging and testing and not for usage on production systems. People
using this should better know what they are doing.
Fixes: 390e2db82480 ("PCI/AER: Abstract AER interrupt handling")
Reported-by: sathyanarayanan.kuppuswamy@linux.intel.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Link: https://lkml.kernel.org/r/20200306130624.098374457@linutronix.de
|
|
Error injection mechanisms need a half ways safe way to inject interrupts as
invoking generic_handle_irq() or the actual device interrupt handler
directly from e.g. a debugfs write is not guaranteed to be safe.
On x86 generic_handle_irq() is unsafe due to the hardware trainwreck which
is the base of x86 interrupt delivery and affinity management.
Move the irq debugfs injection code into a separate function which can be
used by error injection code as well.
The implementation prevents at least that state is corrupted, but it cannot
close a very tiny race window on x86 which might result in a stale and not
serviced device interrupt under very unlikely circumstances.
This is explicitly for debugging and testing and not for production use or
abuse in random driver code.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
Acked-by: Marc Zyngier <maz@kernel.org>
Link: https://lkml.kernel.org/r/20200306130623.990928309@linutronix.de
|
|
The code sets IRQS_REPLAY unconditionally whether the resend happens or
not. That doesn't have bad side effects right now, but inconsistent state
is always a latent source of problems.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Marc Zyngier <maz@kernel.org>
Link: https://lkml.kernel.org/r/20200306130623.882129117@linutronix.de
|
|
In preparation for an interrupt injection interface which can be used
safely by error injection mechanisms. e.g. PCI-E/ AER, add a return value
to check_irq_resend() so errors can be propagated to the caller.
Split out the software resend code so the ugly #ifdef in check_irq_resend()
goes away and the whole thing becomes readable.
Fix up the caller in debugfs. The caller in irq_startup() does not care
about the return value as this is unconditionally invoked for all
interrupts and the resend is best effort anyway.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Marc Zyngier <maz@kernel.org>
Link: https://lkml.kernel.org/r/20200306130623.775200917@linutronix.de
|
|
Sathyanarayanan reported that the PCI-E AER error injection mechanism
can result in a NULL pointer dereference in apic_ack_edge():
BUG: unable to handle kernel NULL pointer dereference at 0000000000000078
RIP: 0010:apic_ack_edge+0x1e/0x40
Call Trace:
handle_edge_irq+0x7d/0x1e0
generic_handle_irq+0x27/0x30
aer_inject_write+0x53a/0x720
It crashes in irq_complete_move() which dereferences get_irq_regs() which
is obviously NULL when this is called from non interrupt context.
Of course the pointer could be checked, but that just papers over the real
issue. Invoking the low level interrupt handling mechanism from random code
can wreckage the fragile interrupt affinity mechanism of x86 as interrupts
can only be moved in interrupt context or with special care when a CPU goes
offline and the move has to be enforced.
In the best case this triggers the warning in the MSI affinity setter, but
if the call happens on the correct CPU it just corrupts state and might
prevent further interrupt delivery for the affected device.
Mark the APIC interrupts as unsuitable for being invoked in random contexts.
This prevents the AER injection from proliferating the wreckage, but that's
less broken than the current state of affairs and more correct than just
papering over the problem by sprinkling random checks all over the place
and silently corrupting state.
Reported-by: sathyanarayanan.kuppuswamy@linux.intel.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lkml.kernel.org/r/20200306130623.684591280@linutronix.de
|
|
In general calling generic_handle_irq() with interrupts disabled from non
interrupt context is harmless. For some interrupt controllers like the x86
trainwrecks this is outright dangerous as it might corrupt state if an
interrupt affinity change is pending.
Add infrastructure which allows to mark interrupts as unsafe and catch such
usage in generic_handle_irq().
Reported-by: sathyanarayanan.kuppuswamy@linux.intel.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Marc Zyngier <maz@kernel.org>
Link: https://lkml.kernel.org/r/20200306130623.590923677@linutronix.de
|
|
Interrupts cannot be injected when the interrupt is not activated and when
a replay is already in progress.
Fixes: 536e2e34bd00 ("genirq/debugfs: Triggering of interrupts from userspace")
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Marc Zyngier <maz@kernel.org>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/20200306130623.500019114@linutronix.de
|
|
The function got renamed at some point, but the kernel-doc was not updated.
Signed-off-by: luanshi <zhangliguang@linux.alibaba.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lkml.kernel.org/r/1583200125-58806-1-git-send-email-zhangliguang@linux.alibaba.com
|
|
|
|
KVM emulates UMIP on hardware that doesn't support it by setting the
'descriptor table exiting' VM-execution control and performing
instruction emulation. When running nested, this emulation is broken as
KVM refuses to emulate L2 instructions by default.
Correct this regression by allowing the emulation of descriptor table
instructions if L1 hasn't requested 'descriptor table exiting'.
Fixes: 07721feee46b ("KVM: nVMX: Don't emulate instructions in guest mode")
Reported-by: Jan Kiszka <jan.kiszka@web.de>
Cc: stable@vger.kernel.org
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Jim Mattson <jmattson@google.com>
Signed-off-by: Oliver Upton <oupton@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
If sbi->s_flex_groups_allocated is zero and the first allocation fails
then this code will crash. The problem is that "i--" will set "i" to
-1 but when we compare "i >= sbi->s_flex_groups_allocated" then the -1
is type promoted to unsigned and becomes UINT_MAX. Since UINT_MAX
is more than zero, the condition is true so we call kvfree(new_groups[-1]).
The loop will carry on freeing invalid memory until it crashes.
Fixes: 7c990728b99e ("ext4: fix potential race between s_flex_groups online resizing and access")
Reviewed-by: Suraj Jitindar Singh <surajjs@amazon.com>
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Cc: stable@kernel.org
Link: https://lore.kernel.org/r/20200228092142.7irbc44yaz3by7nb@kili.mountain
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
|
|
Removing attach_adapter from this driver caused a regression for at
least some machines. Those machines had the sensors described in their
DT, too, so they didn't need manual creation of the sensor devices. The
old code worked, though, because manual creation came first. Creation of
DT devices then failed later and caused error logs, but the sensors
worked nonetheless because of the manually created devices.
When removing attach_adaper, manual creation now comes later and loses
the race. The sensor devices were already registered via DT, yet with
another binding, so the driver could not be bound to it.
This fix refactors the code to remove the race and only manually creates
devices if there are no DT nodes present. Also, the DT binding is updated
to match both, the DT and manually created devices. Because we don't
know which device creation will be used at runtime, the code to start
the kthread is moved to do_probe() which will be called by both methods.
Fixes: 3e7bed52719d ("macintosh: therm_windtunnel: drop using attach_adapter")
Link: https://bugzilla.kernel.org/show_bug.cgi?id=201723
Reported-by: Erhard Furtner <erhard_f@mailbox.org>
Tested-by: Erhard Furtner <erhard_f@mailbox.org>
Acked-by: Michael Ellerman <mpe@ellerman.id.au> (powerpc)
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
Cc: stable@kernel.org # v4.19+
|
|
journal_head::b_transaction and journal_head::b_next_transaction could
be accessed concurrently as noticed by KCSAN,
LTP: starting fsync04
/dev/zero: Can't open blockdev
EXT4-fs (loop0): mounting ext3 file system using the ext4 subsystem
EXT4-fs (loop0): mounted filesystem with ordered data mode. Opts: (null)
==================================================================
BUG: KCSAN: data-race in __jbd2_journal_refile_buffer [jbd2] / jbd2_write_access_granted [jbd2]
write to 0xffff99f9b1bd0e30 of 8 bytes by task 25721 on cpu 70:
__jbd2_journal_refile_buffer+0xdd/0x210 [jbd2]
__jbd2_journal_refile_buffer at fs/jbd2/transaction.c:2569
jbd2_journal_commit_transaction+0x2d15/0x3f20 [jbd2]
(inlined by) jbd2_journal_commit_transaction at fs/jbd2/commit.c:1034
kjournald2+0x13b/0x450 [jbd2]
kthread+0x1cd/0x1f0
ret_from_fork+0x27/0x50
read to 0xffff99f9b1bd0e30 of 8 bytes by task 25724 on cpu 68:
jbd2_write_access_granted+0x1b2/0x250 [jbd2]
jbd2_write_access_granted at fs/jbd2/transaction.c:1155
jbd2_journal_get_write_access+0x2c/0x60 [jbd2]
__ext4_journal_get_write_access+0x50/0x90 [ext4]
ext4_mb_mark_diskspace_used+0x158/0x620 [ext4]
ext4_mb_new_blocks+0x54f/0xca0 [ext4]
ext4_ind_map_blocks+0xc79/0x1b40 [ext4]
ext4_map_blocks+0x3b4/0x950 [ext4]
_ext4_get_block+0xfc/0x270 [ext4]
ext4_get_block+0x3b/0x50 [ext4]
__block_write_begin_int+0x22e/0xae0
__block_write_begin+0x39/0x50
ext4_write_begin+0x388/0xb50 [ext4]
generic_perform_write+0x15d/0x290
ext4_buffered_write_iter+0x11f/0x210 [ext4]
ext4_file_write_iter+0xce/0x9e0 [ext4]
new_sync_write+0x29c/0x3b0
__vfs_write+0x92/0xa0
vfs_write+0x103/0x260
ksys_write+0x9d/0x130
__x64_sys_write+0x4c/0x60
do_syscall_64+0x91/0xb05
entry_SYSCALL_64_after_hwframe+0x49/0xbe
5 locks held by fsync04/25724:
#0: ffff99f9911093f8 (sb_writers#13){.+.+}, at: vfs_write+0x21c/0x260
#1: ffff99f9db4c0348 (&sb->s_type->i_mutex_key#15){+.+.}, at: ext4_buffered_write_iter+0x65/0x210 [ext4]
#2: ffff99f5e7dfcf58 (jbd2_handle){++++}, at: start_this_handle+0x1c1/0x9d0 [jbd2]
#3: ffff99f9db4c0168 (&ei->i_data_sem){++++}, at: ext4_map_blocks+0x176/0x950 [ext4]
#4: ffffffff99086b40 (rcu_read_lock){....}, at: jbd2_write_access_granted+0x4e/0x250 [jbd2]
irq event stamp: 1407125
hardirqs last enabled at (1407125): [<ffffffff980da9b7>] __find_get_block+0x107/0x790
hardirqs last disabled at (1407124): [<ffffffff980da8f9>] __find_get_block+0x49/0x790
softirqs last enabled at (1405528): [<ffffffff98a0034c>] __do_softirq+0x34c/0x57c
softirqs last disabled at (1405521): [<ffffffff97cc67a2>] irq_exit+0xa2/0xc0
Reported by Kernel Concurrency Sanitizer on:
CPU: 68 PID: 25724 Comm: fsync04 Tainted: G L 5.6.0-rc2-next-20200221+ #7
Hardware name: HPE ProLiant DL385 Gen10/ProLiant DL385 Gen10, BIOS A40 07/10/2019
The plain reads are outside of jh->b_state_lock critical section which result
in data races. Fix them by adding pairs of READ|WRITE_ONCE().
Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Qian Cai <cai@lca.pw>
Link: https://lore.kernel.org/r/20200222043111.2227-1-cai@lca.pw
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
|
|
In older version of systemd(219), at boot time, udevadm is called with :
/usr/bin/udevadm trigger --type=devices --action=add"
This program generates an echo "add" in /sys/devices/system/cpu/cpu<x>/uevent,
leading to the "kvm: disabled by bios" message in case of your Bios disabled
the virtualization extensions.
On a modern system running up to 256 CPU threads, this pollutes the Kernel logs.
This patch offers to ratelimit this message to avoid any userspace program triggering
this uevent printing this message too often.
This patch is only a workaround but greatly reduce the pollution without
breaking the current behavior of printing a message if some try to instantiate
KVM on a system that doesn't support it.
Note that recent versions of systemd (>239) do not have trigger this behavior.
This patch will be useful at least for some using older systemd with recent Kernels.
Signed-off-by: Erwan Velu <e.velu@criteo.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
struct cpufreq_policy is quite big and it is not a good idea
to allocate one on the stack. Just use cpufreq_cpu_get and
cpufreq_cpu_put which is even simpler.
Reported-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
Restrict -Werror to well-tested configurations and allow disabling it
via Kconfig.
Reported-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
Compile error with CONFIG_KVM_INTEL=y and W=1:
CC arch/x86/kvm/vmx/vmx.o
arch/x86/kvm/vmx/vmx.c:68:32: error: 'vmx_cpu_id' defined but not used [-Werror=unused-const-variable=]
68 | static const struct x86_cpu_id vmx_cpu_id[] = {
| ^~~~~~~~~~
cc1: all warnings being treated as errors
When building with =y, the MODULE_DEVICE_TABLE macro doesn't generate a
reference to the structure (or any code at all). This makes W=1 compiles
unhappy.
Wrap both in a #ifdef to avoid the issue.
Signed-off-by: Valdis Kletnieks <valdis.kletnieks@vt.edu>
[Do the same for CONFIG_KVM_AMD. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
Nick Desaulniers Reported:
When building with:
$ make CC=clang arch/x86/ CFLAGS=-Wframe-larger-than=1000
The following warning is observed:
arch/x86/kernel/kvm.c:494:13: warning: stack frame size of 1064 bytes in
function 'kvm_send_ipi_mask_allbutself' [-Wframe-larger-than=]
static void kvm_send_ipi_mask_allbutself(const struct cpumask *mask, int
vector)
^
Debugging with:
https://github.com/ClangBuiltLinux/frame-larger-than
via:
$ python3 frame_larger_than.py arch/x86/kernel/kvm.o \
kvm_send_ipi_mask_allbutself
points to the stack allocated `struct cpumask newmask` in
`kvm_send_ipi_mask_allbutself`. The size of a `struct cpumask` is
potentially large, as it's CONFIG_NR_CPUS divided by BITS_PER_LONG for
the target architecture. CONFIG_NR_CPUS for X86_64 can be as high as
8192, making a single instance of a `struct cpumask` 1024 B.
This patch fixes it by pre-allocate 1 cpumask variable per cpu and use it for
both pv tlb and pv ipis..
Reported-by: Nick Desaulniers <ndesaulniers@google.com>
Acked-by: Nick Desaulniers <ndesaulniers@google.com>
Reviewed-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Signed-off-by: Wanpeng Li <wanpengli@tencent.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
Introduce some pv check helpers for consistency.
Suggested-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Wanpeng Li <wanpengli@tencent.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
Sparse notices that declaration and implementation do not match:
arch/s390/kvm/../../../virt/kvm/kvm_main.c:4435:17: warning: incorrect type in return expression (different address spaces)
arch/s390/kvm/../../../virt/kvm/kvm_main.c:4435:17: expected struct kvm_vcpu [noderef] <asn:3> **
arch/s390/kvm/../../../virt/kvm/kvm_main.c:4435:17: got struct kvm_vcpu *[noderef] <asn:3> *
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
Even if APICv is disabled at startup, the backing page and ir_list need
to be initialized in case they are needed later. The only case in
which this can be skipped is for userspace irqchip, and that must be
done because avic_init_backing_page dereferences vcpu->arch.apic
(which is NULL for userspace irqchip).
Tested-by: rmuncrief@humanavance.com
Fixes: https://bugzilla.kernel.org/show_bug.cgi?id=206579
Reviewed-by: Miaohe Lin <linmiaohe@huawei.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
de80f95ccb9c ("PCI: cadence: Move all files to per-device cadence
directory") moved files of the PCI cadence drivers, but did not update the
MAINTAINERS entry.
Since then, ./scripts/get_maintainer.pl --self-test complains:
warning: no file matches F: drivers/pci/controller/pcie-cadence*
Repair the MAINTAINERS entry.
Link: https://lore.kernel.org/r/20200221185402.4703-1-lukas.bulwahn@gmail.com
Signed-off-by: Lukas Bulwahn <lukas.bulwahn@gmail.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
|
|
We must set MSG_CMSG_COMPAT if we're in compatability mode, otherwise
the iovec import for these commands will not do the right thing and fail
the command with -EINVAL.
Found by running the test suite compiled as 32-bit.
Cc: stable@vger.kernel.org
Fixes: aa1fa28fc73e ("io_uring: add support for recvmsg()")
Fixes: 0fa03c624d8f ("io_uring: add support for sendmsg()")
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Add missing ~ to the usage of the mask.
Reported-by: Kevin Benson <Kevin.Benson@zii.aero>
Reported-by: Chris Healy <Chris.Healy@zii.aero>
Fixes: 5c74c54ce6ff ("net: dsa: mv88e6xxx: Split monitor port configuration")
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
During initialization the driver issues a reset to the device and waits
for 100ms before checking if the firmware is ready. The waiting is
necessary because before that the device is irresponsive and the first
read can result in a completion timeout.
While 100ms is sufficient for Spectrum-1 and Spectrum-2, it is
insufficient for Spectrum-3.
Fix this by increasing the timeout to 200ms.
Fixes: da382875c616 ("mlxsw: spectrum: Extend to support Spectrum-3 ASIC")
Signed-off-by: Amit Cohen <amitc@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
We can't just use the top bits of the last sync event as they could be
off-by-one every 65,536 seconds, giving an error in reconstruction of
65,536 seconds.
This patch uses the difference in the bottom 16 bits (mod 2^16) to
calculate an offset that needs to be applied to the last sync event to
get to the current time.
Signed-off-by: Alexandru-Mihai Maftei <amaftei@solarflare.com>
Acked-by: Martin Habets <mhabets@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Some transports (hyperv, virtio) acquire the sock lock during the
.release() callback.
In the vsock_stream_connect() we call vsock_assign_transport(); if
the socket was previously assigned to another transport, the
vsk->transport->release() is called, but the sock lock is already
held in the vsock_stream_connect(), causing a deadlock reported by
syzbot:
INFO: task syz-executor280:9768 blocked for more than 143 seconds.
Not tainted 5.6.0-rc1-syzkaller #0
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
syz-executor280 D27912 9768 9766 0x00000000
Call Trace:
context_switch kernel/sched/core.c:3386 [inline]
__schedule+0x934/0x1f90 kernel/sched/core.c:4082
schedule+0xdc/0x2b0 kernel/sched/core.c:4156
__lock_sock+0x165/0x290 net/core/sock.c:2413
lock_sock_nested+0xfe/0x120 net/core/sock.c:2938
virtio_transport_release+0xc4/0xd60 net/vmw_vsock/virtio_transport_common.c:832
vsock_assign_transport+0xf3/0x3b0 net/vmw_vsock/af_vsock.c:454
vsock_stream_connect+0x2b3/0xc70 net/vmw_vsock/af_vsock.c:1288
__sys_connect_file+0x161/0x1c0 net/socket.c:1857
__sys_connect+0x174/0x1b0 net/socket.c:1874
__do_sys_connect net/socket.c:1885 [inline]
__se_sys_connect net/socket.c:1882 [inline]
__x64_sys_connect+0x73/0xb0 net/socket.c:1882
do_syscall_64+0xfa/0x790 arch/x86/entry/common.c:294
entry_SYSCALL_64_after_hwframe+0x49/0xbe
To avoid this issue, this patch remove the lock acquiring in the
.release() callback of hyperv and virtio transports, and it holds
the lock when we call vsk->transport->release() in the vsock core.
Reported-by: syzbot+731710996d79d0d58fbc@syzkaller.appspotmail.com
Fixes: 408624af4c89 ("vsock: use local transport when it is loaded")
Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Fixes: 3a12500ed5dd ("unix: define and set show_fdinfo only if procfs is enabled")
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Packet forwarding is not working in rmnet bridge mode.
Because when a packet is forwarded, skb_push() for an ethernet header
is needed. But it doesn't call skb_push().
So, the ethernet header will be lost.
Test commands:
modprobe rmnet
ip netns add nst
ip netns add nst2
ip link add veth0 type veth peer name veth1
ip link add veth2 type veth peer name veth3
ip link set veth1 netns nst
ip link set veth3 netns nst2
ip link add rmnet0 link veth0 type rmnet mux_id 1
ip link set veth2 master rmnet0
ip link set veth0 up
ip link set veth2 up
ip link set rmnet0 up
ip a a 192.168.100.1/24 dev rmnet0
ip netns exec nst ip link set veth1 up
ip netns exec nst ip a a 192.168.100.2/24 dev veth1
ip netns exec nst2 ip link set veth3 up
ip netns exec nst2 ip a a 192.168.100.3/24 dev veth3
ip netns exec nst2 ping 192.168.100.2
Fixes: 60d58f971c10 ("net: qualcomm: rmnet: Implement bridge mode")
Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
In order to attach a bridge interface to the rmnet interface,
"master" operation is used.
(e.g. ip link set dummy1 master rmnet0)
But, in the rmnet_add_bridge(), which is a callback of ->ndo_add_slave()
doesn't register lower interface.
So, ->ndo_del_slave() doesn't work.
There are other problems too.
1. It couldn't detect circular upper/lower interface relationship.
2. It couldn't prevent stack overflow because of too deep depth
of upper/lower interface
3. It doesn't check the number of lower interfaces.
4. Panics because of several reasons.
The root problem of these issues is actually the same.
So, in this patch, these all problems will be fixed.
Test commands:
modprobe rmnet
ip link add dummy0 type dummy
ip link add rmnet0 link dummy0 type rmnet mux_id 1
ip link add dummy1 master rmnet0 type dummy
ip link add dummy2 master rmnet0 type dummy
ip link del rmnet0
ip link del dummy2
ip link del dummy1
Splat looks like:
[ 41.867595][ T1164] general protection fault, probably for non-canonical address 0xdffffc0000000101I
[ 41.869993][ T1164] KASAN: null-ptr-deref in range [0x0000000000000808-0x000000000000080f]
[ 41.872950][ T1164] CPU: 0 PID: 1164 Comm: ip Not tainted 5.6.0-rc1+ #447
[ 41.873915][ T1164] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006
[ 41.875161][ T1164] RIP: 0010:rmnet_unregister_bridge.isra.6+0x71/0xf0 [rmnet]
[ 41.876178][ T1164] Code: 48 89 ef 48 89 c6 5b 5d e9 fc fe ff ff e8 f7 f3 ff ff 48 8d b8 08 08 00 00 48 ba 00 7
[ 41.878925][ T1164] RSP: 0018:ffff8880c4d0f188 EFLAGS: 00010202
[ 41.879774][ T1164] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000101
[ 41.887689][ T1164] RDX: dffffc0000000000 RSI: ffffffffb8cf64f0 RDI: 0000000000000808
[ 41.888727][ T1164] RBP: ffff8880c40e4000 R08: ffffed101b3c0e3c R09: 0000000000000001
[ 41.889749][ T1164] R10: 0000000000000001 R11: ffffed101b3c0e3b R12: 1ffff110189a1e3c
[ 41.890783][ T1164] R13: ffff8880c4d0f200 R14: ffffffffb8d56160 R15: ffff8880ccc2c000
[ 41.891794][ T1164] FS: 00007f4300edc0c0(0000) GS:ffff8880d9c00000(0000) knlGS:0000000000000000
[ 41.892953][ T1164] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 41.893800][ T1164] CR2: 00007f43003bc8c0 CR3: 00000000ca53e001 CR4: 00000000000606f0
[ 41.894824][ T1164] Call Trace:
[ 41.895274][ T1164] ? rcu_is_watching+0x2c/0x80
[ 41.895895][ T1164] rmnet_config_notify_cb+0x1f7/0x590 [rmnet]
[ 41.896687][ T1164] ? rmnet_unregister_bridge.isra.6+0xf0/0xf0 [rmnet]
[ 41.897611][ T1164] ? rmnet_unregister_bridge.isra.6+0xf0/0xf0 [rmnet]
[ 41.898508][ T1164] ? __module_text_address+0x13/0x140
[ 41.899162][ T1164] notifier_call_chain+0x90/0x160
[ 41.899814][ T1164] rollback_registered_many+0x660/0xcf0
[ 41.900544][ T1164] ? netif_set_real_num_tx_queues+0x780/0x780
[ 41.901316][ T1164] ? __lock_acquire+0xdfe/0x3de0
[ 41.901958][ T1164] ? memset+0x1f/0x40
[ 41.902468][ T1164] ? __nla_validate_parse+0x98/0x1ab0
[ 41.903166][ T1164] unregister_netdevice_many.part.133+0x13/0x1b0
[ 41.903988][ T1164] rtnl_delete_link+0xbc/0x100
[ ... ]
Fixes: 60d58f971c10 ("net: qualcomm: rmnet: Implement bridge mode")
Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
netdev_upper_dev_link() is useful to manage lower/upper interfaces.
And this function internally validates looping, maximum depth.
All or most virtual interfaces that could have a real interface
(e.g. macsec, macvlan, ipvlan etc.) use lower/upper infrastructure.
Test commands:
modprobe rmnet
ip link add dummy0 type dummy
ip link add rmnet1 link dummy0 type rmnet mux_id 1
for i in {2..100}
do
let A=$i-1
ip link add rmnet$i link rmnet$A type rmnet mux_id $i
done
ip link del dummy0
The purpose of the test commands is to make stack overflow.
Splat looks like:
[ 52.411438][ T1395] BUG: KASAN: slab-out-of-bounds in find_busiest_group+0x27e/0x2c00
[ 52.413218][ T1395] Write of size 64 at addr ffff8880c774bde0 by task ip/1395
[ 52.414841][ T1395]
[ 52.430720][ T1395] CPU: 1 PID: 1395 Comm: ip Not tainted 5.6.0-rc1+ #447
[ 52.496511][ T1395] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006
[ 52.513597][ T1395] Call Trace:
[ 52.546516][ T1395]
[ 52.558773][ T1395] Allocated by task 3171537984:
[ 52.588290][ T1395] BUG: unable to handle page fault for address: ffffffffb999e260
[ 52.589311][ T1395] #PF: supervisor read access in kernel mode
[ 52.590529][ T1395] #PF: error_code(0x0000) - not-present page
[ 52.591374][ T1395] PGD d6818067 P4D d6818067 PUD d6819063 PMD 0
[ 52.592288][ T1395] Thread overran stack, or stack corrupted
[ 52.604980][ T1395] Oops: 0000 [#1] SMP DEBUG_PAGEALLOC KASAN PTI
[ 52.605856][ T1395] CPU: 1 PID: 1395 Comm: ip Not tainted 5.6.0-rc1+ #447
[ 52.611764][ T1395] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006
[ 52.621520][ T1395] RIP: 0010:stack_depot_fetch+0x10/0x30
[ 52.622296][ T1395] Code: ff e9 f9 fe ff ff 48 89 df e8 9c 1d 91 ff e9 ca fe ff ff cc cc cc cc cc cc cc 89 f8 0
[ 52.627887][ T1395] RSP: 0018:ffff8880c774bb60 EFLAGS: 00010006
[ 52.628735][ T1395] RAX: 00000000001f8880 RBX: ffff8880c774d140 RCX: 0000000000000000
[ 52.631773][ T1395] RDX: 000000000000001d RSI: ffff8880c774bb68 RDI: 0000000000003ff0
[ 52.649584][ T1395] RBP: ffffea00031dd200 R08: ffffed101b43e403 R09: ffffed101b43e403
[ 52.674857][ T1395] R10: 0000000000000001 R11: ffffed101b43e402 R12: ffff8880d900e5c0
[ 52.678257][ T1395] R13: ffff8880c774c000 R14: 0000000000000000 R15: dffffc0000000000
[ 52.694541][ T1395] FS: 00007fe867f6e0c0(0000) GS:ffff8880da000000(0000) knlGS:0000000000000000
[ 52.764039][ T1395] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 52.815008][ T1395] CR2: ffffffffb999e260 CR3: 00000000c26aa005 CR4: 00000000000606e0
[ 52.862312][ T1395] Call Trace:
[ 52.887133][ T1395] Modules linked in: dummy rmnet veth openvswitch nsh nf_conncount nf_nat nf_conntrack nf_dex
[ 52.936749][ T1395] CR2: ffffffffb999e260
[ 52.965695][ T1395] ---[ end trace 7e32ca99482dbb31 ]---
[ 52.966556][ T1395] RIP: 0010:stack_depot_fetch+0x10/0x30
[ 52.971083][ T1395] Code: ff e9 f9 fe ff ff 48 89 df e8 9c 1d 91 ff e9 ca fe ff ff cc cc cc cc cc cc cc 89 f8 0
[ 53.003650][ T1395] RSP: 0018:ffff8880c774bb60 EFLAGS: 00010006
[ 53.043183][ T1395] RAX: 00000000001f8880 RBX: ffff8880c774d140 RCX: 0000000000000000
[ 53.076480][ T1395] RDX: 000000000000001d RSI: ffff8880c774bb68 RDI: 0000000000003ff0
[ 53.093858][ T1395] RBP: ffffea00031dd200 R08: ffffed101b43e403 R09: ffffed101b43e403
[ 53.112795][ T1395] R10: 0000000000000001 R11: ffffed101b43e402 R12: ffff8880d900e5c0
[ 53.139837][ T1395] R13: ffff8880c774c000 R14: 0000000000000000 R15: dffffc0000000000
[ 53.141500][ T1395] FS: 00007fe867f6e0c0(0000) GS:ffff8880da000000(0000) knlGS:0000000000000000
[ 53.143343][ T1395] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 53.152007][ T1395] CR2: ffffffffb999e260 CR3: 00000000c26aa005 CR4: 00000000000606e0
[ 53.156459][ T1395] Kernel panic - not syncing: Fatal exception
[ 54.213570][ T1395] Shutting down cpus with NMI
[ 54.354112][ T1395] Kernel Offset: 0x33000000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0x)
[ 54.355687][ T1395] Rebooting in 5 seconds..
Fixes: b37f78f234bf ("net: qualcomm: rmnet: Fix crash on real dev unregistration")
Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Basically, duplicate mux id isn't be allowed.
So, the creation of rmnet will be failed if there is duplicate mux id
is existing.
But, changelink routine doesn't check duplicate mux id.
Test commands:
modprobe rmnet
ip link add dummy0 type dummy
ip link add rmnet0 link dummy0 type rmnet mux_id 1
ip link add rmnet1 link dummy0 type rmnet mux_id 2
ip link set rmnet1 type rmnet mux_id 1
Fixes: 23790ef12082 ("net: qualcomm: rmnet: Allow to configure flags for existing devices")
Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The notifier_call() of the slave interface removes rmnet interface with
unregister_netdevice_queue().
But, before calling unregister_netdevice_queue(), it acquires
rcu readlock.
In the RCU critical section, sleeping isn't be allowed.
But, unregister_netdevice_queue() internally calls synchronize_net(),
which would sleep.
So, suspicious RCU usage warning occurs.
Test commands:
modprobe rmnet
ip link add dummy0 type dummy
ip link add dummy1 type dummy
ip link add rmnet0 link dummy0 type rmnet mux_id 1
ip link set dummy1 master rmnet0
ip link del dummy0
Splat looks like:
[ 79.639245][ T1195] =============================
[ 79.640134][ T1195] WARNING: suspicious RCU usage
[ 79.640852][ T1195] 5.6.0-rc1+ #447 Not tainted
[ 79.641657][ T1195] -----------------------------
[ 79.642472][ T1195] ./include/linux/rcupdate.h:273 Illegal context switch in RCU read-side critical section!
[ 79.644043][ T1195]
[ 79.644043][ T1195] other info that might help us debug this:
[ 79.644043][ T1195]
[ 79.645682][ T1195]
[ 79.645682][ T1195] rcu_scheduler_active = 2, debug_locks = 1
[ 79.646980][ T1195] 2 locks held by ip/1195:
[ 79.647629][ T1195] #0: ffffffffa3cf64f0 (rtnl_mutex){+.+.}, at: rtnetlink_rcv_msg+0x457/0x890
[ 79.649312][ T1195] #1: ffffffffa39256c0 (rcu_read_lock){....}, at: rmnet_config_notify_cb+0xf0/0x590 [rmnet]
[ 79.651717][ T1195]
[ 79.651717][ T1195] stack backtrace:
[ 79.652650][ T1195] CPU: 3 PID: 1195 Comm: ip Not tainted 5.6.0-rc1+ #447
[ 79.653702][ T1195] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006
[ 79.655037][ T1195] Call Trace:
[ 79.655560][ T1195] dump_stack+0x96/0xdb
[ 79.656252][ T1195] ___might_sleep+0x345/0x440
[ 79.656994][ T1195] synchronize_net+0x18/0x30
[ 79.661132][ T1195] netdev_rx_handler_unregister+0x40/0xb0
[ 79.666266][ T1195] rmnet_unregister_real_device+0x42/0xb0 [rmnet]
[ 79.667211][ T1195] rmnet_config_notify_cb+0x1f7/0x590 [rmnet]
[ 79.668121][ T1195] ? rmnet_unregister_bridge.isra.6+0xf0/0xf0 [rmnet]
[ 79.669166][ T1195] ? rmnet_unregister_bridge.isra.6+0xf0/0xf0 [rmnet]
[ 79.670286][ T1195] ? __module_text_address+0x13/0x140
[ 79.671139][ T1195] notifier_call_chain+0x90/0x160
[ 79.671973][ T1195] rollback_registered_many+0x660/0xcf0
[ 79.672893][ T1195] ? netif_set_real_num_tx_queues+0x780/0x780
[ 79.675091][ T1195] ? __lock_acquire+0xdfe/0x3de0
[ 79.675825][ T1195] ? memset+0x1f/0x40
[ 79.676367][ T1195] ? __nla_validate_parse+0x98/0x1ab0
[ 79.677290][ T1195] unregister_netdevice_many.part.133+0x13/0x1b0
[ 79.678163][ T1195] rtnl_delete_link+0xbc/0x100
[ ... ]
Fixes: ceed73a2cf4a ("drivers: net: ethernet: qualcomm: rmnet: Initial implementation")
Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
rmnet_get_port() internally calls rcu_dereference_rtnl(),
which checks RTNL.
But rmnet_get_port() could be called by packet path.
The packet path is not protected by RTNL.
So, the suspicious RCU usage problem occurs.
Test commands:
modprobe rmnet
ip netns add nst
ip link add veth0 type veth peer name veth1
ip link set veth1 netns nst
ip link add rmnet0 link veth0 type rmnet mux_id 1
ip netns exec nst ip link add rmnet1 link veth1 type rmnet mux_id 1
ip netns exec nst ip link set veth1 up
ip netns exec nst ip link set rmnet1 up
ip netns exec nst ip a a 192.168.100.2/24 dev rmnet1
ip link set veth0 up
ip link set rmnet0 up
ip a a 192.168.100.1/24 dev rmnet0
ping 192.168.100.2
Splat looks like:
[ 146.630958][ T1174] WARNING: suspicious RCU usage
[ 146.631735][ T1174] 5.6.0-rc1+ #447 Not tainted
[ 146.632387][ T1174] -----------------------------
[ 146.633151][ T1174] drivers/net/ethernet/qualcomm/rmnet/rmnet_config.c:386 suspicious rcu_dereference_check() !
[ 146.634742][ T1174]
[ 146.634742][ T1174] other info that might help us debug this:
[ 146.634742][ T1174]
[ 146.645992][ T1174]
[ 146.645992][ T1174] rcu_scheduler_active = 2, debug_locks = 1
[ 146.646937][ T1174] 5 locks held by ping/1174:
[ 146.647609][ T1174] #0: ffff8880c31dea70 (sk_lock-AF_INET){+.+.}, at: raw_sendmsg+0xab8/0x2980
[ 146.662463][ T1174] #1: ffffffff93925660 (rcu_read_lock_bh){....}, at: ip_finish_output2+0x243/0x2150
[ 146.671696][ T1174] #2: ffffffff93925660 (rcu_read_lock_bh){....}, at: __dev_queue_xmit+0x213/0x2940
[ 146.673064][ T1174] #3: ffff8880c19ecd58 (&dev->qdisc_running_key#7){+...}, at: ip_finish_output2+0x714/0x2150
[ 146.690358][ T1174] #4: ffff8880c5796898 (&dev->qdisc_xmit_lock_key#3){+.-.}, at: sch_direct_xmit+0x1e2/0x1020
[ 146.699875][ T1174]
[ 146.699875][ T1174] stack backtrace:
[ 146.701091][ T1174] CPU: 0 PID: 1174 Comm: ping Not tainted 5.6.0-rc1+ #447
[ 146.705215][ T1174] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006
[ 146.706565][ T1174] Call Trace:
[ 146.707102][ T1174] dump_stack+0x96/0xdb
[ 146.708007][ T1174] rmnet_get_port.part.9+0x76/0x80 [rmnet]
[ 146.709233][ T1174] rmnet_egress_handler+0x107/0x420 [rmnet]
[ 146.710492][ T1174] ? sch_direct_xmit+0x1e2/0x1020
[ 146.716193][ T1174] rmnet_vnd_start_xmit+0x3d/0xa0 [rmnet]
[ 146.717012][ T1174] dev_hard_start_xmit+0x160/0x740
[ 146.717854][ T1174] sch_direct_xmit+0x265/0x1020
[ 146.718577][ T1174] ? register_lock_class+0x14d0/0x14d0
[ 146.719429][ T1174] ? dev_watchdog+0xac0/0xac0
[ 146.723738][ T1174] ? __dev_queue_xmit+0x15fd/0x2940
[ 146.724469][ T1174] ? lock_acquire+0x164/0x3b0
[ 146.725172][ T1174] __dev_queue_xmit+0x20c7/0x2940
[ ... ]
Fixes: ceed73a2cf4a ("drivers: net: ethernet: qualcomm: rmnet: Initial implementation")
Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
In the rmnet_changelink(), it uses IFLA_LINK without checking
NULL pointer.
tb[IFLA_LINK] could be NULL pointer.
So, NULL-ptr-deref could occur.
rmnet already has a lower interface (real_dev).
So, after this patch, rmnet_changelink() does not use IFLA_LINK anymore.
Test commands:
modprobe rmnet
ip link add dummy0 type dummy
ip link add rmnet0 link dummy0 type rmnet mux_id 1
ip link set rmnet0 type rmnet mux_id 2
Splat looks like:
[ 90.578726][ T1131] general protection fault, probably for non-canonical address 0xdffffc0000000000I
[ 90.581121][ T1131] KASAN: null-ptr-deref in range [0x0000000000000000-0x0000000000000007]
[ 90.582380][ T1131] CPU: 2 PID: 1131 Comm: ip Not tainted 5.6.0-rc1+ #447
[ 90.584285][ T1131] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006
[ 90.587506][ T1131] RIP: 0010:rmnet_changelink+0x5a/0x8a0 [rmnet]
[ 90.588546][ T1131] Code: 83 ec 20 48 c1 ea 03 80 3c 02 00 0f 85 6f 07 00 00 48 8b 5e 28 48 b8 00 00 00 00 00 0
[ 90.591447][ T1131] RSP: 0018:ffff8880ce78f1b8 EFLAGS: 00010247
[ 90.592329][ T1131] RAX: dffffc0000000000 RBX: 0000000000000000 RCX: ffff8880ce78f8b0
[ 90.593253][ T1131] RDX: 0000000000000000 RSI: ffff8880ce78f4a0 RDI: 0000000000000004
[ 90.594058][ T1131] RBP: ffff8880cf543e00 R08: 0000000000000002 R09: 0000000000000002
[ 90.594859][ T1131] R10: ffffffffc0586a40 R11: 0000000000000000 R12: ffff8880ca47c000
[ 90.595690][ T1131] R13: ffff8880ca47c000 R14: ffff8880cf545000 R15: 0000000000000000
[ 90.596553][ T1131] FS: 00007f21f6c7e0c0(0000) GS:ffff8880da400000(0000) knlGS:0000000000000000
[ 90.597504][ T1131] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 90.599418][ T1131] CR2: 0000556e413db458 CR3: 00000000c917a002 CR4: 00000000000606e0
[ 90.600289][ T1131] Call Trace:
[ 90.600631][ T1131] __rtnl_newlink+0x922/0x1270
[ 90.601194][ T1131] ? lock_downgrade+0x6e0/0x6e0
[ 90.601724][ T1131] ? rtnl_link_unregister+0x220/0x220
[ 90.602309][ T1131] ? lock_acquire+0x164/0x3b0
[ 90.602784][ T1131] ? is_bpf_image_address+0xff/0x1d0
[ 90.603331][ T1131] ? rtnl_newlink+0x4c/0x90
[ 90.603810][ T1131] ? kernel_text_address+0x111/0x140
[ 90.604419][ T1131] ? __kernel_text_address+0xe/0x30
[ 90.604981][ T1131] ? unwind_get_return_address+0x5f/0xa0
[ 90.605616][ T1131] ? create_prof_cpu_mask+0x20/0x20
[ 90.606304][ T1131] ? arch_stack_walk+0x83/0xb0
[ 90.606985][ T1131] ? stack_trace_save+0x82/0xb0
[ 90.607656][ T1131] ? stack_trace_consume_entry+0x160/0x160
[ 90.608503][ T1131] ? deactivate_slab.isra.78+0x2c5/0x800
[ 90.609336][ T1131] ? kasan_unpoison_shadow+0x30/0x40
[ 90.610096][ T1131] ? kmem_cache_alloc_trace+0x135/0x350
[ 90.610889][ T1131] ? rtnl_newlink+0x4c/0x90
[ 90.611512][ T1131] rtnl_newlink+0x65/0x90
[ ... ]
Fixes: 23790ef12082 ("net: qualcomm: rmnet: Allow to configure flags for existing devices")
Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
rmnet registers IFLA_LINK interface as a lower interface.
But, IFLA_LINK could be NULL.
In the current code, rmnet doesn't check IFLA_LINK.
So, panic would occur.
Test commands:
modprobe rmnet
ip link add rmnet0 type rmnet mux_id 1
Splat looks like:
[ 36.826109][ T1115] general protection fault, probably for non-canonical address 0xdffffc0000000000I
[ 36.838817][ T1115] KASAN: null-ptr-deref in range [0x0000000000000000-0x0000000000000007]
[ 36.839908][ T1115] CPU: 1 PID: 1115 Comm: ip Not tainted 5.6.0-rc1+ #447
[ 36.840569][ T1115] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006
[ 36.841408][ T1115] RIP: 0010:rmnet_newlink+0x54/0x510 [rmnet]
[ 36.841986][ T1115] Code: 83 ec 18 48 c1 e9 03 80 3c 01 00 0f 85 d4 03 00 00 48 8b 6a 28 48 b8 00 00 00 00 00 c
[ 36.843923][ T1115] RSP: 0018:ffff8880b7e0f1c0 EFLAGS: 00010247
[ 36.844756][ T1115] RAX: dffffc0000000000 RBX: ffff8880d14cca00 RCX: 1ffff11016fc1e99
[ 36.845859][ T1115] RDX: 0000000000000000 RSI: ffff8880c3d04000 RDI: 0000000000000004
[ 36.846961][ T1115] RBP: 0000000000000000 R08: ffff8880b7e0f8b0 R09: ffff8880b6ac2d90
[ 36.848020][ T1115] R10: ffffffffc0589a40 R11: ffffed1016d585b7 R12: ffffffff88ceaf80
[ 36.848788][ T1115] R13: ffff8880c3d04000 R14: ffff8880b7e0f8b0 R15: ffff8880c3d04000
[ 36.849546][ T1115] FS: 00007f50ab3360c0(0000) GS:ffff8880da000000(0000) knlGS:0000000000000000
[ 36.851784][ T1115] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 36.852422][ T1115] CR2: 000055871afe5ab0 CR3: 00000000ae246001 CR4: 00000000000606e0
[ 36.853181][ T1115] Call Trace:
[ 36.853514][ T1115] __rtnl_newlink+0xbdb/0x1270
[ 36.853967][ T1115] ? lock_downgrade+0x6e0/0x6e0
[ 36.854420][ T1115] ? rtnl_link_unregister+0x220/0x220
[ 36.854936][ T1115] ? lock_acquire+0x164/0x3b0
[ 36.855376][ T1115] ? is_bpf_image_address+0xff/0x1d0
[ 36.855884][ T1115] ? rtnl_newlink+0x4c/0x90
[ 36.856304][ T1115] ? kernel_text_address+0x111/0x140
[ 36.856857][ T1115] ? __kernel_text_address+0xe/0x30
[ 36.857440][ T1115] ? unwind_get_return_address+0x5f/0xa0
[ 36.858063][ T1115] ? create_prof_cpu_mask+0x20/0x20
[ 36.858644][ T1115] ? arch_stack_walk+0x83/0xb0
[ 36.859171][ T1115] ? stack_trace_save+0x82/0xb0
[ 36.859710][ T1115] ? stack_trace_consume_entry+0x160/0x160
[ 36.860357][ T1115] ? deactivate_slab.isra.78+0x2c5/0x800
[ 36.860928][ T1115] ? kasan_unpoison_shadow+0x30/0x40
[ 36.861520][ T1115] ? kmem_cache_alloc_trace+0x135/0x350
[ 36.862125][ T1115] ? rtnl_newlink+0x4c/0x90
[ 36.864073][ T1115] rtnl_newlink+0x65/0x90
[ ... ]
Fixes: ceed73a2cf4a ("drivers: net: ethernet: qualcomm: rmnet: Initial implementation")
Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Don't attempt to interpret the PHY specific status register unless
the PHY is indicating that the resolution is valid.
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The current code causes problems when the unregistering netdevice could
be different then the registering one.
Since the check in mlx5_lag_netdev_event() does not allow any other
network namespace anyway, fix this by registerting the lag notifier
per init network namespace only.
Fixes: d48834f9d4b4 ("mlx5: Use dev_net netdevice notifier registrations")
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Tested-by: Aya Levin <ayal@mellanox.com>
Acked-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Follow the pattern used with other *_show_fdinfo functions and only
define unix_show_fdinfo and set it in proto_ops if CONFIG_PROCFS
is set.
Fixes: 3c32da19a858 ("unix: Show number of pending scm files of receive queue in fdinfo")
Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
Reviewed-by: Kirill Tkhai <ktkhai@virtuozzo.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
should use real receive queue number to configure hw rss
indirect table rather than maximal queue number
Signed-off-by: Luo bin <luobin9@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
a reserved field is used to signify prime physical function index
in the latest firmware version, so we must assign a value to it
correctly
Signed-off-by: Luo bin <luobin9@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
can not use a local variable as an input parameter of
irq_set_affinity_hint
Signed-off-by: Luo bin <luobin9@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Completions need to consumed in the same order the controller submitted
them, otherwise future completion entries may overwrite ones we haven't
handled yet. Hold the nvme queue's poll lock while completing new CQEs to
prevent another thread from freeing command tags for reuse out-of-order.
Fixes: dabcefab45d3 ("nvme: provide optimized poll function for separate poll queues")
Signed-off-by: Bijan Mottahedeh <bijan.mottahedeh@oracle.com>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Reviewed-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Keith Busch <kbusch@kernel.org>
|
|
Some older compilers have no implementation for the helper for 64-bit
unsigned division/modulo, so linking pcie-brcmstb driver causes the
"undefined reference to `__aeabi_uldivmod'" error.
*rc_bar2_size is always a power of two, because it is calculated as:
"1ULL << fls64(entry->res->end - entry->res->start)", so the modulo
operation in the subsequent check can be replaced by a simple logical
AND with a proper mask.
Link: https://lore.kernel.org/r/20200227115146.24515-1-m.szyprowski@samsung.com
Fixes: c0452137034b ("PCI: brcmstb: Add Broadcom STB PCIe host controller driver")
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Nicolas Saenz Julienne <nsaenzjulienne@suse.de>
Acked-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
|
|
Follow the pattern used with other *_show_fdinfo functions and only
define and use io_uring_show_fdinfo and its helper functions if
CONFIG_PROC_FS is set.
Fixes: 87ce955b24c9 ("io_uring: add ->show_fdinfo() for the io_uring file descriptor")
Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Before commit 1e4f63aecb53 ("cpufreq: Avoid creating excessively
large stack frames") the initial value of the policy field in struct
cpufreq_policy set by the driver's ->init() callback was implicitly
passed from cpufreq_init_policy() to cpufreq_set_policy() if the
default governor was neither "performance" nor "powersave". After
that commit, however, cpufreq_init_policy() must take that case into
consideration explicitly and handle it as appropriate, so make that
happen.
Fixes: 1e4f63aecb53 ("cpufreq: Avoid creating excessively large stack frames")
Link: https://lore.kernel.org/linux-pm/39fb762880c27da110086741315ca8b111d781cd.camel@gmail.com/
Reported-by: Artem Bityutskiy <dedekind1@gmail.com>
Cc: 5.4+ <stable@vger.kernel.org> # 5.4+
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
|
|
In smc_ib_remove_dev() check if the provided ib device was actually
initialized for SMC before.
Reported-by: syzbot+84484ccebdd4e5451d91@syzkaller.appspotmail.com
Fixes: a4cf0443c414 ("smc: introduce SMC as an IB-client")
Signed-off-by: Karsten Graul <kgraul@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
We cannot register the same netdev notifier multiple times when probing
stmmac devices. Register the notifier only once in module init, and also
make debugfs creation/deletion safe against simultaneous notifier call.
Fixes: 481a7d154cbb ("stmmac: debugfs entry name is not be changed when udev rename device name.")
Signed-off-by: Aaro Koskinen <aaro.koskinen@nokia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|