| Age | Commit message (Collapse) | Author | Files | Lines |
|
Flag (1 << 0) is ignored is set, never used, reject it it with EINVAL
instead.
Fixes: 0ca743a55991 ("netfilter: nf_tables: add compatibility layer for x_tables")
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
xt_find_revision() expects u8, restrict it to this datatype.
Fixes: 0ca743a55991 ("netfilter: nf_tables: add compatibility layer for x_tables")
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
inet_recv_error() is called without holding the socket lock.
IPv6 socket could mutate to IPv4 with IPV6_ADDRFORM
socket option and trigger a KCSAN warning.
Fixes: f4713a3dfad0 ("net-timestamp: make tcp_recvmsg call ipv6_recv_error for AF_INET6 socks")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Willem de Bruijn <willemb@google.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
If hv_netvsc driver is unloaded and reloaded, the NET_DEVICE_REGISTER
handler cannot perform VF register successfully as the register call
is received before netvsc_probe is finished. This is because we
register register_netdevice_notifier() very early( even before
vmbus_driver_register()).
To fix this, we try to register each such matching VF( if it is visible
as a netdevice) at the end of netvsc_probe.
Cc: stable@vger.kernel.org
Fixes: 85520856466e ("hv_netvsc: Fix race of register_netdevice_notifier and VF register")
Suggested-by: Dexuan Cui <decui@microsoft.com>
Signed-off-by: Shradha Gupta <shradhagupta@linux.microsoft.com>
Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com>
Reviewed-by: Dexuan Cui <decui@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
When qmem_alloc and pfvf->hw_ops->sq_aq_init fails, sq->sg should be
freed to prevent memleak.
Fixes: c9c12d339d93 ("octeontx2-pf: Add support for PTP clock")
Signed-off-by: Zhipeng Lu <alexious@zju.edu.cn>
Acked-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
When alloc_scq fails, card->vcs[0] (i.e. vc) should be freed. Otherwise,
in the following call chain:
idt77252_init_one
|-> idt77252_dev_open
|-> open_card_ubr0
|-> alloc_scq [failed]
|-> deinit_card
|-> vfree(card->vcs);
card->vcs is freed and card->vcs[0] is leaked.
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Zhipeng Lu <alexious@zju.edu.cn>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
If the ICMPv6 error is built from a non-linear skb we get the following
splat,
BUG: KASAN: slab-out-of-bounds in do_csum+0x220/0x240
Read of size 4 at addr ffff88811d402c80 by task netperf/820
CPU: 0 PID: 820 Comm: netperf Not tainted 6.8.0-rc1+ #543
...
kasan_report+0xd8/0x110
do_csum+0x220/0x240
csum_partial+0xc/0x20
skb_tunnel_check_pmtu+0xeb9/0x3280
vxlan_xmit_one+0x14c2/0x4080
vxlan_xmit+0xf61/0x5c00
dev_hard_start_xmit+0xfb/0x510
__dev_queue_xmit+0x7cd/0x32a0
br_dev_queue_push_xmit+0x39d/0x6a0
Use skb_checksum instead of csum_partial who cannot deal with non-linear
SKBs.
Fixes: 4cb47a8644cc ("tunnels: PMTU discovery support for directly bridged IP packets")
Signed-off-by: Antoine Tenart <atenart@kernel.org>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
For XDP_TX action xdp_buff is converted to xdp_frame. The conversion is
done by xdp_convert_buff_to_frame(). The memory type of the resulting
xdp_frame depends on the memory type of the xdp_buff. For page pool
based xdp_buff it produces xdp_frame with memory type
MEM_TYPE_PAGE_POOL. For zero copy XSK pool based xdp_buff it produces
xdp_frame with memory type MEM_TYPE_PAGE_ORDER0.
tsnep_xdp_xmit_back() is not prepared for that and uses always the page
pool buffer type TSNEP_TX_TYPE_XDP_TX. This leads to invalid mappings
and the transmission of undefined data.
Improve tsnep_xdp_xmit_back() to use the generic buffer type
TSNEP_TX_TYPE_XDP_NDO for zero copy XDP_TX.
Fixes: 3fc2333933fd ("tsnep: Add XDP socket zero-copy RX support")
Signed-off-by: Gerhard Engleder <gerhard@engleder-embedded.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Using hard-coded constant timeout to wait for some expected
event is deemed to fail sooner or later, especially in slow
env.
Our CI has spotted another of such race:
# TEST: ipv6: cleanup of cached exceptions - nexthop objects [FAIL]
# can't delete veth device in a timely manner, PMTU dst likely leaked
Replace the crude sleep with a loop looking for the expected condition
at low interval for a much longer range.
Fixes: b3cc4f8a8a41 ("selftests: pmtu: add explicit tests for PMTU exceptions cleanup")
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Link: https://lore.kernel.org/r/fd5c745e9bb665b724473af6a9373a8c2a62b247.1706812005.git.pabeni@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The pmtu.sh test uses a few TCP listener in a problematic way:
It hard-codes a constant timeout to wait for the listener starting-up
in background. That introduces unneeded latency and on very slow and
busy host it can fail.
Additionally the test starts again the same listener in the same
namespace on the same port, just after the previous connection
completed. Fast host can attempt starting the new server before the
old one really closed the socket.
Address the issues using the wait_local_port_listen helper and
explicitly waiting for the background listener process exit.
Fixes: 136a1b434bbb ("selftests: net: test vxlan pmtu exceptions with tcp")
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Link: https://lore.kernel.org/r/f8e8f6d44427d8c45e9f6a71ee1a321047452087.1706812005.git.pabeni@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The setup_ns helper marks the testns global variable as
readonly. Later attempts to set such variable are unsuccessful,
causing a couple test failures.
Avoid completely the variable re-initialization and let the
function access the global value.
Fixes: e9ce7ededf14 ("selftests: rtnetlink: use setup_ns in bonding test")
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Link: https://lore.kernel.org/r/6e7c937c8ff73ca52a21a4a536a13a76ec0173a8.1706812005.git.pabeni@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The udpgro_fwd.sh self-tests are somewhat unstable. There are
a few timing constraints the we struggle to meet on very slow
environments.
Instead of skipping the whole tests in such envs, increase the
test resilience WRT very slow hosts: increase the inter-packets
timeouts, avoid resetting the counters every second and finally
disable reduce the background traffic noise.
Tested with:
for I in $(seq 1 100); do
./tools/testing/selftests/kselftest_install/run_kselftest.sh \
-t net:udpgro_fwd.sh || exit -1
done
in a slow environment.
Fixes: a062260a9d5f ("selftests: net: add UDP GRO forwarding self-tests")
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Link: https://lore.kernel.org/r/f4b6b11064a0d39182a9ae6a853abae3e9b4426a.1706812005.git.pabeni@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Function aq_ring_hwts_rx_alloc() maps extra AQ_CFG_RXDS_DEF bytes
for PTP HWTS ring but then generic aq_ring_free() does not take this
into account.
Create and use a specific function to free HWTS ring to fix this
issue.
Trace:
[ 215.351607] ------------[ cut here ]------------
[ 215.351612] DMA-API: atlantic 0000:4b:00.0: device driver frees DMA memory with different size [device address=0x00000000fbdd0000] [map size=34816 bytes] [unmap size=32768 bytes]
[ 215.351635] WARNING: CPU: 33 PID: 10759 at kernel/dma/debug.c:988 check_unmap+0xa6f/0x2360
...
[ 215.581176] Call Trace:
[ 215.583632] <TASK>
[ 215.585745] ? show_trace_log_lvl+0x1c4/0x2df
[ 215.590114] ? show_trace_log_lvl+0x1c4/0x2df
[ 215.594497] ? debug_dma_free_coherent+0x196/0x210
[ 215.599305] ? check_unmap+0xa6f/0x2360
[ 215.603147] ? __warn+0xca/0x1d0
[ 215.606391] ? check_unmap+0xa6f/0x2360
[ 215.610237] ? report_bug+0x1ef/0x370
[ 215.613921] ? handle_bug+0x3c/0x70
[ 215.617423] ? exc_invalid_op+0x14/0x50
[ 215.621269] ? asm_exc_invalid_op+0x16/0x20
[ 215.625480] ? check_unmap+0xa6f/0x2360
[ 215.629331] ? mark_lock.part.0+0xca/0xa40
[ 215.633445] debug_dma_free_coherent+0x196/0x210
[ 215.638079] ? __pfx_debug_dma_free_coherent+0x10/0x10
[ 215.643242] ? slab_free_freelist_hook+0x11d/0x1d0
[ 215.648060] dma_free_attrs+0x6d/0x130
[ 215.651834] aq_ring_free+0x193/0x290 [atlantic]
[ 215.656487] aq_ptp_ring_free+0x67/0x110 [atlantic]
...
[ 216.127540] ---[ end trace 6467e5964dd2640b ]---
[ 216.132160] DMA-API: Mapped at:
[ 216.132162] debug_dma_alloc_coherent+0x66/0x2f0
[ 216.132165] dma_alloc_attrs+0xf5/0x1b0
[ 216.132168] aq_ring_hwts_rx_alloc+0x150/0x1f0 [atlantic]
[ 216.132193] aq_ptp_ring_alloc+0x1bb/0x540 [atlantic]
[ 216.132213] aq_nic_init+0x4a1/0x760 [atlantic]
Fixes: 94ad94558b0f ("net: aquantia: add PTP rings infrastructure")
Signed-off-by: Ivan Vecera <ivecera@redhat.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Link: https://lore.kernel.org/r/20240201094752.883026-1-ivecera@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Many syzbot reports include the following trace [1]
If nsim_dev_trap_report_work() can not grab the mutex,
it should rearm itself at least one jiffie later.
[1]
Sending NMI from CPU 1 to CPUs 0:
NMI backtrace for cpu 0
CPU: 0 PID: 32383 Comm: kworker/0:2 Not tainted 6.8.0-rc2-syzkaller-00031-g861c0981648f #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 11/17/2023
Workqueue: events nsim_dev_trap_report_work
RIP: 0010:bytes_is_nonzero mm/kasan/generic.c:89 [inline]
RIP: 0010:memory_is_nonzero mm/kasan/generic.c:104 [inline]
RIP: 0010:memory_is_poisoned_n mm/kasan/generic.c:129 [inline]
RIP: 0010:memory_is_poisoned mm/kasan/generic.c:161 [inline]
RIP: 0010:check_region_inline mm/kasan/generic.c:180 [inline]
RIP: 0010:kasan_check_range+0x101/0x190 mm/kasan/generic.c:189
Code: 07 49 39 d1 75 0a 45 3a 11 b8 01 00 00 00 7c 0b 44 89 c2 e8 21 ed ff ff 83 f0 01 5b 5d 41 5c c3 48 85 d2 74 4f 48 01 ea eb 09 <48> 83 c0 01 48 39 d0 74 41 80 38 00 74 f2 eb b6 41 bc 08 00 00 00
RSP: 0018:ffffc90012dcf998 EFLAGS: 00000046
RAX: fffffbfff258af1e RBX: fffffbfff258af1f RCX: ffffffff8168eda3
RDX: fffffbfff258af1f RSI: 0000000000000004 RDI: ffffffff92c578f0
RBP: fffffbfff258af1e R08: 0000000000000000 R09: fffffbfff258af1e
R10: ffffffff92c578f3 R11: ffffffff8acbcbc0 R12: 0000000000000002
R13: ffff88806db38400 R14: 1ffff920025b9f42 R15: ffffffff92c578e8
FS: 0000000000000000(0000) GS:ffff8880b9800000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 000000c00994e078 CR3: 000000002c250000 CR4: 00000000003506f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
<NMI>
</NMI>
<TASK>
instrument_atomic_read include/linux/instrumented.h:68 [inline]
atomic_read include/linux/atomic/atomic-instrumented.h:32 [inline]
queued_spin_is_locked include/asm-generic/qspinlock.h:57 [inline]
debug_spin_unlock kernel/locking/spinlock_debug.c:101 [inline]
do_raw_spin_unlock+0x53/0x230 kernel/locking/spinlock_debug.c:141
__raw_spin_unlock_irqrestore include/linux/spinlock_api_smp.h:150 [inline]
_raw_spin_unlock_irqrestore+0x22/0x70 kernel/locking/spinlock.c:194
debug_object_activate+0x349/0x540 lib/debugobjects.c:726
debug_work_activate kernel/workqueue.c:578 [inline]
insert_work+0x30/0x230 kernel/workqueue.c:1650
__queue_work+0x62e/0x11d0 kernel/workqueue.c:1802
__queue_delayed_work+0x1bf/0x270 kernel/workqueue.c:1953
queue_delayed_work_on+0x106/0x130 kernel/workqueue.c:1989
queue_delayed_work include/linux/workqueue.h:563 [inline]
schedule_delayed_work include/linux/workqueue.h:677 [inline]
nsim_dev_trap_report_work+0x9c0/0xc80 drivers/net/netdevsim/dev.c:842
process_one_work+0x886/0x15d0 kernel/workqueue.c:2633
process_scheduled_works kernel/workqueue.c:2706 [inline]
worker_thread+0x8b9/0x1290 kernel/workqueue.c:2787
kthread+0x2c6/0x3a0 kernel/kthread.c:388
ret_from_fork+0x45/0x80 arch/x86/kernel/process.c:147
ret_from_fork_asm+0x11/0x20 arch/x86/entry/entry_64.S:242
</TASK>
Fixes: 012ec02ae441 ("netdevsim: convert driver to use unlocked devlink API during init/fini")
Reported-by: syzbot <syzkaller@googlegroups.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Link: https://lore.kernel.org/r/20240201175324.3752746-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
While inlining csum_and_memcpy() into memcpy_to_iter_csum(), the from
address passed to csum_partial_copy_nocheck() was accidentally changed.
This causes a regression in applications using UDP, as for example
OpenAFS, causing loss of datagrams.
Fixes: dc32bff195b4 ("iov_iter, net: Fold in csum_and_memcpy()")
Cc: David Howells <dhowells@redhat.com>
Cc: stable@vger.kernel.org
Cc: regressions@lists.linux.dev
Signed-off-by: Michael Lass <bevan@bi-co.net>
Reviewed-by: Jeffrey Altman <jaltman@auristor.com>
Acked-by: David Howells <dhowells@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Commit 56e58d6c8a56 ("net: stmmac: Implement Safety Features in
XGMAC core") checks and reports safety errors, but leaves the
Data Path Parity Errors for each channel in DMA unhandled at all, lead to
a storm of interrupt.
Fix it by checking and clearing the DMA_DPP_Interrupt_Status register.
Fixes: 56e58d6c8a56 ("net: stmmac: Implement Safety Features in XGMAC core")
Signed-off-by: Furong Xu <0x1207@gmail.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
IFLA_DPLL_PIN was added to rt_link messages but not to the spec, which
breaks ynl. Add the missing definitions to the rt_link ynl spec.
Fixes: 5f1842692880 ("netdev: expose DPLL pin handle for netdevice")
Signed-off-by: Donald Hunter <donald.hunter@gmail.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Link: https://lore.kernel.org/r/20240201113853.37432-1-donald.hunter@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
In the arm random config file, kconfig option 'CONFIG_AEABI' is
disabled which results in adding the compiler flag '-mabi=apcs-gnu'.
This causes the compiler to add padding in virtchnl2_ptype
structure to align it to 8 bytes, resulting in the following
size check failure:
include/linux/build_bug.h:78:41: error: static assertion failed: "(6) == sizeof(struct virtchnl2_ptype)"
78 | #define __static_assert(expr, msg, ...) _Static_assert(expr, msg)
| ^~~~~~~~~~~~~~
include/linux/build_bug.h:77:34: note: in expansion of macro '__static_assert'
77 | #define static_assert(expr, ...) __static_assert(expr, ##__VA_ARGS__, #expr)
| ^~~~~~~~~~~~~~~
drivers/net/ethernet/intel/idpf/virtchnl2.h:26:9: note: in expansion of macro 'static_assert'
26 | static_assert((n) == sizeof(struct X))
| ^~~~~~~~~~~~~
drivers/net/ethernet/intel/idpf/virtchnl2.h:982:1: note: in expansion of macro 'VIRTCHNL2_CHECK_STRUCT_LEN'
982 | VIRTCHNL2_CHECK_STRUCT_LEN(6, virtchnl2_ptype);
| ^~~~~~~~~~~~~~~~~~~~~~~~~~
Avoid the compiler padding by using "__packed" structure
attribute for the virtchnl2_ptype struct. Also align the
structure by using "__aligned(2)" for better code optimization.
Fixes: 0d7502a9b4a7 ("virtchnl: add virtchnl version 2 ops")
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202312220250.ufEm8doQ-lkp@intel.com
Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Reviewed-by: Paul Menzel <pmenzel@molgen.mpg.de>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: Pavan Kumar Linga <pavan.kumar.linga@intel.com>
Tested-by: Krishneil Singh <krishneil.k.singh@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Link: https://lore.kernel.org/r/20240131222241.2087516-1-anthony.l.nguyen@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Since the "Fixes" commits mentioned below, the newly added "userspace
pm" subtests of mptcp_join selftests are launching the whole transfer in
the background, do the required checks, then wait for the end of
transfer.
There is no need to wait longer, especially because the checks at the
end of the transfer are ignored (which is fine). This saves quite a few
seconds on slow environments.
While at it, use 'mptcp_lib_kill_wait()' helper everywhere, instead of
on a specific one with 'kill_tests_wait()'.
Fixes: b2e2248f365a ("selftests: mptcp: userspace pm create id 0 subflow")
Fixes: e3b47e460b4b ("selftests: mptcp: userspace pm remove initial subflow")
Fixes: b9fb176081fb ("selftests: mptcp: userspace pm send RM_ADDR for ID 0")
Cc: stable@vger.kernel.org
Reviewed-and-tested-by: Geliang Tang <geliang@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://lore.kernel.org/r/20240131-upstream-net-20240131-mptcp-ci-issues-v1-9-4c1c11e571ff@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Since the "Fixes" commit mentioned below, "userspace pm" subtests of
mptcp_join selftests introduced in v6.5 are launching the whole transfer
in the background, do the required checks, then wait for the end of
transfer.
There is no need to wait longer, especially because the checks at the
end of the transfer are ignored (which is fine). This saves quite a few
seconds in slow environments.
Note that old versions will need commit bdbef0a6ff10 ("selftests: mptcp:
add mptcp_lib_kill_wait") as well to get 'mptcp_lib_kill_wait()' helper.
Fixes: 4369c198e599 ("selftests: mptcp: test userspace pm out of transfer")
Cc: stable@vger.kernel.org # 6.5.x: bdbef0a6ff10: selftests: mptcp: add mptcp_lib_kill_wait
Cc: stable@vger.kernel.org # 6.5.x
Reviewed-and-tested-by: Geliang Tang <geliang@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://lore.kernel.org/r/20240131-upstream-net-20240131-mptcp-ci-issues-v1-8-4c1c11e571ff@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
If a CI executes the same selftest multiple times with different
options, all results from the same subtests will have the same title,
which confuse the CI. With the same title printed in TAP, the tests are
considered as the same ones.
Now, it is possible to override this prefix by using MPTCP_LIB_KSFT_TEST
env var, and have a different title.
While at it, use 'basename' to remove the suffix as well instead of
using an extra 'sed'.
Fixes: c4192967e62f ("selftests: mptcp: lib: format subtests results in TAP")
Cc: stable@vger.kernel.org
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://lore.kernel.org/r/20240131-upstream-net-20240131-mptcp-ci-issues-v1-7-4c1c11e571ff@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
When running the simult_flow selftest in slow environments -- e.g. QEmu
without KVM support --, the results can be unstable. This selftest
checks if the aggregated bandwidth is (almost) fully used as expected.
To help improving the stability while still keeping the same validation
in place, the BW and the delay are reduced to lower the pressure on the
CPU.
Fixes: 1a418cb8e888 ("mptcp: simult flow self-tests")
Fixes: 219d04992b68 ("mptcp: push pending frames when subflow has free space")
Cc: stable@vger.kernel.org
Suggested-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://lore.kernel.org/r/20240131-upstream-net-20240131-mptcp-ci-issues-v1-6-4c1c11e571ff@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
On very slow environments -- e.g. when QEmu is used without KVM --,
mptcp_join.sh selftest can take a bit more than 20 minutes. Bump the
default timeout by 50% as it seems normal to take that long on some
environments.
When a debug kernel config is used, this selftest will take even longer,
but that's certainly not a common test env to consider for the timeout.
The Fixes tag that has been picked here is there simply to help having
this patch backported to older stable versions. It is difficult to point
to the exact commit that made some env reaching the timeout from time to
time.
Fixes: d17b968b9876 ("selftests: mptcp: increase timeout to 20 minutes")
Cc: stable@vger.kernel.org
Acked-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://lore.kernel.org/r/20240131-upstream-net-20240131-mptcp-ci-issues-v1-5-4c1c11e571ff@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Since the commit mentioned below, 'mptcp_join' selftests is using
IPTables to add rules to the Mangle table, only in IPv4.
This KConfig is usually enabled by default in many defconfig, but we
recently noticed that some CI were running our selftests without them
enabled.
Fixes: b6e074e171bc ("selftests: mptcp: add infinite map testcase")
Cc: stable@vger.kernel.org
Reviewed-by: Geliang Tang <geliang@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://lore.kernel.org/r/20240131-upstream-net-20240131-mptcp-ci-issues-v1-4-4c1c11e571ff@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Since the commit mentioned below, 'mptcp_join' selftests is using
IPTables to add rules to the Filter table for IPv6.
It is then required to have IP6_NF_FILTER KConfig.
This KConfig is usually enabled by default in many defconfig, but we
recently noticed that some CI were running our selftests without them
enabled.
Fixes: 523514ed0a99 ("selftests: mptcp: add ADD_ADDR IPv6 test cases")
Cc: stable@vger.kernel.org
Reviewed-by: Geliang Tang <geliang@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://lore.kernel.org/r/20240131-upstream-net-20240131-mptcp-ci-issues-v1-3-4c1c11e571ff@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Since the commit mentioned below, 'mptcp_join' selftests is using
IPTables to add rules to the Filter table.
It is then required to have IP_NF_FILTER KConfig.
This KConfig is usually enabled by default in many defconfig, but we
recently noticed that some CI were running our selftests without them
enabled.
Fixes: 8d014eaa9254 ("selftests: mptcp: add ADD_ADDR timeout test case")
Cc: stable@vger.kernel.org
Reviewed-by: Geliang Tang <geliang@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
When the MPTCP PM detects that a subflow is stale, all the packet
scheduler must re-inject all the mptcp-level unacked data. To avoid
acquiring unneeded locks, it first try to check if any unacked data
is present at all in the RTX queue, but such check is currently
broken, as it uses TCP-specific helper on an MPTCP socket.
Funnily enough fuzzers and static checkers are happy, as the accessed
memory still belongs to the mptcp_sock struct, and even from a
functional perspective the recovery completed successfully, as
the short-cut test always failed.
A recent unrelated TCP change - commit d5fed5addb2b ("tcp: reorganize
tcp_sock fast path variables") - exposed the issue, as the tcp field
reorganization makes the mptcp code always skip the re-inection.
Fix the issue dropping the bogus call: we are on a slow path, the early
optimization proved once again to be evil.
Fixes: 1e1d9d6f119c ("mptcp: handle pending data on closed subflow")
Cc: stable@vger.kernel.org
Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/468
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Mat Martineau <martineau@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://lore.kernel.org/r/20240131-upstream-net-20240131-mptcp-ci-issues-v1-1-4c1c11e571ff@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The rtnetlink tests require additional options currently
off by default.
Fixes: 2766a11161cc ("selftests: rtnetlink: add ipsec offload API test")
Fixes: 5e596ee171ba ("selftests: add xfrm state-policy-monitor to rtnetlink.sh")
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Link: https://lore.kernel.org/r/9048ca58e49b962f35dba1dfb2beaf3dab3e0411.1706723341.git.pabeni@redhat.com/
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
amt test uses the TTL iptables module:
ip netns exec "${RELAY}" iptables -t mangle -I PREROUTING \
-d 239.0.0.1 -j TTL --ttl-set 2
Fixes: c08e8baea78e ("selftests: add amt interface selftest script")
Link: https://lore.kernel.org/r/20240131165605.4051645-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Some scripts are not tests themselves; they contain utility functions used
by other tests. According to Documentation/dev-tools/kselftest.rst, such
files should be listed in TEST_FILES. Currently they are incorrectly listed
in TEST_PROGS_EXTENDED so rename the variable.
Fixes: c085dbfb1cfc ("selftests/net/forwarding: define libs as TEST_PROGS_EXTENDED")
Suggested-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Benjamin Poirier <bpoirier@nvidia.com>
Link: https://lore.kernel.org/r/20240131140848.360618-6-bpoirier@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Some scripts are not tests themselves; they contain utility functions used
by other tests. According to Documentation/dev-tools/kselftest.rst, such
files should be listed in TEST_FILES. Move those utility scripts to
TEST_FILES.
Fixes: 1751eb42ddb5 ("selftests: net: use TEST_PROGS_EXTENDED")
Fixes: 25ae948b4478 ("selftests/net: add lib.sh")
Fixes: b99ac1841147 ("kselftests/net: add missed setup_loopback.sh/setup_veth.sh to Makefile")
Fixes: f5173fe3e13b ("selftests: net: included needed helper in the install targets")
Suggested-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Benjamin Poirier <bpoirier@nvidia.com>
Link: https://lore.kernel.org/r/20240131140848.360618-5-bpoirier@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
setup_loopback.sh and net_helper.sh are meant to be sourced from other
scripts, not executed directly. Therefore, remove the executable bits from
those files' permissions.
This change is similar to commit 49078c1b80b6 ("selftests: forwarding:
Remove executable bits from lib.sh")
Fixes: 7d1575014a63 ("selftests/net: GRO coalesce test")
Fixes: 3bdd9fd29cb0 ("selftests/net: synchronize udpgro tests' tx and rx connection")
Suggested-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Benjamin Poirier <bpoirier@nvidia.com>
Link: https://lore.kernel.org/r/20240131140848.360618-4-bpoirier@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The purpose of the test_LAG_cleanup() function is to check that some
hardware addresses are removed from underlying devices after they have been
unenslaved. The test function simply checks that those addresses are not
present at the end. However, if the addresses were never added to begin
with due to some error in device setup, the test function currently passes.
This is a false positive since in that situation the test did not actually
exercise the intended functionality.
Add a check that the expected addresses are indeed present after device
setup. This makes the test function more robust.
I noticed this problem when running the team/dev_addr_lists.sh test on a
system without support for dummy and ipv6:
tools/testing/selftests/drivers/net/team# ./dev_addr_lists.sh
Error: Unknown device type.
Error: Unknown device type.
This program is not intended to be run as root.
RTNETLINK answers: Operation not supported
TEST: team cleanup mode lacp [ OK ]
Fixes: bbb774d921e2 ("net: Add tests for bonding and team address list management")
Signed-off-by: Benjamin Poirier <bpoirier@nvidia.com>
Link: https://lore.kernel.org/r/20240131140848.360618-3-bpoirier@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Similar to commit dd2d40acdbb2 ("selftests: bonding: Add more missing
config options"), add more networking-specific config options which are
needed for team device tests.
For testing, I used the minimal config generated by virtme-ng and I added
the options in the config file. Afterwards, the team device test passed.
Fixes: bbb774d921e2 ("net: Add tests for bonding and team address list management")
Reviewed-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Benjamin Poirier <bpoirier@nvidia.com>
Link: https://lore.kernel.org/r/20240131140848.360618-2-bpoirier@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
In commit ac5047671758 ("hv_netvsc: Disable NAPI before closing the
VMBus channel"), napi_disable was getting called for all channels,
including all subchannels without confirming if they are enabled or not.
This caused hv_netvsc getting hung at napi_disable, when netvsc_probe()
has finished running but nvdev->subchan_work has not started yet.
netvsc_subchan_work() -> rndis_set_subchannel() has not created the
sub-channels and because of that netvsc_sc_open() is not running.
netvsc_remove() calls cancel_work_sync(&nvdev->subchan_work), for which
netvsc_subchan_work did not run.
netif_napi_add() sets the bit NAPI_STATE_SCHED because it ensures NAPI
cannot be scheduled. Then netvsc_sc_open() -> napi_enable will clear the
NAPIF_STATE_SCHED bit, so it can be scheduled. napi_disable() does the
opposite.
Now during netvsc_device_remove(), when napi_disable is called for those
subchannels, napi_disable gets stuck on infinite msleep.
This fix addresses this problem by ensuring that napi_disable() is not
getting called for non-enabled NAPI struct.
But netif_napi_del() is still necessary for these non-enabled NAPI struct
for cleanup purpose.
Call trace:
[ 654.559417] task:modprobe state:D stack: 0 pid: 2321 ppid: 1091 flags:0x00004002
[ 654.568030] Call Trace:
[ 654.571221] <TASK>
[ 654.573790] __schedule+0x2d6/0x960
[ 654.577733] schedule+0x69/0xf0
[ 654.581214] schedule_timeout+0x87/0x140
[ 654.585463] ? __bpf_trace_tick_stop+0x20/0x20
[ 654.590291] msleep+0x2d/0x40
[ 654.593625] napi_disable+0x2b/0x80
[ 654.597437] netvsc_device_remove+0x8a/0x1f0 [hv_netvsc]
[ 654.603935] rndis_filter_device_remove+0x194/0x1c0 [hv_netvsc]
[ 654.611101] ? do_wait_intr+0xb0/0xb0
[ 654.615753] netvsc_remove+0x7c/0x120 [hv_netvsc]
[ 654.621675] vmbus_remove+0x27/0x40 [hv_vmbus]
Cc: stable@vger.kernel.org
Fixes: ac5047671758 ("hv_netvsc: Disable NAPI before closing the VMBus channel")
Signed-off-by: Souradeep Chakrabarti <schakrabarti@linux.microsoft.com>
Reviewed-by: Dexuan Cui <decui@microsoft.com>
Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://lore.kernel.org/r/1706686551-28510-1-git-send-email-schakrabarti@linux.microsoft.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Invoking the make_tx_response() / push_tx_responses() pair with no lock
held would be acceptable only if all such invocations happened from the
same context (NAPI instance or dealloc thread). Since this isn't the
case, and since the interface "spec" also doesn't demand that multicast
operations may only be performed with no in-flight transmits,
MCAST_{ADD,DEL} processing also needs to acquire the response lock
around the invocations.
To prevent similar mistakes going forward, "downgrade" the present
functions to private helpers of just the two remaining ones using them
directly, with no forward declarations anymore. This involves renaming
what so far was make_tx_response(), for the new function of that name
to serve the new (wrapper) purpose.
While there,
- constify the txp parameters,
- correct xenvif_idx_release()'s status parameter's type,
- rename {,_}make_tx_response()'s status parameters for consistency with
xenvif_idx_release()'s.
Fixes: 210c34dcd8d9 ("xen-netback: add support for multicast control")
Cc: stable@vger.kernel.org
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Paul Durrant <paul@xen.org>
Link: https://lore.kernel.org/r/980c6c3d-e10e-4459-8565-e8fbde122f00@suse.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The documentation is pointing to the wrong path for the interface.
Documentation is pointing to /sys/class/<iface>, instead of
/sys/class/net/<iface>.
Fix it by adding the `net/` directory before the interface.
Fixes: 1a02ef76acfa ("net: sysfs: add documentation entries for /sys/class/<iface>/queues")
Signed-off-by: Breno Leitao <leitao@debian.org>
Link: https://lore.kernel.org/r/20240131102150.728960-2-leitao@debian.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
XDP queues are created/destroyed when a XDP program
is attached/detached. In current driver xdp_queues are not
getting destroyed on program exit due to incorrect xdp_queue
and tot_tx_queue count values.
This patch fixes the issue by setting tot_tx_queue and xdp_queue
count to correct values. It also fixes xdp.data_hard_start address.
Fixes: 06059a1a9a4a ("octeontx2-pf: Add XDP support to netdev PF")
Signed-off-by: Geetha sowjanya <gakula@marvell.com>
Link: https://lore.kernel.org/r/20240130120610.16673-1-gakula@marvell.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
It appears that Sony DVMC-DA1 has a quirk that the descriptor leaf entry
locates just after the vendor directory entry in root directory. This is
not conformant to the legacy layout of configuration ROM described in
Configuration ROM for AV/C Devices 1.0 (1394 Trading Association, Dec 2000,
TA Document 1999027).
This commit changes current implementation to parse configuration ROM for
device attributes so that the descriptor leaf entry can be detected for
the vendor name.
$ config-rom-pretty-printer < Sony-DVMC-DA1.img
ROM header and bus information block
-----------------------------------------------------------------
1024 041ee7fb bus_info_length 4, crc_length 30, crc 59387
1028 31333934 bus_name "1394"
1032 e0644000 irmc 1, cmc 1, isc 1, bmc 0, cyc_clk_acc 100, max_rec 4 (32)
1036 08004603 company_id 080046 |
1040 0014193c device_id 12886219068 | EUI-64 576537731003586876
root directory
-----------------------------------------------------------------
1044 0006b681 directory_length 6, crc 46721
1048 03080046 vendor
1052 0c0083c0 node capabilities: per IEEE 1394
1056 8d00000a --> eui-64 leaf at 1096
1060 d1000003 --> unit directory at 1072
1064 c3000005 --> vendor directory at 1084
1068 8100000a --> descriptor leaf at 1108
unit directory at 1072
-----------------------------------------------------------------
1072 0002cdbf directory_length 2, crc 52671
1076 1200a02d specifier id
1080 13010000 version
vendor directory at 1084
-----------------------------------------------------------------
1084 00020cfe directory_length 2, crc 3326
1088 17fa0000 model
1092 81000008 --> descriptor leaf at 1124
eui-64 leaf at 1096
-----------------------------------------------------------------
1096 0002c66e leaf_length 2, crc 50798
1100 08004603 company_id 080046 |
1104 0014193c device_id 12886219068 | EUI-64 576537731003586876
descriptor leaf at 1108
-----------------------------------------------------------------
1108 00039e26 leaf_length 3, crc 40486
1112 00000000 textual descriptor
1116 00000000 minimal ASCII
1120 536f6e79 "Sony"
descriptor leaf at 1124
-----------------------------------------------------------------
1124 0005001d leaf_length 5, crc 29
1128 00000000 textual descriptor
1132 00000000 minimal ASCII
1136 44564d43 "DVMC"
1140 2d444131 "-DA1"
1144 00000000
Suggested-by: Adam Goldman <adamg@pobox.com>
Tested-by: Adam Goldman <adamg@pobox.com>
Link: https://lore.kernel.org/r/20240130100409.30128-3-o-takashi@sakamocchi.jp
Signed-off-by: Takashi Sakamoto <o-takashi@sakamocchi.jp>
|
|
Against its current description, the kernel API can accepts all types of
directory entries.
This commit corrects the documentation.
Cc: stable@vger.kernel.org
Fixes: 3c2c58cb33b3 ("firewire: core: fw_csr_string addendum")
Link: https://lore.kernel.org/r/20240130100409.30128-2-o-takashi@sakamocchi.jp
Signed-off-by: Takashi Sakamoto <o-takashi@sakamocchi.jp>
|
|
When running the pmtu.sh via the kselftest infra, accessing
/dev/stdout gives unexpected results:
# dd: failed to open '/dev/stdout': Device or resource busy
# TEST: IPv4, bridged vxlan4: PMTU exceptions [FAIL]
Let dd use directly the standard output to fix the above:
# TEST: IPv4, bridged vxlan4: PMTU exceptions - nexthop objects [ OK ]
Fixes: 136a1b434bbb ("selftests: net: test vxlan pmtu exceptions with tcp")
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Guillaume Nault <gnault@redhat.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Link: https://lore.kernel.org/r/23d7592c5d77d75cff9b34f15c227f92e911c2ae.1706635101.git.pabeni@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The pmtu.sh test tries to detect the tunnel protocols available
in the running kernel and properly skip the unsupported cases.
In a few more complex setup, such detection is unsuccessful, as
the script currently ignores some intermediate error code at
setup time.
Before:
# which: no nettest in (/bin:/sbin:/usr/bin:/usr/sbin:/usr/local/bin)
# TEST: vti6: PMTU exceptions (ESP-in-UDP) [FAIL]
# PMTU exception wasn't created after creating tunnel exceeding link layer MTU
# ./pmtu.sh: line 931: kill: (7543) - No such process
# ./pmtu.sh: line 931: kill: (7544) - No such process
After:
# xfrm4 not supported
# TEST: vti4: PMTU exceptions [SKIP]
Fixes: ece1278a9b81 ("selftests: net: add ESP-in-UDP PMTU test")
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Guillaume Nault <gnault@redhat.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Link: https://lore.kernel.org/r/cab10e75fda618e6fff8c595b632f47db58b9309.1706635101.git.pabeni@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The mentioned test uses a few Kconfig still missing the
net config, add them.
Before:
# Error: Specified qdisc kind is unknown.
# Error: Specified qdisc kind is unknown.
# Error: Qdisc not classful.
# We have an error talking to the kernel
# Error: Qdisc not classful.
# We have an error talking to the kernel
# policy_routing not supported
# TEST: ICMPv4 with DSCP and ECN: PMTU exceptions [SKIP]
After:
# TEST: ICMPv4 with DSCP and ECN: PMTU exceptions [ OK ]
Fixes: ec730c3e1f0e ("selftest: net: Test IPv4 PMTU exceptions with DSCP and ECN")
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Guillaume Nault <gnault@redhat.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Link: https://lore.kernel.org/r/8d27bf6762a5c7b3acc457d6e6872c533040f9c1.1706635101.git.pabeni@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Currently the teardown/setup flow for driver probe/remove is quite
a bit different from the reset flows in pdsc_fw_down()/pdsc_fw_up().
One key piece that's missing are the calls to pci_alloc_irq_vectors()
and pci_free_irq_vectors(). The pcie reset case is calling
pci_free_irq_vectors() on reset_prepare, but not calling the
corresponding pci_alloc_irq_vectors() on reset_done. This is causing
unexpected/unwanted interrupt behavior due to the adminq interrupt
being accidentally put into legacy interrupt mode. Also, the
pci_alloc_irq_vectors()/pci_free_irq_vectors() functions are being
called directly in probe/remove respectively.
Fix this inconsistency by making the following changes:
1. Always call pdsc_dev_init() in pdsc_setup(), which calls
pci_alloc_irq_vectors() and get rid of the now unused
pds_dev_reinit().
2. Always free/clear the pdsc->intr_info in pdsc_teardown()
since this structure will get re-alloced in pdsc_setup().
3. Move the calls of pci_free_irq_vectors() to pdsc_teardown()
since pci_alloc_irq_vectors() will always be called in
pdsc_setup()->pdsc_dev_init() for both the probe/remove and
reset flows.
4. Make sure to only create the debugfs "identity" entry when it
doesn't already exist, which it will in the reset case because
it's already been created in the initial call to pdsc_dev_init().
Fixes: ffa55858330f ("pds_core: implement pci reset handlers")
Signed-off-by: Brett Creeley <brett.creeley@amd.com>
Reviewed-by: Shannon Nelson <shannon.nelson@amd.com>
Link: https://lore.kernel.org/r/20240129234035.69802-7-brett.creeley@amd.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
During reset the BARs might be accessed when they are
unmapped. This can cause unexpected issues, so fix it by
clearing the cached BAR values so they are not accessed
until they are re-mapped.
Also, make sure any places that can access the BARs
when they are NULL are prevented.
Fixes: 49ce92fbee0b ("pds_core: add FW update feature to devlink")
Signed-off-by: Brett Creeley <brett.creeley@amd.com>
Reviewed-by: Shannon Nelson <shannon.nelson@amd.com>
Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Link: https://lore.kernel.org/r/20240129234035.69802-6-brett.creeley@amd.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
There are multiple paths that can result in using the pdsc's
adminq.
[1] pdsc_adminq_isr and the resulting work from queue_work(),
i.e. pdsc_work_thread()->pdsc_process_adminq()
[2] pdsc_adminq_post()
When the device goes through reset via PCIe reset and/or
a fw_down/fw_up cycle due to bad PCIe state or bad device
state the adminq is destroyed and recreated.
A NULL pointer dereference can happen if [1] or [2] happens
after the adminq is already destroyed.
In order to fix this, add some further state checks and
implement reference counting for adminq uses. Reference
counting was used because multiple threads can attempt to
access the adminq at the same time via [1] or [2]. Additionally,
multiple clients (i.e. pds-vfio-pci) can be using [2]
at the same time.
The adminq_refcnt is initialized to 1 when the adminq has been
allocated and is ready to use. Users/clients of the adminq
(i.e. [1] and [2]) will increment the refcnt when they are using
the adminq. When the driver goes into a fw_down cycle it will
set the PDSC_S_FW_DEAD bit and then wait for the adminq_refcnt
to hit 1. Setting the PDSC_S_FW_DEAD before waiting will prevent
any further adminq_refcnt increments. Waiting for the
adminq_refcnt to hit 1 allows for any current users of the adminq
to finish before the driver frees the adminq. Once the
adminq_refcnt hits 1 the driver clears the refcnt to signify that
the adminq is deleted and cannot be used. On the fw_up cycle the
driver will once again initialize the adminq_refcnt to 1 allowing
the adminq to be used again.
Fixes: 01ba61b55b20 ("pds_core: Add adminq processing and commands")
Signed-off-by: Brett Creeley <brett.creeley@amd.com>
Reviewed-by: Shannon Nelson <shannon.nelson@amd.com>
Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Link: https://lore.kernel.org/r/20240129234035.69802-5-brett.creeley@amd.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The initial design for the adminq interrupt was done based
on client drivers having their own adminq and adminq
interrupt. So, each client driver's adminq isr would use
their specific adminqcq for the private data struct. For the
time being the design has changed to only use a single
adminq for all clients. So, instead use the struct pdsc for
the private data to simplify things a bit.
This also has the benefit of not dereferencing the adminqcq
to access the pdsc struct when the PDSC_S_STOPPING_DRIVER bit
is set and the adminqcq has actually been cleared/freed.
Fixes: 01ba61b55b20 ("pds_core: Add adminq processing and commands")
Signed-off-by: Brett Creeley <brett.creeley@amd.com>
Reviewed-by: Shannon Nelson <shannon.nelson@amd.com>
Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Link: https://lore.kernel.org/r/20240129234035.69802-4-brett.creeley@amd.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
There is a small window where pdsc_work_thread()
calls pdsc_process_adminq() and pdsc_process_adminq()
passes the PDSC_S_STOPPING_DRIVER check and starts
to process adminq/notifyq work and then the driver
starts a fw_down cycle. This could cause some
undefined behavior if the notifyqcq/adminqcq are
free'd while pdsc_process_adminq() is running. Use
cancel_work_sync() on the adminqcq's work struct
to make sure any pending work items are cancelled
and any in progress work items are completed.
Also, make sure to not call cancel_work_sync() if
the work item has not be initialized. Without this,
traces will happen in cases where a reset fails and
teardown is called again or if reset fails and the
driver is removed.
Fixes: 01ba61b55b20 ("pds_core: Add adminq processing and commands")
Signed-off-by: Brett Creeley <brett.creeley@amd.com>
Reviewed-by: Shannon Nelson <shannon.nelson@amd.com>
Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Link: https://lore.kernel.org/r/20240129234035.69802-3-brett.creeley@amd.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The PCIe reset handlers can run at the same time as the
health thread. This can cause the health thread to
stomp on the PCIe reset. Fix this by preventing the
health thread from running while a PCIe reset is happening.
As part of this use timer_shutdown_sync() during reset and
remove to make sure the timer doesn't ever get rearmed.
Fixes: ffa55858330f ("pds_core: implement pci reset handlers")
Signed-off-by: Brett Creeley <brett.creeley@amd.com>
Reviewed-by: Shannon Nelson <shannon.nelson@amd.com>
Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Link: https://lore.kernel.org/r/20240129234035.69802-2-brett.creeley@amd.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
syzbot reported a lockdep splat [1].
Blamed commit hinted about the possible lockdep
violation, and code used unix_state_lock_nested()
in an attempt to silence lockdep.
It is not sufficient, because unix_state_lock_nested()
is already used from unix_state_double_lock().
We need to use a separate subclass.
This patch adds a distinct enumeration to make things
more explicit.
Also use swap() in unix_state_double_lock() as a clean up.
v2: add a missing inline keyword to unix_state_lock_nested()
[1]
WARNING: possible circular locking dependency detected
6.8.0-rc1-syzkaller-00356-g8a696a29c690 #0 Not tainted
syz-executor.1/2542 is trying to acquire lock:
ffff88808b5df9e8 (rlock-AF_UNIX){+.+.}-{2:2}, at: skb_queue_tail+0x36/0x120 net/core/skbuff.c:3863
but task is already holding lock:
ffff88808b5dfe70 (&u->lock/1){+.+.}-{2:2}, at: unix_dgram_sendmsg+0xfc7/0x2200 net/unix/af_unix.c:2089
which lock already depends on the new lock.
the existing dependency chain (in reverse order) is:
-> #1 (&u->lock/1){+.+.}-{2:2}:
lock_acquire+0x1e3/0x530 kernel/locking/lockdep.c:5754
_raw_spin_lock_nested+0x31/0x40 kernel/locking/spinlock.c:378
sk_diag_dump_icons net/unix/diag.c:87 [inline]
sk_diag_fill+0x6ea/0xfe0 net/unix/diag.c:157
sk_diag_dump net/unix/diag.c:196 [inline]
unix_diag_dump+0x3e9/0x630 net/unix/diag.c:220
netlink_dump+0x5c1/0xcd0 net/netlink/af_netlink.c:2264
__netlink_dump_start+0x5d7/0x780 net/netlink/af_netlink.c:2370
netlink_dump_start include/linux/netlink.h:338 [inline]
unix_diag_handler_dump+0x1c3/0x8f0 net/unix/diag.c:319
sock_diag_rcv_msg+0xe3/0x400
netlink_rcv_skb+0x1df/0x430 net/netlink/af_netlink.c:2543
sock_diag_rcv+0x2a/0x40 net/core/sock_diag.c:280
netlink_unicast_kernel net/netlink/af_netlink.c:1341 [inline]
netlink_unicast+0x7e6/0x980 net/netlink/af_netlink.c:1367
netlink_sendmsg+0xa37/0xd70 net/netlink/af_netlink.c:1908
sock_sendmsg_nosec net/socket.c:730 [inline]
__sock_sendmsg net/socket.c:745 [inline]
sock_write_iter+0x39a/0x520 net/socket.c:1160
call_write_iter include/linux/fs.h:2085 [inline]
new_sync_write fs/read_write.c:497 [inline]
vfs_write+0xa74/0xca0 fs/read_write.c:590
ksys_write+0x1a0/0x2c0 fs/read_write.c:643
do_syscall_x64 arch/x86/entry/common.c:52 [inline]
do_syscall_64+0xf5/0x230 arch/x86/entry/common.c:83
entry_SYSCALL_64_after_hwframe+0x63/0x6b
-> #0 (rlock-AF_UNIX){+.+.}-{2:2}:
check_prev_add kernel/locking/lockdep.c:3134 [inline]
check_prevs_add kernel/locking/lockdep.c:3253 [inline]
validate_chain+0x1909/0x5ab0 kernel/locking/lockdep.c:3869
__lock_acquire+0x1345/0x1fd0 kernel/locking/lockdep.c:5137
lock_acquire+0x1e3/0x530 kernel/locking/lockdep.c:5754
__raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:110 [inline]
_raw_spin_lock_irqsave+0xd5/0x120 kernel/locking/spinlock.c:162
skb_queue_tail+0x36/0x120 net/core/skbuff.c:3863
unix_dgram_sendmsg+0x15d9/0x2200 net/unix/af_unix.c:2112
sock_sendmsg_nosec net/socket.c:730 [inline]
__sock_sendmsg net/socket.c:745 [inline]
____sys_sendmsg+0x592/0x890 net/socket.c:2584
___sys_sendmsg net/socket.c:2638 [inline]
__sys_sendmmsg+0x3b2/0x730 net/socket.c:2724
__do_sys_sendmmsg net/socket.c:2753 [inline]
__se_sys_sendmmsg net/socket.c:2750 [inline]
__x64_sys_sendmmsg+0xa0/0xb0 net/socket.c:2750
do_syscall_x64 arch/x86/entry/common.c:52 [inline]
do_syscall_64+0xf5/0x230 arch/x86/entry/common.c:83
entry_SYSCALL_64_after_hwframe+0x63/0x6b
other info that might help us debug this:
Possible unsafe locking scenario:
CPU0 CPU1
---- ----
lock(&u->lock/1);
lock(rlock-AF_UNIX);
lock(&u->lock/1);
lock(rlock-AF_UNIX);
*** DEADLOCK ***
1 lock held by syz-executor.1/2542:
#0: ffff88808b5dfe70 (&u->lock/1){+.+.}-{2:2}, at: unix_dgram_sendmsg+0xfc7/0x2200 net/unix/af_unix.c:2089
stack backtrace:
CPU: 1 PID: 2542 Comm: syz-executor.1 Not tainted 6.8.0-rc1-syzkaller-00356-g8a696a29c690 #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 11/17/2023
Call Trace:
<TASK>
__dump_stack lib/dump_stack.c:88 [inline]
dump_stack_lvl+0x1e7/0x2d0 lib/dump_stack.c:106
check_noncircular+0x366/0x490 kernel/locking/lockdep.c:2187
check_prev_add kernel/locking/lockdep.c:3134 [inline]
check_prevs_add kernel/locking/lockdep.c:3253 [inline]
validate_chain+0x1909/0x5ab0 kernel/locking/lockdep.c:3869
__lock_acquire+0x1345/0x1fd0 kernel/locking/lockdep.c:5137
lock_acquire+0x1e3/0x530 kernel/locking/lockdep.c:5754
__raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:110 [inline]
_raw_spin_lock_irqsave+0xd5/0x120 kernel/locking/spinlock.c:162
skb_queue_tail+0x36/0x120 net/core/skbuff.c:3863
unix_dgram_sendmsg+0x15d9/0x2200 net/unix/af_unix.c:2112
sock_sendmsg_nosec net/socket.c:730 [inline]
__sock_sendmsg net/socket.c:745 [inline]
____sys_sendmsg+0x592/0x890 net/socket.c:2584
___sys_sendmsg net/socket.c:2638 [inline]
__sys_sendmmsg+0x3b2/0x730 net/socket.c:2724
__do_sys_sendmmsg net/socket.c:2753 [inline]
__se_sys_sendmmsg net/socket.c:2750 [inline]
__x64_sys_sendmmsg+0xa0/0xb0 net/socket.c:2750
do_syscall_x64 arch/x86/entry/common.c:52 [inline]
do_syscall_64+0xf5/0x230 arch/x86/entry/common.c:83
entry_SYSCALL_64_after_hwframe+0x63/0x6b
RIP: 0033:0x7f26d887cda9
Code: 28 00 00 00 75 05 48 83 c4 28 c3 e8 e1 20 00 00 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b0 ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007f26d95a60c8 EFLAGS: 00000246 ORIG_RAX: 0000000000000133
RAX: ffffffffffffffda RBX: 00007f26d89abf80 RCX: 00007f26d887cda9
RDX: 000000000000003e RSI: 00000000200bd000 RDI: 0000000000000004
RBP: 00007f26d88c947a R08: 0000000000000000 R09: 0000000000000000
R10: 00000000000008c0 R11: 0000000000000246 R12: 0000000000000000
R13: 000000000000000b R14: 00007f26d89abf80 R15: 00007ffcfe081a68
Fixes: 2aac7a2cb0d9 ("unix_diag: Pending connections IDs NLA")
Reported-by: syzbot <syzkaller@googlegroups.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Link: https://lore.kernel.org/r/20240130184235.1620738-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|