Age | Commit message (Collapse) | Author | Files | Lines |
|
In the bareudp6_xmit_skb(), it calculates min_headroom.
At that point, it uses struct iphdr, but it's not correct.
So panic could occur.
The struct ipv6hdr should be used.
Test commands:
ip netns add A
ip netns add B
ip link add veth0 netns A type veth peer name veth1 netns B
ip netns exec A ip link set veth0 up
ip netns exec A ip a a 2001:db8:0::1/64 dev veth0
ip netns exec B ip link set veth1 up
ip netns exec B ip a a 2001:db8:0::2/64 dev veth1
for i in {10..1}
do
let A=$i-1
ip netns exec A ip link add bareudp$i type bareudp dstport $i \
ethertype 0x86dd
ip netns exec A ip link set bareudp$i up
ip netns exec A ip -6 a a 2001:db8:$i::1/64 dev bareudp$i
ip netns exec A ip -6 r a 2001:db8:$i::2 encap ip6 src \
2001:db8:$A::1 dst 2001:db8:$A::2 via 2001:db8:$i::2 \
dev bareudp$i
ip netns exec B ip link add bareudp$i type bareudp dstport $i \
ethertype 0x86dd
ip netns exec B ip link set bareudp$i up
ip netns exec B ip -6 a a 2001:db8:$i::2/64 dev bareudp$i
ip netns exec B ip -6 r a 2001:db8:$i::1 encap ip6 src \
2001:db8:$A::2 dst 2001:db8:$A::1 via 2001:db8:$i::1 \
dev bareudp$i
done
ip netns exec A ping 2001:db8:7::2
Splat looks like:
[ 66.436679][ C2] skbuff: skb_under_panic: text:ffffffff928614c8 len:454 put:14 head:ffff88810abb4000 data:ffff88810abb3ffa tail:0x1c0 end:0x3ec0 dev:veth0
[ 66.441626][ C2] ------------[ cut here ]------------
[ 66.443458][ C2] kernel BUG at net/core/skbuff.c:109!
[ 66.445313][ C2] invalid opcode: 0000 [#1] SMP DEBUG_PAGEALLOC KASAN PTI
[ 66.447606][ C2] CPU: 2 PID: 913 Comm: ping Not tainted 5.10.0+ #819
[ 66.450251][ C2] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1ubuntu1 04/01/2014
[ 66.453713][ C2] RIP: 0010:skb_panic+0x15d/0x15f
[ 66.455345][ C2] Code: 98 fe 4c 8b 4c 24 10 53 8b 4d 70 45 89 e0 48 c7 c7 60 8b 78 93 41 57 41 56 41 55 48 8b 54 24 20 48 8b 74 24 28 e8 b5 40 f9 ff <0f> 0b 48 8b 6c 24 20 89 34 24 e8 08 c9 98 fe 8b 34 24 48 c7 c1 80
[ 66.462314][ C2] RSP: 0018:ffff888119209648 EFLAGS: 00010286
[ 66.464281][ C2] RAX: 0000000000000089 RBX: ffff888003159000 RCX: 0000000000000000
[ 66.467216][ C2] RDX: 0000000000000089 RSI: 0000000000000008 RDI: ffffed10232412c0
[ 66.469768][ C2] RBP: ffff88810a53d440 R08: ffffed102328018d R09: ffffed102328018d
[ 66.472297][ C2] R10: ffff888119400c67 R11: ffffed102328018c R12: 000000000000000e
[ 66.474833][ C2] R13: ffff88810abb3ffa R14: 00000000000001c0 R15: 0000000000003ec0
[ 66.477361][ C2] FS: 00007f37c0c72f00(0000) GS:ffff888119200000(0000) knlGS:0000000000000000
[ 66.480214][ C2] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 66.482296][ C2] CR2: 000055a058808570 CR3: 000000011039e002 CR4: 00000000003706e0
[ 66.484811][ C2] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 66.487793][ C2] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 66.490424][ C2] Call Trace:
[ 66.491469][ C2] <IRQ>
[ 66.492374][ C2] ? eth_header+0x28/0x190
[ 66.494054][ C2] ? eth_header+0x28/0x190
[ 66.495401][ C2] skb_push.cold.99+0x22/0x22
[ 66.496700][ C2] eth_header+0x28/0x190
[ 66.497867][ C2] neigh_resolve_output+0x3de/0x720
[ 66.499615][ C2] ? __neigh_update+0x7e8/0x20a0
[ 66.501176][ C2] __neigh_update+0x8bd/0x20a0
[ 66.502749][ C2] ndisc_update+0x34/0xc0
[ 66.504010][ C2] ndisc_recv_na+0x8da/0xb80
[ 66.505041][ C2] ? pndisc_redo+0x20/0x20
[ 66.505888][ C2] ? rcu_read_lock_sched_held+0xc0/0xc0
[ 66.506965][ C2] ndisc_rcv+0x3a0/0x470
[ 66.507797][ C2] icmpv6_rcv+0xad9/0x1b00
[ 66.508645][ C2] ip6_protocol_deliver_rcu+0xcd6/0x1560
[ 66.509719][ C2] ip6_input_finish+0x5b/0xf0
[ 66.510615][ C2] ip6_input+0xcd/0x2d0
[ 66.511406][ C2] ? ip6_input_finish+0xf0/0xf0
[ 66.512327][ C2] ? rcu_read_lock_held+0x91/0xa0
[ 66.513279][ C2] ? ip6_protocol_deliver_rcu+0x1560/0x1560
[ 66.514414][ C2] ipv6_rcv+0xe8/0x300
[ ... ]
Acked-by: Guillaume Nault <gnault@redhat.com>
Fixes: 571912c69f0e ("net: UDP tunnel encapsulation module for tunnelling different protocols like MPLS, IP, NSH etc.")
Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Link: https://lore.kernel.org/r/20201228152146.24270-1-ap420073@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Like other tunneling interfaces, the bareudp doesn't need TXLOCK.
So, It is good to set the NETIF_F_LLTX flag to improve performance and
to avoid lockdep's false-positive warning.
Test commands:
ip netns add A
ip netns add B
ip link add veth0 netns A type veth peer name veth1 netns B
ip netns exec A ip link set veth0 up
ip netns exec A ip a a 10.0.0.1/24 dev veth0
ip netns exec B ip link set veth1 up
ip netns exec B ip a a 10.0.0.2/24 dev veth1
for i in {2..1}
do
let A=$i-1
ip netns exec A ip link add bareudp$i type bareudp \
dstport $i ethertype ip
ip netns exec A ip link set bareudp$i up
ip netns exec A ip a a 10.0.$i.1/24 dev bareudp$i
ip netns exec A ip r a 10.0.$i.2 encap ip src 10.0.$A.1 \
dst 10.0.$A.2 via 10.0.$i.2 dev bareudp$i
ip netns exec B ip link add bareudp$i type bareudp \
dstport $i ethertype ip
ip netns exec B ip link set bareudp$i up
ip netns exec B ip a a 10.0.$i.2/24 dev bareudp$i
ip netns exec B ip r a 10.0.$i.1 encap ip src 10.0.$A.2 \
dst 10.0.$A.1 via 10.0.$i.1 dev bareudp$i
done
ip netns exec A ping 10.0.2.2
Splat looks like:
[ 96.992803][ T822] ============================================
[ 96.993954][ T822] WARNING: possible recursive locking detected
[ 96.995102][ T822] 5.10.0+ #819 Not tainted
[ 96.995927][ T822] --------------------------------------------
[ 96.997091][ T822] ping/822 is trying to acquire lock:
[ 96.998083][ T822] ffff88810f753898 (_xmit_NONE#2){+.-.}-{2:2}, at: __dev_queue_xmit+0x1f52/0x2960
[ 96.999813][ T822]
[ 96.999813][ T822] but task is already holding lock:
[ 97.001192][ T822] ffff88810c385498 (_xmit_NONE#2){+.-.}-{2:2}, at: __dev_queue_xmit+0x1f52/0x2960
[ 97.002908][ T822]
[ 97.002908][ T822] other info that might help us debug this:
[ 97.004401][ T822] Possible unsafe locking scenario:
[ 97.004401][ T822]
[ 97.005784][ T822] CPU0
[ 97.006407][ T822] ----
[ 97.007010][ T822] lock(_xmit_NONE#2);
[ 97.007779][ T822] lock(_xmit_NONE#2);
[ 97.008550][ T822]
[ 97.008550][ T822] *** DEADLOCK ***
[ 97.008550][ T822]
[ 97.010057][ T822] May be due to missing lock nesting notation
[ 97.010057][ T822]
[ 97.011594][ T822] 7 locks held by ping/822:
[ 97.012426][ T822] #0: ffff888109a144f0 (sk_lock-AF_INET){+.+.}-{0:0}, at: raw_sendmsg+0x12f7/0x2b00
[ 97.014191][ T822] #1: ffffffffbce2f5a0 (rcu_read_lock_bh){....}-{1:2}, at: ip_finish_output2+0x249/0x2020
[ 97.016045][ T822] #2: ffffffffbce2f5a0 (rcu_read_lock_bh){....}-{1:2}, at: __dev_queue_xmit+0x1fd/0x2960
[ 97.017897][ T822] #3: ffff88810c385498 (_xmit_NONE#2){+.-.}-{2:2}, at: __dev_queue_xmit+0x1f52/0x2960
[ 97.019684][ T822] #4: ffffffffbce2f600 (rcu_read_lock){....}-{1:2}, at: bareudp_xmit+0x31b/0x3690 [bareudp]
[ 97.021573][ T822] #5: ffffffffbce2f5a0 (rcu_read_lock_bh){....}-{1:2}, at: ip_finish_output2+0x249/0x2020
[ 97.023424][ T822] #6: ffffffffbce2f5a0 (rcu_read_lock_bh){....}-{1:2}, at: __dev_queue_xmit+0x1fd/0x2960
[ 97.025259][ T822]
[ 97.025259][ T822] stack backtrace:
[ 97.026349][ T822] CPU: 3 PID: 822 Comm: ping Not tainted 5.10.0+ #819
[ 97.027609][ T822] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1ubuntu1 04/01/2014
[ 97.029407][ T822] Call Trace:
[ 97.030015][ T822] dump_stack+0x99/0xcb
[ 97.030783][ T822] __lock_acquire.cold.77+0x149/0x3a9
[ 97.031773][ T822] ? stack_trace_save+0x81/0xa0
[ 97.032661][ T822] ? register_lock_class+0x1910/0x1910
[ 97.033673][ T822] ? register_lock_class+0x1910/0x1910
[ 97.034679][ T822] ? rcu_read_lock_sched_held+0x91/0xc0
[ 97.035697][ T822] ? rcu_read_lock_bh_held+0xa0/0xa0
[ 97.036690][ T822] lock_acquire+0x1b2/0x730
[ 97.037515][ T822] ? __dev_queue_xmit+0x1f52/0x2960
[ 97.038466][ T822] ? check_flags+0x50/0x50
[ 97.039277][ T822] ? netif_skb_features+0x296/0x9c0
[ 97.040226][ T822] ? validate_xmit_skb+0x29/0xb10
[ 97.041151][ T822] _raw_spin_lock+0x30/0x70
[ 97.041977][ T822] ? __dev_queue_xmit+0x1f52/0x2960
[ 97.042927][ T822] __dev_queue_xmit+0x1f52/0x2960
[ 97.043852][ T822] ? netdev_core_pick_tx+0x290/0x290
[ 97.044824][ T822] ? mark_held_locks+0xb7/0x120
[ 97.045712][ T822] ? lockdep_hardirqs_on_prepare+0x12c/0x3e0
[ 97.046824][ T822] ? __local_bh_enable_ip+0xa5/0xf0
[ 97.047771][ T822] ? ___neigh_create+0x12a8/0x1eb0
[ 97.048710][ T822] ? trace_hardirqs_on+0x41/0x120
[ 97.049626][ T822] ? ___neigh_create+0x12a8/0x1eb0
[ 97.050556][ T822] ? __local_bh_enable_ip+0xa5/0xf0
[ 97.051509][ T822] ? ___neigh_create+0x12a8/0x1eb0
[ 97.052443][ T822] ? check_chain_key+0x244/0x5f0
[ 97.053352][ T822] ? rcu_read_lock_bh_held+0x56/0xa0
[ 97.054317][ T822] ? ip_finish_output2+0x6ea/0x2020
[ 97.055263][ T822] ? pneigh_lookup+0x410/0x410
[ 97.056135][ T822] ip_finish_output2+0x6ea/0x2020
[ ... ]
Acked-by: Guillaume Nault <gnault@redhat.com>
Fixes: 571912c69f0e ("net: UDP tunnel encapsulation module for tunnelling different protocols like MPLS, IP, NSH etc.")
Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Link: https://lore.kernel.org/r/20201228152136.24215-1-ap420073@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
ppp_cp_event is called directly or indirectly by ppp_rx with "ppp->lock"
held. It may call mod_timer to add a new timer. However, at the same time
ppp_timer may be already running and waiting for "ppp->lock". In this
case, there's no need for ppp_timer to continue running and it can just
exit.
If we let ppp_timer continue running, it may call add_timer. This causes
kernel panic because add_timer can't be called with a timer pending.
This patch fixes this problem.
Fixes: e022c2f07ae5 ("WAN: new synchronous PPP implementation for generic HDLC.")
Cc: Krzysztof Halasa <khc@pm.waw.pl>
Signed-off-by: Xie He <xie.he.0141@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This was tested on a RaptorCS Talos II with IBM POWER9 DD2.2 CPUs and an
ASUS XG-C100F PCI-e card without any issue. Speeds of ~8Gbps could be
attained with not-very-scientific (wget HTTP) both-ways measurements on
a local network. No warning or error reported in kernel logs. The
drivers seems to be portable enough for it not to be gated like such.
Signed-off-by: Léo Le Bouter <lle-bout@zaclys.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Both version 0 and version 1 use ETH_P_ERSPAN, but version 0 does not
have an erspan header. So the check in gre_parse_header() is wrong,
we have to distinguish version 1 from version 0.
We can just check the gre header length like is_erspan_type1().
Fixes: cb73ee40b1b3 ("net: ip_gre: use erspan key field for tunnel lookup")
Reported-by: syzbot+f583ce3d4ddf9836b27a@syzkaller.appspotmail.com
Cc: William Tu <u9012063@gmail.com>
Cc: Lorenzo Bianconi <lorenzo.bianconi@redhat.com>
Signed-off-by: Cong Wang <cong.wang@bytedance.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The function skb_copy() could return NULL, the return value
need to be checked.
Fixes: b5996f11ea54 ("net: add Hisilicon Network Subsystem basic ethernet support")
Signed-off-by: Yunjian Wang <wangyunjian@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Check Scell_log shift size in red_check_params() and modify all callers
of red_check_params() to pass Scell_log.
This prevents a shift out-of-bounds as detected by UBSAN:
UBSAN: shift-out-of-bounds in ./include/net/red.h:252:22
shift exponent 72 is too large for 32-bit type 'int'
Fixes: 8afa10cbe281 ("net_sched: red: Avoid illegal values")
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Reported-by: syzbot+97c5bd9cc81eca63d36e@syzkaller.appspotmail.com
Cc: Nogah Frankel <nogahf@mellanox.com>
Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Cc: Cong Wang <xiyou.wangcong@gmail.com>
Cc: Jiri Pirko <jiri@resnulli.us>
Cc: netdev@vger.kernel.org
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
pneigh_enqueue() tries to obtain a random delay by mod
NEIGH_VAR(p, PROXY_DELAY). However, NEIGH_VAR(p, PROXY_DELAY)
migth be zero at that point because someone could write zero
to /proc/sys/net/ipv4/neigh/[device]/proxy_delay after the
callers check it.
This patch uses prandom_u32_max() to get a random delay instead
which avoids potential division by zero.
Signed-off-by: weichenchen <weichen.chen@linux.alibaba.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
RT_TOS() only clears one of the ECN bits. Therefore, when
fib_compute_spec_dst() resorts to a fib lookup, it can return
different results depending on the value of the second ECN bit.
For example, ECT(0) and ECT(1) packets could be treated differently.
$ ip netns add ns0
$ ip netns add ns1
$ ip link add name veth01 netns ns0 type veth peer name veth10 netns ns1
$ ip -netns ns0 link set dev lo up
$ ip -netns ns1 link set dev lo up
$ ip -netns ns0 link set dev veth01 up
$ ip -netns ns1 link set dev veth10 up
$ ip -netns ns0 address add 192.0.2.10/24 dev veth01
$ ip -netns ns1 address add 192.0.2.11/24 dev veth10
$ ip -netns ns1 address add 192.0.2.21/32 dev lo
$ ip -netns ns1 route add 192.0.2.10/32 tos 4 dev veth10 src 192.0.2.21
$ ip netns exec ns1 sysctl -wq net.ipv4.icmp_echo_ignore_broadcasts=0
With TOS 4 and ECT(1), ns1 replies using source address 192.0.2.21
(ping uses -Q to set all TOS and ECN bits):
$ ip netns exec ns0 ping -c 1 -b -Q 5 192.0.2.255
[...]
64 bytes from 192.0.2.21: icmp_seq=1 ttl=64 time=0.544 ms
But with TOS 4 and ECT(0), ns1 replies using source address 192.0.2.11
because the "tos 4" route isn't matched:
$ ip netns exec ns0 ping -c 1 -b -Q 6 192.0.2.255
[...]
64 bytes from 192.0.2.11: icmp_seq=1 ttl=64 time=0.597 ms
After this patch the ECN bits don't affect the result anymore:
$ ip netns exec ns0 ping -c 1 -b -Q 6 192.0.2.255
[...]
64 bytes from 192.0.2.21: icmp_seq=1 ttl=64 time=0.591 ms
Fixes: 35ebf65e851c ("ipv4: Create and use fib_compute_spec_dst() helper.")
Signed-off-by: Guillaume Nault <gnault@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The packet coalescing interrupt threshold has separated registers
for different aggregated/cpu (sw-thread). The required value should
be loaded for every thread but not only for 1 current cpu.
Fixes: 213f428f5056 ("net: mvpp2: add support for TX interrupts and RX queue distribution modes")
Signed-off-by: Stefan Chulski <stefanc@marvell.com>
Link: https://lore.kernel.org/r/1608748521-11033-1-git-send-email-stefanc@marvell.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Callers of evt_ring_command() no longer care whether the command
times out, and don't use what evt_ring_command() returns. Redefine
that function to have void return type.
Reported-by: kernel test robot <lkp@intel.com>
Fixes: 428b448ee764a ("net: ipa: use state to determine event ring command success")
Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Callers of gsi_channel_command() no longer care whether the command
times out, and don't use what gsi_channel_command() returns. Redefine
that function to have void return type.
Reported-by: kernel test robot <lkp@intel.com>
Fixes: 6ffddf3b3d182 ("net: ipa: use state to determine channel command success")
Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
TQM rings are hardware resources that require host context memory
managed by the driver. The driver supports up to 9 TQM rings and
the number of rings to use is requested by firmware during run-time.
Cap this number to the maximum supported to prevent accessing beyond
the array. Future firmware may request more than 9 TQM rings. Define
macros to remove the magic number 9 from the C code.
Fixes: ac3158cb0108 ("bnxt_en: Allocate TQM ring context memory according to fw specification.")
Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com>
Reviewed-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
A recent change skips sending firmware messages to the firmware when
pci_channel_offline() is true during fatal AER error. To make this
complete, we need to move the re-initialization sequence to
bnxt_io_resume(), otherwise the firmware messages to re-initialize
will all be skipped. In any case, it is more correct to re-initialize
in bnxt_io_resume().
Also, fix the reverse x-mas tree format when defining variables
in bnxt_io_slot_reset().
Fixes: b340dc680ed4 ("bnxt_en: Avoid sending firmware messages when AER error is detected.")
Reviewed-by: Edwin Peer <edwin.peer@broadcom.com>
Signed-off-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
the following syzkaller reproducer:
r0 = socket$inet_mptcp(0x2, 0x1, 0x106)
bind$inet(r0, &(0x7f0000000080)={0x2, 0x4e24, @multicast2}, 0x10)
connect$inet(r0, &(0x7f0000000480)={0x2, 0x4e24, @local}, 0x10)
sendto$inet(r0, &(0x7f0000000100)="f6", 0xffffffe7, 0xc000, 0x0, 0x0)
systematically triggers the following warning:
WARNING: CPU: 2 PID: 8618 at net/core/stream.c:208 sk_stream_kill_queues+0x3fa/0x580
Modules linked in:
CPU: 2 PID: 8618 Comm: syz-executor Not tainted 5.10.0+ #334
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.11.1-4.module+el8.1.0+4066+0f1aadab 04/04
RIP: 0010:sk_stream_kill_queues+0x3fa/0x580
Code: df 48 c1 ea 03 0f b6 04 02 84 c0 74 04 3c 03 7e 40 8b ab 20 02 00 00 e9 64 ff ff ff e8 df f0 81 2
RSP: 0018:ffffc9000290fcb0 EFLAGS: 00010293
RAX: ffff888011cb8000 RBX: 0000000000000000 RCX: ffffffff86eecf0e
RDX: 0000000000000000 RSI: ffffffff86eecf6a RDI: 0000000000000005
RBP: 0000000000000e28 R08: ffff888011cb8000 R09: fffffbfff1f48139
R10: ffffffff8fa409c7 R11: fffffbfff1f48138 R12: ffff8880215e6220
R13: ffffffff8fa409c0 R14: ffffc9000290fd30 R15: 1ffff92000521fa2
FS: 00007f41c78f4800(0000) GS:ffff88802d000000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f95c803d088 CR3: 0000000025ed2000 CR4: 00000000000006f0
Call Trace:
__mptcp_destroy_sock+0x4f5/0x8e0
mptcp_close+0x5e2/0x7f0
inet_release+0x12b/0x270
__sock_release+0xc8/0x270
sock_close+0x18/0x20
__fput+0x272/0x8e0
task_work_run+0xe0/0x1a0
exit_to_user_mode_prepare+0x1df/0x200
syscall_exit_to_user_mode+0x19/0x50
entry_SYSCALL_64_after_hwframe+0x44/0xa9
userspace programs provide arbitrarily high values of 'len' in sendmsg():
this is causing integer overflow of 'amount'. Cap forward allocation to 1
megabyte: higher values are not really useful.
Suggested-by: Paolo Abeni <pabeni@redhat.com>
Fixes: e93da92896bc ("mptcp: implement wmem reservation")
Signed-off-by: Davide Caratti <dcaratti@redhat.com>
Link: https://lore.kernel.org/r/3334d00d8b2faecafdfab9aa593efcbf61442756.1608584474.git.dcaratti@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Currently the tun_napi_alloc_frags() function returns -ENOMEM when the
number of iovs exceeds MAX_SKB_FRAGS + 1. However this is inappropriate,
we should use -EMSGSIZE instead of -ENOMEM.
The following distinctions are matters:
1. the caller need to drop the bad packet when -EMSGSIZE is returned,
which means meeting a persistent failure.
2. the caller can try again when -ENOMEM is returned, which means
meeting a transient failure.
Fixes: 90e33d459407 ("tun: enable napi_gro_frags() for TUN/TAP driver")
Signed-off-by: Yunjian Wang <wangyunjian@huawei.com>
Acked-by: Willem de Bruijn <willemb@google.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Link: https://lore.kernel.org/r/1608864736-24332-1-git-send-email-wangyunjian@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The CPTS driver registers PTP PHC clock when first netif is going up and
unregister it when all netif are down. Now ethtool will show:
- PTP PHC clock index 0 after boot until first netif is up;
- the last assigned PTP PHC clock index even if PTP PHC clock is not
registered any more after all netifs are down.
This patch ensures that -1 is returned by ethtool when PTP PHC clock is not
registered any more.
Fixes: 8a2c9a5ab4b9 ("net: ethernet: ti: cpts: rework initialization/deinitialization")
Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com>
Acked-by: Richard Cochran <richardcochran@gmail.com>
Link: https://lore.kernel.org/r/20201224162405.28032-1-grygorii.strashko@ti.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Accesses to dev->xps_rxqs_map (when using dev->num_tc) should be
protected by the rtnl lock, like we do for netif_set_xps_queue. I didn't
see an actual bug being triggered, but let's be safe here and take the
rtnl lock while accessing the map in sysfs.
Fixes: 8af2c06ff4b1 ("net-sysfs: Add interface for Rx queue(s) map per Tx queue")
Signed-off-by: Antoine Tenart <atenart@kernel.org>
Reviewed-by: Alexander Duyck <alexanderduyck@fb.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Two race conditions can be triggered when storing xps rxqs, resulting in
various oops and invalid memory accesses:
1. Calling netdev_set_num_tc while netif_set_xps_queue:
- netif_set_xps_queue uses dev->tc_num as one of the parameters to
compute the size of new_dev_maps when allocating it. dev->tc_num is
also used to access the map, and the compiler may generate code to
retrieve this field multiple times in the function.
- netdev_set_num_tc sets dev->tc_num.
If new_dev_maps is allocated using dev->tc_num and then dev->tc_num
is set to a higher value through netdev_set_num_tc, later accesses to
new_dev_maps in netif_set_xps_queue could lead to accessing memory
outside of new_dev_maps; triggering an oops.
2. Calling netif_set_xps_queue while netdev_set_num_tc is running:
2.1. netdev_set_num_tc starts by resetting the xps queues,
dev->tc_num isn't updated yet.
2.2. netif_set_xps_queue is called, setting up the map with the
*old* dev->num_tc.
2.3. netdev_set_num_tc updates dev->tc_num.
2.4. Later accesses to the map lead to out of bound accesses and
oops.
A similar issue can be found with netdev_reset_tc.
One way of triggering this is to set an iface up (for which the driver
uses netdev_set_num_tc in the open path, such as bnx2x) and writing to
xps_rxqs in a concurrent thread. With the right timing an oops is
triggered.
Both issues have the same fix: netif_set_xps_queue, netdev_set_num_tc
and netdev_reset_tc should be mutually exclusive. We do that by taking
the rtnl lock in xps_rxqs_store.
Fixes: 8af2c06ff4b1 ("net-sysfs: Add interface for Rx queue(s) map per Tx queue")
Signed-off-by: Antoine Tenart <atenart@kernel.org>
Reviewed-by: Alexander Duyck <alexanderduyck@fb.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Accesses to dev->xps_cpus_map (when using dev->num_tc) should be
protected by the rtnl lock, like we do for netif_set_xps_queue. I didn't
see an actual bug being triggered, but let's be safe here and take the
rtnl lock while accessing the map in sysfs.
Fixes: 184c449f91fe ("net: Add support for XPS with QoS via traffic classes")
Signed-off-by: Antoine Tenart <atenart@kernel.org>
Reviewed-by: Alexander Duyck <alexanderduyck@fb.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Two race conditions can be triggered when storing xps cpus, resulting in
various oops and invalid memory accesses:
1. Calling netdev_set_num_tc while netif_set_xps_queue:
- netif_set_xps_queue uses dev->tc_num as one of the parameters to
compute the size of new_dev_maps when allocating it. dev->tc_num is
also used to access the map, and the compiler may generate code to
retrieve this field multiple times in the function.
- netdev_set_num_tc sets dev->tc_num.
If new_dev_maps is allocated using dev->tc_num and then dev->tc_num
is set to a higher value through netdev_set_num_tc, later accesses to
new_dev_maps in netif_set_xps_queue could lead to accessing memory
outside of new_dev_maps; triggering an oops.
2. Calling netif_set_xps_queue while netdev_set_num_tc is running:
2.1. netdev_set_num_tc starts by resetting the xps queues,
dev->tc_num isn't updated yet.
2.2. netif_set_xps_queue is called, setting up the map with the
*old* dev->num_tc.
2.3. netdev_set_num_tc updates dev->tc_num.
2.4. Later accesses to the map lead to out of bound accesses and
oops.
A similar issue can be found with netdev_reset_tc.
One way of triggering this is to set an iface up (for which the driver
uses netdev_set_num_tc in the open path, such as bnx2x) and writing to
xps_cpus in a concurrent thread. With the right timing an oops is
triggered.
Both issues have the same fix: netif_set_xps_queue, netdev_set_num_tc
and netdev_reset_tc should be mutually exclusive. We do that by taking
the rtnl lock in xps_cpus_store.
Fixes: 184c449f91fe ("net: Add support for XPS with QoS via traffic classes")
Signed-off-by: Antoine Tenart <atenart@kernel.org>
Reviewed-by: Alexander Duyck <alexanderduyck@fb.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The cdc_ncm driver passes network connection notifications up to
usbnet_link_change(), which is the right place for any logging.
Remove the netdev_info() duplicating this from the driver itself.
This stops devices such as my "TRENDnet USB 10/100/1G/2.5G LAN"
(ID 20f4:e02b) adapter from spamming the kernel log with
cdc_ncm 2-2:2.0 enp0s2u2c2: network connection: connected
messages every 60 msec or so.
Signed-off-by: Roland Dreier <roland@kernel.org>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Link: https://lore.kernel.org/r/20201224032116.2453938-1-roland@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Instead of directly comparing task->tgid and task->pid, use the
thread_group_leader() helper. This helps with readability, and
there should be no functional change.
Signed-off-by: Jonathan Lemon <jonathan.lemon@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20201218185032.2464558-3-jonathan.lemon@gmail.com
|
|
On some systems, some variant of the following splat is
repeatedly seen. The common factor in all traces seems
to be the entry point to task_file_seq_next(). With the
patch, all warnings go away.
rcu: INFO: rcu_sched self-detected stall on CPU
rcu: \x0926-....: (20992 ticks this GP) idle=d7e/1/0x4000000000000002 softirq=81556231/81556231 fqs=4876
\x09(t=21033 jiffies g=159148529 q=223125)
NMI backtrace for cpu 26
CPU: 26 PID: 2015853 Comm: bpftool Kdump: loaded Not tainted 5.6.13-0_fbk4_3876_gd8d1f9bf80bb #1
Hardware name: Quanta Twin Lakes MP/Twin Lakes Passive MP, BIOS F09_3A12 10/08/2018
Call Trace:
<IRQ>
dump_stack+0x50/0x70
nmi_cpu_backtrace.cold.6+0x13/0x50
? lapic_can_unplug_cpu.cold.30+0x40/0x40
nmi_trigger_cpumask_backtrace+0xba/0xca
rcu_dump_cpu_stacks+0x99/0xc7
rcu_sched_clock_irq.cold.90+0x1b4/0x3aa
? tick_sched_do_timer+0x60/0x60
update_process_times+0x24/0x50
tick_sched_timer+0x37/0x70
__hrtimer_run_queues+0xfe/0x270
hrtimer_interrupt+0xf4/0x210
smp_apic_timer_interrupt+0x5e/0x120
apic_timer_interrupt+0xf/0x20
</IRQ>
RIP: 0010:get_pid_task+0x38/0x80
Code: 89 f6 48 8d 44 f7 08 48 8b 00 48 85 c0 74 2b 48 83 c6 55 48 c1 e6 04 48 29 f0 74 19 48 8d 78 20 ba 01 00 00 00 f0 0f c1 50 20 <85> d2 74 27 78 11 83 c2 01 78 0c 48 83 c4 08 c3 31 c0 48 83 c4 08
RSP: 0018:ffffc9000d293dc8 EFLAGS: 00000202 ORIG_RAX: ffffffffffffff13
RAX: ffff888637c05600 RBX: ffffc9000d293e0c RCX: 0000000000000000
RDX: 0000000000000001 RSI: 0000000000000550 RDI: ffff888637c05620
RBP: ffffffff8284eb80 R08: ffff88831341d300 R09: ffff88822ffd8248
R10: ffff88822ffd82d0 R11: 00000000003a93c0 R12: 0000000000000001
R13: 00000000ffffffff R14: ffff88831341d300 R15: 0000000000000000
? find_ge_pid+0x1b/0x20
task_seq_get_next+0x52/0xc0
task_file_seq_get_next+0x159/0x220
task_file_seq_next+0x4f/0xa0
bpf_seq_read+0x159/0x390
vfs_read+0x8a/0x140
ksys_read+0x59/0xd0
do_syscall_64+0x42/0x110
entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x7f95ae73e76e
Code: Bad RIP value.
RSP: 002b:00007ffc02c1dbf8 EFLAGS: 00000246 ORIG_RAX: 0000000000000000
RAX: ffffffffffffffda RBX: 000000000170faa0 RCX: 00007f95ae73e76e
RDX: 0000000000001000 RSI: 00007ffc02c1dc30 RDI: 0000000000000007
RBP: 00007ffc02c1ec70 R08: 0000000000000005 R09: 0000000000000006
R10: fffffffffffff20b R11: 0000000000000246 R12: 00000000019112a0
R13: 0000000000000000 R14: 0000000000000007 R15: 00000000004283c0
If unable to obtain the file structure for the current task,
proceed to the next task number after the one returned from
task_seq_get_next(), instead of the next task number from the
original iterator.
Also, save the stopping task number from task_seq_get_next()
on failure in case of restarts.
Fixes: eaaacd23910f ("bpf: Add task and task/file iterator targets")
Signed-off-by: Jonathan Lemon <jonathan.lemon@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20201218185032.2464558-2-jonathan.lemon@gmail.com
|
|
20b6cc34ea74 ("bpf: Avoid hashtab deadlock with map_locked") introduced
a possibility of getting EBUSY error on lock contention, which seems to happen
very deterministically in test_maps when running 1024 threads on low-CPU
machine. In libbpf CI case, it's a 2 CPU VM and it's hitting this 100% of the
time. Work around by retrying on EBUSY (and EAGAIN, while we are at it) after
a small sleep. sched_yield() is too agressive and fails even after 20 retries,
so I went with usleep(1) for backoff.
Also log actual error returned to make it easier to see what's going on.
Fixes: 20b6cc34ea74 ("bpf: Avoid hashtab deadlock with map_locked")
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Song Liu <songliubraving@fb.com>
Link: https://lore.kernel.org/bpf/20201223200652.3417075-1-andrii@kernel.org
|
|
This flag can be used by an end user to disable S0ix flows on a
buggy system or by an OEM for development purposes.
If you need this flag to be persisted across reboots, it's suggested
to use a udev rule to call adjust it until the kernel could have your
configuration in a disallow list.
Signed-off-by: Mario Limonciello <mario.limonciello@dell.com>
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
Tested-by: Yijun Shen <Yijun.shen@dell.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
|
|
commit e086ba2fccda ("e1000e: disable s0ix entry and exit flows for ME
systems") disabled s0ix flows for systems that have various incarnations of
the i219-LM ethernet controller. This changed caused power consumption
regressions on the following shipping Dell Comet Lake based laptops:
* Latitude 5310
* Latitude 5410
* Latitude 5410
* Latitude 5510
* Precision 3550
* Latitude 5411
* Latitude 5511
* Precision 3551
* Precision 7550
* Precision 7750
This commit was introduced because of some regressions on certain Thinkpad
laptops. This comment was potentially caused by an earlier
commit 632fbd5eb5b0e ("e1000e: fix S0ix flows for cable connected case").
or it was possibly caused by a system not meeting platform architectural
requirements for low power consumption. Other changes made in the driver
with extended timeouts are expected to make the driver more impervious to
platform firmware behavior.
Fixes: e086ba2fccda ("e1000e: disable s0ix entry and exit flows for ME systems")
Reviewed-by: Alexander Duyck <alexander.duyck@gmail.com>
Signed-off-by: Mario Limonciello <mario.limonciello@dell.com>
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
Tested-by: Yijun Shen <Yijun.shen@dell.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
|
|
Per guidance from Intel ethernet architecture team, it may take
up to 1 second for unconfiguring ULP mode.
However in practice this seems to be taking up to 2 seconds on
some Lenovo machines. Detect scenarios that take more than 1 second
but less than 2.5 seconds and emit a warning on resume for those
scenarios.
Suggested-by: Aaron Ma <aaron.ma@canonical.com>
Suggested-by: Sasha Netfin <sasha.neftin@intel.com>
Suggested-by: Hans de Goede <hdegoede@redhat.com>
CC: Mark Pearson <markpearson@lenovo.com>
Fixes: f15bb6dde738cc8fa0 ("e1000e: Add support for S0ix")
BugLink: https://bugs.launchpad.net/bugs/1865570
Link: https://patchwork.ozlabs.org/project/intel-wired-lan/patch/20200323191639.48826-1-aaron.ma@canonical.com/
Link: https://lkml.org/lkml/2020/12/13/15
Link: https://lkml.org/lkml/2020/12/14/708
Signed-off-by: Mario Limonciello <mario.limonciello@dell.com>
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
Tested-by: Yijun Shen <Yijun.shen@dell.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
|
|
If the shutdown failed, the part will be thawed and running
S0ix flows will put it into an undefined state.
Reported-by: Alexander Duyck <alexander.duyck@gmail.com>
Reviewed-by: Alexander Duyck <alexander.duyck@gmail.com>
Signed-off-by: Mario Limonciello <mario.limonciello@dell.com>
Tested-by: Yijun Shen <Yijun.shen@dell.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
|
|
Commit f9c6cea0b385 ("ibmvnic: Skip fatal error reset after passive init")
says "If the passive
CRQ initialization occurs before the FATAL reset task is processed,
the FATAL error reset task would try to access a CRQ message queue
that was freed, causing an oops. The problem may be most likely to
occur during DLPAR add vNIC with a non-default MTU, because the DLPAR
process will automatically issue a change MTU request.
Fix this by not processing fatal error reset if CRQ is passively
initialized after client-driven CRQ initialization fails."
The original commit skips a specific reset condition, but that does
not fix the problem it claims to fix, and misses a reset condition.
The effective fix is commit 0e435befaea4 ("ibmvnic: fix NULL pointer
dereference in ibmvic_reset_crq") and commit a0faaa27c716 ("ibmvnic:
fix NULL pointer dereference in reset_sub_crq_queues"). With above
two fixes, there are no more crashes seen as described even without
the original commit, so I would like to revert the original commit.
Fixes: f9c6cea0b385 ("ibmvnic: Skip fatal error reset after passive init")
Signed-off-by: Lijun Pan <ljp@linux.ibm.com>
Link: https://lore.kernel.org/r/20201223204904.12677-1-ljp@linux.ibm.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
When mdiobus_register() fails, priv->mdio allocated
by mdiobus_alloc() has not been freed, which leads
to memleak.
Fixes: e7f4dc3536a4 ("mdio: Move allocation of interrupts into core")
Signed-off-by: Dinghao Liu <dinghao.liu@zju.edu.cn>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://lore.kernel.org/r/20201223110615.31389-1-dinghao.liu@zju.edu.cn
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
When aggregating ncsi interfaces and dedicated interfaces to bond
interfaces, the ncsi response handler will use the wrong net device to
find ncsi_dev, so that the ncsi interface will not work properly.
Here, we use the original net device to fix it.
Fixes: 138635cc27c9 ("net/ncsi: NCSI response packet handler")
Signed-off-by: John Wang <wangzhiqiang.bj@bytedance.com>
Link: https://lore.kernel.org/r/20201223055523.2069-1-wangzhiqiang.bj@bytedance.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
DCB uses the same handler function for both RTM_GETDCB and RTM_SETDCB
messages. dcb_doit() bounces RTM_SETDCB mesasges if the user does not have
the CAP_NET_ADMIN capability.
However, the operation to be performed is not decided from the DCB message
type, but from the DCB command. Thus DCB_CMD_*_GET commands are used for
reading DCB objects, the corresponding SET and DEL commands are used for
manipulation.
The assumption is that set-like commands will be sent via an RTM_SETDCB
message, and get-like ones via RTM_GETDCB. However, this assumption is not
enforced.
It is therefore possible to manipulate DCB objects without CAP_NET_ADMIN
capability by sending the corresponding command in an RTM_GETDCB message.
That is a bug. Fix it by validating the type of the request message against
the type used for the response.
Fixes: 2f90b8657ec9 ("ixgbe: this patch adds support for DCB to the kernel and ixgbe driver")
Signed-off-by: Petr Machata <me@pmachata.org>
Link: https://lore.kernel.org/r/a2a9b88418f3a58ef211b718f2970128ef9e3793.1608673640.git.me@pmachata.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
This patch implements the same basic fix for event rings as the
previous one does for channels.
The result of issuing an event ring control command should be that
the event ring changes state. If enabled, a completion interrupt
signals that the event ring state has changed. This interrupt is
enabled by gsi_evt_ring_command() and disabled again after the
command has completed (or we time out).
There is a window of time during which the command could complete
successfully without interrupting. This would cause the event ring
to transition to the desired new state.
So whether a event ring command ends via completion interrupt or
timeout, we can consider the command successful if the event ring
has entered the desired state (and a failure if it has not,
regardless of the cause).
Fixes: b4175f8731f78 ("net: ipa: only enable GSI event control IRQs when needed")
Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The result of issuing a channel control command should be that the
channel changes state. If enabled, a completion interrupt signals
that the channel state has changed. This interrupt is enabled by
gsi_channel_command() and disabled again after the command has
completed (or we time out).
There is a window of time--after the completion interrupt is disabled
but before the channel state is read--during which the command could
complete successfully without interrupting. This would cause the
channel to transition to the desired new state.
So whether a channel command ends via completion interrupt or
timeout, we can consider the command successful if the channel
has entered the desired state (and a failure if it has not,
regardless of the cause).
Fixes: d6c9e3f506ae8 ("net: ipa: only enable generic command completion IRQ when needed");
Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
We enable the completion interrupt for channel or event ring
commands only when we issue them. The interrupt is disabled after
the interrupt has fired, or after we have timed out waiting for it.
If we time out, the command could complete after the interrupt has
been disabled, causing a state change in the channel or event ring.
The interrupt associated with that state change would be delivered
the next time the completion interrupt is enabled.
To avoid previous command completions interfering with new commands,
clear all pending completion interrupts before re-enabling them for
a new command.
Fixes: b4175f8731f78 ("net: ipa: only enable GSI event control IRQs when needed")
Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Add TGL-H PCI info and PCI IDs for the new TSN Controller to the list
of supported devices.
Signed-off-by: Noor Azura Ahmad Tarmizi <noor.azura.ahmad.tarmizi@intel.com>
Signed-off-by: Voon Weifeng <weifeng.voon@intel.com>
Signed-off-by: Muhammad Husaini Zulkifli <muhammad.husaini.zulkifli@intel.com>
Link: https://lore.kernel.org/r/20201222160337.30870-1-muhammad.husaini.zulkifli@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
When the core clock rate and interconnect bandwidth specifications
were moved into configuration data, a copy/paste bug was introduced,
causing the memory interconnect bandwidth to be set three times
rather than enabling the three different interconnects.
Fix this bug.
Fixes: 91d02f9551501 ("net: ipa: use config data for clocking")
Signed-off-by: Alex Elder <elder@linaro.org>
Reviewed-by: Georgi Djakov <georgi.djakov@linaro.org>
Link: https://lore.kernel.org/r/20201222151613.5730-1-elder@linaro.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
virtnet_set_channels can recursively call cpus_read_lock if CONFIG_XPS
and CONFIG_HOTPLUG are enabled.
The path is:
virtnet_set_channels - calls get_online_cpus(), which is a trivial
wrapper around cpus_read_lock()
netif_set_real_num_tx_queues
netif_reset_xps_queues_gt
netif_reset_xps_queues - calls cpus_read_lock()
This call chain and potential deadlock happens when the number of TX
queues is reduced.
This commit the removes netif_set_real_num_[tr]x_queues calls from
inside the get/put_online_cpus section, as they don't require that it
be held.
Fixes: 47be24796c13 ("virtio-net: fix the set affinity bug when CPU IDs are not consecutive")
Signed-off-by: Jeff Dike <jdike@akamai.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Link: https://lore.kernel.org/r/20201223025421.671-1-jdike@akamai.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
IPIP tunnels packets are unknown to device,
hence these packets are incorrectly parsed and
caused the packet corruption, so disable offlods
for such packets at run time.
Signed-off-by: Manish Chopra <manishc@marvell.com>
Signed-off-by: Sudarsana Kalluru <skalluru@marvell.com>
Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Link: https://lore.kernel.org/r/20201221145530.7771-1-manishc@marvell.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Join adjacent questions to a single question line. This fixes the
formatting of questions that were not part of the heading.
Also, drop Q: and A: prefixes. We don't need them now that questions and
answers are visually separated.
Signed-off-by: Baruch Siach <baruch@tkos.co.il>
Link: https://lore.kernel.org/r/f76078ba5547744f2ec178984c32fbc7dcd29a2b.1608454187.git.baruch@tkos.co.il
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
When mvneta_port_power_up() fails, we should execute
cleanup functions after label err_netdev to avoid memleak.
Fixes: 41c2b6b4f0f80 ("net: ethernet: mvneta: Add back interface mode validation")
Signed-off-by: Dinghao Liu <dinghao.liu@zju.edu.cn>
Link: https://lore.kernel.org/r/20201220082930.21623-1-dinghao.liu@zju.edu.cn
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Commit 34f0f4e3f488 ("ibmvnic: Fix login buffer memory leaks") frees
login_rsp_buffer in release_resources() and send_login()
because handle_login_rsp() does not free it.
Commit f3ae59c0c015 ("ibmvnic: store RX and TX subCRQ handle array in
ibmvnic_adapter struct") frees login_rsp_buffer in handle_login_rsp().
It seems unnecessary to free it in release_resources() and send_login().
There are chances that handle_login_rsp returns earlier without freeing
buffers. Double-checking the buffer is harmless since
release_login_buffer and release_login_rsp_buffer will
do nothing if buffer is already freed.
Fixes: f3ae59c0c015 ("ibmvnic: store RX and TX subCRQ handle array in ibmvnic_adapter struct")
Fixes: 34f0f4e3f488 ("ibmvnic: Fix login buffer memory leaks")
Signed-off-by: Lijun Pan <ljp@linux.ibm.com>
Link: https://lore.kernel.org/r/20201219213919.21045-1-ljp@linux.ibm.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
When searching for inactive maintainers it's useful to filter
out mailing list addresses. Such "maintainers" will obviously
never feature in a "From:" line of an email or a review tag.
Since "L:" entries only provide the address of a mailing list
without a fancy name extend this pattern to "M:" entries.
Alternatively we could reserve M: entries for humans only
and move the fake "maintainers" to L:. While I'd personally
prefer to reserve M: for humans only, I'm not 100% that's
a great choice either, given most L: entries are in fact
open mailing lists with public archives.
Link: https://lore.kernel.org/r/20201219185538.750076-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The dwmac glue registers on Amlogic Meson8b and newer SoCs has two clock
inputs:
- Meson8b and Meson8m2: MPLL2 and MPLL2 (the same parent is wired to
both inputs)
- GXBB, GXL, GXM, AXG, G12A, G12B, SM1: FCLK_DIV2 and MPLL2
All known vendor kernels and u-boots are using the first input only. We
let the common clock framework automatically choose the "right" parent.
For some boards this causes a problem though, specificially with G12A and
newer SoCs. The clock input is used for generating the 125MHz RGMII TX
clock. For the two input clocks this means on G12A:
- FCLK_DIV2: 999999985Hz / 8 = 124999998.125Hz
- MPLL2: 499999993Hz / 4 = 124999998.25Hz
In theory MPLL2 is the "better" clock input because it's gets us 0.125Hz
closer to the requested frequency than FCLK_DIV2. In reality however
there is a resource conflict because MPLL2 is needed to generate some of
the audio clocks. dwmac-meson8b probes first and sets up the clock tree
with MPLL2. This works fine until the audio driver comes and "steals"
the MPLL2 clocks and configures it with it's own rate (294909637Hz). The
common clock framework happily changes the MPLL2 rate but does not
reconfigure our RGMII TX clock tree, which then ends up at 73727409Hz,
which is more than 40% off the requested 125MHz.
Don't use the second clock input for now to force the common clock
framework to always select the first parent. This mimics the behavior
from the vendor driver and fixes the clock resource conflict with the
audio driver on G12A boards. Once the common clock framework can handle
this situation this change can be reverted again.
Fixes: 566e8251625304 ("net: stmmac: add a glue driver for the Amlogic Meson 8b / GXBB DWMAC")
Reported-by: Thomas Graichen <thomas.graichen@gmail.com>
Signed-off-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com>
Tested-by: thomas graichen <thomas.graichen@gmail.com>
Link: https://lore.kernel.org/r/20201219135036.3216017-1-martin.blumenstingl@googlemail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
During GoP port 2 Networking Complex Control mode of operation configurations,
also GoP port 3 mode of operation was wrongly set.
Patch removes these configurations.
Fixes: f84bf386f395 ("net: mvpp2: initialize the GoP")
Acked-by: Marcin Wojtas <mw@semihalf.com>
Signed-off-by: Stefan Chulski <stefanc@marvell.com>
Link: https://lore.kernel.org/r/1608462149-1702-1-git-send-email-stefanc@marvell.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
PPPIOCGL2TPSTATS already uses 54. This shouldn't be a problem in
practice, but let's keep the logical decreasing assignment scheme.
Fixes: 4cf476ced45d ("ppp: add PPPIOCBRIDGECHAN and PPPIOCUNBRIDGECHAN ioctls")
Signed-off-by: Guillaume Nault <gnault@redhat.com>
Link: https://lore.kernel.org/r/e3a4c355e3820331d8e1fffef8522739aae58b57.1608380117.git.gnault@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
This error path needs to disable the pci device before returning.
Fixes: ede58ef28e10 ("atm: remove deprecated use of pci api")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Link: https://lore.kernel.org/r/X93dmC4NX0vbTpGp@mwanda
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Let the FW know we have enough receive buffer space for the
vlan tag if it isn't stripped.
Fixes: 0f3154e6bcb3 ("ionic: Add Tx and Rx handling")
Signed-off-by: Shannon Nelson <snelson@pensando.io>
Link: https://lore.kernel.org/r/20201218215001.64696-1-snelson@pensando.io
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
ugeth is the netdiv_priv() part of the netdevice. Accessing the memory
pointed to by ugeth (such as done by ucc_geth_memclean() and the two
of_node_puts) after free_netdev() is thus use-after-free.
Fixes: 80a9fad8e89a ("ucc_geth: fix module removal")
Signed-off-by: Rasmus Villemoes <rasmus.villemoes@prevas.dk>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|