linux-dev - Linux kernel development work

Age	Commit message (Collapse)	Author	Files	Lines
2022-01-25	ionic: Query FW when getting VF info via ndo_get_vf_config	Brett Creeley	5	-8/+148
	Currently when an administrator configures a VF via ndo_set_vf, the driver will send the set command to FW and then update the cached value. The cached value is then used when reporting VF info via ndo_get_vf_config. A problem is that the VF info may have been updated between the last ndo_set_vf and ndo_get_vf_info commands via some other method, i.e. a VF changes its MAC address (assuming it's allowed to do so) and since this is all managed by the FW, this new value won't be reflected in the PF's cache of values. To fix this, update the driver to always get the latest VF information by making use of the IONIC_CMD_VF_GETATTR dev command. The FW may not support getting all the attributes for IONIC_CMD_VF_GETATTR, so the driver will only update the cached VF config members if their associated IONIC_CMD_VF_GETATTR was successful. Otherwise the cached VF config members will remain the same as what was set in ndo_set_vf*. Fixes: fbb39807e9ae ("ionic: support sr-iov operations") Signed-off-by: Brett Creeley <brett@pensando.io> Signed-off-by: Shannon Nelson <snelson@pensando.io> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-01-25	ionic: Allow flexibility for error reporting on dev commands	Brett Creeley	2	-4/+30
	When dev commands fail, an error message will always be printed, which may be overly alarming the to system administrators, especially if the driver shouldn't be printing the error due to some unsupported capability. Similar to recent adminq request changes, we can update the dev command interface with the ability to selectively print error messages to allow the driver to prevent printing errors that are expected. Signed-off-by: Brett Creeley <brett@pensando.io> Signed-off-by: Shannon Nelson <snelson@pensando.io> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-01-25	ionic: Correctly print AQ errors if completions aren't received	Brett Creeley	1	-0/+2
	Recent changes went into the driver to allow flexibility when printing error messages. Unfortunately this had the unexpected consequence of printing confusing messages like the following: IONIC_CMD_RX_FILTER_ADD (31) failed: IONIC_RC_SUCCESS (-6) In cases like this the completion of the admin queue command never completes, so the completion status is 0, hence IONIC_RC_SUCCESS is printed even though the command clearly failed. For example, this could happen when the driver tries to add a filter and at the same time the FW goes through a reset, so the AQ command never completes. Fix this by forcing the FW completion status to IONIC_RC_ERROR in cases where we never get the completion. Fixes: 8c9d956ab6fb ("ionic: allow adminq requests to override default error message") Signed-off-by: Brett Creeley <brett@pensando.io> Signed-off-by: Shannon Nelson <snelson@pensando.io> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-01-25	ionic: fix up printing of timeout error	Shannon Nelson	1	-2/+6
	Make sure we print the TIMEOUT string if we had a timeout error, rather than printing the wrong status. Fixes: 8c9d956ab6fb ("ionic: allow adminq requests to override default error message") Signed-off-by: Shannon Nelson <snelson@pensando.io> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-01-25	ionic: better handling of RESET event	Shannon Nelson	2	-10/+24
	When IONIC_EVENT_RESET is received, we only need to start the fw_down process if we aren't already down, and we need to be sure to set the FW_STOPPING state on the way. If this is how we noticed that FW was stopped, it is most likely from a FW update, and we'll see a new FW generation. The update happens quickly enough that we might not see fw_status==0, so we need to be sure things get restarted when we see the fw_generation change. Fixes: d2662072c094 ("ionic: monitor fw status generation") Signed-off-by: Shannon Nelson <snelson@pensando.io> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-01-25	ionic: add FW_STOPPING state	Shannon Nelson	4	-12/+19
	Between fw running and fw actually stopped into reset, we need a fw_stopping concept to catch and block some actions while we're transitioning to FW_RESET state. This will help to be sure the fw_up task is not scheduled until after the fw_down task has completed. On some rare occasion timing, it is possible for the fw_up task to try to run before the fw_down task, then not get run after the fw_down task has run, leaving the device in a down state. This is possible if the watchdog goes off in between finding the down transition and starting the fw_down task, where the later watchdog sees the FW is back up and schedules a fw_up task. Fixes: c672412f6172 ("ionic: remove lifs on fw reset") Signed-off-by: Shannon Nelson <snelson@pensando.io> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-01-25	ionic: Don't send reset commands if FW isn't running	Brett Creeley	3	-13/+23
	It's possible the FW is already shutting down while the driver is being removed and/or when the driver is going through reset. This can cause unexpected/unnecessary errors to be printed: eth0: DEV_CMD IONIC_CMD_PORT_RESET (12) error, IONIC_RC_ERROR (29) failed eth1: DEV_CMD IONIC_CMD_RESET (3) error, IONIC_RC_ERROR (29) failed Fix this by checking the FW status register before issuing the reset commands. Also, since err may not be assigned in ionic_port_reset(), assign it a default value of 0, and remove an unnecessary log message. Fixes: fbfb8031533c ("ionic: Add hardware init and device commands") Signed-off-by: Brett Creeley <brett@pensando.io> Signed-off-by: Shannon Nelson <snelson@pensando.io> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-01-25	ionic: separate function for watchdog init	Shannon Nelson	1	-12/+19
	Pull the watchdog init code out to a separate bite-sized function. Code cleaning for now, will be a useful change in the near future. Signed-off-by: Shannon Nelson <snelson@pensando.io> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-01-25	ionic: start watchdog after all is setup	Shannon Nelson	2	-4/+3
	The watchdog expects the lif to fully exist when it goes off, so lets not start the watchdog until all is ready in case there is some quirky time dialation that makes probe take multiple seconds. Fixes: 089406bc5ad6 ("ionic: add a watchdog timer to monitor heartbeat") Signed-off-by: Shannon Nelson <snelson@pensando.io> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-01-25	ionic: fix type complaint in ionic_dev_cmd_clean()	Shannon Nelson	1	-3/+3
	Sparse seems to have gotten a little more picky lately and we need to revisit this bit of code to make sparse happy. warning: incorrect type in initializer (different address spaces) expected union ionic_dev_cmd_regs regs got union ionic_dev_cmd_regs [noderef] __iomem dev_cmd_regs warning: incorrect type in argument 2 (different address spaces) expected void [noderef] __iomem * got unsigned int * warning: incorrect type in argument 1 (different address spaces) expected void volatile [noderef] __iomem * got union ionic_dev_cmd * Fixes: d701ec326a31 ("ionic: clean up sparse complaints") Signed-off-by: Shannon Nelson <snelson@pensando.io> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-01-25	ipv4: get rid of fib_info_hash_{alloc\|free}	Eric Dumazet	1	-33/+11
	Use kvzalloc()/kvfree() instead of hand coded functions. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-01-25	ip6_tunnel: allow routing IPv4 traffic in NBMA mode	Qing Deng	1	-0/+5
	Since IPv4 routes support IPv6 gateways now, we can route IPv4 traffic in NBMA tunnels. Signed-off-by: Qing Deng <i@moy.cat> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-01-24	net: use bool values to pass bool param of phy_init_eee()	Jisheng Zhang	7	-7/+7
	The 2nd param of phy_init_eee(): clk_stop_enable is a bool param, use true or false instead of 1/0. Signed-off-by: Jisheng Zhang <jszhang@kernel.org> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Link: https://lore.kernel.org/r/20220123152241.1480-1-jszhang@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-01-24	net: fec_ptp: remove redundant initialization of variable val	Colin Ian King	1	-1/+0
	Variable val is being initialized with a value that is never read, it is being re-assigned later. The assignment is redundant and can be removed. Signed-off-by: Colin Ian King <colin.i.king@gmail.com> Reviewed-by: Joakim Zhang <qiangqing.zhang@nxp.com> Link: https://lore.kernel.org/r/20220123184936.113486-1-colin.i.king@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-01-24	net: usb: asix: remove redundant assignment to variable reg	Colin Ian King	1	-1/+0
	Variable reg is being masked however the variable is never read after this. The assignment is redundant and can be removed. Signed-off-by: Colin Ian King <colin.i.king@gmail.com> Reviewed-by: Oleksij Rempel <o.rempel@pengutronix.de> Link: https://lore.kernel.org/r/20220123184035.112785-1-colin.i.king@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-01-24	selftests, bpf: Do not yet switch to new libbpf XDP APIs	Daniel Borkmann	7	-47/+50
	Revert commit 544356524dd6 ("selftests/bpf: switch to new libbpf XDP APIs") for now given this will heavily conflict with 4b27480dcaa7 ("bpf/selftests: convert xdp_link test to ASSERT_* macros") upon merge. Andrii agreed to redo the conversion cleanly after trees merged. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andrii@kernel.org>
2022-01-24	can: flexcan: mark RX via mailboxes as supported on MCF5441X	Marc Kleine-Budde	2	-1/+2
	Most flexcan IP cores support 2 RX modes: - FIFO - mailbox The flexcan IP core on the MCF5441X cannot receive CAN RTR messages via mailboxes. However the mailbox mode is more performant. The commit \| 1c45f5778a3b ("can: flexcan: add ethtool support to change rx-rtr setting during runtime") added support to switch from FIFO to mailbox mode on these cores. After testing the mailbox mode on the MCF5441X by Angelo Dureghello, this patch marks it (without RTR capability) as supported. Further the IP core overview table is updated, that RTR reception via mailboxes is not supported. Link: https://lore.kernel.org/all/20220121084425.3141218-1-mkl@pengutronix.de Tested-by: Angelo Dureghello <angelo@kernel-space.org> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2022-01-24	can: tcan4x5x: regmap: fix max register value	Marc Kleine-Budde	1	-1/+1
	The MRAM of the tcan4x5x has a size of 2K and starts at 0x8000. There are no further registers in the tcan4x5x making 0x87fc the biggest addressable register. This patch fixes the max register value of the regmap config from 0x8ffc to 0x87fc. Fixes: 6e1caaf8ed22 ("can: tcan4x5x: fix max register value") Link: https://lore.kernel.org/all/20220119064011.2943292-1-mkl@pengutronix.de Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2022-01-24	can: m_can: m_can_fifo_{read,write}: don't read or write from/to FIFO if length is 0	Marc Kleine-Budde	1	-0/+6
	In order to optimize FIFO access, especially on m_can cores attached to slow busses like SPI, in patch \| e39381770ec9 ("can: m_can: Disable IRQs on FIFO bus errors") bulk read/write support has been added to the m_can_fifo_{read,write} functions. That change leads to the tcan driver to call regmap_bulk_{read,write}() with a length of 0 (for CAN frames with 0 data length). regmap treats this as an error: \| tcan4x5x spi1.0 tcan4x5x0: FIFO write returned -22 This patch fixes the problem by not calling the cdev->ops->{read,write)_fifo() in case of a 0 length read/write. Fixes: e39381770ec9 ("can: m_can: Disable IRQs on FIFO bus errors") Link: https://lore.kernel.org/all/20220114155751.2651888-1-mkl@pengutronix.de Cc: stable@vger.kernel.org Cc: Matt Kline <matt@bitbashing.io> Cc: Chandrasekar Ramakrishnan <rcsekar@samsung.com> Reported-by: Michael Anochin <anochin@photo-meter.com> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2022-01-24	dt-bindings: can: tcan4x5x: fix mram-cfg RX FIFO config	Marc Kleine-Budde	1	-1/+1
	This tcan4x5x only comes with 2K of MRAM, a RX FIFO with a dept of 32 doesn't fit into the MRAM. Use a depth of 16 instead. Fixes: 4edd396a1911 ("dt-bindings: can: tcan4x5x: Add DT bindings for TCAN4x5X driver") Link: https://lore.kernel.org/all/20220119062951.2939851-1-mkl@pengutronix.de Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2022-01-24	mailmap: update email address of Brian Silverman	Marc Kleine-Budde	1	-0/+1
	Brian Silverman's address at bluerivertech.com is not valid anymore, use Brian's private email address instead. Link: https://lore.kernel.org/all/20220110082359.2019735-1-mkl@pengutronix.de Cc: Brian Silverman <bsilver16384@gmail.com> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2022-01-24	selftests, xsk: Fix rx_full stats test	Magnus Karlsson	1	-1/+4
	Fix the rx_full stats test so that it correctly reports pass even when the fill ring is not full of buffers. Fixes: 872a1184dbf2 ("selftests: xsk: Put the same buffer only once in the fill ring") Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Tested-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Acked-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Link: https://lore.kernel.org/bpf/20220121123508.12759-1-magnus.karlsson@gmail.com
2022-01-24	bpf: Fix flexible_array.cocci warnings	kernel test robot	1	-1/+1
	Zero-length and one-element arrays are deprecated, see: Documentation/process/deprecated.rst Flexible-array members should be used instead. Generated by: scripts/coccinelle/misc/flexible_array.cocci Fixes: c1ff181ffabc ("selftests/bpf: Extend kfunc selftests") Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: kernel test robot <lkp@intel.com> Signed-off-by: Julia Lawall <julia.lawall@inria.fr> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Cc: Kumar Kartikeya Dwivedi <memxor@gmail.com> Link: https://lore.kernel.org/bpf/alpine.DEB.2.22.394.2201221206320.12220@hadrien
2022-01-24	net: stmmac: remove unused members in struct stmmac_priv	Jisheng Zhang	1	-2/+0
	The tx_coalesce and mii_irq are not used at all now, so remove them. Signed-off-by: Jisheng Zhang <jszhang@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-01-24	net: atlantic: Use the bitmap API instead of hand-writing it	Christophe JAILLET	1	-4/+2
	Simplify code by using bitmap_weight() and bitmap_zero() instead of hand-writing these functions. Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Reviewed-by: Igor Russkikh <irusskikh@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-01-24	ping: fix the sk_bound_dev_if match in ping_lookup	Xin Long	1	-1/+2
	When 'ping' changes to use PING socket instead of RAW socket by: # sysctl -w net.ipv4.ping_group_range="0 100" the selftests 'router_broadcast.sh' will fail, as such command # ip vrf exec vrf-h1 ping -I veth0 198.51.100.255 -b can't receive the response skb by the PING socket. It's caused by mismatch of sk_bound_dev_if and dif in ping_rcv() when looking up the PING socket, as dif is vrf-h1 if dif's master was set to vrf-h1. This patch is to fix this regression by also checking the sk_bound_dev_if against sdif so that the packets can stil be received even if the socket is not bound to the vrf device but to the real iif. Fixes: c319b4d76b9e ("net: ipv4: add IPPROTO_ICMP socket kind") Reported-by: Hangbin Liu <liuhangbin@gmail.com> Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-01-24	net/smc: Transitional solution for clcsock race issue	Wen Gu	1	-12/+51
	We encountered a crash in smc_setsockopt() and it is caused by accessing smc->clcsock after clcsock was released. BUG: kernel NULL pointer dereference, address: 0000000000000020 #PF: supervisor read access in kernel mode #PF: error_code(0x0000) - not-present page PGD 0 P4D 0 Oops: 0000 [#1] PREEMPT SMP PTI CPU: 1 PID: 50309 Comm: nginx Kdump: loaded Tainted: G E 5.16.0-rc4+ #53 RIP: 0010:smc_setsockopt+0x59/0x280 [smc] Call Trace: <TASK> __sys_setsockopt+0xfc/0x190 __x64_sys_setsockopt+0x20/0x30 do_syscall_64+0x34/0x90 entry_SYSCALL_64_after_hwframe+0x44/0xae RIP: 0033:0x7f16ba83918e </TASK> This patch tries to fix it by holding clcsock_release_lock and checking whether clcsock has already been released before access. In case that a crash of the same reason happens in smc_getsockopt() or smc_switch_to_fallback(), this patch also checkes smc->clcsock in them too. And the caller of smc_switch_to_fallback() will identify whether fallback succeeds according to the return value. Fixes: fd57770dd198 ("net/smc: wait for pending work before clcsock release_sock") Link: https://lore.kernel.org/lkml/5dd7ffd1-28e2-24cc-9442-1defec27375e@linux.ibm.com/T/ Signed-off-by: Wen Gu <guwen@linux.alibaba.com> Acked-by: Karsten Graul <kgraul@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-01-24	ibmvnic: remove unused ->wait_capability	Sukadev Bhattiprolu	2	-25/+14
	With previous bug fix, ->wait_capability flag is no longer needed and can be removed. Fixes: 249168ad07cd ("ibmvnic: Make CRQ interrupt tasklet wait for all capabilities crqs") Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.ibm.com> Reviewed-by: Dany Madden <drt@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-01-24	ibmvnic: don't spin in tasklet	Sukadev Bhattiprolu	1	-6/+0
	ibmvnic_tasklet() continuously spins waiting for responses to all capability requests. It does this to avoid encountering an error during initialization of the vnic. However if there is a bug in the VIOS and we do not receive a response to one or more queries the tasklet ends up spinning continuously leading to hard lock ups. If we fail to receive a message from the VIOS it is reasonable to timeout the login attempt rather than spin indefinitely in the tasklet. Fixes: 249168ad07cd ("ibmvnic: Make CRQ interrupt tasklet wait for all capabilities crqs") Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.ibm.com> Reviewed-by: Dany Madden <drt@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-01-24	ibmvnic: init ->running_cap_crqs early	Sukadev Bhattiprolu	1	-35/+71
	We use ->running_cap_crqs to determine when the ibmvnic_tasklet() should send out the next protocol message type. i.e when we get back responses to all our QUERY_CAPABILITY CRQs we send out REQUEST_CAPABILITY crqs. Similiary, when we get responses to all the REQUEST_CAPABILITY crqs, we send out the QUERY_IP_OFFLOAD CRQ. We currently increment ->running_cap_crqs as we send out each CRQ and have the ibmvnic_tasklet() send out the next message type, when this running_cap_crqs count drops to 0. This assumes that all the CRQs of the current type were sent out before the count drops to 0. However it is possible that we send out say 6 CRQs, get preempted and receive all the 6 responses before we send out the remaining CRQs. This can result in ->running_cap_crqs count dropping to zero before all messages of the current type were sent and we end up sending the next protocol message too early. Instead initialize the ->running_cap_crqs upfront so the tasklet will only send the next protocol message after all responses are received. Use the cap_reqs local variable to also detect any discrepancy (either now or in future) in the number of capability requests we actually send. Currently only send_query_cap() is affected by this behavior (of sending next message early) since it is called from the worker thread (during reset) and from application thread (during ->ndo_open()) and they can be preempted. send_request_cap() is only called from the tasklet which processes CRQ responses sequentially, is not be affected. But to maintain the existing symmtery with send_query_capability() we update send_request_capability() also. Fixes: 249168ad07cd ("ibmvnic: Make CRQ interrupt tasklet wait for all capabilities crqs") Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.ibm.com> Reviewed-by: Dany Madden <drt@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-01-24	ibmvnic: Allow extra failures before disabling	Sukadev Bhattiprolu	1	-4/+17
	If auto-priority-failover (APF) is enabled and there are at least two backing devices of different priorities, some resets like fail-over, change-param etc can cause at least two back to back failovers. (Failover from high priority backing device to lower priority one and then back to the higher priority one if that is still functional). Depending on the timimg of the two failovers it is possible to trigger a "hard" reset and for the hard reset to fail due to failovers. When this occurs, the driver assumes that the network is unstable and disables the VNIC for a 60-second "settling time". This in turn can cause the ethtool command to fail with "No such device" while the vnic automatically recovers a little while later. Given that it's possible to have two back to back failures, allow for extra failures before disabling the vnic for the settling time. Fixes: f15fde9d47b8 ("ibmvnic: delay next reset if hard reset fails") Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.ibm.com> Reviewed-by: Dany Madden <drt@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-01-24	ipv4: fix ip option filtering for locally generated fragments	Jakub Kicinski	1	-3/+12
	During IP fragmentation we sanitize IP options. This means overwriting options which should not be copied with NOPs. Only the first fragment has the original, full options. ip_fraglist_prepare() copies the IP header and options from previous fragment to the next one. Commit 19c3401a917b ("net: ipv4: place control buffer handling away from fragmentation iterators") moved sanitizing options before ip_fraglist_prepare() which means options are sanitized and then overwritten again with the old values. Fixing this is not enough, however, nor did the sanitization work prior to aforementioned commit. ip_options_fragment() (which does the sanitization) uses ipcb->opt.optlen for the length of the options. ipcb->opt of fragments is not populated (it's 0), only the head skb has the state properly built. So even when called at the right time ip_options_fragment() does nothing. This seems to date back all the way to v2.5.44 when the fast path for pre-fragmented skbs had been introduced. Prior to that ip_options_build() would have been called for every fragment (in fact ever since v2.5.44 the fragmentation handing in ip_options_build() has been dead code, I'll clean it up in -next). In the original patch (see Link) caixf mentions fixing the handling for fragments other than the second one, but I'm not sure how _any_ fragment could have had their options sanitized with the code as it stood. Tested with python (MTU on lo lowered to 1000 to force fragmentation): import socket s = socket.socket(socket.AF_INET, socket.SOCK_DGRAM) s.setsockopt(socket.IPPROTO_IP, socket.IP_OPTIONS, bytearray([7,4,5,192, 20\|0x80,4,1,0])) s.sendto(b'1'*2000, ('127.0.0.1', 1234)) Before: IP (tos 0x0, ttl 64, id 1053, offset 0, flags [+], proto UDP (17), length 996, options (RR [bad length 4] [bad ptr 5] 192.148.4.1,,RA value 256)) localhost.36500 > localhost.search-agent: UDP, length 2000 IP (tos 0x0, ttl 64, id 1053, offset 968, flags [+], proto UDP (17), length 996, options (RR [bad length 4] [bad ptr 5] 192.148.4.1,,RA value 256)) localhost > localhost: udp IP (tos 0x0, ttl 64, id 1053, offset 1936, flags [none], proto UDP (17), length 100, options (RR [bad length 4] [bad ptr 5] 192.148.4.1,,RA value 256)) localhost > localhost: udp After: IP (tos 0x0, ttl 96, id 42549, offset 0, flags [+], proto UDP (17), length 996, options (RR [bad length 4] [bad ptr 5] 192.148.4.1,,RA value 256)) localhost.51607 > localhost.search-agent: UDP, bad length 2000 > 960 IP (tos 0x0, ttl 96, id 42549, offset 968, flags [+], proto UDP (17), length 996, options (NOP,NOP,NOP,NOP,RA value 256)) localhost > localhost: udp IP (tos 0x0, ttl 96, id 42549, offset 1936, flags [none], proto UDP (17), length 100, options (NOP,NOP,NOP,NOP,RA value 256)) localhost > localhost: udp RA (20 \| 0x80) is now copied as expected, RR (7) is "NOPed out". Link: https://lore.kernel.org/netdev/20220107080559.122713-1-ooppublic@163.com/ Fixes: 19c3401a917b ("net: ipv4: place control buffer handling away from fragmentation iterators") Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: caixf <ooppublic@163.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-01-24	net-procfs: show net devices bound packet types	Jianguo Wu	1	-3/+32
	After commit:7866a621043f ("dev: add per net_device packet type chains"), we can not get packet types that are bound to a specified net device by /proc/net/ptype, this patch fix the regression. Run "tcpdump -i ens192 udp -nns0" Before and after apply this patch: Before: [root@localhost ~]# cat /proc/net/ptype Type Device Function 0800 ip_rcv 0806 arp_rcv 86dd ipv6_rcv After: [root@localhost ~]# cat /proc/net/ptype Type Device Function ALL ens192 tpacket_rcv 0800 ip_rcv 0806 arp_rcv 86dd ipv6_rcv v1 -> v2: - fix the regression rather than adding new /proc API as suggested by Stephen Hemminger. Fixes: 7866a621043f ("dev: add per net_device packet type chains") Signed-off-by: Jianguo Wu <wujianguo@chinatelecom.cn> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-01-24	bonding: use rcu_dereference_rtnl when get bonding active slave	Hangbin Liu	2	-5/+1
	bond_option_active_slave_get_rcu() should not be used in rtnl_mutex as it use rcu_dereference(). Replace to rcu_dereference_rtnl() so we also can use this function in rtnl protected context. With this update, we can rmeove the rcu_read_lock/unlock in bonding .ndo_eth_ioctl and .get_ts_info. Reported-by: Vladimir Oltean <vladimir.oltean@nxp.com> Fixes: 94dd016ae538 ("bond: pass get_ts_info and SIOC[SG]HWTSTAMP ioctl to active device") Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-01-24	net: sfp: ignore disabled SFP node	Marek Behún	1	-0/+5
	Commit ce0aa27ff3f6 ("sfp: add sfp-bus to bridge between network devices and sfp cages") added code which finds SFP bus DT node even if the node is disabled with status = "disabled". Because of this, when phylink is created, it ends with non-null .sfp_bus member, even though the SFP module is not probed (because the node is disabled). We need to ignore disabled SFP bus node. Fixes: ce0aa27ff3f6 ("sfp: add sfp-bus to bridge between network devices and sfp cages") Signed-off-by: Marek Behún <kabel@kernel.org> Cc: stable@vger.kernel.org # 2203cbf2c8b5 ("net: sfp: move fwnode parsing into sfp-bus layer") Signed-off-by: David S. Miller <davem@davemloft.net>
2022-01-21	selftests: net: ioam: expect support for Queue depth data	Justin Iurman	1	-4/+1
	The IOAM queue-depth data field was added a few weeks ago, but the test unit was not updated accordingly. Reported-by: kernel test robot <oliver.sang@intel.com> Fixes: b63c5478e9cb ("ipv6: ioam: Support for Queue depth data field") Signed-off-by: Justin Iurman <justin.iurman@uliege.be> Link: https://lore.kernel.org/r/20220121173449.26918-1-justin.iurman@uliege.be Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-01-21	mptcp: Use struct_group() to avoid cross-field memset()	Kees Cook	1	-3/+3
	In preparation for FORTIFY_SOURCE performing compile-time and run-time field bounds checking for memcpy(), memmove(), and memset(), avoid intentionally writing across neighboring fields. Use struct_group() to capture the fields to be reset, so that memset() can be appropriately bounds-checked by the compiler. Cc: Matthieu Baerts <matthieu.baerts@tessares.net> Cc: mptcp@lists.linux.dev Signed-off-by: Kees Cook <keescook@chromium.org> Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Link: https://lore.kernel.org/r/20220121073935.1154263-1-keescook@chromium.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-01-22	rxrpc: Adjust retransmission backoff	David Howells	2	-6/+4
	Improve retransmission backoff by only backing off when we retransmit data packets rather than when we set the lost ack timer. To this end: (1) In rxrpc_resend(), use rxrpc_get_rto_backoff() when setting the retransmission timer and only tell it that we are retransmitting if we actually have things to retransmit. Note that it's possible for the retransmission algorithm to race with the processing of a received ACK, so we may see no packets needing retransmission. (2) In rxrpc_send_data_packet(), don't bump the backoff when setting the ack_lost_at timer, as it may then get bumped twice. With this, when looking at one particular packet, the retransmission intervals were seen to be 1.5ms, 2ms, 3ms, 5ms, 9ms, 17ms, 33ms, 71ms, 136ms, 264ms, 544ms, 1.088s, 2.1s, 4.2s and 8.3s. Fixes: c410bf01933e ("rxrpc: Fix the excessive initial retransmission timeout") Suggested-by: Marc Dionne <marc.dionne@auristor.com> Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Marc Dionne <marc.dionne@auristor.com> Tested-by: Marc Dionne <marc.dionne@auristor.com> cc: linux-afs@lists.infradead.org Link: https://lore.kernel.org/r/164138117069.2023386.17446904856843997127.stgit@warthog.procyon.org.uk/ Signed-off-by: David S. Miller <davem@davemloft.net>
2022-01-21	xdp: disable XDP_REDIRECT for xdp frags	Lorenzo Bianconi	1	-0/+8
	XDP_REDIRECT is not fully supported yet for xdp frags since not all XDP capable drivers can map non-linear xdp_frame in ndo_xdp_xmit so disable it for the moment. Acked-by: Toke Hoiland-Jorgensen <toke@redhat.com> Acked-by: John Fastabend <john.fastabend@gmail.com> Acked-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Link: https://lore.kernel.org/r/0da25e117d0e2673f5d0ce6503393c55c6fb1be9.1642758637.git.lorenzo@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-01-21	bpf: selftests: add CPUMAP/DEVMAP selftests for xdp frags	Lorenzo Bianconi	6	-1/+185
	Verify compatibility checks attaching a XDP frags program to a CPUMAP/DEVMAP Acked-by: Toke Hoiland-Jorgensen <toke@redhat.com> Acked-by: John Fastabend <john.fastabend@gmail.com> Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Link: https://lore.kernel.org/r/d94b4d35adc1e42c9ca5004e6b2cdfd75992304d.1642758637.git.lorenzo@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-01-21	bpf: selftests: introduce bpf_xdp_{load,store}_bytes selftest	Lorenzo Bianconi	2	-0/+146
	Introduce kernel selftest for new bpf_xdp_{load,store}_bytes helpers. and bpf_xdp_pointer/bpf_xdp_copy_buf utility routines. Acked-by: Toke Hoiland-Jorgensen <toke@redhat.com> Acked-by: John Fastabend <john.fastabend@gmail.com> Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Link: https://lore.kernel.org/r/2c99ae663a5dcfbd9240b1d0489ad55dea4f4601.1642758637.git.lorenzo@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-01-21	net: xdp: introduce bpf_xdp_pointer utility routine	Lorenzo Bianconi	3	-38/+174
	Similar to skb_header_pointer, introduce bpf_xdp_pointer utility routine to return a pointer to a given position in the xdp_buff if the requested area (offset + len) is contained in a contiguous memory area otherwise it will be copied in a bounce buffer provided by the caller. Similar to the tc counterpart, introduce the two following xdp helpers: - bpf_xdp_load_bytes - bpf_xdp_store_bytes Reviewed-by: Eelco Chaudron <echaudro@redhat.com> Acked-by: Toke Hoiland-Jorgensen <toke@redhat.com> Acked-by: John Fastabend <john.fastabend@gmail.com> Acked-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Link: https://lore.kernel.org/r/ab285c1efdd5b7a9d361348b1e7d3ef49f6382b3.1642758637.git.lorenzo@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-01-21	bpf: generalise tail call map compatibility check	Toke Hoiland-Jorgensen	6	-40/+48
	The check for tail call map compatibility ensures that tail calls only happen between maps of the same type. To ensure backwards compatibility for XDP frags we need a similar type of check for cpumap and devmap programs, so move the state from bpf_array_aux into bpf_map, add xdp_has_frags to the check, and apply the same check to cpumap and devmap. Acked-by: John Fastabend <john.fastabend@gmail.com> Co-developed-by: Lorenzo Bianconi <lorenzo@kernel.org> Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Signed-off-by: Toke Hoiland-Jorgensen <toke@redhat.com> Link: https://lore.kernel.org/r/f19fd97c0328a39927f3ad03e1ca6b43fd53cdfd.1642758637.git.lorenzo@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-01-21	libbpf: Add SEC name for xdp frags programs	Lorenzo Bianconi	1	-0/+8
	Introduce support for the following SEC entries for XDP frags property: - SEC("xdp.frags") - SEC("xdp.frags/devmap") - SEC("xdp.frags/cpumap") Acked-by: Toke Hoiland-Jorgensen <toke@redhat.com> Acked-by: John Fastabend <john.fastabend@gmail.com> Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Link: https://lore.kernel.org/r/af23b6e4841c171ad1af01917839b77847a4bc27.1642758637.git.lorenzo@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-01-21	bpf: selftests: update xdp_adjust_tail selftest to include xdp frags	Eelco Chaudron	3	-7/+160
	This change adds test cases for the xdp frags scenarios when shrinking and growing. Acked-by: Toke Hoiland-Jorgensen <toke@redhat.com> Acked-by: John Fastabend <john.fastabend@gmail.com> Co-developed-by: Lorenzo Bianconi <lorenzo@kernel.org> Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Signed-off-by: Eelco Chaudron <echaudro@redhat.com> Link: https://lore.kernel.org/r/d2e6a0ebc52db6f89e62b9befe045032e5e0a5fe.1642758637.git.lorenzo@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-01-21	bpf: test_run: add xdp_shared_info pointer in bpf_test_finish signature	Lorenzo Bianconi	1	-9/+39
	introduce xdp_shared_info pointer in bpf_test_finish signature in order to copy back paged data from a xdp frags frame to userspace buffer Acked-by: Toke Hoiland-Jorgensen <toke@redhat.com> Acked-by: John Fastabend <john.fastabend@gmail.com> Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Link: https://lore.kernel.org/r/c803673798c786f915bcdd6c9338edaa9740d3d6.1642758637.git.lorenzo@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-01-21	bpf: introduce frags support to bpf_prog_test_run_xdp()	Lorenzo Bianconi	1	-13/+45
	Introduce the capability to allocate a xdp frags in bpf_prog_test_run_xdp routine. This is a preliminary patch to introduce the selftests for new xdp frags ebpf helpers Acked-by: Toke Hoiland-Jorgensen <toke@redhat.com> Acked-by: John Fastabend <john.fastabend@gmail.com> Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Link: https://lore.kernel.org/r/b7c0e425a9287f00f601c4fc0de54738ec6ceeea.1642758637.git.lorenzo@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-01-21	bpf: move user_size out of bpf_test_init	Lorenzo Bianconi	1	-6/+7
	Rely on data_size_in in bpf_test_init routine signature. This is a preliminary patch to introduce xdp frags selftest Acked-by: Toke Hoiland-Jorgensen <toke@redhat.com> Acked-by: John Fastabend <john.fastabend@gmail.com> Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Link: https://lore.kernel.org/r/6b48d38ed3d60240d7d6bb15e6fa7fabfac8dfb2.1642758637.git.lorenzo@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-01-21	bpf: add frags support to xdp copy helpers	Eelco Chaudron	4	-36/+137
	This patch adds support for frags for the following helpers: - bpf_xdp_output() - bpf_perf_event_output() Acked-by: Toke Hoiland-Jorgensen <toke@redhat.com> Acked-by: John Fastabend <john.fastabend@gmail.com> Acked-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Eelco Chaudron <echaudro@redhat.com> Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Link: https://lore.kernel.org/r/340b4a99cdc24337b40eaf8bb597f9f9e7b0373e.1642758637.git.lorenzo@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-01-21	bpf: add frags support to the bpf_xdp_adjust_tail() API	Eelco Chaudron	4	-8/+88
	This change adds support for tail growing and shrinking for XDP frags. When called on a non-linear packet with a grow request, it will work on the last fragment of the packet. So the maximum grow size is the last fragments tailroom, i.e. no new buffer will be allocated. A XDP frags capable driver is expected to set frag_size in xdp_rxq_info data structure to notify the XDP core the fragment size. frag_size set to 0 is interpreted by the XDP core as tail growing is not allowed. Introduce __xdp_rxq_info_reg utility routine to initialize frag_size field. When shrinking, it will work from the last fragment, all the way down to the base buffer depending on the shrinking size. It's important to mention that once you shrink down the fragment(s) are freed, so you can not grow again to the original size. Acked-by: Toke Hoiland-Jorgensen <toke@redhat.com> Acked-by: John Fastabend <john.fastabend@gmail.com> Acked-by: Jakub Kicinski <kuba@kernel.org> Co-developed-by: Lorenzo Bianconi <lorenzo@kernel.org> Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Signed-off-by: Eelco Chaudron <echaudro@redhat.com> Link: https://lore.kernel.org/r/eabda3485dda4f2f158b477729337327e609461d.1642758637.git.lorenzo@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>