wireguard-linux - WireGuard for the Linux kernel

Age	Commit message (Collapse)	Author	Files	Lines
2021-07-14	r8152: Fix potential PM refcount imbalance	Takashi Iwai	1	-1/+2
	rtl8152_close() takes the refcount via usb_autopm_get_interface() but it doesn't release when RTL8152_UNPLUG test hits. This may lead to the imbalance of PM refcount. This patch addresses it. Link: https://bugzilla.suse.com/show_bug.cgi?id=1186194 Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-07-14	fs: add vfs_parse_fs_param_source() helper	Christian Brauner	3	-27/+43
	Add a simple helper that filesystems can use in their parameter parser to parse the "source" parameter. A few places open-coded this function and that already caused a bug in the cgroup v1 parser that we fixed. Let's make it harder to get this wrong by introducing a helper which performs all necessary checks. Link: https://syzkaller.appspot.com/bug?id=6312526aba5beae046fdae8f00399f87aab48b12 Cc: Christoph Hellwig <hch@lst.de> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Dmitry Vyukov <dvyukov@google.com> Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-07-14	cgroup: verify that source is a string	Christian Brauner	1	-0/+2
	The following sequence can be used to trigger a UAF: int fscontext_fd = fsopen("cgroup"); int fd_null = open("/dev/null, O_RDONLY); int fsconfig(fscontext_fd, FSCONFIG_SET_FD, "source", fd_null); close_range(3, ~0U, 0); The cgroup v1 specific fs parser expects a string for the "source" parameter. However, it is perfectly legitimate to e.g. specify a file descriptor for the "source" parameter. The fs parser doesn't know what a filesystem allows there. So it's a bug to assume that "source" is always of type fs_value_is_string when it can reasonably also be fs_value_is_file. This assumption in the cgroup code causes a UAF because struct fs_parameter uses a union for the actual value. Access to that union is guarded by the param->type member. Since the cgroup paramter parser didn't check param->type but unconditionally moved param->string into fc->source a close on the fscontext_fd would trigger a UAF during put_fs_context() which frees fc->source thereby freeing the file stashed in param->file causing a UAF during a close of the fd_null. Fix this by verifying that param->type is actually a string and report an error if not. In follow up patches I'll add a new generic helper that can be used here and by other filesystems instead of this error-prone copy-pasta fix. But fixing it in here first makes backporting a it to stable a lot easier. Fixes: 8d2451f4994f ("cgroup1: switch to option-by-option parsing") Reported-by: syzbot+283ce5a46486d6acdbaf@syzkaller.appspotmail.com Cc: Christoph Hellwig <hch@lst.de> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: <stable@kernel.org> Cc: syzkaller-bugs <syzkaller-bugs@googlegroups.com> Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-07-13	net: dsa: properly check for the bridge_leave methods in dsa_switch_bridge_leave()	Vladimir Oltean	1	-2/+2
	This was not caught because there is no switch driver which implements the .port_bridge_join but not .port_bridge_leave method, but it should nonetheless be fixed, as in certain conditions (driver development) it might lead to NULL pointer dereference. Fixes: f66a6a69f97a ("net: dsa: permit cross-chip bridging between all trees in the system") Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-07-13	sfc: add logs explaining XDP_TX/REDIRECT is not available	Íñigo Huguet	1	-0/+4
	If it's not possible to allocate enough channels for XDP, XDP_TX and XDP_REDIRECT don't work. However, only a message saying that not enough channels were available was shown, but not saying what are the consequences in that case. The user didn't know if he/she can use XDP or not, if the performance is reduced, or what. Signed-off-by: Íñigo Huguet <ihuguet@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-07-13	sfc: ensure correct number of XDP queues	Íñigo Huguet	1	-7/+8
	Commit 99ba0ea616aa ("sfc: adjust efx->xdp_tx_queue_count with the real number of initialized queues") intended to fix a problem caused by a round up when calculating the number of XDP channels and queues. However, this was not the real problem. The real problem was that the number of XDP TX queues had been reduced to half in commit e26ca4b53582 ("sfc: reduce the number of requested xdp ev queues"), but the variable xdp_tx_queue_count had remained the same. Once the correct number of XDP TX queues is created again in the previous patch of this series, this also can be reverted since the error doesn't actually exist. Only in the case that there is a bug in the code we can have different values in xdp_queue_number and efx->xdp_tx_queue_count. Because of this, and per Edward Cree's suggestion, I add instead a WARN_ON to catch if it happens again in the future. Note that the number of allocated queues can be higher than the number of used ones due to the round up, as explained in the existing comment in the code. That's why we also have to stop increasing xdp_queue_number beyond efx->xdp_tx_queue_count. Signed-off-by: Íñigo Huguet <ihuguet@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-07-13	sfc: fix lack of XDP TX queues - error XDP TX failed (-22)	Íñigo Huguet	1	-1/+2
	Fixes: e26ca4b53582 sfc: reduce the number of requested xdp ev queues The buggy commit intended to allocate less channels for XDP in order to be more unlikely to reach the limit of 32 channels of the driver. The idea was to use each IRQ/eventqeue for more XDP TX queues than before, calculating which is the maximum number of TX queues that one event queue can handle. For example, in EF10 each event queue could handle up to 8 queues, better than the 4 they were handling before the change. This way, it would have to allocate half of channels than before for XDP TX. The problem is that the TX queues are also contained inside the channel structs, and there are only 4 queues per channel. Reducing the number of channels means also reducing the number of queues, resulting in not having the desired number of 1 queue per CPU. This leads to getting errors on XDP_TX and XDP_REDIRECT if they're executed from a high numbered CPU, because there only exist queues for the low half of CPUs, actually. If XDP_TX/REDIRECT is executed in a low numbered CPU, the error doesn't happen. This is the error in the logs (repeated many times, even rate limited): sfc 0000:5e:00.0 ens3f0np0: XDP TX failed (-22) This errors happens in function efx_xdp_tx_buffers, where it expects to have a dedicated XDP TX queue per CPU. Reverting the change makes again more likely to reach the limit of 32 channels in machines with many CPUs. If this happen, no XDP_TX/REDIRECT will be possible at all, and we will have this log error messages: At interface probe: sfc 0000:5e:00.0: Insufficient resources for 12 XDP event queues (24 other channels, max 32) At every subsequent XDP_TX/REDIRECT failure, rate limited: sfc 0000:5e:00.0 ens3f0np0: XDP TX failed (-22) However, without reverting the change, it makes the user to think that everything is OK at probe time, but later it fails in an unpredictable way, depending on the CPU that handles the packet. It is better to restore the predictable behaviour. If the user sees the error message at probe time, he/she can try to configure the best way it fits his/her needs. At least, he/she will have 2 options: - Accept that XDP_TX/REDIRECT is not available (he/she may not need it) - Load sfc module with modparam 'rss_cpus' with a lower number, thus creating less normal RX queues/channels, letting more free resources for XDP, with some performance penalty. Anyway, let the calculation of maximum TX queues that can be handled by a single event queue, and use it only if it's less than the number of TX queues per channel. This doesn't happen in practice, but could happen if some constant values are tweaked in the future, such us EFX_MAX_TXQ_PER_CHANNEL, EFX_MAX_EVQ_SIZE or EFX_MAX_DMAQ_SIZE. Related mailing list thread: https://lore.kernel.org/bpf/20201215104327.2be76156@carbon/ Signed-off-by: Íñigo Huguet <ihuguet@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-07-13	net: fddi: fix UAF in fza_probe	Pavel Skripkin	1	-2/+1
	fp is netdev private data and it cannot be used after free_netdev() call. Using fp after free_netdev() can cause UAF bug. Fix it by moving free_netdev() after error message. Fixes: 61414f5ec983 ("FDDI: defza: Add support for DEC FDDIcontroller 700 TURBOchannel adapter") Signed-off-by: Pavel Skripkin <paskripkin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-07-13	net: dsa: sja1105: fix address learning getting disabled on the CPU port	Vladimir Oltean	1	-8/+6
	In May 2019 when commit 640f763f98c2 ("net: dsa: sja1105: Add support for Spanning Tree Protocol") was introduced, the comment that "STP does not get called for the CPU port" was true. This changed after commit 0394a63acfe2 ("net: dsa: enable and disable all ports") in August 2019 and went largely unnoticed, because the sja1105_bridge_stp_state_set() method did nothing different compared to the static setup done by sja1105_init_mac_settings(). With the ability to turn address learning off introduced by the blamed commit, there is a new priv->learn_ena port mask in the driver. When sja1105_bridge_stp_state_set() gets called and we are in BR_STATE_LEARNING or later, address learning is enabled or not depending on priv->learn_ena & BIT(port). So what happens is that priv->learn_ena is not being set from anywhere for the CPU port, and the static configuration done by sja1105_init_mac_settings() is being overwritten. To solve this, acknowledge that the static configuration of STP state is no longer necessary because the STP state is being set by the DSA core now, but what is necessary is to set priv->learn_ena for the CPU port. Fixes: 4d9423549501 ("net: dsa: sja1105: offload bridge port flags to device") Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-07-13	net: ocelot: fix switchdev objects synced for wrong netdev with LAG offload	Vladimir Oltean	1	-4/+5
	The point with a dev and a brport_dev is that when we have a LAG net device that is a bridge port, dev is an ocelot net device and brport_dev is the bonding/team net device. The ocelot net device beneath the LAG does not exist from the bridge's perspective, so we need to sync the switchdev objects belonging to the brport_dev and not to the dev. Fixes: e4bd44e89dcf ("net: ocelot: replay switchdev events when joining bridge") Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-07-13	net: Use nlmsg_unicast() instead of netlink_unicast()	Yajun Deng	8	-27/+13
	It has 'if (err >0 )' statement in nlmsg_unicast(), so use nlmsg_unicast() instead of netlink_unicast(), this looks more concise. v2: remove the change in netfilter. Signed-off-by: Yajun Deng <yajun.deng@linux.dev> Reviewed-by: David Ahern <dsahern@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-07-12	sd: don't mess with SD_MINORS for CONFIG_DEBUG_BLOCK_EXT_DEVT	Christoph Hellwig	1	-4/+0
	No need to give up the original sd minor even with this option, and if we did we'd also need to fix the number of minors for this configuration to actually work. Fixes: 7c3f828b522b0 ("block: refactor device number setup in __device_add_disk") Reported-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Christoph Hellwig <hch@lst.de> Tested-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-07-12	mm: Make copy_huge_page() always available	Matthew Wilcox (Oracle)	4	-53/+11
	Rewrite copy_huge_page() and move it into mm/util.c so it's always available. Fixes an exposure of uninitialised memory on configurations with HUGETLB and UFFD enabled and MIGRATION disabled. Fixes: 8cc5fcbb5be8 ("mm, hugetlb: fix racy resv_huge_pages underflow on UFFDIO_COPY") Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-07-12	mm/rmap: fix munlocking Anon THP with mlocked ptes	Hugh Dickins	1	-17/+22
	Many thanks to Kirill for reminding that PageDoubleMap cannot be relied on to warn of pte mappings in the Anon THP case; and a scan of subpages does not seem appropriate here. Note how follow_trans_huge_pmd() does not even mark an Anon THP as mlocked when compound_mapcount != 1: multiple mlocking of Anon THP is avoided, so simply return from page_mlock() in this case. Link: https://lore.kernel.org/lkml/cfa154c-d595-406-eb7d-eb9df730f944@google.com/ Fixes: d9770fcc1c0c ("mm/rmap: fix old bug: munlocking THP missed other mlocks") Reported-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Signed-off-by: Hugh Dickins <hughd@google.com> Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Alistair Popple <apopple@nvidia.com> Cc: Jason Gunthorpe <jgg@nvidia.com> Cc: Ralph Campbell <rcampbell@nvidia.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Yang Shi <shy828301@gmail.com> Cc: Shakeel Butt <shakeelb@google.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-07-12	octeontx2-pf: Fix uninitialized boolean variable pps	Colin Ian King	1	-1/+1
	In the case where act->id is FLOW_ACTION_POLICE and also act->police.rate_bytes_ps > 0 or act->police.rate_pkt_ps is not > 0 the boolean variable pps contains an uninitialized value when function otx2_tc_act_set_police is called. Fix this by initializing pps to false. Addresses-Coverity: ("Uninitialized scalar variable)" Fixes: 68fbff68dbea ("octeontx2-pf: Add police action for TC flower") Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-07-12	ipv6: allocate enough headroom in ip6_finish_output2()	Vasily Averin	1	-0/+28
	When TEE target mirrors traffic to another interface, sk_buff may not have enough headroom to be processed correctly. ip_finish_output2() detect this situation for ipv4 and allocates new skb with enogh headroom. However ipv6 lacks this logic in ip_finish_output2 and it leads to skb_under_panic: skbuff: skb_under_panic: text:ffffffffc0866ad4 len:96 put:24 head:ffff97be85e31800 data:ffff97be85e317f8 tail:0x58 end:0xc0 dev:gre0 ------------[ cut here ]------------ kernel BUG at net/core/skbuff.c:110! invalid opcode: 0000 [#1] SMP PTI CPU: 2 PID: 393 Comm: kworker/2:2 Tainted: G OE 5.13.0 #13 Hardware name: Virtuozzo KVM, BIOS 1.11.0-2.vz7.4 04/01/2014 Workqueue: ipv6_addrconf addrconf_dad_work RIP: 0010:skb_panic+0x48/0x4a Call Trace: skb_push.cold.111+0x10/0x10 ipgre_header+0x24/0xf0 [ip_gre] neigh_connected_output+0xae/0xf0 ip6_finish_output2+0x1a8/0x5a0 ip6_output+0x5c/0x110 nf_dup_ipv6+0x158/0x1000 [nf_dup_ipv6] tee_tg6+0x2e/0x40 [xt_TEE] ip6t_do_table+0x294/0x470 [ip6_tables] nf_hook_slow+0x44/0xc0 nf_hook.constprop.34+0x72/0xe0 ndisc_send_skb+0x20d/0x2e0 ndisc_send_ns+0xd1/0x210 addrconf_dad_work+0x3c8/0x540 process_one_work+0x1d1/0x370 worker_thread+0x30/0x390 kthread+0x116/0x130 ret_from_fork+0x22/0x30 Signed-off-by: Vasily Averin <vvs@virtuozzo.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-07-12	net: hdlc: rename 'mod_init' & 'mod_exit' functions to be module-specific	Randy Dunlap	6	-24/+24
	Rename module_init & module_exit functions that are named "mod_init" and "mod_exit" so that they are unique in both the System.map file and in initcall_debug output instead of showing up as almost anonymous "mod_init". This is helpful for debugging and in determining how long certain module_init calls take to execute. Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Cc: Krzysztof Halasa <khc@pm.waw.pl> Cc: netdev@vger.kernel.org Cc: "David S. Miller" <davem@davemloft.net> Cc: Jakub Kicinski <kuba@kernel.org> Cc: Martin Schiller <ms@dev.tdt.de> Cc: linux-x25@vger.kernel.org Signed-off-by: David S. Miller <davem@davemloft.net>
2021-07-11	Linux 5.14-rc1	Linus Torvalds	1	-2/+2

2021-07-11	mm/rmap: try_to_migrate() skip zone_device !device_private	Hugh Dickins	1	-3/+3
	I know nothing about zone_device pages and !device_private pages; but if try_to_migrate_one() will do nothing for them, then it's better that try_to_migrate() filter them first, than trawl through all their vmas. Signed-off-by: Hugh Dickins <hughd@google.com> Reviewed-by: Shakeel Butt <shakeelb@google.com> Reviewed-by: Alistair Popple <apopple@nvidia.com> Link: https://lore.kernel.org/lkml/1241d356-8ec9-f47b-a5ec-9b2bf66d242@google.com/ Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Jason Gunthorpe <jgg@nvidia.com> Cc: Ralph Campbell <rcampbell@nvidia.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Yang Shi <shy828301@gmail.com> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-07-11	mm/rmap: fix new bug: premature return from page_mlock_one()	Hugh Dickins	1	-6/+5
	In the unlikely race case that page_mlock_one() finds VM_LOCKED has been cleared by the time it got page table lock, page_vma_mapped_walk_done() must be called before returning, either explicitly, or by a final call to page_vma_mapped_walk() - otherwise the page table remains locked. Fixes: cd62734ca60d ("mm/rmap: split try_to_munlock from try_to_unmap") Signed-off-by: Hugh Dickins <hughd@google.com> Reviewed-by: Alistair Popple <apopple@nvidia.com> Reviewed-by: Shakeel Butt <shakeelb@google.com> Reported-by: kernel test robot <oliver.sang@intel.com> Link: https://lore.kernel.org/lkml/20210711151446.GB4070@xsang-OptiPlex-9020/ Link: https://lore.kernel.org/lkml/f71f8523-cba7-3342-40a7-114abc5d1f51@google.com/ Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Jason Gunthorpe <jgg@nvidia.com> Cc: Ralph Campbell <rcampbell@nvidia.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Yang Shi <shy828301@gmail.com> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-07-11	mm/rmap: fix old bug: munlocking THP missed other mlocks	Hugh Dickins	1	-5/+8
	The kernel recovers in due course from missing Mlocked pages: but there was no point in calling page_mlock() (formerly known as try_to_munlock()) on a THP, because nothing got done even when it was found to be mapped in another VM_LOCKED vma. It's true that we need to be careful: Mlocked accounting of pte-mapped THPs is too difficult (so consistently avoided); but Mlocked accounting of only-pmd-mapped THPs is supposed to work, even when multiple mappings are mlocked and munlocked or munmapped. Refine the tests. There is already a VM_BUG_ON_PAGE(PageDoubleMap) in page_mlock(), so page_mlock_one() does not even have to worry about that complication. (I said the kernel recovers: but would page reclaim be likely to split THP before rediscovering that it's VM_LOCKED? I've not followed that up) Fixes: 9a73f61bdb8a ("thp, mlock: do not mlock PTE-mapped file huge pages") Signed-off-by: Hugh Dickins <hughd@google.com> Reviewed-by: Shakeel Butt <shakeelb@google.com> Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Link: https://lore.kernel.org/lkml/cfa154c-d595-406-eb7d-eb9df730f944@google.com/ Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Alistair Popple <apopple@nvidia.com> Cc: Jason Gunthorpe <jgg@nvidia.com> Cc: Ralph Campbell <rcampbell@nvidia.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Yang Shi <shy828301@gmail.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-07-11	mm/rmap: fix comments left over from recent changes	Hugh Dickins	2	-7/+2
	Parallel developments in mm/rmap.c have left behind some out-of-date comments: try_to_migrate_one() also accepts TTU_SYNC (already commented in try_to_migrate() itself), and try_to_migrate() returns nothing at all. TTU_SPLIT_FREEZE has just been deleted, so reword the comment about it in mm/huge_memory.c; and TTU_IGNORE_ACCESS was removed in 5.11, so delete the "recently referenced" comment from try_to_unmap_one() (once upon a time the comment was near the removed codeblock, but they drifted apart). Signed-off-by: Hugh Dickins <hughd@google.com> Reviewed-by: Shakeel Butt <shakeelb@google.com> Reviewed-by: Alistair Popple <apopple@nvidia.com> Link: https://lore.kernel.org/lkml/563ce5b2-7a44-5b4d-1dfd-59a0e65932a9@google.com/ Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Jason Gunthorpe <jgg@nvidia.com> Cc: Ralph Campbell <rcampbell@nvidia.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Yang Shi <shy828301@gmail.com> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-07-11	net: bridge: multicast: fix MRD advertisement router port marking race	Nikolay Aleksandrov	1	-0/+4
	When an MRD advertisement is received on a bridge port with multicast snooping enabled, we mark it as a router port automatically, that includes adding that port to the router port list. The multicast lock protects that list, but it is not acquired in the MRD advertisement case leading to a race condition, we need to take it to fix the race. Cc: stable@vger.kernel.org Cc: linus.luessing@c0d3.blue Fixes: 4b3087c7e37f ("bridge: Snoop Multicast Router Advertisements") Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-07-11	net: bridge: multicast: fix PIM hello router port marking race	Nikolay Aleksandrov	1	-0/+2
	When a PIM hello packet is received on a bridge port with multicast snooping enabled, we mark it as a router port automatically, that includes adding that port the router port list. The multicast lock protects that list, but it is not acquired in the PIM message case leading to a race condition, we need to take it to fix the race. Cc: stable@vger.kernel.org Fixes: 91b02d3d133b ("bridge: mcast: add router port on PIM hello message") Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-07-11	net: phy: marvell10g: fix differentiation of 88X3310 from 88X3340	Marek Behún	2	-10/+36
	It seems that we cannot differentiate 88X3310 from 88X3340 by simply looking at bit 3 of revision ID. This only works on revisions A0 and A1. On revision B0, this bit is always 1. Instead use the 3.d00d register for differentiation, since this register contains information about number of ports on the device. Fixes: 9885d016ffa9 ("net: phy: marvell10g: add separate structure for 88X3340") Signed-off-by: Marek Behún <kabel@kernel.org> Reported-by: Matteo Croce <mcroce@linux.microsoft.com> Tested-by: Matteo Croce <mcroce@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-07-11	dsa: fix for_each_child.cocci warnings	kernel test robot	1	-1/+3
	For_each_available_child_of_node should have of_node_put() before return around line 423. Generated by: scripts/coccinelle/iterators/for_each_child.cocci CC: Alexander Lobakin <alobakin@pm.me> Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: kernel test robot <lkp@intel.com> Signed-off-by: Julia Lawall <julia.lawall@inria.fr> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-07-10	mm/page_alloc: Revert pahole zero-sized workaround	Mel Gorman	2	-14/+0
	Commit dbbee9d5cd83 ("mm/page_alloc: convert per-cpu list protection to local_lock") folded in a workaround patch for pahole that was unable to deal with zero-sized percpu structures. A superior workaround is achieved with commit a0b8200d06ad ("kbuild: skip per-CPU BTF generation for pahole v1.18-v1.21"). This patch reverts the dummy field and the pahole version check. Fixes: dbbee9d5cd83 ("mm/page_alloc: convert per-cpu list protection to local_lock") Signed-off-by: Mel Gorman <mgorman@techsingularity.net> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-07-10	rtc: pcf8523: rename register and bit defines	Alexandre Belloni	1	-73/+73
	arch/arm/mach-ixp4xx/include/mach/platform.h now gets included indirectly and defines REG_OFFSET. Rename the register and bit definition to something specific to the driver. Fixes: 7fd70c65faac ("ARM: irqstat: Get rid of duplicated declaration") Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com> Link: https://lore.kernel.org/r/20210710211431.1393589-1-alexandre.belloni@bootlin.com
2021-07-10	virtio_net: check virtqueue_add_sgs() return value	Yunjian Wang	1	-1/+7
	As virtqueue_add_sgs() can fail, we should check the return value. Addresses-Coverity-ID: 1464439 ("Unchecked return value") Signed-off-by: Yunjian Wang <wangyunjian@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-07-09	mptcp: properly account bulk freed memory	Paolo Abeni	4	-6/+18
	After commit 879526030c8b ("mptcp: protect the rx path with the msk socket spinlock") the rmem currently used by a given msk is really sk_rmem_alloc - rmem_released. The safety check in mptcp_data_ready() does not take the above in due account, as a result legit incoming data is kept in subflow receive queue with no reason, delaying or blocking MPTCP-level ack generation. This change addresses the issue introducing a new helper to fetch the rmem memory and using it as needed. Additionally add a MIB counter for the exceptional event described above - the peer is misbehaving. Finally, introduce the required annotation when rmem_released is updated. Fixes: 879526030c8b ("mptcp: protect the rx path with the msk socket spinlock") Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/211 Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-07-09	selftests: mptcp: fix case multiple subflows limited by server	Jianguo Wu	1	-1/+1
	After patch "mptcp: fix syncookie process if mptcp can not_accept new subflow", if subflow is limited, MP_JOIN SYN is dropped, and no SYN/ACK will be replied. So in case "multiple subflows limited by server", the expected SYN/ACK number should be 1. Fixes: 00587187ad30 ("selftests: mptcp: add test cases for mptcp join tests with syn cookies") Reported-by: kernel test robot <oliver.sang@intel.com> Signed-off-by: Jianguo Wu <wujianguo@chinatelecom.cn> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-07-09	mptcp: avoid processing packet if a subflow reset	Jianguo Wu	3	-12/+31
	If check_fully_established() causes a subflow reset, it should not continue to process the packet in tcp_data_queue(). Add a return value to mptcp_incoming_options(), and return false if a subflow has been reset, else return true. Then drop the packet in tcp_data_queue()/tcp_rcv_state_process() if mptcp_incoming_options() return false. Fixes: d582484726c4 ("mptcp: fix fallback for MP_JOIN subflows") Signed-off-by: Jianguo Wu <wujianguo@chinatelecom.cn> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-07-09	mptcp: fix syncookie process if mptcp can not_accept new subflow	Jianguo Wu	1	-3/+3
	Lots of "TCP: tcp_fin: Impossible, sk->sk_state=7" in client side when doing stress testing using wrk and webfsd. There are at least two cases may trigger this warning: 1.mptcp is in syncookie, and server recv MP_JOIN SYN request, in subflow_check_req(), the mptcp_can_accept_new_subflow() return false, so subflow_init_req_cookie_join_save() isn't called, i.e. not store the data present in the MP_JOIN syn request and the random nonce in hash table - join_entries[], but still send synack. When recv 3rd-ack, mptcp_token_join_cookie_init_state() will return false, and 3rd-ack is dropped, then if mptcp conn is closed by client, client will send a DATA_FIN and a MPTCP FIN, the DATA_FIN doesn't have MP_CAPABLE or MP_JOIN, so mptcp_subflow_init_cookie_req() will return 0, and pass the cookie check, MP_JOIN request is fallback to normal TCP. Server will send a TCP FIN if closed, in client side, when process TCP FIN, it will do reset, the code path is: tcp_data_queue()->mptcp_incoming_options() ->check_fully_established()->mptcp_subflow_reset(). mptcp_subflow_reset() will set sock state to TCP_CLOSE, so tcp_fin will hit TCP_CLOSE, and print the warning. 2.mptcp is in syncookie, and server recv 3rd-ack, in mptcp_subflow_init_cookie_req(), mptcp_can_accept_new_subflow() return false, and subflow_req->mp_join is not set to 1, so in subflow_syn_recv_sock() will not reset the MP_JOIN subflow, but fallback to normal TCP, and then the same thing happens when server will send a TCP FIN if closed. For case1, subflow_check_req() return -EPERM, then tcp_conn_request() will drop MP_JOIN SYN. For case2, let subflow_syn_recv_sock() call mptcp_can_accept_new_subflow(), and do fatal fallback, send reset. Fixes: 9466a1ccebbe ("mptcp: enable JOIN requests even if cookies are in use") Signed-off-by: Jianguo Wu <wujianguo@chinatelecom.cn> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-07-09	mptcp: remove redundant req destruct in subflow_check_req()	Jianguo Wu	1	-5/+0
	In subflow_check_req(), if subflow sport is mismatch, will put msk, destroy token, and destruct req, then return -EPERM, which can be done by subflow_req_destructor() via: tcp_conn_request() \|--__reqsk_free() \|--subflow_req_destructor() So we should remove these redundant code, otherwise will call tcp_v4_reqsk_destructor() twice, and may double free inet_rsk(req)->ireq_opt. Fixes: 5bc56388c74f ("mptcp: add port number check for MP_JOIN") Signed-off-by: Jianguo Wu <wujianguo@chinatelecom.cn> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-07-09	mptcp: fix warning in __skb_flow_dissect() when do syn cookie for subflow join	Jianguo Wu	1	-1/+15
	I did stress test with wrk[1] and webfsd[2] with the assistance of mptcp-tools[3]: Server side: ./use_mptcp.sh webfsd -4 -R /tmp/ -p 8099 Client side: ./use_mptcp.sh wrk -c 200 -d 30 -t 4 http://192.168.174.129:8099/ and got the following warning message: [ 55.552626] TCP: request_sock_subflow: Possible SYN flooding on port 8099. Sending cookies. Check SNMP counters. [ 55.553024] ------------[ cut here ]------------ [ 55.553027] WARNING: CPU: 0 PID: 10 at net/core/flow_dissector.c:984 __skb_flow_dissect+0x280/0x1650 ... [ 55.553117] CPU: 0 PID: 10 Comm: ksoftirqd/0 Not tainted 5.12.0+ #18 [ 55.553121] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 02/27/2020 [ 55.553124] RIP: 0010:__skb_flow_dissect+0x280/0x1650 ... [ 55.553133] RSP: 0018:ffffb79580087770 EFLAGS: 00010246 [ 55.553137] RAX: 0000000000000000 RBX: ffffffff8ddb58e0 RCX: ffffb79580087888 [ 55.553139] RDX: ffffffff8ddb58e0 RSI: ffff8f7e4652b600 RDI: 0000000000000000 [ 55.553141] RBP: ffffb79580087858 R08: 0000000000000000 R09: 0000000000000008 [ 55.553143] R10: 000000008c622965 R11: 00000000d3313a5b R12: ffff8f7e4652b600 [ 55.553146] R13: ffff8f7e465c9062 R14: 0000000000000000 R15: ffffb79580087888 [ 55.553149] FS: 0000000000000000(0000) GS:ffff8f7f75e00000(0000) knlGS:0000000000000000 [ 55.553152] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 55.553154] CR2: 00007f73d1d19000 CR3: 0000000135e10004 CR4: 00000000003706f0 [ 55.553160] Call Trace: [ 55.553166] ? __sha256_final+0x67/0xd0 [ 55.553173] ? sha256+0x7e/0xa0 [ 55.553177] __skb_get_hash+0x57/0x210 [ 55.553182] subflow_init_req_cookie_join_save+0xac/0xc0 [ 55.553189] subflow_check_req+0x474/0x550 [ 55.553195] ? ip_route_output_key_hash+0x67/0x90 [ 55.553200] ? xfrm_lookup_route+0x1d/0xa0 [ 55.553207] subflow_v4_route_req+0x8e/0xd0 [ 55.553212] tcp_conn_request+0x31e/0xab0 [ 55.553218] ? selinux_socket_sock_rcv_skb+0x116/0x210 [ 55.553224] ? tcp_rcv_state_process+0x179/0x6d0 [ 55.553229] tcp_rcv_state_process+0x179/0x6d0 [ 55.553235] tcp_v4_do_rcv+0xaf/0x220 [ 55.553239] tcp_v4_rcv+0xce4/0xd80 [ 55.553243] ? ip_route_input_rcu+0x246/0x260 [ 55.553248] ip_protocol_deliver_rcu+0x35/0x1b0 [ 55.553253] ip_local_deliver_finish+0x44/0x50 [ 55.553258] ip_local_deliver+0x6c/0x110 [ 55.553262] ? ip_rcv_finish_core.isra.19+0x5a/0x400 [ 55.553267] ip_rcv+0xd1/0xe0 ... After debugging, I found in __skb_flow_dissect(), skb->dev and skb->sk are both NULL, then net is NULL, and trigger WARN_ON_ONCE(!net), actually net is always NULL in this code path, as skb->dev is set to NULL in tcp_v4_rcv(), and skb->sk is never set. Code snippet in __skb_flow_dissect() that trigger warning: 975 if (skb) { 976 if (!net) { 977 if (skb->dev) 978 net = dev_net(skb->dev); 979 else if (skb->sk) 980 net = sock_net(skb->sk); 981 } 982 } 983 984 WARN_ON_ONCE(!net); So, using seq and transport header derived hash. [1] https://github.com/wg/wrk [2] https://github.com/ourway/webfsd [3] https://github.com/pabeni/mptcp-tools Fixes: 9466a1ccebbe ("mptcp: enable JOIN requests even if cookies are in use") Suggested-by: Paolo Abeni <pabeni@redhat.com> Suggested-by: Florian Westphal <fw@strlen.de> Signed-off-by: Jianguo Wu <wujianguo@chinatelecom.cn> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-07-10	rtc: pcf2127: handle timestamp interrupts	Mian Yousaf Kaukab	1	-59/+133
	commit 03623b4b041c ("rtc: pcf2127: add tamper detection support") added support for timestamp interrupts. However they are not being handled in the irq handler. If a timestamp interrupt occurs it results in kernel disabling the interrupt and displaying the call trace: [ 121.145580] irq 78: nobody cared (try booting with the "irqpoll" option) ... [ 121.238087] [<00000000c4d69393>] irq_default_primary_handler threaded [<000000000a90d25b>] pcf2127_rtc_irq [rtc_pcf2127] [ 121.248971] Disabling IRQ #78 Handle timestamp interrupts in pcf2127_rtc_irq(). Save time stamp before clearing TSF1 and TSF2 flags so that it can't be overwritten. Set a flag to mark if the timestamp is valid and only report to sysfs if the flag is set. To mimic the hardware behavior, don’t save another timestamp until the first one has been read by the userspace. However, if the alarm irq is not configured, keep the old way of handling timestamp interrupt in the timestamp0 sysfs calls. Signed-off-by: Mian Yousaf Kaukab <ykaukab@suse.de> Reviewed-by: Bruno Thomsen <bruno.thomsen@gmail.com> Tested-by: Bruno Thomsen <bruno.thomsen@gmail.com> Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com> Link: https://lore.kernel.org/r/20210629150643.31551-1-ykaukab@suse.de
2021-07-10	rtc: at91sam9: Remove unnecessary offset variable checks	Nobuhiro Iwamatsu	1	-1/+1
	The offset variable is checked by at91_rtc_readalarm(), but this check is unnecessary because the previous check knew that the value of this variable was not 0. This removes that unnecessary offset variable checks. Cc: Nicolas Ferre <nicolas.ferre@microchip.com> Cc: Ludovic Desroches <ludovic.desroches@microchip.com> Signed-off-by: Nobuhiro Iwamatsu <nobuhiro1.iwamatsu@toshiba.co.jp> Acked-by: Nicolas Ferre <nicolas.ferre@microchip.com> Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com> Link: https://lore.kernel.org/r/20210708051340.341345-1-nobuhiro1.iwamatsu@toshiba.co.jp
2021-07-10	rtc: s5m: Check return value of s5m_check_peding_alarm_interrupt()	Nobuhiro Iwamatsu	1	-3/+1
	s5m_check_peding_alarm_interrupt() in s5m_rtc_read_alarm() gets the return value, but doesn't use it. This modifies using the s5m_check_peding_alarm_interrupt()"s return value as the s5m_rtc_read_alarm()'s return value. Cc: Krzysztof Kozlowski <krzysztof.kozlowski@canonical.com> Cc: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com> Cc: linux-samsung-soc@vger.kernel.org Signed-off-by: Nobuhiro Iwamatsu <nobuhiro1.iwamatsu@toshiba.co.jp> Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@canonical.com> Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com> Link: https://lore.kernel.org/r/20210708051304.341278-1-nobuhiro1.iwamatsu@toshiba.co.jp
2021-07-10	rtc: spear: convert to SPDX identifier	Nobuhiro Iwamatsu	1	-4/+1
	Use SPDX-License-Identifier instead of a verbose license text. Signed-off-by: Nobuhiro Iwamatsu <nobuhiro1.iwamatsu@toshiba.co.jp> Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com> Link: https://lore.kernel.org/r/20210707075804.337458-11-nobuhiro1.iwamatsu@toshiba.co.jp
2021-07-10	rtc: tps6586x: convert to SPDX identifier	Nobuhiro Iwamatsu	1	-14/+1
	Use SPDX-License-Identifier instead of a verbose license text. Signed-off-by: Nobuhiro Iwamatsu <nobuhiro1.iwamatsu@toshiba.co.jp> Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com> Link: https://lore.kernel.org/r/20210707075804.337458-9-nobuhiro1.iwamatsu@toshiba.co.jp
2021-07-10	rtc: tps80031: convert to SPDX identifier	Nobuhiro Iwamatsu	1	-14/+1
	Use SPDX-License-Identifier instead of a verbose license text. Signed-off-by: Nobuhiro Iwamatsu <nobuhiro1.iwamatsu@toshiba.co.jp> Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com> Link: https://lore.kernel.org/r/20210707075804.337458-8-nobuhiro1.iwamatsu@toshiba.co.jp
2021-07-10	rtc: rtd119x: Fix format of SPDX identifier	Nobuhiro Iwamatsu	1	-2/+1
	For C files, use the C99 format (//). Signed-off-by: Nobuhiro Iwamatsu <nobuhiro1.iwamatsu@toshiba.co.jp> Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com> Link: https://lore.kernel.org/r/20210707075804.337458-7-nobuhiro1.iwamatsu@toshiba.co.jp
2021-07-10	rtc: sc27xx: Fix format of SPDX identifier	Nobuhiro Iwamatsu	1	-1/+1
	For C files, use the C99 format (//). Cc: Orson Zhai <orsonzhai@gmail.com> Cc: Baolin Wang <baolin.wang7@gmail.com> Cc: Chunyan Zhang <zhang.lyra@gmail.com> Signed-off-by: Nobuhiro Iwamatsu <nobuhiro1.iwamatsu@toshiba.co.jp> Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com> Link: https://lore.kernel.org/r/20210707075804.337458-6-nobuhiro1.iwamatsu@toshiba.co.jp
2021-07-10	rtc: palmas: convert to SPDX identifier	Nobuhiro Iwamatsu	1	-14/+1
	Use SPDX-License-Identifier instead of a verbose license text. Signed-off-by: Nobuhiro Iwamatsu <nobuhiro1.iwamatsu@toshiba.co.jp> Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com> Link: https://lore.kernel.org/r/20210707075804.337458-5-nobuhiro1.iwamatsu@toshiba.co.jp
2021-07-10	rtc: max6900: convert to SPDX identifier	Nobuhiro Iwamatsu	1	-5/+3
	Use SPDX-License-Identifier instead of a verbose license text. Signed-off-by: Nobuhiro Iwamatsu <nobuhiro1.iwamatsu@toshiba.co.jp> Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com> Link: https://lore.kernel.org/r/20210707075804.337458-4-nobuhiro1.iwamatsu@toshiba.co.jp
2021-07-10	rtc: ds1374: convert to SPDX identifier	Nobuhiro Iwamatsu	1	-5/+2
	Use SPDX-License-Identifier instead of a verbose license text. Signed-off-by: Nobuhiro Iwamatsu <nobuhiro1.iwamatsu@toshiba.co.jp> Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com> Link: https://lore.kernel.org/r/20210707075804.337458-3-nobuhiro1.iwamatsu@toshiba.co.jp
2021-07-10	rtc: au1xxx: convert to SPDX identifier	Nobuhiro Iwamatsu	1	-4/+1
	Use SPDX-License-Identifier instead of a verbose license text. Signed-off-by: Nobuhiro Iwamatsu <nobuhiro1.iwamatsu@toshiba.co.jp> Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com> Link: https://lore.kernel.org/r/20210707075804.337458-2-nobuhiro1.iwamatsu@toshiba.co.jp
2021-07-09	Revert "PCI: Coalesce host bridge contiguous apertures"	Bjorn Helgaas	1	-46/+4
	This reverts commit 65db04053efea3f3e412a7e0cc599962999c96b4. Guenter reported that after 65db04053efe, the ppc:sam460ex qemu emulation no longer boots from nvme: nvme nvme0: Device not ready; aborting initialisation, CSTS=0x0 nvme nvme0: Removing after probe failure status: -19 Link: https://lore.kernel.org/r/20210709231529.GA3270116@roeck-us.net Reported-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2021-07-10	rtc: pcf85063: Update the PCF85063A datasheet revision	Fabio Estevam	1	-1/+1
	After updating the datasheet URL, the PCF85063A datasheet revision has changed. Adjust it accordingly. Reported-by: Nobuhiro Iwamatsu <iwamatsu@nigauri.org> Signed-off-by: Fabio Estevam <festevam@gmail.com> Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com> Link: https://lore.kernel.org/r/20210624120953.2313378-1-festevam@gmail.com
2021-07-10	dt-bindings: rtc: ti,bq32k: take maintainership	Alexandre Belloni	1	-1/+1
	Take maintainership of the binding as PAvel said he doesn't have the hardware anymore. Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com> Acked-by: Pavel Machek <pavel@ucw.cz> Link: https://lore.kernel.org/r/20210620224030.1115356-1-alexandre.belloni@bootlin.com