aboutsummaryrefslogtreecommitdiffstats
path: root/scripts/bpf_doc.py (unfollow)
AgeCommit message (Collapse)AuthorFilesLines
2022-06-23mm: Clear page->private when splitting or migrating a pageMatthew Wilcox (Oracle)2-0/+2
In our efforts to remove uses of PG_private, we have found folios with the private flag clear and folio->private not-NULL. That is the root cause behind 642d51fb0775 ("ceph: check folio PG_private bit instead of folio->private"). It can also affect a few other filesystems that haven't yet reported a problem. compaction_alloc() can return a page with uninitialised page->private, and rather than checking all the callers of migrate_pages(), just zero page->private after calling get_new_page(). Similarly, the tail pages from split_huge_page() may also have an uninitialised page->private. Reported-by: Xiubo Li <xiubli@redhat.com> Tested-by: Xiubo Li <xiubli@redhat.com> Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
2022-06-23net: openvswitch: fix parsing of nw_proto for IPv6 fragmentsRosemarie O'Riorden1-1/+1
When a packet enters the OVS datapath and does not match any existing flows installed in the kernel flow cache, the packet will be sent to userspace to be parsed, and a new flow will be created. The kernel and OVS rely on each other to parse packet fields in the same way so that packets will be handled properly. As per the design document linked below, OVS expects all later IPv6 fragments to have nw_proto=44 in the flow key, so they can be correctly matched on OpenFlow rules. OpenFlow controllers create pipelines based on this design. This behavior was changed by the commit in the Fixes tag so that nw_proto equals the next_header field of the last extension header. However, there is no counterpart for this change in OVS userspace, meaning that this field is parsed differently between OVS and the kernel. This is a problem because OVS creates actions based on what is parsed in userspace, but the kernel-provided flow key is used as a match criteria, as described in Documentation/networking/openvswitch.rst. This leads to issues such as packets incorrectly matching on a flow and thus the wrong list of actions being applied to the packet. Such changes in packet parsing cannot be implemented without breaking the userspace. The offending commit is partially reverted to restore the expected behavior. The change technically made sense and there is a good reason that it was implemented, but it does not comply with the original design of OVS. If in the future someone wants to implement such a change, then it must be user-configurable and disabled by default to preserve backwards compatibility with existing OVS versions. Cc: stable@vger.kernel.org Fixes: fa642f08839b ("openvswitch: Derive IP protocol number for IPv6 later frags") Link: https://docs.openvswitch.org/en/latest/topics/design/#fragments Signed-off-by: Rosemarie O'Riorden <roriorden@redhat.com> Acked-by: Eelco Chaudron <echaudro@redhat.com> Link: https://lore.kernel.org/r/20220621204845.9721-1-roriorden@redhat.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-06-23sock: redo the psock vs ULP protection checkJakub Kicinski4-3/+12
Commit 8a59f9d1e3d4 ("sock: Introduce sk->sk_prot->psock_update_sk_prot()") has moved the inet_csk_has_ulp(sk) check from sk_psock_init() to the new tcp_bpf_update_proto() function. I'm guessing that this was done to allow creating psocks for non-inet sockets. Unfortunately the destruction path for psock includes the ULP unwind, so we need to fail the sk_psock_init() itself. Otherwise if ULP is already present we'll notice that later, and call tcp_update_ulp() with the sk_proto of the ULP itself, which will most likely result in the ULP looping its callbacks. Fixes: 8a59f9d1e3d4 ("sock: Introduce sk->sk_prot->psock_update_sk_prot()") Signed-off-by: Jakub Kicinski <kuba@kernel.org> Reviewed-by: John Fastabend <john.fastabend@gmail.com> Reviewed-by: Jakub Sitnicki <jakub@cloudflare.com> Tested-by: Jakub Sitnicki <jakub@cloudflare.com> Link: https://lore.kernel.org/r/20220620191353.1184629-2-kuba@kernel.org Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-06-23Revert "net/tls: fix tls_sk_proto_close executed repeatedly"Jakub Kicinski1-3/+0
This reverts commit 69135c572d1f84261a6de2a1268513a7e71753e2. This commit was just papering over the issue, ULP should not get ->update() called with its own sk_prot. Each ULP would need to add this check. Fixes: 69135c572d1f ("net/tls: fix tls_sk_proto_close executed repeatedly") Signed-off-by: Jakub Kicinski <kuba@kernel.org> Reviewed-by: John Fastabend <john.fastabend@gmail.com> Link: https://lore.kernel.org/r/20220620191353.1184629-1-kuba@kernel.org Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-06-22virtio_net: fix xdp_rxq_info bug after suspend/resumeStephan Gerhold1-19/+6
The following sequence currently causes a driver bug warning when using virtio_net: # ip link set eth0 up # echo mem > /sys/power/state (or e.g. # rtcwake -s 10 -m mem) <resume> # ip link set eth0 down Missing register, driver bug WARNING: CPU: 0 PID: 375 at net/core/xdp.c:138 xdp_rxq_info_unreg+0x58/0x60 Call trace: xdp_rxq_info_unreg+0x58/0x60 virtnet_close+0x58/0xac __dev_close_many+0xac/0x140 __dev_change_flags+0xd8/0x210 dev_change_flags+0x24/0x64 do_setlink+0x230/0xdd0 ... This happens because virtnet_freeze() frees the receive_queue completely (including struct xdp_rxq_info) but does not call xdp_rxq_info_unreg(). Similarly, virtnet_restore() sets up the receive_queue again but does not call xdp_rxq_info_reg(). Actually, parts of virtnet_freeze_down() and virtnet_restore_up() are almost identical to virtnet_close() and virtnet_open(): only the calls to xdp_rxq_info_(un)reg() are missing. This means that we can fix this easily and avoid such problems in the future by just calling virtnet_close()/open() from the freeze/restore handlers. Aside from adding the missing xdp_rxq_info calls the only difference is that the refill work is only cancelled if netif_running(). However, this should not make any functional difference since the refill work should only be active if the network interface is actually up. Fixes: 754b8a21a96d ("virtio_net: setup xdp_rxq_info") Signed-off-by: Stephan Gerhold <stephan.gerhold@kernkonzept.com> Acked-by: Jesper Dangaard Brouer <brouer@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com> Link: https://lore.kernel.org/r/20220621114845.3650258-1-stephan.gerhold@kernkonzept.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-06-22igb: Make DMA faster when CPU is active on the PCIe linkKai-Heng Feng1-7/+5
Intel I210 on some Intel Alder Lake platforms can only achieve ~750Mbps Tx speed via iperf. The RR2DCDELAY shows around 0x2xxx DMA delay, which will be significantly lower when 1) ASPM is disabled or 2) SoC package c-state stays above PC3. When the RR2DCDELAY is around 0x1xxx the Tx speed can reach to ~950Mbps. According to the I210 datasheet "8.26.1 PCIe Misc. Register - PCIEMISC", "DMA Idle Indication" doesn't seem to tie to DMA coalesce anymore, so set it to 1b for "DMA is considered idle when there is no Rx or Tx AND when there are no TLPs indicating that CPU is active detected on the PCIe link (such as the host executes CSR or Configuration register read or write operation)" and performing Tx should also fall under "active CPU on PCIe link" case. In addition to that, commit b6e0c419f040 ("igb: Move DMA Coalescing init code to separate function.") seems to wrongly changed from enabling E1000_PCIEMISC_LX_DECISION to disabling it, also fix that. Fixes: b6e0c419f040 ("igb: Move DMA Coalescing init code to separate function.") Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com> Tested-by: Gurucharan <gurucharanx.g@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Link: https://lore.kernel.org/r/20220621221056.604304-1-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-06-22net: dsa: qca8k: reduce mgmt ethernet timeoutChristian Marangi1-1/+1
The current mgmt ethernet timeout is set to 100ms. This value is too big and would slow down any mdio command in case the mgmt ethernet packet have some problems on the receiving part. Reduce it to just 5ms to handle case when some operation are done on the master port that would cause the mgmt ethernet to not work temporarily. Fixes: 5950c7c0a68c ("net: dsa: qca8k: add support for mgmt read/write in Ethernet packet") Signed-off-by: Christian Marangi <ansuelsmth@gmail.com> Link: https://lore.kernel.org/r/20220621151633.11741-1-ansuelsmth@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-06-22net: dsa: qca8k: reset cpu port on MTU changeChristian Marangi1-1/+21
It was discovered that the Documentation lacks of a fundamental detail on how to correctly change the MAX_FRAME_SIZE of the switch. In fact if the MAX_FRAME_SIZE is changed while the cpu port is on, the switch panics and cease to send any packet. This cause the mgmt ethernet system to not receive any packet (the slow fallback still works) and makes the device not reachable. To recover from this a switch reset is required. To correctly handle this, turn off the cpu ports before changing the MAX_FRAME_SIZE and turn on again after the value is applied. Fixes: f58d2598cf70 ("net: dsa: qca8k: implement the port MTU callbacks") Cc: stable@vger.kernel.org Signed-off-by: Christian Marangi <ansuelsmth@gmail.com> Link: https://lore.kernel.org/r/20220621151122.10220-1-ansuelsmth@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-06-22MAINTAINERS: Add a maintainer for OCP Time CardVadim Fedorenko1-0/+1
I've been contributing and reviewing patches for ptp_ocp driver for some time and I'm taking care of it's github mirror. On Jakub's suggestion, I would like to step forward and become a maintainer for this driver. This patch adds a dedicated entry to MAINTAINERS. Signed-off-by: Vadim Fedorenko <vadfed@fb.com> Acked-by: Jonathan Lemon <jonathan.lemon@gmail.com> Link: https://lore.kernel.org/r/20220621233131.21240-1-vfedorenko@novek.ru Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-06-22hinic: Replace memcpy() with direct assignmentKees Cook1-3/+1
Under CONFIG_FORTIFY_SOURCE=y and CONFIG_UBSAN_BOUNDS=y, Clang is bugged here for calculating the size of the destination buffer (0x10 instead of 0x14). This copy is a fixed size (sizeof(struct fw_section_info_st)), with the source and dest being struct fw_section_info_st, so the memcpy should be safe, assuming the index is within bounds, which is UBSAN_BOUNDS's responsibility to figure out. Avoid the whole thing and just do a direct assignment. This results in no change to the executable code. [This is a duplicate of commit 2c0ab32b73cf ("hinic: Replace memcpy() with direct assignment") which was applied to net-next.] Cc: Nick Desaulniers <ndesaulniers@google.com> Cc: Tom Rix <trix@redhat.com> Cc: llvm@lists.linux.dev Link: https://github.com/ClangBuiltLinux/linux/issues/1592 Signed-off-by: Kees Cook <keescook@chromium.org> Reviewed-by: Gustavo A. R. Silva <gustavoars@kernel.org> Tested-by: Nathan Chancellor <nathan@kernel.org> # build Link: https://lore.kernel.org/r/20220616052312.292861-1-keescook@chromium.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-06-22ALSA: hda/realtek: Add quirk for Clevo NS50PUTim Crawford1-0/+1
Fixes headset detection on Clevo NS50PU. Signed-off-by: Tim Crawford <tcrawford@system76.com> Cc: <stable@vger.kernel.org> Link: https://lore.kernel.org/r/20220622150017.9897-1-tcrawford@system76.com Signed-off-by: Takashi Iwai <tiwai@suse.de>
2022-06-22Revert "drivers/net/ethernet/neterion/vxge: Fix a use-after-free bug in vxge-main.c"Jakub Kicinski1-1/+1
This reverts commit 8fc74d18639a2402ca52b177e990428e26ea881f. BAR0 is the main (only?) register bank for this device. We most obviously can't unmap it before the netdev is unregistered. This was pointed out in review but the patch got reposted and merged, anyway. The author of the patch was only testing it with a QEMU model, which I presume does not emulate enough for the netdev to be brought up (author's replies are not visible in lore because they kept sending their emails in HTML). Link: https://lore.kernel.org/all/20220616085059.680dc215@kernel.org/ Fixes: 8fc74d18639a ("drivers/net/ethernet/neterion/vxge: Fix a use-after-free bug in vxge-main.c") Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-06-21net: phy: smsc: Disable Energy Detect Power-Down in interrupt modeLukas Wunner1-2/+4
Simon reports that if two LAN9514 USB adapters are directly connected without an intermediate switch, the link fails to come up and link LEDs remain dark. The issue was introduced by commit 1ce8b37241ed ("usbnet: smsc95xx: Forward PHY interrupts to PHY driver to avoid polling"). The PHY suffers from a known erratum wherein link detection becomes unreliable if Energy Detect Power-Down is used. In poll mode, the driver works around the erratum by briefly disabling EDPD for 640 msec to detect a neighbor, then re-enabling it to save power. In interrupt mode, no interrupt is signaled if EDPD is used by both link partners, so it must not be enabled at all. We'll recoup the power savings by enabling SUSPEND1 mode on affected LAN95xx chips in a forthcoming commit. Fixes: 1ce8b37241ed ("usbnet: smsc95xx: Forward PHY interrupts to PHY driver to avoid polling") Reported-by: Simon Han <z.han@kunbus.com> Signed-off-by: Lukas Wunner <lukas@wunner.de> Link: https://lore.kernel.org/r/439a3f3168c2f9d44b5fd9bb8d2b551711316be6.1655714438.git.lukas@wunner.de Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-06-21ice: ethtool: Prohibit improper channel config for DCBAnatolii Gerasymenko2-5/+47
Do not allow setting less channels, than Traffic Classes there are via ethtool. There must be at least one channel per Traffic Class. If you set less channels, than Traffic Classes there are, then during ice_vsi_rebuild there would be allocated only the requested amount of tx/rx rings in ice_vsi_alloc_arrays. But later in ice_vsi_setup_q_map there would be requested at least one channel per Traffic Class. This results in setting num_rxq > alloc_rxq and num_txq > alloc_txq. Later, there would be a NULL pointer dereference in ice_vsi_map_rings_to_vectors, because we go beyond of rx_rings or tx_rings arrays. Change ice_set_channels() to return error if you try to allocate less channels, than Traffic Classes there are. Change ice_vsi_setup_q_map() and ice_vsi_setup_q_map_mqprio() to return status code instead of void. Add error handling for ice_vsi_setup_q_map() and ice_vsi_setup_q_map_mqprio() in ice_vsi_init() and ice_vsi_cfg_tc(). [53753.889983] INFO: Flow control is disabled for this traffic class (0) on this vsi. [53763.984862] BUG: unable to handle kernel NULL pointer dereference at 0000000000000028 [53763.992915] PGD 14b45f5067 P4D 0 [53763.996444] Oops: 0002 [#1] SMP NOPTI [53764.000312] CPU: 12 PID: 30661 Comm: ethtool Kdump: loaded Tainted: GOE --------- - - 4.18.0-240.el8.x86_64 #1 [53764.011825] Hardware name: Intel Corporation WilsonCity/WilsonCity, BIOS WLYDCRB1.SYS.0020.P21.2012150710 12/15/2020 [53764.022584] RIP: 0010:ice_vsi_map_rings_to_vectors+0x7e/0x120 [ice] [53764.029089] Code: 41 0d 0f b7 b7 12 05 00 00 0f b6 d0 44 29 de 44 0f b7 c6 44 01 c2 41 39 d0 7d 2d 4c 8b 47 28 44 0f b7 ce 83 c6 01 4f 8b 04 c8 <49> 89 48 28 4 c 8b 89 b8 01 00 00 4d 89 08 4c 89 81 b8 01 00 00 44 [53764.048379] RSP: 0018:ff550dd88ea47b20 EFLAGS: 00010206 [53764.053884] RAX: 0000000000000002 RBX: 0000000000000004 RCX: ff385ea42fa4a018 [53764.061301] RDX: 0000000000000006 RSI: 0000000000000005 RDI: ff385e9baeedd018 [53764.068717] RBP: 0000000000000010 R08: 0000000000000000 R09: 0000000000000004 [53764.076133] R10: 0000000000000002 R11: 0000000000000004 R12: 0000000000000000 [53764.083553] R13: 0000000000000000 R14: ff385e658fdd9000 R15: ff385e9baeedd018 [53764.090976] FS: 000014872c5b5740(0000) GS:ff385e847f100000(0000) knlGS:0000000000000000 [53764.099362] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [53764.105409] CR2: 0000000000000028 CR3: 0000000a820fa002 CR4: 0000000000761ee0 [53764.112851] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [53764.120301] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [53764.127747] PKRU: 55555554 [53764.130781] Call Trace: [53764.133564] ice_vsi_rebuild+0x611/0x870 [ice] [53764.138341] ice_vsi_recfg_qs+0x94/0x100 [ice] [53764.143116] ice_set_channels+0x1a8/0x3e0 [ice] [53764.147975] ethtool_set_channels+0x14e/0x240 [53764.152667] dev_ethtool+0xd74/0x2a10 [53764.156665] ? __mod_lruvec_state+0x44/0x110 [53764.161280] ? __mod_lruvec_state+0x44/0x110 [53764.165893] ? page_add_file_rmap+0x15/0x170 [53764.170518] ? inet_ioctl+0xd1/0x220 [53764.174445] ? netdev_run_todo+0x5e/0x290 [53764.178808] dev_ioctl+0xb5/0x550 [53764.182485] sock_do_ioctl+0xa0/0x140 [53764.186512] sock_ioctl+0x1a8/0x300 [53764.190367] ? selinux_file_ioctl+0x161/0x200 [53764.195090] do_vfs_ioctl+0xa4/0x640 [53764.199035] ksys_ioctl+0x60/0x90 [53764.202722] __x64_sys_ioctl+0x16/0x20 [53764.206845] do_syscall_64+0x5b/0x1a0 [53764.210887] entry_SYSCALL_64_after_hwframe+0x65/0xca Fixes: 87324e747fde ("ice: Implement ethtool ops for channels") Signed-off-by: Anatolii Gerasymenko <anatolii.gerasymenko@intel.com> Tested-by: Gurucharan <gurucharanx.g@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-06-21ice: ethtool: advertise 1000M speeds properlyAnatolii Gerasymenko1-1/+38
In current implementation ice_update_phy_type enables all link modes for selected speed. This approach doesn't work for 1000M speeds, because both copper (1000baseT) and optical (1000baseX) standards cannot be enabled at once. Fix this, by adding the function `ice_set_phy_type_from_speed()` for 1000M speeds. Fixes: 48cb27f2fd18 ("ice: Implement handlers for ethtool PHY/link operations") Signed-off-by: Anatolii Gerasymenko <anatolii.gerasymenko@intel.com> Tested-by: Gurucharan <gurucharanx.g@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-06-21mips: lantiq: Add missing of_node_put() in irq.cLiang He1-0/+1
In icu_of_init(), of_find_compatible_node() will return a node pointer with refcount incremented. We should use of_node_put() when it is not used anymore. Signed-off-by: Liang He <windhl@126.com> Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
2022-06-21ice: Fix switchdev rules book keepingWojciech Drewek1-0/+1
Adding two filters with same matching criteria ends up with one rule in hardware with act = ICE_FWD_TO_VSI_LIST. In order to remove them properly we have to keep the information about vsi handle which is used in VSI bitmap (ice_adv_fltr_mgmt_list_entry::vsi_list_info::vsi_map). Fixes: 0d08a441fb1a ("ice: ndo_setup_tc implementation for PF") Reported-by: Sridhar Samudrala <sridhar.samudrala@intel.com> Signed-off-by: Wojciech Drewek <wojciech.drewek@intel.com> Tested-by: Sandeep Penigalapati <sandeep.penigalapati@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-06-21PM: hibernate: Use kernel_can_power_off()Dmitry Osipenko1-1/+1
Use new kernel_can_power_off() API instead of legacy pm_power_off global variable to fix regressed hibernation to disk where machine no longer powers off when it should because ACPI power driver transitioned to the new sys-off based API and it doesn't use pm_power_off anymore. Fixes: 98f30d0ecf79 ("ACPI: power: Switch to sys-off handler API") Tested-by: Ken Moffat <zarniwhoop@ntlworld.com> Reported-by: Ken Moffat <zarniwhhop@ntlworld.com> Signed-off-by: Dmitry Osipenko <dmitry.osipenko@collabora.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2022-06-21ice: ignore protocol field in GTP offloadMarcin Szycik1-1/+3
Commit 34a897758efe ("ice: Add support for inner etype in switchdev") added the ability to match on inner ethertype. A side effect of that change is that it is now impossible to add some filters for protocols which do not contain inner ethtype field. tc requires the protocol field to be specified when providing certain other options, e.g. src_ip. This is a problem in case of GTP - when user wants to specify e.g. src_ip, they also need to specify protocol in tc command (otherwise tc fails with: Illegal "src_ip"). Because GTP is a tunnel, the protocol field is treated as inner protocol. GTP does not contain inner ethtype field and the filter cannot be added. To fix this, ignore the ethertype field in case of GTP filters. Fixes: 9a225f81f540 ("ice: Support GTP-U and GTP-C offload in switchdev") Signed-off-by: Marcin Szycik <marcin.szycik@linux.intel.com> Tested-by: Sandeep Penigalapati <sandeep.penigalapati@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-06-21afs: Fix dynamic root getattrDavid Howells1-1/+2
The recent patch to make afs_getattr consult the server didn't account for the pseudo-inodes employed by the dynamic root-type afs superblock not having a volume or a server to access, and thus an oops occurs if such a directory is stat'd. Fix this by checking to see if the vnode->volume pointer actually points anywhere before following it in afs_getattr(). This can be tested by stat'ing a directory in /afs. It may be sufficient just to do "ls /afs" and the oops looks something like: BUG: kernel NULL pointer dereference, address: 0000000000000020 ... RIP: 0010:afs_getattr+0x8b/0x14b ... Call Trace: <TASK> vfs_statx+0x79/0xf5 vfs_fstatat+0x49/0x62 Fixes: 2aeb8c86d499 ("afs: Fix afs_getattr() to refetch file status if callback break occurred") Reported-by: Marc Dionne <marc.dionne@auristor.com> Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Marc Dionne <marc.dionne@auristor.com> Tested-by: Marc Dionne <marc.dionne@auristor.com> cc: linux-afs@lists.infradead.org Link: https://lore.kernel.org/r/165408450783.1031787.7941404776393751186.stgit@warthog.procyon.org.uk/ Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-06-21efi/x86: libstub: Fix typo in __efi64_argmap* nameEvgeniy Baskov1-1/+1
The actual name of the DXE services function used is set_memory_space_attributes(), not set_memory_space_descriptor(). Change EFI mixed mode helper macro name to match the function name. Fixes: 31f1a0edff78 ("efi/x86: libstub: Make DXE calls mixed mode safe") Signed-off-by: Evgeniy Baskov <baskov@ispras.ru> Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2022-06-21efi: sysfb_efi: remove unnecessary <asm/efi.h> includeJavier Martinez Canillas1-2/+0
Nothing defined in the header is used by drivers/firmware/efi/sysfb_efi.c but also, including it can lead to build errors when built on arches that don't have an asm/efi.h header file. This can happen for example if a driver that is built when COMPILE_TEST is enabled selects the SYSFB symbol, e.g. on powerpc with allyesconfig: drivers/firmware/efi/sysfb_efi.c:29:10: fatal error: asm/efi.h: No such file or directory 29 | #include <asm/efi.h> | ^~~~~~~~~~~ Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Javier Martinez Canillas <javierm@redhat.com> Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2022-06-21mips: dts: ingenic: Add TCU clock to x1000/x1830 tcu device nodeAidan MacDonald2-4/+6
This clock is a gate for the TCU hardware block on these SoCs, but it wasn't included in the device tree since the ingenic-tcu driver erroneously did not request it. Reviewed-by: Paul Cercueil <paul@crapouillou.net> Signed-off-by: Aidan MacDonald <aidanmacdonald.0x0@gmail.com> Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
2022-06-21certs: Add FIPS selftestsDavid Howells5-1/+251
Add some selftests for signature checking when FIPS mode is enabled. These need to be done before we start actually using the signature checking for things and must panic the kernel upon failure. Note that the tests must not check the blacklist lest this provide a way to prevent a kernel from booting by installing a hash of a test key in the appropriate UEFI table. Reported-by: Simo Sorce <simo@redhat.com> Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Simo Sorce <simo@redhat.com> Reviewed-by: Herbert Xu <herbert@gondor.apana.org.au> cc: keyrings@vger.kernel.org cc: linux-crypto@vger.kernel.org Link: https://lore.kernel.org/r/165515742832.1554877.2073456606206090838.stgit@warthog.procyon.org.uk/
2022-06-21certs: Move load_certificate_list() to be with the asymmetric keys codeDavid Howells7-22/+17
Move load_certificate_list(), which loads a series of binary X.509 certificates from a blob and inserts them as keys into a keyring, to be with the asymmetric keys code that it drives. This makes it easier to add FIPS selftest code in which we need to load up a private keyring for the tests to use. Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Simo Sorce <simo@redhat.com> Reviewed-by: Herbert Xu <herbert@gondor.apana.org.au> cc: keyrings@vger.kernel.org cc: linux-crypto@vger.kernel.org Link: https://lore.kernel.org/r/165515742145.1554877.13488098107542537203.stgit@warthog.procyon.org.uk/
2022-06-21mips/pic32/pic32mzda: Fix refcount leak bugsLiang He2-1/+9
of_find_matching_node(), of_find_compatible_node() and of_find_node_by_path() will return node pointers with refcout incremented. We should call of_node_put() when they are not used anymore. Signed-off-by: Liang He <windhl@126.com> Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
2022-06-21mips: lantiq: xway: Fix refcount leak bug in sysctrlLiang He1-0/+4
In ltq_soc_init(), of_find_compatible_node() will return a node pointer with refcount incremented. We should use of_node_put() when it is not used anymore. Signed-off-by: Liang He <windhl@126.com> Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
2022-06-21mips: lantiq: falcon: Fix refcount leak bug in sysctrlLiang He1-0/+6
In ltq_soc_init(), of_find_compatible_node() will return a node pointer with refcount incremented. We should use of_node_put() when it is not used anymore. Signed-off-by: Liang He <windhl@126.com> Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
2022-06-21mips: ralink: Fix refcount leak in of.cLiang He1-0/+2
In plat_of_remap_node(), plat_of_remap_node() will return a node pointer with refcount incremented. We should use of_node_put() when it is not used anymore. Signed-off-by: Liang He <windhl@126.com> Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
2022-06-21mips: mti-malta: Fix refcount leak in malta-time.cLiang He1-0/+2
In update_gic_frequency_dt(), of_find_compatible_node() will return a node pointer with refcount incremented. We should use of_node_put() when it is not used anymore. Signed-off-by: Liang He <windhl@126.com> Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
2022-06-21arch: mips: generic: Add missing of_node_put() in board-ranchu.cLiang He1-0/+1
In ranchu_measure_hpt_freq(), of_find_compatible_node() will return a node pointer with refcount incremented. We should use of_put_node() when it is not used anymore. Signed-off-by: Liang He <windhl@126.com> Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
2022-06-21MIPS: Remove repetitive increase irq_err_counthuhai2-4/+0
commit 979934da9e7a ("[PATCH] mips: update IRQ handling for vr41xx") added a function irq_dispatch, and it'll increase irq_err_count when the get_irq callback returns a negative value, but increase irq_err_count in get_irq was not removed. And also, modpost complains once gpio-vr41xx drivers become modules. ERROR: modpost: "irq_err_count" [drivers/gpio/gpio-vr41xx.ko] undefined! So it would be a good idea to remove repetitive increase irq_err_count in get_irq callback. Fixes: 27fdd325dace ("MIPS: Update VR41xx GPIO driver to use gpiolib") Fixes: 979934da9e7a ("[PATCH] mips: update IRQ handling for vr41xx") Reported-by: k2ci <kernel-bot@kylinos.cn> Signed-off-by: huhai <huhai@kylinos.cn> Signed-off-by: Genjian Zhang <zhanggenjian@kylinos.cn> Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
2022-06-21ALSA: hda: Fix discovery of i915 graphics PCI deviceTakashi Iwai1-9/+6
It's been reported that the recent fix for skipping the component-binding with D-GPU caused a regression on some systems; it resulted in the completely missing component binding with i915 GPU. The problem was the use of pci_get_class() function. It matches with the full PCI class bits, while we want to match only partially the PCI base class bits. So, when a system has an i915 graphics device with the PCI class 0380, it won't hit because we're looking for only the PCI class 0300. This patch fixes i915_gfx_present() to look up each PCI device and match with PCI base class explicitly instead of pci_get_class(). Fixes: c9db8a30d9f0 ("ALSA: hda/i915 - skip acomp init if no matching display") Reviewed-by: Kai Vehmanen <kai.vehmanen@linux.intel.com> Tested-by: Kai Vehmanen <kai.vehmanen@linux.intel.com> Cc: <stable@vger.kernel.org> Link: https://bugzilla.opensuse.org/show_bug.cgi?id=1200611 Link: https://lore.kernel.org/r/87bkunztec.wl-tiwai@suse.de Link: https://lore.kernel.org/r/20220621120044.11573-1-tiwai@suse.de Signed-off-by: Takashi Iwai <tiwai@suse.de>
2022-06-21netfilter: nf_dup_netdev: add and use recursion counterFlorian Westphal1-4/+15
Now that the egress function can be called from egress hook, we need to avoid recursive calls into the nf_tables traverser, else crash. Fixes: f87b9464d152 ("netfilter: nft_fwd_netdev: Support egress hook") Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2022-06-21netfilter: nf_dup_netdev: do not push mac header a second timeFlorian Westphal1-4/+10
Eric reports skb_under_panic when using dup/fwd via bond+egress hook. Before pushing mac header, we should make sure that we're called from ingress to put back what was pulled earlier. In egress case, the MAC header is already there; we should leave skb alone. While at it be more careful here: skb might have been altered and headroom reduced, so add a skb_cow() before so that headroom is increased if necessary. nf_do_netdev_egress() assumes skb ownership (it normally ends with a call to dev_queue_xmit), so we must free the packet on error. Fixes: f87b9464d152 ("netfilter: nft_fwd_netdev: Support egress hook") Reported-by: Eric Garver <eric@garver.life> Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2022-06-21selftests: netfilter: correct PKTGEN_SCRIPT_PATHS in nft_concat_range.shJie2x Zhou1-1/+1
Before change: make -C netfilter TEST: performance net,port [SKIP] perf not supported port,net [SKIP] perf not supported net6,port [SKIP] perf not supported port,proto [SKIP] perf not supported net6,port,mac [SKIP] perf not supported net6,port,mac,proto [SKIP] perf not supported net,mac [SKIP] perf not supported After change: net,mac [ OK ] baseline (drop from netdev hook): 2061098pps baseline hash (non-ranged entries): 1606741pps baseline rbtree (match on first field only): 1191607pps set with 1000 full, ranged entries: 1639119pps ok 8 selftests: netfilter: nft_concat_range.sh Fixes: 611973c1e06f ("selftests: netfilter: Introduce tests for sets with range concatenation") Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Jie2x Zhou <jie2x.zhou@intel.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2022-06-20filemap: Handle sibling entries in filemap_get_read_batch()Matthew Wilcox (Oracle)1-0/+2
If a read races with an invalidation followed by another read, it is possible for a folio to be replaced with a higher-order folio. If that happens, we'll see a sibling entry for the new folio in the next iteration of the loop. This manifests as a NULL pointer dereference while holding the RCU read lock. Handle this by simply returning. The next call will find the new folio and handle it correctly. The other ways of handling this rare race are more complex and it's just not worth it. Reported-by: Dave Chinner <david@fromorbit.com> Reported-by: Brian Foster <bfoster@redhat.com> Debugged-by: Brian Foster <bfoster@redhat.com> Tested-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Fixes: cbd59c48ae2b ("mm/filemap: use head pages in generic_file_buffered_read") Cc: stable@vger.kernel.org Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
2022-06-20filemap: Correct the conditions for marking a folio as accessedMatthew Wilcox (Oracle)1-3/+10
We had an off-by-one error which meant that we never marked the first page in a read as accessed. This was visible as a slowdown when re-reading a file as pages were being evicted from cache too soon. In reviewing this code, we noticed a second bug where a multi-page folio would be marked as accessed multiple times when doing reads that were less than the size of the folio. Abstract the comparison of whether two file positions are in the same folio into a new function, fixing both of these bugs. Reported-by: Yu Kuai <yukuai3@huawei.com> Reviewed-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
2022-06-20udmabuf: add back sanity checkGerd Hoffmann1-1/+4
Check vm_fault->pgoff before using it. When we removed the warning, we also removed the check. Fixes: 7b26e4e2119d ("udmabuf: drop WARN_ON() check.") Reported-by: zdi-disclosures@trendmicro.com Suggested-by: Linus Torvalds <torvalds@linuxfoundation.org> Signed-off-by: Gerd Hoffmann <kraxel@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-06-20ALSA: hda/via: Fix missing beep setupTakashi Iwai1-2/+2
Like the previous fix for Conexant codec, the beep_nid has to be set up before calling snd_hda_gen_parse_auto_config(); otherwise it'd miss the path setup. Fix the call order for addressing the missing beep setup. Fixes: 0e8f9862493a ("ALSA: hda/via - Simplify control management") Cc: <stable@vger.kernel.org> Link: https://bugzilla.kernel.org/show_bug.cgi?id=216152 Link: https://lore.kernel.org/r/20220620104008.1994-2-tiwai@suse.de Signed-off-by: Takashi Iwai <tiwai@suse.de>
2022-06-20ALSA: hda/conexant: Fix missing beep setupTakashi Iwai1-2/+2
Currently the Conexant codec driver sets up the beep NID after calling snd_hda_gen_parse_auto_config(). It turned out that this results in the insufficient setup for the beep control, as the generic parser handles the fake path in snd_hda_gen_parse_auto_config() only if the beep_nid is set up beforehand. For dealing with the beep widget properly, call cx_auto_parse_beep() before snd_hda_gen_parse_auto_config() call. Fixes: 51e19ca5f755 ("ALSA: hda/conexant - Clean up beep code") Cc: <stable@vger.kernel.org> Link: https://bugzilla.kernel.org/show_bug.cgi?id=216152 Link: https://lore.kernel.org/r/20220620104008.1994-1-tiwai@suse.de Signed-off-by: Takashi Iwai <tiwai@suse.de>
2022-06-20random: update comment from copy_to_user() -> copy_to_iter()Jason A. Donenfeld1-1/+1
This comment wasn't updated when we moved from read() to read_iter(), so this patch makes the trivial fix. Fixes: 1b388e7765f2 ("random: convert to using fops->read_iter()") Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2022-06-20net/tls: fix tls_sk_proto_close executed repeatedlyZiyang Xuan1-0/+3
After setting the sock ktls, update ctx->sk_proto to sock->sk_prot by tls_update(), so now ctx->sk_proto->close is tls_sk_proto_close(). When close the sock, tls_sk_proto_close() is called for sock->sk_prot->close is tls_sk_proto_close(). But ctx->sk_proto->close() will be executed later in tls_sk_proto_close(). Thus tls_sk_proto_close() executed repeatedly occurred. That will trigger the following bug. ================================================================= KASAN: null-ptr-deref in range [0x0000000000000010-0x0000000000000017] RIP: 0010:tls_sk_proto_close+0xd8/0xaf0 net/tls/tls_main.c:306 Call Trace: <TASK> tls_sk_proto_close+0x356/0xaf0 net/tls/tls_main.c:329 inet_release+0x12e/0x280 net/ipv4/af_inet.c:428 __sock_release+0xcd/0x280 net/socket.c:650 sock_close+0x18/0x20 net/socket.c:1365 Updating a proto which is same with sock->sk_prot is incorrect. Add proto and sock->sk_prot equality check at the head of tls_update() to fix it. Fixes: 95fa145479fb ("bpf: sockmap/tls, close can race with map free") Reported-by: syzbot+29c3c12f3214b85ad081@syzkaller.appspotmail.com Signed-off-by: Ziyang Xuan <william.xuanziyang@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-06-20erspan: do not assume transport header is always setEric Dumazet2-10/+20
Rewrite tests in ip6erspan_tunnel_xmit() and erspan_fb_xmit() to not assume transport header is set. syzbot reported: WARNING: CPU: 0 PID: 1350 at include/linux/skbuff.h:2911 skb_transport_header include/linux/skbuff.h:2911 [inline] WARNING: CPU: 0 PID: 1350 at include/linux/skbuff.h:2911 ip6erspan_tunnel_xmit+0x15af/0x2eb0 net/ipv6/ip6_gre.c:963 Modules linked in: CPU: 0 PID: 1350 Comm: aoe_tx0 Not tainted 5.19.0-rc2-syzkaller-00160-g274295c6e53f #0 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.14.0-2 04/01/2014 RIP: 0010:skb_transport_header include/linux/skbuff.h:2911 [inline] RIP: 0010:ip6erspan_tunnel_xmit+0x15af/0x2eb0 net/ipv6/ip6_gre.c:963 Code: 0f 47 f0 40 88 b5 7f fe ff ff e8 8c 16 4b f9 89 de bf ff ff ff ff e8 a0 12 4b f9 66 83 fb ff 0f 85 1d f1 ff ff e8 71 16 4b f9 <0f> 0b e9 43 f0 ff ff e8 65 16 4b f9 48 8d 85 30 ff ff ff ba 60 00 RSP: 0018:ffffc90005daf910 EFLAGS: 00010293 RAX: 0000000000000000 RBX: 000000000000ffff RCX: 0000000000000000 RDX: ffff88801f032100 RSI: ffffffff882e8d3f RDI: 0000000000000003 RBP: ffffc90005dafab8 R08: 0000000000000003 R09: 000000000000ffff R10: 000000000000ffff R11: 0000000000000000 R12: ffff888024f21d40 R13: 000000000000a288 R14: 00000000000000b0 R15: ffff888025a2e000 FS: 0000000000000000(0000) GS:ffff88802c800000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000001b2e425000 CR3: 000000006d099000 CR4: 0000000000152ef0 Call Trace: <TASK> __netdev_start_xmit include/linux/netdevice.h:4805 [inline] netdev_start_xmit include/linux/netdevice.h:4819 [inline] xmit_one net/core/dev.c:3588 [inline] dev_hard_start_xmit+0x188/0x880 net/core/dev.c:3604 sch_direct_xmit+0x19f/0xbe0 net/sched/sch_generic.c:342 __dev_xmit_skb net/core/dev.c:3815 [inline] __dev_queue_xmit+0x14a1/0x3900 net/core/dev.c:4219 dev_queue_xmit include/linux/netdevice.h:2994 [inline] tx+0x6a/0xc0 drivers/block/aoe/aoenet.c:63 kthread+0x1e7/0x3b0 drivers/block/aoe/aoecmd.c:1229 kthread+0x2e9/0x3a0 kernel/kthread.c:376 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:302 </TASK> Fixes: d5db21a3e697 ("erspan: auto detect truncated ipv6 packets.") Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: William Tu <u9012063@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-06-20ipv4: fix bind address validity regression testsRiccardo Paolo Bestetti1-9/+27
Commit 8ff978b8b222 ("ipv4/raw: support binding to nonlocal addresses") introduces support for binding to nonlocal addresses, as well as some basic test coverage for some of the related cases. Commit b4a028c4d031 ("ipv4: ping: fix bind address validity check") fixes a regression which incorrectly removed some checks for bind address validation. In addition, it introduces regression tests for those specific checks. However, those regression tests are defective, in that they perform the tests using an incorrect combination of bind flags. As a result, those tests fail when they should succeed. This commit introduces additional regression tests for nonlocal binding and fixes the defective regression tests. It also introduces new set_sysctl calls for the ipv4_bind test group, as to perform the ICMP binding tests it is necessary to allow ICMP socket creation by setting the net.ipv4.ping_group_range knob. Fixes: b4a028c4d031 ("ipv4: ping: fix bind address validity check") Reported-by: Riccardo Paolo Bestetti <pbl@bestov.io> Signed-off-by: Riccardo Paolo Bestetti <pbl@bestov.io> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-06-20ALSA: memalloc: Drop x86-specific hack for WC allocationsTakashi Iwai1-22/+1
The recent report for a crash on Haswell machines implied that the x86-specific (rather hackish) implementation for write-cache memory buffer allocation in ALSA core is buggy with the recent kernel in some corner cases. This patch drops the x86-specific implementation and uses the standard dma_alloc_wc() & co generically for avoiding the bug and also for simplification. BugLink: https://bugzilla.kernel.org/show_bug.cgi?id=216112 Cc: <stable@vger.kernel.org> # v5.18+ Link: https://lore.kernel.org/r/20220620073440.7514-1-tiwai@suse.de Signed-off-by: Takashi Iwai <tiwai@suse.de>
2022-06-19random: quiet urandom warning ratelimit suppression messageJason A. Donenfeld2-5/+9
random.c ratelimits how much it warns about uninitialized urandom reads using __ratelimit(). When the RNG is finally initialized, it prints the number of missed messages due to ratelimiting. It has been this way since that functionality was introduced back in 2018. Recently, cc1e127bfa95 ("random: remove ratelimiting for in-kernel unseeded randomness") put a bit more stress on the urandom ratelimiting, which teased out a bug in the implementation. Specifically, when under pressure, __ratelimit() will print its own message and reset the count back to 0, making the final message at the end less useful. Secondly, it does so as a pr_warn(), which apparently is undesirable for people's CI. Fortunately, __ratelimit() has the RATELIMIT_MSG_ON_RELEASE flag exactly for this purpose, so we set the flag. Fixes: 4e00b339e264 ("random: rate limit unseeded randomness warnings") Cc: stable@vger.kernel.org Reported-by: Jon Hunter <jonathanh@nvidia.com> Reported-by: Ron Economos <re@w6rz.net> Tested-by: Ron Economos <re@w6rz.net> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2022-06-19random: schedule mix_interrupt_randomness() less oftenJason A. Donenfeld1-1/+1
It used to be that mix_interrupt_randomness() would credit 1 bit each time it ran, and so add_interrupt_randomness() would schedule mix() to run every 64 interrupts, a fairly arbitrary number, but nonetheless considered to be a decent enough conservative estimate. Since e3e33fc2ea7f ("random: do not use input pool from hard IRQs"), mix() is now able to credit multiple bits, depending on the number of calls to add(). This was done for reasons separate from this commit, but it has the nice side effect of enabling this patch to schedule mix() less often. Currently the rules are: a) Credit 1 bit for every 64 calls to add(). b) Schedule mix() once a second that add() is called. c) Schedule mix() once every 64 calls to add(). Rules (a) and (c) no longer need to be coupled. It's still important to have _some_ value in (c), so that we don't "over-saturate" the fast pool, but the once per second we get from rule (b) is a plenty enough baseline. So, by increasing the 64 in rule (c) to something larger, we avoid calling queue_work_on() as frequently during irq storms. This commit changes that 64 in rule (c) to be 1024, which means we schedule mix() 16 times less often. And it does *not* need to change the 64 in rule (a). Fixes: 58340f8e952b ("random: defer fast pool mixing to worker") Cc: stable@vger.kernel.org Cc: Dominik Brodowski <linux@dominikbrodowski.net> Acked-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2022-06-19Linux 5.19-rc3Linus Torvalds1-1/+1
2022-06-19tools headers UAPI: Sync linux/prctl.h with the kernel sourcesArnaldo Carvalho de Melo1-0/+9
To pick the changes in: 9e4ab6c891094720 ("arm64/sme: Implement vector length configuration prctl()s") That don't result in any changes in tooling: $ tools/perf/trace/beauty/prctl_option.sh > before $ cp include/uapi/linux/prctl.h tools/include/uapi/linux/prctl.h $ tools/perf/trace/beauty/prctl_option.sh > after $ diff -u before after $ Just silences this perf tools build warning: Warning: Kernel ABI header at 'tools/include/uapi/linux/prctl.h' differs from latest version at 'include/uapi/linux/prctl.h' diff -u tools/include/uapi/linux/prctl.h include/uapi/linux/prctl.h Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Mark Brown <broonie@kernel.org> Link: http://lore.kernel.org/lkml/Yq81we+XFOqlBWyu@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>