linux-dev - Linux kernel development work

Age	Commit message (Collapse)	Author	Files	Lines
2022-08-10	bpf: Acquire map uref in .init_seq_private for array map iterator	Hou Tao	1	-0/+6
	bpf_iter_attach_map() acquires a map uref, and the uref may be released before or in the middle of iterating map elements. For example, the uref could be released in bpf_iter_detach_map() as part of bpf_link_release(), or could be released in bpf_map_put_with_uref() as part of bpf_map_release(). Alternative fix is acquiring an extra bpf_link reference just like a pinned map iterator does, but it introduces unnecessary dependency on bpf_link instead of bpf_map. So choose another fix: acquiring an extra map uref in .init_seq_private for array map iterator. Fixes: d3cc2ab546ad ("bpf: Implement bpf iterator for array maps") Signed-off-by: Hou Tao <houtao1@huawei.com> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/r/20220810080538.1845898-2-houtao@huaweicloud.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-08-10	bpf: Disallow bpf programs call prog_run command.	Alexei Starovoitov	2	-8/+16
	The verifier cannot perform sufficient validation of bpf_attr->test.ctx_in pointer, therefore bpf programs should not be allowed to call BPF_PROG_RUN command from within the program. To fix this issue split bpf_sys_bpf() bpf helper into normal kern_sys_bpf() kernel function that can only be used by the kernel light skeleton directly. Reported-by: YiFei Zhu <zhuyifei@google.com> Fixes: b1d18a7574d0 ("bpf: Extend sys_bpf commands for bpf_syscall programs.") Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-08-10	bpf, arm64: Fix bpf trampoline instruction endianness	Xu Kuohai	1	-6/+6
	The sparse tool complains as follows: arch/arm64/net/bpf_jit_comp.c:1684:16: warning: incorrect type in assignment (different base types) arch/arm64/net/bpf_jit_comp.c:1684:16: expected unsigned int [usertype] branch arch/arm64/net/bpf_jit_comp.c:1684:16: got restricted __le32 [usertype] arch/arm64/net/bpf_jit_comp.c:1700:52: error: subtraction of different types can't work (different base types) arch/arm64/net/bpf_jit_comp.c:1734:29: warning: incorrect type in assignment (different base types) arch/arm64/net/bpf_jit_comp.c:1734:29: expected unsigned int [usertype] * arch/arm64/net/bpf_jit_comp.c:1734:29: got restricted __le32 [usertype] * arch/arm64/net/bpf_jit_comp.c:1918:52: error: subtraction of different types can't work (different base types) This is because the variable branch in function invoke_bpf_prog and the variable branches in function prepare_trampoline are defined as type u32 , which conflicts with ctx->image's type __le32 , so sparse complains when assignment or arithmetic operation are performed on these two variables and ctx->image. Since arm64 instructions are always little-endian, change the type of these two variables to __le32 * and call cpu_to_le32() to convert instruction to little-endian before writing it to memory. This is also in line with emit() which internally does cpu_to_le32(), too. Fixes: efc9909fdce0 ("bpf, arm64: Add bpf trampoline for arm64") Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Xu Kuohai <xukuohai@huawei.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: Jean-Philippe Brucker <jean-philippe@linaro.org> Link: https://lore.kernel.org/bpf/20220808040735.1232002-1-xukuohai@huawei.com
2022-08-09	selftests/bpf: Add test for prealloc_lru_pop bug	Kumar Kartikeya Dwivedi	2	-0/+70
	Add a regression test to check against invalid check_and_init_map_value call inside prealloc_lru_pop. The kptr should not be reset to NULL once we set it after deleting the map element. Hence, we trigger a program that updates the element causing its reuse, and checks whether the unref kptr is reset or not. If it is, prealloc_lru_pop does an incorrect check_and_init_map_value call and the test fails. Acked-by: Yonghong Song <yhs@fb.com> Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Link: https://lore.kernel.org/r/20220809213033.24147-4-memxor@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-08-09	bpf: Don't reinit map value in prealloc_lru_pop	Kumar Kartikeya Dwivedi	1	-5/+1
	The LRU map that is preallocated may have its elements reused while another program holds a pointer to it from bpf_map_lookup_elem. Hence, only check_and_free_fields is appropriate when the element is being deleted, as it ensures proper synchronization against concurrent access of the map value. After that, we cannot call check_and_init_map_value again as it may rewrite bpf_spin_lock, bpf_timer, and kptr fields while they can be concurrently accessed from a BPF program. This is safe to do as when the map entry is deleted, concurrent access is protected against by check_and_free_fields, i.e. an existing timer would be freed, and any existing kptr will be released by it. The program can create further timers and kptrs after check_and_free_fields, but they will eventually be released once the preallocated items are freed on map destruction, even if the item is never reused again. Hence, the deleted item sitting in the free list can still have resources attached to it, and they would never leak. With spin_lock, we never touch the field at all on delete or update, as we may end up modifying the state of the lock. Since the verifier ensures that a bpf_spin_lock call is always paired with bpf_spin_unlock call, the program will eventually release the lock so that on reuse the new user of the value can take the lock. Essentially, for the preallocated case, we must assume that the map value may always be in use by the program, even when it is sitting in the freelist, and handle things accordingly, i.e. use proper synchronization inside check_and_free_fields, and never reinitialize the special fields when it is reused on update. Fixes: 68134668c17f ("bpf: Add map side support for bpf timers.") Acked-by: Yonghong Song <yhs@fb.com> Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Acked-by: Martin KaFai Lau <kafai@fb.com> Link: https://lore.kernel.org/r/20220809213033.24147-3-memxor@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-08-09	bpf: Allow calling bpf_prog_test kfuncs in tracing programs	Kumar Kartikeya Dwivedi	1	-0/+1
	In addition to TC hook, enable these in tracing programs so that they can be used in selftests. Acked-by: Yonghong Song <yhs@fb.com> Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Link: https://lore.kernel.org/r/20220809213033.24147-2-memxor@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-08-08	bpf, arm64: Allocate program buffer using kvcalloc instead of kcalloc	Aijun Sun	1	-2/+2
	It is not necessary to allocate contiguous physical memory for BPF program buffer using kcalloc. When the BPF program is large more than memory page size, kcalloc allocates multiple memory pages from buddy system. If the device can not provide sufficient memory, for example in low-end android devices [0], memory allocation for BPF program is likely to fail. Test cases in lib/test_bpf.c all pass on ARM64 QEMU. [0] AndroidTestSuit: page allocation failure: order:4, mode:0x40dc0(GFP_KERNEL\|__GFP_COMP\|__GFP_ZERO), nodemask=(null),cpuset=foreground,mems_allowed=0 Call trace: dump_stack+0xa4/0x114 warn_alloc+0xf8/0x14c __alloc_pages_slowpath+0xac8/0xb14 __alloc_pages_nodemask+0x194/0x3d0 kmalloc_order_trace+0x44/0x1e8 __kmalloc+0x29c/0x66c bpf_int_jit_compile+0x17c/0x568 bpf_prog_select_runtime+0x4c/0x1b0 bpf_prepare_filter+0x5fc/0x6bc bpf_prog_create_from_user+0x118/0x1c0 seccomp_set_mode_filter+0x1c4/0x7cc __do_sys_prctl+0x380/0x1424 __arm64_sys_prctl+0x20/0x2c el0_svc_common+0xc8/0x22c el0_svc_handler+0x1c/0x28 el0_svc+0x8/0x100 Signed-off-by: Aijun Sun <aijun.sun@unisoc.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20220804025442.22524-1-aijun.sun@unisoc.com
2022-08-08	selftests/bpf: Excercise bpf_obj_get_info_by_fd for bpf2bpf	Stanislav Fomichev	1	-0/+95
	Apparently, no existing selftest covers it. Add a new one where we load cgroup/bind4 program and attach fentry to it. Calling bpf_obj_get_info_by_fd on the fentry program should return non-zero btf_id/btf_obj_id instead of crashing the kernel. Signed-off-by: Stanislav Fomichev <sdf@google.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Martin KaFai Lau <kafai@fb.com> Link: https://lore.kernel.org/bpf/20220804201140.1340684-2-sdf@google.com
2022-08-08	bpf: Use proper target btf when exporting attach_btf_obj_id	Stanislav Fomichev	1	-4/+3
	When attaching to program, the program itself might not be attached to anything (and, hence, might not have attach_btf), so we can't unconditionally use 'prog->aux->dst_prog->aux->attach_btf'. Instead, use bpf_prog_get_target_btf to pick proper target BTF: * when attached to dst_prog, use dst_prog->aux->btf * when attached to kernel btf, use prog->aux->attach_btf Fixes: b79c9fc9551b ("bpf: implement BPF_PROG_QUERY for BPF_LSM_CGROUP") Signed-off-by: Stanislav Fomichev <sdf@google.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Hao Luo <haoluo@google.com> Acked-by: Martin KaFai Lau <kafai@fb.com> Link: https://lore.kernel.org/bpf/20220804201140.1340684-1-sdf@google.com
2022-08-08	mptcp, btf: Add struct mptcp_sock definition when CONFIG_MPTCP is disabled	Jiri Olsa	1	-0/+4
	The btf_sock_ids array needs struct mptcp_sock BTF ID for the bpf_skc_to_mptcp_sock helper. When CONFIG_MPTCP is disabled, the 'struct mptcp_sock' is not defined and resolve_btfids will complain with: [...] BTFIDS vmlinux WARN: resolve_btfids: unresolved symbol mptcp_sock [...] Add an empty definition for struct mptcp_sock when CONFIG_MPTCP is disabled. Fixes: 3bc253c2e652 ("bpf: Add bpf_skc_to_mptcp_sock_proto") Signed-off-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Acked-by: Martin KaFai Lau <kafai@fb.com> Link: https://lore.kernel.org/bpf/20220802163324.1873044-1-jolsa@kernel.org
2022-08-05	bpf: Cleanup ftrace hash in bpf_trampoline_put	Jiri Olsa	1	-1/+4
	We need to release possible hash from trampoline fops object before removing it, otherwise we leak it. Fixes: 00963a2e75a8 ("bpf: Support bpf_trampoline on functions with IPMODIFY (e.g. livepatch)") Signed-off-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Song Liu <song@kernel.org> Link: https://lore.kernel.org/bpf/20220802135651.1794015-1-jolsa@kernel.org
2022-08-04	BPF: Fix potential bad pointer dereference in bpf_sys_bpf()	Jinghao Jia	1	-2/+6
	The bpf_sys_bpf() helper function allows an eBPF program to load another eBPF program from within the kernel. In this case the argument union bpf_attr pointer (as well as the insns and license pointers inside) is a kernel address instead of a userspace address (which is the case of a usual bpf() syscall). To make the memory copying process in the syscall work in both cases, bpfptr_t was introduced to wrap around the pointer and distinguish its origin. Specifically, when copying memory contents from a bpfptr_t, a copy_from_user() is performed in case of a userspace address and a memcpy() is performed for a kernel address. This can lead to problems because the in-kernel pointer is never checked for validity. The problem happens when an eBPF syscall program tries to call bpf_sys_bpf() to load a program but provides a bad insns pointer -- say 0xdeadbeef -- in the bpf_attr union. The helper calls __sys_bpf() which would then call bpf_prog_load() to load the program. bpf_prog_load() is responsible for copying the eBPF instructions to the newly allocated memory for the program; it creates a kernel bpfptr_t for insns and invokes copy_from_bpfptr(). Internally, all bpfptr_t operations are backed by the corresponding sockptr_t operations, which performs direct memcpy() on kernel pointers for copy_from/strncpy_from operations. Therefore, the code is always happy to dereference the bad pointer to trigger a un-handle-able page fault and in turn an oops. However, this is not supposed to happen because at that point the eBPF program is already verified and should not cause a memory error. Sample KASAN trace: [ 25.685056][ T228] ================================================================== [ 25.685680][ T228] BUG: KASAN: user-memory-access in copy_from_bpfptr+0x21/0x30 [ 25.686210][ T228] Read of size 80 at addr 00000000deadbeef by task poc/228 [ 25.686732][ T228] [ 25.686893][ T228] CPU: 3 PID: 228 Comm: poc Not tainted 5.19.0-rc7 #7 [ 25.687375][ T228] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS d55cb5a 04/01/2014 [ 25.687991][ T228] Call Trace: [ 25.688223][ T228] <TASK> [ 25.688429][ T228] dump_stack_lvl+0x73/0x9e [ 25.688747][ T228] print_report+0xea/0x200 [ 25.689061][ T228] ? copy_from_bpfptr+0x21/0x30 [ 25.689401][ T228] ? _printk+0x54/0x6e [ 25.689693][ T228] ? _raw_spin_lock_irqsave+0x70/0xd0 [ 25.690071][ T228] ? copy_from_bpfptr+0x21/0x30 [ 25.690412][ T228] kasan_report+0xb5/0xe0 [ 25.690716][ T228] ? copy_from_bpfptr+0x21/0x30 [ 25.691059][ T228] kasan_check_range+0x2bd/0x2e0 [ 25.691405][ T228] ? copy_from_bpfptr+0x21/0x30 [ 25.691734][ T228] memcpy+0x25/0x60 [ 25.692000][ T228] copy_from_bpfptr+0x21/0x30 [ 25.692328][ T228] bpf_prog_load+0x604/0x9e0 [ 25.692653][ T228] ? cap_capable+0xb4/0xe0 [ 25.692956][ T228] ? security_capable+0x4f/0x70 [ 25.693324][ T228] __sys_bpf+0x3af/0x580 [ 25.693635][ T228] bpf_sys_bpf+0x45/0x240 [ 25.693937][ T228] bpf_prog_f0ec79a5a3caca46_bpf_func1+0xa2/0xbd [ 25.694394][ T228] bpf_prog_run_pin_on_cpu+0x2f/0xb0 [ 25.694756][ T228] bpf_prog_test_run_syscall+0x146/0x1c0 [ 25.695144][ T228] bpf_prog_test_run+0x172/0x190 [ 25.695487][ T228] __sys_bpf+0x2c5/0x580 [ 25.695776][ T228] __x64_sys_bpf+0x3a/0x50 [ 25.696084][ T228] do_syscall_64+0x60/0x90 [ 25.696393][ T228] ? fpregs_assert_state_consistent+0x50/0x60 [ 25.696815][ T228] ? exit_to_user_mode_prepare+0x36/0xa0 [ 25.697202][ T228] ? syscall_exit_to_user_mode+0x20/0x40 [ 25.697586][ T228] ? do_syscall_64+0x6e/0x90 [ 25.697899][ T228] entry_SYSCALL_64_after_hwframe+0x63/0xcd [ 25.698312][ T228] RIP: 0033:0x7f6d543fb759 [ 25.698624][ T228] Code: 08 5b 89 e8 5d c3 66 2e 0f 1f 84 00 00 00 00 00 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 97 a6 0e 00 f7 d8 64 89 01 48 [ 25.699946][ T228] RSP: 002b:00007ffc3df78468 EFLAGS: 00000287 ORIG_RAX: 0000000000000141 [ 25.700526][ T228] RAX: ffffffffffffffda RBX: 00007ffc3df78628 RCX: 00007f6d543fb759 [ 25.701071][ T228] RDX: 0000000000000090 RSI: 00007ffc3df78478 RDI: 000000000000000a [ 25.701636][ T228] RBP: 00007ffc3df78510 R08: 0000000000000000 R09: 0000000000300000 [ 25.702191][ T228] R10: 0000000000000005 R11: 0000000000000287 R12: 0000000000000000 [ 25.702736][ T228] R13: 00007ffc3df78638 R14: 000055a1584aca68 R15: 00007f6d5456a000 [ 25.703282][ T228] </TASK> [ 25.703490][ T228] ================================================================== [ 25.704050][ T228] Disabling lock debugging due to kernel taint Update copy_from_bpfptr() and strncpy_from_bpfptr() so that: - for a kernel pointer, it uses the safe copy_from_kernel_nofault() and strncpy_from_kernel_nofault() functions. - for a userspace pointer, it performs copy_from_user() and strncpy_from_user(). Fixes: af2ac3e13e45 ("bpf: Prepare bpf syscall to be used from kernel and user space.") Link: https://lore.kernel.org/bpf/20220727132905.45166-1-jinghao@linux.ibm.com/ Signed-off-by: Jinghao Jia <jinghao@linux.ibm.com> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/r/20220729201713.88688-1-jinghao@linux.ibm.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-08-04	bpf: Update bpf_design_QA.rst to clarify that BTF_ID does not ABIify a function	Paul E. McKenney	1	-0/+7
	This patch updates bpf_design_QA.rst to clarify that mentioning a function to the BTF_ID macro does not make that function become part of the Linux kernel's ABI. Suggested-by: Alexei Starovoitov <alexei.starovoitov@gmail.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Link: https://lore.kernel.org/r/20220802173913.4170192-3-paulmck@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-08-04	bpf: Update bpf_design_QA.rst to clarify that attaching to functions is not ABI	Paul E. McKenney	1	-0/+12
	This patch updates bpf_design_QA.rst to clarify that the ability to attach a BPF program to an arbitrary function in the kernel does not make that function become part of the Linux kernel's ABI. [ paulmck: Apply Daniel Borkmann feedback. ] Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Link: https://lore.kernel.org/r/20220802173913.4170192-2-paulmck@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-08-04	bpf: Update bpf_design_QA.rst to clarify that kprobes is not ABI	Paul E. McKenney	1	-0/+6
	This patch updates bpf_design_QA.rst to clarify that the ability to attach a BPF program to a given point in the kernel code via kprobes does not make that attachment point be part of the Linux kernel's ABI. Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Link: https://lore.kernel.org/r/20220802173913.4170192-1-paulmck@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-08-03	nfp: ethtool: fix the display error of `ethtool -m DEVNAME`	Yu Xiao	1	-0/+2
	The port flag isn't set to `NFP_PORT_CHANGED` when using `ethtool -m DEVNAME` before, so the port state (e.g. interface) cannot be updated. Therefore, it caused that `ethtool -m DEVNAME` sometimes cannot read the correct information. E.g. `ethtool -m DEVNAME` cannot work when load driver before plug in optical module, as the port interface is still NONE without port update. Now update the port state before sending info to NIC to ensure that port interface is correct (latest state). Fixes: 61f7c6f44870 ("nfp: implement ethtool get module EEPROM") Reviewed-by: Louis Peens <louis.peens@corigine.com> Signed-off-by: Yu Xiao <yu.xiao@corigine.com> Signed-off-by: Simon Horman <simon.horman@corigine.com> Link: https://lore.kernel.org/r/20220802093355.69065-1-simon.horman@corigine.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-08-03	net: phy: Warn about incorrect mdio_bus_phy_resume() state	Florian Fainelli	1	-0/+6
	Calling mdio_bus_phy_resume() with neither the PHY state machine set to PHY_HALTED nor phydev->mac_managed_pm set to true is a good indication that we can produce a race condition looking like this: CPU0 CPU1 bcmgenet_resume -> phy_resume -> phy_init_hw -> phy_start -> phy_resume phy_start_aneg() mdio_bus_phy_resume -> phy_resume -> phy_write(..., BMCR_RESET) -> usleep() -> phy_read() with the phy_resume() function triggering a PHY behavior that might have to be worked around with (see bf8bfc4336f7 ("net: phy: broadcom: Fix brcm_fet_config_init()") for instance) that ultimately leads to an error reading from the PHY. Fixes: fba863b81604 ("net: phy: make PHY PM ops a no-op if MAC driver manages PHY PM") Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Link: https://lore.kernel.org/r/20220801233403.258871-1-f.fainelli@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-08-03	docs: net: bonding: remove mentions of trans_start	Vladimir Oltean	1	-9/+0
	ARP monitoring no longer depends on dev->last_rx or dev_trans_start(), so delete this information. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-08-03	Revert "veth: Add updating of trans_start"	Vladimir Oltean	1	-4/+0
	This reverts commit e66e257a5d8368d9c0ba13d4630f474436533e8b. The veth driver no longer needs these hacks which are slightly detrimential to the fast path performance, because the bonding driver is keeping track of TX times of ARP and NS probes by itself, which it should. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-08-03	net/sched: remove hacks added to dev_trans_start() for bonding to work	Vladimir Oltean	1	-6/+2
	Now that the bonding driver keeps track of the last TX time of ARP and NS probes, we effectively revert the following commits: 32d3e51a82d4 ("net_sched: use macvlan real dev trans_start in dev_trans_start()") 07ce76aa9bcf ("net_sched: make dev_trans_start return vlan's real dev trans_start") Note that the approach of continuing to hack at this function would not get us very far, hence the desire to take a different approach. DSA is also a virtual device that uses NETIF_F_LLTX, but there, many uppers share the same lower (DSA master, i.e. the physical host port of a switch). By making dev_trans_start() on a DSA interface return the dev_trans_start() of the master, we effectively assume that all other DSA interfaces are silent, otherwise this corrupts the validity of the probe timestamp data from the bonding driver's perspective. Furthermore, the hacks didn't take into consideration the fact that the lower interface of @dev may not have been physical either. For example, VLAN over VLAN, or DSA with 2 masters in a LAG. And even furthermore, there are NETIF_F_LLTX devices which are not stacked, like veth. The hack here would not work with those, because it would not have to provide the bonding driver something to chew at all. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-08-03	net: bonding: replace dev_trans_start() with the jiffies of the last ARP/NS	Vladimir Oltean	2	-16/+32
	The bonding driver piggybacks on time stamps kept by the network stack for the purpose of the netdev TX watchdog, and this is problematic because it does not work with NETIF_F_LLTX devices. It is hard to say why the driver looks at dev_trans_start() of the slave->dev, considering that this is updated even by non-ARP/NS probes sent by us, and even by traffic not sent by us at all (for example PTP on physical slave devices). ARP monitoring in active-backup mode appears to still work even if we track only the last TX time of actual ARP probes. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Acked-by: Jay Vosburgh <jay.vosburgh@canonical.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-08-02	doc: sfp-phylink: Fix a broken reference	Christophe JAILLET	1	-3/+3
	The commit in Fixes: has changed a .txt file into a .yaml file. Update the documentation accordingly. While at it add some `` around some file names to improve the output. Fixes: 70991f1e6858 ("dt-bindings: net: convert sff,sfp to dtschema") Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Link: https://lore.kernel.org/r/be3c7e87ca7f027703247eccfe000b8e34805094.1659247114.git.christophe.jaillet@wanadoo.fr Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-08-03	ata: sata_mv: Fixes expected number of resources now IRQs are gone	Andrew Lunn	1	-1/+1
	The commit a1a2b7125e10 ("of/platform: Drop static setup of IRQ resource from DT core") stopped IRQ resources being available as platform resources. This broke the sanity check for the expected number of resources in the Marvell SATA driver which expected two resources, the IO memory and the interrupt. Change the sanity check to only expect the IO memory. Cc: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com> Fixes: a1a2b7125e10 ("of/platform: Drop static setup of IRQ resource from DT core") Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
2022-08-02	wireguard: selftests: support UML	Jason A. Donenfeld	2	-1/+19
	This shoud open up various possibilities like time travel execution, and is also just another platform to help shake out bugs. Cc: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-08-02	wireguard: allowedips: don't corrupt stack when detecting overflow	Jason A. Donenfeld	2	-6/+9
	In case push_rcu() and related functions are buggy, there's a WARN_ON(len >= 128), which the selftest tries to hit by being tricky. In case it is hit, we shouldn't corrupt the kernel's stack, though; otherwise it may be hard to even receive the report that it's buggy. So conditionalize the stack write based on that WARN_ON()'s return value. Note that this never actually happens anyway. The WARN_ON() in the first place is bounded by IS_ENABLED(DEBUG), and isn't expected to ever actually hit. This is just a debugging sanity check. Additionally, hoist the constant 128 into a named enum, MAX_ALLOWEDIPS_BITS, so that it's clear why this value is chosen. Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> Link: https://lore.kernel.org/all/CAHk-=wjJZGA6w_DxA+k7Ejbqsq+uGK==koPai3sqdsfJqemvag@mail.gmail.com/ Fixes: e7096c131e51 ("net: WireGuard secure network tunnel") Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-08-02	wireguard: selftests: update config fragments	Lukas Bulwahn	2	-6/+0
	The kernel.config and debug.config fragments in wireguard selftests mention some config symbols that have been reworked: Commit c5665868183f ("mm: kmemleak: use the memory pool for early allocations") removes the config DEBUG_KMEMLEAK_EARLY_LOG_SIZE and since then, the config's feature is available without further configuration. Commit 4675ff05de2d ("kmemcheck: rip it out") removes kmemcheck and the corresponding arch config HAVE_ARCH_KMEMCHECK. There is no need for this config. Commit 3bf195ae6037 ("netfilter: nat: merge nf_nat_ipv4,6 into nat core") removes the config NF_NAT_IPV4 and since then, the config's feature is available without further configuration. Commit 41a2901e7d22 ("rcu: Remove SPARSE_RCU_POINTER Kconfig option") removes the config SPARSE_RCU_POINTER and since then, the config's feature is enabled by default. Commit dfb4357da6dd ("time: Remove CONFIG_TIMER_STATS") removes the feature and config CONFIG_TIMER_STATS without any replacement. Commit 3ca17b1f3628 ("lib/ubsan: remove null-pointer checks") removes the check and config UBSAN_NULL without any replacement. Adjust the config fragments to those changes in configs. Signed-off-by: Lukas Bulwahn <lukas.bulwahn@gmail.com> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-08-02	wireguard: ratelimiter: use hrtimer in selftest	Jason A. Donenfeld	2	-11/+15
	Using msleep() is problematic because it's compared against ratelimiter.c's ktime_get_coarse_boottime_ns(), which means on systems with slow jiffies (such as UML's forced HZ=100), the result is inaccurate. So switch to using schedule_hrtimeout(). However, hrtimer gives us access only to the traditional posix timers, and none of the _COARSE variants. So now, rather than being too imprecise like jiffies, it's too precise. One solution would be to give it a large "range" value, but this will still fire early on a loaded system. A better solution is to align the timeout to the actual coarse timer, and then round up to the nearest tick, plus change. So add the timeout to the current coarse time, and then schedule_hrtimer() until the absolute computed time. This should hopefully reduce flakes in CI as well. Note that we keep the retry loop in case the entire function is running behind, because the test could still be scheduled out, by either the kernel or by the hypervisor's kernel, in which case restarting the test and hoping to not be scheduled out still helps. Fixes: e7096c131e51 ("net: WireGuard secure network tunnel") Suggested-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-08-02	fs: remove the NULL get_block case in mpage_writepages	Christoph Hellwig	1	-16/+6
	No one calls mpage_writepages with a NULL get_block paramter, so remove support for that case. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
2022-08-02	fs: don't call ->writepage from __mpage_writepage	Christoph Hellwig	1	-1/+1
	All callers of mpage_writepage use block_write_full_page as their ->writepage implementation when called from mpage_writepages (although for ntfs3 this is obsfucated a bit). Just call block_write_full_page directly instead of going through the ->writepage indirection. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
2022-08-02	fs: remove the nobh helpers	Christoph Hellwig	4	-358/+1
	All callers are gone, so remove the now dead code. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
2022-08-02	jfs: stop using the nobh helper	Christoph Hellwig	1	-3/+15
	The nobh mode is an obscure feature to save lowlevel for large memory 32-bit configurations while trading for much slower performance and has been long obsolete. Switch to the regular buffer head based helpers instead. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
2022-08-02	ext2: remove nobh support	Christoph Hellwig	5	-63/+7
	The nobh mode is an obscure feature to save lowlevel for large memory 32-bit configurations while trading for much slower performance and has been long obsolete. Remove it. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
2022-08-02	ntfs3: refactor ntfs_writepages	Christoph Hellwig	1	-5/+3
	Handle the resident case with an explicit generic_writepages call instead of using the obscure overload that makes mpage_writepages with a NULL get_block do the same thing. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
2022-08-02	mm/folio-compat: Remove migration compatibility functions	Matthew Wilcox (Oracle)	3	-34/+1
	migrate_page_move_mapping(), migrate_page_copy() and migrate_page_states() are all now unused after converting all the filesystems from aops->migratepage() to aops->migrate_folio(). Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: Christoph Hellwig <hch@lst.de>
2022-08-02	fs: Remove aops->migratepage()	Matthew Wilcox (Oracle)	3	-8/+2
	With all users converted to migrate_folio(), remove this operation. Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: Christoph Hellwig <hch@lst.de>
2022-08-02	secretmem: Convert to migrate_folio	Matthew Wilcox (Oracle)	1	-4/+3
	This is little more than changing the types over; there's no real work being done in this function. Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
2022-08-02	hugetlb: Convert to migrate_folio	Matthew Wilcox (Oracle)	3	-21/+26
	This involves converting migrate_huge_page_move_mapping(). We also need a folio variant of hugetlb_set_page_subpool(), but that's for a later patch. Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Acked-by: Muchun Song <songmuchun@bytedance.com> Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com>
2022-08-02	aio: Convert to migrate_folio	Matthew Wilcox (Oracle)	1	-18/+18
	Use a folio throughout this function. Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: Christoph Hellwig <hch@lst.de>
2022-08-02	f2fs: Convert to filemap_migrate_folio()	Matthew Wilcox (Oracle)	4	-49/+3
	filemap_migrate_folio() fits f2fs's needs perfectly. Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Acked-by: Chao Yu <chao@kernel.org>
2022-08-02	ubifs: Convert to filemap_migrate_folio()	Matthew Wilcox (Oracle)	1	-27/+2
	filemap_migrate_folio() is a little more general than ubifs really needs, but it's better to share the code. Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
2022-08-02	btrfs: Convert btrfs_migratepage to migrate_folio	Matthew Wilcox (Oracle)	1	-17/+9
	Use filemap_migrate_folio() to do the bulk of the work, and then copy the ordered flag across if needed. Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Acked-by: David Sterba <dsterba@suse.com>
2022-08-02	mm/migrate: Add filemap_migrate_folio()	Matthew Wilcox (Oracle)	7	-34/+29
	There is nothing iomap-specific about iomap_migratepage(), and it fits a pattern used by several other filesystems, so move it to mm/migrate.c, convert it to be filemap_migrate_folio() and convert the iomap filesystems to use it. Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <djwong@kernel.org>
2022-08-02	mm/migrate: Convert migrate_page() to migrate_folio()	Matthew Wilcox (Oracle)	8	-27/+30
	Convert all callers to pass a folio. Most have the folio already available. Switch all users from aops->migratepage to aops->migrate_folio. Also turn the documentation into kerneldoc. Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Acked-by: David Sterba <dsterba@suse.com>
2022-08-02	nfs: Convert to migrate_folio	Matthew Wilcox (Oracle)	3	-13/+13
	Use a folio throughout this function. migrate_page() will be converted later. Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Acked-by: Anna Schumaker <Anna.Schumaker@Netapp.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
2022-08-02	btrfs: Convert btree_migratepage to migrate_folio	Matthew Wilcox (Oracle)	1	-12/+10
	Use a folio throughout this function. migrate_page() will be converted later. Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Acked-by: David Sterba <dsterba@suse.com>
2022-08-02	mm/migrate: Convert expected_page_refs() to folio_expected_refs()	Matthew Wilcox (Oracle)	1	-7/+12
	Now that both callers have a folio, convert this function to take a folio & rename it. Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: Christoph Hellwig <hch@lst.de>
2022-08-02	mm/migrate: Convert buffer_migrate_page() to buffer_migrate_folio()	Matthew Wilcox (Oracle)	8	-51/+65
	Use a folio throughout __buffer_migrate_folio(), add kernel-doc for buffer_migrate_folio() and buffer_migrate_folio_norefs(), move their declarations to buffer.h and switch all filesystems that have wired them up. Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: Christoph Hellwig <hch@lst.de>
2022-08-02	mm/migrate: Convert writeout() to take a folio	Matthew Wilcox (Oracle)	1	-11/+10
	Use a folio throughout this function. Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: Christoph Hellwig <hch@lst.de>
2022-08-02	mm/migrate: Convert fallback_migrate_page() to fallback_migrate_folio()	Matthew Wilcox (Oracle)	1	-10/+9
	Use a folio throughout. migrate_page() will be converted to migrate_folio() later. Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: Christoph Hellwig <hch@lst.de>
2022-08-02	fs: Add aops->migrate_folio	Matthew Wilcox (Oracle)	5	-15/+23
	Provide a folio-based replacement for aops->migratepage. Update the documentation to document migrate_folio instead of migratepage. Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: Christoph Hellwig <hch@lst.de>