wireguard-linux - WireGuard for the Linux kernel

Age	Commit message (Collapse)	Author	Files	Lines
2024-09-09	RDMA/mlx5: Drop redundant work canceling from clean_keys()	Michael Guralnik	1	-1/+0
	The canceling of dealyed work in clean_keys() is a leftover from years back and was added to prevent races in the cleanup process of MR cache. The cleanup process was rewritten a few years ago and the canceling of delayed work and flushing of workqueue was added before the call to clean_keys(). Signed-off-by: Michael Guralnik <michaelgur@nvidia.com> Link: https://patch.msgid.link/943d21f5a9dba7b98a3e1d531e3561ffe9745d71.1725362530.git.leon@kernel.org Signed-off-by: Leon Romanovsky <leon@kernel.org>
2024-09-09	RDMA/erdma: Return QP state in erdma_query_qp	Cheng Xu	1	-1/+24
	Fix qp_state and cur_qp_state to return correct values in struct ib_qp_attr. Fixes: 155055771704 ("RDMA/erdma: Add verbs implementation") Signed-off-by: Cheng Xu <chengyou@linux.alibaba.com> Link: https://patch.msgid.link/20240902112920.58749-4-chengyou@linux.alibaba.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
2024-09-09	RDMA/erdma: Add disassociate ucontext support	Cheng Xu	3	-0/+6
	All IO pages mapped to user space are handled by rdma_user_mmap_io, so add empty stub for disassociate ucontext. Signed-off-by: Cheng Xu <chengyou@linux.alibaba.com> Link: https://patch.msgid.link/20240902112920.58749-3-chengyou@linux.alibaba.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
2024-09-09	RDMA/erdma: Refactor the initialization and destruction of EQ	Cheng Xu	4	-69/+51
	We extracted the common parts of the initialization/destruction process to make the code cleaner. Signed-off-by: Cheng Xu <chengyou@linux.alibaba.com> Link: https://patch.msgid.link/20240902112920.58749-2-chengyou@linux.alibaba.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
2024-09-09	RDMA/mlx5: Enable ATS when allocating kernel MRs	Maher Sanalla	1	-0/+5
	When creating kernel MRs, it is not definitive whether they will be used for peer-to-peer transactions or for other usecases, since address mapping is performed only after the MR is created. Since peer-to-peer transactions benefit significantly from ATS performance-wise, enable ATS on newly-allocated kernel MRs when supported. Signed-off-by: Maher Sanalla <msanalla@nvidia.com> Reviewed-by: Gal Shalom <galshalom@nvidia.com> Link: https://patch.msgid.link/fafd4c9f14cf438d2882d88649c2947e1d05d0b4.1725273403.git.leon@kernel.org Signed-off-by: Leon Romanovsky <leon@kernel.org>
2024-09-04	IB/core: Fix ib_cache_setup_one error flow cleanup	Patrisious Haddad	1	-1/+3
	When ib_cache_update return an error, we exit ib_cache_setup_one instantly with no proper cleanup, even though before this we had already successfully done gid_table_setup_one, that results in the kernel WARN below. Do proper cleanup using gid_table_cleanup_one before returning the err in order to fix the issue. WARNING: CPU: 4 PID: 922 at drivers/infiniband/core/cache.c:806 gid_table_release_one+0x181/0x1a0 Modules linked in: CPU: 4 UID: 0 PID: 922 Comm: c_repro Not tainted 6.11.0-rc1+ #3 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014 RIP: 0010:gid_table_release_one+0x181/0x1a0 Code: 44 8b 38 75 0c e8 2f cb 34 ff 4d 8b b5 28 05 00 00 e8 23 cb 34 ff 44 89 f9 89 da 4c 89 f6 48 c7 c7 d0 58 14 83 e8 4f de 21 ff <0f> 0b 4c 8b 75 30 e9 54 ff ff ff 48 8 3 c4 10 5b 5d 41 5c 41 5d 41 RSP: 0018:ffffc90002b835b0 EFLAGS: 00010286 RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffffffff811c8527 RDX: 0000000000000000 RSI: ffffffff811c8534 RDI: 0000000000000001 RBP: ffff8881011b3d00 R08: ffff88810b3abe00 R09: 205d303839303631 R10: 666572207972746e R11: 72746e6520444947 R12: 0000000000000001 R13: ffff888106390000 R14: ffff8881011f2110 R15: 0000000000000001 FS: 00007fecc3b70800(0000) GS:ffff88813bd00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000020000340 CR3: 000000010435a001 CR4: 00000000003706b0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: <TASK> ? show_regs+0x94/0xa0 ? __warn+0x9e/0x1c0 ? gid_table_release_one+0x181/0x1a0 ? report_bug+0x1f9/0x340 ? gid_table_release_one+0x181/0x1a0 ? handle_bug+0xa2/0x110 ? exc_invalid_op+0x31/0xa0 ? asm_exc_invalid_op+0x16/0x20 ? __warn_printk+0xc7/0x180 ? __warn_printk+0xd4/0x180 ? gid_table_release_one+0x181/0x1a0 ib_device_release+0x71/0xe0 ? __pfx_ib_device_release+0x10/0x10 device_release+0x44/0xd0 kobject_put+0x135/0x3d0 put_device+0x20/0x30 rxe_net_add+0x7d/0xa0 rxe_newlink+0xd7/0x190 nldev_newlink+0x1b0/0x2a0 ? __pfx_nldev_newlink+0x10/0x10 rdma_nl_rcv_msg+0x1ad/0x2e0 rdma_nl_rcv_skb.constprop.0+0x176/0x210 netlink_unicast+0x2de/0x400 netlink_sendmsg+0x306/0x660 __sock_sendmsg+0x110/0x120 ____sys_sendmsg+0x30e/0x390 ___sys_sendmsg+0x9b/0xf0 ? kstrtouint+0x6e/0xa0 ? kstrtouint_from_user+0x7c/0xb0 ? get_pid_task+0xb0/0xd0 ? proc_fail_nth_write+0x5b/0x140 ? __fget_light+0x9a/0x200 ? preempt_count_add+0x47/0xa0 __sys_sendmsg+0x61/0xd0 do_syscall_64+0x50/0x110 entry_SYSCALL_64_after_hwframe+0x76/0x7e Fixes: 1901b91f9982 ("IB/core: Fix potential NULL pointer dereference in pkey cache") Signed-off-by: Patrisious Haddad <phaddad@nvidia.com> Reviewed-by: Maher Sanalla <msanalla@nvidia.com> Link: https://patch.msgid.link/79137687d829899b0b1c9835fcb4b258004c439a.1725273354.git.leon@kernel.org Signed-off-by: Leon Romanovsky <leon@kernel.org>
2024-09-04	IB/mlx5: Fix UMR pd cleanup on error flow of driver init	Chris Mi	1	-0/+3
	The cited commit moves the pd allocation from function mlx5r_umr_resource_cleanup() to a new function mlx5r_umr_cleanup(). So the fix in commit [1] is broken. In error flow, will hit panic [2]. Fix it by checking pd pointer to avoid panic if it is NULL; [1] RDMA/mlx5: Fix UMR cleanup on error flow of driver init [2] [ 347.567063] infiniband mlx5_0: Couldn't register device with driver model [ 347.591382] BUG: kernel NULL pointer dereference, address: 0000000000000020 [ 347.593438] #PF: supervisor read access in kernel mode [ 347.595176] #PF: error_code(0x0000) - not-present page [ 347.596962] PGD 0 P4D 0 [ 347.601361] RIP: 0010:ib_dealloc_pd_user+0x12/0xc0 [ib_core] [ 347.604171] RSP: 0018:ffff888106293b10 EFLAGS: 00010282 [ 347.604834] RAX: 0000000000000000 RBX: 000000000000000e RCX: 0000000000000000 [ 347.605672] RDX: ffff888106293ad0 RSI: 0000000000000000 RDI: 0000000000000000 [ 347.606529] RBP: 0000000000000000 R08: ffff888106293ae0 R09: ffff888106293ae0 [ 347.607379] R10: 0000000000000a06 R11: 0000000000000000 R12: 0000000000000000 [ 347.608224] R13: ffffffffa0704dc0 R14: 0000000000000001 R15: 0000000000000001 [ 347.609067] FS: 00007fdc720cd9c0(0000) GS:ffff88852c880000(0000) knlGS:0000000000000000 [ 347.610094] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 347.610727] CR2: 0000000000000020 CR3: 0000000103012003 CR4: 0000000000370eb0 [ 347.611421] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 347.612113] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 347.612804] Call Trace: [ 347.613130] <TASK> [ 347.613417] ? __die+0x20/0x60 [ 347.613793] ? page_fault_oops+0x150/0x3e0 [ 347.614243] ? free_msg+0x68/0x80 [mlx5_core] [ 347.614840] ? cmd_exec+0x48f/0x11d0 [mlx5_core] [ 347.615359] ? exc_page_fault+0x74/0x130 [ 347.615808] ? asm_exc_page_fault+0x22/0x30 [ 347.616273] ? ib_dealloc_pd_user+0x12/0xc0 [ib_core] [ 347.616801] mlx5r_umr_cleanup+0x23/0x90 [mlx5_ib] [ 347.617365] mlx5_ib_stage_pre_ib_reg_umr_cleanup+0x36/0x40 [mlx5_ib] [ 347.618025] __mlx5_ib_add+0x96/0xd0 [mlx5_ib] [ 347.618539] mlx5r_probe+0xe9/0x310 [mlx5_ib] [ 347.619032] ? kernfs_add_one+0x107/0x150 [ 347.619478] ? __mlx5_ib_add+0xd0/0xd0 [mlx5_ib] [ 347.619984] auxiliary_bus_probe+0x3e/0x90 [ 347.620448] really_probe+0xc5/0x3a0 [ 347.620857] __driver_probe_device+0x80/0x160 [ 347.621325] driver_probe_device+0x1e/0x90 [ 347.621770] __driver_attach+0xec/0x1c0 [ 347.622213] ? __device_attach_driver+0x100/0x100 [ 347.622724] bus_for_each_dev+0x71/0xc0 [ 347.623151] bus_add_driver+0xed/0x240 [ 347.623570] driver_register+0x58/0x100 [ 347.623998] __auxiliary_driver_register+0x6a/0xc0 [ 347.624499] ? driver_register+0xae/0x100 [ 347.624940] ? 0xffffffffa0893000 [ 347.625329] mlx5_ib_init+0x16a/0x1e0 [mlx5_ib] [ 347.625845] do_one_initcall+0x4a/0x2a0 [ 347.626273] ? gcov_event+0x2e2/0x3a0 [ 347.626706] do_init_module+0x8a/0x260 [ 347.627126] init_module_from_file+0x8b/0xd0 [ 347.627596] __x64_sys_finit_module+0x1ca/0x2f0 [ 347.628089] do_syscall_64+0x4c/0x100 Fixes: 638420115cc4 ("IB/mlx5: Create UMR QP just before first reg_mr occurs") Signed-off-by: Chris Mi <cmi@nvidia.com> Reviewed-by: Jianbo Liu <jianbol@nvidia.com> Link: https://patch.msgid.link/778c40c60287992da5d6ec92bb07b67f7bb5e6ef.1725273295.git.leon@kernel.org Signed-off-by: Leon Romanovsky <leon@kernel.org>
2024-09-02	RDMA/bnxt_re: Add support for MR Relaxed Ordering	Kalesh AP	2	-0/+19
	Some of the adapters support Relaxed Ordering for the MRs. Driver queries support for Memory region relax ordering support from firmware and set relax ordering bit in REGISTER_MR request, if the users request for the support. Also, this is supported only if the PCIe device has enabled relaxed ordering attribute. Reviewed-by: Chandramohan Akula <chandramohan.akula@broadcom.com> Reviewed-by: Selvin Xavier <selvin.xavier@broadcom.com> Reviewed-by: Vijay Kumar Mandadapu <vijaykumar.mandadapu@broadcom.com> Signed-off-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com> Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com> Link: https://patch.msgid.link/1725256351-12751-5-git-send-email-selvin.xavier@broadcom.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
2024-09-02	RDMA/bnxt_re: Avoid an extra hwrm per MR creation	Kalesh AP	4	-14/+36
	Firmware now have a new mr registration command where both MR allocation and registration can be done in a single hwrm command. Driver has to issue this new hwrm command whenever the support flag is set. This reduces the number of hwrm issued per MR creation and speed up the MR creation. Signed-off-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com> Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com> Link: https://patch.msgid.link/1725256351-12751-4-git-send-email-selvin.xavier@broadcom.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
2024-09-02	RDMA/bnxt_re: Rename a variable	Kalesh AP	3	-7/+7
	Renaming flags to access_flags for clarity. Signed-off-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com> Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com> Link: https://patch.msgid.link/1725256351-12751-3-git-send-email-selvin.xavier@broadcom.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
2024-09-02	RDMA/bnxt_re: Update HW interface headers	Kalesh AP	1	-9/+27
	Updating the HW structures for the pcie relax ordering support. Newly added interface structures will be used in the followup patch. Signed-off-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com> Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com> Link: https://patch.msgid.link/1725256351-12751-2-git-send-email-selvin.xavier@broadcom.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
2024-09-02	RDMA/mana_ib: use the correct page size for mapping user-mode doorbell page	Long Li	1	-3/+3
	When mapping doorbell page from user-mode, the driver should use the system page size as this memory is allocated via mmap() from user-mode. Cc: stable@vger.kernel.org Fixes: 0266a177631d ("RDMA/mana_ib: Add a driver for Microsoft Azure Network Adapter") Signed-off-by: Long Li <longli@microsoft.com> Link: https://patch.msgid.link/1725030993-16213-2-git-send-email-longli@linuxonhyperv.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
2024-09-02	RDMA/mana_ib: use the correct page table index based on hardware page size	Long Li	1	-1/+1
	MANA hardware uses 4k page size. When calculating the page table index, it should use the hardware page size, not the system page size. Cc: stable@vger.kernel.org Fixes: 0266a177631d ("RDMA/mana_ib: Add a driver for Microsoft Azure Network Adapter") Signed-off-by: Long Li <longli@microsoft.com> Link: https://patch.msgid.link/1725030993-16213-1-git-send-email-longli@linuxonhyperv.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
2024-09-02	RDMA/bnxt_re: Share a page to expose per SRQ info with userspace	Chandramohan Akula	5	-2/+47
	Gen P7 adapters needs to share a toggle bits information received in kernel driver with the user space. User space needs this info to arm the SRQ. User space application can get this page using the UAPI routines. Library will mmap this page and get the toggle bits to be used in the next ARM Doorbell. Uses a hash list to map the SRQ structure from the SRQ ID. SRQ structure is retrieved from the hash list while the library calls the UAPI routine to get the toggle page mapping. Currently the full page is mapped per SRQ. This can be optimized to enable multiple SRQs from the same application share the same page and different offsets in the page Signed-off-by: Chandramohan Akula <chandramohan.akula@broadcom.com> Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com> Link: https://patch.msgid.link/1724945645-14989-4-git-send-email-selvin.xavier@broadcom.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
2024-09-02	RDMA/bnxt_re: Refactor the BNXT_RE_METHOD_GET_TOGGLE_MEM method	Chandramohan Akula	1	-11/+7
	Refactor the code in this function to have common code. This is used in subsequent patches. Signed-off-by: Chandramohan Akula <chandramohan.akula@broadcom.com> Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com> Link: https://patch.msgid.link/1724945645-14989-3-git-send-email-selvin.xavier@broadcom.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
2024-09-02	RDMA/bnxt_re: Get the toggle bits from SRQ events	Hongguang Gao	3	-0/+13
	SRQ arming requires the toggle bits received from hardware. Get the toggle bits from SRQ notification for the gen p7 adapters. This value will be zero for the older adapters. Signed-off-by: Hongguang Gao <hongguang.gao@broadcom.com> Signed-off-by: Chandramohan Akula <chandramohan.akula@broadcom.com> Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com> Link: https://patch.msgid.link/1724945645-14989-2-git-send-email-selvin.xavier@broadcom.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
2024-09-02	RDMA/rdmavt: Convert to use ERR_CAST()	Shen Lichuan	1	-3/+3
	As opposed to open-code, using the ERR_CAST macro clearly indicates that this is a pointer to an error value and a type conversion was performed. Signed-off-by: Shen Lichuan <shenlichuan@vivo.com> Link: https://patch.msgid.link/20240828082720.33231-1-shenlichuan@vivo.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
2024-08-28	RDMA/cxgb4: Remove unused declarations	Yue Haibing	1	-4/+0
	Since commit be4c9bad9d0e ("MAINTAINERS: Add cxgb4 and iw_cxgb4 entries") c4iw_post_terminate() declaration is not used anymore. And other declarations were never implemented since introduction in commit cfdda9d76436 ("RDMA/cxgb4: Add driver for Chelsio T4 RNIC"). Signed-off-by: Yue Haibing <yuehaibing@huawei.com> Link: https://patch.msgid.link/20240824091629.3659565-1-yuehaibing@huawei.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
2024-08-28	RDMA/rtrs-clt: Remove an extra space	Jack Wang	1	-1/+1
	No functional changes. Signed-off-by: Jack Wang <jinpu.wang@ionos.com> Signed-off-by: Alexei Pastuchov <alexei.pastuchov@ionos.com> Signed-off-by: Grzegorz Prajsner <grzegorz.prajsner@ionos.com> Signed-off-by: Md Haris Iqbal <haris.iqbal@ionos.com> Link: https://patch.msgid.link/20240821112217.41827-12-haris.iqbal@ionos.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
2024-08-28	RDMA/rtrs-clt: Do local invalidate after write io completion	Jack Wang	1	-13/+7
	Switch local invalidate after write io completion avoid the chain usage of WR, this fixed the local protection error on LOCAL INVALIDATE WR. Signed-off-by: Jack Wang <jinpu.wang@ionos.com> Signed-off-by: Md Haris Iqbal <haris.iqbal@ionos.com> Signed-off-by: Grzegorz Prajsner <grzegorz.prajsner@ionos.com> Link: https://patch.msgid.link/20240821112217.41827-11-haris.iqbal@ionos.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
2024-08-28	RDMA/rtrs: Register ib event handler	Grzegorz Prajsner	5	-2/+58
	Use ib_register_event_handler() to register event handlers for both client and server side. For now, all those handlers do, is to print type of incoming event. Signed-off-by: Grzegorz Prajsner <grzegorz.prajsner@ionos.com> Signed-off-by: Jack Wang <jinpu.wang@ionos.com> Signed-off-by: Md Haris Iqbal <haris.iqbal@ionos.com> Link: https://patch.msgid.link/20240821112217.41827-10-haris.iqbal@ionos.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
2024-08-28	RDMA/rtrs-srv: Avoid null pointer deref during path establishment	Md Haris Iqbal	1	-2/+11
	For RTRS path establishment, RTRS client initiates and completes con_num of connections. After establishing all its connections, the information is exchanged between the client and server through the info_req message. During this exchange, it is essential that all connections have been established, and the state of the RTRS srv path is CONNECTED. So add these sanity checks, to make sure we detect and abort process in error scenarios to avoid null pointer deref. Signed-off-by: Md Haris Iqbal <haris.iqbal@ionos.com> Signed-off-by: Jack Wang <jinpu.wang@ionos.com> Signed-off-by: Grzegorz Prajsner <grzegorz.prajsner@ionos.com> Link: https://patch.msgid.link/20240821112217.41827-9-haris.iqbal@ionos.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
2024-08-28	RDMA/rtrs-clt: Print request type for errors	Jack Wang	1	-2/+4
	Extend the output to print also the request type. Signed-off-by: Jack Wang <jinpu.wang@ionos.com> Signed-off-by: Grzegorz Prajsner <grzegorz.prajsner@ionos.com> Signed-off-by: Md Haris Iqbal <haris.iqbal@ionos.com> Link: https://patch.msgid.link/20240821112217.41827-8-haris.iqbal@ionos.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
2024-08-28	RDMA/rtrs-clt: Reset cid to con_num - 1 to stay in bounds	Md Haris Iqbal	1	-0/+6
	In the function init_conns(), after the create_con() and create_cm() for loop if something fails. In the cleanup for loop after the destroy tag, we access out of bound memory because cid is set to clt_path->s.con_num. This commits resets the cid to clt_path->s.con_num - 1, to stay in bounds in the cleanup loop later. Fixes: 6a98d71daea1 ("RDMA/rtrs: client: main functionality") Signed-off-by: Md Haris Iqbal <haris.iqbal@ionos.com> Signed-off-by: Jack Wang <jinpu.wang@ionos.com> Signed-off-by: Grzegorz Prajsner <grzegorz.prajsner@ionos.com> Link: https://patch.msgid.link/20240821112217.41827-7-haris.iqbal@ionos.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
2024-08-28	RDMA/rtrs-clt: Reuse need_inval from mr	Jack Wang	2	-10/+9
	mr has a member need_inval, which can be used to indicate if local invalidate is needed, switch to it and remove need_inv from rtrs_clt_io_req. Signed-off-by: Jack Wang <jinpu.wang@ionos.com> Signed-off-by: Md Haris Iqbal <haris.iqbal@ionos.com> Signed-off-by: Grzegorz Prajsner <grzegorz.prajsner@ionos.com> Link: https://patch.msgid.link/20240821112217.41827-6-haris.iqbal@ionos.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
2024-08-28	RDMA/rtrs: Reset hb_missed_cnt after receiving other traffic from peer	Jack Wang	2	-1/+3
	Reset hb_missed_cnt after receiving traffic from other peer, so hb is more robust again high load on host or network. Fixes: 6a98d71daea1 ("RDMA/rtrs: client: main functionality") Signed-off-by: Jack Wang <jinpu.wang@ionos.com> Signed-off-by: Md Haris Iqbal <haris.iqbal@ionos.com> Signed-off-by: Grzegorz Prajsner <grzegorz.prajsner@ionos.com> Link: https://patch.msgid.link/20240821112217.41827-5-haris.iqbal@ionos.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
2024-08-28	RDMA/rtrs-clt: Rate limit errors in IO path	Jack Wang	1	-3/+3
	On network errors, a large number of these logs are printed due to all the inflight IOs, rate limit them so they do not clutter kernel log. Signed-off-by: Jack Wang <jinpu.wang@ionos.com> Signed-off-by: Md Haris Iqbal <haris.iqbal@ionos.com> Signed-off-by: Grzegorz Prajsner <grzegorz.prajsner@ionos.com> Link: https://patch.msgid.link/20240821112217.41827-4-haris.iqbal@ionos.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
2024-08-28	RDMA/rtrs-clt: Fix need_inv setting in error case	Jack Wang	1	-11/+9
	In some cases need_inv can be missed for write requests, additionally driver has to handle missing invalidates for write requests. While at it, remove the else case from write invalidate path as it is possible to reach there. Signed-off-by: Jack Wang <jinpu.wang@ionos.com> Signed-off-by: Md Haris Iqbal <haris.iqbal@ionos.com> Signed-off-by: Grzegorz Prajsner <grzegorz.prajsner@ionos.com> Link: https://patch.msgid.link/20240821112217.41827-3-haris.iqbal@ionos.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
2024-08-28	RDMA/rtrs: For HB error add additional clt/srv specific logging	Md Haris Iqbal	2	-0/+6
	In case of HB error, we need to know the specific path on which it happened, for better debugging. Since the clt/srv path structures are not available in rtrs.c, it needs to be done in the individual HB error handler. This commit add those loging. A sample kernel log output after this commit: rtrs_core L357: <blya>: HB missed max reached. rtrs_server L717: <blya>: HB err handler for path=ip:x.x.x.x@ip:x.x.x.x . . rtrs_core L357: <blya>: HB missed max reached. rtrs_client L1519: <blya>: HB err handler for path=ip:x.x.x.x@ip:x.x.x.x Signed-off-by: Md Haris Iqbal <haris.iqbal@ionos.com> Reviewed-by: Jack Wang <jinpu.wang@ionos.com> Signed-off-by: Grzegorz Prajsner <grzegorz.prajsner@ionos.com> Link: https://patch.msgid.link/20240821112217.41827-2-haris.iqbal@ionos.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
2024-08-27	RDMA/bnxt_re: Enable variable size WQEs for user space applications	Selvin Xavier	2	-0/+6
	Add backward compatibility code to enable variable size WQEs only if the user lib supports it. Link: https://patch.msgid.link/r/1724042847-1481-6-git-send-email-selvin.xavier@broadcom.com Signed-off-by: Hongguang Gao <hongguang.gao@broadcom.com> Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2024-08-27	RDMA/bnxt_re: Handle variable WQE support for user applications	Selvin Xavier	3	-46/+82
	User library calculates the number of slots required for user applications and it can pass that information to the driver. Driver can use this value and update the HW directly. This mechanism is currently used only for the newly introduced variable size WQEs. Extend the bnxt_re_qp_req structure to pass the Send Queue slot count. Reorganize the code to get the sq_slots before initializing the Send Queue attributes. Link: https://patch.msgid.link/r/1724042847-1481-5-git-send-email-selvin.xavier@broadcom.com Signed-off-by: Hongguang Gao <hongguang.gao@broadcom.com> Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2024-08-27	RDMA/bnxt_re: Fix the table size for PSN/MSN entries	Selvin Xavier	1	-0/+2
	HW MSN table size is always a power of 2. So the pages should be mapped accordingly. Use the power of two calculation while get the number of PSN/MSN entries. Fixes: 6f6bfbc595fb ("RDMA/bnxt_re: Expose the MSN table capability for user library") Link: https://patch.msgid.link/r/1724042847-1481-4-git-send-email-selvin.xavier@broadcom.com Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2024-08-27	RDMA/bnxt_re: Get the WQE index from slot index while completing the WQEs	Selvin Xavier	2	-1/+52
	While reporting the completions, SQ Work Queue index is required to identify the WQE that generated the completions. In variable WQE mode, FW returns the slot index for Error completions. Driver need to walk through the shadow queue between the consumer index and producer index and matches the slot index returned by FW. If a match is found, the next index of the shadow queue is the WQE index to be considered for remaining poll_cq loop. Link: https://patch.msgid.link/r/1724042847-1481-3-git-send-email-selvin.xavier@broadcom.com Signed-off-by: Hongguang Gao <hongguang.gao@broadcom.com> Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2024-08-27	RDMA/bnxt_re: Add support for Variable WQE in Genp7 adapters	Selvin Xavier	6	-27/+47
	Variable size WQE means that each send Work Queue Entry to HW can use different WQE sizes as opposed to the static WQE size on the current devices. Set variable WQE mode for Gen P7 devices. Depth of the Queue will be a multiple of slot which is 16 bytes. The number of slots should be a multiple of 256 as per the HW requirement. Initialize the Software shadow queue to hold requests equal to the number of slots. Also, do not expose the variable size WQE capability until the last patch in the series. Link: https://patch.msgid.link/r/1724042847-1481-2-git-send-email-selvin.xavier@broadcom.com Signed-off-by: Hongguang Gao <hongguang.gao@broadcom.com> Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2024-08-25	Linux 6.11-rc5	Linus Torvalds	1	-1/+1

2024-08-24	bcachefs: Fix rebalance_work accounting	Kent Overstreet	5	-27/+98
	rebalance_work was keying off of the presence of rebelance_opts in the extent - but that was incorrect, we keep those around after rebalance for indirect extents since the inode's options are not directly available Fixes: 20ac515a9cc7 ("bcachefs: bch_acct_rebalance_work") Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-08-24	bcachefs: Fix failure to flush moves before sleeping in copygc	Kent Overstreet	1	-1/+1
	This fixes an apparent deadlock - rebalance would get stuck trying to take nocow locks because they weren't being released by copygc. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-08-23	RDMA/efa: Add support for node guid	Yehuda Yitschak	4	-0/+6
	Propagate the unique, per device, ID in the device attributes to the standard node_guid value in IB device. Link: https://patch.msgid.link/r/20240822171143.2800-1-mrgolin@amazon.com Reviewed-by: Yonatan Nachum <ynachum@amazon.com> Signed-off-by: Yehuda Yitschak <yehuday@amazon.com> Signed-off-by: Michael Margolin <mrgolin@amazon.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2024-08-23	RDMA/iwcm: Fix WARNING:at_kernel/workqueue.c:#check_flush_dependency	Zhu Yanjun	1	-1/+1
	In the commit aee2424246f9 ("RDMA/iwcm: Fix a use-after-free related to destroying CM IDs"), the function flush_workqueue is invoked to flush the work queue iwcm_wq. But at that time, the work queue iwcm_wq was created via the function alloc_ordered_workqueue without the flag WQ_MEM_RECLAIM. Because the current process is trying to flush the whole iwcm_wq, if iwcm_wq doesn't have the flag WQ_MEM_RECLAIM, verify that the current process is not reclaiming memory or running on a workqueue which doesn't have the flag WQ_MEM_RECLAIM as that can break forward-progress guarantee leading to a deadlock. The call trace is as below: [ 125.350876][ T1430] Call Trace: [ 125.356281][ T1430] <TASK> [ 125.361285][ T1430] ? __warn (kernel/panic.c:693) [ 125.367640][ T1430] ? check_flush_dependency (kernel/workqueue.c:3706 (discriminator 9)) [ 125.375689][ T1430] ? report_bug (lib/bug.c:180 lib/bug.c:219) [ 125.382505][ T1430] ? handle_bug (arch/x86/kernel/traps.c:239) [ 125.388987][ T1430] ? exc_invalid_op (arch/x86/kernel/traps.c:260 (discriminator 1)) [ 125.395831][ T1430] ? asm_exc_invalid_op (arch/x86/include/asm/idtentry.h:621) [ 125.403125][ T1430] ? check_flush_dependency (kernel/workqueue.c:3706 (discriminator 9)) [ 125.410984][ T1430] ? check_flush_dependency (kernel/workqueue.c:3706 (discriminator 9)) [ 125.418764][ T1430] __flush_workqueue (kernel/workqueue.c:3970) [ 125.426021][ T1430] ? __pfx___might_resched (kernel/sched/core.c:10151) [ 125.433431][ T1430] ? destroy_cm_id (drivers/infiniband/core/iwcm.c:375) iw_cm [ 125.441209][ T1430] ? __pfx___flush_workqueue (kernel/workqueue.c:3910) [ 125.473900][ T1430] ? _raw_spin_lock_irqsave (arch/x86/include/asm/atomic.h:107 include/linux/atomic/atomic-arch-fallback.h:2170 include/linux/atomic/atomic-instrumented.h:1302 include/asm-generic/qspinlock.h:111 include/linux/spinlock.h:187 include/linux/spinlock_api_smp.h:111 kernel/locking/spinlock.c:162) [ 125.473909][ T1430] ? __pfx__raw_spin_lock_irqsave (kernel/locking/spinlock.c:161) [ 125.482537][ T1430] _destroy_id (drivers/infiniband/core/cma.c:2044) rdma_cm [ 125.495072][ T1430] nvme_rdma_free_queue (drivers/nvme/host/rdma.c:656 drivers/nvme/host/rdma.c:650) nvme_rdma [ 125.505827][ T1430] nvme_rdma_reset_ctrl_work (drivers/nvme/host/rdma.c:2180) nvme_rdma [ 125.505831][ T1430] process_one_work (kernel/workqueue.c:3231) [ 125.515122][ T1430] worker_thread (kernel/workqueue.c:3306 kernel/workqueue.c:3393) [ 125.515127][ T1430] ? __pfx_worker_thread (kernel/workqueue.c:3339) [ 125.531837][ T1430] kthread (kernel/kthread.c:389) [ 125.539864][ T1430] ? __pfx_kthread (kernel/kthread.c:342) [ 125.550628][ T1430] ret_from_fork (arch/x86/kernel/process.c:147) [ 125.558840][ T1430] ? __pfx_kthread (kernel/kthread.c:342) [ 125.558844][ T1430] ret_from_fork_asm (arch/x86/entry/entry_64.S:257) [ 125.566487][ T1430] </TASK> [ 125.566488][ T1430] ---[ end trace 0000000000000000 ]--- Fixes: aee2424246f9 ("RDMA/iwcm: Fix a use-after-free related to destroying CM IDs") Link: https://patch.msgid.link/r/20240820113336.19860-1-yanjun.zhu@linux.dev Reported-by: kernel test robot <oliver.sang@intel.com> Closes: https://lore.kernel.org/oe-lkp/202408151633.fc01893c-oliver.sang@intel.com Tested-by: kernel test robot <oliver.sang@intel.com> Signed-off-by: Zhu Yanjun <yanjun.zhu@linux.dev> Reviewed-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2024-08-23	RDMA/rxe: Fix __bth_set_resv6a	zhenwei pi	1	-1/+1
	__bth_set_resv6a is used to clear BIT [24, 29] of rxe_bth::qpn, the wrong expression leads other BITs into 1. Link: https://patch.msgid.link/r/20240822065223.1117056-4-pizhenwei@bytedance.com Signed-off-by: zhenwei pi <pizhenwei@bytedance.com> Reviewed-by: Zhu Yanjun <yanjun.zhu@linux.dev> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2024-08-23	RDMA/rxe: Fix misspelling of 'rmda'	zhenwei pi	1	-1/+1
	Fix 'rmda' into 'RDMA'. Link: https://patch.msgid.link/r/20240822065223.1117056-3-pizhenwei@bytedance.com Signed-off-by: zhenwei pi <pizhenwei@bytedance.com> Reviewed-by: Zhu Yanjun <yanjun.zhu@linux.dev> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2024-08-23	RDMA/rxe: Use sizeof instead of hard code number	zhenwei pi	1	-1/+1
	Use 'sizeof(union rdma_network_hdr)' instead of hard code GRH length for GSI and UD. Link: https://patch.msgid.link/r/20240822065223.1117056-2-pizhenwei@bytedance.com Signed-off-by: zhenwei pi <pizhenwei@bytedance.com> Reviewed-by: Zhu Yanjun <yanjun.zhu@linux.dev> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2024-08-23	RDMA/mlx4: Simplify an alloc_ordered_workqueue() invocation	Jinjie Ruan	1	-7/+3
	Let alloc_ordered_workqueue() format the workqueue name instead of calling snprintf() explicitly. Link: https://patch.msgid.link/r/20240823101840.515398-5-ruanjinjie@huawei.com Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2024-08-23	RDMA/mlx4: Simplify an alloc_ordered_workqueue() invocation	Jinjie Ruan	1	-3/+1
	Let alloc_ordered_workqueue() format the workqueue name instead of calling snprintf() explicitly. Link: https://patch.msgid.link/r/20240823101840.515398-4-ruanjinjie@huawei.com Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2024-08-23	RDMA/mad: Simplify an alloc_ordered_workqueue() invocation	Jinjie Ruan	1	-3/+2
	Let alloc_ordered_workqueue() format the workqueue name instead of calling snprintf() explicitly. Link: https://patch.msgid.link/r/20240823101840.515398-3-ruanjinjie@huawei.com Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2024-08-23	RDMA/qib: Simplify an alloc_ordered_workqueue() invocation	Jinjie Ruan	1	-6/+3
	Let alloc_ordered_workqueue() format the workqueue name instead of calling snprintf() explicitly. Link: https://patch.msgid.link/r/20240823101840.515398-2-ruanjinjie@huawei.com Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2024-08-22	NFS: Avoid unnecessary rescanning of the per-server delegation list	Trond Myklebust	1	-10/+5
	If the call to nfs_delegation_grab_inode() fails, we will not have dropped any locks that require us to rescan the list. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2024-08-22	NFSv4: Fix clearing of layout segments in layoutreturn	Trond Myklebust	2	-6/+8
	Make sure that we clear the layout segments in cases where we see a fatal error, and also in the case where the layout is invalid. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2024-08-22	NFSv4: Add missing rescheduling points in nfs_client_return_marked_delegations	Trond Myklebust	1	-0/+2
	We're seeing reports of soft lockups when iterating through the loops, so let's add rescheduling points. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2024-08-22	nfs: fix bitmap decoder to handle a 3rd word	Jeff Layton	1	-2/+4
	It only decodes the first two words at this point. Have it decode the third word as well. Without this, the client doesn't send delegated timestamps in the CB_GETATTR response. With this change we also need to expand the on-stack bitmap in decode_recallany_args to 3 elements, in case the server sends a larger bitmap than expected. Fixes: 43df7110f4a9 ("NFSv4: Add CB_GETATTR support for delegated attributes") Signed-off-by: Jeff Layton <jlayton@kernel.org> Reviewed-by: Benjamin Coddington <bcodding@redhat.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>