linux-rng/drivers/infiniband, branch master

Merge tag 'objtool-core-2025-12-01' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

2025-12-02T04:18:59Z

Pull objtool updates from Ingo Molnar: - klp-build livepatch module generation (Josh Poimboeuf) Introduce new objtool features and a klp-build script to generate livepatch modules using a source .patch as input. This builds on concepts from the longstanding out-of-tree kpatch project which began in 2012 and has been used for many years to generate livepatch modules for production kernels. However, this is a complete rewrite which incorporates hard-earned lessons from 12+ years of maintaining kpatch. Key improvements compared to kpatch-build: - Integrated with objtool: Leverages objtool's existing control-flow graph analysis to help detect changed functions. - Works on vmlinux.o: Supports late-linked objects, making it compatible with LTO, IBT, and similar. - Simplified code base: ~3k fewer lines of code. - Upstream: No more out-of-tree #ifdef hacks, far less cruft. - Cleaner internals: Vastly simplified logic for symbol/section/reloc inclusion and special section extraction. - Robust __LINE__ macro handling: Avoids false positive binary diffs caused by the __LINE__ macro by introducing a fix-patch-lines script which injects #line directives into the source .patch to preserve the original line numbers at compile time. - Disassemble code with libopcodes instead of running objdump (Alexandre Chartre) - Disassemble support (-d option to objtool) by Alexandre Chartre, which supports the decoding of various Linux kernel code generation specials such as alternatives: 17ef: sched_balance_find_dst_group+0x62f mov 0x34(%r9),%edx 17f3: sched_balance_find_dst_group+0x633 | | X86_FEATURE_POPCNT 17f3: sched_balance_find_dst_group+0x633 | call 0x17f8 <__sw_hweight64> | popcnt %rdi,%rax 17f8: sched_balance_find_dst_group+0x638 cmp %eax,%edx ... jump table alternatives: 1895: sched_use_asym_prio+0x5 test $0x8,%ch 1898: sched_use_asym_prio+0x8 je 0x18a9 189a: sched_use_asym_prio+0xa | | JUMP 189a: sched_use_asym_prio+0xa | jmp 0x18ae | nop2 189c: sched_use_asym_prio+0xc mov $0x1,%eax 18a1: sched_use_asym_prio+0x11 and $0x80,%ecx ... exception table alternatives: native_read_msr: 5b80: native_read_msr+0x0 mov %edi,%ecx 5b82: native_read_msr+0x2 | | EXCEPTION 5b82: native_read_msr+0x2 | rdmsr | resume at 0x5b84 5b84: native_read_msr+0x4 shl $0x20,%rdx .... x86 feature flag decoding (also see the X86_FEATURE_POPCNT example in sched_balance_find_dst_group() above): 2faaf: start_thread_common.constprop.0+0x1f jne 0x2fba4 2fab5: start_thread_common.constprop.0+0x25 | | X86_FEATURE_ALWAYS | X86_BUG_NULL_SEG 2fab5: start_thread_common.constprop.0+0x25 | jmp 0x2faba <.altinstr_aux+0x2f4> | jmp 0x4b0 | nop5 2faba: start_thread_common.constprop.0+0x2a mov $0x2b,%eax ... NOP sequence shortening: 1048e2: snapshot_write_finalize+0xc2 je 0x104917 1048e4: snapshot_write_finalize+0xc4 nop6 1048ea: snapshot_write_finalize+0xca nop11 1048f5: snapshot_write_finalize+0xd5 nop11 104900: snapshot_write_finalize+0xe0 mov %rax,%rcx 104903: snapshot_write_finalize+0xe3 mov 0x10(%rdx),%rax ... and much more. - Function validation tracing support (Alexandre Chartre) - Various -ffunction-sections fixes (Josh Poimboeuf) - Clang AutoFDO (Automated Feedback-Directed Optimizations) support (Josh Poimboeuf) - Misc fixes and cleanups (Borislav Petkov, Chen Ni, Dylan Hatch, Ingo Molnar, John Wang, Josh Poimboeuf, Pankaj Raghav, Peter Zijlstra, Thorsten Blum) * tag 'objtool-core-2025-12-01' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (129 commits) objtool: Fix segfault on unknown alternatives objtool: Build with disassembly can fail when including bdf.h objtool: Trim trailing NOPs in alternative objtool: Add wide output for disassembly objtool: Compact output for alternatives with one instruction objtool: Improve naming of group alternatives objtool: Add Function to get the name of a CPU feature objtool: Provide access to feature and flags of group alternatives objtool: Fix address references in alternatives objtool: Disassemble jump table alternatives objtool: Disassemble exception table alternatives objtool: Print addresses with alternative instructions objtool: Disassemble group alternatives objtool: Print headers for alternatives objtool: Preserve alternatives order objtool: Add the --disas= action objtool: Do not validate IBT for .return_sites and .call_sites objtool: Improve tracing of alternative instructions objtool: Add functions to better name alternatives objtool: Identify the different types of alternatives ...

Merge tag 'v6.18-rc5' into objtool/core, to pick up fixes

2025-11-13T06:58:43Z

Signed-off-by: Ingo Molnar

mlx5: Fix default values in create CQ

2025-11-11T14:12:18Z

Currently, CQs without a completion function are assigned the mlx5_add_cq_to_tasklet function by default. This is problematic since only user CQs created through the mlx5_ib driver are intended to use this function. Additionally, all CQs that will use doorbells instead of polling for completions must call mlx5_cq_arm. However, the default CQ creation flow leaves a valid value in the CQ's arm_db field, allowing FW to send interrupts to polling-only CQs in certain corner cases. These two factors would allow a polling-only kernel CQ to be triggered by an EQ interrupt and call a completion function intended only for user CQs, causing a null pointer exception. Some areas in the driver have prevented this issue with one-off fixes but did not address the root cause. This patch fixes the described issue by adding defaults to the create CQ flow. It adds a default dummy completion function to protect against null pointer exceptions, and it sets an invalid command sequence number by default in kernel CQs to prevent the FW from sending an interrupt to the CQ until it is armed. User CQs are responsible for their own initialization values. Callers of mlx5_core_create_cq are responsible for changing the completion function and arming the CQ per their needs. Fixes: cdd04f4d4d71 ("net/mlx5: Add support to create SQ and CQ for ASO") Signed-off-by: Akiva Goldberger Reviewed-by: Moshe Shemesh Signed-off-by: Tariq Toukan Acked-by: Leon Romanovsky Link: https://patch.msgid.link/1762681743-1084694-1-git-send-email-tariqt@nvidia.com Signed-off-by: Paolo Abeni

RDMA/irdma: Fix vf_id size to u16 to avoid overflow

2025-11-02T11:46:01Z

Correctly size the vf_id to u16 to avoid overflow. Signed-off-by: Jay Bhat Signed-off-by: Tatyana Nikolova Link: https://patch.msgid.link/20251031021726.1003-6-tatyana.e.nikolova@intel.com Signed-off-by: Leon Romanovsky

RDMA/hns: Remove an extra blank line

2025-10-27T09:44:00Z

Remove an extra blank line. Signed-off-by: Guofeng Yue Signed-off-by: Junxian Huang Link: https://patch.msgid.link/20251016114051.1963197-5-huangjunxian6@hisilicon.com Signed-off-by: Leon Romanovsky

RDMA/hns: Fix wrong WQE data when QP wraps around

2025-10-27T09:44:00Z

When QP wraps around, WQE data from the previous use at the same position still remains as driver does not clear it. The WQE field layout differs across different opcodes, causing that the fields that are not explicitly assigned for the current opcode retain stale values, and are issued to HW by mistake. Such fields are as follows: * MSG_START_SGE_IDX field in ATOMIC WQE * BLOCK_SIZE and ZBVA fields in FRMR WQE * DirectWQE fields when DirectWQE not used For ATOMIC WQE, always set the latest sge index in MSG_START_SGE_IDX as required by HW. For FRMR WQE and DirectWQE, clear only those unassigned fields instead of the entire WQE to avoid performance penalty. Fixes: 68a997c5d28c ("RDMA/hns: Add FRMR support for hip08") Signed-off-by: Junxian Huang Link: https://patch.msgid.link/20251016114051.1963197-4-huangjunxian6@hisilicon.com Signed-off-by: Leon Romanovsky

RDMA/hns: Fix the modification of max_send_sge

2025-10-27T09:44:00Z

The actual sge number may exceed the value specified in init_attr->cap when HW needs extra sge to enable inline feature. Since these extra sges are not expected by ULP, return the user-specified value to ULP instead of the expanded sge number. Fixes: 0c5e259b06a8 ("RDMA/hns: Fix incorrect sge nums calculation") Signed-off-by: wenglianfa Signed-off-by: Junxian Huang Link: https://patch.msgid.link/20251016114051.1963197-3-huangjunxian6@hisilicon.com Signed-off-by: Leon Romanovsky

RDMA/hns: Fix recv CQ and QP cache affinity

2025-10-27T09:44:00Z

Currently driver enforces affinity between QP cache and send CQ cache, which helps improve the performance of sending, but doesn't set affinity with recv CQ cache, resulting in suboptimal performance of receiving. Use one CQ bank per context to ensure the affinity among QP, send CQ and recv CQ. For kernel ULP, CQ bank is fixed to 0. Fixes: 9e03dbea2b06 ("RDMA/hns: Fix CQ and QP cache affinity") Signed-off-by: Chengchang Tang Signed-off-by: Junxian Huang Link: https://patch.msgid.link/20251016114051.1963197-2-huangjunxian6@hisilicon.com Signed-off-by: Leon Romanovsky

RDMA/uverbs: Fix umem release in UVERBS_METHOD_CQ_CREATE

2025-10-19T11:31:25Z

In `UVERBS_METHOD_CQ_CREATE`, umem should be released if anything goes wrong. Currently, if `create_cq_umem` fails, umem would not be released or referenced, causing a possible leak. In this patch, we release umem at `UVERBS_METHOD_CQ_CREATE`, the driver should not release umem if it returns an error code. Fixes: 1a40c362ae26 ("RDMA/uverbs: Add a common way to create CQ with umem") Signed-off-by: Shuhao Fu Link: https://patch.msgid.link/aOh1le4YqtYwj-hH@osx.local Signed-off-by: Leon Romanovsky

RDMA/irdma: Set irdma_cq cq_num field during CQ create

2025-10-19T11:02:11Z

The driver maintains a CQ table that is used to ensure that a CQ is still valid when processing CQ related AEs. When a CQ is destroyed, the table entry is cleared, using irdma_cq.cq_num as the index. This field was never being set, so it was just always clearing out entry 0. Additionally, the cq_num field size was increased to accommodate HW supporting more than 64K CQs. Fixes: b48c24c2d710 ("RDMA/irdma: Implement device supported verb APIs") Signed-off-by: Jacob Moroni Link: https://patch.msgid.link/20250923142439.943930-1-jmoroni@google.com Acked-by: Tatyana Nikolova Signed-off-by: Leon Romanovsky