aboutsummaryrefslogtreecommitdiffstatshomepage
path: root/tools/perf/scripts/python/export-to-sqlite.py (unfollow)
AgeCommit message (Collapse)AuthorFilesLines
2025-07-02drm/gem: Acquire references on GEM handles for framebuffersThomas Zimmermann3-11/+51
A GEM handle can be released while the GEM buffer object is attached to a DRM framebuffer. This leads to the release of the dma-buf backing the buffer object, if any. [1] Trying to use the framebuffer in further mode-setting operations leads to a segmentation fault. Most easily happens with driver that use shadow planes for vmap-ing the dma-buf during a page flip. An example is shown below. [ 156.791968] ------------[ cut here ]------------ [ 156.796830] WARNING: CPU: 2 PID: 2255 at drivers/dma-buf/dma-buf.c:1527 dma_buf_vmap+0x224/0x430 [...] [ 156.942028] RIP: 0010:dma_buf_vmap+0x224/0x430 [ 157.043420] Call Trace: [ 157.045898] <TASK> [ 157.048030] ? show_trace_log_lvl+0x1af/0x2c0 [ 157.052436] ? show_trace_log_lvl+0x1af/0x2c0 [ 157.056836] ? show_trace_log_lvl+0x1af/0x2c0 [ 157.061253] ? drm_gem_shmem_vmap+0x74/0x710 [ 157.065567] ? dma_buf_vmap+0x224/0x430 [ 157.069446] ? __warn.cold+0x58/0xe4 [ 157.073061] ? dma_buf_vmap+0x224/0x430 [ 157.077111] ? report_bug+0x1dd/0x390 [ 157.080842] ? handle_bug+0x5e/0xa0 [ 157.084389] ? exc_invalid_op+0x14/0x50 [ 157.088291] ? asm_exc_invalid_op+0x16/0x20 [ 157.092548] ? dma_buf_vmap+0x224/0x430 [ 157.096663] ? dma_resv_get_singleton+0x6d/0x230 [ 157.101341] ? __pfx_dma_buf_vmap+0x10/0x10 [ 157.105588] ? __pfx_dma_resv_get_singleton+0x10/0x10 [ 157.110697] drm_gem_shmem_vmap+0x74/0x710 [ 157.114866] drm_gem_vmap+0xa9/0x1b0 [ 157.118763] drm_gem_vmap_unlocked+0x46/0xa0 [ 157.123086] drm_gem_fb_vmap+0xab/0x300 [ 157.126979] drm_atomic_helper_prepare_planes.part.0+0x487/0xb10 [ 157.133032] ? lockdep_init_map_type+0x19d/0x880 [ 157.137701] drm_atomic_helper_commit+0x13d/0x2e0 [ 157.142671] ? drm_atomic_nonblocking_commit+0xa0/0x180 [ 157.147988] drm_mode_atomic_ioctl+0x766/0xe40 [...] [ 157.346424] ---[ end trace 0000000000000000 ]--- Acquiring GEM handles for the framebuffer's GEM buffer objects prevents this from happening. The framebuffer's cleanup later puts the handle references. Commit 1a148af06000 ("drm/gem-shmem: Use dma_buf from GEM object instance") triggers the segmentation fault easily by using the dma-buf field more widely. The underlying issue with reference counting has been present before. v2: - acquire the handle instead of the BO (Christian) - fix comment style (Christian) - drop the Fixes tag (Christian) - rename err_ gotos - add missing Link tag Suggested-by: Christian König <christian.koenig@amd.com> Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de> Link: https://elixir.bootlin.com/linux/v6.15/source/drivers/gpu/drm/drm_gem.c#L241 # [1] Cc: Thomas Zimmermann <tzimmermann@suse.de> Cc: Anusha Srivatsa <asrivats@redhat.com> Cc: Christian König <christian.koenig@amd.com> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Cc: Maxime Ripard <mripard@kernel.org> Cc: Sumit Semwal <sumit.semwal@linaro.org> Cc: "Christian König" <christian.koenig@amd.com> Cc: linux-media@vger.kernel.org Cc: dri-devel@lists.freedesktop.org Cc: linaro-mm-sig@lists.linaro.org Cc: <stable@vger.kernel.org> Reviewed-by: Christian König <christian.koenig@amd.com> Link: https://lore.kernel.org/r/20250630084001.293053-1-tzimmermann@suse.de
2025-07-01drm/sched: Increment job count before swapping tail spsc queueMatthew Brost1-1/+3
A small race exists between spsc_queue_push and the run-job worker, in which spsc_queue_push may return not-first while the run-job worker has already idled due to the job count being zero. If this race occurs, job scheduling stops, leading to hangs while waiting on the job’s DMA fences. Seal this race by incrementing the job count before appending to the SPSC queue. This race was observed on a drm-tip 6.16-rc1 build with the Xe driver in an SVM test case. Fixes: 1b1f42d8fde4 ("drm: move amd_gpu_scheduler into common location") Fixes: 27105db6c63a ("drm/amdgpu: Add SPSC queue to scheduler.") Cc: stable@vger.kernel.org Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com> Link: https://lore.kernel.org/r/20250613212013.719312-1-matthew.brost@intel.com
2025-07-01drm/i915/gsc: mei interrupt top half should be in irq disabled contextJunxiao Chang1-1/+1
MEI GSC interrupt comes from i915. It has top half and bottom half. Top half is called from i915 interrupt handler. It should be in irq disabled context. With RT kernel, by default i915 IRQ handler is in threaded IRQ. MEI GSC top half might be in threaded IRQ context. generic_handle_irq_safe API could be called from either IRQ or process context, it disables local IRQ then calls MEI GSC interrupt top half. This change fixes A380/A770 GPU boot hang issue with RT kernel. Fixes: 1e3dc1d8622b ("drm/i915/gsc: add gsc as a mei auxiliary device") Tested-by: Furong Zhou <furong.zhou@intel.com> Suggested-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Acked-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Junxiao Chang <junxiao.chang@intel.com> Link: https://lore.kernel.org/r/20250425151108.643649-1-junxiao.chang@intel.com Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> (cherry picked from commit dccf655f69002d496a527ba441b4f008aa5bebbf) Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2025-07-01drm/i915/gt: Fix timeline left held on VMA alloc errorJanusz Krzysztofik1-1/+2
The following error has been reported sporadically by CI when a test unbinds the i915 driver on a ring submission platform: <4> [239.330153] ------------[ cut here ]------------ <4> [239.330166] i915 0000:00:02.0: [drm] drm_WARN_ON(dev_priv->mm.shrink_count) <4> [239.330196] WARNING: CPU: 1 PID: 18570 at drivers/gpu/drm/i915/i915_gem.c:1309 i915_gem_cleanup_early+0x13e/0x150 [i915] ... <4> [239.330640] RIP: 0010:i915_gem_cleanup_early+0x13e/0x150 [i915] ... <4> [239.330942] Call Trace: <4> [239.330944] <TASK> <4> [239.330949] i915_driver_late_release+0x2b/0xa0 [i915] <4> [239.331202] i915_driver_release+0x86/0xa0 [i915] <4> [239.331482] devm_drm_dev_init_release+0x61/0x90 <4> [239.331494] devm_action_release+0x15/0x30 <4> [239.331504] release_nodes+0x3d/0x120 <4> [239.331517] devres_release_all+0x96/0xd0 <4> [239.331533] device_unbind_cleanup+0x12/0x80 <4> [239.331543] device_release_driver_internal+0x23a/0x280 <4> [239.331550] ? bus_find_device+0xa5/0xe0 <4> [239.331563] device_driver_detach+0x14/0x20 ... <4> [357.719679] ---[ end trace 0000000000000000 ]--- If the test also unloads the i915 module then that's followed with: <3> [357.787478] ============================================================================= <3> [357.788006] BUG i915_vma (Tainted: G U W N ): Objects remaining on __kmem_cache_shutdown() <3> [357.788031] ----------------------------------------------------------------------------- <3> [357.788204] Object 0xffff888109e7f480 @offset=29824 <3> [357.788670] Allocated in i915_vma_instance+0xee/0xc10 [i915] age=292729 cpu=4 pid=2244 <4> [357.788994] i915_vma_instance+0xee/0xc10 [i915] <4> [357.789290] init_status_page+0x7b/0x420 [i915] <4> [357.789532] intel_engines_init+0x1d8/0x980 [i915] <4> [357.789772] intel_gt_init+0x175/0x450 [i915] <4> [357.790014] i915_gem_init+0x113/0x340 [i915] <4> [357.790281] i915_driver_probe+0x847/0xed0 [i915] <4> [357.790504] i915_pci_probe+0xe6/0x220 [i915] ... Closer analysis of CI results history has revealed a dependency of the error on a few IGT tests, namely: - igt@api_intel_allocator@fork-simple-stress-signal, - igt@api_intel_allocator@two-level-inception-interruptible, - igt@gem_linear_blits@interruptible, - igt@prime_mmap_coherency@ioctl-errors, which invisibly trigger the issue, then exhibited with first driver unbind attempt. All of the above tests perform actions which are actively interrupted with signals. Further debugging has allowed to narrow that scope down to DRM_IOCTL_I915_GEM_EXECBUFFER2, and ring_context_alloc(), specific to ring submission, in particular. If successful then that function, or its execlists or GuC submission equivalent, is supposed to be called only once per GEM context engine, followed by raise of a flag that prevents the function from being called again. The function is expected to unwind its internal errors itself, so it may be safely called once more after it returns an error. In case of ring submission, the function first gets a reference to the engine's legacy timeline and then allocates a VMA. If the VMA allocation fails, e.g. when i915_vma_instance() called from inside is interrupted with a signal, then ring_context_alloc() fails, leaving the timeline held referenced. On next I915_GEM_EXECBUFFER2 IOCTL, another reference to the timeline is got, and only that last one is put on successful completion. As a consequence, the legacy timeline, with its underlying engine status page's VMA object, is still held and not released on driver unbind. Get the legacy timeline only after successful allocation of the context engine's VMA. v2: Add a note on other submission methods (Krzysztof Karas): Both execlists and GuC submission use lrc_alloc() which seems free from a similar issue. Fixes: 75d0a7f31eec ("drm/i915: Lift timeline into intel_context") Closes: https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/12061 Cc: Chris Wilson <chris.p.wilson@linux.intel.com> Cc: Matthew Auld <matthew.auld@intel.com> Cc: Krzysztof Karas <krzysztof.karas@intel.com> Reviewed-by: Sebastian Brzezinka <sebastian.brzezinka@intel.com> Reviewed-by: Krzysztof Niemiec <krzysztof.niemiec@intel.com> Signed-off-by: Janusz Krzysztofik <janusz.krzysztofik@linux.intel.com> Reviewed-by: Nitin Gote <nitin.r.gote@intel.com> Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com> Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com> Link: https://lore.kernel.org/r/20250611104352.1014011-2-janusz.krzysztofik@linux.intel.com (cherry picked from commit cc43422b3cc79eacff4c5a8ba0d224688ca9dd4f) Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2025-06-30drm/vmwgfx: Fix guests running with TDX/SEVMarko Kiiskila1-1/+1
Commit 81256a50aa0f ("x86/mm: Make memremap(MEMREMAP_WB) map memory as encrypted by default") changed the default behavior of memremap(MEMREMAP_WB) and started mapping memory as encrypted. The driver requires the fifo memory to be decrypted to communicate with the host but was relaying on the old default behavior of memremap(MEMREMAP_WB) and thus broke. Fix it by explicitly specifying the desired behavior and passing MEMREMAP_DEC to memremap. Fixes: 81256a50aa0f ("x86/mm: Make memremap(MEMREMAP_WB) map memory as encrypted by default") Signed-off-by: Marko Kiiskila <marko.kiiskila@broadcom.com> Signed-off-by: Zack Rusin <zack.rusin@broadcom.com> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: linux-mm@kvack.org Cc: linux-kernel@vger.kernel.org Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Link: https://lore.kernel.org/r/20250618192926.1092450-1-zack.rusin@broadcom.com
2025-06-30drm/amd/display: Don't allow OLED to go down to fully offMario Limonciello1-5/+7
[Why] OLED panels can be fully off, but this behavior is unexpected. [How] Ensure that minimum luminance is at least 1. Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4338 Reviewed-by: Alex Hung <alex.hung@amd.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Ray Wu <ray.wu@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 51496c7737d06a74b599d0aa7974c3d5a4b1162e)
2025-06-30drm/amd/display: Added case for when RR equals panel's max RR using freesyncHarold Sun2-0/+9
[WHY] Rounding error sometimes occurs when the refresh rate is equal to a panel's max refresh rate, causing HDMI compliance failures. [HOW] Added a case so that we round up to avoid v_total_min to be below a panel's minimum bound. Reviewed-by: Jun Lei <jun.lei@amd.com> Signed-off-by: Harold Sun <Harold.Sun@amd.com> Signed-off-by: Ray Wu <ray.wu@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit fe7645d22bc0f7c1558296538ec49987bf268ef6)
2025-06-30drm/amdkfd: add hqd_sdma_get_doorbell callbacks for gfx7/8Alex Deucher2-0/+16
These were missed when support was added for other generations. The callbacks are called unconditionally so we need to make sure all generations have them. Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4304 Link: https://github.com/ROCm/ROCm/issues/4965 Fixes: bac38ca8c475 ("drm/amdkfd: implement per queue sdma reset for gfx 9.4+") Cc: Jonathan Kim <jonathan.kim@amd.com> Reported-by: Johl Brown <johlbrown@gmail.com> Reviewed-by: Jonathan Kim <jonathan.kim@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 1e9d17a5dcf1242e9518e461d8e63ad35240e49e) Cc: stable@vger.kernel.org
2025-06-30drm/amdgpu: Fix memory leak in amdgpu_ctx_mgr_entity_finiLin.Cao1-0/+1
patch dd64956685fa ("drm/amdgpu: Remove duplicated "context still alive" check") removed ctx put, which will cause amdgpu_ctx_fini() cannot be called and then cause some finished fence that added by amdgpu_ctx_add_fence() cannot be released and cause memleak. Fixes: dd64956685fa ("drm/amdgpu: Remove duplicated "context still alive" check") Signed-off-by: Lin.Cao <lincao12@amd.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 8cf66089e28108dedd47e6156a48489303cf525c) Cc: stable@vger.kernel.org
2025-06-30drm/amdkfd: Don't call mmput from MMU notifier callbackPhilip Yang1-23/+20
If the process is exiting, the mmput inside mmu notifier callback from compactd or fork or numa balancing could release the last reference of mm struct to call exit_mmap and free_pgtable, this triggers deadlock with below backtrace. The deadlock will leak kfd process as mmu notifier release is not called and cause VRAM leaking. The fix is to take mm reference mmget_non_zero when adding prange to the deferred list to pair with mmput in deferred list work. If prange split and add into pchild list, the pchild work_item.mm is not used, so remove the mm parameter from svm_range_unmap_split and svm_range_add_child. The backtrace of hung task: INFO: task python:348105 blocked for more than 64512 seconds. Call Trace: __schedule+0x1c3/0x550 schedule+0x46/0xb0 rwsem_down_write_slowpath+0x24b/0x4c0 unlink_anon_vmas+0xb1/0x1c0 free_pgtables+0xa9/0x130 exit_mmap+0xbc/0x1a0 mmput+0x5a/0x140 svm_range_cpu_invalidate_pagetables+0x2b/0x40 [amdgpu] mn_itree_invalidate+0x72/0xc0 __mmu_notifier_invalidate_range_start+0x48/0x60 try_to_unmap_one+0x10fa/0x1400 rmap_walk_anon+0x196/0x460 try_to_unmap+0xbb/0x210 migrate_page_unmap+0x54d/0x7e0 migrate_pages_batch+0x1c3/0xae0 migrate_pages_sync+0x98/0x240 migrate_pages+0x25c/0x520 compact_zone+0x29d/0x590 compact_zone_order+0xb6/0xf0 try_to_compact_pages+0xbe/0x220 __alloc_pages_direct_compact+0x96/0x1a0 __alloc_pages_slowpath+0x410/0x930 __alloc_pages_nodemask+0x3a9/0x3e0 do_huge_pmd_anonymous_page+0xd7/0x3e0 __handle_mm_fault+0x5e3/0x5f0 handle_mm_fault+0xf7/0x2e0 hmm_vma_fault.isra.0+0x4d/0xa0 walk_pmd_range.isra.0+0xa8/0x310 walk_pud_range+0x167/0x240 walk_pgd_range+0x55/0x100 __walk_page_range+0x87/0x90 walk_page_range+0xf6/0x160 hmm_range_fault+0x4f/0x90 amdgpu_hmm_range_get_pages+0x123/0x230 [amdgpu] amdgpu_ttm_tt_get_user_pages+0xb1/0x150 [amdgpu] init_user_pages+0xb1/0x2a0 [amdgpu] amdgpu_amdkfd_gpuvm_alloc_memory_of_gpu+0x543/0x7d0 [amdgpu] kfd_ioctl_alloc_memory_of_gpu+0x24c/0x4e0 [amdgpu] kfd_ioctl+0x29d/0x500 [amdgpu] Fixes: fa582c6f3684 ("drm/amdkfd: Use mmget_not_zero in MMU notifier") Signed-off-by: Philip Yang <Philip.Yang@amd.com> Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit a29e067bd38946f752b0ef855f3dfff87e77bec7) Cc: stable@vger.kernel.org
2025-06-30drm/amdgpu: Include sdma_4_4_4.binKent Russell1-0/+1
This got missed during SDMA 4.4.4 support. Fixes: 968e3811c3e8 ("drm/amdgpu: add initial support for sdma444") Signed-off-by: Kent Russell <kent.russell@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 51526efe02714339ed6139f7bc348377d363200a) Cc: stable@vger.kernel.org
2025-06-30amdkfd: MTYPE_UC for ext-coherent system memoryDavid Yat Sin1-1/+1
Set memory mtype to UC host memory when ext-coherent flag is set and memory is registered as a SVM allocation. Reviewed-by: Amber Lin <Amber.Lin@amd.com> Signed-off-by: David Yat Sin <David.YatSin@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 5d14fdab4778c29cfd39e62c3ce84d232b4a7d8c)
2025-06-30drm/amdgpu/sdma5.x: suspend KFD queues in ring resetAlex Deucher2-2/+12
SDMA 5.x only supports engine soft reset which resets all queues on the engine. As such, we need to suspend KFD queues around resets like we do for SDMA 4.x. Reviewed-by: Jesse Zhang <Jesse.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 61feed0baa1a0d094af0e07e968b1e6e875f07d0)
2025-06-30drm/bridge: aux-hpd-bridge: fix assignment of the of_nodeDmitry Baryshkov1-1/+2
Perform fix similar to the one in the commit 85e444a68126 ("drm/bridge: Fix assignment of the of_node of the parent to aux bridge"). The assignment of the of_node to the aux HPD bridge needs to mark the of_node as reused, otherwise driver core will attempt to bind resources like pinctrl, which is going to fail as corresponding pins are already marked as used by the parent device. Fix that by using the device_set_of_node_from_dev() helper instead of assigning it directly. Fixes: e560518a6c2e ("drm/bridge: implement generic DP HPD bridge") Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com> Reviewed-by: Neil Armstrong <neil.armstrong@linaro.org> Signed-off-by: Neil Armstrong <neil.armstrong@linaro.org> Link: https://lore.kernel.org/r/20250608-fix-aud-hpd-bridge-v1-1-4641a6f8e381@oss.qualcomm.com
2025-06-30drm/bridge: panel: move prepare_prev_first handling to drm_panel_bridge_add_typedDmitry Baryshkov1-4/+1
The commit 5ea6b1702781 ("drm/panel: Add prepare_prev_first flag to drm_panel") and commit 0974687a19c3 ("drm/bridge: panel: Set pre_enable_prev_first from drmm_panel_bridge_add") added handling of panel's prepare_prev_first to devm_panel_bridge_add() and drmm_panel_bridge_add(). However if the driver calls drm_panel_bridge_add_typed() directly, then the flag won't be handled and thus the drm_bridge.pre_enable_prev_first will not be set. Move prepare_prev_first handling to the drm_panel_bridge_add_typed() so that there is no way to miss the flag. Fixes: 5ea6b1702781 ("drm/panel: Add prepare_prev_first flag to drm_panel") Fixes: 0974687a19c3 ("drm/bridge: panel: Set pre_enable_prev_first from drmm_panel_bridge_add") Reported-by: Svyatoslav Ryhel <clamor95@gmail.com> Closes: https://lore.kernel.org/dri-devel/CAPVz0n3YZass3Bns1m0XrFxtAC0DKbEPiW6vXimQx97G243sXw@mail.gmail.com/ Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Reviewed-by: Neil Armstrong <neil.armstrong@linaro.org> Signed-off-by: Neil Armstrong <neil.armstrong@linaro.org> Link: https://lore.kernel.org/r/20250220-panel_prev_first-v1-1-b9e787825a1a@linaro.org
2025-06-30drm/ttm: fix error handling in ttm_buffer_object_transferChristian König1-6/+7
Unlocking the resv object was missing in the error path, additionally to that we should move over the resource only after the fence slot was reserved. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Fixes: c8d4c18bfbc4a ("dma-buf/drivers: make reserving a shared slot mandatory v4") Cc: <stable@vger.kernel.org> Link: https://lore.kernel.org/r/20250616130726.22863-3-christian.koenig@amd.com
2025-06-30dma-buf: fix timeout handling in dma_resv_wait_timeout v2Christian König1-5/+7
Even the kerneldoc says that with a zero timeout the function should not wait for anything, but still return 1 to indicate that the fences are signaled now. Unfortunately that isn't what was implemented, instead of only returning 1 we also waited for at least one jiffies. Fix that by adjusting the handling to what the function is actually documented to do. v2: improve code readability Reported-by: Marek Olšák <marek.olsak@amd.com> Reported-by: Lucas Stach <l.stach@pengutronix.de> Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Lucas Stach <l.stach@pengutronix.de> Cc: <stable@vger.kernel.org> Link: https://lore.kernel.org/r/20250129105841.1806-1-christian.koenig@amd.com
2025-06-30drm/i915/dsi: Fix NULL pointer deref in vlv_dphy_param_init()Hans de Goede1-1/+1
Commit 77ba0b856225 ("drm/i915/dsi: convert vlv_dsi.[ch] to struct intel_display") added a to_intel_display(connector) call to vlv_dphy_param_init() but when vlv_dphy_param_init() gets called the connector object has not been initialized yet, so this leads to a NULL pointer deref: BUG: kernel NULL pointer dereference, address: 000000000000000c ... Hardware name: ASUSTeK COMPUTER INC. T100TA/T100TA, BIOS T100TA.314 08/13/2015 RIP: 0010:vlv_dsi_init+0x4e6/0x1600 [i915] ... Call Trace: <TASK> ? intel_step_name+0x4be8/0x5c30 [i915] intel_setup_outputs+0x2d6/0xbd0 [i915] intel_display_driver_probe_nogem+0x13f/0x220 [i915] i915_driver_probe+0x3d9/0xaf0 [i915] Use to_intel_display(&intel_dsi->base) instead to fix this. Fixes: 77ba0b856225 ("drm/i915/dsi: convert vlv_dsi.[ch] to struct intel_display") Signed-off-by: Hans de Goede <hansg@kernel.org> Reviewed-by: Jani Nikula <jani.nikula@intel.com> Link: https://lore.kernel.org/r/20250626143317.101706-1-hansg@kernel.org Signed-off-by: Jani Nikula <jani.nikula@intel.com> (cherry picked from commit 0dc6bfb50a5d0759e726cd36a3d3b7529fd2a627) Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2025-06-30drm/i915/selftests: Change mock_request() to return error pointersDan Carpenter2-11/+11
There was an error pointer vs NULL bug in __igt_breadcrumbs_smoketest(). The __mock_request_alloc() function implements the smoketest->request_alloc() function pointer. It was supposed to return error pointers, but it propogates the NULL return from mock_request() so in the event of a failure, it would lead to a NULL pointer dereference. To fix this, change the mock_request() function to return error pointers and update all the callers to expect that. Fixes: 52c0fdb25c7c ("drm/i915: Replace global breadcrumbs with per-context interrupt tracking") Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://lore.kernel.org/r/685c1417.050a0220.696f5.5c05@mx.google.com Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> (cherry picked from commit 778fa8ad5f0f23397d045c7ebca048ce8def1c43) Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2025-06-29Linux 6.16-rc4Linus Torvalds1-1/+1
2025-06-29drm/bridge: samsung-dsim: Don't use %pK through printkThomas Weißschuh1-2/+2
In the past %pK was preferable to %p as it would not leak raw pointer values into the kernel log. Since commit ad67b74d2469 ("printk: hash addresses printed with %p") the regular %p has been improved to avoid this issue. Furthermore, restricted pointers ("%pK") were never meant to be used through printk(). They can still unintentionally leak raw pointers or acquire sleeping locks in atomic contexts. Switch to the regular pointer formatting which is safer and easier to reason about. Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de> Signed-off-by: Inki Dae <inki.dae@samsung.com>
2025-06-29drm/exynos: fimd: Guard display clock control with runtime PM callsMarek Szyprowski1-0/+12
Commit c9b1150a68d9 ("drm/atomic-helper: Re-order bridge chain pre-enable and post-disable") changed the call sequence to the CRTC enable/disable and bridge pre_enable/post_disable methods, so those bridge methods are now called when CRTC is not yet enabled. This causes a lockup observed on Samsung Peach-Pit/Pi Chromebooks. The source of this lockup is a call to fimd_dp_clock_enable() function, when FIMD device is not yet runtime resumed. It worked before the mentioned commit only because the CRTC implemented by the FIMD driver was always enabled what guaranteed the FIMD device to be runtime resumed. This patch adds runtime PM guards to the fimd_dp_clock_enable() function to enable its proper operation also when the CRTC implemented by FIMD is not yet enabled. Fixes: 196e059a8a6a ("drm/exynos: convert clock_enable crtc callback to pipeline clock") Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com> Reviewed-by: Tomi Valkeinen <tomi.valkeinen@ideasonboard.com> Signed-off-by: Inki Dae <inki.dae@samsung.com>
2025-06-29drm/exynos: exynos7_drm_decon: add vblank check in IRQ handlingKaustabh Chakraborty1-0/+4
If there's support for another console device (such as a TTY serial), the kernel occasionally panics during boot. The panic message and a relevant snippet of the call stack is as follows: Unable to handle kernel NULL pointer dereference at virtual address 000000000000000 Call trace: drm_crtc_handle_vblank+0x10/0x30 (P) decon_irq_handler+0x88/0xb4 [...] Otherwise, the panics don't happen. This indicates that it's some sort of race condition. Add a check to validate if the drm device can handle vblanks before calling drm_crtc_handle_vblank() to avoid this. Cc: stable@vger.kernel.org Fixes: 96976c3d9aff ("drm/exynos: Add DECON driver") Signed-off-by: Kaustabh Chakraborty <kauschluss@disroot.org> Signed-off-by: Inki Dae <inki.dae@samsung.com>
2025-06-29drm/exynos: Don't use %pK through printkThomas Weißschuh2-17/+17
In the past %pK was preferable to %p as it would not leak raw pointer values into the kernel log. Since commit ad67b74d2469 ("printk: hash addresses printed with %p") the regular %p has been improved to avoid this issue. Furthermore, restricted pointers ("%pK") were never meant to be used through printk(). They can still unintentionally leak raw pointers or acquire sleeping locks in atomic contexts. Switch to the regular pointer formatting which is safer and easier to reason about. Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de> Signed-off-by: Inki Dae <inki.dae@samsung.com>
2025-06-27tracing: Fix filter logic errorEdward Adam Davis1-7/+7
If the processing of the tr->events loop fails, the filter that has been added to filter_head will be released twice in free_filter_list(&head->rcu) and __free_filter(filter). After adding the filter of tr->events, add the filter to the filter_head process to avoid triggering uaf. Link: https://lore.kernel.org/tencent_4EF87A626D702F816CD0951CE956EC32CD0A@qq.com Fixes: a9d0aab5eb33 ("tracing: Fix regression of filter waiting a long time on RCU synchronization") Reported-by: syzbot+daba72c4af9915e9c894@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=daba72c4af9915e9c894 Tested-by: syzbot+daba72c4af9915e9c894@syzkaller.appspotmail.com Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> Signed-off-by: Edward Adam Davis <eadavis@qq.com> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2025-06-27drm/vesadrm: Avoid NULL-ptr deref in vesadrm_pmi_cmap_write()Thomas Zimmermann1-4/+9
Only set PMI fields if the screen_info's Vesa PM segment has been set. Vesa PMI is the power-management interface. It also provides means to set the color palette. The interface is optional, so not all VESA graphics cards support it. Print vesafb's warning [1] if the hardware palette cannot be set at all. If unsupported the field PrimaryPalette in struct vesadrm.pmi is NULL, which results in a segmentation fault. Happens with qemu's Cirrus emulation. Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de> Fixes: 814d270b31d2 ("drm/sysfb: vesadrm: Add gamma correction") Link: https://elixir.bootlin.com/linux/v6.15/source/drivers/video/fbdev/vesafb.c#L375 # 1 Cc: Thomas Zimmermann <tzimmermann@suse.de> Cc: Javier Martinez Canillas <javierm@redhat.com> Cc: dri-devel@lists.freedesktop.org Acked-by: Javier Martinez Canillas <javierm@redhat.com> Link: https://lore.kernel.org/r/20250617140944.142392-1-tzimmermann@suse.de
2025-06-27i2c: scx200_acb: depends on HAS_IOPORTJohannes Berg1-1/+1
It already depends on X86_32, but that's also set for ARCH=um. Recent changes made UML no longer have IO port access since it's not needed, but this driver uses it. Build it only for HAS_IOPORT. This is pretty much the same as depending on X86, but on the off-chance that HAS_IOPORT will ever be optional on x86 HAS_IOPORT is the real prerequisite. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
2025-06-27LoongArch: KVM: Disable updating of "num_cpu" and "feature"Bibo Mao1-0/+6
Property "num_cpu" and "feature" are read-only once eiointc is created, which are set with KVM_DEV_LOONGARCH_EXTIOI_GRP_CTRL attr group before device creation. Attr group KVM_DEV_LOONGARCH_EXTIOI_GRP_SW_STATUS is to update register and software state for migration and reset usage, property "num_cpu" and "feature" can not be update again if it is created already. Here discard write operation with property "num_cpu" and "feature" in attr group KVM_DEV_LOONGARCH_EXTIOI_GRP_CTRL. Cc: stable@vger.kernel.org Fixes: 1ad7efa552fd ("LoongArch: KVM: Add EIOINTC user mode read and write functions") Signed-off-by: Bibo Mao <maobibo@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2025-06-27LoongArch: KVM: Check validity of "num_cpu" from user spaceBibo Mao1-5/+14
The maximum supported cpu number is EIOINTC_ROUTE_MAX_VCPUS about irqchip EIOINTC, here add validation about cpu number to avoid array pointer overflow. Cc: stable@vger.kernel.org Fixes: 1ad7efa552fd ("LoongArch: KVM: Add EIOINTC user mode read and write functions") Signed-off-by: Bibo Mao <maobibo@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2025-06-27LoongArch: KVM: Check interrupt route from physical CPUBibo Mao1-6/+18
With EIOINTC interrupt controller, physical CPU ID is set for irq route. However the function kvm_get_vcpu() is used to get destination vCPU when delivering irq. With API kvm_get_vcpu(), the logical CPU ID is used. With API kvm_get_vcpu_by_cpuid(), vCPU ID can be searched from physical CPU ID. Cc: stable@vger.kernel.org Fixes: 3956a52bc05b ("LoongArch: KVM: Add EIOINTC read and write functions") Signed-off-by: Bibo Mao <maobibo@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2025-06-27LoongArch: KVM: Fix interrupt route update with EIOINTCBibo Mao1-7/+6
With function eiointc_update_sw_coremap(), there is forced assignment like val = *(u64 *)pvalue. Parameter pvalue may be pointer to char type or others, there is problem with forced assignment with u64 type. Here the detailed value is passed rather address pointer. Cc: stable@vger.kernel.org Fixes: 3956a52bc05b ("LoongArch: KVM: Add EIOINTC read and write functions") Signed-off-by: Bibo Mao <maobibo@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2025-06-27LoongArch: KVM: Add address alignment check for IOCSR emulationBibo Mao1-0/+10
IOCSR instruction supports 1/2/4/8 bytes access, the address should be naturally aligned with its access size. Here address alignment check is added in the EIOINTC kernel emulation. Cc: stable@vger.kernel.org Fixes: 3956a52bc05b ("LoongArch: KVM: Add EIOINTC read and write functions") Signed-off-by: Bibo Mao <maobibo@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2025-06-27drm/panel: panel-simple: get rid of panel_dpi hackMaxime Ripard1-9/+18
The empty panel_dpi struct was only ever used as a discriminant, but it's kind of a hack, and with the reworks done in the previous patches, we shouldn't need it anymore. Let's get rid of it. Reviewed-by: Javier Martinez Canillas <javierm@redhat.com> Tested-by: Francesco Dolcini <francesco.dolcini@toradex.com> # Toradex Colibri iMX6 Link: https://lore.kernel.org/r/20250626-drm-panel-simple-fixes-v2-5-5afcaa608bdc@kernel.org Signed-off-by: Maxime Ripard <mripard@kernel.org>
2025-06-27drm/panel: panel-simple: Add function to look panel data upMaxime Ripard1-33/+49
Commit de04bb0089a9 ("drm/panel/panel-simple: Use the new allocation in place of devm_kzalloc()") moved the call to drm_panel_init into the devm_drm_panel_alloc(), which needs a connector type to initialize properly. In the panel-dpi compatible case, the passed panel_desc structure is an empty one used as a discriminant, and the connector type it contains isn't actually initialized. It is initialized through a call to panel_dpi_probe() later in the function, which used to be before the call to drm_panel_init() that got merged into devm_drm_panel_alloc(). So, we do need a proper panel_desc pointer before the call to devm_drm_panel_alloc() now. All cases associate their panel_desc with the panel compatible and use of_device_get_match_data, except for the panel-dpi compatible. In that case, we're expected to call panel_dpi_probe, which will allocate and initialize the panel_desc for us. Let's create such a helper function that would be called first in the driver and will lookup the desc by compatible, or allocate one if relevant. Reported-by: Francesco Dolcini <francesco@dolcini.it> Closes: https://lore.kernel.org/all/20250612081834.GA248237@francesco-nb/ Fixes: de04bb0089a9 ("drm/panel/panel-simple: Use the new allocation in place of devm_kzalloc()") Reviewed-by: Javier Martinez Canillas <javierm@redhat.com> Tested-by: Francesco Dolcini <francesco.dolcini@toradex.com> # Toradex Colibri iMX6 Link: https://lore.kernel.org/r/20250626-drm-panel-simple-fixes-v2-4-5afcaa608bdc@kernel.org Signed-off-by: Maxime Ripard <mripard@kernel.org>
2025-06-27drm/panel: panel-simple: Make panel_simple_probe return its panelMaxime Ripard1-13/+20
In order to fix the regession introduced by commit de04bb0089a9 ("drm/panel/panel-simple: Use the new allocation in place of devm_kzalloc()"), we need to move the panel_desc lookup into the common panel_simple_probe() function. There's two callers for that function, the probe implementations of the platform and MIPI-DSI drivers panel-simple implements. The MIPI-DSI driver's probe will need to access the current panel_desc to initialize properly, which won't be possible anymore if we make that lookup in panel_simple_probe(). However, we can make panel_simple_probe() return the initialized panel_simple structure it allocated, which will contain a pointer to the associated panel_desc in its desc field. This doesn't fix de04bb0089a9 ("drm/panel/panel-simple: Use the new allocation in place of devm_kzalloc()") still, but makes progress towards that goal. Fixes: de04bb0089a9 ("drm/panel/panel-simple: Use the new allocation in place of devm_kzalloc()") Reviewed-by: Javier Martinez Canillas <javierm@redhat.com> Tested-by: Francesco Dolcini <francesco.dolcini@toradex.com> # Toradex Colibri iMX6 Link: https://lore.kernel.org/r/20250626-drm-panel-simple-fixes-v2-3-5afcaa608bdc@kernel.org Signed-off-by: Maxime Ripard <mripard@kernel.org>
2025-06-27drm/panel: panel-simple: make panel_dpi_probe return a panel_descMaxime Ripard1-11/+11
If the panel-simple driver is probed from a panel-dpi compatible, the driver will use an empty panel_desc structure as a descriminant. It will then allocate and fill another panel_desc as part of its probe. However, that allocation needs to happen after the panel_simple structure has been allocated, since panel_dpi_probe(), the function doing the panel_desc allocation and initialization, takes a panel_simple pointer as an argument. This pointer is used to fill the panel_simple->desc pointer that is still initialized with the empty panel_desc when panel_dpi_probe() is called. Since commit de04bb0089a9 ("drm/panel/panel-simple: Use the new allocation in place of devm_kzalloc()"), we will need the panel connector type found in panel_desc to allocate panel_simple. This creates a circular dependency where we need panel_desc to create panel_simple, and need panel_simple to create panel_desc. Let's break that dependency by making panel_dpi_probe simply return the panel_desc it initialized and move the panel_simple->desc assignment to the caller. This will not fix the breaking commit entirely, but will move us towards the right direction. Fixes: de04bb0089a9 ("drm/panel/panel-simple: Use the new allocation in place of devm_kzalloc()") Reviewed-by: Javier Martinez Canillas <javierm@redhat.com> Tested-by: Francesco Dolcini <francesco.dolcini@toradex.com> # Toradex Colibri iMX6 Link: https://lore.kernel.org/r/20250626-drm-panel-simple-fixes-v2-2-5afcaa608bdc@kernel.org Signed-off-by: Maxime Ripard <mripard@kernel.org>
2025-06-27drm/mipi-dsi: Add dev_is_mipi_dsi functionMaxime Ripard2-1/+5
This will be especially useful for generic panels (like panel-simple) which can take different code path depending on if they are MIPI-DSI devices or platform devices. Reviewed-by: Javier Martinez Canillas <javierm@redhat.com> Tested-by: Francesco Dolcini <francesco.dolcini@toradex.com> # Toradex Colibri iMX6 Link: https://lore.kernel.org/r/20250626-drm-panel-simple-fixes-v2-1-5afcaa608bdc@kernel.org Signed-off-by: Maxime Ripard <mripard@kernel.org>
2025-06-26io_uring/kbuf: flag partial buffer mappingsJens Axboe3-9/+18
A previous commit aborted mapping more for a non-incremental ring for bundle peeking, but depending on where in the process this peeking happened, it would not necessarily prevent a retry by the user. That can create gaps in the received/read data. Add struct buf_sel_arg->partial_map, which can pass this information back. The networking side can then map that to internal state and use it to gate retry as well. Since this necessitates a new flag, change io_sr_msg->retry to a retry_flags member, and store both the retry and partial map condition in there. Cc: stable@vger.kernel.org Fixes: 26ec15e4b0c1 ("io_uring/kbuf: don't truncate end buffer for multiple buffer peeks") Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-06-26cifs: Fix reading into an ITER_FOLIOQ from the smbdirect codeDavid Howells1-95/+17
When performing a file read from RDMA, smbd_recv() prints an "Invalid msg type 4" error and fails the I/O. This is due to the switch-statement there not handling the ITER_FOLIOQ handed down from netfslib. Fix this by collapsing smbd_recv_buf() and smbd_recv_page() into smbd_recv() and just using copy_to_iter() instead of memcpy(). This future-proofs the function too, in case more ITER_* types are added. Fixes: ee4cdf7ba857 ("netfs: Speed up buffered reading") Reported-by: Stefan Metzmacher <metze@samba.org> Signed-off-by: David Howells <dhowells@redhat.com> cc: Tom Talpey <tom@talpey.com> cc: Paulo Alcantara (Red Hat) <pc@manguebit.com> cc: Matthew Wilcox <willy@infradead.org> cc: linux-cifs@vger.kernel.org cc: netfs@lists.linux.dev cc: linux-fsdevel@vger.kernel.org Signed-off-by: Steve French <stfrench@microsoft.com>
2025-06-26cifs: Fix the smbd_response slab to allow usercopyDavid Howells1-5/+13
The handling of received data in the smbdirect client code involves using copy_to_iter() to copy data from the smbd_reponse struct's packet trailer to a folioq buffer provided by netfslib that encapsulates a chunk of pagecache. If, however, CONFIG_HARDENED_USERCOPY=y, this will result in the checks then performed in copy_to_iter() oopsing with something like the following: CIFS: Attempting to mount //172.31.9.1/test CIFS: VFS: RDMA transport established usercopy: Kernel memory exposure attempt detected from SLUB object 'smbd_response_0000000091e24ea1' (offset 81, size 63)! ------------[ cut here ]------------ kernel BUG at mm/usercopy.c:102! ... RIP: 0010:usercopy_abort+0x6c/0x80 ... Call Trace: <TASK> __check_heap_object+0xe3/0x120 __check_object_size+0x4dc/0x6d0 smbd_recv+0x77f/0xfe0 [cifs] cifs_readv_from_socket+0x276/0x8f0 [cifs] cifs_read_from_socket+0xcd/0x120 [cifs] cifs_demultiplex_thread+0x7e9/0x2d50 [cifs] kthread+0x396/0x830 ret_from_fork+0x2b8/0x3b0 ret_from_fork_asm+0x1a/0x30 The problem is that the smbd_response slab's packet field isn't marked as being permitted for usercopy. Fix this by passing parameters to kmem_slab_create() to indicate that copy_to_iter() is permitted from the packet region of the smbd_response slab objects, less the header space. Fixes: ee4cdf7ba857 ("netfs: Speed up buffered reading") Reported-by: Stefan Metzmacher <metze@samba.org> Link: https://lore.kernel.org/r/acb7f612-df26-4e2a-a35d-7cd040f513e1@samba.org/ Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Stefan Metzmacher <metze@samba.org> Tested-by: Stefan Metzmacher <metze@samba.org> cc: Paulo Alcantara <pc@manguebit.com> cc: linux-cifs@vger.kernel.org cc: netfs@lists.linux.dev cc: linux-fsdevel@vger.kernel.org Signed-off-by: Steve French <stfrench@microsoft.com>
2025-06-26smb: client: fix potential deadlock when reconnecting channelsPaulo Alcantara2-22/+37
Fix cifs_signal_cifsd_for_reconnect() to take the correct lock order and prevent the following deadlock from happening ====================================================== WARNING: possible circular locking dependency detected 6.16.0-rc3-build2+ #1301 Tainted: G S W ------------------------------------------------------ cifsd/6055 is trying to acquire lock: ffff88810ad56038 (&tcp_ses->srv_lock){+.+.}-{3:3}, at: cifs_signal_cifsd_for_reconnect+0x134/0x200 but task is already holding lock: ffff888119c64330 (&ret_buf->chan_lock){+.+.}-{3:3}, at: cifs_signal_cifsd_for_reconnect+0xcf/0x200 which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #2 (&ret_buf->chan_lock){+.+.}-{3:3}: validate_chain+0x1cf/0x270 __lock_acquire+0x60e/0x780 lock_acquire.part.0+0xb4/0x1f0 _raw_spin_lock+0x2f/0x40 cifs_setup_session+0x81/0x4b0 cifs_get_smb_ses+0x771/0x900 cifs_mount_get_session+0x7e/0x170 cifs_mount+0x92/0x2d0 cifs_smb3_do_mount+0x161/0x460 smb3_get_tree+0x55/0x90 vfs_get_tree+0x46/0x180 do_new_mount+0x1b0/0x2e0 path_mount+0x6ee/0x740 do_mount+0x98/0xe0 __do_sys_mount+0x148/0x180 do_syscall_64+0xa4/0x260 entry_SYSCALL_64_after_hwframe+0x76/0x7e -> #1 (&ret_buf->ses_lock){+.+.}-{3:3}: validate_chain+0x1cf/0x270 __lock_acquire+0x60e/0x780 lock_acquire.part.0+0xb4/0x1f0 _raw_spin_lock+0x2f/0x40 cifs_match_super+0x101/0x320 sget+0xab/0x270 cifs_smb3_do_mount+0x1e0/0x460 smb3_get_tree+0x55/0x90 vfs_get_tree+0x46/0x180 do_new_mount+0x1b0/0x2e0 path_mount+0x6ee/0x740 do_mount+0x98/0xe0 __do_sys_mount+0x148/0x180 do_syscall_64+0xa4/0x260 entry_SYSCALL_64_after_hwframe+0x76/0x7e -> #0 (&tcp_ses->srv_lock){+.+.}-{3:3}: check_noncircular+0x95/0xc0 check_prev_add+0x115/0x2f0 validate_chain+0x1cf/0x270 __lock_acquire+0x60e/0x780 lock_acquire.part.0+0xb4/0x1f0 _raw_spin_lock+0x2f/0x40 cifs_signal_cifsd_for_reconnect+0x134/0x200 __cifs_reconnect+0x8f/0x500 cifs_handle_standard+0x112/0x280 cifs_demultiplex_thread+0x64d/0xbc0 kthread+0x2f7/0x310 ret_from_fork+0x2a/0x230 ret_from_fork_asm+0x1a/0x30 other info that might help us debug this: Chain exists of: &tcp_ses->srv_lock --> &ret_buf->ses_lock --> &ret_buf->chan_lock Possible unsafe locking scenario: CPU0 CPU1 ---- ---- lock(&ret_buf->chan_lock); lock(&ret_buf->ses_lock); lock(&ret_buf->chan_lock); lock(&tcp_ses->srv_lock); *** DEADLOCK *** 3 locks held by cifsd/6055: #0: ffffffff857de398 (&cifs_tcp_ses_lock){+.+.}-{3:3}, at: cifs_signal_cifsd_for_reconnect+0x7b/0x200 #1: ffff888119c64060 (&ret_buf->ses_lock){+.+.}-{3:3}, at: cifs_signal_cifsd_for_reconnect+0x9c/0x200 #2: ffff888119c64330 (&ret_buf->chan_lock){+.+.}-{3:3}, at: cifs_signal_cifsd_for_reconnect+0xcf/0x200 Cc: linux-cifs@vger.kernel.org Reported-by: David Howells <dhowells@redhat.com> Fixes: d7d7a66aacd6 ("cifs: avoid use of global locks for high contention data") Reviewed-by: David Howells <dhowells@redhat.com> Tested-by: David Howells <dhowells@redhat.com> Signed-off-by: Paulo Alcantara (Red Hat) <pc@manguebit.org> Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2025-06-26block: fix false warning in bdev_count_inflight_rw()Yu Kuai1-11/+15
While bdev_count_inflight is interating all cpus, if some IOs are issued from traversed cpu and then completed from the cpu that is not traversed yet: cpu0 cpu1 bdev_count_inflight //for_each_possible_cpu // cpu0 is 0 infliht += 0 // issue a io blk_account_io_start // cpu0 inflight ++ cpu2 // the io is done blk_account_io_done // cpu2 inflight -- // cpu 1 is 0 inflight += 0 // cpu2 is -1 inflight += -1 ... In this case, the total inflight will be -1, causing lots of false warning. Fix the problem by removing the warning. Noted there is still a valid warning for nvme-mpath(From Yi) that is not fixed yet. Fixes: f5482ee5edb9 ("block: WARN if bdev inflight counter is negative") Reported-by: Yi Zhang <yi.zhang@redhat.com> Closes: https://lore.kernel.org/linux-block/aFtUXy-lct0WxY2w@mozart.vkv.me/T/#mae89155a5006463d0a21a4a2c35ae0034b26a339 Reported-and-tested-by: Calvin Owens <calvin@wbinvd.org> Closes: https://lore.kernel.org/linux-block/aFtUXy-lct0WxY2w@mozart.vkv.me/T/#m1d935a00070bf95055d0ac84e6075158b08acaef Reported-by: Dave Chinner <david@fromorbit.com> Closes: https://lore.kernel.org/linux-block/aFuypjqCXo9-5_En@dread.disaster.area/ Signed-off-by: Yu Kuai <yukuai3@huawei.com> Link: https://lore.kernel.org/r/20250626115743.1641443-1-yukuai3@huawei.com Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-06-26ublk: sanity check add_dev input for underflowRonnie Sahlberg1-1/+2
Add additional checks that queue depth and number of queues are non-zero. Signed-off-by: Ronnie Sahlberg <rsahlberg@whamcloud.com> Reviewed-by: Ming Lei <ming.lei@redhat.com> Link: https://lore.kernel.org/r/20250626022046.235018-1-ronniesahlberg@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-06-26drm/xe: Process deferred GGTT node removals on device unwindMichal Wajdeczko1-0/+11
While we are indirectly draining our dedicated workqueue ggtt->wq that we use to complete asynchronous removal of some GGTT nodes, this happends as part of the managed-drm unwinding (ggtt_fini_early), which could be later then manage-device unwinding, where we could already unmap our MMIO/GMS mapping (mmio_fini). This was recently observed during unsuccessful VF initialization: [ ] xe 0000:00:02.1: probe with driver xe failed with error -62 [ ] xe 0000:00:02.1: DEVRES REL ffff88811e747340 __xe_bo_unpin_map_no_vm (16 bytes) [ ] xe 0000:00:02.1: DEVRES REL ffff88811e747540 __xe_bo_unpin_map_no_vm (16 bytes) [ ] xe 0000:00:02.1: DEVRES REL ffff88811e747240 __xe_bo_unpin_map_no_vm (16 bytes) [ ] xe 0000:00:02.1: DEVRES REL ffff88811e747040 tiles_fini (16 bytes) [ ] xe 0000:00:02.1: DEVRES REL ffff88811e746840 mmio_fini (16 bytes) [ ] xe 0000:00:02.1: DEVRES REL ffff88811e747f40 xe_bo_pinned_fini (16 bytes) [ ] xe 0000:00:02.1: DEVRES REL ffff88811e746b40 devm_drm_dev_init_release (16 bytes) [ ] xe 0000:00:02.1: [drm:drm_managed_release] drmres release begin [ ] xe 0000:00:02.1: [drm:drm_managed_release] REL ffff88810ef81640 __fini_relay (8 bytes) [ ] xe 0000:00:02.1: [drm:drm_managed_release] REL ffff88810ef80d40 guc_ct_fini (8 bytes) [ ] xe 0000:00:02.1: [drm:drm_managed_release] REL ffff88810ef80040 __drmm_mutex_release (8 bytes) [ ] xe 0000:00:02.1: [drm:drm_managed_release] REL ffff88810ef80140 ggtt_fini_early (8 bytes) and this was leading to: [ ] BUG: unable to handle page fault for address: ffffc900058162a0 [ ] #PF: supervisor write access in kernel mode [ ] #PF: error_code(0x0002) - not-present page [ ] Oops: Oops: 0002 [#1] SMP NOPTI [ ] Tainted: [W]=WARN [ ] Workqueue: xe-ggtt-wq ggtt_node_remove_work_func [xe] [ ] RIP: 0010:xe_ggtt_set_pte+0x6d/0x350 [xe] [ ] Call Trace: [ ] <TASK> [ ] xe_ggtt_clear+0xb0/0x270 [xe] [ ] ggtt_node_remove+0xbb/0x120 [xe] [ ] ggtt_node_remove_work_func+0x30/0x50 [xe] [ ] process_one_work+0x22b/0x6f0 [ ] worker_thread+0x1e8/0x3d Add managed-device action that will explicitly drain the workqueue with all pending node removals prior to releasing MMIO/GSM mapping. Fixes: 919bb54e989c ("drm/xe: Fix missing runtime outer protection for ggtt_remove_node") Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: Lucas De Marchi <lucas.demarchi@intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://lore.kernel.org/r/20250612220937.857-2-michal.wajdeczko@intel.com (cherry picked from commit 89d2835c3680ab1938e22ad81b1c9f8c686bd391) Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
2025-06-26drm/xe/guc: Explicitly exit CT safe mode on unwindMichal Wajdeczko1-4/+6
During driver probe we might be briefly using CT safe mode, which is based on a delayed work, but usually we are able to stop this once we have IRQ fully operational. However, if we abort the probe quite early then during unwind we might try to destroy the workqueue while there is still a pending delayed work that attempts to restart itself which triggers a WARN. This was recently observed during unsuccessful VF initialization: [ ] xe 0000:00:02.1: probe with driver xe failed with error -62 [ ] ------------[ cut here ]------------ [ ] workqueue: cannot queue safe_mode_worker_func [xe] on wq xe-g2h-wq [ ] WARNING: CPU: 9 PID: 0 at kernel/workqueue.c:2257 __queue_work+0x287/0x710 [ ] RIP: 0010:__queue_work+0x287/0x710 [ ] Call Trace: [ ] delayed_work_timer_fn+0x19/0x30 [ ] call_timer_fn+0xa1/0x2a0 Exit the CT safe mode on unwind to avoid that warning. Fixes: 09b286950f29 ("drm/xe/guc: Allow CTB G2H processing without G2H IRQ") Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://lore.kernel.org/r/20250612220937.857-3-michal.wajdeczko@intel.com (cherry picked from commit 2ddbb73ec20b98e70a5200cb85deade22ccea2ec) Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
2025-06-26drm/xe: move DPT l2 flush to a more sensible placeMatthew Auld1-2/+3
Only need the flush for DPT host updates here. Normal GGTT updates don't need special flush. Fixes: 01570b446939 ("drm/xe/bmg: implement Wa_16023588340") Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Cc: stable@vger.kernel.org # v6.12+ Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://lore.kernel.org/r/20250606104546.1996818-4-matthew.auld@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> (cherry picked from commit 35db1da40c8cfd7511dc42f342a133601eb45449) Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
2025-06-26drm/xe: Move DSB l2 flush to a more sensible placeMaarten Lankhorst1-7/+4
Flushing l2 is only needed after all data has been written. Fixes: 01570b446939 ("drm/xe/bmg: implement Wa_16023588340") Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Cc: Matthew Auld <matthew.auld@intel.com> Cc: stable@vger.kernel.org # v6.12+ Reviewed-by: Matthew Auld <matthew.auld@intel.com> Signed-off-by: Matthew Auld <matthew.auld@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://lore.kernel.org/r/20250606104546.1996818-3-matthew.auld@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> (cherry picked from commit 0dd2dd0182bc444a62652e89d08c7f0e4fde15ba) Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
2025-06-26LoongArch: KVM: Avoid overflow with array indexBibo Mao1-10/+7
The variable index is modified and reused as array index when modify register EIOINTC_ENABLE. There will be array index overflow problem. Cc: stable@vger.kernel.org Fixes: 3956a52bc05b ("LoongArch: KVM: Add EIOINTC read and write functions") Signed-off-by: Bibo Mao <maobibo@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2025-06-26LoongArch: Handle KCOV __init vs inline mismatchesKees Cook3-4/+4
When the KCOV is enabled all functions get instrumented, unless the __no_sanitize_coverage attribute is used. To prepare for __no_sanitize_coverage being applied to __init functions, we have to handle differences in how GCC's inline optimizations get resolved. For LoongArch this exposed several places where __init annotations were missing but ended up being "accidentally correct". So fix these cases. Signed-off-by: Kees Cook <kees@kernel.org> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2025-06-26LoongArch: Reserve the EFI memory map regionMing Wang1-0/+12
The EFI memory map at 'boot_memmap' is crucial for kdump to understand the primary kernel's memory layout. This memory region, typically part of EFI Boot Services (BS) data, can be overwritten after ExitBootServices if not explicitly preserved by the kernel. This commit addresses this by: 1. Calling memblock_reserve() to reserve the entire physical region occupied by the EFI memory map (header + descriptors). This prevents the primary kernel from reallocating and corrupting this area. 2. Setting the EFI_PRESERVE_BS_REGIONS flag in efi.flags. This indicates that efforts have been made to preserve critical BS code/data regions which can be useful for other kernel subsystems or debugging. These changes ensure the original EFI memory map data remains intact, improving kdump reliability and potentially aiding other EFI-related functionalities that might rely on preserved BS code/data. Signed-off-by: Ming Wang <wangming01@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>