aboutsummaryrefslogtreecommitdiffstats
path: root/drivers/gpu (follow)
AgeCommit message (Collapse)AuthorFilesLines
7 daystreewide, timers: Rename from_timer() to timer_container_of()Ingo Molnar21-23/+29
Move this API to the canonical timer_*() namespace. [ tglx: Redone against pre rc1 ] Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/all/aB2X0jCKQO56WdMt@gmail.com
8 daysMerge tag 'drm-fixes-2025-06-06' of https://gitlab.freedesktop.org/drm/kernelLinus Torvalds2-2/+4
Pull more drm fixes from Simona Vetter: "Another small batch of drm fixes, this time with a different baseline and hence separate. Drivers: - ivpu: - dma_resv locking - warning fixes - reset failure handling - improve logging - update fw file names - fix cmdqueue unregister - panel-simple: add Evervision VGG644804 Core Changes: - sysfb: screen_info type check - video: screen_info for relocated pci fb - drm/sched: signal fence of killed job - dummycon: deferred takeover fix" * tag 'drm-fixes-2025-06-06' of https://gitlab.freedesktop.org/drm/kernel: sysfb: Fix screen_info type check for VGA video: screen_info: Relocate framebuffers behind PCI bridges accel/ivpu: Fix warning in ivpu_gem_bo_free() accel/ivpu: Trigger device recovery on engine reset/resume failure accel/ivpu: Use dma_resv_lock() instead of a custom mutex drm/panel-simple: fix the warnings for the Evervision VGG644804 accel/ivpu: Reorder Doorbell Unregister and Command Queue Destruction accel/ivpu: Use firmware names from upstream repo accel/ivpu: Improve buffer object logging dummycon: Trigger redraw when switching consoles with deferred takeover drm/scheduler: signal scheduled fence when kill job
9 daysMerge tag 'drm-next-2025-06-06' of https://gitlab.freedesktop.org/drm/kernelLinus Torvalds134-617/+2274
Pull drm fixes from Dave Airlie: "This is pretty much two weeks worth of fixes, plus one thing that might be considered next: amdkfd is now able to be enabled on risc-v platforms. Otherwise, amdgpu and xe with the majority of fixes, and then a smattering all over. panel: - nt37801: fix IS_ERR - nt37801: fix KConfig connector: - Fix null deref in HDMI audio helper. bridge: - analogix_dp: fixup clk-disable removal nouveau: - minor typo fix (',' vs ';') msm: - mailmap updates i915: - Fix the enabling/disabling of DP audio SDP splitting - Fix PSR register definitions for ALPM - Fix u32 overflow in SNPS PHY HDMI PLL setup - Fix GuC pending message underflow when submit fails - Fix GuC wakeref underflow race during reset xe: - Two documentation fixes - A couple of vm init fixes - Hwmon fixes - Drop reduntant conversion to bool - Fix CONFIG_INTEL_VSEC dependency - Rework eviction rejection of bound external bos - Stop re-submitting signalled jobs - A couple of pxp fixes - Add back a fix that got lost in a merge - Create LRC bo without VM - Fix for the above fix amdgpu: - UserQ fixes - SMU 13.x fixes - VCN fixes - JPEG fixes - Misc cleanups - runtime pm fix - DCN 4.0.1 fixes - Misc display fixes - ISP fix - VRAM manager fix - RAS fixes - IP discovery fix - Cleaner shader fix for GC 10.1.x - OD fix - Non-OLED panel fix - Misc display fixes - Brightness fixes amdkfd: - Enable CONFIG_HSA_AMD on RISCV - SVM fix - Misc cleanups - Ref leak fix - WPTR BO fix radeon: - Misc cleanups" * tag 'drm-next-2025-06-06' of https://gitlab.freedesktop.org/drm/kernel: (105 commits) drm/nouveau/vfn/r535: Convert comma to semicolon drm/xe: remove unmatched xe_vm_unlock() from __xe_exec_queue_init() drm/xe: Create LRC BO without VM drm/xe/guc_submit: add back fix drm/xe/pxp: Clarify PXP queue creation behavior if PXP is not ready drm/xe/pxp: Use the correct define in the set_property_funcs array drm/xe/sched: stop re-submitting signalled jobs drm/xe: Rework eviction rejection of bound external bos drm/xe/vsec: fix CONFIG_INTEL_VSEC dependency drm/xe: drop redundant conversion to bool drm/xe/hwmon: Move card reactive critical power under channel card drm/xe/hwmon: Add support to manage power limits though mailbox drm/xe/vm: move xe_svm_init() earlier drm/xe/vm: move rebind_work init earlier MAINTAINERS: .mailmap: update Rob Clark's email address mailmap: Update entry for Akhil P Oommen MAINTAINERS: update my email address MAINTAINERS: drop myself as maintainer drm/i915/display: Fix u32 overflow in SNPS PHY HDMI PLL setup drm/amd/display: Fix default DC and AC levels ...
9 daysMerge tag 'drm-misc-fixes-2025-06-06' of https://gitlab.freedesktop.org/drm/misc/kernel into drm-fixesSimona Vetter1-2/+3
Short summary of fixes pull: ivpu: - gem: Use dma-resv lock - gem. Fix a warning - Trigger recovery on device engine reset/resume failure panel: - panel-simple: Fix settings for Evervision VGG644804 sysfb: - Fix screen_info type check video: - Update screen_info for relocated PCI framebuffers Signed-off-by: Simona Vetter <simona.vetter@ffwll.ch> From: Thomas Zimmermann <tzimmermann@suse.de> Link: https://patchwork.freedesktop.org/patch/msgid/20250606072853.GA13099@linux.fritz.box
9 daysMerge tag 'drm-misc-fixes-2025-05-28' of https://gitlab.freedesktop.org/drm/misc/kernel into drm-fixesSimona Vetter1-0/+1
Short summary of fixes pull: drm-scheduler: - signal scheduled fence when killing job dummycon: - trigger deferred takeover when switching consoles ivpu: - improve logging - update firmware filenames - reorder steps in command-queue unregistering Signed-off-by: Simona Vetter <simona.vetter@ffwll.ch> From: Thomas Zimmermann <tzimmermann@suse.de> Link: https://patchwork.freedesktop.org/patch/msgid/20250528153550.GA21050@linux.fritz.box
9 daysdrm/nouveau/vfn/r535: Convert comma to semicolonChen Ni1-1/+1
Replace comma between expressions with semicolons. Using a ',' in place of a ';' can have unintended side effects. Although that is not the case here, it is seems best to use ';' unless ',' is intended. Found by inspection. No functional change intended. Compile tested only. Signed-off-by: Chen Ni <nichen@iscas.ac.cn> Fixes: cd3c62282b61 ("drm/nouveau/gsp: add usermode class id to gpu hal") Signed-off-by: Dave Airlie <airlied@redhat.com> Link: https://lore.kernel.org/r/20250603061027.1310267-1-nichen@iscas.ac.cn
9 daysMerge tag 'amd-drm-fixes-6.16-2025-06-05' of https://gitlab.freedesktop.org/agd5f/linux into drm-nextDave Airlie17-51/+205
amd-drm-fixes-6.16-2025-06-05: amdgpu: - IP discovery fix - Cleaner shader fix for GC 10.1.x - OD fix - UserQ fixes - Non-OLED panel fix - Misc display fixes - Brightness fixes amdkfd: - Enable CONFIG_HSA_AMD on RISCV Signed-off-by: Dave Airlie <airlied@redhat.com> From: Alex Deucher <alexander.deucher@amd.com> Link: https://lore.kernel.org/r/20250606015932.835829-1-alexander.deucher@amd.com
9 daysMerge tag 'drm-xe-next-fixes-2025-06-05' of https://gitlab.freedesktop.org/drm/xe/kernel into drm-nextDave Airlie19-176/+476
Driver Changes: - A couple of vm init fixes (Matt Auld) - Hwmon fixes (Karthik) - Drop reduntant conversion to bool (Raag) - Fix CONFIG_INTEL_VSEC dependency (Arnd) - Rework eviction rejection of bound external bos (Thomas) - Stop re-submitting signalled jobs (Matt Auld) - A couple of pxp fixes (Daniele) - Add back a fix that got lost in a merge (Matt Auld) - Create LRC bo without VM (Niranjana) - Fix for the above fix (Maciej) Signed-off-by: Dave Airlie <airlied@redhat.com> From: Thomas Hellstrom <thomas.hellstrom@linux.intel.com> Link: https://lore.kernel.org/r/aEHq44uIAZwfK-mG@fedora
9 daysMerge tag 'drm-misc-next-fixes-2025-06-05' of https://gitlab.freedesktop.org/drm/misc/kernel into drm-nextDave Airlie4-17/+12
drm-misc-fixes for v6.16-rc1: - Fixes for nt37801 panel - Fix null deref in HDMI audio helper. - Fixes for analogix_dp. Signed-off-by: Dave Airlie <airlied@redhat.com> From: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Link: https://lore.kernel.org/r/14c2eff8-701d-4699-b187-08862715e1ac@linux.intel.com
9 daysMerge tag 'drm-intel-next-fixes-2025-06-05' of https://gitlab.freedesktop.org/drm/i915/kernel into drm-nextDave Airlie3-14/+25
- Fix PSR register definitions for ALPM - Fix u32 overflow in SNPS PHY HDMI PLL setup - Fix GuC pending message underflow when submit fails - Fix GuC wakeref underflow race during reset Signed-off-by: Dave Airlie <airlied@redhat.com> From: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: https://lore.kernel.org/r/aEFW1wGnt1kTVNGF@jlahtine-mobl
9 daysdrm/xe: remove unmatched xe_vm_unlock() from __xe_exec_queue_init()Maciej Patelczyk1-5/+1
There is unmatched xe_vm_unlock() in the __xe_exec_queue_init(). Leftover from commit fbeaad071a98 ("drm/xe: Create LRC BO without VM") Fixes: 2b0a0ce0c20b ("drm/xe: Create LRC BO without VM") Signed-off-by: Maciej Patelczyk <maciej.patelczyk@intel.com> Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com> Link: https://lore.kernel.org/r/20250530135627.2821612-1-maciej.patelczyk@intel.com (cherry picked from commit 28b996ce73982a44fa86736ca0e3684cb1ae8b24) Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
9 daysdrm/xe: Create LRC BO without VMNiranjana Vishwanathapura2-28/+4
Specifying VM during lrc->bo creation requires VM's reference to be held for the lifetime of lrc->bo as it will use VM's dma reservation object. Using VM's dma reservation object for lrc->bo doesn't provide any advantage. Hence do not pass VM while creating lrc->bo. v2: Use xe_bo_unpin_map_no_vm (Matthew Brost) Fixes: 264eecdba211 ("drm/xe: Decouple xe_exec_queue and xe_lrc") Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Link: https://lore.kernel.org/r/20250529052031.2429120-2-niranjana.vishwanathapura@intel.com (cherry picked from commit fbeaad071a98fef87deccee81d564de1c8e8e16d) Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
9 daysdrm/xe/guc_submit: add back fixMatthew Auld1-0/+11
Daniele noticed that the fix in commit 2d2be279f1ca ("drm/xe: fix UAF around queue destruction") looks to have been unintentionally removed as part of handling a conflict in some past merge commit. Add it back. Fixes: ac44ff7cec33 ("Merge tag 'drm-xe-fixes-2024-10-10' of https://gitlab.freedesktop.org/drm/xe/kernel into drm-fixes") Reported-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Cc: <stable@vger.kernel.org> # v6.12+ Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://lore.kernel.org/r/20250603174213.1543579-2-matthew.auld@intel.com (cherry picked from commit 9d9fca62dc49d96f97045b6d8e7402a95f8cf92a) Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
9 daysdrm/xe/pxp: Clarify PXP queue creation behavior if PXP is not readyDaniele Ceraolo Spurio1-2/+6
The expected flow of operations when using PXP is to query the PXP status and wait for it to transition to "ready" before attempting to create an exec_queue. This flow is followed by the Mesa driver, but there is no guarantee that an incorrectly coded (or malicious) app will not attempt to create the queue first without querying the status. Therefore, we need to clarify what the expected behavior of the queue creation ioctl is in this scenario. Currently, the ioctl always fails with an -EBUSY code no matter the error, but for consistency it is better to distinguish between "failed to init" (-EIO) and "not ready" (-EBUSY), the same way the query ioctl does. Note that, while this is a change in the return code of an ioctl, the behavior of the ioctl in this particular corner case was not clearly spec'd, so no one should have been relying on it (and we know that Mesa, which is the only known userspace for this, didn't). v2: Minor rework of the doc (Rodrigo) Fixes: 72d479601d67 ("drm/xe/pxp/uapi: Add userspace and LRC support for PXP-using queues") Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: John Harrison <John.C.Harrison@Intel.com> Cc: José Roberto de Souza <jose.souza@intel.com> Reviewed-by: José Roberto de Souza <jose.souza@intel.com> Reviewed-by: John Harrison <John.C.Harrison@Intel.com> Acked-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://lore.kernel.org/r/20250522225401.3953243-7-daniele.ceraolospurio@intel.com (cherry picked from commit 21784ca96025b62d95b670b7639ad70ddafa69b8) Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
9 daysdrm/xe/pxp: Use the correct define in the set_property_funcs arrayDaniele Ceraolo Spurio1-1/+1
The define of the extension type was accidentally used instead of the one of the property itself. They're both zero, so no functional issue, but we should use the correct define for code correctness. Fixes: 41a97c4a1294 ("drm/xe/pxp/uapi: Add API to mark a BO as using PXP") Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: John Harrison <John.C.Harrison@Intel.com> Reviewed-by: John Harrison <John.C.Harrison@Intel.com> Link: https://lore.kernel.org/r/20250522225401.3953243-6-daniele.ceraolospurio@intel.com (cherry picked from commit 1d891ee820fd0fbb4101eacb0d922b5050a24933) Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
9 daysdrm/xe/sched: stop re-submitting signalled jobsMatthew Auld1-1/+9
Customer is reporting a really subtle issue where we get random DMAR faults, hangs and other nasties for kernel migration jobs when stressing stuff like s2idle/s3/s4. The explosions seems to happen somewhere after resuming the system with splats looking something like: PM: suspend exit rfkill: input handler disabled xe 0000:00:02.0: [drm] GT0: Engine reset: engine_class=bcs, logical_mask: 0x2, guc_id=0 xe 0000:00:02.0: [drm] GT0: Timedout job: seqno=24496, lrc_seqno=24496, guc_id=0, flags=0x13 in no process [-1] xe 0000:00:02.0: [drm] GT0: Kernel-submitted job timed out The likely cause appears to be a race between suspend cancelling the worker that processes the free_job()'s, such that we still have pending jobs to be freed after the cancel. Following from this, on resume the pending_list will now contain at least one already complete job, but it looks like we call drm_sched_resubmit_jobs(), which will then call run_job() on everything still on the pending_list. But if the job was already complete, then all the resources tied to the job, like the bb itself, any memory that is being accessed, the iommu mappings etc. might be long gone since those are usually tied to the fence signalling. This scenario can be seen in ftrace when running a slightly modified xe_pm IGT (kernel was only modified to inject artificial latency into free_job to make the race easier to hit): xe_sched_job_run: dev=0000:00:02.0, fence=0xffff888276cc8540, seqno=0, lrc_seqno=0, gt=0, guc_id=0, batch_addr=0x000000146910 ... xe_exec_queue_stop: dev=0000:00:02.0, 3:0x2, gt=0, width=1, guc_id=0, guc_state=0x0, flags=0x13 xe_exec_queue_stop: dev=0000:00:02.0, 3:0x2, gt=0, width=1, guc_id=1, guc_state=0x0, flags=0x4 xe_exec_queue_stop: dev=0000:00:02.0, 4:0x1, gt=1, width=1, guc_id=0, guc_state=0x0, flags=0x3 xe_exec_queue_stop: dev=0000:00:02.0, 1:0x1, gt=1, width=1, guc_id=1, guc_state=0x0, flags=0x3 xe_exec_queue_stop: dev=0000:00:02.0, 4:0x1, gt=1, width=1, guc_id=2, guc_state=0x0, flags=0x3 xe_exec_queue_resubmit: dev=0000:00:02.0, 3:0x2, gt=0, width=1, guc_id=0, guc_state=0x0, flags=0x13 xe_sched_job_run: dev=0000:00:02.0, fence=0xffff888276cc8540, seqno=0, lrc_seqno=0, gt=0, guc_id=0, batch_addr=0x000000146910 ... ..... xe_exec_queue_memory_cat_error: dev=0000:00:02.0, 3:0x2, gt=0, width=1, guc_id=0, guc_state=0x3, flags=0x13 So the job_run() is clearly triggered twice for the same job, even though the first must have already signalled to completion during suspend. We can also see a CAT error after the re-submit. To prevent this only resubmit jobs on the pending_list that have not yet signalled. v2: - Make sure to re-arm the fence callbacks with sched_start(). v3 (Matt B): - Stop using drm_sched_resubmit_jobs(), which appears to be deprecated and just open-code a simple loop such that we skip calling run_job() on anything already signalled. Link: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/4856 Fixes: dd08ebf6c352 ("drm/xe: Introduce a new DRM driver for Intel GPUs") Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Cc: William Tseng <william.tseng@intel.com> Cc: <stable@vger.kernel.org> # v6.8+ Reviewed-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Tejas Upadhyay <tejas.upadhyay@intel.com> Link: https://lore.kernel.org/r/20250528113328.289392-2-matthew.auld@intel.com (cherry picked from commit 38fafa9f392f3110d2de431432d43f4eef99cd1b) Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
9 daysdrm/xe: Rework eviction rejection of bound external bosThomas Hellström3-18/+105
For preempt_fence mode VM's we're rejecting eviction of shared bos during VM_BIND. However, since we do this in the move() callback, we're getting an eviction failure warning from TTM. The TTM callback intended for these things is eviction_valuable(). However, the latter doesn't pass in the struct ttm_operation_ctx needed to determine whether the caller needs this. Instead, attach the needed information to the vm under the vm->resv, until we've been able to update TTM to provide the needed information. And add sufficient lockdep checks to prevent misuse and races. v2: - Fix a copy-paste error in xe_vm_clear_validating() v3: - Fix kerneldoc errors. Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Fixes: 0af944f0e308 ("drm/xe: Reject BO eviction if BO is bound to current VM") Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://lore.kernel.org/r/20250528164105.234718-1-thomas.hellstrom@linux.intel.com (cherry picked from commit 9d5558649f68e2e84a87a909631b30e15ca0f8ec) Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
9 daysdrm/xe/vsec: fix CONFIG_INTEL_VSEC dependencyArnd Bergmann1-1/+2
The XE driver can be built with or without VSEC support, but fails to link as built-in if vsec is in a loadable module: x86_64-linux-ld: vmlinux.o: in function `xe_vsec_init': (.text+0x1e83e16): undefined reference to `intel_vsec_register' The normal fix for this is to add a 'depends on INTEL_VSEC || !INTEL_VSEC', forcing XE to be a loadable module as well, but that causes a circular dependency: symbol DRM_XE depends on INTEL_VSEC symbol INTEL_VSEC depends on X86_PLATFORM_DEVICES symbol X86_PLATFORM_DEVICES is selected by DRM_XE The problem here is selecting a symbol from another subsystem, so change that as well and rephrase the 'select' into the corresponding dependency. Since X86_PLATFORM_DEVICES is 'default y', there is no change to defconfig builds here. Fixes: 0c45e76fcc62 ("drm/xe/vsec: Support BMG devices") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://lore.kernel.org/r/20250529172355.2395634-2-lucas.demarchi@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> (cherry picked from commit e4931f8be347ec5f19df4d6d33aea37145378c42) Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
9 daysdrm/xe: drop redundant conversion to boolRaag Jadav1-1/+1
The result of integer comparison already evaluates to bool. No need for explicit conversion. No functional impact. Fixes: 0e414bf7ad01 ("drm/xe: Expose PCIe link downgrade attributes") Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202505292205.MoljmkjQ-lkp@intel.com/ Signed-off-by: Raag Jadav <raag.jadav@intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://lore.kernel.org/r/20250529160937.490147-1-raag.jadav@intel.com Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> (cherry picked from commit 61761a6b57f2818983466d24aab60baab471ba21) Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
9 daysdrm/xe/hwmon: Move card reactive critical power under channel cardKarthik Poosa1-3/+3
Move power2/curr2_crit to channel 1 i.e power1/curr1_crit as this represents the entire card critical power/current. v2: Update the date of curr1_crit also in hwmon documentation. Signed-off-by: Karthik Poosa <karthik.poosa@intel.com> Fixes: 345dadc4f68b ("drm/xe/hwmon: Add infra to support card power and energy attributes") Reviewed-by: Badal Nilawar <badal.nilawar@intel.com> Link: https://lore.kernel.org/r/20250529163458.2354509-3-karthik.poosa@intel.com Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> (cherry picked from commit 25e963a09e059ffdb15c09cc79cfded855b43668) Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
9 daysdrm/xe/hwmon: Add support to manage power limits though mailboxKarthik Poosa8-106/+318
Add support to manage power limits using pcode mailbox commands for supported platforms. v2: - Address review comments. (Badal) - Use mailbox commands instead of registers to manage power limits for BMG. - Clamp the maximum power limit to GPU firmware default value. v3: - Clamp power limit in write also for platforms with mailbox support. v4: - Remove unnecessary debug prints. (Badal) v5: - Update description of variable pl1_on_boot to fix kernel-doc error. v6: - Improve commit message, refer to BIOS as GPU firmware. - Change macro READ_PL_FROM_BIOS to READ_PL_FROM_FW. - Rectify drm_warn to drm_info. Signed-off-by: Karthik Poosa <karthik.poosa@intel.com> Fixes: e90f7a58e659 ("drm/xe/hwmon: Add HWMON support for BMG") Reviewed-by: Badal Nilawar <badal.nilawar@intel.com> Link: https://lore.kernel.org/r/20250529163458.2354509-2-karthik.poosa@intel.com Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> (cherry picked from commit 7596d839f6228757fe17a810da2d1c5f3305078c) Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
9 daysdrm/xe/vm: move xe_svm_init() earlierMatthew Auld1-7/+12
In xe_vm_close_and_put() we need to be able to call xe_svm_fini(), however during vm creation we can call this on the error path, before having actually initialised the svm state, leading to various splats followed by a fatal NPD. Fixes: 6fd979c2f331 ("drm/xe: Add SVM init / close / fini to faulting VMs") Link: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/4967 Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://lore.kernel.org/r/20250514152424.149591-4-matthew.auld@intel.com (cherry picked from commit 4f296d77cf49fcb5f90b4674123ad7f3a0676165) Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
9 daysdrm/xe/vm: move rebind_work init earlierMatthew Auld1-4/+4
In xe_vm_close_and_put() we need to be able to call flush_work(rebind_work), however during vm creation we can call this on the error path, before having actually set up the worker, leading to a splat from flush_work(). It looks like we can simply move the worker init step earlier to fix this. Fixes: dd08ebf6c352 ("drm/xe: Introduce a new DRM driver for Intel GPUs") Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Cc: <stable@vger.kernel.org> # v6.8+ Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://lore.kernel.org/r/20250514152424.149591-3-matthew.auld@intel.com (cherry picked from commit 96af397aa1a2d1032a6e28ff3f4bc0ab4be40e1d) Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
10 daysdrm/panel-simple: fix the warnings for the Evervision VGG644804Michael Walle1-2/+3
The panel lacked the connector type which causes a warning. Adding the connector type reveals wrong bus_flags and bits per pixel. Fix all of it. Fixes: 1319f2178bdf ("drm/panel-simple: add Evervision VGG644804 panel entry") Signed-off-by: Michael Walle <mwalle@kernel.org> Reviewed-by: Neil Armstrong <neil.armstrong@linaro.org> Signed-off-by: Neil Armstrong <neil.armstrong@linaro.org> Link: https://lore.kernel.org/r/20250520074110.655114-1-mwalle@kernel.org
10 daysMerge tag 'pci-v6.16-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pciLinus Torvalds13-15/+12
Pull pci updates from Bjorn Helgaas: "Enumeration: - Print the actual delay time in pci_bridge_wait_for_secondary_bus() instead of assuming it was 1000ms (Wilfred Mallawa) - Revert 'iommu/amd: Prevent binding other PCI drivers to IOMMU PCI devices', which broke resume from system sleep on AMD platforms and has been fixed by other commits (Lukas Wunner) Resource management: - Remove mtip32xx use of pcim_iounmap_regions(), which is deprecated and unnecessary (Philipp Stanner) - Remove pcim_iounmap_regions() and pcim_request_region_exclusive() and related flags since all uses have been removed (Philipp Stanner) - Rework devres 'request' functions so they are no longer 'hybrid', i.e., their behavior no longer depends on whether pcim_enable_device or pci_enable_device() was used, and remove related code (Philipp Stanner) - Warn (not BUG()) about failure to assign optional resources (Ilpo Järvinen) Error handling: - Log the DPC Error Source ID only when it's actually valid (when ERR_FATAL or ERR_NONFATAL was received from a downstream device) and decode into bus/device/function (Bjorn Helgaas) - Determine AER log level once and save it so all related messages use the same level (Karolina Stolarek) - Use KERN_WARNING, not KERN_ERR, when logging PCIe Correctable Errors (Karolina Stolarek) - Ratelimit PCIe Correctable and Non-Fatal error logging, with sysfs controls on interval and burst count, to avoid flooding logs and RCU stall warnings (Jon Pan-Doh) Power management: - Increment PM usage counter when probing reset methods so we don't try to read config space of a powered-off device (Alex Williamson) - Set all devices to D0 during enumeration to ensure ACPI opregion is connected via _REG (Mario Limonciello) Power control: - Rename pwrctrl Kconfig symbols from 'PWRCTL' to 'PWRCTRL' to match the filename paths. Retain old deprecated symbols for compatibility, except for the pwrctrl slot driver (PCI_PWRCTRL_SLOT) (Johan Hovold) - When unregistering pwrctrl, cancel outstanding rescan work before cleaning up data structures to avoid use-after-free issues (Brian Norris) Bandwidth control: - Simplify link bandwidth controller by replacing the count of Link Bandwidth Management Status (LBMS) events with a PCI_LINK_LBMS_SEEN flag (Ilpo Järvinen) - Update the Link Speed after retraining, since the Link Speed may have changed (Ilpo Järvinen) PCIe native device hotplug: - Ignore Presence Detect Changed caused by DPC. pciehp already ignores Link Down/Up events caused by DPC, but on slots using in-band presence detect, DPC causes a spurious Presence Detect Changed event (Lukas Wunner) - Ignore Link Down/Up caused by Secondary Bus Reset. On hotplug ports using in-band presence detect, the reset causes a Presence Detect Changed event, which mistakenly caused teardown and re-enumeration of the device. Drivers may need to annotate code that resets their device (Lukas Wunner) Virtualization: - Add an ACS quirk for Loongson Root Ports that don't advertise ACS but don't allow peer-to-peer transactions between Root Ports; the quirk allows each Root Port to be in a separate IOMMU group (Huacai Chen) Endpoint framework: - For fixed-size BARs, retain both the actual size and the possibly larger size allocated to accommodate iATU alignment requirements (Jerome Brunet) - Simplify ctrl/SPAD space allocation and avoid allocating more space than needed (Jerome Brunet) - Correct MSI-X PBA offset calculations for DesignWare and Cadence endpoint controllers (Niklas Cassel) - Align the return value (number of interrupts) encoding for pci_epc_get_msi()/pci_epc_ops::get_msi() and pci_epc_get_msix()/pci_epc_ops::get_msix() (Niklas Cassel) - Align the nr_irqs parameter encoding for pci_epc_set_msi()/pci_epc_ops::set_msi() and pci_epc_set_msix()/pci_epc_ops::set_msix() (Niklas Cassel) Common host controller library: - Convert pci-host-common to a library so platforms that don't need native host controller drivers don't need to include these helper functions (Manivannan Sadhasivam) Apple PCIe controller driver: - Extract ECAM bridge creation helper from pci_host_common_probe() to separate driver-specific things like MSI from PCI things (Marc Zyngier) - Dynamically allocate RID-to_SID bitmap to prepare for SoCs with varying capabilities (Marc Zyngier) - Skip ports disabled in DT when setting up ports (Janne Grunau) - Add t6020 compatible string (Alyssa Rosenzweig) - Add T602x PCIe support (Hector Martin) - Directly set/clear INTx mask bits because T602x dropped the accessors that could do this without locking (Marc Zyngier) - Move port PHY registers to their own reg items to accommodate T602x, which moves them around; retain default offsets for existing DTs that lack phy%d entries with the reg offsets (Hector Martin) - Stop polling for core refclk, which doesn't work on T602x and the bootloader has already done anyway (Hector Martin) - Use gpiod_set_value_cansleep() when asserting PERST# in probe because we're allowed to sleep there (Hector Martin) Cadence PCIe controller driver: - Drop a runtime PM 'put' to resolve a runtime atomic count underflow (Hans Zhang) - Make the cadence core buildable as a module (Kishon Vijay Abraham I) - Add cdns_pcie_host_disable() and cdns_pcie_ep_disable() for use by loadable drivers when they are removed (Siddharth Vadapalli) Freescale i.MX6 PCIe controller driver: - Apply link training workaround only on IMX6Q, IMX6SX, IMX6SP (Richard Zhu) - Remove redundant dw_pcie_wait_for_link() from imx_pcie_start_link(); since the DWC core does this, imx6 only needs it when retraining for a faster link speed (Richard Zhu) - Toggle i.MX95 core reset to align with PHY powerup (Richard Zhu) - Set SYS_AUX_PWR_DET to work around i.MX95 ERR051624 erratum: in some cases, the controller can't exit 'L23 Ready' through Beacon or PERST# deassertion (Richard Zhu) - Clear GEN3_ZRXDC_NONCOMPL to work around i.MX95 ERR051586 erratum: controller can't meet 2.5 GT/s ZRX-DC timing when operating at 8 GT/s, causing timeouts in L1 (Richard Zhu) - Wait for i.MX95 PLL lock before enabling controller (Richard Zhu) - Save/restore i.MX95 LUT for suspend/resume (Richard Zhu) Mobiveil PCIe controller driver: - Return bool (not int) for link-up check in mobiveil_pab_ops.link_up() and layerscape-gen4, mobiveil (Hans Zhang) NVIDIA Tegra194 PCIe controller driver: - Create debugfs directory for 'aspm_state_cnt' only when CONFIG_PCIEASPM is enabled, since there are no other entries (Hans Zhang) Qualcomm PCIe controller driver: - Add OF support for parsing DT 'eq-presets-<N>gts' property for lane equalization presets (Krishna Chaitanya Chundru) - Read Maximum Link Width from the Link Capabilities register if DT lacks 'num-lanes' property (Krishna Chaitanya Chundru) - Add Physical Layer 64 GT/s Capability ID and register offsets for 8, 32, and 64 GT/s lane equalization registers (Krishna Chaitanya Chundru) - Add generic dwc support for configuring lane equalization presets (Krishna Chaitanya Chundru) - Add DT and driver support for PCIe on IPQ5018 SoC (Nitheesh Sekar) Renesas R-Car PCIe controller driver: - Describe endpoint BAR 4 as being fixed size (Jerome Brunet) - Document how to obtain R-Car V4H (r8a779g0) controller firmware (Yoshihiro Shimoda) Rockchip PCIe controller driver: - Reorder rockchip_pci_core_rsts because reset_control_bulk_deassert() deasserts in reverse order, to fix a link training regression (Jensen Huang) - Mark RK3399 as being capable of raising INTx interrupts (Niklas Cassel) Rockchip DesignWare PCIe controller driver: - Check only PCIE_LINKUP, not LTSSM status, to determine whether the link is up (Shawn Lin) - Increase N_FTS (used in L0s->L0 transitions) and enable ASPM L0s for Root Complex and Endpoint modes (Shawn Lin) - Hide the broken ATS Capability in rockchip_pcie_ep_init() instead of rockchip_pcie_ep_pre_init() so it stays hidden after PERST# resets non-sticky registers (Shawn Lin) - Call phy_power_off() before phy_exit() in rockchip_pcie_phy_deinit() (Diederik de Haas) Synopsys DesignWare PCIe controller driver: - Set PORT_LOGIC_LINK_WIDTH to one lane to make initial link training more robust; this will not affect the intended link width if all lanes are functional (Wenbin Yao) - Return bool (not int) for link-up check in dw_pcie_ops.link_up() and armada8k, dra7xx, dw-rockchip, exynos, histb, keembay, keystone, kirin, meson, qcom, qcom-ep, rcar_gen4, spear13xx, tegra194, uniphier, visconti (Hans Zhang) - Add debugfs support for exposing DWC device-specific PTM context (Manivannan Sadhasivam) TI J721E PCIe driver: - Make j721e buildable as a loadable and removable module (Siddharth Vadapalli) - Fix j721e host/endpoint dependencies that result in link failures in some configs (Arnd Bergmann) Device tree bindings: - Add qcom DT binding for 'global' interrupt (PCIe controller and link-specific events) for ipq8074, ipq8074-gen3, ipq6018, sa8775p, sc7280, sc8180x sdm845, sm8150, sm8250, sm8350 (Manivannan Sadhasivam) - Add qcom DT binding for 8 MSI SPI interrupts for msm8998, ipq8074, ipq8074-gen3, ipq6018 (Manivannan Sadhasivam) - Add dw rockchip DT binding for rk3576 and rk3562 (Kever Yang) - Correct indentation and style of examples in brcm,stb-pcie, cdns,cdns-pcie-ep, intel,keembay-pcie-ep, intel,keembay-pcie, microchip,pcie-host, rcar-pci-ep, rcar-pci-host, xilinx-versal-cpm (Krzysztof Kozlowski) - Convert Marvell EBU (dove, kirkwood, armada-370, armada-xp) and armada8k from text to schema DT bindings (Rob Herring) - Remove obsolete .txt DT bindings for content that has been moved to schemas (Rob Herring) - Add qcom DT binding for MHI registers in IPQ5332, IPQ6018, IPQ8074 and IPQ9574 (Varadarajan Narayanan) - Convert v3,v360epc-pci from text to DT schema binding (Rob Herring) - Change microchip,pcie-host DT binding to be 'dma-noncoherent' since PolarFire may be configured that way (Conor Dooley) Miscellaneous: - Drop 'pci' suffix from intel_mid_pci.c filename to match similar files (Andy Shevchenko) - All platforms with PCI have an MMU, so add PCI Kconfig dependency on MMU to simplify build testing and avoid inadvertent build regressions (Arnd Bergmann) - Update Krzysztof Wilczyński's email address in MAINTAINERS (Krzysztof Wilczyński) - Update Manivannan Sadhasivam's email address in MAINTAINERS (Manivannan Sadhasivam)" * tag 'pci-v6.16-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci: (147 commits) MAINTAINERS: Update Manivannan Sadhasivam email address PCI: j721e: Fix host/endpoint dependencies PCI: j721e: Add support to build as a loadable module PCI: cadence-ep: Introduce cdns_pcie_ep_disable() helper for cleanup PCI: cadence-host: Introduce cdns_pcie_host_disable() helper for cleanup PCI: cadence: Add support to build pcie-cadence library as a kernel module MAINTAINERS: Update Krzysztof Wilczyński email address PCI: Remove unnecessary linesplit in __pci_setup_bridge() PCI: WARN (not BUG()) when we fail to assign optional resources PCI: Remove unused pci_printk() PCI: qcom: Replace PERST# sleep time with proper macro PCI: dw-rockchip: Replace PERST# sleep time with proper macro PCI: host-common: Convert to library for host controller drivers PCI/ERR: Remove misleading TODO regarding kernel panic PCI: cadence: Remove duplicate message code definitions PCI: endpoint: Align pci_epc_set_msix(), pci_epc_ops::set_msix() nr_irqs encoding PCI: endpoint: Align pci_epc_set_msi(), pci_epc_ops::set_msi() nr_irqs encoding PCI: endpoint: Align pci_epc_get_msix(), pci_epc_ops::get_msix() return value encoding PCI: endpoint: Align pci_epc_get_msi(), pci_epc_ops::get_msi() return value encoding PCI: cadence-ep: Correct PBA offset in .set_msix() callback ...
10 daysdrm/ttm: Fix compile error when CONFIG_SHMEM is not setSteven Rostedt1-0/+1
When CONFIG_SHMEM is not set, the following compiler error occurs: ld: vmlinux.o: in function `ttm_backup_backup_page': (.text+0x10363bc): undefined reference to `shmem_writeout' make[3]: *** [/work/build/trace/nobackup/linux.git/scripts/Makefile.vmlinux:91: vmlinux.unstripped] Error 1 This is due to the replacement of writepage and calling swap_writeout() and shmem_writeout() directly. The issue is that when CONFIG_SHMEM is not defined, shmem_writeout() is also not defined. The function ttm_backup_backup_page() called mapping->a_ops->writepage() which was then changed to call shmem_writeout() directly. Even before commit 84798514db50 ("mm: Remove swap_writepage() and shmem_writepage()"), it didn't make sense to call anything other than shmem_writeout() as the ttm_backup deals only with shmem folios. Have DRM_TTM config option select SHMEM to guarantee that shmem_writeout() is available. Link: https://lore.kernel.org/all/20250602170500.48713a2b@gandalf.local.home/ Suggested-by: Hugh Dickins <hughd@google.com> Fixes: 84798514db50 ("mm: Remove swap_writepage() and shmem_writepage()") Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org> Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
11 daysdrm/i915/display: Fix u32 overflow in SNPS PHY HDMI PLL setupDibin Moolakadan Subrahmanian1-8/+8
When configuring the HDMI PLL, calculations use DIV_ROUND_UP_ULL and DIV_ROUND_DOWN_ULL macros, which internally rely on do_div. However, do_div expects a 32-bit (u32) divisor, and at higher data rates, the divisor can exceed this limit. This leads to incorrect division results and ultimately misconfigured PLL values. This fix replaces do_div calls with div64_base64 calls where diviser can exceed u32 limit. Fixes: 5947642004bf ("drm/i915/display: Add support for SNPS PHY HDMI PLL algorithm for DG2") Cc: Ankit Nautiyal <ankit.k.nautiyal@intel.com> Cc: Suraj Kandpal <suraj.kandpal@intel.com> Cc: Jani Nikula <jani.nikula@intel.com> Signed-off-by: Dibin Moolakadan Subrahmanian <dibin.moolakadan.subrahmanian@intel.com> Reviewed-by: Ankit Nautiyal <ankit.k.nautiyal@intel.com> Signed-off-by: Ankit Nautiyal <ankit.k.nautiyal@intel.com> Link: https://lore.kernel.org/r/20250528064557.4172149-1-dibin.moolakadan.subrahmanian@intel.com (cherry picked from commit ce924116e43ffbfa544d82976c4b9d11bcde9334) Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
11 daysdrm/amd/display: Fix default DC and AC levelsMario Limonciello1-2/+2
[Why] DC and AC levels are advertised in a percentage, not a luminance. [How] Scale DC and AC levels to supported values. Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4221 Reviewed-by: Alex Hung <alex.hung@amd.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Wayne Lin <wayne.lin@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
11 daysdrm/amd/display: Add debugging message for brightness capsMario Limonciello1-1/+6
[Why] Default BIOS brightness caps are buried in ACPI. [How] Add extra dynamic debug that can show default brightness caps. Reviewed-by: Alex Hung <alex.hung@amd.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Wayne Lin <wayne.lin@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
11 daysdrm/amd/display: Use DC log instead of using DM error msgCruise Hung1-1/+1
[Why & How] It sent an error msg when it failed to read the DP tunneling DPCD field. This should just be a warning msg. Use a DC log instead of a DM error msg. Reviewed-by: Wenjing Liu <wenjing.liu@amd.com> Signed-off-by: Cruise Hung <Cruise.Hung@amd.com> Signed-off-by: Wayne Lin <wayne.lin@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
11 daysdrm/amd/display: Avoid calling blank_stream() twiceZhongwei Zhang3-2/+13
[Why] We've made fix for garbage in dcn31_reset_back_end_for_pipe(), adding blank_stream() before disable_crtc(). And set_dpms_off() will call blank_stream() again. [How] Add flag to avoid calling blank_stream() twice. Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com> Signed-off-by: Zhongwei Zhang <Zhongwei.Zhang@amd.com> Signed-off-by: Wayne Lin <wayne.lin@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
11 daysdrm/amd/display: Correct non-OLED pre_T11_delay.Zhongwei Zhang1-3/+4
[Why] Only OLED panels require non-zero pre_T11_delay defaultly. Others should be controlled by power sequence. [How] For non OLED, pre_T11_delay delay in code should be zero. Also post_T7_delay. Cc: Mario Limonciello <mario.limonciello@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org Reviewed-by: Charlene Liu <charlene.liu@amd.com> Signed-off-by: Zhongwei Zhang <Zhongwei.Zhang@amd.com> Signed-off-by: Wayne Lin <wayne.lin@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
11 daysdrm/amdgpu: Add userq fence support to SDMAv7.0Arunpravin Paneer Selvam3-23/+120
- Add userq fence support to SDMAv7.0. - GFX12's user fence irq src id differs from GFX11's, hence we need create a new irq srcid header file for GFX12. User fence irq src id information- GFX11 and SDMA6.0 - 0x43 GFX12 and SDMA7.0 - 0x46 Signed-off-by: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
11 daysdrm/amdgpu: Fix integer overflow in amdgpu_gem_add_input_fence()Dan Carpenter1-1/+1
The "num_syncobj_handles" is a u32 value that comes from the user via the ioctl. On 32bit systems the "sizeof(uint32_t) * num_syncobj_handles" multiplication can have an integer overflow. Use size_mul() to fix that. Fixes: 38c67ec9aa4b ("drm/amdgpu: Add input fence to sync bo map/unmap") Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
11 daysdrm/amdgpu: Fix integer overflow issues in amdgpu_userq_fence.cDan Carpenter1-4/+4
This patch only affects 32bit systems. There are several integer overflows bugs here but only the "sizeof(u32) * num_syncobj" multiplication is a problem at runtime. (The last lines of this patch). These variables are u32 variables that come from the user. The issue is the multiplications can overflow leading to us allocating a smaller buffer than intended. For the first couple integer overflows, the syncobj_handles = memdup_user() allocation is immediately followed by a kmalloc_array(): syncobj = kmalloc_array(num_syncobj_handles, sizeof(*syncobj), GFP_KERNEL); In that situation the kmalloc_array() works as a bounds check and we haven't accessed the syncobj_handlesp[] array yet so the integer overflow is harmless. But the "num_syncobj" multiplication doesn't have that and the integer overflow could lead to an out of bounds access. Fixes: a292fdecd728 ("drm/amdgpu: Implement userqueue signal/wait IOCTL") Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
11 daysdrm/amdgpu: disable workload profile switching when OD is enabledAlex Deucher3-0/+31
Users have reported that they have to reduce the level of undervolting to acheive stability when dynamic workload profiles are enabled on GC 10.3.x. Disable dynamic workload profiles if the user has enabled OD. Fixes: b9467983b774 ("drm/amdgpu: add dynamic workload profile switching for gfx10") Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4262 Reviewed-by: Kenneth Feng <kenneth.feng@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org # 6.15.x
11 daysdrm/amdgpu/gfx10: Refine Cleaner Shader for GFX10.1.10Vitaly Prosyak2-10/+9
This patch updates the cleaner shader, which is responsible for initializing GPU resources such as Local Data Share (LDS), Vector General Purpose Registers (VGPRs), and Scalar General Purpose Registers (SGPRs). Changes include adjustments to register clearing and shader configuration. - Updated GPU resource initialization addresses in the cleaner shader from `be803080` to `be803000`. - Simplified the logic in the SGPR clearing section, ensuring all SGPRs are set to zero. Fixes: 25961bad9212 ("drm/amdgpu/gfx10: Add cleaner shader for GFX10.1.10") Cc: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Manu Rastogi <manu.rastogi@amd.com> Signed-off-by: Vitaly Prosyak <vitaly.prosyak@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
11 daysdrm/amdgpu: Add more checks to discovery fetchLijo Lazar1-3/+13
Add more checks for valid vram size and log error, if any. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
11 daysdrm/amdkfd: enable kfd on RISCV systemsXuemei Liu1-1/+1
KFD has been confirmed that can run on RISCV systems. It's necessary to support CONFIG_HSA_AMD on RISCV. Signed-off-by: Xuemei Liu <liu.xuemei1@zte.com.cn> Signed-off-by: Felix Kuehling <felix.kuehling@amd.com> Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
12 daysMerge tag 'bitmap-for-6.16-rc1' of https://github.com/norov/linuxLinus Torvalds1-97/+11
Pull bitmap updates from Yury Norov: - dead code cleanups for cpumasks and nodemasks (me) - fixed-width flavors of GENMASK() and BIT() (Vincent, Lucas and me) - FIELD_MODIFY() helper (Luo) - for_each_node_with_cpus() optimization (me) - bitmap-str fixes (Andy) * tag 'bitmap-for-6.16-rc1' of https://github.com/norov/linux: topology: make for_each_node_with_cpus() O(N) bitfield: Add FIELD_MODIFY() helper bitmap-str: Add missing header(s) bitmap-str: Get rid of 'extern' for function prototypes build_bug.h: more user friendly error messages in BUILD_BUG_ON_ZERO() test_bits: add tests for BIT_U*() test_bits: add tests for GENMASK_U*() drm/i915: Convert REG_GENMASK*() to fixed-width GENMASK_U*() bits: introduce fixed-type BIT_U*() bits: introduce fixed-type GENMASK_U*() bits: add comments and newlines to #if, #else and #endif directives cpumask: drop cpumask_assign_cpu() riscv: switch set_icache_stale_mask() to using non-atomic assign_cpu() cpumask: add non-atomic __assign_cpu() nodemask: drop nodes_shift
13 daysdrm/i915/guc: Handle race condition where wakeref count drops below 0Jesus Narvaez1-3/+14
There is a rare race condition when preparing for a reset where guc_lrc_desc_unpin() could be in the process of deregistering a context while a different thread is scrubbing outstanding contexts and it alters the context state and does a wakeref put. Then, if there is a failure with deregister_context(), a second wakeref put could occur. As a result the wakeref count could drop below 0 and fail an INTEL_WAKEREF_BUG_ON() check. Therefore if there is a failure with deregister_context(), undo the context state changes and do a wakeref put only if the context was set to be destroyed earlier. v2: Expand comment to better explain change. (Daniele) v3: Removed addition to the original comment. (Daniele) Fixes: 2f2cc53b5fe7 ("drm/i915/guc: Close deregister-context race against CT-loss") Signed-off-by: Jesus Narvaez <jesus.narvaez@intel.com> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: Alan Previn <alan.previn.teres.alexis@intel.com> Cc: Anshuman Gupta <anshuman.gupta@intel.com> Cc: Mousumi Jana <mousumi.jana@intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: Matt Roper <matthew.d.roper@intel.com> Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Link: https://lore.kernel.org/r/20250528230551.1855177-1-jesus.narvaez@intel.com (cherry picked from commit f36a75aba1c3176d177964bca76f86a075d2943a) Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
13 daysdrm/i915/psr: Fix using wrong mask in REG_FIELD_PREPJouni Högander1-2/+2
Wrong mask is used in PORT_ALPM_LFPS_CTL_FIRST_LFPS_HALF_CYCLE_DURATION and PORT_ALPM_LFPS_CTL_LAST_LFPS_HALF_CYCLE_DURATION. Fixes: 295099580f04 ("drm/i915/psr: Add missing ALPM AUX-Less register definitions") Signed-off-by: Jouni Högander <jouni.hogander@intel.com> Reviewed-by: Ankit Nautiyal <ankit.k.nautiyal@intel.com> Link: https://lore.kernel.org/r/20250526120512.1702815-12-jouni.hogander@intel.com (cherry picked from commit 8097128a40ff378761034ec72cdbf6f46e466dc0) Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
13 daysdrm/i915/guc: Check if expecting reply before decrementing outstanding_submission_g2hJesus Narvaez1-1/+1
When sending a H2G message where a reply is expected in guc_submission_send_busy_loop(), outstanding_submission_g2h is incremented before the send. However, if there is an error sending the message, outstanding_submission_g2h is decremented without checking if a reply is expected. Therefore, check if reply is expected when there is a failure before decrementing outstanding_submission_g2h. Fixes: 2f2cc53b5fe7 ("drm/i915/guc: Close deregister-context race against CT-loss") Signed-off-by: Jesus Narvaez <jesus.narvaez@intel.com> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: Alan Previn <alan.previn.teres.alexis@intel.com> Cc: Anshuman Gupta <anshuman.gupta@intel.com> Cc: Mousumi Jana <mousumi.jana@intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: Matt Roper <matthew.d.roper@intel.com> Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Link: https://lore.kernel.org/r/20250514225224.4142684-1-jesus.narvaez@intel.com (cherry picked from commit a6a26786f22a4ab0227bcf610510c4c9c2df0808) Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
13 daysMerge tag 'amd-drm-fixes-6.16-2025-05-29' of https://gitlab.freedesktop.org/agd5f/linux into drm-nextDave Airlie88-347/+1533
amd-drm-fixes-6.16-2025-05-29: amdgpu: - UserQ fixes - SMU 13.x fixes - VCN fixes - JPEG fixes - Misc cleanups - runtime pm fix - DCN 4.0.1 fixes - Misc display fixes - ISP fix - VRAM manager fix - RAS fixes amdkfd: - SVM fix - Misc cleanups - Ref leak fix - WPTR BO fix radeon: - Misc cleanups Signed-off-by: Dave Airlie <airlied@redhat.com> From: Alex Deucher <alexander.deucher@amd.com> Link: https://lore.kernel.org/r/20250529205215.6790-1-alexander.deucher@amd.com
2025-05-31Merge tag 'mm-nonmm-stable-2025-05-31-15-28' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mmLinus Torvalds1-2/+2
Pull non-MM updates from Andrew Morton: - "hung_task: extend blocking task stacktrace dump to semaphore" from Lance Yang enhances the hung task detector. The detector presently dumps the blocking tasks's stack when it is blocked on a mutex. Lance's series extends this to semaphores - "nilfs2: improve sanity checks in dirty state propagation" from Wentao Liang addresses a couple of minor flaws in nilfs2 - "scripts/gdb: Fixes related to lx_per_cpu()" from Illia Ostapyshyn fixes a couple of issues in the gdb scripts - "Support kdump with LUKS encryption by reusing LUKS volume keys" from Coiby Xu addresses a usability problem with kdump. When the dump device is LUKS-encrypted, the kdump kernel may not have the keys to the encrypted filesystem. A full writeup of this is in the series [0/N] cover letter - "sysfs: add counters for lockups and stalls" from Max Kellermann adds /sys/kernel/hardlockup_count and /sys/kernel/hardlockup_count and /sys/kernel/rcu_stall_count - "fork: Page operation cleanups in the fork code" from Pasha Tatashin implements a number of code cleanups in fork.c - "scripts/gdb/symbols: determine KASLR offset on s390 during early boot" from Ilya Leoshkevich fixes some s390 issues in the gdb scripts * tag 'mm-nonmm-stable-2025-05-31-15-28' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (67 commits) llist: make llist_add_batch() a static inline delayacct: remove redundant code and adjust indentation squashfs: add optional full compressed block caching crash_dump, nvme: select CONFIGFS_FS as built-in scripts/gdb/symbols: determine KASLR offset on s390 during early boot scripts/gdb/symbols: factor out pagination_off() scripts/gdb/symbols: factor out get_vmlinux() kernel/panic.c: format kernel-doc comments mailmap: update and consolidate Casey Connolly's name and email nilfs2: remove wbc->for_reclaim handling fork: define a local GFP_VMAP_STACK fork: check charging success before zeroing stack fork: clean-up naming of vm_stack/vm_struct variables in vmap stacks code fork: clean-up ifdef logic around stack allocation kernel/rcu/tree_stall: add /sys/kernel/rcu_stall_count kernel/watchdog: add /sys/kernel/{hard,soft}lockup_count x86/crash: make the page that stores the dm crypt keys inaccessible x86/crash: pass dm crypt keys to kdump kernel Revert "x86/mm: Remove unused __set_memory_prot()" crash_dump: retrieve dm crypt keys in kdump kernel ...
2025-05-31Merge tag 'mm-stable-2025-05-31-14-50' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mmLinus Torvalds1-2/+2
Pull MM updates from Andrew Morton: - "Add folio_mk_pte()" from Matthew Wilcox simplifies the act of creating a pte which addresses the first page in a folio and reduces the amount of plumbing which architecture must implement to provide this. - "Misc folio patches for 6.16" from Matthew Wilcox is a shower of largely unrelated folio infrastructure changes which clean things up and better prepare us for future work. - "memory,x86,acpi: hotplug memory alignment advisement" from Gregory Price adds early-init code to prevent x86 from leaving physical memory unused when physical address regions are not aligned to memory block size. - "mm/compaction: allow more aggressive proactive compaction" from Michal Clapinski provides some tuning of the (sadly, hard-coded (more sadly, not auto-tuned)) thresholds for our invokation of proactive compaction. In a simple test case, the reduction of a guest VM's memory consumption was dramatic. - "Minor cleanups and improvements to swap freeing code" from Kemeng Shi provides some code cleaups and a small efficiency improvement to this part of our swap handling code. - "ptrace: introduce PTRACE_SET_SYSCALL_INFO API" from Dmitry Levin adds the ability for a ptracer to modify syscalls arguments. At this time we can alter only "system call information that are used by strace system call tampering, namely, syscall number, syscall arguments, and syscall return value. This series should have been incorporated into mm.git's "non-MM" branch, but I goofed. - "fs/proc: extend the PAGEMAP_SCAN ioctl to report guard regions" from Andrei Vagin extends the info returned by the PAGEMAP_SCAN ioctl against /proc/pid/pagemap. This permits CRIU to more efficiently get at the info about guard regions. - "Fix parameter passed to page_mapcount_is_type()" from Gavin Shan implements that fix. No runtime effect is expected because validate_page_before_insert() happens to fix up this error. - "kernel/events/uprobes: uprobe_write_opcode() rewrite" from David Hildenbrand basically brings uprobe text poking into the current decade. Remove a bunch of hand-rolled implementation in favor of using more current facilities. - "mm/ptdump: Drop assumption that pxd_val() is u64" from Anshuman Khandual provides enhancements and generalizations to the pte dumping code. This might be needed when 128-bit Page Table Descriptors are enabled for ARM. - "Always call constructor for kernel page tables" from Kevin Brodsky ensures that the ctor/dtor is always called for kernel pgtables, as it already is for user pgtables. This permits the addition of more functionality such as "insert hooks to protect page tables". This change does result in various architectures performing unnecesary work, but this is fixed up where it is anticipated to occur. - "Rust support for mm_struct, vm_area_struct, and mmap" from Alice Ryhl adds plumbing to permit Rust access to core MM structures. - "fix incorrectly disallowed anonymous VMA merges" from Lorenzo Stoakes takes advantage of some VMA merging opportunities which we've been missing for 15 years. - "mm/madvise: batch tlb flushes for MADV_DONTNEED and MADV_FREE" from SeongJae Park optimizes process_madvise()'s TLB flushing. Instead of flushing each address range in the provided iovec, we batch the flushing across all the iovec entries. The syscall's cost was approximately halved with a microbenchmark which was designed to load this particular operation. - "Track node vacancy to reduce worst case allocation counts" from Sidhartha Kumar makes the maple tree smarter about its node preallocation. stress-ng mmap performance increased by single-digit percentages and the amount of unnecessarily preallocated memory was dramaticelly reduced. - "mm/gup: Minor fix, cleanup and improvements" from Baoquan He removes a few unnecessary things which Baoquan noted when reading the code. - ""Enhance sysfs handling for memory hotplug in weighted interleave" from Rakie Kim "enhances the weighted interleave policy in the memory management subsystem by improving sysfs handling, fixing memory leaks, and introducing dynamic sysfs updates for memory hotplug support". Fixes things on error paths which we are unlikely to hit. - "mm/damon: auto-tune DAMOS for NUMA setups including tiered memory" from SeongJae Park introduces new DAMOS quota goal metrics which eliminate the manual tuning which is required when utilizing DAMON for memory tiering. - "mm/vmalloc.c: code cleanup and improvements" from Baoquan He provides cleanups and small efficiency improvements which Baoquan found via code inspection. - "vmscan: enforce mems_effective during demotion" from Gregory Price changes reclaim to respect cpuset.mems_effective during demotion when possible. because presently, reclaim explicitly ignores cpuset.mems_effective when demoting, which may cause the cpuset settings to violated. This is useful for isolating workloads on a multi-tenant system from certain classes of memory more consistently. - "Clean up split_huge_pmd_locked() and remove unnecessary folio pointers" from Gavin Guo provides minor cleanups and efficiency gains in in the huge page splitting and migrating code. - "Use kmem_cache for memcg alloc" from Huan Yang creates a slab cache for `struct mem_cgroup', yielding improved memory utilization. - "add max arg to swappiness in memory.reclaim and lru_gen" from Zhongkun He adds a new "max" argument to the "swappiness=" argument for memory.reclaim MGLRU's lru_gen. This directs proactive reclaim to reclaim from only anon folios rather than file-backed folios. - "kexec: introduce Kexec HandOver (KHO)" from Mike Rapoport is the first step on the path to permitting the kernel to maintain existing VMs while replacing the host kernel via file-based kexec. At this time only memblock's reserve_mem is preserved. - "mm: Introduce for_each_valid_pfn()" from David Woodhouse provides and uses a smarter way of looping over a pfn range. By skipping ranges of invalid pfns. - "sched/numa: Skip VMA scanning on memory pinned to one NUMA node via cpuset.mems" from Libo Chen removes a lot of pointless VMA scanning when a task is pinned a single NUMA mode. Dramatic performance benefits were seen in some real world cases. - "JFS: Implement migrate_folio for jfs_metapage_aops" from Shivank Garg addresses a warning which occurs during memory compaction when using JFS. - "move all VMA allocation, freeing and duplication logic to mm" from Lorenzo Stoakes moves some VMA code from kernel/fork.c into the more appropriate mm/vma.c. - "mm, swap: clean up swap cache mapping helper" from Kairui Song provides code consolidation and cleanups related to the folio_index() function. - "mm/gup: Cleanup memfd_pin_folios()" from Vishal Moola does that. - "memcg: Fix test_memcg_min/low test failures" from Waiman Long addresses some bogus failures which are being reported by the test_memcontrol selftest. - "eliminate mmap() retry merge, add .mmap_prepare hook" from Lorenzo Stoakes commences the deprecation of file_operations.mmap() in favor of the new file_operations.mmap_prepare(). The latter is more restrictive and prevents drivers from messing with things in ways which, amongst other problems, may defeat VMA merging. - "memcg: decouple memcg and objcg stocks"" from Shakeel Butt decouples the per-cpu memcg charge cache from the objcg's one. This is a step along the way to making memcg and objcg charging NMI-safe, which is a BPF requirement. - "mm/damon: minor fixups and improvements for code, tests, and documents" from SeongJae Park is yet another batch of miscellaneous DAMON changes. Fix and improve minor problems in code, tests and documents. - "memcg: make memcg stats irq safe" from Shakeel Butt converts memcg stats to be irq safe. Another step along the way to making memcg charging and stats updates NMI-safe, a BPF requirement. - "Let unmap_hugepage_range() and several related functions take folio instead of page" from Fan Ni provides folio conversions in the hugetlb code. * tag 'mm-stable-2025-05-31-14-50' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (285 commits) mm: pcp: increase pcp->free_count threshold to trigger free_high mm/hugetlb: convert use of struct page to folio in __unmap_hugepage_range() mm/hugetlb: refactor __unmap_hugepage_range() to take folio instead of page mm/hugetlb: refactor unmap_hugepage_range() to take folio instead of page mm/hugetlb: pass folio instead of page to unmap_ref_private() memcg: objcg stock trylock without irq disabling memcg: no stock lock for cpu hot-unplug memcg: make __mod_memcg_lruvec_state re-entrant safe against irqs memcg: make count_memcg_events re-entrant safe against irqs memcg: make mod_memcg_state re-entrant safe against irqs memcg: move preempt disable to callers of memcg_rstat_updated memcg: memcg_rstat_updated re-entrant safe against irqs mm: khugepaged: decouple SHMEM and file folios' collapse selftests/eventfd: correct test name and improve messages alloc_tag: check mem_profiling_support in alloc_tag_init Docs/damon: update titles and brief introductions to explain DAMOS selftests/damon/_damon_sysfs: read tried regions directories in order mm/damon/tests/core-kunit: add a test for damos_set_filters_default_reject() mm/damon/paddr: remove unused variable, folio_list, in damon_pa_stat() mm/damon/sysfs-schemes: fix wrong comment on damons_sysfs_quota_goal_metric_strs ...
2025-05-31Merge tag 'gcc-minimum-version-6.16' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-genericLinus Torvalds1-5/+1
Pull compiler version requirement update from Arnd Bergmann: "Require gcc-8 and binutils-2.30 x86 already uses gcc-8 as the minimum version, this changes all other architectures to the same version. gcc-8 is used is Debian 10 and Red Hat Enterprise Linux 8, both of which are still supported, and binutils 2.30 is the oldest corresponding version on those. Ubuntu Pro 18.04 and SUSE Linux Enterprise Server 15 both use gcc-7 as the system compiler but additionally include toolchains that remain supported. With the new minimum toolchain versions, a number of workarounds for older versions can be dropped, in particular on x86_64 and arm64. Importantly, the updated compiler version allows removing two of the five remaining gcc plugins, as support for sancov and structeak features is already included in modern compiler versions. I tried collecting the known changes that are possible based on the new toolchain version, but expect that more cleanups will be possible. Since this touches multiple architectures, I merged the patches through the asm-generic tree." * tag 'gcc-minimum-version-6.16' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic: Makefile.kcov: apply needed compiler option unconditionally in CFLAGS_KCOV Documentation: update binutils-2.30 version reference gcc-plugins: remove SANCOV gcc plugin Kbuild: remove structleak gcc plugin arm64: drop binutils version checks raid6: skip avx512 checks kbuild: require gcc-8 and binutils-2.30
2025-05-29drm/amdkfd: Map wptr BO to GART unconditionallyLang Yu2-13/+13
For simulation C models that don't run CP FW where adev->mes.sched_version is not populated correctly. This causes NULL dereference in amdgpu_amdkfd_free_gtt_mem(dev->adev, (void **)&pqn->q->wptr_bo_gart) and warning on unpinned BO in amdgpu_bo_gpu_offset(q->properties.wptr_bo). Compared with adding version check here and there, always map wptr BO to GART simplifies things. v2: Add NULL check in amdgpu_amdkfd_free_gtt_mem.(Philip) Signed-off-by: Lang Yu <lang.yu@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-05-29drm/amdgpu/mes: remove some unused functionsAlex Deucher2-67/+0
Nothing uses them so remove them. Leftover from MES bring up. Reviewed-by: Michael Chen <michael.chen@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-05-29drm/amdgpu/mes: add missing locking in helper functionsAlex Deucher1-0/+16
We need to take the MES lock. Reviewed-by: Michael Chen <michael.chen@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org