aboutsummaryrefslogtreecommitdiffstatshomepage
path: root/tools/perf/scripts/python/export-to-postgresql.py (unfollow)
AgeCommit message (Collapse)AuthorFilesLines
2025-05-07drm/xe: Move xe_device_sysfs_init() to xe_device_probe()Raag Jadav3-11/+13
Since xe_device_sysfs_init() exposes device specific attributes, a better place for it is xe_device_probe(). Signed-off-by: Raag Jadav <raag.jadav@intel.com> Reviewed-by: Riana Tauro <riana.tauro@intel.com> Link: https://lore.kernel.org/r/20250506054835.3395220-2-raag.jadav@intel.com Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2025-05-07drm/xe: Release force wake first then runtime powerShuicheng Lin1-4/+5
xe_force_wake_get() is dependent on xe_pm_runtime_get(), so for the release path, xe_force_wake_put() should be called first then xe_pm_runtime_put(). Combine the error path and normal path together with goto. Fixes: 85d547608ef5 ("drm/xe/xe_gt_debugfs: Update handling of xe_force_wake_get return") Cc: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Shuicheng Lin <shuicheng.lin@intel.com> Reviewed-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com> Link: https://lore.kernel.org/r/20250507022302.2187527-1-shuicheng.lin@intel.com Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2025-05-07drm/xe: Add config control for svm flush workShuicheng Lin3-3/+19
Without CONFIG_DRM_XE_GPUSVM set, GPU SVM is not initialized thus below warning pops. Refine the flush work code to be controlled by the config to avoid below warning: " [ 453.132028] ------------[ cut here ]------------ [ 453.132527] WARNING: CPU: 9 PID: 4491 at kernel/workqueue.c:4205 __flush_work+0x379/0x3a0 [ 453.133355] Modules linked in: xe drm_ttm_helper ttm gpu_sched drm_buddy drm_suballoc_helper drm_gpuvm drm_exec [ 453.134352] CPU: 9 UID: 0 PID: 4491 Comm: xe_exec_mix_mod Tainted: G U W 6.15.0-rc3+ #7 PREEMPT(full) [ 453.135405] Tainted: [U]=USER, [W]=WARN ... [ 453.136921] RIP: 0010:__flush_work+0x379/0x3a0 [ 453.137417] Code: 8b 45 00 48 8b 55 08 89 c7 48 c1 e8 04 83 e7 08 83 e0 0f 83 cf 02 89 c6 48 0f ba 6d 00 03 e9 d5 fe ff ff 0f 0b e9 db fd ff ff <0f> 0b 45 31 e4 e9 d1 fd ff ff 0f 0b e9 03 ff ff ff 0f 0b e9 d6 fe [ 453.139250] RSP: 0018:ffffc90000c67b18 EFLAGS: 00010246 [ 453.139782] RAX: 0000000000000000 RBX: ffff888108a24000 RCX: 0000000000002000 [ 453.140521] RDX: 0000000000000001 RSI: 0000000000000000 RDI: ffff8881016d61c8 [ 453.141253] RBP: ffff8881016d61c8 R08: 0000000000000000 R09: 0000000000000000 [ 453.141985] R10: 0000000000000000 R11: 0000000008a24000 R12: 0000000000000001 [ 453.142709] R13: 0000000000000002 R14: 0000000000000000 R15: ffff888107db8c00 [ 453.143450] FS: 00007f44853d4c80(0000) GS:ffff8882f469b000(0000) knlGS:0000000000000000 [ 453.144276] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 453.144853] CR2: 00007f4487629228 CR3: 00000001016aa000 CR4: 00000000000406f0 [ 453.145594] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 453.146320] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 453.147061] Call Trace: [ 453.147336] <TASK> [ 453.147579] ? tick_nohz_tick_stopped+0xd/0x30 [ 453.148067] ? xas_load+0x9/0xb0 [ 453.148435] ? xa_load+0x6f/0xb0 [ 453.148781] __xe_vm_bind_ioctl+0xbd5/0x1500 [xe] [ 453.149338] ? dev_printk_emit+0x48/0x70 [ 453.149762] ? _dev_printk+0x57/0x80 [ 453.150148] ? drm_ioctl+0x17c/0x440 [ 453.150544] ? __drm_dev_vprintk+0x36/0x90 [ 453.150983] ? __pfx_xe_vm_bind_ioctl+0x10/0x10 [xe] [ 453.151575] ? drm_ioctl_kernel+0x9f/0xf0 [ 453.151998] ? __pfx_xe_vm_bind_ioctl+0x10/0x10 [xe] [ 453.152560] drm_ioctl_kernel+0x9f/0xf0 [ 453.152968] drm_ioctl+0x20f/0x440 [ 453.153332] ? __pfx_xe_vm_bind_ioctl+0x10/0x10 [xe] [ 453.153893] ? ioctl_has_perm.constprop.0.isra.0+0xae/0x100 [ 453.154489] ? memory_bm_test_bit+0x5/0x60 [ 453.154935] xe_drm_ioctl+0x47/0x70 [xe] [ 453.155419] __x64_sys_ioctl+0x8d/0xc0 [ 453.155824] do_syscall_64+0x47/0x110 [ 453.156228] entry_SYSCALL_64_after_hwframe+0x76/0x7e " v2 (Matt): refine commit message to have more details add Fixes tag move the code to xe_svm.h which already have the config remove a blank line per codestyle suggestion Fixes: 63f6e480d115 ("drm/xe: Add SVM garbage collector") Cc: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Shuicheng Lin <shuicheng.lin@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Link: https://lore.kernel.org/r/20250502170052.1787973-1-shuicheng.lin@intel.com
2025-05-07drm/xe: Use copy_from_user() instead of __copy_from_user()Harish Chegondi6-17/+16
copy_from_user() has more checks and is more safer than __copy_from_user() Suggested-by: Kees Cook <kees@kernel.org> Signed-off-by: Harish Chegondi <harish.chegondi@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com> Signed-off-by: Ashutosh Dixit <ashutosh.dixit@intel.com> Link: https://lore.kernel.org/r/acabf20aa8621c7bc8de09b1bffb8d14b5376484.1746126614.git.harish.chegondi@intel.com
2025-05-05drm/xe/gsc: do not flush the GSC worker from the reset pathDaniele Ceraolo Spurio7-2/+44
The workqueue used for the reset worker is marked as WQ_MEM_RECLAIM, while the GSC one isn't (and can't be as we need to do memory allocations in the gsc worker). Therefore, we can't flush the latter from the former. The reason why we had such a flush was to avoid interrupting either the GSC FW load or in progress GSC proxy operations. GSC proxy operations fall into 2 categories: 1) GSC proxy init: this only happens once immediately after GSC FW load and does not support being interrupted. The only way to recover from an interruption of the proxy init is to do an FLR and re-load the GSC. 2) GSC proxy request: this can happen in response to a request that the driver sends to the GSC. If this is interrupted, the GSC FW will timeout and the driver request will be failed, but overall the GSC will keep working fine. Flushing the work allowed us to avoid interruption in both cases (unless the hang came from the GSC engine itself, in which case we're toast anyway). However, a failure on a proxy request is tolerable if we're in a scenario where we're triggering a GT reset (i.e., something is already gone pretty wrong), so what we really need to avoid is interrupting the init flow, which we can do by polling on the register that reports when the proxy init is complete (as that ensure us that all the load and init operations have been completed). Note that during suspend we still want to do a flush of the worker to make sure it completes any operations involving the HW before the power is cut. v2: fix spelling in commit msg, rename waiter function (Julia) Fixes: dd0e89e5edc2 ("drm/xe/gsc: GSC FW load") Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/4830 Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: John Harrison <John.C.Harrison@Intel.com> Cc: Alan Previn <alan.previn.teres.alexis@intel.com> Cc: <stable@vger.kernel.org> # v6.8+ Reviewed-by: Julia Filipchuk <julia.filipchuk@intel.com> Link: https://lore.kernel.org/r/20250502155104.2201469-1-daniele.ceraolospurio@intel.com
2025-05-01drm/xe: Do not print timedout job message on killed exec queuesMatthew Brost1-3/+6
If a user ctrl-c an app while something is running on the GPU, jobs are expected to timeout. Do not spam dmesg with timedout job messages in this case. Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://lore.kernel.org/r/20250428175505.935694-1-matthew.brost@intel.com
2025-05-01drm/xe: fix devcoredump chunk alignmnent calculationArnd Bergmann1-4/+9
The device core dumps are copied in 1.5GB chunks, which leads to a link-time error on 32-bit builds because of the 64-bit division not getting trivially turned into mask and shift operations: ERROR: modpost: "__moddi3" [drivers/gpu/drm/xe/xe.ko] undefined! On top of this, I noticed that the ALIGN_DOWN() usage here cannot work because that is only defined for power-of-two alignments. Change ALIGN_DOWN into an explicit div_u64_rem() that avoids the link error and hopefully produces the right results. Doing a 1.5GB kvmalloc() does seem a bit suspicious as well, e.g. this will clearly fail on any 32-bit platform and is also likely to run out of memory on 64-bit systems under memory pressure, so using a much smaller power-of-two chunk size might be a good idea instead. v2: - Always call div_u64_rem (Matt) Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202504251238.JsNgFeFc-lkp@intel.com/ Fixes: c4a2e5f865b7 ("drm/xe: Add devcoredump chunking") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://lore.kernel.org/r/20250501012545.1045247-1-matthew.brost@intel.com Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2025-04-29drm/xe/vf: Fix guc_info debugfs for VFsDaniele Ceraolo Spurio1-21/+23
The guc_info debugfs attempts to read a bunch of registers that the VFs doesn't have access to, so fix it by skipping the reads. Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/4775 Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Lukasz Laguna <lukasz.laguna@intel.com> Reviewed-by: Lukasz Laguna <lukasz.laguna@intel.com> Link: https://lore.kernel.org/r/20250423173908.1571412-1-daniele.ceraolospurio@intel.com
2025-04-29drm/gpusvm: set has_dma_mapping inside mapping loopDafna Hirschfeld1-1/+1
The 'has_dma_mapping' flag should be set once there is a mapping so it could be unmapped in case of error. v2: - Resend for CI Fixes: 99624bdff867 ("drm/gpusvm: Add support for GPU Shared Virtual Memory") Signed-off-by: Dafna Hirschfeld <dafna.hirschfeld@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Link: https://lore.kernel.org/r/20250428024752.881292-1-matthew.brost@intel.com
2025-04-29drm/xe/tests/mocs: Hold XE_FORCEWAKE_ALL for LNCF regsTejas Upadhyay1-2/+5
LNCF registers report wrong values when XE_FORCEWAKE_GT only is held. Holding XE_FORCEWAKE_ALL ensures correct operations on LNCF regs. V2(Himal): - Use xe_force_wake_ref_has_domain Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/1999 Fixes: a6a4ea6d7d37 ("drm/xe: Add mocs kunit") Reviewed-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20250428082357.1730068-1-tejas.upadhyay@intel.com Signed-off-by: Tejas Upadhyay <tejas.upadhyay@intel.com>
2025-04-27drm/xe: Drop force_alloc from xe_bo_evict in selftestsMatthew Brost3-3/+3
The force_alloc flag was removed from TTM / Xe but updating the selftests to new function interfaces was missed. Remove argument from xe_bo_evict in selftests. v2: - Fix dma-buf, migrate selftests (CI) Fixes: 55df7c0c62c1 ("drm/ttm/xe: drop unused force_alloc flag") Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Clint Taylor <Clinton.A.Taylor@intel.com> Link: https://lore.kernel.org/r/20250428022318.877860-1-matthew.brost@intel.com
2025-04-25drm/xe/eustall: Do not support EU stall on SRIOV VFHarish Chegondi2-1/+5
EU stall sampling is not supported on SRIOV VF. Do not initialize or open EU stall stream on SRIOV VF. Fixes: 9a0b11d4cf3b ("drm/xe/eustall: Add support to init, enable and disable EU stall sampling") Signed-off-by: Harish Chegondi <harish.chegondi@intel.com> Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com> Signed-off-by: Ashutosh Dixit <ashutosh.dixit@intel.com> Link: https://lore.kernel.org/r/10db5d1c7e17aadca7078ff74575b7ffc0d5d6b8.1745215022.git.harish.chegondi@intel.com
2025-04-25drm/xe/eustall: Resolve a possible circular locking dependencyHarish Chegondi1-2/+9
Use a separate lock in the polling function eu_stall_data_buf_poll() instead of eu_stall->stream_lock. This would prevent a possible circular locking dependency leading to a deadlock as described below. This would also require additional locking with the new lock in the read function. <4> [787.192986] ====================================================== <4> [787.192988] WARNING: possible circular locking dependency detected <4> [787.192991] 6.14.0-rc7-xe+ #1 Tainted: G U <4> [787.192993] ------------------------------------------------------ <4> [787.192994] xe_eu_stall/20093 is trying to acquire lock: <4> [787.192996] ffff88819847e2c0 ((work_completion) (&(&stream->buf_poll_work)->work)), at: __flush_work+0x1f8/0x5e0 <4> [787.193005] but task is already holding lock: <4> [787.193007] ffff88814ce83ba8 (&gt->eu_stall->stream_lock){3:3}, at: xe_eu_stall_stream_ioctl+0x41/0x6a0 [xe] <4> [787.193090] which lock already depends on the new lock. <4> [787.193093] the existing dependency chain (in reverse order) is: <4> [787.193095] -> #1 (&gt->eu_stall->stream_lock){+.+.}-{3:3}: <4> [787.193099] __mutex_lock+0xb4/0xe40 <4> [787.193104] mutex_lock_nested+0x1b/0x30 <4> [787.193106] eu_stall_data_buf_poll_work_fn+0x44/0x1d0 [xe] <4> [787.193155] process_one_work+0x21c/0x740 <4> [787.193159] worker_thread+0x1db/0x3c0 <4> [787.193161] kthread+0x10d/0x270 <4> [787.193164] ret_from_fork+0x44/0x70 <4> [787.193168] ret_from_fork_asm+0x1a/0x30 <4> [787.193172] -> #0 ((work_completion)(&(&stream->buf_poll_work)->work)){+.+.}-{0:0}: <4> [787.193176] __lock_acquire+0x1637/0x2810 <4> [787.193180] lock_acquire+0xc9/0x300 <4> [787.193183] __flush_work+0x219/0x5e0 <4> [787.193186] cancel_delayed_work_sync+0x87/0x90 <4> [787.193189] xe_eu_stall_disable_locked+0x9a/0x260 [xe] <4> [787.193237] xe_eu_stall_stream_ioctl+0x5b/0x6a0 [xe] <4> [787.193285] __x64_sys_ioctl+0xa4/0xe0 <4> [787.193289] x64_sys_call+0x131e/0x2650 <4> [787.193292] do_syscall_64+0x91/0x180 <4> [787.193295] entry_SYSCALL_64_after_hwframe+0x76/0x7e <4> [787.193299] other info that might help us debug this: <4> [787.193302] Possible unsafe locking scenario: <4> [787.193304] CPU0 CPU1 <4> [787.193305] ---- ---- <4> [787.193306] lock(&gt->eu_stall->stream_lock); <4> [787.193308] lock((work_completion) (&(&stream->buf_poll_work)->work)); <4> [787.193311] lock(&gt->eu_stall->stream_lock); <4> [787.193313] lock((work_completion) (&(&stream->buf_poll_work)->work)); <4> [787.193315] *** DEADLOCK *** Fixes: 760edec939685 ("drm/xe/eustall: Add support to read() and poll() EU stall data") Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/4598 Signed-off-by: Harish Chegondi <harish.chegondi@intel.com> Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com> Signed-off-by: Ashutosh Dixit <ashutosh.dixit@intel.com> Link: https://lore.kernel.org/r/c896932fca84f79db2df5942911997ed77b2b9b6.1744934656.git.harish.chegondi@intel.com
2025-04-24drm/xe: Abort printing coredump in VM printer output if fullMatthew Brost1-0/+3
Abort printing coredump in VM printer output if full. Helps speedup large coredumps which need to walked multiple times in xe_devcoredump_read. v2: - s/drm_printer_is_full/drm_coredump_printer_is_full (Jani) Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com> Link: https://lore.kernel.org/r/20250423171725.597955-5-matthew.brost@intel.com
2025-04-24drm/print: Add drm_coredump_printer_is_fullMatthew Brost1-0/+20
Add drm_coredump_printer_is_full which indicates if a drm printer's output is full. Useful to short circuit coredump printing once printer's output is full. v2: - s/drm_printer_is_full/drm_coredump_printer_is_full (Jani) v3: - Bail if not a coredump printer (Michal) Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com> Link: https://lore.kernel.org/r/20250423171725.597955-4-matthew.brost@intel.com
2025-04-24drm/xe: Update xe_ttm_access_memory to use GPU for non-visible accessMatthew Brost3-19/+218
Add migrate layer functions to access VRAM and update xe_ttm_access_memory to use for non-visible access and large (more than 16k) BO access. 8G devcoreump on BMG observed 3 minute CPU copy time vs. 3s GPU copy time. v4: - Fix non-page aligned accesses - Add support for small / unaligned access - Update commit message indicating migrate used for large accesses (Auld) - Fix warning in xe_res_cursor for non-zero offset v5: - Fix 32 bit build (CI) v6: - Rebase and use SVM migration copy functions v7: - Fix build error (CI) v8: - Remove ifdef around VRAM copy functions (CI) - Use break statement in dma unmmaping (Jonathan) - Use if/else rather than goto (Jonathan) - Use single return point (Jonathan) Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com> Link: https://lore.kernel.org/r/20250423171725.597955-3-matthew.brost@intel.com
2025-04-24drm/xe: Add devcoredump chunkingMatthew Brost2-12/+49
Chunk devcoredump into 1.5G pieces to avoid hitting the kvmalloc limit of 2G. Simple algorithm reads 1.5G at time in xe_devcoredump_read callback as needed. Some memory allocations are changed to GFP_ATOMIC as they done in xe_devcoredump_read which holds lock in the path of reclaim. The allocations are small, so in practice should never fail. v2: - Update commit message wrt gfp atomic (John H) v6: - Drop GFP_ATOMIC change for hwconfig (John H) Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com> Link: https://lore.kernel.org/r/20250423171725.597955-2-matthew.brost@intel.com
2025-04-24drm/xe/hwmon: Fix kernel version documentation for fan speedLucas De Marchi1-3/+3
The version in the sysfs attribute should correspond to the version in which this is enabled and visible for end users. It usually doesn't correspond to the version in which the patch was developed, but rather a release that will contain it. Update them to 6.16. Fixes: 28f79ac609de ("drm/xe/hwmon: expose fan speed") Reported-by: Ulisses Furquim <ulisses.furquim@intel.com> Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/4841 Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Reviewed-by: Raag Jadav <raag.jadav@intel.com> Link: https://lore.kernel.org/r/20250421-hwmon-doc-fix-v1-2-9f68db702249@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2025-04-24drm/xe/hwmon: Fix kernel version documentation for temperatureLucas De Marchi1-2/+2
The version in the sysfs attribute should correspond to the version in which this is enabled and visible for end users. It usually doesn't correspond to the version in which the patch was developed, but rather a release that will contain it. Update them to 6.15. Fixes: dac328dea701 ("drm/xe/hwmon: expose package and vram temperature") Reported-by: Ulisses Furquim <ulisses.furquim@intel.com> Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/4840 Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Reviewed-by: Raag Jadav <raag.jadav@intel.com> Link: https://lore.kernel.org/r/20250421-hwmon-doc-fix-v1-1-9f68db702249@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2025-04-24drm/ttm/xe: drop unused force_alloc flagDave Airlie6-9/+3
This flag used to be used in the old memory tracking code, that code got migrated into the vmwgfx driver[1], and then got removed from the tree[2], but this piece got left behind. [1] f07069da6b4c ("drm/ttm: move memory accounting into vmwgfx v4") [2] 8aadeb8ad874 ("drm/vmwgfx: Remove the dedicated memory accounting") Cleanup the dead code. Reviewed-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Dave Airlie <airlied@redhat.com>
2025-04-23drm/xe: Fix CFI violation when accessing sysfs filesJeevaka Prabu Badrappan3-93/+107
When an attribute group is created with sysfs_create_group() or sysfs_create_files() the ->sysfs_ops() callback is set to kobj_sysfs_ops, which sets the ->show() callback to kobj_attr_show(). kobj_attr_show() uses container_of() to get the ->show() callback from the attribute it was passed, meaning the ->show() callback needs to be the same type as the ->show() callback in 'struct kobj_attribute'. However, cur_freq_show() has the type of the ->show() callback in 'struct device_attribute', which causes a CFI violation when opening the 'id' sysfs node under gtidle/freq/throttle. This happens to work because the layout of 'struct kobj_attribute' and 'struct device_attribute' are the same, so the container_of() cast happens to allow the ->show() callback to still work. Changed the type of cur_freq_show() and few more functions to match the ->show() callback in 'struct kobj_attributes' to resolve the CFI violation. CFI failure seen while accessing sysfs files under /sys/class/drm/card0/device/tile0/gt*/gtidle/* /sys/class/drm/card0/device/tile0/gt*/freq0/* /sys/class/drm/card0/device/tile0/gt*/freq0/throttle/* [ 2599.618075] RIP: 0010:__cfi_cur_freq_show+0xd/0x10 [xe] [ 2599.624452] Code: 44 c1 44 89 fa e8 03 95 39 f2 48 98 5b 41 5e 41 5f 5d c3 c9 [ 2599.646638] RSP: 0018:ffffbe438ead7d10 EFLAGS: 00010286 [ 2599.652823] RAX: ffff9f7d8b3845d8 RBX: ffff9f7dee8c95d8 RCX: 0000000000000000 [ 2599.661246] RDX: ffff9f7e6f439000 RSI: ffffffffc13ada30 RDI: ffff9f7d975d4b00 [ 2599.669669] RBP: ffffbe438ead7d18 R08: 0000000000001000 R09: ffff9f7e6f439000 [ 2599.678092] R10: 00000000e07304a6 R11: ffffffffc1241ca0 R12: ffffffffb4836ea0 [ 2599.688435] R13: ffff9f7e45fb1180 R14: ffff9f7d975d4b00 R15: ffff9f7e6f439000 [ 2599.696860] FS: 000076b02b66cfc0(0000) GS:ffff9f80ef400000(0000) knlGS:00000 [ 2599.706412] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 2599.713196] CR2: 00005f80d94641a9 CR3: 00000001e44ec006 CR4: 0000000100f72ef0 [ 2599.721618] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 2599.730041] DR3: 0000000000000000 DR6: 00000000ffff07f0 DR7: 0000000000000400 [ 2599.738464] PKRU: 55555554 [ 2599.741655] Call Trace: [ 2599.744541] <TASK> [ 2599.747017] ? __die_body+0x69/0xb0 [ 2599.751151] ? die+0xa9/0xd0 [ 2599.754548] ? do_trap+0x89/0x160 [ 2599.758476] ? __cfi_cur_freq_show+0xd/0x10 [xe b37985c94829727668bd7c5b33c1] [ 2599.768315] ? handle_invalid_op+0x69/0x90 [ 2599.773167] ? __cfi_cur_freq_show+0xd/0x10 [xe b37985c94829727668bd7c5b33c1] [ 2599.783010] ? exc_invalid_op+0x36/0x60 [ 2599.787552] ? fred_hwexc+0x123/0x1a0 [ 2599.791873] ? fred_entry_from_kernel+0x7b/0xd0 [ 2599.797219] ? asm_fred_entrypoint_kernel+0x45/0x70 [ 2599.802976] ? act_freq_show+0x70/0x70 [xe b37985c94829727668bd7c5b33c1d9998] [ 2599.812301] ? __cfi_cur_freq_show+0xd/0x10 [xe b37985c94829727668bd7c5b33c1] [ 2599.822137] ? __kmalloc_node_noprof+0x1f3/0x420 [ 2599.827594] ? __kvmalloc_node_noprof+0xcb/0x180 [ 2599.833045] ? kobj_attr_show+0x22/0x40 [ 2599.837571] sysfs_kf_seq_show+0xa8/0x110 [ 2599.842302] kernfs_seq_show+0x38/0x50 Signed-off-by: Jeevaka Prabu Badrappan <jeevaka.badrappan@intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://lore.kernel.org/r/20250422171852.85558-1-jeevaka.badrappan@intel.com Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2025-04-23drm/xe: handle pinned memory in PM notifierMatthew Auld5-19/+176
Userspace is still alive and kicking at this point so actually moving pinned stuff here is tricky. However, we can instead pre-allocate the backup storage upfront from the notifier, such that we scoop up as much as we can, and then leave the final .suspend() to do the actual copy (or allocate anything that we missed). That way the bulk of our allocations will hopefully be done outside the more restrictive .suspend(). We do need to be extra careful though, since the pinned handling can now race with PM notifier, like something becoming unpinned after we prepare it from the notifier. v2 (Thomas): - Fix kernel doc and drop the pin as soon as we are done with the restore, instead of deferring to later. Suggested-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Signed-off-by: Matthew Auld <matthew.auld@intel.com> Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Link: https://lore.kernel.org/r/20250416150913.434369-8-matthew.auld@intel.com
2025-04-23drm/xe: share bo dma-resv with backup objectMatthew Auld2-15/+15
We end up needing to grab both locks together anyway and keep them held until we complete the copy or add the fence. Plus the backup_obj is short lived and tied to the parent object, so seems reasonable to share the same dma-resv. This will simplify the locking here, and in follow up patches. v2: - Hold reference to the parent bo to be sure the shared dma-resv can't go out of scope too soon. (Thomas) Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Link: https://lore.kernel.org/r/20250416150913.434369-7-matthew.auld@intel.com
2025-04-23drm/xe: evict user memory in PM notifierMatthew Auld6-24/+84
In the case of VRAM we might need to allocate large amounts of GFP_KERNEL memory on suspend, however doing that directly in the driver .suspend()/.prepare() callback is not advisable (no swap for example). To improve on this we can instead hook up to the PM notifier framework which is invoked at an earlier stage. We effectively call the evict routine twice, where the notifier will have hopefully have cleared out most if not everything by the time we call it a second time when entering the .suspend() callback. For s4 we also get the added benefit of allocating the system pages before the hibernation image size is calculated, which looks more sensible. Note that the .suspend() hook is still responsible for dealing with all the pinned memory. Improving that is left to another patch. Link: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/1181 Link: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/4288 Link: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/4566 Suggested-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Signed-off-by: Matthew Auld <matthew.auld@intel.com> Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Link: https://lore.kernel.org/r/20250416150913.434369-6-matthew.auld@intel.com
2025-04-22drm/xe/guc: Cache DSS info when creating capture register listJohn Harrison2-50/+48
Calculating the DSS id (index of a steered register) currently requires reading state from the hwconfig table and that currently requires dynamically allocating memory. The GuC based register capture (for dev core dumps) includes this index as part of the register name in the dump. However, it was calculating said index at the time of the dump for every dump. That is wasteful. It also breaks anyone trying to do the dump at a time when memory allocations are not allowed. So rather than calculating on every print, just calculate at start of day when creating the register list in the first place. Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Link: https://lore.kernel.org/r/20250417213303.3021243-1-John.C.Harrison@Intel.com
2025-04-22drm/xe/guc: Use the steering flag when printing registersJohn Harrison1-4/+2
The printing code was doing a test on which list a register was in to decide whether it is steered or not. That might be valid at this moment but there may be other reasons for extended lists in the future. Plus, there is a flag specifically for identifying steered registers. So, just use that instead - it is simpler and safer. Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Link: https://lore.kernel.org/r/20250417195215.3002210-3-John.C.Harrison@Intel.com
2025-04-22drm/xe/guc: Fix capture of steering registersJohn Harrison1-1/+1
The list of registers to capture on a GPU hang includes some that require steering. Unfortunately, the flag to say this was being wiped to due a missing OR on the assignment of the next flag field. Fix that. Fixes: b170d696c1e2 ("drm/xe/guc: Add XE_LP steered register lists") Cc: Zhanjun Dong <zhanjun.dong@intel.com> Cc: Alan Previn <alan.previn.teres.alexis@intel.com> Cc: Matt Roper <matthew.d.roper@intel.com> Cc: Lucas De Marchi <lucas.demarchi@intel.com> Cc: "Thomas Hellström" <thomas.hellstrom@linux.intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: intel-xe@lists.freedesktop.org Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Reviewed-by: Zhanjun Dong <zhanjun.dong@intel.com> Link: https://lore.kernel.org/r/20250417195215.3002210-2-John.C.Harrison@Intel.com
2025-04-22drm/xe/svm: fix dereferencing error pointer in drm_gpusvm_range_alloc()Harshit Mogalapalli1-1/+1
xe_svm_range_alloc() returns ERR_PTR(-ENOMEM) on failure and there is a dereference of "range" after that: --> range->gpusvm = gpusvm; In xe_svm_range_alloc(), when memory allocation fails return NULL instead to handle this situation. Fixes: 99624bdff867 ("drm/gpusvm: Add support for GPU Shared Virtual Memory") Reported-by: Dan Carpenter <dan.carpenter@linaro.org> Closes: https://lore.kernel.org/all/adaef4dd-5866-48ca-bc22-4a1ddef20381@stanley.mountain/ Signed-off-by: Harshit Mogalapalli <harshit.m.mogalapalli@oracle.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Link: https://lore.kernel.org/r/20250323124907.3946370-1-harshit.m.mogalapalli@oracle.com
2025-04-17drm/xe: Introduce fault injection for guc CTB send/recvSatyanarayana K V P1-0/+1
Fault can be injected with below steps. FAILTYPE=fail_function FAILFUNC=xe_guc_ct_send_recv echo > /sys/kernel/debug/$FAILTYPE/inject echo $FAILFUNC > /sys/kernel/debug/$FAILTYPE/inject printf %#x -19 > /sys/kernel/debug/$FAILTYPE/$FAILFUNC/retval echo N > /sys/kernel/debug/$FAILTYPE/task-filter echo 10 > /sys/kernel/debug/$FAILTYPE/probability echo 0 > /sys/kernel/debug/$FAILTYPE/interval echo -1 > /sys/kernel/debug/$FAILTYPE/times echo 0 > /sys/kernel/debug/$FAILTYPE/space echo 1 > /sys/kernel/debug/$FAILTYPE/verbose Signed-off-by: Satyanarayana K V P <satyanarayana.k.v.p@intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Cc: Michał Wajdeczko <michal.wajdeczko@intel.com> Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Link: https://lore.kernel.org/r/20250403120641.7258-3-satyanarayana.k.v.p@intel.com
2025-04-17drm/xe: Introduce fault injection for guc mmio send/recv.Satyanarayana K V P1-0/+1
Fault can be injected with below steps. FAILTYPE=fail_function FAILFUNC=xe_guc_mmio_send_recv echo > /sys/kernel/debug/$FAILTYPE/inject echo $FAILFUNC > /sys/kernel/debug/$FAILTYPE/inject printf %#x -5 > /sys/kernel/debug/$FAILTYPE/$FAILFUNC/retval echo N > /sys/kernel/debug/$FAILTYPE/task-filter echo 10 > /sys/kernel/debug/$FAILTYPE/probability echo 0 > /sys/kernel/debug/$FAILTYPE/interval echo -1 > /sys/kernel/debug/$FAILTYPE/times echo 0 > /sys/kernel/debug/$FAILTYPE/space echo 1 > /sys/kernel/debug/$FAILTYPE/verbose Signed-off-by: Satyanarayana K V P <satyanarayana.k.v.p@intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Cc: Michał Wajdeczko <michal.wajdeczko@intel.com> Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Link: https://lore.kernel.org/r/20250403120641.7258-2-satyanarayana.k.v.p@intel.com
2025-04-17drm/xe: Use GT oriented message to report engine activity errorMichal Wajdeczko1-2/+3
We are enabling/disabling engine activity on per-GT basis, so any errors should be also reported per GT, like: [ ] xe 0000:00:02.0: [drm] GT0: PF: Failed to enable engine activity function stats (-ENOSPC) [ ] xe 0000:00:02.0: [drm] GT1: PF: Failed to enable engine activity function stats (-ENOSPC) Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com> Cc: Riana Tauro <riana.tauro@intel.com> Reviewed-by: Riana Tauro <riana.tauro@intel.com> Link: https://lore.kernel.org/r/20250414202347.1909-2-michal.wajdeczko@intel.com
2025-04-17drm/xe/guc: Fix out-of-bound while enabling engine activity statsMichal Wajdeczko1-2/+5
In the PF mode we allocate array of struct engine_activity_group that holds activity data split for the PF and all potential VFs. But while preparing data for use by VFs we ended with bad index. [ ] BUG: KASAN: slab-out-of-bounds in xe_guc_engine_activity_function_stats+0x41e/0x4f0 [xe] [ ] Call Trace: [ ] <TASK> [ ] dump_stack_lvl+0x91/0xf0 [ ] print_report+0xd1/0x680 [ ] ? __virt_addr_valid+0x23a/0x440 [ ] ? kasan_addr_to_slab+0xd/0xb0 [ ] kasan_report+0xe7/0x130 [ ] ? xe_guc_engine_activity_function_stats+0x41e/0x4f0 [xe] [ ] ? xe_guc_engine_activity_function_stats+0x41e/0x4f0 [xe] [ ] __asan_report_store8_noabort+0x17/0x30 [ ] xe_guc_engine_activity_function_stats+0x41e/0x4f0 [xe] [ ] pf_engine_activity_stats+0x1b6/0x7f0 [xe] [ ] ? kobject_put+0x5f/0x470 [ ] xe_pci_sriov_configure+0x28c9/0x3270 [xe] [ ] ? __pfx_dev_attr_store+0x10/0x10 [ ] ? kstrtoull+0x3b/0x70 [ ] ? __pfx___lock_acquire+0x10/0x10 [ ] ? kstrtou16+0x65/0xf0 [ ] sriov_numvfs_store+0x20c/0x400 [ ] ? __pfx_sriov_numvfs_store+0x10/0x10 [ ] ? __pfx__copy_from_iter+0x10/0x10 [ ] ? __pfx_dev_attr_store+0x10/0x10 [ ] dev_attr_store+0x3b/0x80 [ ] ? sysfs_file_ops+0x135/0x190 Fixes: 2de3f38fbf89 ("drm/xe: Add support for per-function engine activity") Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com> Cc: Riana Tauro <riana.tauro@intel.com> Reviewed-by: Riana Tauro <riana.tauro@intel.com> Link: https://lore.kernel.org/r/20250414202347.1909-1-michal.wajdeczko@intel.com
2025-04-17drm/xe/pxp: do not queue unneeded terminations from debugfsDaniele Ceraolo Spurio1-2/+11
The PXP terminate debugfs currently unconditionally simulates a termination, no matter what the HW status is. This is unneeded if PXP is not in use and can cause errors if the HW init hasn't completed yet. To solve these issues, we can simply limit the terminations to the cases where PXP is fully initialized and in use. v2: s/pxp_status/ready/ to avoid confusion with pxp->status (John) Fixes: 385a8015b214 ("drm/xe/pxp: Add PXP debugfs support") Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/4749 Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: John Harrison <John.C.Harrison@Intel.com> Reviewed-by: John Harrison <John.C.Harrison@Intel.com> Link: https://lore.kernel.org/r/20250416201622.1295369-1-daniele.ceraolospurio@intel.com
2025-04-16drm/xe/dma_buf: stop relying on placement in unmapMatthew Auld1-4/+1
The is_vram() is checking the current placement, however if we consider exported VRAM with dynamic dma-buf, it looks possible for the xe driver to async evict the memory, notifying the importer, however importer does not have to call unmap_attachment() immediately, but rather just as "soon as possible", like when the dma-resv idles. Following from this we would then pipeline the move, attaching the fence to the manager, and then update the current placement. But when the unmap_attachment() runs at some later point we might see that is_vram() is now false, and take the complete wrong path when dma-unmapping the sg, leading to explosions. To fix this check if the sgl was mapping a struct page. v2: - The attachment can be mapped multiple times it seems, so we can't really rely on encoding something in the attachment->priv. Instead see if the page_link has an encoded struct page. For vram we expect this to be NULL. Link: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/4563 Fixes: dd08ebf6c352 ("drm/xe: Introduce a new DRM driver for Intel GPUs") Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Cc: <stable@vger.kernel.org> # v6.8+ Acked-by: Christian König <christian.koenig@amd.com> Link: https://lore.kernel.org/r/20250410162716.159403-2-matthew.auld@intel.com
2025-04-16drm/xe/userptr: fix notifier vs folio deadlockMatthew Auld1-24/+0
User is reporting what smells like notifier vs folio deadlock, where migrate_pages_batch() on core kernel side is holding folio lock(s) and then interacting with the mappings of it, however those mappings are tied to some userptr, which means calling into the notifier callback and grabbing the notifier lock. With perfect timing it looks possible that the pages we pulled from the hmm fault can get sniped by migrate_pages_batch() at the same time that we are holding the notifier lock to mark the pages as accessed/dirty, but at this point we also want to grab the folio locks(s) to mark them as dirty, but if they are contended from notifier/migrate_pages_batch side then we deadlock since folio lock won't be dropped until we drop the notifier lock. Fortunately the mark_page_accessed/dirty is not really needed in the first place it seems and should have already been done by hmm fault, so just remove it. Link: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/4765 Fixes: 0a98219bcc96 ("drm/xe/hmm: Don't dereference struct page pointers without notifier lock") Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: Thomas Hellström <thomas.hellstrom@intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Cc: <stable@vger.kernel.org> # v6.10+ Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://lore.kernel.org/r/20250414132539.26654-2-matthew.auld@intel.com
2025-04-15drm/xe: Adjust ringbuf emission for maximum possible sizeTvrtko Ursulin1-1/+1
MAX_JOB_SIZE_DW seems to be undersized. For the worst case emission from __emit_job_gen12_render_compute I hand count 57 dwords so lets bump this to an even 58. Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com> Reviewed-by: Francois Dugast <francois.dugast@intel.com> Link: https://lore.kernel.org/r/20250403190317.6064-2-tvrtko.ursulin@igalia.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2025-04-14drm/xe: Set LRC addresses before guc loadLucas De Marchi1-30/+45
The metadata saved in the ADS is read by GuC when it's initialized. Saving the addresses to the LRCs when they are populated is too late as GuC will keep using the old ones. This was causing GuC to use the RCS LRC for any engine class. It's not a big problem on a Linux-only scenario since the they are used by GuC only on media engines when the watchdog is triggered. However, in a virtualization scenario with Windows as the VF, it causes the wrong LRCs to be loaded as the watchdog is used for all engines. Fix it by letting guc_golden_lrc_init() initialize the metadata, like other *_init() functions, and later guc_golden_lrc_populate() to copy the LRCs to the right places. The former is called before the second GuC load, while the latter is called after LRCs have been recorded. Cc: Chee Yin Wong <chee.yin.wong@intel.com> Cc: John Harrison <john.c.harrison@intel.com> Cc: Matt Roper <matthew.d.roper@intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Fixes: dd08ebf6c352 ("drm/xe: Introduce a new DRM driver for Intel GPUs") Cc: <stable@vger.kernel.org> # v6.11+ Reviewed-by: Matthew Brost <matthew.brost@intel.com> Tested-by: Chee Yin Wong <chee.yin.wong@intel.com> Link: https://lore.kernel.org/r/20250409-fix-guc-ads-v1-1-494135f7a5d0@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2025-04-14drm/xe/pf: Don't show GGTT/LMEM debugfs files under media GTMichal Wajdeczko1-17/+49
Most of the PF's debugfs files (and their implementations) are based on the GT hierarchy even if files are related to GGTT or LMEM data, that are related to the tile. While we could reach the tile data from any GT, to avoid potential misuse, some functions allow to be used on the primary GT only, and may use asserts to enforce that. In our case, the following assert could be seen when reading the /sys/kernel/debug/dri/0000:00:02.0/gt1/pf/ggtt_available [ ] xe 0000:00:02.0: [drm] Assertion `!xe_gt_is_media_type(gt)` failed! [ ] WARNING: CPU: 4 PID: 10609 at drivers/gpu/drm/xe/xe_gt_sriov_pf_config.c:379 pf_get_spare_ggtt+0x256/0x4e0 [xe] [ ] RIP: 0010:pf_get_spare_ggtt+0x256/0x4e0 [xe] [ ] Call Trace: [ ] <TASK> [ ] xe_gt_sriov_pf_config_print_available_ggtt+0xb7/0x480 [xe] [ ] ? __memcg_slab_post_alloc_hook+0x12f/0x3f0 [ ] xe_gt_debugfs_simple_show+0x7b/0xb0 [xe] [ ] ? __pfx___drm_printfn_seq_file+0x10/0x10 [ ] ? __pfx___drm_puts_seq_file+0x10/0x10 [ ] seq_read_iter+0x139/0x4e0 [ ] seq_read+0x11d/0x160 [ ] full_proxy_read+0x6b/0xb0 [ ] vfs_read+0xfa/0x390 Fix that by moving GGTT/LMEM debugfs attributes to separate lists and register them only when applicable (on primary GT, on DGFX). Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Tested-by: Marcin Bernatowicz <marcin.bernatowicz@linux.intel.com> Reviewed-by: Marcin Bernatowicz <marcin.bernatowicz@linux.intel.com> Link: https://lore.kernel.org/r/20250411193030.1865-1-michal.wajdeczko@intel.com
2025-04-13Linux 6.15-rc2Linus Torvalds1-1/+1
2025-04-12ext4: fix off-by-one error in do_splitArtem Sadovnikov1-1/+1
Syzkaller detected a use-after-free issue in ext4_insert_dentry that was caused by out-of-bounds access due to incorrect splitting in do_split. BUG: KASAN: use-after-free in ext4_insert_dentry+0x36a/0x6d0 fs/ext4/namei.c:2109 Write of size 251 at addr ffff888074572f14 by task syz-executor335/5847 CPU: 0 UID: 0 PID: 5847 Comm: syz-executor335 Not tainted 6.12.0-rc6-syzkaller-00318-ga9cda7c0ffed #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 10/30/2024 Call Trace: <TASK> __dump_stack lib/dump_stack.c:94 [inline] dump_stack_lvl+0x241/0x360 lib/dump_stack.c:120 print_address_description mm/kasan/report.c:377 [inline] print_report+0x169/0x550 mm/kasan/report.c:488 kasan_report+0x143/0x180 mm/kasan/report.c:601 kasan_check_range+0x282/0x290 mm/kasan/generic.c:189 __asan_memcpy+0x40/0x70 mm/kasan/shadow.c:106 ext4_insert_dentry+0x36a/0x6d0 fs/ext4/namei.c:2109 add_dirent_to_buf+0x3d9/0x750 fs/ext4/namei.c:2154 make_indexed_dir+0xf98/0x1600 fs/ext4/namei.c:2351 ext4_add_entry+0x222a/0x25d0 fs/ext4/namei.c:2455 ext4_add_nondir+0x8d/0x290 fs/ext4/namei.c:2796 ext4_symlink+0x920/0xb50 fs/ext4/namei.c:3431 vfs_symlink+0x137/0x2e0 fs/namei.c:4615 do_symlinkat+0x222/0x3a0 fs/namei.c:4641 __do_sys_symlink fs/namei.c:4662 [inline] __se_sys_symlink fs/namei.c:4660 [inline] __x64_sys_symlink+0x7a/0x90 fs/namei.c:4660 do_syscall_x64 arch/x86/entry/common.c:52 [inline] do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83 entry_SYSCALL_64_after_hwframe+0x77/0x7f </TASK> The following loop is located right above 'if' statement. for (i = count-1; i >= 0; i--) { /* is more than half of this entry in 2nd half of the block? */ if (size + map[i].size/2 > blocksize/2) break; size += map[i].size; move++; } 'i' in this case could go down to -1, in which case sum of active entries wouldn't exceed half the block size, but previous behaviour would also do split in half if sum would exceed at the very last block, which in case of having too many long name files in a single block could lead to out-of-bounds access and following use-after-free. Found by Linux Verification Center (linuxtesting.org) with Syzkaller. Cc: stable@vger.kernel.org Fixes: 5872331b3d91 ("ext4: fix potential negative array index in do_split()") Signed-off-by: Artem Sadovnikov <a.sadovnikov@ispras.ru> Reviewed-by: Jan Kara <jack@suse.cz> Link: https://patch.msgid.link/20250404082804.2567-3-a.sadovnikov@ispras.ru Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2025-04-12ext4: make block validity check resistent to sb bh corruptionOjaswin Mujoo2-6/+6
Block validity checks need to be skipped in case they are called for journal blocks since they are part of system's protected zone. Currently, this is done by checking inode->ino against sbi->s_es->s_journal_inum, which is a direct read from the ext4 sb buffer head. If someone modifies this underneath us then the s_journal_inum field might get corrupted. To prevent against this, change the check to directly compare the inode with journal->j_inode. **Slight change in behavior**: During journal init path, check_block_validity etc might be called for journal inode when sbi->s_journal is not set yet. In this case we now proceed with ext4_inode_block_valid() instead of returning early. Since systems zones have not been set yet, it is okay to proceed so we can perform basic checks on the blocks. Suggested-by: Baokun Li <libaokun1@huawei.com> Reviewed-by: Baokun Li <libaokun1@huawei.com> Reviewed-by: Jan Kara <jack@suse.cz> Reviewed-by: Zhang Yi <yi.zhang@huawei.com> Signed-off-by: Ojaswin Mujoo <ojaswin@linux.ibm.com> Link: https://patch.msgid.link/0c06bc9ebfcd6ccfed84a36e79147bf45ff5adc1.1743142920.git.ojaswin@linux.ibm.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2025-04-12ext4: avoid -Wflex-array-member-not-at-end warningGustavo A. R. Silva1-10/+8
-Wflex-array-member-not-at-end was introduced in GCC-14, and we are getting ready to enable it, globally. Use the `DEFINE_RAW_FLEX()` helper for an on-stack definition of a flexible structure where the size of the flexible-array member is known at compile-time, and refactor the rest of the code, accordingly. So, with these changes, fix the following warning: fs/ext4/mballoc.c:3041:40: warning: structure containing a flexible array member is not at the end of another structure [-Wflex-array-member-not-at-end] Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org> Reviewed-by: Kees Cook <kees@kernel.org> Link: https://patch.msgid.link/Z-SF97N3AxcIMlSi@kspp Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2025-04-12Documentation: ext4: Add fields to ext4_super_block documentationTom Vierjahn1-6/+14
Documentation and implementation of the ext4 super block have slightly diverged: Padding has been removed in order to make room for new fields that are still missing in the documentation. Add the new fields s_encryption_level, s_first_error_errorcode, s_last_error_errorcode to the documentation of the ext4 super block. Fixes: f542fbe8d5e8 ("ext4 crypto: reserve codepoints used by the ext4 encryption feature") Fixes: 878520ac45f9 ("ext4: save the error code which triggered an ext4_error() in the superblock") Signed-off-by: Tom Vierjahn <tom.vierjahn@acm.org> Reviewed-by: Ojaswin Mujoo <ojaswin@linux.ibm.com> Link: https://patch.msgid.link/20250324221004.5268-1-tom.vierjahn@acm.org Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2025-04-12rv: Fix out-of-bound memory access in rv_is_container_monitor()Nam Cao1-1/+6
When rv_is_container_monitor() is called on the last monitor in rv_monitors_list, KASAN yells: BUG: KASAN: global-out-of-bounds in rv_is_container_monitor+0x101/0x110 Read of size 8 at addr ffffffff97c7c798 by task setup/221 The buggy address belongs to the variable: rv_monitors_list+0x18/0x40 This is due to list_next_entry() is called on the last entry in the list. It wraps around to the first list_head, and the first list_head is not embedded in struct rv_monitor_def. Fix it by checking if the monitor is last in the list. Cc: stable@vger.kernel.org Cc: Gabriele Monaco <gmonaco@redhat.com> Fixes: cb85c660fcd4 ("rv: Add option for nested monitors and include sched") Link: https://lore.kernel.org/e85b5eeb7228bfc23b8d7d4ab5411472c54ae91b.1744355018.git.namcao@linutronix.de Signed-off-by: Nam Cao <namcao@linutronix.de> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2025-04-12ftrace: Do not have print_graph_retval() add a newlineSteven Rostedt1-6/+5
The retval and retaddr options for function_graph tracer will add a comment at the end of a function for both leaf and non leaf functions that looks like: __wake_up_common(); /* ret=0x1 */ } /* pick_next_task_fair ret=0x0 */ The function print_graph_retval() adds a newline after the "*/". But if that's not called, the caller function needs to make sure there's a newline added. This is confusing and when the function parameters code was added, it added a newline even when calling print_graph_retval() as the fact that the print_graph_retval() function prints a newline isn't obvious. This caused an extra newline to be printed and that made it fail the selftests when the retval option was set, as the selftests were not expecting blank lines being injected into the trace. Instead of having print_graph_retval() print a newline, just have the caller always print the newline regardless if it calls print_graph_retval() or not. This not only fixes this bug, but it also simplifies the code. Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Link: https://lore.kernel.org/20250411133015.015ca393@gandalf.local.home Reported-by: Mark Brown <broonie@kernel.org> Tested-by: Mark Brown <broonie@kernel.org> Closes: https://lore.kernel.org/all/ccc40f2b-4b9e-4abd-8daf-d22fce2a86f0@sirena.org.uk/ Fixes: ff5c9c576e754 ("ftrace: Add support for function argument to graph tracer") Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>