Age | Commit message (Collapse) | Author | Files | Lines |
|
Pull drm updates from Dave Airlie:
"Lots of stuff all over, some new AMD IP support and gang submit
support. i915 has further DG2 and Meteorlake pieces, and a bunch of
i915 display refactoring. msm has a shrinker rework. There are also a
bunch of conversions to use kunit.
This has two external pieces, some MEI changes needed for future Intel
discrete GPUs. These should be acked by Greg. There is also a cross
maintainer shared tree with some backlight rework from Hans in here.
Core:
- convert selftests to kunit
- managed init for more objects
- move to idr_init_base
- rename fb and gem cma helpers to dma
- hide unregistered connectors from getconnector ioctl
- DSC passthrough aux support
- backlight handling improvements
- add dma_resv_assert_held to vmap/vunmap
edid:
- move luminance calculation to core
fbdev:
- fix aperture helper usage
fourcc:
- add more format helpers
- add DRM_FORMAT_Cxx, DRM_FORMAT_Rxx, DRM_FORMAT_Dxx
- add packed AYUV8888, XYUV8888
- add some kunit tests
ttm:
- allow bos without backing store
- rewrite placement to use intersect/compatible functions
dma-buf:
- docs update
- improve signalling when debugging
udmabuf:
- fix failure path GPF
dp:
- drop dp/mst legacy code
- atomic mst state support
- audio infoframe packing
panel:
- Samsung LTL101AL01
- B120XAN01.0
- R140NWF5 RH
- DMT028VGHMCMI-1A T
- AUO B133UAN02.1
- IVO M133NW4J-R3
- Innolux N120ACA-EA1
amdgpu:
- Gang submit support
- Mode2 reset for RDNA2
- New IP support:
DCN 3.1.4, 3.2
SMU 13.x
NBIO 7.7
GC 11.x
PSP 13.x
SDMA 6.x
GMC 11.x
- DSC passthrough support
- PSP fixes for TA support
- vangogh GFXOFF stats
- clang fixes
- gang submit CS cleanup prep work
- fix VRAM eviction issues
amdkfd:
- GC 10.3 IP ISA fixes
- fix CRIU regression
- CPU fault on COW mapping fixes
i915:
- align fw versioning with kernel practices
- add display substruct to i915 private
- add initial runtime info to driver info
- split out HDCP and backlight registers
- MEI XeHP SDV GSC support
- add per-gt sysfs defaults
- TLB invalidation improvements
- Disable PCI BAR resize on 32-bit
- GuC firmware updates and compat changes
- GuC log timestamp translation
- DG2 preemption workaround changes
- DG2 improved HDMI pixel clocks support
- PCI BAR sanity checks
- Enable DC5 on DG2
- DG2 DMC fw bumped
- ADL-S PCI ID added
- Meteorlake enablement
- Rename ggtt_view to gtt_view
- host RPS fixes
- release mmaps on rpm suspend on discrete
- clocking and dpll refactoring
- VBT definitions and parsing updates
- SKL watermark code extracted to separate file
- allow seamless M/N changes on eDP panels
- BUG_ON removal and cleanups
msm:
- DPU:
simplified VBIF configuration
cleanup CTL interfaces
- DSI:
removed unused msm_display_dsc_config struct
switch regulator calls to new API
switched to PANEL_BRIDGE for direct attached panels
- DSI_PHY: convert drivers to parent_hws
- DP: cleanup pixel_rate handling
- HDMI: turned hdmi-phy-8996 into OF clk provider
- misc dt-bindings fixes
- choose eDP as primary display if it's available
- support getting interconnects from either the mdss or the mdp5/dpu
device nodes
- gem: Shrinker + LRU re-work:
- adds a shared GEM LRU+shrinker helper and moves msm over to that
- reduce lock contention between retire and submit by avoiding the
need to acquire obj lock in retire path (and instead using resv
seeing obj's busyness in the shrinker
- fix reclaim vs submit issues
- GEM fault injection for triggering userspace error paths
- Map/unmap optimization
- Improved robustness for a6xx GPU recovery
virtio:
- improve error and edge conditions handling
- convert to use managed helpers
- stop exposing LINEAR modifier
mgag200:
- split modeset handling per model
udl:
- suspend/disconnect handling improvements
vc4:
- rework HDMI power up
- depend on PM
- better unplugging support
ast:
- resolution handling improvements
ingenic:
- add JZ4760(B) support
- avoid a modeset when sharpness property is unchanged
- use the new PM ops
it6505:
- power seq and clock updates
ssd130x:
- regmap bulk write
- use atomic helpers instead of simple helpers
via:
- rename via_drv to via_dri1, consolidate all code.
radeon:
- drop DP MST experimental support
- delayed work flush fix
- use time_after
ti-sn65dsi86:
- DP support
mediatek:
- MT8195 DP support
- drop of_gpio header
- remove unneeded result
- small DP code improvements
vkms:
- RGB565, XRGB64 and ARGB64 support
sun4i:
- tv: convert to atomic
rcar-du:
- Synopsys DW HDMI bridge DT bindings update
exynos:
- use drm_display_info.is_hdmi
- correct return of mixer_mode_valid and hdmi_mode_valid
omap:
- refcounting fix
rockchip:
- RK3568 support
- RK3399 gamma support"
* tag 'drm-next-2022-10-05' of git://anongit.freedesktop.org/drm/drm: (1374 commits)
drm/amdkfd: Fix UBSAN shift-out-of-bounds warning
drm/amdkfd: Track unified memory when switching xnack mode
drm/amdgpu: Enable sram on vcn_4_0_2
drm/amdgpu: Enable VCN DPG for GC11_0_1
drm/msm: Fix build break with recent mm tree
drm/panel: simple: Use dev_err_probe() to simplify code
drm/panel: panel-edp: Use dev_err_probe() to simplify code
drm/panel: simple: Add Multi-Inno Technology MI0800FT-9
dt-bindings: display: simple: Add Multi-Inno Technology MI0800FT-9 panel
drm/amdgpu: correct the memcpy size for ip discovery firmware
drm/amdgpu: Skip put_reset_domain if it doesn't exist
drm/amdgpu: remove switch from amdgpu_gmc_noretry_set
drm/amdgpu: Fix mc_umc_status used uninitialized warning
drm/amd/display: Prevent OTG shutdown during PSR SU
drm/amdgpu: add page retirement handling for CPU RAS
drm/amdgpu: use RAS error address convert api in mca notifier
drm/amdgpu: support to convert dedicated umc mca address
drm/amdgpu: export umc error address convert interface
drm/amdgpu: fix sdma v4 init microcode error
drm/amd/display: fix array-bounds error in dc_stream_remove_writeback()
...
|
|
drm/i915 feature pull #2 for v6.1:
Features and functionality:
- More Meteorlake platform enabling (Radhakrishna, Imre, Madhumitha)
- Allow seamless M/N changes on eDP panels that support it (Ville)
- Switch DSC debugfs from output bpp to input bpc (Swati)
Refactoring and cleanups:
- Clocking and DPLL refactoring and cleanups to support seamless M/N (Ville)
- Plenty of VBT definition and parsing updates and cleanups (Ville)
- Extract SKL watermark code to a separate file, and clean up (Ville)
- Clean up IPC interfaces and debugfs (Jani)
- Continue moving display data under drm_i915_private display sub-struct (Jani)
- Display quirk handling refactoring and abstractions (Jani)
- Stop using implicit dev_priv in gmbus registers (Jani)
- BUG_ON() removals and conversions to drm_WARN_ON() and BUILD_BUG_ON() (Jani)
- Use drm_dp_phy_name() for logging (Jani)
- Use REG_BIT() macros for CDCLK registers (Stan)
- Move display and media IP versions to runtime info (Radhakrishna)
Fixes:
- Fix DP MST suspend to avoid use-after-free (Andrzej)
- Fix HPD suspend to avoid use-after-free for fbdev (Andrzej)
- Fix various PSR issues regarding selective update and damage clips (Jouni)
- Fix runtime pm wakerefs for driver remove and release (Mitul Golani)
- Fix conditions for filtering fixed modes for panels (Ville)
- Fix TV encoder clock computation (Ville)
- Fix dvo mode_valid hook return type (Nathan Huckleberry)
Merges:
- Backmerge drm-next to sync the DP MST atomic changes (Jani)
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/87o7vfr064.fsf@intel.com
|
|
Cross-subsystem Changes:
- MEI subsystem pieces for XeHP SDV GSC support
These are Acked-by Greg.
Driver Changes:
- Release mmaps on RPM suspend on discrete GPUs (Anshuman)
- Update GuC version to 7.5 on DG1, DG2 and ADL
- Revert "drm/i915/dg2: extend Wa_1409120013 to DG2" (Lucas)
- MTL enabling incl. standalone media (Matt R, Lucas)
- Explicitly clear BB_OFFSET for new contexts on Gen8+ (Chris)
- Fix throttling / perf limit reason decoding (Ashutosh)
- XeHP SDV GSC support (Vitaly, Alexander, Tomas)
- Fix issues with overrding firmware file paths (John)
- Invert if-else ladders to check latest version first (Lucas)
- Cancel GuC engine busyness worker synchronously (Umesh)
- Skip applying copy engine fuses outside PVC (Lucas)
- Eliminate Gen10 frequency read function (Lucas)
- Static code checker fixes (Gaosheng)
- Selftest improvements (Chris)
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/YyQ4Jgl3cpGL1/As@jlahtine-mobl.ger.corp.intel.com
|
|
Due to i915_perf assuming that it can use the i915_gem_context reference
to protect its i915->gem.contexts.list iteration, we need to defer removal
of the context from the list until last reference to the context is put.
However, there is a risk of triggering kernel warning on contexts list not
empty at driver release time if we deleagate that task to a worker for
i915_gem_context_release_work(), unless that work is flushed first.
Unfortunately, it is not flushed on driver release. Fix it.
Instead of additionally calling flush_workqueue(), either directly or via
a new dedicated wrapper around it, replace last call to
i915_gem_drain_freed_objects() with existing i915_gem_drain_workqueue()
that performs both tasks.
Fixes: 75eefd82581f ("drm/i915: Release i915_gem_context from a worker")
Suggested-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Janusz Krzysztofik <janusz.krzysztofik@linux.intel.com>
Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com>
Cc: stable@kernel.org # v5.16+
Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220916092403.201355-2-janusz.krzysztofik@linux.intel.com
(cherry picked from commit 1cec34442408a77ba5396b19725fed2c398005c3)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
|
|
drm/i915 feature pull for v6.1:
Features and functionality:
- Early Meteorlake (MTL) enabling (José, Radhakrishna, Clint, Imre, Vandita, Ville, Jani)
- Support more HDMI pixel clock frequencies on DG2 (Clint)
- Sanity check PCI BARs (Piotr Piórkowski)
- Enable DC5 on DG2 (Anusha)
- DG2 DMC firmware version bump to v2.07 (Madhumitha)
- New ADL-S PCI ID (José)
Refactoring and cleanups:
- Add display sub-struct to struct drm_i915_private (Jani)
- Add initial runtime info to device info (Jani)
- Split out HDCP and backlight registers to separate files (Jani)
Fixes:
- Skip wm/ddb readout for disabled pipes (Ville)
- HDMI port timing quirk for GLK ECS Liva Q2 (Diego Santa Cruz)
- Fix bw init null pointer dereference (Łukasz Bartosik)
- Disable PPS power hook for DP AUX backlight (Jouni)
- Avoid warnings on registering multiple backlight devices (Arun)
- Fix dual-link DSI backlight and CABC ports for display 11+ (Jani)
- Fix Type-C PHY ownership programming in HDMI legacy mode (Imre)
- Fix unclaimed register access while loading PIPEDMC-C/D (Imre)
- Bump up CDCLK for DG2 (Stan)
- Prune modes that require HDMI 2.1 FRL (Ankit)
- Disable FBC when PSR1 is enabled in display 12-13 (Matt)
- Fix TGL+ HDMI transcoder clock and DDI BUF disable order (Imre)
- Disable PSR before disable pipe (José)
- Disable DMC handlers during firmware loading/disabling on display 12+ (Imre)
- Disable clock gating for PIPEDMC-A/B as a workaround (Imre)
Merges:
- Two drm-next backmerges (Rodrigo, Jani)
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/87k06rfaku.fsf@intel.com
|
|
Release all mmap mapping for all lmem objects which are associated
with userfault such that, while pcie function in D3hot, any access
to memory mappings will raise a userfault.
Runtime resume the dgpu(when gem object lies in lmem).
This will transition the dgpu graphics function to D0
state if it was in D3 in order to access the mmap memory
mappings.
v2:
- Squashes the patches. [Matt Auld]
- Add adequate locking for lmem_userfault_list addition. [Matt Auld]
- Reused obj->userfault_count to avoid double addition. [Matt Auld]
- Added i915_gem_object_lock to check
i915_gem_object_is_lmem. [Matt Auld]
v3:
- Use i915_ttm_cpu_maps_iomem. [Matt Auld]
- Fix 'ret == 0 to ret == VM_FAULT_NOPAGE'. [Matt Auld]
- Reuse obj->userfault_count as a bool 0 or 1. [Matt Auld]
- Delete the mmaped obj from lmem_userfault_list in obj
destruction path. [Matt Auld]
- Get a wakeref for object destruction patch. [Matt Auld]
- Use intel_wakeref_auto to delay runtime PM. [Matt Auld]
v4:
- Avoid using mmo offset to get the vma_node. [Matt Auld]
- Added comment to use the lmem_userfault_lock. [Matt Auld]
- Get lmem_userfault_lock in i915_gem_object_release_mmap_offset.
[Matt Auld]
- Fixed kernel test robot generated warning.
v5:
- Addressed the cosmetics comments. [Andi]
- Changed i915_gem_runtime_pm_object_release_mmap_offset() name to
i915_gem_object_runtime_pm_release_mmap_offset() to be rhythmic.
PCIe Specs 5.3.1.4.1
Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/6331
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Anshuman Gupta <anshuman.gupta@intel.com>
Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220913152714.16541-3-anshuman.gupta@intel.com
|
|
Refactor userfault_wakeref to re-use for discrete lmem mmap mapping
as well, as on discrete GTT mmap are not supported. Moving
userfault_wakeref from ggtt to gt structure.
Signed-off-by: Anshuman Gupta <anshuman.gupta@intel.com>
Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220913152714.16541-2-anshuman.gupta@intel.com
|
|
So far, different views (normal, partial, rotated and remapped)
into the same object are only supported for GGTT mappings.
But with the upcoming VM_BIND feature, PPGTT will also use the
partial view mapping. Hence rename ggtt_view to more generic
gtt_view.
Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
Acked-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220901183854.3446-1-niranjana.vishwanathapura@intel.com
|
|
I can't idenfity a single hot path that would require
i915_gem_drain_freed_objects() to be inline. Un-inline it.
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/6c289c55afee0d9a3067122db63277b8d60cf74f.1662390010.git.jani.nikula@intel.com
|
|
i915_gem_drain_workqueue() is not used on any hot paths. Un-unline it.
Replace the do-while with a for loop while at it.
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/2c89e7e0a3528caf7ba9ffa29b2bb9f13f2357d1.1662390010.git.jani.nikula@intel.com
|
|
Move display frontbuffer tracking related members under drm_i915_private
display sub-struct.
Rename struct i915_frontbuffer_tracking to intel_frontbuffer_tracking
while at it.
FIXME: fb_tracking.lock mutex init should be moved away from
i915_gem_init_early().
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/a5444d0a373afca46da9a2f6e4db442af21b429b.1661779055.git.jani.nikula@intel.com
|
|
The lone gem quirk is an outlier, not even handled by the common quirk
code. Split it to a separate gem_quirks member.
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/fe9c0cb1e49da0ddc31d24c996af5fd09bce3042.1661346845.git.jani.nikula@intel.com
|
|
If it's modified runtime, it's runtime info.
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Reviewed-by: Maarten Lankhort <maarten.lankhorst@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/f6825dd97d2ba63aa395c30131c4b9e6ef32b0c8.1660910433.git.jani.nikula@intel.com
|
|
Tracking DRM clients more explicitly will allow later patches to
accumulate past and current GPU usage in a centralised place and also
consolidate access to owning task pid/name.
Unique client id is also assigned for the purpose of distinguishing/
consolidating between multiple file descriptors owned by the same process.
v2:
Chris Wilson:
* Enclose new members into dedicated structs.
* Protect against failed sysfs registration.
v3:
* sysfs_attr_init.
v4:
* Fix for internal clients.
v5:
* Use cyclic ida for client id. (Chris)
* Do not leak pid reference. (Chris)
* Tidy code with some locals.
v6:
* Use xa_alloc_cyclic to simplify locking. (Chris)
* No need to unregister individial sysfs files. (Chris)
* Rebase on top of fpriv kref.
* Track client closed status and reflect in sysfs.
v7:
* Make drm_client more standalone concept.
v8:
* Simplify sysfs show. (Chris)
* Always track name and pid.
v9:
* Fix cyclic id assignment.
v10:
* No need for a mutex around xa_alloc_cyclic.
* Refactor sysfs into own function.
* Unregister sysfs before freeing pid and name.
* Move clients setup into own function.
v11:
* Call clients init directly from driver init. (Chris)
v12:
* Do not fail client add on id wrap. (Maciej)
v13 (Lucas): Rebase.
v14:
* Dropped sysfs bits.
v15:
* Dropped tracking of pid/ and name.
* Dropped RCU freeing of the client object.
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> # v11
Reviewed-by: Aravind Iddamsetty <aravind.iddamsetty@intel.com> # v11
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220401142205.3123159-2-tvrtko.ursulin@linux.intel.com
|
|
On platforms capable of allowing 8K (7680 x 4320) modes, pinning 2 or
more framebuffers/scanout buffers results in only one that is mappable/
fenceable. Therefore, pageflipping between these 2 FBs where only one
is mappable/fenceable creates latencies large enough to miss alternate
vblanks thereby producing less optimal framerate.
This mainly happens because when i915_gem_object_pin_to_display_plane()
is called to pin one of the FB objs, the associated vma is identified
as misplaced -- because there is no space for it in the aperture --
and therefore i915_vma_unbind() is called which unbinds and evicts it.
This misplaced vma gets subseqently pinned only when
i915_gem_object_ggtt_pin_ww() is called without PIN_MAPPABLE. This whole
thing results in a latency of ~10ms and happens every other repaint cycle.
Therefore, to fix this issue, we just ensure that the misplaced VMA
does not get evicted when we try to pin it with PIN_MAPPABLE -- by
returning early if the mappable/fenceable flag is not set.
Testcase:
Running Weston and weston-simple-egl on an Alderlake_S (ADLS) platform
with a 8K@60 mode results in only ~40 FPS (compared to ~59 FPS with
this patch). Since upstream Weston submits a frame ~7ms before the
next vblank, the latencies seen between atomic commit and flip event
are 7, 24 (7 + 16.66), 7, 24..... suggesting that it misses the
vblank every other frame.
Here is the ftrace snippet that shows the source of the ~10ms latency:
i915_gem_object_pin_to_display_plane() {
0.102 us | i915_gem_object_set_cache_level();
i915_gem_object_ggtt_pin_ww() {
0.390 us | i915_vma_instance();
0.178 us | i915_vma_misplaced();
i915_vma_unbind() {
__i915_active_wait() {
0.082 us | i915_active_acquire_if_busy();
0.475 us | }
intel_runtime_pm_get() {
0.087 us | intel_runtime_pm_acquire();
0.259 us | }
__i915_active_wait() {
0.085 us | i915_active_acquire_if_busy();
0.240 us | }
__i915_vma_evict() {
ggtt_unbind_vma() {
gen8_ggtt_clear_range() {
10507.255 us | }
10507.689 us | }
10508.516 us | }
v2:
- Expand the code comments to describe the ping-pong issue.
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Vivek Kasireddy <vivek.kasireddy@intel.com>
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220321005431.1113890-1-vivek.kasireddy@intel.com
|
|
The test for vma should always return true, and when assigning -EBUSY
to ret, the variable should already have that value.
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220304082641.308069-4-thomas.hellstrom@linux.intel.com
|
|
Now that i915_vma_parked() is taking the object lock on vma destruction,
and the only user of the vma refcount, i915_gem_object_unbind()
also takes the object lock, remove the vma refcount.
v3: Documentation update.
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220304082641.308069-3-thomas.hellstrom@linux.intel.com
|
|
vms are not getting properly closed. Rather than fixing that,
Remove the vm open count and instead rely on the vm refcount.
The vm open count existed solely to break the strong references the
vmas had on the vms. Now instead make those references weak and
ensure vmas are destroyed when the vm is destroyed.
Unfortunately if the vm destructor and the object destructor both
wants to destroy a vma, that may lead to a race in that the vm
destructor just unbinds the vma and leaves the actual vma destruction
to the object destructor. However in order for the object destructor
to ensure the vma is unbound it needs to grab the vm mutex. In order
to keep the vm mutex alive until the object destructor is done with
it, somewhat hackishly grab a vm_resv refcount that is released late
in the vma destruction process, when the vm mutex is no longer needed.
v2: Address review-comments from Niranjana
- Clarify that the struct i915_address_space::skip_pte_rewrite is a hack
and should ideally be replaced in an upcoming patch.
- Remove an unneeded continue in clear_vm_list and update comment.
v3:
- Documentation update
- Commit message formatting
Co-developed-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220304082641.308069-2-thomas.hellstrom@linux.intel.com
|
|
UAPI Changes:
- Weak parallel submission support for execlists
Minimal implementation of the parallel submission support for
execlists backend that was previously only implemented for GuC.
Support one sibling non-virtual engine.
Core Changes:
- Two backmerges of drm/drm-next for header file renames/changes and
i915_regs reorganization
Driver Changes:
- Add new DG2 subplatform: DG2-G12 (Matt R)
- Add new DG2 workarounds (Matt R, Ram, Bruce)
- Handle pre-programmed WOPCM registers for DG2+ (Daniele)
- Update guc shim control programming on XeHP SDV+ (Daniele)
- Add RPL-S C0/D0 stepping information (Anusha)
- Improve GuC ADS initialization to work on ARM64 on dGFX (Lucas)
- Fix KMD and GuC race on accessing PMU busyness (Umesh)
- Use PM timestamp instead of RING TIMESTAMP for reference in PMU with GuC (Umesh)
- Report error on invalid reset notification from GuC (John)
- Avoid WARN splat by holding RPM wakelock during PXP unbind (Juston)
- Fixes to parallel submission implementation (Matt B.)
- Improve GuC loading status check/error reports (John)
- Tweak TTM LRU priority hint selection (Matt A.)
- Align the plane_vma to min_page_size of stolen mem (Ram)
- Introduce vma resources and implement async unbinding (Thomas)
- Use struct vma_resource instead of struct vma_snapshot (Thomas)
- Return some TTM accel move errors instead of trying memcpy move (Thomas)
- Fix a race between vma / object destruction and unbinding (Thomas)
- Remove short-term pins from execbuf (Maarten)
- Update to GuC version 69.0.3 (John, Michal Wa.)
- Improvements to GT reset paths in GuC backend (Matt B.)
- Use shrinker_release_pages instead of writeback in shmem object hooks (Matt A., Tvrtko)
- Use trylock instead of blocking lock when freeing GEM objects (Maarten)
- Allocate intel_engine_coredump_alloc with ALLOW_FAIL (Matt B.)
- Fixes to object unmapping and purging (Matt A)
- Check for wedged device in GuC backend (John)
- Avoid lockdep splat by locking dpt_obj around set_cache_level (Maarten)
- Allow dead vm to unbind vma's without lock (Maarten)
- s/engine->i915/i915/ for DG2 engine workarounds (Matt R)
- Use to_gt() helper for GGTT accesses (Michal Wi.)
- Selftest improvements (Matt B., Thomas, Ram)
- Coding style and compiler warning fixes (Matt B., Jasmine, Andi, Colin, Gustavo, Dan)
From: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/Yg4i2aCZvvee5Eai@jlahtine-mobl.ger.corp.intel.com
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
[Fixed conflicts while applying, using the fixups/drm-intel-gt-next.patch
from drm-rerere's 1f2b1742abdd ("2022y-02m-23d-16h-07m-57s UTC: drm-tip
rerere cache update")]
|
|
Include it only in files that use it.
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Acked-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/14edab4a193ea3f73f387a88e3836c8555401871.1644507885.git.jani.nikula@intel.com
|
|
Limit the scope of struct drm_i915_file_private to the files that
actually need it.
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Acked-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/e375859dc1729a1b988036e4103e5b1bd48caa00.1644507885.git.jani.nikula@intel.com
|
|
Remove the duplicates.
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Acked-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/48f5ace6393533372da5d13df3de0203c8db11b3.1644507885.git.jani.nikula@intel.com
|
|
Catch-up with 5.17-rc2 and trying to align with drm-intel-gt-next
for a possible topic branch for merging the split of i915_regs...
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
|
|
We want to remove more members of i915_vma, which requires the locking to
be held more often.
Start requiring gem object lock for i915_vma_unbind, as it's one of the
callers that may unpin pages.
Some special care is needed when evicting, because the last reference to
the object may be held by the VMA, so after __i915_vma_unbind, vma may be
garbage, and we need to cache vma->obj before unlocking.
Changes since v1:
- Make trylock failing a WARN. (Matt)
- Remove double i915_vma_wait_for_bind() (Matt)
- Move atomic_set to right before mutex_unlock(), to make it more clear
they belong together. (Matt)
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Reviewed-by: Matthew Auld <matthew.william.auld@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220114132320.109030-5-maarten.lankhorst@linux.intel.com
|
|
Implement async (non-blocking) unbinding by not syncing the vma before
calling unbind on the vma_resource.
Add the resulting unbind fence to the object's dma_resv from where it is
picked up by the ttm migration code.
Ideally these unbind fences should be coalesced with the migration blit
fence to avoid stalling the migration blit waiting for unbind, as they
can certainly go on in parallel, but since we don't yet have a
reasonable data structure to use to coalesce fences and attach the
resulting fence to a timeline, we defer that for now.
Note that with async unbinding, even while the unbind waits for the
preceding bind to complete before unbinding, the vma itself might have been
destroyed in the process, clearing the vma pages. Therefore we can
only allow async unbinding if we have a refcounted sg-list and keep a
refcount on that for the vma resource pages to stay intact until
binding occurs. If this condition is not met, a request for an async
unbind is diverted to a sync unbind.
v2:
- Use a separate kmem_cache for vma resources for now to isolate their
memory allocation and aid debugging.
- Move the check for vm closed to the actual unbinding thread. Regardless
of whether the vm is closed, we need the unbind fence to properly wait
for capture.
- Clear vma_res::vm on unbind and update its documentation.
v4:
- Take cache coloring into account when searching for vma resources
pending unbind. (Matthew Auld)
v5:
- Fix timeout and error check in i915_vma_resource_bind_dep_await().
- Avoid taking a reference on the object for async binding if
async unbind capable.
- Fix braces around a single-line if statement.
v6:
- Fix up the cache coloring adjustment. (Kernel test robot <lkp@intel.com>)
- Don't allow async unbinding if the vma_res pages are not the same as
the object pages. (Matthew Auld)
v7:
- s/unsigned long/u64/ in a number of places (Matthew Auld)
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220110172219.107131-5-thomas.hellstrom@linux.intel.com
|
|
We already have the gem/i915_gem_userptr.c file.
Acked-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/c29f66604ebd973b8eff1cce7d7c53615a26480f.1641561552.git.jani.nikula@intel.com
|
|
GGTT is currently available both through i915->ggtt and gt->ggtt, and we
eventually want to get rid of the i915->ggtt one.
Use to_gt() for all i915->ggtt accesses to help with the future
refactoring.
Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220104223550.56135-1-andi.shyti@linux.intel.com
|
|
We will need the lock to unbind the vma, and wait for bind to complete.
Remove the special casing for the !ww path, and force ww locking for all.
Changes since v1:
- Pass err to for_i915_gem_ww handling for -EDEADLK handling.
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20211216142749.1966107-6-maarten.lankhorst@linux.intel.com
|
|
Use to_gt() helper consistently throughout the codebase.
Pure mechanical s/i915->gt/to_gt(i915). No functional changes.
Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20211214193346.21231-10-andi.shyti@linux.intel.com
|
|
Gem-TTM objects that are backed by shmem might have populated
page-vectors without having the GEM pages set. Those objects
aren't moved to the correct shrinker / purge list by gem_madvise.
For such objects, identified by having the
_SELF_MANAGED_SHRINK_LIST set, make sure they end up on the
correct list.
v2:
- Revert a change that made swapped-out objects inaccessible for
truncating. (Matthew Auld)
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20211108123637.929617-1-thomas.hellstrom@linux.intel.com
|
|
Move it next to its partner in crime; gpu_write_needs_clflush. For
better readability lets keep gpu vs cpu at least in the same file.
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20211027161813.3094681-3-matthew.auld@intel.com
|
|
Support for multiple GT's within a single i915 device will be arriving
soon. Since each GT may have its own fusing and require different
workarounds, we need to make the GT workaround functions and multicast
steering setup per-gt.
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Signed-off-by: Venkata Sandeep Dhanalakota <venkata.s.dhanalakota@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210917170845.836358-1-matthew.d.roper@intel.com
|
|
As we're about to add more ww-related functionality,
break out the dma_resv ww locking utilities to their own files
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210617063018.92802-3-thomas.hellstrom@linux.intel.com
|
|
Since the ww transaction endpoint easily end up far out-of-scope of
the objects on the ww object list, particularly for contending lock
objects, make sure we reference objects on the list so they don't
disappear under us.
This comes with a performance penalty so it's been debated whether this
is really needed. But I think this is motivated by the fact that locking
is typically difficult to get right, and whatever we can do to make it
simpler for developers moving forward should be done, unless the
performance impact is far too high.
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210617063018.92802-2-thomas.hellstrom@linux.intel.com
|
|
Between
commit ae30af84edb5b7cc95485922e43afd909a892e1b
Author: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Date: Tue Mar 23 16:50:00 2021 +0100
drm/i915: Disable userptr pread/pwrite support.
and
commit 0049b688459b846f819b6e51c24cd0781fcfde41
Author: Matthew Auld <matthew.auld@intel.com>
Date: Thu Nov 5 15:49:33 2020 +0000
drm/i915/gem: Allow backends to override pread implementation
this accidentally landed twice.
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Cc: Matthew Auld <matthew.auld@intel.com
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Cc: Jason Ekstrand <jason@jlekstrand.net>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210616090350.828696-1-daniel.vetter@ffwll.ch
|
|
Most logical place to introduce TTM buffer objects is as an i915
gem object backend. We need to add some ops to account for added
functionality like delayed delete and LRU list manipulation.
Initially we support only LMEM and SYSTEM memory, but SYSTEM
(which in this case means evicted LMEM objects) is not
visible to i915 GEM yet. The plan is to move the i915 gem system region
over to the TTM system memory type in upcoming patches.
We set up GPU bindings directly both from LMEM and from the system region,
as there is no need to use the legacy TTM_TT memory type. We reserve
that for future porting of GGTT bindings to TTM.
Remove the old lmem backend.
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210610070152.572423-2-thomas.hellstrom@linux.intel.com
|
|
UAPI Changes:
- Disable mmap ioctl for gen12+ (excl. TGL-LP)
- Start enabling HuC loading by default for upcoming Gen12+
platforms (excludes TGL and RKL)
Core Changes:
- Backmerge of drm-next
Driver Changes:
- Revert "i915: use io_mapping_map_user" (Eero, Matt A)
- Initialize the TTM device and memory managers (Thomas)
- Major rework to the GuC submission backend to prepare
for enabling on new platforms (Michal Wa., Daniele,
Matt B, Rodrigo)
- Fix i915_sg_page_sizes to record dma segments rather
than physical pages (Thomas)
- Locking rework to prep for TTM conversion (Thomas)
- Replace IS_GEN and friends with GRAPHICS_VER (Lucas)
- Use DEVICE_ATTR_RO macro (Yue)
- Static code checker fixes (Zhihao)
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/YMHeDxg9VLiFtyn3@jlahtine-mobl.ger.corp.intel.com
|
|
Cross-subsystem Changes:
- x86/gpu: add JasperLake to gen11 early quirks
(Although the patch lacks the Ack info, it has been Acked by Borislav)
Driver Changes:
- General DMC improves (Anusha)
- More ADL-P enabling (Vandita, Matt, Jose, Mika, Anusha, Imre, Lucas, Jani, Manasi, Ville, Stanislav)
- Introduce MBUS relative dbuf offset (Ville)
- PSR fixes and improvements (Gwan, Jose, Ville)
- Re-enable LTTPR non-transparent LT mode for DPCD_REV < 1.4 (Ville)
- Remove duplicated declarations (Shaokun, Wan)
- Check HDMI sink deep color capabilities during .mode_valid (Ville)
- Fix display flicker screan related to console and FBC (Chris)
- Remaining conversions of GRAPHICS_VER (Lucas)
- Drop invalid FIXME (Jose)
- Fix bigjoiner check in dsc_disable (Vandita)
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/YMEy2Ew82BeL/hDK@intel.com
|
|
This was done by the following semantic patch:
@@ expression i915; @@
- INTEL_GEN(i915)
+ GRAPHICS_VER(i915)
@@ expression i915; expression E; @@
- INTEL_GEN(i915) >= E
+ GRAPHICS_VER(i915) >= E
@@ expression dev_priv; expression E; @@
- !IS_GEN(dev_priv, E)
+ GRAPHICS_VER(dev_priv) != E
@@ expression dev_priv; expression E; @@
- IS_GEN(dev_priv, E)
+ GRAPHICS_VER(dev_priv) == E
@@
expression dev_priv;
expression from, until;
@@
- IS_GEN_RANGE(dev_priv, from, until)
+ IS_GRAPHICS_VER(dev_priv, from, until)
@def@
expression E;
identifier id =~ "^gen$";
@@
- id = GRAPHICS_VER(E)
+ ver = GRAPHICS_VER(E)
@@
identifier def.id;
@@
- id
+ ver
It also takes care of renaming the variable we assign to GRAPHICS_VER()
so to use "ver" rather than "gen".
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210606045050.103862-2-lucas.demarchi@intel.com
|
|
Temporarily remove the buddy allocator and related selftests
and hook up the TTM range manager for i915 regions.
Also modify the mock region selftests somewhat to account for a
fragmenting manager.
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210602083818.241793-2-thomas.hellstrom@linux.intel.com
|
|
When instantiating a tiled object on an L-shaped memory machine, we mark
the object as unshrinkable to prevent the shrinker from trying to swap
out the pages. We have to do this as we do not know the swizzling on the
individual pages, and so the data will be scrambled across swap out/in.
Not only do we need to move the object off the shrinker list, we need to
mark the object with shrink_pin so that the counter is consistent across
calls to madvise.
v2: in the madvise ioctl we need to check if the object is currently
shrinkable/purgeable, not if the object type supports shrinking
Fixes: 0175969e489a ("drm/i915/gem: Use shrinkable status for unknown swizzle quirks")
References: https://gitlab.freedesktop.org/drm/intel/-/issues/3293
References: https://gitlab.freedesktop.org/drm/intel/-/issues/3450
Reported-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Tested-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: <stable@vger.kernel.org> # v5.12+
Link: https://patchwork.freedesktop.org/patch/msgid/20210517084640.18862-1-matthew.auld@intel.com
|
|
We changed the locking hierarchy for both ppgtt and ggtt, so both locks
should be trylocked inside i915_gem_object_unbind().
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Fixes: bc6f80cce9ae ("drm/i915: Use trylock in shrinker for ggtt on bsw vt-d and bxt, v2.")
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210429120158.1105318-1-maarten.lankhorst@linux.intel.com
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> #irc
|
|
The stop_machine() lock may allocate memory, but is called inside
vm->mutex, which is taken in the shrinker. This will cause a lockdep
splat, as can be seen below:
<4>[ 462.585762] ======================================================
<4>[ 462.585768] WARNING: possible circular locking dependency detected
<4>[ 462.585773] 5.12.0-rc5-CI-Trybot_7644+ #1 Tainted: G U
<4>[ 462.585779] ------------------------------------------------------
<4>[ 462.585783] i915_selftest/5540 is trying to acquire lock:
<4>[ 462.585788] ffffffff826440b0 (cpu_hotplug_lock){++++}-{0:0}, at: stop_machine+0x12/0x30
<4>[ 462.585814]
but task is already holding lock:
<4>[ 462.585818] ffff888125369c70 (&vm->mutex/1){+.+.}-{3:3}, at: i915_vma_pin_ww+0x38e/0xb40 [i915]
<4>[ 462.586301]
which lock already depends on the new lock.
<4>[ 462.586305]
the existing dependency chain (in reverse order) is:
<4>[ 462.586309]
-> #2 (&vm->mutex/1){+.+.}-{3:3}:
<4>[ 462.586323] i915_gem_shrinker_taints_mutex+0x2d/0x50 [i915]
<4>[ 462.586719] i915_address_space_init+0x12d/0x130 [i915]
<4>[ 462.587092] ppgtt_init+0x4e/0x80 [i915]
<4>[ 462.587467] gen8_ppgtt_create+0x3e/0x5c0 [i915]
<4>[ 462.587828] i915_ppgtt_create+0x28/0xf0 [i915]
<4>[ 462.588203] intel_gt_init+0x123/0x370 [i915]
<4>[ 462.588572] i915_gem_init+0x129/0x1f0 [i915]
<4>[ 462.588971] i915_driver_probe+0x753/0xd80 [i915]
<4>[ 462.589320] i915_pci_probe+0x43/0x1d0 [i915]
<4>[ 462.589671] pci_device_probe+0x9e/0x110
<4>[ 462.589680] really_probe+0xea/0x410
<4>[ 462.589690] driver_probe_device+0xd9/0x140
<4>[ 462.589697] device_driver_attach+0x4a/0x50
<4>[ 462.589704] __driver_attach+0x83/0x140
<4>[ 462.589711] bus_for_each_dev+0x75/0xc0
<4>[ 462.589718] bus_add_driver+0x14b/0x1f0
<4>[ 462.589724] driver_register+0x66/0xb0
<4>[ 462.589731] i915_init+0x70/0x87 [i915]
<4>[ 462.590053] do_one_initcall+0x56/0x2e0
<4>[ 462.590061] do_init_module+0x55/0x200
<4>[ 462.590068] load_module+0x2703/0x2990
<4>[ 462.590074] __do_sys_finit_module+0xad/0x110
<4>[ 462.590080] do_syscall_64+0x33/0x80
<4>[ 462.590089] entry_SYSCALL_64_after_hwframe+0x44/0xae
<4>[ 462.590096]
-> #1 (fs_reclaim){+.+.}-{0:0}:
<4>[ 462.590109] fs_reclaim_acquire+0x9f/0xd0
<4>[ 462.590118] kmem_cache_alloc_trace+0x3d/0x430
<4>[ 462.590126] intel_cpuc_prepare+0x3b/0x1b0
<4>[ 462.590133] cpuhp_invoke_callback+0x9e/0x890
<4>[ 462.590141] _cpu_up+0xa4/0x130
<4>[ 462.590147] cpu_up+0x82/0x90
<4>[ 462.590153] bringup_nonboot_cpus+0x4a/0x60
<4>[ 462.590159] smp_init+0x21/0x5c
<4>[ 462.590167] kernel_init_freeable+0x8a/0x1b7
<4>[ 462.590175] kernel_init+0x5/0xff
<4>[ 462.590181] ret_from_fork+0x22/0x30
<4>[ 462.590187]
-> #0 (cpu_hotplug_lock){++++}-{0:0}:
<4>[ 462.590199] __lock_acquire+0x1520/0x2590
<4>[ 462.590207] lock_acquire+0xd1/0x3d0
<4>[ 462.590213] cpus_read_lock+0x39/0xc0
<4>[ 462.590219] stop_machine+0x12/0x30
<4>[ 462.590226] bxt_vtd_ggtt_insert_entries__BKL+0x36/0x50 [i915]
<4>[ 462.590601] ggtt_bind_vma+0x5d/0x80 [i915]
<4>[ 462.590970] i915_vma_bind+0xdc/0x1c0 [i915]
<4>[ 462.591374] i915_vma_pin_ww+0x435/0xb40 [i915]
<4>[ 462.591779] make_obj_busy+0xcb/0x330 [i915]
<4>[ 462.592170] igt_mmap_offset_exhaustion+0x45f/0x4c0 [i915]
<4>[ 462.592562] __i915_subtests.cold.7+0x42/0x92 [i915]
<4>[ 462.592995] __run_selftests.part.3+0x10d/0x172 [i915]
<4>[ 462.593428] i915_live_selftests.cold.5+0x1f/0x47 [i915]
<4>[ 462.593860] i915_pci_probe+0x93/0x1d0 [i915]
<4>[ 462.594210] pci_device_probe+0x9e/0x110
<4>[ 462.594217] really_probe+0xea/0x410
<4>[ 462.594226] driver_probe_device+0xd9/0x140
<4>[ 462.594233] device_driver_attach+0x4a/0x50
<4>[ 462.594240] __driver_attach+0x83/0x140
<4>[ 462.594247] bus_for_each_dev+0x75/0xc0
<4>[ 462.594254] bus_add_driver+0x14b/0x1f0
<4>[ 462.594260] driver_register+0x66/0xb0
<4>[ 462.594267] i915_init+0x70/0x87 [i915]
<4>[ 462.594586] do_one_initcall+0x56/0x2e0
<4>[ 462.594592] do_init_module+0x55/0x200
<4>[ 462.594599] load_module+0x2703/0x2990
<4>[ 462.594605] __do_sys_finit_module+0xad/0x110
<4>[ 462.594612] do_syscall_64+0x33/0x80
<4>[ 462.594618] entry_SYSCALL_64_after_hwframe+0x44/0xae
<4>[ 462.594625]
other info that might help us debug this:
<4>[ 462.594629] Chain exists of:
cpu_hotplug_lock --> fs_reclaim --> &vm->mutex/1
<4>[ 462.594645] Possible unsafe locking scenario:
<4>[ 462.594648] CPU0 CPU1
<4>[ 462.594652] ---- ----
<4>[ 462.594655] lock(&vm->mutex/1);
<4>[ 462.594664] lock(fs_reclaim);
<4>[ 462.594671] lock(&vm->mutex/1);
<4>[ 462.594679] lock(cpu_hotplug_lock);
<4>[ 462.594686]
*** DEADLOCK ***
<4>[ 462.594690] 4 locks held by i915_selftest/5540:
<4>[ 462.594696] #0: ffff888100fbc240 (&dev->mutex){....}-{3:3}, at: device_driver_attach+0x18/0x50
<4>[ 462.594715] #1: ffffc900006cb9a0 (reservation_ww_class_acquire){+.+.}-{0:0}, at: make_obj_busy+0x81/0x330 [i915]
<4>[ 462.595118] #2: ffff88812a6081e8 (reservation_ww_class_mutex){+.+.}-{3:3}, at: make_obj_busy+0x21f/0x330 [i915]
<4>[ 462.595519] #3: ffff888125369c70 (&vm->mutex/1){+.+.}-{3:3}, at: i915_vma_pin_ww+0x38e/0xb40 [i915]
<4>[ 462.595934]
stack backtrace:
<4>[ 462.595939] CPU: 0 PID: 5540 Comm: i915_selftest Tainted: G U 5.12.0-rc5-CI-Trybot_7644+ #1
<4>[ 462.595947] Hardware name: GOOGLE Kefka/Kefka, BIOS MrChromebox 02/04/2018
<4>[ 462.595952] Call Trace:
<4>[ 462.595961] dump_stack+0x7f/0xad
<4>[ 462.595974] check_noncircular+0x12e/0x150
<4>[ 462.595982] ? save_stack.isra.17+0x3f/0x70
<4>[ 462.595991] ? drm_mm_insert_node_in_range+0x34a/0x5b0
<4>[ 462.596000] ? i915_vma_pin_ww+0x9ec/0xb40 [i915]
<4>[ 462.596410] __lock_acquire+0x1520/0x2590
<4>[ 462.596419] ? do_init_module+0x55/0x200
<4>[ 462.596429] lock_acquire+0xd1/0x3d0
<4>[ 462.596435] ? stop_machine+0x12/0x30
<4>[ 462.596445] ? gen8_ggtt_insert_entries+0xf0/0xf0 [i915]
<4>[ 462.596816] cpus_read_lock+0x39/0xc0
<4>[ 462.596824] ? stop_machine+0x12/0x30
<4>[ 462.596831] stop_machine+0x12/0x30
<4>[ 462.596839] bxt_vtd_ggtt_insert_entries__BKL+0x36/0x50 [i915]
<4>[ 462.597210] ggtt_bind_vma+0x5d/0x80 [i915]
<4>[ 462.597580] i915_vma_bind+0xdc/0x1c0 [i915]
<4>[ 462.597986] i915_vma_pin_ww+0x435/0xb40 [i915]
<4>[ 462.598395] ? make_obj_busy+0xcb/0x330 [i915]
<4>[ 462.598786] make_obj_busy+0xcb/0x330 [i915]
<4>[ 462.599180] ? 0xffffffff81000000
<4>[ 462.599187] ? debug_mutex_unlock+0x50/0xa0
<4>[ 462.599198] igt_mmap_offset_exhaustion+0x45f/0x4c0 [i915]
<4>[ 462.599592] __i915_subtests.cold.7+0x42/0x92 [i915]
<4>[ 462.600026] ? i915_perf_selftests+0x20/0x20 [i915]
<4>[ 462.600422] ? __i915_nop_setup+0x10/0x10 [i915]
<4>[ 462.600820] __run_selftests.part.3+0x10d/0x172 [i915]
<4>[ 462.601253] i915_live_selftests.cold.5+0x1f/0x47 [i915]
<4>[ 462.601686] i915_pci_probe+0x93/0x1d0 [i915]
<4>[ 462.602037] ? _raw_spin_unlock_irqrestore+0x3d/0x60
<4>[ 462.602047] pci_device_probe+0x9e/0x110
<4>[ 462.602057] really_probe+0xea/0x410
<4>[ 462.602067] driver_probe_device+0xd9/0x140
<4>[ 462.602075] device_driver_attach+0x4a/0x50
<4>[ 462.602084] __driver_attach+0x83/0x140
<4>[ 462.602091] ? device_driver_attach+0x50/0x50
<4>[ 462.602099] ? device_driver_attach+0x50/0x50
<4>[ 462.602107] bus_for_each_dev+0x75/0xc0
<4>[ 462.602116] bus_add_driver+0x14b/0x1f0
<4>[ 462.602124] driver_register+0x66/0xb0
<4>[ 462.602133] i915_init+0x70/0x87 [i915]
<4>[ 462.602453] ? 0xffffffffa0606000
<4>[ 462.602458] do_one_initcall+0x56/0x2e0
<4>[ 462.602466] ? kmem_cache_alloc_trace+0x374/0x430
<4>[ 462.602476] do_init_module+0x55/0x200
<4>[ 462.602484] load_module+0x2703/0x2990
<4>[ 462.602500] ? __do_sys_finit_module+0xad/0x110
<4>[ 462.602507] __do_sys_finit_module+0xad/0x110
<4>[ 462.602519] do_syscall_64+0x33/0x80
<4>[ 462.602527] entry_SYSCALL_64_after_hwframe+0x44/0xae
<4>[ 462.602535] RIP: 0033:0x7fab69d8d89d
Changes since v1:
- Add lockdep annotations during init, to ensure that lockdep is primed.
This also fixes a false positive when reading /proc/lockdep_stats
during module reload.
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210426102351.921874-1-maarten.lankhorst@linux.intel.com
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
|
|
With all callers and selftests fixed to use ww locking, we can now
finally remove this lock.
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20210323155059.628690-62-maarten.lankhorst@linux.intel.com
|
|
We are removing obj->mm.lock, and need to take the reservation lock
before we can pin pages. Move the pinning pages into the helper, and
merge gtt pwrite/pread preparation and cleanup paths.
The fence lock is also removed; it will conflict with fence annotations,
because of memory allocations done when pagefaulting inside copy_*_user.
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
[danvet: Pick the older version to avoid the conflicts]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20210128162612.927917-31-maarten.lankhorst@linux.intel.com
Link: https://patchwork.freedesktop.org/patch/msgid/20210323155059.628690-31-maarten.lankhorst@linux.intel.com
|
|
We previously complained when ww == NULL.
This function is now only used in selftests to pin an object,
and ww locking is now fixed.
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
[danvet: Resolve conflict because we don't have a set-domain refactor,
see
https://lore.kernel.org/intel-gfx/20210203090205.25818-8-chris@chris-wilson.co.uk/
The really worrying thing here is that the above patch had a change in
arguments for i915_gem_object_set_to_gtt_domain(), without any
explanation. I decided to just faithfully apply Maarten's change but
not the argument change which was in Maarten's context diff.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20210323155059.628690-26-maarten.lankhorst@linux.intel.com
|
|
Instead of doing what we do currently, which will never work with
PROVE_LOCKING, do the same as AMD does, and something similar to
relocation slowpath. When all locks are dropped, we acquire the
pages for pinning. When the locks are taken, we transfer those
pages in .get_pages() to the bo. As a final check before installing
the fences, we ensure that the mmu notifier was not called; if it is,
we return -EAGAIN to userspace to signal it has to start over.
Changes since v1:
- Unbinding is done in submit_init only. submit_begin() removed.
- MMU_NOTFIER -> MMU_NOTIFIER
Changes since v2:
- Make i915->mm.notifier a spinlock.
Changes since v3:
- Add WARN_ON if there are any page references left, should have been 0.
- Return 0 on success in submit_init(), bug from spinlock conversion.
- Release pvec outside of notifier_lock (Thomas).
Changes since v4:
- Mention why we're clearing eb->[i + 1].vma in the code. (Thomas)
- Actually check all invalidations in eb_move_to_gpu. (Thomas)
- Do not wait when process is exiting to fix gem_ctx_persistence.userptr.
Changes since v5:
- Clarify why check on PF_EXITING is (temporarily) required.
Changes since v6:
- Ensure userptr validity is checked in set_domain through a special path.
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Acked-by: Dave Airlie <airlied@redhat.com>
[danvet: s/kfree/kvfree/ in i915_gem_object_userptr_drop_ref in the
previous review round, but which got lost. The other open questions
around page refcount are imo better discussed in a separate series,
with amdgpu folks involved].
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20210323155059.628690-17-maarten.lankhorst@linux.intel.com
|
|
Userptr should not need the kernel for a userspace memcpy, userspace
needs to call memcpy directly.
Specifically, disable i915_gem_pwrite_ioctl() and i915_gem_pread_ioctl().
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20210323155059.628690-12-maarten.lankhorst@linux.intel.com
|
|
Doesn't need the full ww lock, only checking if pages are bound.
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> #irc
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20210323155059.628690-7-maarten.lankhorst@linux.intel.com
|
|
The rationale for this change is roughly as follows:
1. The functionality can be done entirely in userspace with a
combination of mmap + memcpy
2. The only reason anyone in userspace is still using it is because
someone implemented bo_subdata that way in libdrm ages ago and
they're all too lazy to write the 5 lines of code to do a map.
3. This falls cleanly into the category of things which will only get
more painful with local memory support.
These ioctls aren't used much anymore by "real" userspace drivers.
Vulkan has never used them and neither has the iris GL driver. The old
i965 GL driver does use PWRITE for glBufferSubData but it only supports
up through Gen11; Gen12 was never enabled in i965. The compute driver
has never used PREAD/PWRITE. The only remaining user is the media
driver which uses it exactly twice and they're easily removed [1] so
expecting them to drop it going forward is reasonable.
IGT changes which handle this kernel change have also been submitted [2].
[1] https://github.com/intel/media-driver/pull/1160
[2] https://patchwork.freedesktop.org/series/81384/
v2 (Jason Ekstrand):
- Improved commit message with the status of all usermode drivers
- A more future-proof platform check
v3 (Jason Ekstrand):
- Drop the HAS_LMEM checks as they're already covered by the version
checks
Signed-off-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20210317234014.2271006-4-jason@jlekstrand.net
|