aboutsummaryrefslogtreecommitdiffstats
path: root/drivers/gpu/drm/i915/i915_vma.c (follow)
AgeCommit message (Collapse)AuthorFilesLines
2019-10-31drm/i915: Split detaching and removing the vmaChris Wilson1-20/+17
In order to keep the assert_bind_count() valid, we need to hold the vma page reference until after we drop the bind count. However, we must also keep the drm_mm_remove_node() as the last action of i915_vma_unbind() so that it serialises with the unlocked check inside i915_vma_destroy(). So we need to split up i915_vma_remove() so that we order the detach, drop pages and remove as required during unbind. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=112067 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Matthew Auld <matthew.auld@intel.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191030192159.18404-1-chris@chris-wilson.co.uk
2019-10-21drm/i915: Lift i915_vma_parked() onto the gtChris Wilson1-16/+16
Currently even though i915_vma_parked() operates on a per-gt struct, it is called from a global pm notify. This oddity was only because the long term plan is to decouple the vma cache from the pm notification, but right now the oddity stands out like a sore thumb! Suggested-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191021183236.21790-1-chris@chris-wilson.co.uk
2019-10-15drm/i915: Remove leftover vma->obj->pages_pin_count on insert/removeChris Wilson1-4/+1
We now do the page pin count upfront in vma_get_pages/vma_put_pages, so that we do the allocations before we enter the vm->mutex. Our vma page references we are tracked in vma->pages_count and the extra obj->pages_pin_count being performed later in i915_vma_insert and i915_vma_remove is redundant, and worse throws off the shrinker's logic on when it can free an object by unbinding it. Reported-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Reported-by: Matthew Auld <matthew.auld@intel.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Matthew Auld <matthew.auld@intel.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191015100155.10376-1-chris@chris-wilson.co.uk
2019-10-15drm/i915: Drop obj.page_pin_count after a failed vma->set_pages()Chris Wilson1-1/+4
Before we attempt to set_pages on the vma, we claim a obj.pages_pin_count for it. If we subsequently fail to set the pages on the vma, we need to drop our pinning before returning the error. Reported-by: Matthew Auld <matthew.auld@intel.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Matthew Auld <matthew.auld@intel.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191015093915.3995-1-chris@chris-wilson.co.uk
2019-10-04drm/i915: Coordinate i915_active with its own mutexChris Wilson1-2/+2
Forgo the struct_mutex serialisation for i915_active, and interpose its own mutex handling for active/retire. This is a multi-layered sleight-of-hand. First, we had to ensure that no active/retire callbacks accidentally inverted the mutex ordering rules, nor assumed that they were themselves serialised by struct_mutex. More challenging though, is the rule over updating elements of the active rbtree. Instead of the whole i915_active now being serialised by struct_mutex, allocations/rotations of the tree are serialised by the i915_active.mutex and individual nodes are serialised by the caller using the i915_timeline.mutex (we need to use nested spinlocks to interact with the dma_fence callback lists). The pain point here is that instead of a single mutex around execbuf, we now have to take a mutex for active tracker (one for each vma, context, etc) and a couple of spinlocks for each fence update. The improvement in fine grained locking allowing for multiple concurrent clients (eventually!) should be worth it in typical loads. v2: Add some comments that barely elucidate anything :( Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191004134015.13204-6-chris@chris-wilson.co.uk
2019-10-04drm/i915: Push the i915_active.retire into a workerChris Wilson1-0/+2
As we need to use a mutex to serialise i915_active activation (because we want to allow the callback to sleep), we need to push the i915_active.retire into a worker callback in case we get need to retire from an atomic context. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191004134015.13204-5-chris@chris-wilson.co.uk
2019-10-04drm/i915: Pull i915_vma_pin under the vm->mutexChris Wilson1-158/+366
Replace the struct_mutex requirement for pinning the i915_vma with the local vm->mutex instead. Note that the vm->mutex is tainted by the shrinker (we require unbinding from inside fs-reclaim) and so we cannot allocate while holding that mutex. Instead we have to preallocate workers to do allocate and apply the PTE updates after we have we reserved their slot in the drm_mm (using fences to order the PTE writes with the GPU work and with later unbind). In adding the asynchronous vma binding, one subtle requirement is to avoid coupling the binding fence into the backing object->resv. That is the asynchronous binding only applies to the vma timeline itself and not to the pages as that is a more global timeline (the binding of one vma does not need to be ordered with another vma, nor does the implicit GEM fencing depend on a vma, only on writes to the backing store). Keeping the vma binding distinct from the backing store timelines is verified by a number of async gem_exec_fence and gem_exec_schedule tests. The way we do this is quite simple, we keep the fence for the vma binding separate and only wait on it as required, and never add it to the obj->resv itself. Another consequence in reducing the locking around the vma is the destruction of the vma is no longer globally serialised by struct_mutex. A natural solution would be to add a kref to i915_vma, but that requires decoupling the reference cycles, possibly by introducing a new i915_mm_pages object that is own by both obj->mm and vma->pages. However, we have not taken that route due to the overshadowing lmem/ttm discussions, and instead play a series of complicated games with trylocks to (hopefully) ensure that only one destruction path is called! v2: Add some commentary, and some helpers to reduce patch churn. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191004134015.13204-4-chris@chris-wilson.co.uk
2019-10-04drm/i915: Only track bound elements of the GTTChris Wilson1-10/+2
The premise here is to simply avoiding having to acquire the vm->mutex inside vma create/destroy to update the vm->unbound_lists, to avoid some nasty lock recursions later. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191004134015.13204-2-chris@chris-wilson.co.uk
2019-10-04drm/i915: Use helpers for drm_mm_node booleansChris Wilson1-1/+1
A subset of 71724f708997 ("drm/mm: Use helpers for drm_mm_node booleans") in order to prepare drm-intel-next-queued for subsequent patches before we can backmerge 71724f708997 itself. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191004142226.13711-1-chris@chris-wilson.co.uk
2019-09-20drm/i915: Mark i915_request.timeline as a volatile, rcu pointerChris Wilson1-4/+2
The request->timeline is only valid until the request is retired (i.e. before it is completed). Upon retiring the request, the context may be unpinned and freed, and along with it the timeline may be freed. We therefore need to be very careful when chasing rq->timeline that the pointer does not disappear beneath us. The vast majority of users are in a protected context, either during request construction or retirement, where the timeline->mutex is held and the timeline cannot disappear. It is those few off the beaten path (where we access a second timeline) that need extra scrutiny -- to be added in the next patch after first adding the warnings about dangerous access. One complication, where we cannot use the timeline->mutex itself, is during request submission onto hardware (under spinlocks). Here, we want to check on the timeline to finalize the breadcrumb, and so we need to impose a second rule to ensure that the request->timeline is indeed valid. As we are submitting the request, it's context and timeline must be pinned, as it will be used by the hardware. Since it is pinned, we know the request->timeline must still be valid, and we cannot submit the idle barrier until after we release the engine->active.lock, ergo while submitting and holding that spinlock, a second thread cannot release the timeline. v2: Don't be lazy inside selftests; hold the timeline->mutex for as long as we need it, and tidy up acquiring the timeline with a bit of refactoring (i915_active_add_request) Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190919111912.21631-1-chris@chris-wilson.co.uk
2019-09-11drm/i915: Make i915_vma.flags atomic_t for mutex reductionChris Wilson1-14/+15
In preparation for reducing struct_mutex stranglehold around the vm, make the vma.flags atomic so that we can acquire a pin on the vma atomically before deciding if we need to take the mutex. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190911090243.16786-1-chris@chris-wilson.co.uk
2019-09-09drm/i915: cleanup cache-coloringMatthew Auld1-11/+11
Try to tidy up the cache-coloring such that we rid the code of any mm.color_adjust assumptions, this should hopefully make it more obvious in the code when we need to actually use the cache-level as the color, and as a bonus should make adding a different color-scheme simpler. Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20190909124052.22900-3-matthew.auld@intel.com
2019-09-09drm/i915: export color_differsMatthew Auld1-7/+4
Export color_differs so that we can use it elsewhere. Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20190909124052.22900-1-matthew.auld@intel.com
2019-08-22drm/i915: Replace i915_vma_put_fence()Chris Wilson1-1/+3
Avoid calling i915_vma_put_fence() by using our alternate paths that bind a secondary vma avoiding the original fenced vma. For the few instances where we need to release the fence (i.e. on binding when the GGTT range becomes invalid), replace the put_fence with a revoke_fence. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190822061557.18402-1-chris@chris-wilson.co.uk
2019-08-22drm/i915: Pull obj->userfault tracking under the ggtt->mutexChris Wilson1-1/+3
Since we want to revoke the ggtt vma from only under the ggtt->mutex, we need to move protection of the userfault tracking from the struct_mutex to the ggtt->mutex. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190822060914.2671-2-chris@chris-wilson.co.uk
2019-08-22Merge drm/drm-next into drm-intel-next-queuedRodrigo Vivi1-3/+3
We need the rename of reservation_object to dma_resv. The solution on this merge came from linux-next: From: Stephen Rothwell <sfr@canb.auug.org.au> Date: Wed, 14 Aug 2019 12:48:39 +1000 Subject: [PATCH] drm: fix up fallout from "dma-buf: rename reservation_object to dma_resv" Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> --- drivers/gpu/drm/i915/gt/intel_engine_pool.c | 8 ++++---- 3 files changed, 7 insertions(+), 7 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/intel_engine_pool.c b/drivers/gpu/drm/i915/gt/intel_engine_pool.c index 03d90b49584a..4cd54c569911 100644 --- a/drivers/gpu/drm/i915/gt/intel_engine_pool.c +++ b/drivers/gpu/drm/i915/gt/intel_engine_pool.c @@ -43,12 +43,12 @@ static int pool_active(struct i915_active *ref) { struct intel_engine_pool_node *node = container_of(ref, typeof(*node), active); - struct reservation_object *resv = node->obj->base.resv; + struct dma_resv *resv = node->obj->base.resv; int err; - if (reservation_object_trylock(resv)) { - reservation_object_add_excl_fence(resv, NULL); - reservation_object_unlock(resv); + if (dma_resv_trylock(resv)) { + dma_resv_add_excl_fence(resv, NULL); + dma_resv_unlock(resv); } err = i915_gem_object_pin_pages(node->obj); which is a simplified version from a previous one which had: Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2019-08-21Merge tag 'drm-misc-next-2019-08-19' of git://anongit.freedesktop.org/drm/drm-misc into drm-nextDave Airlie1-4/+4
drm-misc-next for 5.4: UAPI Changes: Cross-subsystem Changes: Core Changes: - dma-buf: add reservation_object_fences helper, relax reservation_object_add_shared_fence, remove reservation_object seq number (and then restored) - dma-fence: Shrinkage of the dma_fence structure, Merge dma_fence_signal and dma_fence_signal_locked, Store the timestamp in struct dma_fence in a union with cb_list Driver Changes: - More dt-bindings YAML conversions - More removal of drmP.h includes - dw-hdmi: Support get_eld and various i2s improvements - gm12u320: Few fixes - meson: Global cleanup - panfrost: Few refactors, Support for GPU heap allocations - sun4i: Support for DDC enable GPIO - New panels: TI nspire, NEC NL8048HL11, LG Philips LB035Q02, Sharp LS037V7DW01, Sony ACX565AKM, Toppoly TD028TTEC1 Toppoly TD043MTEA1 Signed-off-by: Dave Airlie <airlied@redhat.com> [airlied: fixup dma_resv rename fallout] From: Maxime Ripard <maxime.ripard@bootlin.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190819141923.7l2adietcr2pioct@flea
2019-08-20drm/i915: Be defensive when starting vma activityChris Wilson1-2/+1
Before we acquire the vma for GPU activity, ensure that the underlying object is not already in the process of being freed. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190820100531.8430-1-chris@chris-wilson.co.uk
2019-08-16drm/i915: Markup expected timeline locks for i915_activeChris Wilson1-2/+2
As every i915_active_request should be serialised by a dedicated lock, i915_active consists of a tree of locks; one for each node. Markup up the i915_active_request with what lock is supposed to be guarding it so that we can verify that the serialised updated are indeed serialised. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190816121000.8507-2-chris@chris-wilson.co.uk
2019-08-16drm/i915: Extract intel_frontbuffer active trackingChris Wilson1-2/+4
Move the active tracking for the frontbuffer operations out of the i915_gem_object and into its own first class (refcounted) object. In the process of detangling, we switch from low level request tracking to the easier i915_active -- with the plan that this avoids any potential atomic callbacks as the frontbuffer tracking wishes to sleep as it flushes. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190816074635.26062-1-chris@chris-wilson.co.uk
2019-08-13dma-buf: rename reservation_object to dma_resvChristian König1-8/+8
Be more consistent with the naming of the other DMA-buf objects. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/323401/
2019-08-12drm/i915: Forgo last_fence active request trackingChris Wilson1-13/+0
We were using the last_fence to track the last request that used this vma that might be interpreted by a fence register and forced ourselves to wait for this request before modifying any fence register that overlapped our vma. Due to requirement that we need to track any XY_BLT command, linear or tiled, this in effect meant that we have to track the vma for its active lifespan anyway, so we can forgo the explicit last_fence tracking and just use the whole vma->active. Another solution would be to pipeline the register updates, and would help resolve some long running stalls for gen3 (but only gen 2 and 3!) Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190812174804.26180-1-chris@chris-wilson.co.uk
2019-08-07drm/i915: avoid including intel_drv.h via i915_drv.h->i915_trace.hJani Nikula1-0/+1
Disentangle i915_drv.h from intel_drv.h, which gets included via i915_trace.h. This necessitates including i915_trace.h wherever it's needed. Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Jani Nikula <jani.nikula@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/ed82bf259d3b725a1a1a3c3e9d6fb5c08bc4d489.1565085691.git.jani.nikula@intel.com
2019-08-02drm/i915: Hide unshrinkable context objects from the shrinkerChris Wilson1-0/+16
The shrinker cannot touch objects used by the contexts (logical state and ring). Currently we mark those as "pin_global" to let the shrinker skip over them, however, if we remove them from the shrinker lists entirely, we don't event have to include them in our shrink accounting. By keeping the unshrinkable objects in our shrinker tracking, we report a large number of objects available to be shrunk, and leave the shrinker deeply unsatisfied when we fail to reclaim those. The shrinker will persist in trying to reclaim the unavailable objects, forcing the system into a livelock (not even hitting the dread oomkiller). v2: Extend unshrinkable protection for perma-pinned scratch and guc allocations (Tvrtko) v3: Notice that we should be pinned when marking unshrinkable and so the link cannot be empty; merge duplicate paths. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190802212137.22207-1-chris@chris-wilson.co.uk
2019-08-02drm/i915: Report resv_obj allocation failureChris Wilson1-22/+9
Since commit 64d6c500a384 ("drm/i915: Generalise GPU activity tracking"), we have been prepared for i915_vma_move_to_active() to fail. We can take advantage of this to report the failure for allocating the shared-fence slot in the reservation_object. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190730205805.3733-1-chris@chris-wilson.co.uk
2019-07-30drm/i915: Move aliasing_ppgtt underneath its i915_ggttChris Wilson1-1/+1
The aliasing_ppgtt provides a PIN_USER alias for the global gtt, so move it under the i915_ggtt to simplify later transformations to enable intel_context.vm. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190730143209.4549-1-chris@chris-wilson.co.uk
2019-07-03drm/i915: Mark up vma->active as safe for use inside shrinkersChris Wilson1-0/+8
Since a shrinker may be forced to wait on GPU activity, i915_active_wait(&vma->active) must be safe for use inside a shrinker, and so let's mark up the lock as being acquired by the shrinker to avoid any nasty surprises creeping in. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190703091726.11690-8-chris@chris-wilson.co.uk
2019-06-21drm/i915: Provide an i915_active.acquire callbackChris Wilson1-8/+14
If we introduce a callback for i915_active that is only called the first time we use the i915_active and is symmetrically paired with the i915_active.retire callback, we can replace the open-coded and non-atomic implementations -- which will be very fragile (i.e. broken) upon removing the struct_mutex serialisation. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190621183801.23252-4-chris@chris-wilson.co.uk
2019-06-21drm/i915: Throw away the active object retirement complexityChris Wilson1-43/+11
Remove the accumulated optimisations that we have for i915_vma_retire and reduce it to the bare essential of tracking the active object reference. This allows us to only use atomic operations, and so will be able to avoid the struct_mutex requirement. The principal loss here is the shrinker MRU bumping, so now if we have to shrink, we will do so in much more random order and more likely to try and shrink recently used objects. That is a nuisance, but shrinking active objects is a second step we try to avoid and will always be a system-wide performance issue. The other loss is here is in the automatic pruning of the reservation_object when idling. This is not as large an issue as upon reservation_object introduction as now adding new fences into the object replaces already signaled fences, keeping the array compact. But we do lose the auto-expiration of stale fences and unused arrays. That may be a noticeable problem for which we need to re-implement autopruning. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190621183801.23252-3-chris@chris-wilson.co.uk
2019-06-21drm/i915: Convert i915_gem_flush_ggtt_writes to intel_gtTvrtko Ursulin1-1/+2
Having introduced struct intel_gt (named the anonymous structure in i915) we can start using it to compartmentalize our code better. It makes more sense logically to have the code internally like this and it will also help with future split between gt and display in i915. v2: * Keep ggtt flush before fb obj flush. (Chris) v3: * Fix refactoring fail. * Always flush ggtt writes. (Chris) Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20190621070811.7006-23-tvrtko.ursulin@linux.intel.com
2019-06-18drm/i915: Use drm_gem_object.resvChris Wilson1-5/+5
Since commit 1ba627148ef5 ("drm: Add reservation_object to drm_gem_object"), struct drm_gem_object grew its own builtin reservation_object rendering our own private one bloat. Remove our redundant reservation_object and point into obj->base.resv instead. References: 1ba627148ef5 ("drm: Add reservation_object to drm_gem_object") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Matthew Auld <matthew.auld@intel.com> Cc: Mika Kuoppala <mika.kuoppala@intel.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190618125858.7295-1-chris@chris-wilson.co.uk
2019-06-17drm/i915: move modesetting core code under display/Jani Nikula1-5/+5
Now that we have a new subdirectory for display code, continue by moving modesetting core code. display/intel_frontbuffer.h sticks out like a sore thumb, otherwise this is, again, a surprisingly clean operation. v2: - don't move intel_sideband.[ch] (Ville) - use tabs for Makefile file lists and sort them Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Acked-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Acked-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Acked-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Jani Nikula <jani.nikula@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190613084416.6794-3-jani.nikula@intel.com
2019-06-14drm/i915: Remove rpm asserts that use i915Daniele Ceraolo Spurio1-1/+1
Quite a few of the call points have already switched to the version working directly on the runtime_pm structure, so let's switch over the rest and kill the i915-based asserts. v2: rebase Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: Imre Deak <imre.deak@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Acked-by: Imre Deak <imre.deak@intel.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20190613232156.34940-3-daniele.ceraolospurio@intel.com
2019-06-12drm/i915: Combine unbound/bound list tracking for objectsChris Wilson1-29/+5
With async binding, we don't want to manage a bound/unbound list as we may end up running before we even acquire the pages. All that is required is keeping track of shrinkable objects, so reduce it to the minimum list. Fixes: 6951e5893b48 ("drm/i915: Move GEM object domain management from struct_mutex to local") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Matthew Auld <matthew.william.auld@gmail.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190612105720.30310-1-chris@chris-wilson.co.uk
2019-06-10drm/i915: Promote i915->mm.obj_lock to be irqsafeChris Wilson1-6/+11
The intent is to be able to update the mm.lists from inside an irqsoff section (e.g. from a softirq rcu workqueue), ergo we need to make the i915->mm.obj_lock irqsafe. v2: can_discard_pages() ensures we are shrinkable v3: Beware shadowing of 'flags' Fixes: 3b4fa9640ccd ("drm/i915: Track the purgeable objects on a separate eviction list") Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110869 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Matthew Auld <matthew.william.auld@gmail.com> Reviewed-by: Matthew Auld <matthew.william.auld@gmail.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190610145430.17717-1-chris@chris-wilson.co.uk
2019-06-06drm/i915: Move object close under its own lockChris Wilson1-17/+31
Use i915_gem_object_lock() to guard the LUT and active reference to allow us to break free of struct_mutex for handling GEM_CLOSE. Testcase: igt/gem_close_race Testcase: igt/gem_exec_parallel Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190606112320.9704-1-chris@chris-wilson.co.uk
2019-05-31drm/i915: Report all objects with allocated pages to the shrinkerChris Wilson1-4/+12
Currently, we try to report to the shrinker the precise number of objects (pages) that are available to be reaped at this moment. This requires searching all objects with allocated pages to see if they fulfill the search criteria, and this count is performed quite frequently. (The shrinker tries to free ~128 pages on each invocation, before which we count all the objects; counting takes longer than unbinding the objects!) If we take the pragmatic view that with sufficient desire, all objects are eventually reapable (they become inactive, or no longer used as framebuffer etc), we can simply return the count of pinned pages maintained during get_pages/put_pages rather than walk the lists every time. The downside is that we may (slightly) over-report the number of objects/pages we could shrink and so penalize ourselves by shrinking more than required. This is mitigated by keeping the order in which we shrink objects such that we avoid penalizing active and frequently used objects, and if memory is so tight that we need to free them we would need to anyway. v2: Only expose shrinkable objects to the shrinker; a small reduction in not considering stolen and foreign objects. v3: Restore the tracking from a "backup" copy from before the gem/ split Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Matthew Auld <matthew.auld@intel.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190530203500.26272-2-chris@chris-wilson.co.uk
2019-05-31drm/i915: Track the purgeable objects on a separate eviction listChris Wilson1-1/+2
Currently the purgeable objects, I915_MADV_DONTNEED, are mixed in the normal bound/unbound lists. Every shrinker pass starts with an attempt to purge from this set of unneeded objects, which entails us doing a walk over both lists looking for any candidates. If there are none, and since we are shrinking we can reasonably assume that the lists are full!, this becomes a very slow futile walk. If we separate out the purgeable objects into own list, this search then becomes its own phase that is preferentially handled during shrinking. Instead the cost becomes that we then need to filter the purgeable list if we want to distinguish between bound and unbound objects. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Matthew Auld <matthew.william.auld@gmail.com> Reviewed-by: Matthew Auld <matthew.william.auld@gmail.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190530203500.26272-1-chris@chris-wilson.co.uk
2019-05-28drm/i915: Drop the deferred active referenceChris Wilson1-9/+6
An old optimisation to reduce the number of atomics per batch sadly relies on struct_mutex for coordination. In order to remove struct_mutex from serialising object/context closing, always taking and releasing an active reference on first use / last use greatly simplifies the locking. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190528092956.14910-15-chris@chris-wilson.co.uk
2019-05-28drm/i915: Move GEM object domain management from struct_mutex to localChris Wilson1-4/+4
Use the per-object local lock to control the cache domain of the individual GEM objects, not struct_mutex. This is a huge leap forward for us in terms of object-level synchronisation; execbuffers are coordinated using the ww_mutex and pread/pwrite is finally fully serialised again. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190528092956.14910-10-chris@chris-wilson.co.uk
2019-05-28Merge tag 'drm-intel-next-2019-05-24' of git://anongit.freedesktop.org/drm/drm-intel into drm-nextDave Airlie1-8/+5
Features: - Engine discovery query (Tvrtko) - Support for DP YCbCr4:2:0 outputs (Gwan-gyeong) - HDCP revocation support, refactoring (Ramalingam) - Remove DRM_AUTH from IOCTLs which also have DRM_RENDER_ALLOW (Christian König) - Asynchronous display power disabling (Imre) - Perma-pin uC firmware and re-enable global reset (Fernando) - GTT remapping for display, for bigger fb size and stride (Ville) - Enable pipe HDR mode on ICL if only HDR planes are used (Ville) - Kconfig to tweak the busyspin durations for i915_wait_request (Chris) - Allow multiple user handles to the same VM (Chris) - GT/GEM runtime pm improvements using wakerefs (Chris) - Gen 4&5 render context support (Chris) - Allow userspace to clone contexts on creation (Chris) - SINGLE_TIMELINE flags for context creation (Chris) - Allow specification of parallel execbuf (Chris) Refactoring: - Header refactoring (Jani) - Move GraphicsTechnology files under gt/ (Chris) - Sideband code refactoring (Chris) Fixes: - ICL DSI state readout and checker fixes (Vandita) - GLK DSI picture corruption fix (Stanislav) - HDMI deep color fixes (Clinton, Aditya) - Fix driver unbinding from a device in use (Janusz) - Fix clock gating with pipe scaling (Radhakrishna) - Disable broken FBC on GLK (Daniel Drake) - Miscellaneous GuC fixes (Michal) - Fix MG PHY DP register programming (Imre) - Add missing combo PHY lane power setup (Imre) - Workarounds for early ICL VBT issues (Imre) - Fix fastset vs. pfit on/off on HSW EDP transcoder (Ville) - Add readout and state check for pch_pfit.force_thru (Ville) - Miscellaneous display fixes and refactoring (Ville) - Display workaround fixes (Ville) - Enable audio even if ELD is bogus (Ville) - Fix use-after-free in reporting create.size (Chris) - Sideband fixes to avoid BYT hard lockups (Chris) - Workaround fixes and improvements (Chris) Maintainer shortcomings: - Failure to adequately describe and give credit for all changes (Jani) Signed-off-by: Dave Airlie <airlied@redhat.com> From: Jani Nikula <jani.nikula@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/87sgt3n45z.fsf@intel.com
2019-05-20drm/i915/selftests: Add live vma selftestVille Syrjälä1-8/+0
Add a live selftest to excercise rotated/remapped vmas. We simply write through the rotated/remapped vma, and confirm that the data appears in the right page when read through the normal vma. Not sure what the fallout of making all rotated/remapped vmas mappable/fenceable would be, hence I just hacked it in the test. v2: Grab rpm reference (Chris) GEM_BUG_ON(view.type not as expected) (Chris) Allow CAN_FENCE for rotated/remapped vmas (Chris) Update intel_plane_uses_fence() to ask for a fence only for normal vmas on gen4+ v3: Deal with intel_wakeref_t v4: Rebase Cc: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190509122159.24376-4-ville.syrjala@linux.intel.com Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
2019-05-20drm/i915: Add a new "remapped" gtt_viewVille Syrjälä1-1/+5
To overcome display engine stride limits we'll want to remap the pages in the GTT. To that end we need a new gtt_view type which is just like the "rotated" type except not rotated. v2: Use intel_remapped_plane_info base type s/unused/unused_mbz/ (Chris) Separate BUILD_BUG_ON()s (Chris) Use I915_GTT_PAGE_SIZE (Chris) v3: Use i915_gem_object_get_dma_address() (Chris) Trim the sg (Tvrtko) v4: Actually trim this time. Limit the max length to one row of pages to keep things simple Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190509122159.24376-2-ville.syrjala@linux.intel.com Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
2019-05-08Merge tag 'drm-next-2019-05-09' of git://anongit.freedesktop.org/drm/drmLinus Torvalds1-6/+45
Pull drm updates from Dave Airlie: "This has two exciting community drivers for ARM Mali accelerators. Since ARM has never been open source friendly on the GPU side of the house, the community has had to create open source drivers for the Mali GPUs. Lima covers the older t4xx and panfrost the newer 6xx/7xx series. Well done to all involved and hopefully this will help ARM head in the right direction. There is also now the ability if you don't have any of the legacy drivers enabled (pre-KMS) to remove all the pre-KMS support code from the core drm, this saves 10% or so in codesize on my machine. i915 also enable Icelake/Elkhart Lake Gen11 GPUs by default, vboxvideo moves out of staging. There are also some rcar-du patches which crossover with media tree but all should be acked by Mauro. Summary: uapi changes: - Colorspace connector property - fourcc - new YUV formts - timeline sync objects initially merged - expose FB_DAMAGE_CLIPS to atomic userspace new drivers: - vboxvideo: moved out of staging - aspeed: ASPEED SoC BMC chip display support - lima: ARM Mali4xx GPU acceleration driver support - panfrost: ARM Mali6xx/7xx Midgard/Bitfrost acceleration driver support core: - component helper docs - unplugging fixes - devm device init - MIPI/DSI rate control - shmem backed gem objects - connector, display_info, edid_quirks cleanups - dma_buf fence chain support - 64-bit dma-fence seqno comparison fixes - move initial fb config code to core - gem fence array helpers for Lima - ability to remove legacy support code if no drivers requires it (removes 10% of drm.ko size) - lease fixes ttm: - unified DRM_FILE_PAGE_OFFSET handling - Account for kernel allocations in kernel zone only panel: - OSD070T1718-19TS panel support - panel-tpo-td028ttec1 backlight support - Ronbo RB070D30 MIPI/DSI - Feiyang FY07024DI26A30-D MIPI-DSI panel - Rocktech jh057n00900 MIPI-DSI panel i915: - Comet Lake (Gen9) PCI IDs - Updated Icelake PCI IDs - Elkhartlake (Gen11) support - DP MST property addtions - plane and watermark fixes - Icelake port sync and VEBOX disable fixes - struct_mutex usage reduction - Icelake gamma fix - GuC reset fixes - make mmap more asynchronous - sound display power well race fixes - DDI/MIPI-DSI clocks for Icelake - Icelake RPS frequency changing support - Icelake workarounds amdgpu: - Use HMM for userptr - vega20 experimental smu11 support - RAS support for vega20 - BACO support for vega12 + fixes for vega20 - reworked IH interrupt handling - amdkfd RAS support - Freesync improvements - initial timeline sync object support - DC Z ordering fixes - NV12 planes support - colorspace properties for planes= - eDP opts if eDP already initialized nouveau: - misc fixes etnaviv: - misc fixes msm: - GPU zap shader support expansion - robustness ABI addition exynos: - Logging cleanups tegra: - Shared reset fix - CPU cache maintenance fix cirrus: - driver rewritten using simple helpers meson: - G12A support vmwgfx: - Resource dirtying management improvements - Userspace logging improvements virtio: - PRIME fixes rockchip: - rk3066 hdmi support sun4i: - DSI burst mode support vc4: - load tracker to detect underflow v3d: - v3d v4.2 support malidp: - initial Mali D71 support in komeda driver tfp410: - omap related improvement omapdrm: - drm bridge/panel support - drop some omap specific panels rcar-du: - Display writeback support" * tag 'drm-next-2019-05-09' of git://anongit.freedesktop.org/drm/drm: (1507 commits) drm/msm/a6xx: No zap shader is not an error drm/cma-helper: Fix drm_gem_cma_free_object() drm: Fix timestamp docs for variable refresh properties. drm/komeda: Mark the local functions as static drm/komeda: Fixed warning: Function parameter or member not described drm/komeda: Expose bus_width to Komeda-CORE drm/komeda: Add sysfs attribute: core_id and config_id drm: add non-desktop quirk for Valve HMDs drm/panfrost: Show stored feature registers drm/panfrost: Don't scream about deferred probe drm/panfrost: Disable PM on probe failure drm/panfrost: Set DMA masks earlier drm/panfrost: Add sanity checks to submit IOCTL drm/etnaviv: initialize idle mask before querying the HW db drm: introduce a capability flag for syncobj timeline support drm: report consistent errors when checking syncobj capibility drm/nouveau/nouveau: forward error generated while resuming objects tree drm/nouveau/fb/ramgk104: fix spelling mistake "sucessfully" -> "successfully" drm/nouveau/i2c: Disable i2c bus access after ->fini() drm/nouveau: Remove duplicate ACPI_VIDEO_NOTIFY_PROBE definition ...
2019-04-29drm: Simplify stacktrace handlingThomas Gleixner1-7/+4
Replace the indirection through struct stack_trace by using the storage array based interfaces. The original code in all printing functions is really wrong. It allocates a storage array on stack which is unused because depot_fetch_stack() does not store anything in it. It overwrites the entries pointer in the stack_trace struct so it points to the depot storage. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Josh Poimboeuf <jpoimboe@redhat.com> Acked-by: Daniel Vetter <daniel@ffwll.ch> Cc: Andy Lutomirski <luto@kernel.org> Cc: intel-gfx@lists.freedesktop.org Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Cc: dri-devel@lists.freedesktop.org Cc: David Airlie <airlied@linux.ie> Cc: Jani Nikula <jani.nikula@linux.intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Alexander Potapenko <glider@google.com> Cc: Alexey Dobriyan <adobriyan@gmail.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Christoph Lameter <cl@linux.com> Cc: Pekka Enberg <penberg@kernel.org> Cc: linux-mm@kvack.org Cc: David Rientjes <rientjes@google.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Andrey Ryabinin <aryabinin@virtuozzo.com> Cc: kasan-dev@googlegroups.com Cc: Mike Rapoport <rppt@linux.vnet.ibm.com> Cc: Akinobu Mita <akinobu.mita@gmail.com> Cc: Christoph Hellwig <hch@lst.de> Cc: iommu@lists.linux-foundation.org Cc: Robin Murphy <robin.murphy@arm.com> Cc: Marek Szyprowski <m.szyprowski@samsung.com> Cc: Johannes Thumshirn <jthumshirn@suse.de> Cc: David Sterba <dsterba@suse.com> Cc: Chris Mason <clm@fb.com> Cc: Josef Bacik <josef@toxicpanda.com> Cc: linux-btrfs@vger.kernel.org Cc: dm-devel@redhat.com Cc: Mike Snitzer <snitzer@redhat.com> Cc: Alasdair Kergon <agk@redhat.com> Cc: Tom Zanussi <tom.zanussi@linux.intel.com> Cc: Miroslav Benes <mbenes@suse.cz> Cc: linux-arch@vger.kernel.org Link: https://lkml.kernel.org/r/20190425094802.622094226@linutronix.de
2019-04-24drm/i915: Move GraphicsTechnology files under gt/Chris Wilson1-1/+2
Start partitioning off the code that talks to the hardware (GT) from the uapi layers and move the device facing code under gt/ One casualty is s/intel_ringbuffer.h/intel_engine.h/ with the plan to subdivide that header and body further (and split out the submission code from the ringbuffer and logical context handling). This patch aims to be simple motion so git can fixup inflight patches with little mess. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Acked-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Acked-by: Jani Nikula <jani.nikula@intel.com> Acked-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190424174839.7141-1-chris@chris-wilson.co.uk
2019-03-06drm/i915: Use i915_global_register()Chris Wilson1-10/+18
Rather than manually add every new global into each hook, use i915_global_register() function and keep a list of registered globals to invoke instead. However, I haven't found a way for random drivers to add an .init table to avoid having to manually add ourselves to i915_globals_init() each time. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20190305213830.18094-1-chris@chris-wilson.co.uk Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
2019-02-28drm/i915: Make object/vma allocation caches globalChris Wilson1-6/+37
As our allocations are not device specific, we can move our slab caches to a global scope. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190228102035.5857-2-chris@chris-wilson.co.uk
2019-02-05drm/i915: Pull i915_gem_active into the i915_active familyChris Wilson1-6/+6
Looking forward, we need to break the struct_mutex dependency on i915_gem_active. In the meantime, external use of i915_gem_active is quite beguiling, little do new users suspect that it implies a barrier as each request it tracks must be ordered wrt the previous one. As one of many, it can be used to track activity across multiple timelines, a shared fence, which fits our unordered request submission much better. We need to steer external users away from the singular, exclusive fence imposed by i915_gem_active to i915_active instead. As part of that process, we move i915_gem_active out of i915_request.c into i915_active.c to start separating the two concepts, and rename it to i915_active_request (both to tie it to the concept of tracking just one request, and to give it a longer, less appealing name). Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190205130005.2807-5-chris@chris-wilson.co.uk
2019-02-05drm/i915: Generalise GPU activity trackingChris Wilson1-145/+28
We currently track GPU memory usage inside VMA, such that we never release memory used by the GPU until after it has finished accessing it. However, we may want to track other resources aside from VMA, or we may want to split a VMA into multiple independent regions and track each separately. For this purpose, generalise our request tracking (akin to struct reservation_object) so that we can embed it into other objects. v2: Tweak error handling during selftest setup. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190205130005.2807-2-chris@chris-wilson.co.uk