aboutsummaryrefslogtreecommitdiffstatshomepage
path: root/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c (follow)
AgeCommit message (Collapse)AuthorFilesLines
2021-07-16drm/i915: Revert "drm/i915/gem: Asynchronous cmdparser"Jason Ekstrand1-214/+13
This reverts 686c7c35abc2 ("drm/i915/gem: Asynchronous cmdparser"). The justification for this commit in the git history was a vague comment about getting it out from under the struct_mutex. While this may improve perf for some workloads on Gen7 platforms where we rely on the command parser for features such as indirect rendering, no numbers were provided to prove such an improvement. It claims to closed two gitlab/bugzilla issues but with no explanation whatsoever as to why or what bug it's fixing. Meanwhile, by moving command parsing off to an async callback, it leaves us with a problem of what to do on error. When things were synchronous, EXECBUFFER2 would fail with an error code if parsing failed. When moving it to async, we needed another way to handle that error and the solution employed was to set an error on the dma_fence and then trust that said error gets propagated to the client eventually. Moving back to synchronous will help us untangle the fence error propagation mess. This also reverts most of 0edbb9ba1bfe ("drm/i915: Move cmd parser pinning to execbuffer") which is a refactor of some of our allocation paths for asynchronous parsing. Now that everything is synchronous, we don't need it. v2 (Daniel Vetter): - Add stabel Cc and Fixes tag Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Cc: <stable@vger.kernel.org> # v5.6+ Fixes: 9e31c1fe45d5 ("drm/i915: Propagate errors on awaiting already signaled fences") Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Reviewed-by: Jon Bloomfield <jon.bloomfield@intel.com> Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/20210714193419.1459723-2-jason@jlekstrand.net
2021-07-08drm/i915/gem: Return an error ptr from context_lookupJason Ekstrand1-2/+2
We're about to start doing lazy context creation which means contexts get created in i915_gem_context_lookup and we may start having more errors than -ENOENT. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/20210708154835.528166-23-jason@jlekstrand.net
2021-07-08drm/i915/request: Remove the hook from await_executionJason Ekstrand1-2/+1
This was only ever used for FENCE_SUBMIT automatic engine selection which was removed in the previous commit. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/20210708154835.528166-12-jason@jlekstrand.net
2021-07-08drm/i915/gem: Remove engine auto-magic with FENCE_SUBMIT (v2)Jason Ekstrand1-1/+1
Even though FENCE_SUBMIT is only documented to wait until the request in the in-fence starts instead of waiting until it completes, it has a bit more magic than that. If FENCE_SUBMIT is used to submit something to a balanced engine, we would wait to assign engines until the primary request was ready to start and then attempt to assign it to a different engine than the primary. There is an IGT test (the bonded-slice subtest of gem_exec_balancer) which exercises this by submitting a primary batch to a specific VCS and then using FENCE_SUBMIT to submit a secondary which can run on any VCS and have i915 figure out which VCS to run it on such that they can run in parallel. However, this functionality has never been used in the real world. The media driver (the only user of FENCE_SUBMIT) always picks exactly two physical engines to bond and never asks us to pick which to use. v2 (Daniel Vetter): - Mention the exact IGT test this breaks Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/20210708154835.528166-11-jason@jlekstrand.net
2021-07-08drm/i915: Implement SINGLE_TIMELINE with a syncobj (v4)Jason Ekstrand1-0/+16
This API is entirely unnecessary and I'd love to get rid of it. If userspace wants a single timeline across multiple contexts, they can either use implicit synchronization or a syncobj, both of which existed at the time this feature landed. The justification given at the time was that it would help GL drivers which are inherently single-timeline. However, neither of our GL drivers actually wanted the feature. i965 was already in maintenance mode at the time and iris uses syncobj for everything. Unfortunately, as much as I'd love to get rid of it, it is used by the media driver so we can't do that. We can, however, do the next-best thing which is to embed a syncobj in the context and do exactly what we'd expect from userspace internally. This isn't an entirely identical implementation because it's no longer atomic if userspace races with itself by calling execbuffer2 twice simultaneously from different threads. It won't crash in that case; it just doesn't guarantee any ordering between those two submits. It also means that sync files exported from different engines on a SINGLE_TIMELINE context will have different fence contexts. This is visible to userspace if it looks at the obj_name field of sync_fence_info. Moving SINGLE_TIMELINE to a syncobj emulation has a couple of technical advantages beyond mere annoyance. One is that intel_timeline is no longer an api-visible object and can remain entirely an implementation detail. This may be advantageous as we make scheduler changes going forward. Second is that, together with deleting the CLONE_CONTEXT API, we should now have a 1:1 mapping between intel_context and intel_timeline which may help us reduce locking. v2 (Tvrtko Ursulin): - Update the comment on i915_gem_context::syncobj to mention that it's an emulation and the possible race if userspace calls execbuffer2 twice on the same context concurrently. v2 (Jason Ekstrand): - Wrap the checks for eb.gem_context->syncobj in unlikely() - Drop the dma_fence reference - Improved commit message v3 (Jason Ekstrand): - Move the dma_fence_put() to before the error exit v4 (Tvrtko Ursulin): - Add a comment about fence contexts to the commit message Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/20210708154835.528166-8-jason@jlekstrand.net
2021-07-08drm/i915: Drop I915_CONTEXT_PARAM_NO_ZEROMAPJason Ekstrand1-8/+0
The idea behind this param is to support OpenCL drivers with relocations because OpenCL reserves 0x0 for NULL and, if we placed memory there, it would confuse CL kernels. It was originally sent out as part of a patch series including libdrm [1] and Beignet [2] support. However, the libdrm and Beignet patches never landed in their respective upstream projects so this API has never been used. It's never been used in Mesa or any other driver, either. Dropping this API allows us to delete a small bit of code. [1]: https://lists.freedesktop.org/archives/intel-gfx/2015-May/067030.html [2]: https://lists.freedesktop.org/archives/intel-gfx/2015-May/067031.html Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/20210708154835.528166-4-jason@jlekstrand.net
2021-06-21drm/i915/eb: Fix pagefault disabling in the first slowpathDaniel Vetter1-2/+0
In commit ebc0808fa2da0548a78e715858024cb81cd732bc Author: Chris Wilson <chris@chris-wilson.co.uk> Date: Tue Oct 18 13:02:51 2016 +0100 drm/i915: Restrict pagefault disabling to just around copy_from_user() we entirely missed that there's a slow path call to eb_relocate_entry (or i915_gem_execbuffer_relocate_entry as it was called back then) which was left fully wrapped by pagefault_disable/enable() calls. Previously any issues with blocking calls where handled by the following code: /* we can't wait for rendering with pagefaults disabled */ if (pagefault_disabled() && !object_is_idle(obj)) return -EFAULT; Now at this point the prefaulting was still around, which means in normal applications it was very hard to hit this bug. No idea why the regressions in igts weren't caught. Now this all changed big time with 2 patches merged closely together. First commit 2889caa9232109afc8881f29a2205abeb5709d0c Author: Chris Wilson <chris@chris-wilson.co.uk> Date: Fri Jun 16 15:05:19 2017 +0100 drm/i915: Eliminate lots of iterations over the execobjects array removes the prefaulting from the first relocation path, pushing it into the first slowpath (of which this patch added a total of 3 escalation levels). This would have really quickly uncovered the above bug, were it not for immediate adding a duct-tape on top with commit 7dd4f6729f9243bd7046c6f04c107a456bda38eb Author: Chris Wilson <chris@chris-wilson.co.uk> Date: Fri Jun 16 15:05:24 2017 +0100 drm/i915: Async GPU relocation processing by pushing all all the relocation patching to the gpu if the buffer was busy, which avoided all the possible blocking calls. The entire slowpath was then furthermore ditched in commit 7dc8f1143778a35b190f9413f228b3cf28f67f8d Author: Chris Wilson <chris@chris-wilson.co.uk> Date: Wed Mar 11 16:03:10 2020 +0000 drm/i915/gem: Drop relocation slowpath and resurrected in commit fd1500fcd4420eee06e2c7f3aa6067b78ac05871 Author: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Date: Wed Aug 19 16:08:43 2020 +0200 Revert "drm/i915/gem: Drop relocation slowpath". but this did not further impact what's going on. Since pagefault_disable/enable is an atomic section, any sleeping in there is prohibited, and we definitely do that without gpu relocations since we have to wait for the gpu usage to finish before we can patch up the relocations. Reviewed-by: Matthew Auld <matthew.auld@intel.com> Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com> Cc: Jon Bloomfield <jon.bloomfield@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: "Thomas Hellström" <thomas.hellstrom@linux.intel.com> Cc: Matthew Auld <matthew.auld@intel.com> Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Cc: Dave Airlie <airlied@redhat.com> Cc: Jason Ekstrand <jason@jlekstrand.net> Link: https://patchwork.freedesktop.org/patch/msgid/20210618214503.1773805-1-daniel.vetter@ffwll.ch
2021-06-17drm/i915: Perform execbuffer object locking as a separate stepThomas Hellström1-4/+21
To help avoid evicting already resident buffers from the batch we're processing, perform locking as a separate step. Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by: Ramalingam C <ramalingam.c@intel.com> Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Signed-off-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210615113600.30660-1-thomas.hellstrom@linux.intel.com
2021-06-14dma-buf: add dma_fence_chain_alloc/free v3Christian König1-4/+2
Add a common allocation helper. Cleaning up the mix of kzalloc/kmalloc and some unused code in the selftest. v2: polish kernel doc a bit v3: polish kernel doc even a bit more Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/20210611120301.10595-3-christian.koenig@amd.com
2021-06-14drm/i915: Simplify userptr lockingThomas Hellström1-13/+8
Use an rwlock instead of spinlock for the global notifier lock to reduce risk of contention in execbuf. Protect object state with the object lock whenever possible rather than with the global notifier lock Don't take an explicit page_ref in userptr_submit_init() but rather call get_pages() after obtaining the page list so that get_pages() holds the page_ref. This means we don't need to call userptr_submit_fini(), which is needed to avoid awkward locking in our upcoming VM_BIND code. Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Signed-off-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210610143525.624677-1-thomas.hellstrom@linux.intel.com
2021-06-11Merge tag 'drm-intel-gt-next-2021-06-10' of git://anongit.freedesktop.org/drm/drm-intel into drm-nextDave Airlie1-5/+5
UAPI Changes: - Disable mmap ioctl for gen12+ (excl. TGL-LP) - Start enabling HuC loading by default for upcoming Gen12+ platforms (excludes TGL and RKL) Core Changes: - Backmerge of drm-next Driver Changes: - Revert "i915: use io_mapping_map_user" (Eero, Matt A) - Initialize the TTM device and memory managers (Thomas) - Major rework to the GuC submission backend to prepare for enabling on new platforms (Michal Wa., Daniele, Matt B, Rodrigo) - Fix i915_sg_page_sizes to record dma segments rather than physical pages (Thomas) - Locking rework to prep for TTM conversion (Thomas) - Replace IS_GEN and friends with GRAPHICS_VER (Lucas) - Use DEVICE_ATTR_RO macro (Yue) - Static code checker fixes (Zhihao) Signed-off-by: Dave Airlie <airlied@redhat.com> From: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/YMHeDxg9VLiFtyn3@jlahtine-mobl.ger.corp.intel.com
2021-06-06dma-buf: drop the _rcu postfix on function names v3Christian König1-1/+1
The functions can be called both in _rcu context as well as while holding the lock. v2: add some kerneldoc as suggested by Daniel v3: fix indentation Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/20210602111714.212426-7-christian.koenig@amd.com
2021-06-05drm/i915/gem: replace IS_GEN and friends with GRAPHICS_VERLucas De Marchi1-5/+5
This was done by the following semantic patch: @@ expression i915; @@ - INTEL_GEN(i915) + GRAPHICS_VER(i915) @@ expression i915; expression E; @@ - INTEL_GEN(i915) >= E + GRAPHICS_VER(i915) >= E @@ expression dev_priv; expression E; @@ - !IS_GEN(dev_priv, E) + GRAPHICS_VER(dev_priv) != E @@ expression dev_priv; expression E; @@ - IS_GEN(dev_priv, E) + GRAPHICS_VER(dev_priv) == E @@ expression dev_priv; expression from, until; @@ - IS_GEN_RANGE(dev_priv, from, until) + IS_GRAPHICS_VER(dev_priv, from, until) @def@ expression E; identifier id =~ "^gen$"; @@ - id = GRAPHICS_VER(E) + ver = GRAPHICS_VER(E) @@ identifier def.id; @@ - id + ver It also takes care of renaming the variable we assign to GRAPHICS_VER() so to use "ver" rather than "gen". Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210605155356.4183026-4-lucas.demarchi@intel.com
2021-04-14drm/i915: eliminate remaining uses of intel_device_info->genLucas De Marchi1-11/+11
Replace gen with the new graphics_ver value and use GRAPHICS_VER() in those places. Reviewed-by: Jani Nikula <jani.nikula@intel.com> Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> Signed-off-by: Jani Nikula <jani.nikula@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210413051002.92589-10-lucas.demarchi@intel.com
2021-03-24drm/i915: Defer pin calls in buffer pool until first use by caller.Maarten Lankhorst1-0/+2
We need to take the obj lock to pin pages, so wait until the callers have done so, before making the object unshrinkable. Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/20210323155059.628690-30-maarten.lankhorst@linux.intel.com
2021-03-24drm/i915: Fix userptr so we do not have to worry about obj->mm.lock, v7.Maarten Lankhorst1-13/+88
Instead of doing what we do currently, which will never work with PROVE_LOCKING, do the same as AMD does, and something similar to relocation slowpath. When all locks are dropped, we acquire the pages for pinning. When the locks are taken, we transfer those pages in .get_pages() to the bo. As a final check before installing the fences, we ensure that the mmu notifier was not called; if it is, we return -EAGAIN to userspace to signal it has to start over. Changes since v1: - Unbinding is done in submit_init only. submit_begin() removed. - MMU_NOTFIER -> MMU_NOTIFIER Changes since v2: - Make i915->mm.notifier a spinlock. Changes since v3: - Add WARN_ON if there are any page references left, should have been 0. - Return 0 on success in submit_init(), bug from spinlock conversion. - Release pvec outside of notifier_lock (Thomas). Changes since v4: - Mention why we're clearing eb->[i + 1].vma in the code. (Thomas) - Actually check all invalidations in eb_move_to_gpu. (Thomas) - Do not wait when process is exiting to fix gem_ctx_persistence.userptr. Changes since v5: - Clarify why check on PF_EXITING is (temporarily) required. Changes since v6: - Ensure userptr validity is checked in set_domain through a special path. Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Acked-by: Dave Airlie <airlied@redhat.com> [danvet: s/kfree/kvfree/ in i915_gem_object_userptr_drop_ref in the previous review round, but which got lost. The other open questions around page refcount are imo better discussed in a separate series, with amdgpu folks involved]. Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/20210323155059.628690-17-maarten.lankhorst@linux.intel.com
2021-03-24drm/i915: Make compilation of userptr code depend on MMU_NOTIFIER.Maarten Lankhorst1-0/+2
Now that unsynchronized mappings are removed, the only time userptr works is when the MMU notifier is enabled. Put all of the userptr code behind a mmu notifier ifdef. Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/20210323155059.628690-16-maarten.lankhorst@linux.intel.com
2021-03-24drm/i915: make lockdep slightly happier about execbuf.Maarten Lankhorst1-6/+18
As soon as we install fences, we should stop allocating memory in order to prevent any potential deadlocks. This is required later on, when we start adding support for dma-fence annotations. Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/20210323155059.628690-11-maarten.lankhorst@linux.intel.com
2021-03-24drm/i915: Add missing -EDEADLK handling to execbuf pinning, v2.Maarten Lankhorst1-11/+24
i915_vma_pin may fail with -EDEADLK when we start locking page tables, so ensure we handle this correctly. Changes since v1: - Drop -EDEADLK todo, this commit handles it. - Change eb_pin_vma from sort-of-bool + -EDEADLK to a proper int. (Matt) Cc: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/20210323155059.628690-5-maarten.lankhorst@linux.intel.com
2021-03-24drm/i915: Move cmd parser pinning to execbufferMaarten Lankhorst1-6/+68
We need to get rid of allocations in the cmd parser, because it needs to be called from a signaling context, first move all pinning to execbuf, where we already hold all locks. Allocate jump_whitelist in the execbuffer, and add annotations around intel_engine_cmd_parser(), to ensure we only call the command parser without allocating any memory, or taking any locks we're not supposed to. Because i915_gem_object_get_page() may also allocate memory, add a path to i915_gem_object_get_sg() that prevents memory allocations, and walk the sg list manually. It should be similarly fast. This has the added benefit of being able to catch all memory allocation errors before the point of no return, and return -ENOMEM safely to the execbuf submitter. Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Acked-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/20210323155059.628690-4-maarten.lankhorst@linux.intel.com
2021-03-18drm/i915/gem: Drop relocation support on all new hardware (v6)Jason Ekstrand1-0/+7
The Vulkan driver in Mesa for Intel hardware never uses relocations if it's running on a version of i915 that supports at least softpin which all versions of i915 supporting Gen12 do. On the OpenGL side, Gen12+ is only supported by iris which never uses relocations. The older i965 driver in Mesa does use relocations but it only supports Intel hardware through Gen11 and has been deprecated for all hardware Gen9+. The compute driver also never uses relocations. This only leaves the media driver which is supposed to be switching to softpin going forward. Making softpin a requirement for all future hardware seems reasonable. There is one piece of hardware enabled by default in i915: RKL which was enabled by e22fa6f0a976 which has not yet landed in drm-next so this almost but not really a userspace API change for RKL. If it becomes a problem, we can always add !IS_ROCKETLAKE(eb->i915) to the condition. Rejecting relocations starting with newer Gen12 platforms has the benefit that we don't have to bother supporting it on platforms with local memory. Given how much CPU touching of memory is required for relocations, not having to do so on platforms where not all memory is directly CPU-accessible carries significant advantages. v2 (Jason Ekstrand): - Allow TGL-LP platforms as they've already shipped v3 (Jason Ekstrand): - WARN_ON platforms with LMEM support in case the check is wrong v4 (Jason Ekstrand): - Call out Rocket Lake in the commit message v5 (Jason Ekstrand): - Drop the HAS_LMEM check as it's already covered by the version check v6 (Jason Ekstrand): - Move the check to eb_validate_vma() with all the other exec_object validation checks. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Zbigniew Kempczyński <zbigniew.kempczynski@intel.com> Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/20210317234014.2271006-3-jason@jlekstrand.net
2021-03-18drm/i915/gem: Drop legacy execbuffer support (v2)Jason Ekstrand1-100/+0
libdrm has supported the newer execbuffer2 ioctl and using it by default when it exists since libdrm commit b50964027bef which landed Mar 2, 2010. The i915 and i965 drivers in Mesa at the time both used libdrm and so did the Intel X11 back-end. The SNA back-end for X11 has always used execbuffer2. v2 (Jason Ekstrand): - Add a comment saying what Linux version it's being removed in. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Acked-by: Keith Packard <keithp@keithp.com> Acked-by: Dave Airlie <airlied@redhat.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/20210317234014.2271006-2-jason@jlekstrand.net
2021-01-19drm/i915/pool: constrain pool objects by mapping typeMatthew Auld1-6/+7
In a few places we always end up mapping the pool object with the FORCE constraint(to prevent hitting -EBUSY) which will destroy the cached mapping if it has a different type. As a simple first step, make the mapping type part of the pool interface, where the behaviour is to only give out pool objects which match the requested mapping type. Suggested-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Matthew Auld <matthew.auld@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20210119133106.66294-4-matthew.auld@intel.com
2021-01-08drm/i915/gt: Disable arbitration on no-preempt requestsChris Wilson1-3/+3
If a request is submitted and known to require no preemption, disable arbitration around the batch which prevents the HW from handling a preemption request during the payload. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Andi Shyti <andi.shyti@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210108204026.20682-7-chris@chris-wilson.co.uk
2021-01-03drm/i915: fix shift warningArnd Bergmann1-1/+1
Randconfig builds on 32-bit machines show lots of warnings for the i915 driver for passing a 32-bit value into __const_hweight64(): drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c:2584:9: error: shift count >= width of type [-Werror,-Wshift-count-overflow] return hweight64(VDBOX_MASK(&i915->gt)); ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ include/asm-generic/bitops/const_hweight.h:29:49: note: expanded from macro 'hweight64' #define hweight64(w) (__builtin_constant_p(w) ? __const_hweight64(w) : __arch_hweight64(w)) Change it to hweight_long() to avoid the warning. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20210103135158.3591442-1-arnd@kernel.org
2020-12-24drm/i915: clear the gpu reloc batchMatthew Auld1-1/+3
The reloc batch is short lived but can exist in the user visible ppGTT, and since it's backed by an internal object, which lacks page clearing, we should take care to clear it upfront. Signed-off-by: Matthew Auld <matthew.auld@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20201224151358.401345-2-matthew.auld@intel.com Cc: stable@vger.kernel.org
2020-12-16drm/i915/gt: Move gen8 CS emitters into gen8_engine_cs.hChris Wilson1-0/+1
Reduce the pollution of intel_engine.h by moving gen8_emit_pipe_control and friends to gen8_engine_cs.h Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20201216135452.6063-1-chris@chris-wilson.co.uk
2020-12-16drm/i915: Fix mismatch between misplaced vma check and vma insertChris Wilson1-1/+1
When inserting a VMA, we restrict the placement to the low 4G unless the caller opts into using the full range. This was done to allow usersapce the opportunity to transition slowly from a 32b address space, and to avoid breaking inherent 32b assumptions of some commands. However, for insert we limited ourselves to 4G-4K, but on verification we allowed the full 4G. This causes some attempts to bind a new buffer to sporadically fail with -ENOSPC, but at other times be bound successfully. commit 48ea1e32c39d ("drm/i915/gen9: Set PIN_ZONE_4G end to 4GB - 1 page") suggests that there is a genuine problem with stateless addressing that cannot utilize the last page in 4G and so we purposefully excluded it. This means that the quick pin pass may cause us to utilize a buggy placement. Reported-by: CQ Tang <cq.tang@intel.com> Testcase: igt/gem_exec_params/larger-than-life-batch Fixes: 48ea1e32c39d ("drm/i915/gen9: Set PIN_ZONE_4G end to 4GB - 1 page") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: CQ Tang <cq.tang@intel.com> Reviewed-by: CQ Tang <cq.tang@intel.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Cc: <stable@vger.kernel.org> # v4.5+ Link: https://patchwork.freedesktop.org/patch/msgid/20201216092951.7124-1-chris@chris-wilson.co.uk
2020-12-08drm/i915/gem: Drop false !i915_vma_is_closed assertionChris Wilson1-2/+0
Closed vma are protected by the GT wakeref held as we lookup the vma, so we know that the vma will not be freed as we process it for the execbuf. Instead we expect to catch the closed status of the context, and simply allow the close-race on an individual vma to be washed away. Longer term, the GT wakeref protection will be removed by explicit vma.kref tracking. Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/2245 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20201207193824.18114-1-chris@chris-wilson.co.uk
2020-12-04drm/i915/gem: Propagate error from cancelled submit due to context closureChris Wilson1-2/+5
In the course of discovering and closing many races with context closure and execbuf submission, since commit 61231f6bd056 ("drm/i915/gem: Check that the context wasn't closed during setup") we started checking that the context was not closed by another userspace thread during the execbuf ioctl. In doing so we cancelled the inflight request (by telling it to be skipped), but kept reporting success since we do submit a request, albeit one that doesn't execute. As the error is known before we return from the ioctl, we can report the error we detect immediately, rather than leave it on the fence status. With the immediate propagation of the error, it is easier for userspace to handle. Fixes: 61231f6bd056 ("drm/i915/gem: Check that the context wasn't closed during setup") Testcase: igt/gem_ctx_exec/basic-close-race Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: <stable@vger.kernel.org> # v5.7+ Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20201203103432.31526-1-chris@chris-wilson.co.uk
2020-10-19drm/i915/gem: Support parsing of oversize batchesChris Wilson1-2/+8
Matthew Auld noted that on more recent systems (such as the parser for gen9) we may have objects that are larger than expected by the GEM uAPI (i.e. greater than u32). These objects would have incorrect implicit batch lengths, causing the parser to reject them for being incomplete, or worse. Based on a patch by Matthew Auld. Reported-by: Matthew Auld <matthew.auld@intel.com> Fixes: 435e8fc059db ("drm/i915: Allow parsing of unsized batches") Testcase: igt/gem_exec_params/larger-than-life-batch Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Matthew Auld <matthew.auld@intel.com> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: Jon Bloomfield <jon.bloomfield@intel.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Cc: stable@vger.kernel.org Link: https://patchwork.freedesktop.org/patch/msgid/20201015115954.871-1-chris@chris-wilson.co.uk (cherry picked from commit 57b2d834bf235daab388c3ba12d035c820ae09c6) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2020-09-30drm/i915: Avoid mixing integer types during batch copiesChris Wilson1-2/+5
Be consistent and use unsigned long throughout the chunk copies to avoid the inherent clumsiness of mixing integer types of different widths and signs. Failing to take acount of a wider unsigned type when using min_t can lead to treating it as a negative, only for it flip back to a large unsigned value after passing a boundary check. Fixes: ed13033f0287 ("drm/i915/cmdparser: Only cache the dst vmap") Testcase: igt/gen9_exec_parse/bb-large Reported-by: "Candelaria, Jared" <jared.candelaria@intel.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: "Candelaria, Jared" <jared.candelaria@intel.com> Cc: "Bloomfield, Jon" <jon.bloomfield@intel.com> Cc: <stable@vger.kernel.org> # v4.9+ Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200928215942.31917-1-chris@chris-wilson.co.uk (cherry picked from commit b7eeb2b4132ccf1a7d38f434cde7043913d1ed3c) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2020-09-10drm/i915: Fix slightly botched merge in __reloc_entry_gpuMaarten Lankhorst1-2/+2
This function should be an int, not a bool. Presumably because we had the same 2 reverts in a slightly different way, git got confused. Thanks to Dan for reporting. :) The conflict is between the 3 reverts in drm-fixes: 4993a8a37808 ("Revert "drm/i915: Remove i915_gem_object_get_dirty_page()"") ad5d95e4d538 ("Revert "drm/i915/gem: Async GPU relocations only"") 20561da3a2e1 ("Revert "drm/i915/gem: Delete unused code"") And the slightly different combined revert in drm-intel-gt-next, but with the same goal: 102a0a9051f4 ("Revert "drm/i915/gem: Async GPU relocations only"") In the merge commit 1f4b2aca794f ("Merge tag 'drm-intel-gt-next-2020-09-07' of git://anongit.freedesktop.org/drm/drm-intel into drm-next") things went wrong, but the merge commit view now doesn't show any conflict anymore (as git tends to do when the resolution picks one or the other branch). The need to handle other than just true/false error codes in __reloc_entry_gpu was added in the dma_resv locking changes in c43ce12328df ("drm/i915: Use per object locking in execbuf, v12.") Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Cc: Dave Airlie <airlied@redhat.com> [danvet: Explain this entire saga a lot better, adding tons of commit references. Also note that this was merged before full intel-gfx-CI results, only after BAT, since the breakage at the BAT run is already severe enough to block all pre-merge testing.] Fixes: 1f4b2aca794f ("Merge tag 'drm-intel-gt-next-2020-09-07' of git://anongit.freedesktop.org/drm/drm-intel into drm-next") Acked-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/20200910111225.2184193-1-maarten.lankhorst@linux.intel.com
2020-09-09Merge tag 'drm-intel-gt-next-2020-09-07' of git://anongit.freedesktop.org/drm/drm-intel into drm-nextDave Airlie1-482/+767
(Same content as drm-intel-gt-next-2020-09-04-3, S-o-b's added) UAPI Changes: (- Potential implicit changes from WW locking refactoring) Cross-subsystem Changes: (- WW locking changes should align the i915 locking more with others) Driver Changes: - MAJOR: Apply WW locking across the driver (Maarten) - Reverts for 5 commits to make applying WW locking faster (Maarten) - Disable preparser around invalidations on Tigerlake for non-RCS engines (Chris) - Add missing dma_fence_put() for error case of syncobj timeline (Chris) - Parse command buffer earlier in eb_relocate(slow) to facilitate backoff (Maarten) - Pin engine before pinning all objects (Maarten) - Rework intel_context pinning to do everything outside of pin_mutex (Maarten) - Avoid tracking GEM context until registered (Cc: stable, Chris) - Provide a fastpath for waiting on vma bindings (Chris) - Fixes to preempt-to-busy mechanism (Chris) - Distinguish the virtual breadcrumbs from the irq breadcrumbs (Chris) - Switch to object allocations for page directories (Chris) - Hold context/request reference while breadcrumbs are active (Chris) - Make sure execbuffer always passes ww state to i915_vma_pin (Maarten) - Code refactoring to facilitate use of WW locking (Maarten) - Locking refactoring to use more granular locking (Maarten, Chris) - Support for multiple pinned timelines per engine (Chris) - Move complication of I915_GEM_THROTTLE to the ioctl from general code (Chris) - Make active tracking/vma page-directory stash work preallocated (Chris) - Avoid flushing submission tasklet too often (Chris) - Reduce context termination list iteration guard to RCU (Chris) - Reductions to locking contention (Chris) - Fixes for issues found by CI (Chris) Signed-off-by: Dave Airlie <airlied@redhat.com> From: Joonas Lahtinen <jlahtine@jlahtine-mobl.ger.corp.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200907130039.GA27766@jlahtine-mobl.ger.corp.intel.com
2020-09-09Backmerge drm-fixes merge into drm-nextDave Airlie1-21/+293
Commit '6f6a73c8b715d595977774d48450a734297ab21f' from Linus' tree The fixes reverts cause a bit of a conflict pain with intel next, start fixing it up here. Signed-off-by: Dave Airlie <airlied@redhat.com>
2020-09-08Revert "drm/i915/gem: Delete unused code"Dave Airlie1-0/+19
These commits caused a regression on Lenovo t520 sandybridge machine belonging to reporter. We are reverting them for 5.10 for other reasons, so just do it for 5.9 as well. This reverts commit 7ac2d2536dfa71c275a74813345779b1e7522c91. Reported-by: Harald Arnesen <harald@skogtun.org> Signed-off-by: Dave Airlie <airlied@redhat.com>
2020-09-08Revert "drm/i915/gem: Async GPU relocations only"Dave Airlie1-21/+274
These commits caused a regression on Lenovo t520 sandybridge machine belonging to reporter. We are reverting them for 5.10 for other reasons, so just do it for 5.9 as well. This reverts commit 9e0f9464e2ab36b864359a59b0e9058fdef0ce47. Reported-by: Harald Arnesen <harald@skogtun.org> Signed-off-by: Dave Airlie <airlied@redhat.com>
2020-09-07drm/i915: Make sure execbuffer always passes ww state to i915_vma_pin.Maarten Lankhorst1-64/+76
As a preparation step for full object locking and wait/wound handling during pin and object mapping, ensure that we always pass the ww context in i915_gem_execbuffer.c to i915_vma_pin, use lockdep to ensure this happens. This also requires changing the order of eb_parse slightly, to ensure we pass ww at a point where we could still handle -EDEADLK safely. Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Reviewed-by: Thomas Hellström <thomas.hellstrom@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200819140904.1708856-15-maarten.lankhorst@linux.intel.com Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2020-09-07drm/i915: Pin engine before pinning all objects, v5.Maarten Lankhorst1-65/+149
We want to lock all gem objects, including the engine context objects, rework the throttling to ensure that we can do this. Now we only throttle once, but can take eb_pin_engine while acquiring objects. This means we will have to drop the lock to wait. If we don't have to throttle we can still take the fastpath, if not we will take the slowpath and wait for the throttle request while unlocked. The engine has to be pinned as first step, otherwise gpu relocations won't work. Changes since v1: - Only need to get a throttled request in the fastpath, no need for a global flag any more. - Always free the waited request correctly. Changes since v2: - Use intel_engine_pm_get()/put() to keeep engine pool alive during EDEADLK handling. Changes since v3: - Fix small rq leak. Changes since v4: - Use a single reloc_context, for intel_context_pin_ww(). Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Reviewed-by: Thomas Hellström <thomas.hellstrom@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200819140904.1708856-13-maarten.lankhorst@linux.intel.com Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2020-09-07drm/i915: Nuke arguments to eb_pin_engineMaarten Lankhorst1-10/+7
Those arguments are already set as eb.file and eb.args, so kill off the extra arguments. This will allow us to move eb_pin_engine() to after we reserved all BO's. Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Thomas Hellström <thomas.hellstrom@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200819140904.1708856-12-maarten.lankhorst@linux.intel.com Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2020-09-07drm/i915: Use per object locking in execbuf, v12.Maarten Lankhorst1-146/+217
Now that we changed execbuf submission slightly to allow us to do all pinning in one place, we can now simply add ww versions on top of struct_mutex. All we have to do is a separate path for -EDEADLK handling, which needs to unpin all gem bo's before dropping the lock, then starting over. This finally allows us to do parallel submission, but because not all of the pinning code uses the ww ctx yet, we cannot completely drop struct_mutex yet. Changes since v1: - Keep struct_mutex for now. :( Changes since v2: - Make sure we always lock the ww context in slowpath. Changes since v3: - Don't call __eb_unreserve_vma in eb_move_to_gpu now; this can be done on normal unlock path. - Unconditionally release vmas and context. Changes since v4: - Rebased on top of struct_mutex reduction. Changes since v5: - Remove training wheels. Changes since v6: - Fix accidentally broken -ENOSPC handling. Changes since v7: - Handle gt buffer pool better. Changes since v8: - Properly clear variables, to make -EDEADLK handling not BUG. Change since v9: - Fix unpinning fence on pnv and below. Changes since v10: - Make relocation gpu chaining working again. Changes since v11: - Remove relocation chaining, pain to make it work. Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Reviewed-by: Thomas Hellström <thomas.hellstrom@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200819140904.1708856-9-maarten.lankhorst@linux.intel.com Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2020-09-07drm/i915: Parse command buffer earlier in eb_relocate(slow)Maarten Lankhorst1-31/+37
We want to introduce backoff logic, but we need to lock the pool object as well for command parsing. Because of this, we will need backoff logic for the engine pool obj, move the batch validation up slightly to eb_lookup_vmas, and the actual command parsing in a separate function which can get called from execbuf relocation fast and slowpath. Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Reviewed-by: Thomas Hellström <thomas.hellstrom@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200819140904.1708856-8-maarten.lankhorst@linux.intel.com Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2020-09-07drm/i915: Remove locking from i915_gem_object_prepare_read/writeMaarten Lankhorst1-2/+11
Execbuffer submission will perform its own WW locking, and we cannot rely on the implicit lock there. This also makes it clear that the GVT code will get a lockdep splat when multiple batchbuffer shadows need to be performed in the same instance, fix that up. Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200819140904.1708856-7-maarten.lankhorst@linux.intel.com Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2020-09-07drm/i915: Add an implementation for i915_gem_ww_ctx locking, v2.Maarten Lankhorst1-1/+1
i915_gem_ww_ctx is used to lock all gem bo's for pinning and memory eviction. We don't use it yet, but lets start adding the definition first. To use it, we have to pass a non-NULL ww to gem_object_lock, and don't unlock directly. It is done in i915_gem_ww_ctx_fini. Changes since v1: - Change ww_ctx and obj order in locking functions (Jonas Lahtinen) Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Reviewed-by: Thomas Hellström <thomas.hellstrom@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200819140904.1708856-6-maarten.lankhorst@linux.intel.com Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2020-09-07Revert "drm/i915/gem: Split eb_vma into its own allocation"Maarten Lankhorst1-73/+51
This reverts commit 0f1dd02295f3 ("drm/i915/gem: Split eb_vma into its own allocation") and also moves all unreserving to a single place at the end, which is a minor simplification. With the WW locking, we will drop all references only at the end when unlocking, so refcounting can now be removed. Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/20200819140904.1708856-5-maarten.lankhorst@linux.intel.com Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2020-09-07Revert "drm/i915/gem: Drop relocation slowpath".Maarten Lankhorst1-10/+252
This reverts commit 7dc8f1143778 ("drm/i915/gem: Drop relocation slowpath"). We need the slowpath relocation for taking ww-mutex inside the page fault handler, and we will take this mutex when pinning all objects. We also functionally revert ef398881d27d ("drm/i915/gem: Limit struct_mutex to eb_reserve"), as we need the struct_mutex in the slowpath as well, and a tiny part of 003d8b9143a6 ("drm/i915/gem: Only call eb_lookup_vma once during execbuf ioctl"). Specifically, we make the -EAGAIN handling part of fallback to slowpath again. With this, we have a proper working slowpath again, which will allow us to do fault handling with WW locks held. [mlankhorst: Adjusted for reloc_gpu_flush() changes] Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Matthew Auld <matthew.auld@intel.com> Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> [mlankhorst: Removed extra reloc_gpu_flush()] Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/20200819140904.1708856-4-maarten.lankhorst@linux.intel.com Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2020-09-07drm/i915: Revert relocation chaining commits.Maarten Lankhorst1-139/+32
This reverts commit 964a9b0f611ee ("drm/i915/gem: Use chained reloc batches") and commit 0e97fbb080553 ("drm/i915/gem: Use a single chained reloc batches for a single execbuf"). When adding ww locking to execbuf, it's hard enough to deal with a single BO that is part of relocation execution. Chaining is hard to get right, and with GPU relocation deprecated, it's best to drop this altogether, instead of trying to fix something we will remove. This is not a completely 1:1 revert, we reset rq_size to 0 in reloc_cache_init, this was from e3d291301f99 ("drm/i915/gem: Implement legacy MI_STORE_DATA_IMM"), because we don't want to break the selftests. (Daniel) Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/20200819140904.1708856-3-maarten.lankhorst@linux.intel.com Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2020-09-07Revert "drm/i915/gem: Async GPU relocations only"Maarten Lankhorst1-21/+298
This reverts commit 9e0f9464e2ab ("drm/i915/gem: Async GPU relocations only"), and related commit 7ac2d2536dfa7 ("drm/i915/gem: Delete unused code"). Async GPU relocations are not the path forward, we want to remove GPU accelerated relocation support eventually when userspace is fixed to use VM_BIND, and this is the first step towards that. We will keep async gpu relocations around for now, until userspace is fixed. Relocation support will be disabled completely on platforms where there was never any userspace that depends on it, as the hardware doesn't require it from at least gen9+ onward. For older platforms, the plan is to use cpu relocations only. The igt side is fixed in igt commit 39e9aa1032a4e ("tests/i915: Remove subtests that rely on async relocation behavior"). Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/20200819140904.1708856-2-maarten.lankhorst@linux.intel.com Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2020-09-07drm/i915/gem: Free the fence after a fence-chain lookup failureChris Wilson1-0/+1
If dma_fence_chain_find_seqno() reports an error, it does so in its preamble before it disposes of the input fence. On handling the error, we need to drop the reference to the fence. Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/2292 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: 13149e8bafc4 ("drm/i915: add syncobj timeline support") Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200806161056.17593-1-chris@chris-wilson.co.uk Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2020-09-07drm/i915: Export a preallocate variant of i915_active_acquire()Chris Wilson1-1/+1
Sometimes we have to be very careful not to allocate underneath a mutex (or spinlock) and yet still want to track activity. Enter i915_active_acquire_for_context(). This raises the activity counter on i915_active prior to use and ensures that the fence-tree contains a slot for the context. v2: Refactor active_lookup() so it can be called again before/after locking to resolve contention. Since we protect the rbtree until we idle, we can do a lockfree lookup, with the caveat that if another thread performs a concurrent insertion, the rotations from the insert may cause us to not find our target. A second pass holding the treelock will find the target if it exists, or the place to perform our insertion. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Thomas Hellström <thomas.hellstrom@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200731085015.32368-3-chris@chris-wilson.co.uk Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>