aboutsummaryrefslogtreecommitdiffstats
path: root/drivers/gpu/drm/i915/i915_gem_object.h (follow)
AgeCommit message (Collapse)AuthorFilesLines
2019-05-28drm/i915: Move object->pages API to i915_gem_object.[ch]Chris Wilson1-220/+0
Currently the code for manipulating the pages on an object is still residing in i915_gem.c, move it to i915_gem_object.c Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190528092956.14910-3-chris@chris-wilson.co.uk
2019-05-28drm/i915: Split GEM object type definition to its own headerChris Wilson1-292/+3
For convenience in avoiding inline spaghetti, keep the type definition as a separate header. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Acked-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Acked-by: Jani Nikula <jani.nikula@intel.com> Acked-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190528092956.14910-1-chris@chris-wilson.co.uk
2019-03-31drm/i915: Check domains for userptr on releaseChris Wilson1-0/+4
When we return pages to the system, we release control over them and should defensively return them to the CPU write domain so that we catch any external writes on reacquiring them (e.g. to transparently swapout/swapin). While we did this defensive clflushing for ordinary shmem pages, it was forgotten for userptr. Fortunately, userptr objects are normally cache coherent and so oblivious to the forgotten domain tracking. References: a679f58d0510 ("drm/i915: Flush pages on acquisition") References: 754a25442705 ("drm/i915: Skip object locking around a no-op set-domain ioctl") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Matthew Auld <matthew.william.auld@gmail.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> Reviewed-by: Matthew Auld <matthew.william.auld@gmail.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190331094620.15185-1-chris@chris-wilson.co.uk
2019-03-06drm/i915: Use i915_global_register()Chris Wilson1-4/+0
Rather than manually add every new global into each hook, use i915_global_register() function and keep a list of registered globals to invoke instead. However, I haven't found a way for random drivers to add an .init table to avoid having to manually add ourselves to i915_globals_init() each time. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20190305213830.18094-1-chris@chris-wilson.co.uk Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
2019-02-28drm/i915: Make object/vma allocation caches globalChris Wilson1-1/+7
As our allocations are not device specific, we can move our slab caches to a global scope. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190228102035.5857-2-chris@chris-wilson.co.uk
2019-02-05drm/i915: Pull i915_gem_active into the i915_active familyChris Wilson1-1/+1
Looking forward, we need to break the struct_mutex dependency on i915_gem_active. In the meantime, external use of i915_gem_active is quite beguiling, little do new users suspect that it implies a barrier as each request it tracks must be ordered wrt the previous one. As one of many, it can be used to track activity across multiple timelines, a shared fence, which fits our unordered request submission much better. We need to steer external users away from the singular, exclusive fence imposed by i915_gem_active to i915_active instead. As part of that process, we move i915_gem_active out of i915_request.c into i915_active.c to start separating the two concepts, and rename it to i915_active_request (both to tie it to the concept of tracking just one request, and to give it a longer, less appealing name). Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190205130005.2807-5-chris@chris-wilson.co.uk
2019-01-28drm/i915: Move vma lookup to its own lockChris Wilson1-18/+27
Remove the struct_mutex requirement for looking up the vma for an object. v2: Highlight how the race for duplicate vma creation is resolved on reacquiring the lock with a short comment. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190128102356.15037-3-chris@chris-wilson.co.uk
2019-01-15drm/i915/userptr: Avoid struct_mutex recursion for mmu_invalidate_range_startChris Wilson1-0/+7
Since commit 93065ac753e4 ("mm, oom: distinguish blockable mode for mmu notifiers") we have been able to report failure from mmu_invalidate_range_start which allows us to use a trylock on the struct_mutex to avoid potential recursion and report -EBUSY instead. Furthermore, this allows us to pull the work into the main callback and avoid the sleight-of-hand in using a workqueue to avoid lockdep. However, not all paths to mmu_invalidate_range_start are prepared to handle failure, so instead of reporting the recursion, deal with it by propagating the failure upwards, who can decide themselves to handle it or report it. v2: Mark up the recursive lock behaviour and comment on the various weak points. v3: Follow commit 3824e41975ae ("drm/i915: Use mutex_lock_killable() from inside the shrinker") and also use mutex_lock_killable(). v3.1: No leak on EINTR. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108375 References: 93065ac753e4 ("mm, oom: distinguish blockable mode for mmu notifiers") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190115124442.3500-1-chris@chris-wilson.co.uk
2019-01-09drm/i915: drop all drmP.h includesJani Nikula1-1/+2
Needs just a few additional includes here and there. Cc: Sam Ravnborg <sam@ravnborg.org> Cc: Daniel Vetter <daniel@ffwll.ch> Cc: Laurent Pinchart <laurent.pinchart@ideasonboard.com> Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch> Acked-by: Sam Ravnborg <sam@ravnborg.org> Signed-off-by: Jani Nikula <jani.nikula@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190108082709.3748-1-jani.nikula@intel.com
2018-07-26drm/i915: Mark up object tiling-and-stride getters as constChris Wilson1-5/+5
For that little bit of defense against a tired programmer. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180725155447.11909-1-chris@chris-wilson.co.uk
2018-07-13drm/i915/userptr: Enable read-only support on gen8+Chris Wilson1-1/+0
On gen8 and onwards, we can mark GPU accesses through the ppGTT as being read-only, that is cause any GPU write onto that page to be discarded (not triggering a fault). This is all that we need to finally support the read-only flag for userptr! v2: Check default address space for read only support as a proxy for the user context/ppgtt. Testcase: igt/gem_userptr_blits/readonly* Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Jon Bloomfield <jon.bloomfield@intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Reviewed-by: Matthew Auld <matthew.william.auld@gmail.com> Reviewed-by: Jon Bloomfield <jon.bloomfield@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180712191430.9269-1-chris@chris-wilson.co.uk
2018-07-13drm/i915: Prevent writing into a read-only object via a GGTT mmapChris Wilson1-1/+12
If the user has created a read-only object, they should not be allowed to circumvent the write protection by using a GGTT mmapping. Deny it. Also most machines do not support read-only GGTT PTEs, so again we have to reject attempted writes. Fortunately, this is known a priori, so we can at least reject in the call to create the mmap (with a sanity check in the fault handler). v2: Check the vma->vm_flags during mmap() to allow readonly access. v3: Remove VM_MAYWRITE to curtail mprotect() Testcase: igt/gem_userptr_blits/readonly_mmap* Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Jon Bloomfield <jon.bloomfield@intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Matthew Auld <matthew.william.auld@gmail.com> Cc: David Herrmann <dh.herrmann@gmail.com> Reviewed-by: Matthew Auld <matthew.william.auld@gmail.com> #v1 Reviewed-by: Jon Bloomfield <jon.bloomfield@intel.com> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180712185315.3288-4-chris@chris-wilson.co.uk
2018-06-28drm/i915: Replace drm_gem_object_unreference_unlocked with put functionThomas Zimmermann1-3/+0
This patch unifies the naming of DRM functions for reference counting of struct drm_gem_object. The resulting code is more aligned with the rest of the Linux kernel interfaces. Signed-off-by: Thomas Zimmermann <tdz@users.sourceforge.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/20180618110154.30462-5-tdz@users.sourceforge.net
2018-06-28drm/i915: Replace __drm_gem_object_unreference with __drm_gem_object_putThomas Zimmermann1-1/+1
This patch unifies the naming of DRM functions for reference counting of struct drm_gem_object. The resulting code is more aligned with the rest of the Linux kernel interfaces. Signed-off-by: Thomas Zimmermann <tdz@users.sourceforge.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/20180618110154.30462-4-tdz@users.sourceforge.net
2018-06-28drm/i915: Replace drm_gem_object_{un/reference} with {put,get} functionsThomas Zimmermann1-7/+1
This patch unifies the naming of DRM functions for reference counting of struct drm_gem_object. The resulting code is more aligned with the rest of the Linux kernel interfaces. Signed-off-by: Thomas Zimmermann <tdz@users.sourceforge.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/20180618110154.30462-3-tdz@users.sourceforge.net
2018-02-21drm/i915: Rename drm_i915_gem_request to i915_requestChris Wilson1-1/+1
We want to de-emphasize the link between the request (dependency, execution and fence tracking) from GEM and so rename the struct from drm_i915_gem_request to i915_request. That is we may implement the GEM user interface on top of requests, but they are an abstraction for tracking execution rather than an implementation detail of GEM. (Since they are not tied to HW, we keep the i915 prefix as opposed to intel.) In short, the spatch: @@ @@ - struct drm_i915_gem_request + struct i915_request A corollary to contracting the type name, we also harmonise on using 'rq' shorthand for local variables where space if of the essence and repetition makes 'request' unwieldy. For globals and struct members, 'request' is still much preferred for its clarity. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Michał Winiarski <michal.winiarski@intel.com> Cc: Michal Wajdeczko <michal.wajdeczko@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180221095636.6649-1-chris@chris-wilson.co.uk Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Reviewed-by: Michał Winiarski <michal.winiarski@intel.com> Acked-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2018-02-16drm: move read_domains and write_domain into i915Christian König1-0/+15
i915 is the only driver using those fields in the drm_gem_object structure, so they only waste memory for all other drivers. Move the fields into drm_i915_gem_object instead and patch the i915 code with the following sed commands: sed -i "s/obj->base.read_domains/obj->read_domains/g" drivers/gpu/drm/i915/*.c drivers/gpu/drm/i915/*/*.c sed -i "s/obj->base.write_domain/obj->write_domain/g" drivers/gpu/drm/i915/*.c drivers/gpu/drm/i915/*/*.c Change is only compile tested. v2: move fields around as suggested by Chris. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20180216124338.9087-1-christian.koenig@amd.com Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2017-12-04drm/i915/gvt: Dmabuf support for GVT-gTina Zhang1-0/+2
This patch introduces a guest's framebuffer sharing mechanism based on dma-buf subsystem. With this sharing mechanism, guest's framebuffer can be shared between guest VM and host. v17: - modify VFIO_DEVICE_GET_GFX_DMABUF interface. (Alex) v16: - add x_hot and y_hot. (Gerd) - add flag validation for VFIO_DEVICE_GET_GFX_DMABUF. (Alex) - rebase 4.14.0-rc6. v15: - add VFIO_DEVICE_GET_GFX_DMABUF ABI. (Gerd) - add intel_vgpu_dmabuf_cleanup() to clean up the vGPU's dmabuf. (Gerd) v14: - add PROBE, DMABUF and REGION flags. (Alex) v12: - refine the lifecycle of dmabuf. v9: - remove dma-buf management. (Alex) - track the dma-buf create and release in kernel mode. (Gerd) (Daniel) v8: - refine the dma-buf ioctl definition.(Alex) - add a lock to protect the dmabuf list. (Alex) v7: - release dma-buf related allocations in dma-buf's associated release function. (Alex) - refine ioctl interface for querying plane info or create dma-buf. (Alex) v6: - align the dma-buf life cycle with the vfio device. (Alex) - add the dma-buf related operations in a separate patch. (Gerd) - i915 related changes. (Chris) v5: - fix bug while checking whether the gem obj is gvt's dma-buf when user change caching mode or domains. Add a helper function to do it. (Xiaoguang) - add definition for the query plane and create dma-buf. (Xiaoguang) v4: - fix bug while checking whether the gem obj is gvt's dma-buf when set caching mode or doamins. (Xiaoguang) v3: - declare a new flag I915_GEM_OBJECT_IS_GVT_DMABUF in drm_i915_gem_object to represent the gem obj for gvt's dma-buf. The tiling mode, caching mode and domains can not be changed for this kind of gem object. (Alex) - change dma-buf related information to be more generic. So other vendor can use the same interface. (Alex) v2: - create a management fd for dma-buf operations. (Alex) - alloc gem object's backing storage in gem obj's get_pages() callback. (Chris) Signed-off-by: Tina Zhang <tina.zhang@intel.com> Cc: Alex Williamson <alex.williamson@redhat.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: Gerd Hoffmann <kraxel@redhat.com> Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
2017-11-14drm/i915: Introduce GEM proxyTina Zhang1-2/+9
GEM proxy is a kind of GEM, whose backing physical memory is pinned and produced by guest VM and is used by host as read only. With GEM proxy, host is able to access guest physical memory through GEM object interface. As GEM proxy is such a special kind of GEM, a new flag I915_GEM_OBJECT_IS_PROXY is introduced to ban host from changing the backing storage of GEM proxy. v3: - update "Reviewed-by". (Joonas) v2: - return -ENXIO when pin and map pages of GEM proxy to kernel space. (Chris) Here are the histories of this patch in "Dma-buf support for Gvt-g" patch-set: v14: - return -ENXIO when gem proxy object is banned by ioctl. (Chris) (Daniel) v13: - add comments to GEM proxy. (Chris) - don't ban GEM proxy in i915_gem_sw_finish_ioctl. (Chris) - check GEM proxy bar after finishing i915_gem_object_wait. (Chris) - remove GEM proxy bar in i915_gem_madvise_ioctl. v6: - add gem proxy barrier in the following ioctls. (Chris) i915_gem_set_caching_ioctl i915_gem_set_domain_ioctl i915_gem_sw_finish_ioctl i915_gem_set_tiling_ioctl i915_gem_madvise_ioctl Signed-off-by: Tina Zhang <tina.zhang@intel.com> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> #v1 Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/1510555798-21079-2-git-send-email-tina.zhang@intel.com Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20171114102513.22269-2-chris@chris-wilson.co.uk
2017-10-16drm/i915: Move dev_priv->mm.[un]bound_list to its own lockChris Wilson1-1/+6
Remove the struct_mutex requirement around dev_priv->mm.bound_list and dev_priv->mm.unbound_list by giving it its own spinlock. This reduces one more requirement for struct_mutex and in the process gives us slightly more accurate unbound_list tracking, which should improve the shrinker - but the drawback is that we drop the retirement before counting so i915_gem_object_is_active() may be stale and lead us to underestimate the number of objects that may be shrunk (see commit bed50aea61df ("drm/i915/shrinker: Flush active on objects before counting")). v2: Crosslink the spinlock to the lists it protects, and btw this changes s/obj->global_link/obj->mm.link/ v3: Fix decoupling of old links in i915_gem_object_attach_phys() v3.1: Fix the fix, only unlink if it was linked v3.2: Use a local for to_i915(obj->base.dev)->mm.obj_lock Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20171016114037.5556-1-chris@chris-wilson.co.uk
2017-10-16drm/i915: Rename obj->pin_display to obj->pin_globalChris Wilson1-1/+2
In the next patch, we want to extend use of the global pin counter for semi-permanent pinning of context/ring objects. Given that we plan to extend the usage to encompass a disparate set of objects, we want a name that reflects both and should entail less confusion. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20171013202621.7276-2-chris@chris-wilson.co.uk
2017-10-09drm/i915: Track user GTT faulting per-vmaChris Wilson1-0/+1
We don't wish to refault the entire object (other vma) when unbinding one partial vma. To do this track which vma have been faulted into the user's address space. v2: Use a local vma_offset to tidy up a multiline unmap_mapping_range(). Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20171009084401.29090-3-chris@chris-wilson.co.uk Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2017-10-07drm/i915/selftests: huge page testsMatthew Auld1-0/+2
v2: mock test page support configurations and add MI_STORE_DWORD test v3: run all mockable huge page tests on all platforms via the mock_device v4: add pin_update regression test various improvements suggested by Chris v5: fix issues reported by kbuild test single sg spanning multiple page sizes don't explode when running the live-tests through the appgtt v6: lots of improvements from Chris v7: run on each engine for igt_write_huge add simple tmpfs fallback test v8: size_t is bad don't break the i386 build Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20171006145041.21673-18-matthew.auld@intel.com Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20171006221833.32439-17-chris@chris-wilson.co.uk
2017-10-07drm/i915: accurate page size tracking for the ppgttMatthew Auld1-0/+10
Now that we support multiple page sizes for the ppgtt, it would be useful to track the real usage for debugging purposes. Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20171006145041.21673-16-matthew.auld@intel.com Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20171006221833.32439-15-chris@chris-wilson.co.uk
2017-10-07drm/i915: introduce page_size membersMatthew Auld1-0/+17
In preparation for supporting huge gtt pages for the ppgtt, we introduce page size members for gem objects. We fill in the page sizes by scanning the sg table. v2: pass the sg_mask to set_pages v3: calculate the sg_mask inline with populating the sg_table where possible, and pass to set_pages along with the pages. v4: bunch of improvements from Joonas v5: fix num_pages blunder introduce i915_sg_page_sizes helper v6: prefer GEM_BUG_ON(sizes == 0) Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Daniel Vetter <daniel@ffwll.ch> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20171006145041.21673-7-matthew.auld@intel.com Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20171006221833.32439-6-chris@chris-wilson.co.uk
2017-10-07drm/i915: push set_pages down to the callersMatthew Auld1-1/+1
Each backend is now responsible for calling __i915_gem_object_set_pages upon successfully gathering its backing storage. This eliminates the inconsistency between the async and sync paths, which stands out even more when we start throwing around an sg_mask in a later patch. Suggested-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20171006145041.21673-6-matthew.auld@intel.com Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20171006221833.32439-5-chris@chris-wilson.co.uk
2017-08-18drm/i915: Replace execbuf vma ht with an idrChris Wilson1-1/+22
This was the competing idea long ago, but it was only with the rewrite of the idr as an radixtree and using the radixtree directly ourselves, along with the realisation that we can store the vma directly in the radixtree and only need a list for the reverse mapping, that made the patch performant enough to displace using a hashtable. Though the vma ht is fast and doesn't require any extra allocation (as we can embed the node inside the vma), it does require a thread for resizing and serialization and will have the occasional slow lookup. That is hairy enough to investigate alternatives and favour them if equivalent in peak performance. One advantage of allocating an indirection entry is that we can support a single shared bo between many clients, something that was done on a first-come first-serve basis for shared GGTT vma previously. To offset the extra allocations, we create yet another kmem_cache for them. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20170816085210.4199-5-chris@chris-wilson.co.uk
2017-08-15drm/i915: Split obj->cache_coherent to track r/wChris Wilson1-1/+8
Another month, another story in the cache coherency saga. This time, we come to the realisation that i915_gem_object_is_coherent() has been reporting whether we can read from the target without requiring a cache invalidate; but we were using it in places for testing whether we could write into the object without requiring a cache flush. So split the tracking into two, one to decide before reads, one after writes. See commit e27ab73d17ef ("drm/i915: Mark CPU cache as dirty on every transition for CPU writes") for the previous entry in this saga. v2: Be verbose v3: Remove unused function (i915_gem_object_is_coherent) v4: Fix inverted coherency check prior to execbuf (from v2) v5: Add comment for nasty code where we are optimising on gcc's behalf. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101109 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101555 Testcase: igt/kms_mmap_write_crc Testcase: igt/kms_pwrite_crc Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Cc: Dongwon Kim <dongwon.kim@intel.com> Cc: Matt Roper <matthew.d.roper@intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Tested-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Acked-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20170811111116.10373-1-chris@chris-wilson.co.uk Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2017-06-16drm/i915: Store a direct lookup from object handle to vmaChris Wilson1-3/+1
The advent of full-ppgtt lead to an extra indirection between the object and its binding. That extra indirection has a noticeable impact on how fast we can convert from the user handles to our internal vma for execbuffer. In order to bypass the extra indirection, we use a resizable hashtable to jump from the object to the per-ctx vma. rhashtable was considered but we don't need the online resizing feature and the extra complexity proved to undermine its usefulness. Instead, we simply reallocate the hastable on demand in a background task and serialize it before iterating. In non-full-ppgtt modes, multiple files and multiple contexts can share the same vma. This leads to having multiple possible handle->vma links, so we only use the first to establish the fast path. The majority of buffers are not shared and so we should still be able to realise speedups with multiple clients. v2: Prettier names, more magic. v3: Many style tweaks, most notably hiding the misuse of execobj[].rsvd2 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2017-06-16drm/i915: Store i915_gem_object_is_coherent() as a bit next to cache-dirtyChris Wilson1-0/+1
For ease of use (i.e. avoiding a few checks and function calls), store the object's cache coherency next to the cache is dirty bit. Specifically this patch aims to reduce the frequency of no-op calls to i915_gem_object_clflush() to counter-act the increase of such calls for GPU only objects in the previous patch. v2: Replace cache_dirty & ~cache_coherent with cache_dirty && !cache_coherent as gcc generates much better code for the latter (Tvrtko) Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Dongwon Kim <dongwon.kim@intel.com> Cc: Matt Roper <matthew.d.roper@intel.com> Tested-by: Dongwon Kim <dongwon.kim@intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170616105455.16977-1-chris@chris-wilson.co.uk Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
2017-05-30drm/i915: Add kerneldoc to describe i915_gem_object.vma_listChris Wilson1-1/+16
Add kerneldoc for the vma_list stored on the i915_gem_object, in particular, documenting the expected ordering of elements -- i.e. that we do expect GGTT VMA first followed by the ppGTT VMA. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170525204818.12044-1-chris@chris-wilson.co.uk Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2017-05-24drm/i915: Convert i915_gem_object_ops->flags values to use BIT()Chris Wilson1-2/+2
Having just watched someone add a new value, 0x3, without realising that the flags were bit values, I have come to appreciate the value in using BIT. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170523103116.32239-1-chris@chris-wilson.co.uk Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2017-03-08drm/i915: Purge i915_gem_object_is_dead()Chris Wilson1-6/+0
i915_gem_object_is_dead() was a temporary lockdep aide whilst transitioning to a new locking structure for obj->mm. Since commit 1233e2db199d ("drm/i915: Move object backing storage manipulation to its own locking") it is now unused and should be removed. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170308132629.7987-2-chris@chris-wilson.co.uk
2017-03-08Merge remote-tracking branch 'airlied/drm-next' into drm-intel-next-queuedDaniel Vetter1-1/+1
Backmerge drm-next to get at all the good stuff in drm-misc. We need that because: - drm_connector_list_iter conversion for i915 needs the core patches. - Maarten's patches to use the new atomic state iterators also need the core patches. - We need the new link status property to complete the DP retraining work, merging through 2 branches wasn't a good idea and we had to partially backtrack. - Chris needs reservation_object_trylock and we want to roll out kref_read everywhere. Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
2017-03-08Merge tag 'drm-intel-next-2017-03-06' of git://anongit.freedesktop.org/git/drm-intel into drm-nextDave Airlie1-8/+33
4 weeks worth of stuff since I was traveling&lazy: - lspcon improvements (Imre) - proper atomic state for cdclk handling (Ville) - gpu reset improvements (Chris) - lots and lots of polish around fences, requests, waiting and everything related all over (both gem and modeset code), from Chris - atomic by default on gen5+ minus byt/bsw (Maarten did the patch to flip the default, really this is a massive joint team effort) - moar power domains, now 64bit (Ander) - big pile of in-kernel unit tests for various gem subsystems (Chris), including simple mock objects for i915 device and and the ggtt manager. - i915_gpu_info in debugfs, for taking a snapshot of the current gpu state. Same thing as i915_error_state, but useful if the kernel didn't notice something is stick. From Chris. - bxt dsi fixes (Umar Shankar) - bxt w/a updates (Jani) - no more struct_mutex for gem object unreference (Chris) - some execlist refactoring (Tvrtko) - color manager support for glk (Ander) - improve the power-well sync code to better take over from the firmware (Imre) - gem tracepoint polish (Tvrtko) - lots of glk fixes all around (Ander) - ctx switch improvements (Chris) - glk dsi support&fixes (Deepak M) - dsi fixes for vlv and clanups, lots of them (Hans de Goede) - switch to i915.ko types in lots of our internal modeset code (Ander) - byt/bsw atomic wm update code, yay (Ville) * tag 'drm-intel-next-2017-03-06' of git://anongit.freedesktop.org/git/drm-intel: (432 commits) drm/i915: Update DRIVER_DATE to 20170306 drm/i915: Don't use enums for hardware engine id drm/i915: Split breadcrumbs spinlock into two drm/i915: Refactor wakeup of the next breadcrumb waiter drm/i915: Take reference for signaling the request from hardirq drm/i915: Add FIFO underrun tracepoints drm/i915: Add cxsr toggle tracepoint drm/i915: Add VLV/CHV watermark/FIFO programming tracepoints drm/i915: Add plane update/disable tracepoints drm/i915: Kill level 0 wm hack for VLV/CHV drm/i915: Workaround VLV/CHV sprite1->sprite0 enable underrun drm/i915: Sanitize VLV/CHV watermarks properly drm/i915: Only use update_wm_{pre,post} for pre-ilk platforms drm/i915: Nuke crtc->wm.cxsr_allowed drm/i915: Compute proper intermediate wms for vlv/cvh drm/i915: Skip useless watermark/FIFO related work on VLV/CHV when not needed drm/i915: Compute vlv/chv wms the atomic way drm/i915: Compute VLV/CHV FIFO sizes based on the PM2 watermarks drm/i915: Plop vlv/chv fifo sizes into crtc state drm/i915: Plop vlv wm state into crtc_state ...
2017-03-07drm/i915: Use pagecache write to prepopulate shmemfs from pwrite-ioctlChris Wilson1-0/+3
Before we instantiate/pin the backing store for our use, we can prepopulate the shmemfs filp efficiently using a write into the pagecache. We avoid the penalty of instantiating all the pages, important if the user is just writing to a few and never uses the object on the GPU, and using a direct write into shmemfs allows it to avoid the cost of retrieving a page (mostly the clear-before-use, but in theory we could curtail swapin) before it is overwritten. This can be extended later to provide additional specialisation for other backends (other than shmemfs). For now it provides a defense against very large write-only allocations from exhausting all of system memory. v2: Smelling fixes. Fixes: fe115628d567 ("drm/i915: Implement pwrite without struct-mutex") References: https://bugs.freedesktop.org/show_bug.cgi?id=99107 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Matthew Auld <matthew.william.auld@gmail.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: <stable@vger.kernel.org> # v4.10+ Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170307120338.7277-2-chris@chris-wilson.co.uk
2017-03-01drm/i915: Prevent concurrent tiling/framebuffer modificationsChris Wilson1-1/+17
Reintroduce a lock around tiling vs framebuffer creation to prevent modification of the obj->tiling_and_stride whilst the framebuffer is being created. Rather than use struct_mutex once again, use the per-object lock - this will also be required in future to prevent changing the tiling whilst submitting rendering. Reported-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Fixes: 24dbf51a5517 ("drm/i915: struct_mutex is not required for allocating the framebuffer") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170301154128.2841-2-chris@chris-wilson.co.uk
2017-02-23Merge tag 'drm-for-v4.11-less-shouty' of git://people.freedesktop.org/~airlied/linuxLinus Torvalds1-0/+23
Pull drm updates from Dave Airlie: "This is the main drm pull request for v4.11. Nothing too major, the tinydrm and mmu-less support should make writing smaller drivers easier for some of the simpler platforms, and there are a bunch of documentation updates. Intel grew displayport MST audio support which is hopefully useful to people, and FBC is on by default for GEN9+ (so people know where to look for regressions). AMDGPU has a lot of fixes that would like new firmware files installed for some GPUs. Other than that it's pretty scattered all over. I may have a follow up pull request as I know BenH has a bunch of AST rework and fixes and I'd like to get those in once they've been tested by AST, and I've got at least one pull request I'm just trying to get the author to fix up. Core: - drm_mm reworked - Connector list locking and iterators - Documentation updates - Format handling rework - MMU-less support for fbdev helpers - drm_crtc_from_index helper - Core CRC API - Remove drm_framebuffer_unregister_private - Debugfs cleanup - EDID/Infoframe fixes - Release callback - Tinydrm support (smaller drivers for simple hw) panel: - Add support for some new simple panels i915: - FBC by default for gen9+ - Shared dpll cleanups and docs - GEN8 powerdomain cleanup - DMC support on GLK - DP MST audio support - HuC loading support - GVT init ordering fixes - GVT IOMMU workaround fix amdgpu/radeon: - Power/clockgating improvements - Preliminary SR-IOV support - TTM buffer priority and eviction fixes - SI DPM quirks removed due to firmware fixes - Powerplay improvements - VCE/UVD powergating fixes - Cleanup SI GFX code to match CI/VI - Support for > 2 displays on 3/5 crtc asics - SI headless fixes nouveau: - Rework securre boot code in prep for GP10x secure boot - Channel recovery improvements - Initial power budget code - MMU rework preperation vmwgfx: - Bunch of fixes and cleanups exynos: - Runtime PM support for MIC driver - Cleanups to use atomic helpers - UHD Support for TM2/TM2E boards - Trigger mode fix for Rinato board etnaviv: - Shader performance fix - Command stream validator fixes - Command buffer suballocator rockchip: - CDN DisplayPort support - IOMMU support for arm64 platform imx-drm: - Fix i.MX5 TV encoder probing - Remove lower fb size limits msm: - Support for HW cursor on MDP5 devices - DSI encoder cleanup - GPU DT bindings cleanup sti: - stih410 cleanups - Create fbdev at binding - HQVDP fixes - Remove stih416 chip functionality - DVI/HDMI mode selection fixes - FPS statistic reporting omapdrm: - IRQ code cleanup dwi-hdmi bridge: - Cleanups and fixes adv-bridge: - Updates for nexus sii8520 bridge: - Add interlace mode support - Rework HDMI and lots of fixes qxl: - probing/teardown cleanups ZTE drm: - HDMI audio via SPDIF interface - Video Layer overlay plane support - Add TV encoder output device atmel-hlcdc: - Rework fbdev creation logic tegra: - OF node fix fsl-dcu: - Minor fixes mali-dp: - Assorted fixes sunxi: - Minor fix" [ This was the "fixed" pull, that still had build warnings due to people not even having build tested the result. I'm not a happy camper I've fixed the things I noticed up in this merge. - Linus ] * tag 'drm-for-v4.11-less-shouty' of git://people.freedesktop.org/~airlied/linux: (1177 commits) lib/Kconfig: make PRIME_NUMBERS not user selectable drm/tinydrm: helpers: Properly fix backlight dependency drm/tinydrm: mipi-dbi: Fix field width specifier warning drm/tinydrm: mipi-dbi: Silence: ‘cmd’ may be used uninitialized drm/sti: fix build warnings in sti_drv.c and sti_vtg.c files drm/amd/powerplay: fix PSI feature on Polars12 drm/amdgpu: refuse to reserve io mem for split VRAM buffers drm/ttm: fix use-after-free races in vm fault handling drm/tinydrm: Add support for Multi-Inno MI0283QT display dt-bindings: Add Multi-Inno MI0283QT binding dt-bindings: display/panel: Add common rotation property of: Add vendor prefix for Multi-Inno drm/tinydrm: Add MIPI DBI support drm/tinydrm: Add helper functions drm: Add DRM support for tiny LCD displays drm/amd/amdgpu: post card if there is real hw resetting performed drm/nouveau/tmr: provide backtrace when a timeout is hit drm/nouveau/pci/g92: Fix rearm drm/nouveau/drm/therm/fan: add a fallback if no fan control is specified in the vbios drm/nouveau/hwmon: expose power_max and power_crit ..
2017-02-22drm/i915: Amalgamate flushing of display objectsChris Wilson1-0/+2
We have three different paths by which userspace wants to flush the display plane (i.e. objects with obj->pin_display). Use a common helper to identify those paths and to simplify a later change. v2: Include the conditional in the name, i915_gem_object_flush_if_display Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170222114049.28456-3-chris@chris-wilson.co.uk
2017-02-16drm/i915: Remove struct_mutex for destroying framebuffersChris Wilson1-1/+1
We do not need to hold struct_mutex for destroying drm_i915_gem_objects any longer, and with a little care taken over tracking obj->framebuffer_references, we can relinquish BKL locking around the destroy of intel_framebuffer. v2: Use atomic check for WARN_ON framebuffer miscounting Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170216094621.3426-1-chris@chris-wilson.co.uk Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2017-02-13drm/i915: Exercise filling the top/bottom portions of the ppgttChris Wilson1-0/+3
Allocate objects with varying number of pages (which should hopefully consist of a mixture of contiguous page chunks and so coalesced sg lists) and check that the sg walkers in insert_pages cope. v2: Check both small <-> large Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170213171558.20942-28-chris@chris-wilson.co.uk
2017-02-13drm/i915: Create a fake object for testing huge allocationsChris Wilson1-8/+12
We would like to be able to exercise huge allocations even on memory constrained devices. To do this we create an object that allocates only a few pages and remaps them across its whole range - each page is reused multiple times. We can therefore pretend we are rendering into a much larger object. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170213171558.20942-9-chris@chris-wilson.co.uk
2017-01-14locking/atomic, kref: Add kref_read()Peter Zijlstra1-1/+1
Since we need to change the implementation, stop exposing internals. Provide kref_read() to read the current reference count; typically used for debug messages. Kills two anti-patterns: atomic_read(&kref->refcount) kref->refcount.counter Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-01-10drm/i915: Split out i915_gem_object_set_tiling()Chris Wilson1-0/+3
Expose an interface for changing the tiling and stride on an object, that includes the complexity of checking for conflicting bindings and fence registers. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170110121045.27144-2-chris@chris-wilson.co.uk
2017-01-10drm/i915: Extract tile_row_size for fencingChris Wilson1-0/+20
Computing the tile row size of a tiled object (for use with fence registers) is repeated, so extract it to a common helper. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: http://patchwork.freedesktop.org/patch/msgid/20170109161613.11881-1-chris@chris-wilson.co.uk Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2016-11-18drm/i915: Move frontbuffer CS write tracking from ggtt vma to objectChris Wilson1-0/+1
I tried to avoid having to track the write for every VMA by only tracking writes to the ggtt. However, for the purposes of frontbuffer tracking this is insufficient as we need to invalidate around writes not just to the the ggtt but all aliased ppgtt views of the framebuffer. By moving the critical section to the object and only doing so for framebuffer writes we can reduce the tracking even further by only watching framebuffers and not vma. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Paulo Zanoni <paulo.r.zanoni@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20161116190704.5293-1-chris@chris-wilson.co.uk Tested-by: Paulo Zanoni <paulo.r.zanoni@intel.com> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2016-11-11drm/i915: Split out i915_vma.cJoonas Lahtinen1-0/+337
As a side product, had to split two other files; - i915_gem_fence_reg.h - i915_gem_object.h (only parts that needed immediate untanglement) I tried to move code in as big chunks as possible, to make review easier. i915_vma_compare was moved to a header temporarily. v2: - Use i915_gem_fence_reg.{c,h} v3: - Rebased v4: - Fix building when DEBUG_GEM is enabled by reordering a bit. Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Acked-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1478861034-30643-1-git-send-email-joonas.lahtinen@linux.intel.com