aboutsummaryrefslogtreecommitdiffstats
path: root/drivers/gpu/drm/i915/gem (follow)
AgeCommit message (Collapse)AuthorFilesLines
2020-08-23treewide: Use fallthrough pseudo-keywordGustavo A. R. Silva3-5/+5
Replace the existing /* fall through */ comments and its variants with the new pseudo-keyword macro fallthrough[1]. Also, remove unnecessary fall-through markings when it is the case. [1] https://www.kernel.org/doc/html/v5.7/process/deprecated.html?highlight=fallthrough#implicit-switch-case-fall-through Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
2020-08-12mm/gup: remove task_struct pointer for all gup codePeter Xu1-1/+1
After the cleanup of page fault accounting, gup does not need to pass task_struct around any more. Remove that parameter in the whole gup stack. Signed-off-by: Peter Xu <peterx@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: John Hubbard <jhubbard@nvidia.com> Link: http://lkml.kernel.org/r/20200707225021.200906-26-peterx@redhat.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-07-08drm/i915: Remove i915_gem_object_get_dirty_page()Chris Wilson2-18/+0
Last user removed, remove the get_dirty_page convenience function. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200708173748.32734-4-chris@chris-wilson.co.uk
2020-07-08drm/i915: Release shortlived maps of longlived objectsChris Wilson2-0/+17
Some objects we map once during their construction, and then never access their mappings again, even if they are kept around for the duration of the driver. Keeping those pages mapped, often vmapped, is therefore wasteful and we should release the maps as soon as we no longer need them. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200708173748.32734-3-chris@chris-wilson.co.uk
2020-07-08drm/i915/gem: Unpin idle contexts from kswapd reclaimChris Wilson1-9/+16
We removed retiring requests from the shrinker in order to decouple the mutexes from reclaim in preparation for unravelling the struct_mutex. The impact of not retiring is that we are much less agressive in making global objects available for shrinking, as such objects remain pinned until they are flushed by a heartbeat pulse following the last retired request along their timeline. In order to ensure that pulse occurs in time for memory reclamation, we should kick it from kswapd. The catch is that we have added some flush_work() into the retirement phase (to ensure that we reach a global idle in a timely manner), but these flush_work() are not eligible (i.e do not belong to WQ_MEM_RELCAIM) for use from inside kswapd. To avoid flushing those workqueues, we teach the retirer not to do so unless we are actually waiting, and only do the plain retire from inside the shrinker. Note that for execlists, we already retire completed contexts as they are scheduled out, so it should not be keeping global state unnecessarily pinned. The legacy ringbuffer however... References: 9e9539800dd4 ("drm/i915: Remove waiting & retiring from shrinker paths") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200708173748.32734-1-chris@chris-wilson.co.uk
2020-07-08drm/i915/sseu: Move sseu_info under gt_infoVenkata Sandeep Dhanalakota3-5/+9
SSEUs are a GT capability, so track them under gt_info. Signed-off-by: Venkata Sandeep Dhanalakota <venkata.s.dhanalakota@intel.com> Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Andi Shyti <andi.shyti@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20200708003952.21831-8-daniele.ceraolospurio@intel.com
2020-07-08drm/i915: Move the engine mask to intel_gt_infoDaniele Ceraolo Spurio1-2/+1
Since the engines belong to the GT, move the runtime-updated list of available engines to the intel_gt struct. The original mask has been renamed to indicate it contains the maximum engine list that can be found on a matching device. In preparation for other info being moved to the gt in follow up patches (sseu), introduce an intel_gt_info structure to group all gt-related runtime info. v2: s/max_engine_mask/platform_engine_mask (tvrtko), fix selftest Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> Cc: Andi Shyti <andi.shyti@intel.com> Cc: Venkata Sandeep Dhanalakota <venkata.s.dhanalakota@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> #v1 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20200708003952.21831-5-daniele.ceraolospurio@intel.com
2020-07-03drm/i915: Export ppgtt_bind_vmaChris Wilson1-4/+5
Reuse the ppgtt_bind_vma() for aliasing_ppgtt_bind_vma() so we can reduce some code near-duplication. The catch is that we need to then pass along the i915_address_space and not rely on vma->vm, as they differ with the aliasing-ppgtt. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Andi Shyti <andi.shyti@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200703102519.26539-1-chris@chris-wilson.co.uk
2020-07-03drm/i915/gem: Split the context's obj:vma lut into its own mutexChris Wilson5-13/+26
Rather than reuse the common ctx->mutex for locking the execbuffer LUT, split it into its own lock to avoid being taken [as part of ctx->mutex] at inappropriate times. In particular to avoid the inversion from taking the timeline->mutex for the whole execbuf submission in the next patch. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Andi Shyti <andi.shyti@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200703004306.11117-1-chris@chris-wilson.co.uk
2020-07-02drm/i915/gem: Drop forced struct_mutex from shrinker_taints_mutexChris Wilson1-11/+0
Since we no longer always take struct_mutex around everything, and want the freedom to create GEM objects, actually taking struct_mutex inside the lock creation ends up pulling the mutex inside other looks. Since we don't use generally use struct_mutex, we can relax the tainting. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Andi Shyti <andi.shyti@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200702083225.20044-3-chris@chris-wilson.co.uk
2020-07-02drm/i915/gem: Only revoke mmap handlers if activeChris Wilson3-28/+24
Avoid waking up the device and taking stale locks if we know that the object is not currently mmapped. This is particularly useful as not many object are actually mmapped and so we can destroy them without waking the device up, and gives us a little more freedom of workqueue ordering during shutdown. v2: Pull the release_mmap() into its single user in freeing the objects, where there can not be any race with a concurrent user of the freed object. Or so one hopes! Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>, Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>, Link: https://patchwork.freedesktop.org/patch/msgid/20200702163623.6402-2-chris@chris-wilson.co.uk
2020-07-02drm/i915/gem: Only revoke the GGTT mmappings on aperture detiling changesChris Wilson3-3/+6
Only a GGTT mmapping will use the aperture detiling registers, so on a tiling change for an object, we only need to revoke those mmappings and not the CPU mmappings (which are always linear irrespective of the tiling). Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200702163623.6402-1-chris@chris-wilson.co.uk
2020-07-01drm/i915/gem: Move obj->lut_list under its own lockChris Wilson4-12/+20
The obj->lut_list is traversed when the object is closed as the file table is destroyed during process termination. As this occurs before we kill any outstanding context if, due to some bug or another, the closure is blocked, then we fail to shootdown any inflight operations potentially leaving the GPU spinning forever. As we only need to guard the list against concurrent closures and insertions, the hold is short and merits being treated as a simple spinlock. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Michael J. Ruhl <michael.j.ruhl@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200701084439.17025-1-chris@chris-wilson.co.uk
2020-06-30drm/i915/gem: Avoid kmalloc under i915->mm_lockChris Wilson1-67/+64
Rearrange the allocation of the mm_struct registration to avoid allocating underneath the i915->mm_lock, so that we avoid tainting the lock (and in turn many other locks that may be held as i915->mm_lock is taken, and those locks we may want on the free [shrinker] paths). In doing so, we convert the lookup to be RCU protected by courtesy of converting the free-worker to be an rcu_work. v2: Remember to use hash_rcu variants to protect the list iteration from concurrent add/del. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200619194038.5088-1-chris@chris-wilson.co.uk
2020-06-25Merge drm/drm-next into drm-intel-next-queuedJani Nikula6-28/+34
Catch up with upstream, in particular to get c1e8d7c6a7a6 ("mmap locking API: convert mmap_sem comments"). Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2020-06-22drm/i915/params: switch to device specific parametersJani Nikula1-2/+2
Start using device specific parameters instead of module parameters for most things. The module parameters become the immutable initial values for i915 parameters. The device specific parameters in i915->params start life as a copy of i915_modparams. Any later changes are only reflected in the debugfs. The stragglers are: * i915.force_probe and i915.modeset. Needed before dev_priv is available. This is fine because the parameters are read-only and never modified. * i915.verbose_state_checks. Passing dev_priv to I915_STATE_WARN and I915_STATE_WARN_ON would result in massive and ugly churn. This is handled by not exposing the parameter via debugfs, and leaving the parameter writable in sysfs. This may be fixed up in follow-up work. * i915.inject_probe_failure. Only makes sense in terms of the module, not the device. This is handled by not exposing the parameter via debugfs. v2: Fix uc i915 lookup code (Michał Winiarski) Cc: Juha-Pekka Heikkilä <juha-pekka.heikkila@intel.com> Cc: Venkata Sandeep Dhanalakota <venkata.s.dhanalakota@intel.com> Cc: Michał Winiarski <michal.winiarski@intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Acked-by: Michał Winiarski <michal.winiarski@intel.com> Signed-off-by: Jani Nikula <jani.nikula@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20200618150402.14022-1-jani.nikula@intel.com
2020-06-16drm/i915: Remove redundant i915_request_await_object in blit clearsTvrtko Ursulin1-31/+21
One i915_request_await_object is enough and we keep the one under the object lock so it is final. At the same time move async clflushing setup under the same locked section and consolidate common code into a helper function. v2: * Emit initial breadcrumbs after aways are set up. (Chris) Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Matthew Auld <matthew.auld@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Michael J. Ruhl <michael.j.ruhl@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Michael J. Ruhl <michael.j.ruhl@intel.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20200615151449.32605-1-tvrtko.ursulin@linux.intel.com
2020-06-11Merge tag 'drm-next-2020-06-11-1' of git://anongit.freedesktop.org/drm/drmLinus Torvalds1-8/+48
Pull drm fixes from Dave Airlie: "One sun4i fix and a connector hotplug race The ast fix is for a regression in 5.6, and one of the i915 ones fixes an oops reported by dhowells. core: - fix race in connectors sending hotplug i915: - Avoid use after free in cmdparser - Avoid NULL dereference when probing all display encoders - Fixup to module parameter type sun4i: - clock divider fix ast: - 24/32 bpp mode setting fix" * tag 'drm-next-2020-06-11-1' of git://anongit.freedesktop.org/drm/drm: drm/ast: fix missing break in switch statement for format->cpp[0] case 4 drm/sun4i: hdmi ddc clk: Fix size of m divider drm/i915/display: Only query DP state of a DDI encoder drm/i915/params: fix i915.reset module param type drm/i915/gem: Mark the buffer pool as active for the cmdparser drm/connector: notify userspace on hotplug after register complete
2020-06-10Merge branch 'uaccess.i915' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfsLinus Torvalds1-5/+0
Pull i915 uaccess updates from Al Viro: "Low-hanging fruit in i915; there are several trickier followups, but that'll wait for the next cycle" * 'uaccess.i915' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: i915:get_engines(): get rid of pointless access_ok() i915: alloc_oa_regs(): get rid of pointless access_ok() i915 compat ioctl(): just use drm_ioctl_kernel() i915: switch copy_perf_config_registers_or_number() to unsafe_put_user() i915: switch query_{topology,engine}_info() to copy_to_user()
2020-06-09mmap locking API: convert mmap_sem commentsMichel Lespinasse1-3/+3
Convert comments that reference mmap_sem to reference mmap_lock instead. [akpm@linux-foundation.org: fix up linux-next leftovers] [akpm@linux-foundation.org: s/lockaphore/lock/, per Vlastimil] [akpm@linux-foundation.org: more linux-next fixups, per Michel] Signed-off-by: Michel Lespinasse <walken@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Vlastimil Babka <vbabka@suse.cz> Reviewed-by: Daniel Jordan <daniel.m.jordan@oracle.com> Cc: Davidlohr Bueso <dbueso@suse.de> Cc: David Rientjes <rientjes@google.com> Cc: Hugh Dickins <hughd@google.com> Cc: Jason Gunthorpe <jgg@ziepe.ca> Cc: Jerome Glisse <jglisse@redhat.com> Cc: John Hubbard <jhubbard@nvidia.com> Cc: Laurent Dufour <ldufour@linux.ibm.com> Cc: Liam Howlett <Liam.Howlett@oracle.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ying Han <yinghan@google.com> Link: http://lkml.kernel.org/r/20200520052908.204642-13-walken@google.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-09mmap locking API: use coccinelle to convert mmap_sem rwsem call sitesMichel Lespinasse2-6/+6
This change converts the existing mmap_sem rwsem calls to use the new mmap locking API instead. The change is generated using coccinelle with the following rule: // spatch --sp-file mmap_lock_api.cocci --in-place --include-headers --dir . @@ expression mm; @@ ( -init_rwsem +mmap_init_lock | -down_write +mmap_write_lock | -down_write_killable +mmap_write_lock_killable | -down_write_trylock +mmap_write_trylock | -up_write +mmap_write_unlock | -downgrade_write +mmap_write_downgrade | -down_read +mmap_read_lock | -down_read_killable +mmap_read_lock_killable | -down_read_trylock +mmap_read_trylock | -up_read +mmap_read_unlock ) -(&mm->mmap_sem) +(mm) Signed-off-by: Michel Lespinasse <walken@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Daniel Jordan <daniel.m.jordan@oracle.com> Reviewed-by: Laurent Dufour <ldufour@linux.ibm.com> Reviewed-by: Vlastimil Babka <vbabka@suse.cz> Cc: Davidlohr Bueso <dbueso@suse.de> Cc: David Rientjes <rientjes@google.com> Cc: Hugh Dickins <hughd@google.com> Cc: Jason Gunthorpe <jgg@ziepe.ca> Cc: Jerome Glisse <jglisse@redhat.com> Cc: John Hubbard <jhubbard@nvidia.com> Cc: Liam Howlett <Liam.Howlett@oracle.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ying Han <yinghan@google.com> Link: http://lkml.kernel.org/r/20200520052908.204642-5-walken@google.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-08Merge tag 'drm-next-2020-06-08' of git://anongit.freedesktop.org/drm/drmLinus Torvalds2-8/+11
Pull drm fixes from Dave Airlie: "These are the fixes from last week for the stuff merged in the merge window. It got a bunch of nouveau fixes for HDA audio on some new GPUs, some i915 and some amdpgu fixes. i915: - gvt: Fix one clang warning on debug only function - Use ARRAY_SIZE for coccicheck warning - Use after free fix for display global state. - Whitelisting context-local timestamp on Gen9 and two scheduler fixes with deps (Cc: stable) - Removal of write flag from sysfs files where ineffective nouveau: - HDMI/DP audio HDA fixes - display hang fix for Volta/Turing - GK20A regression fix. amdgpu: - Prevent hwmon accesses while GPU is in reset - CTF interrupt fix - Backlight fix for renoir - Fix for display sync groups - Display bandwidth validation workaround" * tag 'drm-next-2020-06-08' of git://anongit.freedesktop.org/drm/drm: (28 commits) drm/nouveau/kms/nv50-: clear SW state of disabled windows harder drm/nouveau: gr/gk20a: Use firmware version 0 drm/nouveau/disp/gm200-: detect and potentially disable HDA support on some SORs drm/nouveau/disp/gp100: split SOR implementation from gm200 drm/nouveau/disp: modify OR allocation policy to account for HDA requirements drm/nouveau/disp: split part of OR allocation logic into a function drm/nouveau/disp: provide hint to OR allocation about HDA requirements drm/amd/display: Revalidate bandwidth before commiting DC updates drm/amdgpu/display: use blanked rather than plane state for sync groups drm/i915/params: fix i915.fake_lmem_start module param sysfs permissions drm/i915/params: don't expose inject_probe_failure in debugfs drm/i915: Whitelist context-local timestamp in the gen9 cmdparser drm/i915: Fix global state use-after-frees with a refcount drm/i915: Check for awaits on still currently executing requests drm/i915/gt: Do not schedule normal requests immediately along virtual drm/i915: Reorder await_execution before await_request drm/nouveau/kms/gt215-: fix race with audio driver runpm drm/nouveau/disp/gm200-: fix NV_PDISP_SOR_HDMI2_CTRL(n) selection Revert "drm/amd/display: disable dcn20 abm feature for bring up" drm/amd/powerplay: ack the SMUToHost interrupt on receive V2 ...
2020-06-08drm/i915/gem: Mark the buffer pool as active for the cmdparserChris Wilson1-8/+48
If the execbuf is interrupted after building the cmdparser pipeline, and before we commit to submitting the request to HW, we would attempt to clean up the cmdparser early. While we held active references to the vma being parsed and constructed, we did not hold an active reference for the buffer pool itself. The result was that an interrupted execbuf could still have run the cmdparser pipeline, but since the buffer pool was idle, its target vma could have been recycled. Note this problem only occurs if the cmdparser is running async due to pipelined waits on busy fences, and the execbuf is interrupted. Fixes: 686c7c35abc2 ("drm/i915/gem: Asynchronous cmdparser") Fixes: 16e87459673a ("drm/i915/gt: Move the batch buffer pool from the engine to the gt") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200604103751.18816-1-chris@chris-wilson.co.uk (cherry picked from commit 57a78ca4eceab1ecb0299fba8a10211289329889) Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2020-06-08Merge tag 'drm-intel-next-fixes-2020-06-04' of git://anongit.freedesktop.org/drm/drm-intel into drm-nextDave Airlie2-8/+11
- Includes gvt-next-fixes-2020-05-28 - Use after free fix for display global state. - Whitelisting context-local timestamp on Gen9 and two scheduler fixes with deps (Cc: stable) - Removal of write flag from sysfs files where ineffective Signed-off-by: Dave Airlie <airlied@redhat.com> From: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200604150454.GA59322@jlahtine-desk.ger.corp.intel.com
2020-06-05drm/i915/gem: Delete unused codeChris Wilson1-19/+0
Unused as of commit 9e0f9464e2ab ("drm/i915/gem: Async GPU relocations only"), but left behind. >> drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c:933:21: error: unused function 'unmask_page' [-Werror,-Wunused-function] static inline void *unmask_page(unsigned long p) ^ >> drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c:938:28: error: unused function 'unmask_flags' [-Werror,-Wunused-function] static inline unsigned int unmask_flags(unsigned long p) ^ >> drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c:945:33: error: unused function 'cache_to_ggtt' [-Werror,-Wunused-function] static inline struct i915_ggtt *cache_to_ggtt(struct reloc_cache *cache) Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200605200357.13069-1-chris@chris-wilson.co.uk
2020-06-05drm/i915/gem: Async GPU relocations onlyChris Wilson2-289/+27
Reduce the 3 relocation paths down to the single path that accommodates all. The primary motivation for this is to guard the relocations with a natural fence (derived from the i915_request used to write the relocation from the GPU). The tradeoff in using async gpu relocations is that it increases latency over using direct CPU relocations, for the cases where the target is idle and accessible by the CPU. The benefit is greatly reduced lock contention and improved concurrency by pipelining. Note that forcing the async gpu relocations does reveal a few issues they have. Firstly, is that they are visible as writes to gem_busy, causing to mark some buffers are being to written to by the GPU even though userspace only reads. Secondly is that, in combination with the cmdparser, they can cause priority inversions. This should be the case where the work is being put into a common workqueue losing our priority information and so being executed in FIFO from the worker, denying us the opportunity to reorder the requests afterwards. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200604211457.19696-1-chris@chris-wilson.co.uk
2020-06-04drm/i915/gem: Mark the buffer pool as active for the cmdparserChris Wilson1-8/+48
If the execbuf is interrupted after building the cmdparser pipeline, and before we commit to submitting the request to HW, we would attempt to clean up the cmdparser early. While we held active references to the vma being parsed and constructed, we did not hold an active reference for the buffer pool itself. The result was that an interrupted execbuf could still have run the cmdparser pipeline, but since the buffer pool was idle, its target vma could have been recycled. Note this problem only occurs if the cmdparser is running async due to pipelined waits on busy fences, and the execbuf is interrupted. Fixes: 686c7c35abc2 ("drm/i915/gem: Asynchronous cmdparser") Fixes: 16e87459673a ("drm/i915/gt: Move the batch buffer pool from the engine to the gt") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200604103751.18816-1-chris@chris-wilson.co.uk
2020-06-04drm/i915/selftests: Exercise all copy engines with the blt routinesChris Wilson4-24/+75
Just to remove an obnoxious HAS_ENGINES(), and in the process make the code agnostic to the availabilty of any particular engine by making it exercise any and all such engines declared on the system. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Matthew Auld <matthew.auld@intel.com> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200604123641.767-1-chris@chris-wilson.co.uk
2020-06-03drm/i915: convert get_user_pages() --> pin_user_pages()John Hubbard1-9/+13
This code was using get_user_pages*(), in a "Case 2" scenario (DMA/RDMA), using the categorization from [1]. That means that it's time to convert the get_user_pages*() + put_page() calls to pin_user_pages*() + unpin_user_pages() calls. There is some helpful background in [2]: basically, this is a small part of fixing a long-standing disconnect between pinning pages, and file systems' use of those pages. [1] Documentation/core-api/pin_user_pages.rst [2] "Explicit pinning of user-space pages": https://lwn.net/Articles/807108/ Signed-off-by: John Hubbard <jhubbard@nvidia.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Souptick Joarder <jrdr.linux@gmail.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Jani Nikula <jani.nikula@linux.intel.com> Cc: "Joonas Lahtinen" <joonas.lahtinen@linux.intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: David Airlie <airlied@linux.ie> Cc: Daniel Vetter <daniel@ffwll.ch> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Matthew Auld <matthew.auld@intel.com> Link: http://lkml.kernel.org/r/20200519002124.2025955-5-jhubbard@nvidia.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-03drm/i915: Drop i915_request.i915 backpointerChris Wilson1-2/+2
We infrequently use the direct i915 backpointer from the i915_request, so do we really need to waste the space in the struct for it? 8 bytes from the most frequently allocated struct vs an 3 bytes and pointer chasing in using rq->engine->i915? Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Akeem G Abodunrin <akeem.g.abodunrin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200602220953.21178-1-chris@chris-wilson.co.uk
2020-06-02Merge tag 'drm-next-2020-06-02' of git://anongit.freedesktop.org/drm/drmLinus Torvalds25-510/+1570
Pull drm updates from Dave Airlie: "Highlights: - Core DRM had a lot of refactoring around managed drm resources to make drivers simpler. - Intel Tigerlake support is on by default - amdgpu now support p2p PCI buffer sharing and encrypted GPU memory Details: core: - uapi: error out EBUSY when existing master - uapi: rework SET/DROP MASTER permission handling - remove drm_pci.h - drm_pci* are now legacy - introduced managed DRM resources - subclassing support for drm_framebuffer - simple encoder helper - edid improvements - vblank + writeback documentation improved - drm/mm - optimise tree searches - port drivers to use devm_drm_dev_alloc dma-buf: - add flag for p2p buffer support mst: - ACT timeout improvements - remove drm_dp_mst_has_audio - don't use 2nd TX slot - spec recommends against it bridge: - dw-hdmi various improvements - chrontel ch7033 support - fix stack issues with old gcc hdmi: - add unpack function for drm infoframe fbdev: - misc fbdev driver fixes i915: - uapi: global sseu pinning - uapi: OA buffer polling - uapi: remove generated perf code - uapi: per-engine default property values in sysfs - Tigerlake GEN12 enabled. - Lots of gem refactoring - Tigerlake enablement patches - move to drm_device logging - Icelake gamma HW readout - push MST link retrain to hotplug work - bandwidth atomic helpers - ICL fixes - RPS/GT refactoring - Cherryview full-ppgtt support - i915 locking guidelines documented - require linear fb stride to be 512 multiple on gen9 - Tigerlake SAGV support amdgpu: - uapi: encrypted GPU memory handling - uapi: add MEM_SYNC IB flag - p2p dma-buf support - export VRAM dma-bufs - FRU chip access support - RAS/SR-IOV updates - Powerplay locking fixes - VCN DPG (powergating) enablement - GFX10 clockgating fixes - DC fixes - GPU reset fixes - navi SDMA fix - expose FP16 for modesetting - DP 1.4 compliance fixes - gfx10 soft recovery - Improved Critical Thermal Faults handling - resizable BAR on gmc10 amdkfd: - uapi: GWS resource management - track GPU memory per process - report PCI domain in topology radeon: - safe reg list generator fixes nouveau: - HD audio fixes on recent systems - vGPU detection (fail probe if we're on one, for now) - Interlaced mode fixes (mostly avoidance on Turing, which doesn't support it) - SVM improvements/fixes - NVIDIA format modifier support - Misc other fixes. adv7511: - HDMI SPDIF support ast: - allocate crtc state size - fix double assignment - fix suspend bochs: - drop connector register cirrus: - move to tiny drivers. exynos: - fix imported dma-buf mapping - enable runtime PM - fixes and cleanups mediatek: - DPI pin mode swap - config mipi_tx current/impedance lima: - devfreq + cooling device support - task handling improvements - runtime PM support pl111: - vexpress init improvements - fix module auto-load rcar-du: - DT bindings conversion to YAML - Planes zpos sanity check and fix - MAINTAINERS entry for LVDS panel driver mcde: - fix return value mgag200: - use managed config init stm: - read endpoints from DT vboxvideo: - use PCI managed functions - drop WC mtrr vkms: - enable cursor by default rockchip: - afbc support virtio: - various cleanups qxl: - fix cursor notify port hisilicon: - 128-byte stride alignment fix sun4i: - improved format handling" * tag 'drm-next-2020-06-02' of git://anongit.freedesktop.org/drm/drm: (1401 commits) drm/amd/display: Fix potential integer wraparound resulting in a hang drm/amd/display: drop cursor position check in atomic test drm/amdgpu: fix device attribute node create failed with multi gpu drm/nouveau: use correct conflicting framebuffer API drm/vblank: Fix -Wformat compile warnings on some arches drm/amdgpu: Sync with VM root BO when switching VM to CPU update mode drm/amd/display: Handle GPU reset for DC block drm/amdgpu: add apu flags (v2) drm/amd/powerpay: Disable gfxoff when setting manual mode on picasso and raven drm/amdgpu: fix pm sysfs node handling (v2) drm/amdgpu: move gpu_info parsing after common early init drm/amdgpu: move discovery gfx config fetching drm/nouveau/dispnv50: fix runtime pm imbalance on error drm/nouveau: fix runtime pm imbalance on error drm/nouveau: fix runtime pm imbalance on error drm/nouveau/debugfs: fix runtime pm imbalance on error drm/nouveau/nouveau/hmm: fix migrate zero page to GPU drm/nouveau/nouveau/hmm: fix nouveau_dmem_chunk allocations drm/nouveau/kms/nv50-: Share DP SST mode_valid() handling with MST drm/nouveau/kms/nv50-: Move 8BPC limit for MST into nv50_mstc_get_modes() ...
2020-06-02Merge branch 'akpm' (patches from Andrew)Linus Torvalds1-1/+1
Merge updates from Andrew Morton: "A few little subsystems and a start of a lot of MM patches. Subsystems affected by this patch series: squashfs, ocfs2, parisc, vfs. With mm subsystems: slab-generic, slub, debug, pagecache, gup, swap, memcg, pagemap, memory-failure, vmalloc, kasan" * emailed patches from Andrew Morton <akpm@linux-foundation.org>: (128 commits) kasan: move kasan_report() into report.c mm/mm_init.c: report kasan-tag information stored in page->flags ubsan: entirely disable alignment checks under UBSAN_TRAP kasan: fix clang compilation warning due to stack protector x86/mm: remove vmalloc faulting mm: remove vmalloc_sync_(un)mappings() x86/mm/32: implement arch_sync_kernel_mappings() x86/mm/64: implement arch_sync_kernel_mappings() mm/ioremap: track which page-table levels were modified mm/vmalloc: track which page-table levels were modified mm: add functions to track page directory modifications s390: use __vmalloc_node in stack_alloc powerpc: use __vmalloc_node in alloc_vm_stack arm64: use __vmalloc_node in arch_alloc_vmap_stack mm: remove vmalloc_user_node_flags mm: switch the test_vmalloc module to use __vmalloc_node mm: remove __vmalloc_node_flags_caller mm: remove both instances of __vmalloc_node_flags mm: remove the prot argument to __vmalloc_node mm: remove the pgprot argument to __vmalloc ...
2020-06-02mm: remove the prot argument from vm_map_ramChristoph Hellwig1-1/+1
This is always PAGE_KERNEL - for long term mappings with other properties vmap should be used. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Christian Borntraeger <borntraeger@de.ibm.com> Cc: Christophe Leroy <christophe.leroy@c-s.fr> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: David Airlie <airlied@linux.ie> Cc: Gao Xiang <xiang@kernel.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Haiyang Zhang <haiyangz@microsoft.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: "K. Y. Srinivasan" <kys@microsoft.com> Cc: Laura Abbott <labbott@redhat.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Michael Kelley <mikelley@microsoft.com> Cc: Minchan Kim <minchan@kernel.org> Cc: Nitin Gupta <ngupta@vflare.org> Cc: Robin Murphy <robin.murphy@arm.com> Cc: Sakari Ailus <sakari.ailus@linux.intel.com> Cc: Stephen Hemminger <sthemmin@microsoft.com> Cc: Sumit Semwal <sumit.semwal@linaro.org> Cc: Wei Liu <wei.liu@kernel.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Paul Mackerras <paulus@ozlabs.org> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: Will Deacon <will@kernel.org> Link: http://lkml.kernel.org/r/20200414131348.444715-19-hch@lst.de Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-02gup: document and work around "COW can break either way" issueLinus Torvalds1-0/+8
Doing a "get_user_pages()" on a copy-on-write page for reading can be ambiguous: the page can be COW'ed at any time afterwards, and the direction of a COW event isn't defined. Yes, whoever writes to it will generally do the COW, but if the thread that did the get_user_pages() unmapped the page before the write (and that could happen due to memory pressure in addition to any outright action), the writer could also just take over the old page instead. End result: the get_user_pages() call might result in a page pointer that is no longer associated with the original VM, and is associated with - and controlled by - another VM having taken it over instead. So when doing a get_user_pages() on a COW mapping, the only really safe thing to do would be to break the COW when getting the page, even when only getting it for reading. At the same time, some users simply don't even care. For example, the perf code wants to look up the page not because it cares about the page, but because the code simply wants to look up the physical address of the access for informational purposes, and doesn't really care about races when a page might be unmapped and remapped elsewhere. This adds logic to force a COW event by setting FOLL_WRITE on any copy-on-write mapping when FOLL_GET (or FOLL_PIN) is used to get a page pointer as a result. The current semantics end up being: - __get_user_pages_fast(): no change. If you don't ask for a write, you won't break COW. You'd better know what you're doing. - get_user_pages_fast(): the fast-case "look it up in the page tables without anything getting mmap_sem" now refuses to follow a read-only page, since it might need COW breaking. Which happens in the slow path - the fast path doesn't know if the memory might be COW or not. - get_user_pages() (including the slow-path fallback for gup_fast()): for a COW mapping, turn on FOLL_WRITE for FOLL_GET/FOLL_PIN, with very similar semantics to FOLL_FORCE. If it turns out that we want finer granularity (ie "only break COW when it might actually matter" - things like the zero page are special and don't need to be broken) we might need to push these semantics deeper into the lookup fault path. So if people care enough, it's possible that we might end up adding a new internal FOLL_BREAK_COW flag to go with the internal FOLL_COW flag we already have for tracking "I had a COW". Alternatively, if it turns out that different callers might want to explicitly control the forced COW break behavior, we might even want to make such a flag visible to the users of get_user_pages() instead of using the above default semantics. But for now, this is mostly commentary on the issue (this commit message being a lot bigger than the patch, and that patch in turn is almost all comments), with that minimal "enable COW breaking early" logic using the existing FOLL_WRITE behavior. [ It might be worth noting that we've always had this ambiguity, and it could arguably be seen as a user-space issue. You only get private COW mappings that could break either way in situations where user space is doing cooperative things (ie fork() before an execve() etc), but it _is_ surprising and very subtle, and fork() is supposed to give you independent address spaces. So let's treat this as a kernel issue and make the semantics of get_user_pages() easier to understand. Note that obviously a true shared mapping will still get a page that can change under us, so this does _not_ mean that get_user_pages() somehow returns any "stable" page ] Reported-by: Jann Horn <jannh@google.com> Tested-by: Christoph Hellwig <hch@lst.de> Acked-by: Oleg Nesterov <oleg@redhat.com> Acked-by: Kirill Shutemov <kirill@shutemov.name> Acked-by: Jan Kara <jack@suse.cz> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Matthew Wilcox <willy@infradead.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-01Merge branch 'uaccess.readdir' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfsLinus Torvalds1-2/+3
Pull uaccess/readdir updates from Al Viro: "Finishing the conversion of readdir.c to unsafe_... API. This includes the uaccess_{read,write}_begin series by Christophe Leroy" * 'uaccess.readdir' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: readdir.c: get rid of the last __put_user(), drop now-useless access_ok() readdir.c: get compat_filldir() more or less in sync with filldir() switch readdir(2) to unsafe_copy_dirent_name() drm/i915/gem: Replace user_access_begin by user_write_access_begin uaccess: Selectively open read or write user access uaccess: Add user_read_access_begin/end and user_write_access_begin/end
2020-05-29drm/i915/gem: Give each object class a friendly nameChris Wilson11-1/+14
Name the object classes and their offspring for easier lockdep debugging. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200529183204.16850-2-chris@chris-wilson.co.uk
2020-05-29drm/i915/gem: Taint all shrinkable object locksChris Wilson1-0/+4
If we declare that an object type is shrinkable (any that we can reclaim to recover system pages), make sure we taint the object mutex so that lockdep expects us to use it within fs_reclaim. lockdep will then complain the first time we try to allocate while holding the plain mutex, as doing so invites potential recursion. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200529183204.16850-1-chris@chris-wilson.co.uk
2020-05-25drm/i915/gem: Suppress some random warningsChris Wilson4-7/+4
Leave the error propagation in place, but limit the warnings to only show up in CI if the unlikely errors are hit. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200525141957.3061-2-chris@chris-wilson.co.uk
2020-05-25drm/i915/gem: Avoid iterating an empty listChris Wilson1-6/+9
Our __sgt_iter assumes that the scattergather list has at least one element. But during construction we may fail in allocating the first page, and so mark the first element as the terminator. This is unexpected! [22555.524752] RIP: 0010:shmem_get_pages+0x506/0x710 [i915] [22555.524759] Code: 49 8b 2c 24 31 c0 66 89 44 24 40 48 85 ed 0f 84 62 01 00 00 4c 8b 75 00 8b 5d 08 44 8b 7d 0c 48 8b 0d 7e 34 07 e2 49 83 e6 fc <49> 8b 16 41 01 df 48 89 cf 48 89 d0 48 c1 e8 2d 48 85 c9 0f 84 c8 [22555.524765] RSP: 0018:ffffc9000053f9d0 EFLAGS: 00010246 [22555.524770] RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffff8881ffffa000 [22555.524774] RDX: fffffffffffffff4 RSI: ffffffffffffffff RDI: ffffffff821efe00 [22555.524778] RBP: ffff8881b099ab00 R08: 0000000000000000 R09: 00000000fffffff4 [22555.524782] R10: 0000000000000002 R11: 00000000ffec0a02 R12: ffff8881cd3c8d60 [22555.524786] R13: 00000000fffffff4 R14: 0000000000000000 R15: 0000000000000000 [22555.524790] FS: 00007f4fbeb9b9c0(0000) GS:ffff8881f8580000(0000) knlGS:0000000000000000 [22555.524795] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [22555.524799] CR2: 0000000000000000 CR3: 00000001ec7f0004 CR4: 00000000001606e0 [22555.524803] Call Trace: [22555.524919] __i915_gem_object_get_pages+0x4f/0x60 [i915] Fixes: 85d1225ec066 ("drm/i915: Introduce & use new lightweight SGL iterators") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Matthew Auld <matthew.auld@intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: <stable@vger.kernel.org> # v4.8+ Reviewed-by: Matthew Auld <matthew.auld@intel.com> Reviewed-by: Maciej Patelczyk <maciej.patelczyk@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200522132706.5133-1-chris@chris-wilson.co.uk (cherry picked from commit 957ad9a02be6faa87594c58ac09460cd3d190d0e) Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2020-05-25drm/i915: Disable semaphore inter-engine sync without timeslicingChris Wilson1-2/+2
Since the removal of the no-semaphore boosting, we rely on timeslicing to reorder passed inter-dependency hogs across the engines. However, we require preemption to support timeslicing into user payloads, and not all machine support preemption so we do not universally enable timeslicing, even when it would correctly preempt our own inter-engine semaphores. Since timeslicing and semaphore priority deboosting is now disabled on Broadwell/Braswell, we have to follow suite and not use semaphores. Testcase: igt/gem_exec_schedule/semaphore-codependency # bdw/bsw Fixes: 18e4af04d218 ("drm/i915: Drop no-semaphore boosting") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200521140617.30015-1-chris@chris-wilson.co.uk (cherry picked from commit 0eb670aac27b1d615004c29efec595616e3e091a) Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2020-05-22drm/i915/gem: Avoid iterating an empty listChris Wilson1-6/+9
Our __sgt_iter assumes that the scattergather list has at least one element. But during construction we may fail in allocating the first page, and so mark the first element as the terminator. This is unexpected! [22555.524752] RIP: 0010:shmem_get_pages+0x506/0x710 [i915] [22555.524759] Code: 49 8b 2c 24 31 c0 66 89 44 24 40 48 85 ed 0f 84 62 01 00 00 4c 8b 75 00 8b 5d 08 44 8b 7d 0c 48 8b 0d 7e 34 07 e2 49 83 e6 fc <49> 8b 16 41 01 df 48 89 cf 48 89 d0 48 c1 e8 2d 48 85 c9 0f 84 c8 [22555.524765] RSP: 0018:ffffc9000053f9d0 EFLAGS: 00010246 [22555.524770] RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffff8881ffffa000 [22555.524774] RDX: fffffffffffffff4 RSI: ffffffffffffffff RDI: ffffffff821efe00 [22555.524778] RBP: ffff8881b099ab00 R08: 0000000000000000 R09: 00000000fffffff4 [22555.524782] R10: 0000000000000002 R11: 00000000ffec0a02 R12: ffff8881cd3c8d60 [22555.524786] R13: 00000000fffffff4 R14: 0000000000000000 R15: 0000000000000000 [22555.524790] FS: 00007f4fbeb9b9c0(0000) GS:ffff8881f8580000(0000) knlGS:0000000000000000 [22555.524795] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [22555.524799] CR2: 0000000000000000 CR3: 00000001ec7f0004 CR4: 00000000001606e0 [22555.524803] Call Trace: [22555.524919] __i915_gem_object_get_pages+0x4f/0x60 [i915] Fixes: 85d1225ec066 ("drm/i915: Introduce & use new lightweight SGL iterators") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Matthew Auld <matthew.auld@intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: <stable@vger.kernel.org> # v4.8+ Reviewed-by: Matthew Auld <matthew.auld@intel.com> Reviewed-by: Maciej Patelczyk <maciej.patelczyk@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200522132706.5133-1-chris@chris-wilson.co.uk
2020-05-21drm/i915: Remove PIN_UPDATE for i915_vma_pinChris Wilson1-142/+0
As we no longer use PIN_UPDATE (since commit 7d0aa0db4375 ("drm/i915/gem: Unbind all current vma on changing cache-level")) we can remove PIN_UPDATE itself. The benefit is just in simplifing the vma bind. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200521144949.25357-1-chris@chris-wilson.co.uk
2020-05-21drm/i915: Disable semaphore inter-engine sync without timeslicingChris Wilson1-2/+2
Since the removal of the no-semaphore boosting, we rely on timeslicing to reorder passed inter-dependency hogs across the engines. However, we require preemption to support timeslicing into user payloads, and not all machine support preemption so we do not universally enable timeslicing, even when it would correctly preempt our own inter-engine semaphores. Since timeslicing and semaphore priority deboosting is now disabled on Broadwell/Braswell, we have to follow suite and not use semaphores. Testcase: igt/gem_exec_schedule/semaphore-codependency # bdw/bsw Fixes: 18e4af04d218 ("drm/i915: Drop no-semaphore boosting") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200521140617.30015-1-chris@chris-wilson.co.uk
2020-05-20Merge tag 'drm-intel-next-2020-05-15' of git://anongit.freedesktop.org/drm/drm-intel into drm-nextDave Airlie15-249/+707
UAPI Changes: - drm/i915: Show per-engine default property values in sysfs By providing the default values configured into the kernel via sysfs, it is much more convenient for userspace to restore those sane defaults, or at least know what are considered good baseline. This is useful, for example, to cleanup after any failed userspace prior to commencing new jobs. Cross-subsystem Changes: - video/hdmi: Add Unpack only function for DRM infoframe - Includes pull request gvt-next-2020-05-12 Driver Changes: - Restore Cherryview back to full-ppgtt (Chris, Mika) - Document locking guidelines for i915 (Chris, Daniel, Joonas) - Fix GitLab #1746: Handle idling during i915_gem_evict_something busy loops (Chris) - Display WA #1105: Require linear fb stride to be multiple of 512 bytes on gen9/glk (Ville) - Add Wa_14010685332 for ICP/ICL (Matt R) - Restrict w/a 1607087056 for EHL/JSL (Swathi) - Fix interrupt handling for DP AUX transactions on Tigerlake (Imre) - Revert "drm/i915/tgl: Include ro parts of l3 to invalidate" (Mika) - Fix HDC pipeline flush hardware bit on Gen12 (Mika) - Flush L3 when flushing render on Gen12 (Mika) - Invalidate aux table entries forcibly between BB on Gen12 (Mika) - Add aux table invalidate for all engines on Gen12 (Mika) - Force pte cacheline to main memory Gen8+ (Mika) - Add and enable TGL+ SAGV support (Stanislav) - Implement vm_ops->access on i915 mmaps for GDB (Chris, Kristian) - Replace zero-length array with flexible-array (Gustavo) - Improve batch buffer pool effectiveness to mitigate soft-rc6 hit (Chris) - Remove wait priority boosting (Chris) - Keep driver module referenced when PMU is active (Chris) - Sanitize RPS interrupts upon resume (Chris) - Extend pcode read timeout to 20 ms (Chris) - Wait for ACT sent before enabling MST pipe (Ville) - Extend support to async relocations to SNB (Chris) - Remove CNL pre-prod workarounds (Ville) - Don't enable WaIncreaseLatencyIPCEnabled when IPC is disabled (Sultan) - Record the active CCID from before reset (Chris) - Mark concurrent submissions with a weak-dependency (Chris) - Peel dma-fence-chains for await to allow engine-to-engine sync (Lionel) - Prevent using semaphores to chain up to external fences (Chris) - Fix GLK watermark calculations (Ville) - Emit await(batch) before MI_BB_START (Chris) - Reset execlists registers before HWSP (Chris) - Drop no-semaphore boosting in favor of fast timeslicing (Chris) - Fix enabled infoframe states of lspcon (Gwan-gyeong) - Program DP SDPs on pipe updates (Gwan-gyeong) - Stop sending DP SDPs on ddi disable (Gwan-gyeong) - Store CS timestamp frequency in Hz (Ville) - Remove unused HAS_FWTABLE macro (Pascal) - Use batchbuffer chaining for relocations to save ring space (Chris) - Try different engines for relocs if MI ops not supported (Chris, Tvrtko) - Lazily acquire the device wakeref for freeing objects (Chris) - Streamline display code arithmetics around rounding etc. (Ville) - Use bw state for per crtc SAGV evaluation (Stanislav) - Track active_pipes in bw_state (Stanislav) - Nuke mode.vrefresh usage (Ville) - Warn if the FBC is still writing to stolen on removal (Chris) - Added new PCode commands prepping for QGV rescricting (Stansilav) - Stop holding onto the pinned_default_state (Chris) - Propagate error from completed fences (Chris) - Ignore submit-fences on the same timeline (Chris) - Pull waiting on an external dma-fence into its routine (Chris) - Replace the hardcoded I915_FENCE_TIMEOUT with Kconfig (Chris) - Mark up the racy read of execlists->context_tag (Chris) - Tidy up the return handling for completed dma-fences (Chris) - Introduce skl_plane_wm_level accessor (Stanislav) - Extract SKL SAGV checking (Stanislav) - Make active_pipes check skl specific (Stanislav) - Suspend tasklets before resume sanitization (Chris) - Remove redundant exec_fence (Chris) - Mark the addition of the initial-breadcrumb in the request (Chris) - Transfer old virtual breadcrumbs to irq_worker (Chris) - Read the DP SDPs from the video DIP (Gwan-gyeong) - Program DP SDPs with computed configs (Gwan-gyeong) - Add state readout for DP VSC and DP HDR Metadata Infoframe SDP (Gwan-gyeong) - Add compute routine for DP PSR VSC SDP (Gwan-gyeong) - Use new DP VSC SDP compute routine on PSR (Gwan-gyeong) - Restrict qgv points which don't have enough bandwidth. (Stanislav) - Nuke pointless div by 64bit (Ville) - Static checker code fixes (Nathan, Mika, Chris) - Add logging function for DP VSC SDP (Gwan-gyeong) - Include HDMI DRM infoframe, DP HDR metadata and DP VSC SDP in the crtc state dump (Gwan-gyeong) - Make timeslicing explicit engine property (Chris, Tvrtko) - Selftest and debugging improvements (Chris) - Align variable names with BSpec (Ville) - Tidy up gen8+ breadcrumb emission code (Chris) - Turn intel_digital_port_connected() in a vfunc (Ville) - Use stashed away hpd isr bits in intel_digital_port_connected() (Ville) - Extract i915_cs_timestamp_{ns_to_ticks,tick_to_ns}() (Ville) Signed-off-by: Dave Airlie <airlied@redhat.com> From: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200515160703.GA19043@jlahtine-desk.ger.corp.intel.com
2020-05-19drm/i915/gem: Prefer drm_WARN* over WARN*Pankaj Bharadiya3-3/+4
struct drm_device specific drm_WARN* macros include device information in the backtrace, so we know what device the warnings originate from. Prefer drm_WARN* over WARN* at places where struct drm_device pointer can be extracted. Signed-off-by: Pankaj Bharadiya <pankaj.laxminarayan.bharadiya@intel.com> Signed-off-by: Jani Nikula <jani.nikula@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200504181600.18503-6-pankaj.laxminarayan.bharadiya@intel.com
2020-05-16drm/i915/gem: Retry faulthandlers on ENOSPCChris Wilson1-1/+1
As we no longer use the shmemfs allocation directly, we do not expect to receive -ENOSPC from a backing store allocation. The potential sources for -ENOSPC are then our own internal eviction code, so the choice is either to kill the potential application with SIGBUS or to retry the faulthandler. In this patch we retry the fault handler, but since this is a should never happen condition, it is arguable that we gather up copious debug and kill the application. At worst, we cause an interruptible busy-wait, stalling the application -- all causes should be transient and the system should eventually recover. A small stall is hopefully a better outcome than random oomkiller. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Matthew Auld <matthew.auld@intel.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200515200031.12034-1-chris@chris-wilson.co.uk
2020-05-14drm/i915: Drop no-semaphore boostingChris Wilson1-15/+0
Now that we have fast timeslicing on semaphores, we no longer need to prioritise none-semaphore work as we will yield any work blocked on a semaphore to the next in the queue. Previously with no timeslicing, blocking on the semaphore caused extremely bad scheduling with multiple clients utilising multiple rings. Now, there is no impact and we can remove the complication. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200513173504.28322-1-chris@chris-wilson.co.uk
2020-05-14Merge tag 'drm-intel-next-2020-04-30' of git://anongit.freedesktop.org/drm/drm-intel into drm-nextDave Airlie8-82/+646
Driver Changes: - Fix GitLab #1698: Performance regression with Linux 5.7-rc1 on Iris Plus 655 and 4K screen (Chris) - Add Wa_14011059788 for Tigerlake (Matt A) - Add per ctx batchbuffer wa for timestamp for Gen12 (Mika) - Use indirect ctx bb to load cmd buffer control value from context image to avoid corruption (Mika) - Enable DP Display Audio WA (Uma, Jani) - Update forcewake firmware ranges for Icelake (Radhakrishna) - Add missing deinitialization cases of load failure for display (Jose) - Implement TC cold sequences for Icelake and Tigerlake (Jose) - Unbreak enable_dpcd_backlight modparam (Lyude) - Move the late flush_submission in retire to the end (Chris) - Demote "Reducing compressed framebufer size" message to info (Peter) - Push MST link retraining to the hotplug work (Ville) - Hold obj->vma.lock over for_each_ggtt_vma() (Chris) - Fix timeout handling during TypeC AUX power well enabling for ICL (Imre) - Fix skl+ non-scaled pfit modes (Ville) - Prefer soft-rc6 over RPS DOWN_TIMEOUT (Chris) - Sanitize GT first before poisoning HWSP (Chris) - Fix up clock RPS frequency readout (Chris) - Avoid reusing the same logical CCID (Chris) - Avoid dereferencing a dead context (Chris) - Always enable busy-stats for execlists (Chris) - Apply the aggressive downclocking to parking (Chris) - Restore aggressive post-boost downclocking (Chris) - Scrub execlists state on resume (Chris) - Add debugfs attributes for LPSP (Ansuman) - Improvements to kernel selftests (Chris, Mika) - Add tiled blits selftest (Zbigniew) - Fix error handling in __live_lrc_indirect_ctx_bb() (Dan) - Add pre/post plane updates for SAGV (Stanislav) - Add ICL PG3 PW ID for EHL (Anshuman) - Fix Sphinx build duplicate label warning (Jani) - Error log non-zero audio power refcount after unbind (Jani) - Remove object_is_locked assertion from unpin_from_display_plane (Chris) - Use single set of AUX powerwell ops for gen11+ (Matt R) - Prefer drm_WARN_ON over WARN_ON (Pankaj) - Poison residual state [HWSP] across resume (Chris, Tvrtko) - Convert request-before-CS assertion to debug (Chris) - Carefully order virtual_submission_tasklet (Chris) - Check carefully for an idle engine in wait-for-idle (Chris) - Only close vma we open (Chris) - Trace RPS events (Chris) - Use the RPM config register to determine clk frequencies (Chris) - Drop rq->ring->vma peeking from error capture (Chris) - Check preempt-timeout target before submit_ports (Chris) - Check HWSP cacheline is valid before acquiring (Chris) - Use proper fault mask in interrupt postinstall too (Matt R) - Keep a no-frills swappable copy of the default context state (Chris) - Add atomic helpers for bandwidth (Stanislav) - Refactor setting dma info to a common helper from device info (Michael) - Refactor DDI transcoder code for clairty (Ville) - Extend PG3 power well ID to ICL (Anshuman) - Refactor PFIT code for readability and future extensibility (Ville) - Clarify code split between intel_ddi.c and intel_dp.c (Ville) - Move out code to return the digital_port of the aux ch (Jose) - Move rps.enabled/active and use of RPS interrupts to flags (Chris) - Remove superfluous inlines and dead code (Jani) - Re-disable -Wframe-address from top-level Makefile (Nick) - Static checker and spelling fixes (Colin, Nathan) - Split long lines (Ville) Signed-off-by: Dave Airlie <airlied@redhat.com> From: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200430124904.GA100924@jlahtine-desk.ger.corp.intel.com
2020-05-13drm/i915/gem: Remove redundant exec_fenceChris Wilson1-26/+14
Since there can only be one of in_fence/exec_fence, just use the single in_fence local. v2: Consolidate lookup Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200513180937.28992-1-chris@chris-wilson.co.uk
2020-05-11drm/i915/selftests: Always flush before unpining after writingChris Wilson4-2/+10
Be consistent, and even when we know we had used a WC, flush the mapped object after writing into it. The flush understands the mapping type and will only clflush if !I915_MAP_WC, but will always insert a wmb [sfence] so that we can be sure that all writes are visible. v2: Add the unconditional wmb so we are know that we always flush the writes to memory/HW at that point. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200511141304.599-1-chris@chris-wilson.co.uk