linux-dev - Linux kernel development work

Age	Commit message (Collapse)	Author	Files	Lines
2019-04-05	drm/i915: Convert i915_reset.c over to using uncore mmio	Chris Wilson	1	-52/+70
	Currently i915_reset.c mixes calls to intel_uncore, pci and our old style I915_READ mmio interfaces. Cast aside the old implicit macros, and harmonise on using uncore throughout. add/remove: 1/1 grow/shrink: 0/4 up/down: 65/-207 (-142) Function old new delta rmw_register - 65 +65 gen8_reset_engines 945 942 -3 g4x_do_reset 407 376 -31 intel_gpu_reset 545 509 -36 clear_register 63 - -63 i915_clear_error_registers 461 387 -74 A little bit of pointer dancing elimination works wonders. v2: Roll up the helpers into intel_uncore for general use With the helpers gcc was a little more eager to inline: add/remove: 0/1 grow/shrink: 1/3 up/down: 99/-133 (-34) Function old new delta i915_clear_error_registers 461 560 +99 gen8_reset_engines 945 942 -3 g4x_do_reset 407 376 -31 intel_gpu_reset 545 509 -36 clear_register 63 - -63 Total: Before=1544400, After=1544366, chg -0.00% Win some, lose some, gcc is gcc. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: Paulo Zanoni <paulo.r.zanoni@intel.com> Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190405202419.3093-1-chris@chris-wilson.co.uk
2019-04-05	drm/i915: Make use of 'engine->uncore'	Chris Wilson	1	-15/+17
	The engine has a direct link to the intel_uncore mmio handler, so make use of it rather than going indirectly via &engine->i915->uncore. v2: Update gen11_lock_sfc() to use engine->uncore as well Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: Paulo Zanoni <paulo.r.zanoni@intel.com> Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com> Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190405181550.7630-1-chris@chris-wilson.co.uk
2019-04-02	drm/i915: Move intel_engine_mask_t around for use by i915_request_types.h	Chris Wilson	1	-20/+23
	We want to use intel_engine_mask_t inside i915_request.h, which means extracting it from the general header file mess and placing it inside a types.h. A knock on effect is that the compiler wants to warn about type-contraction of ALL_ENGINES into intel_engine_maskt_t, so prepare for the worst. v2: Use intel_engine_mask_t consistently v3: Move I915_NUM_ENGINES to its natural home at the end of the enum Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190401162641.10963-1-chris@chris-wilson.co.uk Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
2019-03-26	drm/i915: take a reference to uncore in the engine and use it	Daniele Ceraolo Spurio	1	-4/+9
	A few advantages: - Prepares us for the planned split of display uncore from GT uncore - Improves our engine-centric view of the world in the engine code and allows us to avoid jumping back to dev_priv. - Allows us to wrap accesses to engine register in nice macros that automatically pick the right mmio base. Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: Paulo Zanoni <paulo.r.zanoni@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20190325214940.23632-10-daniele.ceraolospurio@intel.com
2019-03-26	drm/i915: intel_wait_for_register_fw to uncore	Daniele Ceraolo Spurio	1	-16/+20
	The intel_uncore structure is the owner of register access, so subclass the function to it. While at it, use a local uncore var and switch to the new read/write functions where it makes sense. Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: Paulo Zanoni <paulo.r.zanoni@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20190325214940.23632-8-daniele.ceraolospurio@intel.com
2019-03-20	drm/i915: use intel_uncore for all forcewake get/put	Daniele Ceraolo Spurio	1	-6/+6
	Now that the internal code all works on intel_uncore, flip the external-facing interface. v2: fix GVT. Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: Paulo Zanoni <paulo.r.zanoni@intel.com> Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20190319183543.13679-4-daniele.ceraolospurio@intel.com
2019-03-18	drm/i915: Stop needlessly acquiring wakeref for debugfs/drop_caches_set	Chris Wilson	1	-1/+3
	We only need to acquire a wakeref for ourselves for a few operations, as most either already acquire their own wakeref or imply a wakeref. In particular, it is i915_gem_set_wedged() that needed us to present it with a wakeref, which is incongruous with its "use anywhere" ability. Suggested-by: "Yokoyama, Caz" <caz.yokoyama@intel.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: "Yokoyama, Caz" <caz.yokoyama@intel.com> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190318095204.9913-7-chris@chris-wilson.co.uk
2019-03-14	drm/i915: Refactor to common helpers for prepare/finish between reset & wedge	Chris Wilson	1	-9/+11
	Since both GPU reset and declaring the device wedged suspend ongoing driver activity around a hard reset, we can reuse the same code to reduce the likelihood of forgetting details surrounding reset from either path. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Sujaritha Sundaresan <sujaritha.sundaresan@intel.com> Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190314084432.3740-1-chris@chris-wilson.co.uk
2019-03-14	drm/i915/guc: Preparing for GuC reset along with engine reset	Sujaritha Sundaresan	1	-0/+2
	Adding the call to prepare for guc reset along with engine reset. intel_uc_reset_prepare() calls to disable guc communication and to sanitize. Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: Michal Wajdeczko <michal.wajdeczko@intel.com> Signed-off-by: Sujaritha Sundaresan <sujaritha.sundaresan@intel.com> Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20190307184445.25895-1-sujaritha.sundaresan@intel.com
2019-03-12	drm/i915: Consolidate reset-request debug message	Chris Wilson	1	-0/+6
	Move the pair of messages to the common callsite where it makes sense to include a bit more information about which request is being reset. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190312111146.10662-1-chris@chris-wilson.co.uk
2019-03-05	drm/i915: Store the BIT(engine->id) as the engine's mask	Chris Wilson	1	-23/+24
	In the next patch, we are introducing a broad virtual engine to encompass multiple physical engines, losing the 1:1 nature of BIT(engine->id). To reflect the broader set of engines implied by the virtual instance, lets store the full bitmask. v2: Use intel_engine_mask_t (s/ring_mask/engine_mask/) v3: Tvrtko voted for moah churn so teach everyone to not mention ring and use $class$instance throughout. v4: Comment upon the disparity in bspec for using VCS1,VCS2 in gen8 and VCS[0-4] in later gen. We opt to keep the code consistent and use 0-index naming throughout. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190305180332.30900-1-chris@chris-wilson.co.uk
2019-02-26	drm/i915: Remove access to global seqno in the HWSP	Chris Wilson	1	-1/+0
	Stop accessing the HWSP to read the global seqno, and stop tracking the mirror in the engine's execution timeline -- it is unused. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190226094922.31617-2-chris@chris-wilson.co.uk
2019-02-20	drm/i915/guc: Calling guc_disable_communication in all suspend paths	Sujaritha Sundaresan	1	-1/+1
	This aim of this patch is to call guc_disable_communication in all suspend paths. The reason to introduce this is to resolve a bug that occurred due to suspend late not being called in the hibernate devices path. Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: Michal Wajdeczko <michal.wajdeczko@intel.com> Signed-off-by: Sujaritha Sundaresan <sujaritha.sundaresan@intel.com> Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20190220013927.9488-3-sujaritha.sundaresan@intel.com
2019-02-20	drm/i915: Beware temporary wedging when determining -EIO	Chris Wilson	1	-2/+27
	At a few points in our uABI, we check to see if the driver is wedged and report -EIO back to the user in that case. However, as we perform the check and reset asynchronously (where once before they were both serialised by the struct_mutex), we may instead see the temporary wedging used to cancel inflight rendering to avoid a deadlock during reset (caused by either us timing out in our reset handler, i915_wedge_on_timeout or with malice aforethought in intel_reset_prepare for a stuck modeset). If we suspect this is the case, that is we see a wedged driver and reset in progress, then wait until the reset is resolved before reporting upon the wedged status. v2: might_sleep() (Mika) Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109580 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190220145637.23503-1-chris@chris-wilson.co.uk
2019-02-19	drm/i915: Use time based guilty context banning	Chris Wilson	1	-12/+17
	Currently, we accumulate each time a context hangs the GPU, offset against the number of requests it submits, and if that score exceeds a certain threshold, we ban that context from submitting any more requests (cancelling any work in flight). In contrast, we use a simple timer on the file, that if we see more than a 9 hangs faster than 60s apart in total across all of its contexts, we will ban the client from creating any more contexts. This leads to a confusing situation where the file may be banned before the context, so lets use a simple timer scheme for each. If the context submits 3 hanging requests within a 120s period, declare it forbidden to ever send more requests. This has the advantage of not being easy to repair by simply sending empty requests, but has the disadvantage that if the context is idle then it is forgiven. However, if the context is idle, it is not disrupting the system, but a hog can evade the request counting and cause much more severe disruption to the system. Updating ban_score from request retirement is dubious as the retirement is purposely not in sync with request submission (i.e. we try and batch retirement to reduce overhead and avoid latency on submission), which leads to surprising situations where we can forgive a hang immediately due to a backlog of requests from before the hang being retired afterwards. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190219122215.8941-2-chris@chris-wilson.co.uk
2019-02-19	drm/i915: Trim delays for wedging	Chris Wilson	1	-1/+1
	CI still reports the occasional multi-second delay for resets, in particular along the wedge+recovery paths. As the likely, and unbounded, delay here is from sync_rcu, use the expedited variant instead. Testcase: igt/gem_eio/unwedge-stress Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190219122215.8941-7-chris@chris-wilson.co.uk
2019-02-18	drm/i915: Restore interrupt enabling after a reset	Chris Wilson	1	-0/+6
	At least on i965g and i965gm, performing a device reset clobbers the IER resulting in loss of interrupts thereafter. So, run the irq_postinstall hook to restore them. v2: Ville pointed out that he already attempted to solve this problem by reinstalling the interrupts in intel_reset_finish() (part of the display handling around reset). However, reinstalling the irq clobbers the i915->irq_mask which we need for handling MI_USER_INTERRUPTS, and does so too late to handle any interrupts generated from resuming the rings. The simple solution to both is to pull the interrupt reenabling from afterwards to around the device reset. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190218153106.16768-1-chris@chris-wilson.co.uk
2019-02-18	drm/i915: Optionally disable automatic recovery after a GPU reset	Chris Wilson	1	-1/+2
	Some clients, such as mesa, may only emit minimal incremental batches that rely on the logical context state from previous batches. They know that recovery is impossible after a hang as their required GPU state is lost, and that each in flight and subsequent batch will hang (resetting the context image back to default perpetuating the problem). To avoid getting into the state in the first place, we can allow clients to opt out of automatic recovery and elect to ban any guilty context following a hang. This prevents the continual stream of hangs and allows the client to recreate their context and rebuild the state from scratch. v2: Prefer calling it recoverable rather than unrecoverable. References: https://lists.freedesktop.org/archives/mesa-dev/2019-February/215431.html Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Kenneth Graunke <kenneth@whitecape.org> Cc: Mika Kuoppala <mika.kuoppala@intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Acked-by: Kenneth Graunke <kenneth@whitecape.org> # for mesa Link: https://patchwork.freedesktop.org/patch/msgid/20190218105821.17293-1-chris@chris-wilson.co.uk
2019-02-15	drm/i915: Defer application of request banning to submission	Chris Wilson	1	-14/+5
	As we currently do not check on submission whether the context is banned in a timely manner it is possible for some requests to escape cancellation after their parent context is banned. By moving the ban into the request submission under the engine->timeline.lock, we serialise it with the reset and setting of the context ban. References: eb8d0f5af4ec ("drm/i915: Remove GPU reset dependence on struct_mutex") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190213182737.12695-1-chris@chris-wilson.co.uk Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
2019-02-14	drm/i915: Only try to park engines after a failed reset	Chris Wilson	1	-5/+8
	Currently we try to stop the engine by programming the ring registers to be disabled before we perform the reset. Sometimes, we see the context image also have invalid ring registers, which one presumes may be actually caused by us doing so. Lets risk not doing programming the ring to zero on the first attempt to avoid preserving that corruption into the context image, leaving the w/a in place for subsequent reset attempts. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190213232047.8486-1-chris@chris-wilson.co.uk
2019-02-12	drm/i915: Detect potential i915_reset_trylock() lockups	Chris Wilson	1	-0/+3
	Use lockdep to warn before we wait indefinitely in case we may be waiting indefinitely. Suggested-by: Mika Kuoppala <mika.kuoppala@intel.com> References: 2caffbf11762 ("drm/i915: Revoke mmaps and prevent access to fence registers across reset") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190212130831.14425-2-chris@chris-wilson.co.uk
2019-02-11	drm/i915: Use synchronize_srcu_expedited() for resets	Chris Wilson	1	-1/+1
	We impose upon ourselves a strict timeout for resets (to ensure forward progress by use of a failsafe). Prefer to use the expedited synchronisation function in this case to reduce the likelihood of a spurious delay being treated as a deadlock. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190211135040.1234-2-chris@chris-wilson.co.uk Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
2019-02-11	drm/i915: Pull sync_scru for device reset outside of wedge_mutex	Chris Wilson	1	-3/+3
	We need to flush our srcu protecting resources about to be clobbered by the reset, inside of our timer failsafe but outside of the error->wedge_mutex, so that the failsafe can run in case the synchronize_srcu() takes too long (hits a shrinker deadlock?). Fixes: 72eb16df010a ("drm/i915: Serialise resets with wedging") References: https://bugs.freedesktop.org/show_bug.cgi?id=109605 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190211135040.1234-1-chris@chris-wilson.co.uk Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
2019-02-08	drm/i915: Serialise resets with wedging	Chris Wilson	1	-28/+40
	Prevent concurrent set-wedge with ongoing resets (and vice versa) by taking the same wedge_mutex around both operations. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190208153708.20023-6-chris@chris-wilson.co.uk
2019-02-08	drm/i915: Uninterruptibly drain the timelines on unwedging	Chris Wilson	1	-20/+8
	On wedging, we mark all executing requests as complete and all pending requests completed as soon as they are ready. Before unwedging though we wish to flush those pending requests prior to restoring default execution, and so we must wait. Do so uninterruptibly as we do not provide the EINTR gracefully back to userspace in this case but persists in keeping the permanently wedged state without restarting the syscall. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190208153708.20023-4-chris@chris-wilson.co.uk
2019-02-08	drm/i915: Force the GPU reset upon wedging	Chris Wilson	1	-4/+4
	When declaring the GPU wedged, we do need to hit the GPU with the reset hammer so that its state matches our presumed state during cleanup. If the reset fails, it fails, and we may be unhappy but wedged. However, if we are testing our wedge/unwedged handling, the desync carries over into the next test and promptly explodes. References: https://bugs.freedesktop.org/show_bug.cgi?id=106702 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190208153708.20023-3-chris@chris-wilson.co.uk
2019-02-08	drm/i915: Revoke mmaps and prevent access to fence registers across reset	Chris Wilson	1	-40/+70
	Previously, we were able to rely on the recursive properties of struct_mutex to allow us to serialise revoking mmaps and reacquiring the FENCE registers with them being clobbered over a global device reset. I then proceeded to throw out the baby with the bath water in order to pursue a struct_mutex-less reset. Perusing LWN for alternative strategies, the dilemma on how to serialise access to a global resource on one side was answered by https://lwn.net/Articles/202847/ -- Sleepable RCU: 1 int readside(void) { 2 int idx; 3 rcu_read_lock(); 4 if (nomoresrcu) { 5 rcu_read_unlock(); 6 return -EINVAL; 7 } 8 idx = srcu_read_lock(&ss); 9 rcu_read_unlock(); 10 /* SRCU read-side critical section. */ 11 srcu_read_unlock(&ss, idx); 12 return 0; 13 } 14 15 void cleanup(void) 16 { 17 nomoresrcu = 1; 18 synchronize_rcu(); 19 synchronize_srcu(&ss); 20 cleanup_srcu_struct(&ss); 21 } No more worrying about stop_machine, just an uber-complex mutex, optimised for reads, with the overhead pushed to the rare reset path. However, we do run the risk of a deadlock as we allocate underneath the SRCU read lock, and the allocation may require a GPU reset, causing a dependency cycle via the in-flight requests. We resolve that by declaring the driver wedged and cancelling all in-flight rendering. v2: Use expedited rcu barriers to match our earlier timing characteristics. v3: Try to annotate locking contexts for sparse v4: Reduce selftest lock duration to avoid a reset deadlock with fences v5: s/srcu/reset_backoff_srcu/ v6: Remove more stale comments Testcase: igt/gem_mmap_gtt/hang Fixes: eb8d0f5af4ec ("drm/i915: Remove GPU reset dependence on struct_mutex") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190208153708.20023-2-chris@chris-wilson.co.uk
2019-02-05	drm/i915: Pull i915_gem_active into the i915_active family	Chris Wilson	1	-1/+1
	Looking forward, we need to break the struct_mutex dependency on i915_gem_active. In the meantime, external use of i915_gem_active is quite beguiling, little do new users suspect that it implies a barrier as each request it tracks must be ordered wrt the previous one. As one of many, it can be used to track activity across multiple timelines, a shared fence, which fits our unordered request submission much better. We need to steer external users away from the singular, exclusive fence imposed by i915_gem_active to i915_active instead. As part of that process, we move i915_gem_active out of i915_request.c into i915_active.c to start separating the two concepts, and rename it to i915_active_request (both to tie it to the concept of tracking just one request, and to give it a longer, less appealing name). Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190205130005.2807-5-chris@chris-wilson.co.uk
2019-01-29	drm/i915: Replace global breadcrumbs with per-context interrupt tracking	Chris Wilson	1	-7/+9
	A few years ago, see commit 688e6c725816 ("drm/i915: Slaughter the thundering i915_wait_request herd"), the issue of handling multiple clients waiting in parallel was brought to our attention. The requirement was that every client should be woken immediately upon its request being signaled, without incurring any cpu overhead. To handle certain fragility of our hw meant that we could not do a simple check inside the irq handler (some generations required almost unbounded delays before we could be sure of seqno coherency) and so request completion checking required delegation. Before commit 688e6c725816, the solution was simple. Every client waiting on a request would be woken on every interrupt and each would do a heavyweight check to see if their request was complete. Commit 688e6c725816 introduced an rbtree so that only the earliest waiter on the global timeline would woken, and would wake the next and so on. (Along with various complications to handle requests being reordered along the global timeline, and also a requirement for kthread to provide a delegate for fence signaling that had no process context.) The global rbtree depends on knowing the execution timeline (and global seqno). Without knowing that order, we must instead check all contexts queued to the HW to see which may have advanced. We trim that list by only checking queued contexts that are being waited on, but still we keep a list of all active contexts and their active signalers that we inspect from inside the irq handler. By moving the waiters onto the fence signal list, we can combine the client wakeup with the dma_fence signaling (a dramatic reduction in complexity, but does require the HW being coherent, the seqno must be visible from the cpu before the interrupt is raised - we keep a timer backup just in case). Having previously fixed all the issues with irq-seqno serialisation (by inserting delays onto the GPU after each request instead of random delays on the CPU after each interrupt), we can rely on the seqno state to perfom direct wakeups from the interrupt handler. This allows us to preserve our single context switch behaviour of the current routine, with the only downside that we lose the RT priority sorting of wakeups. In general, direct wakeup latency of multiple clients is about the same (about 10% better in most cases) with a reduction in total CPU time spent in the waiter (about 20-50% depending on gen). Average herd behaviour is improved, but at the cost of not delegating wakeups on task_prio. v2: Capture fence signaling state for error state and add comments to warm even the most cold of hearts. v3: Check if the request is still active before busywaiting v4: Reduce the amount of pointer misdirection with list_for_each_safe and using a local i915_request variable inside the loops v5: Add a missing pluralisation to a purely informative selftest message. References: 688e6c725816 ("drm/i915: Slaughter the thundering i915_wait_request herd") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190129205230.19056-2-chris@chris-wilson.co.uk
2019-01-28	drm/i915: Track active timelines	Chris Wilson	1	-1/+1
	Now that we pin timelines around use, we have a clearly defined lifetime and convenient points at which we can track only the active timelines. This allows us to reduce the list iteration to only consider those active timelines and not all. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190128181812.22804-6-chris@chris-wilson.co.uk
2019-01-28	drm/i915: Track the context's seqno in its own timeline HWSP	Chris Wilson	1	-0/+1
	Now that we have allocated ourselves a cacheline to store a breadcrumb, we can emit a write from the GPU into the timeline's HWSP of the per-context seqno as we complete each request. This drops the mirroring of the per-engine HWSP and allows each context to operate independently. We do not need to unwind the per-context timeline, and so requests are always consistent with the timeline breadcrumb, greatly simplifying the completion checks as we no longer need to be concerned about the global_seqno changing mid check. One complication though is that we have to be wary that the request may outlive the HWSP and so avoid touching the potentially danging pointer after we have retired the fence. We also have to guard our access of the HWSP with RCU, the release of the obj->mm.pages should already be RCU-safe. At this point, we are emitting both per-context and global seqno and still using the single per-engine execution timeline for resolving interrupts. v2: s/fake_complete/mark_complete/ Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190128181812.22804-5-chris@chris-wilson.co.uk
2019-01-28	drm/i915: Move list of timelines under its own lock	Chris Wilson	1	-2/+6
	Currently, the list of timelines is serialised by the struct_mutex, but to alleviate difficulties with using that mutex in future, move the list management under its own dedicated mutex. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190128102356.15037-5-chris@chris-wilson.co.uk
2019-01-25	drm/i915: Issue engine resets onto idle engines	Chris Wilson	1	-4/+0
	Always perform the requested reset, even if we believe the engine is idle. Presumably there was a reason the caller wanted the reset, and in the near future we lose the easy tracking for whether the engine is idle. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190125132230.22221-5-chris@chris-wilson.co.uk
2019-01-25	drm/i915: Remove GPU reset dependence on struct_mutex	Chris Wilson	1	-213/+179
	Now that the submission backends are controlled via their own spinlocks, with a wave of a magic wand we can lift the struct_mutex requirement around GPU reset. That is we allow the submission frontend (userspace) to keep on submitting while we process the GPU reset as we can suspend the backend independently. The major change is around the backoff/handoff strategy for performing the reset. With no mutex deadlock, we no longer have to coordinate with any waiter, and just perform the reset immediately. Testcase: igt/gem_mmap_gtt/hang # regresses Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190125132230.22221-3-chris@chris-wilson.co.uk
2019-01-25	drm/i915/guc: Disable global reset	Chris Wilson	1	-0/+3
	The guc (and huc) currently inexcruitably depend on struct_mutex for device reinitialisation from inside the reset, and indeed taking any mutex here is verboten (as we must be able to reset from underneath any of our mutexes). That makes recovering the guc unviable without, for example, reserving contiguous vma space and pages for it to use. The plan to re-enable global reset for the GuC centres around reusing the WOPM reserved space at the top of the aperture (that we know we can populate a contiguous range large enough to dma xfer the fw image). In the meantime, hopefully no one even notices as the device-reset is only used as a backup to the per-engine resets for handling GPU hangs. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Acked-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Acked-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Reviewed-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190125132230.22221-2-chris@chris-wilson.co.uk
2019-01-25	drm/i915: Make all GPU resets atomic	Chris Wilson	1	-51/+39
	In preparation for the next few commits, make resetting the GPU atomic. Currently, we have prepared gen6+ for atomic resetting of individual engines, but now there is a requirement to perform the whole device level reset (just the register poking) from inside an atomic context. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190125132230.22221-1-chris@chris-wilson.co.uk
2019-01-16	drm/i915: Pull all the reset functionality together into i915_reset.c	Chris Wilson	1	-0/+1389
	Currently the code to reset the GPU and our state is spread widely across a few files. Pull the logic together into a common file. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Acked-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190116153304.787-1-chris@chris-wilson.co.uk