aboutsummaryrefslogtreecommitdiffstats
path: root/drivers/gpu/drm/i915/intel_ringbuffer.c (follow)
AgeCommit message (Collapse)AuthorFilesLines
2015-02-09drm/i915: Insert a command barrier on BLT/BSD cache flushesChris Wilson1-4/+19
This looked like an odd regression from commit ec5cc0f9b019af95e4571a9fa162d94294c8d90b Author: Chris Wilson <chris@chris-wilson.co.uk> Date: Thu Jun 12 10:28:55 2014 +0100 drm/i915: Restrict GPU boost to the RCS engine but in reality it undercovered a much older coherency bug. The issue that boosting the GPU frequency on the BCS ring was masking was that we could wake the CPU up after completion of a BCS batch and inspect memory prior to the write cache being fully evicted. In order to serialise the breadcrumb interrupt (and so ensure that the CPU's view of memory is coherent) we need to perform a post-sync operation in the MI_FLUSH_DW. v2: Fix all the MI_FLUSH_DW (bsd plus the duplication in execlists). Also fix the invalidate_domains mask in gen8_emit_flush() for ring != VCS. Testcase: gpuX-rcs-gpu-read-after-write Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: stable@vger.kernel.org Acked-by: Daniel Vetter <daniel@ffwll.ch> Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2015-01-27drm/i915: Change CHV WIZ hashing mode to 16x4Ville Syrjälä1-0/+12
I ran a few tests with xonotic and synmark2 trying out the different WIZ hashing modes on CHV. The results seem to match the results I got with IVB/HSW when I did the similar tests on them in the past. That is 16x4 is generally the fastest mode, 8x8 comes next and finally 8x4. On CHV the difference between the modes is at most ~1% in most tests. IIRC on IVB/HSW the difference was a little bigger, but as there doesn't seem to be any real downside to 16x4 let's use it by default. Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Arun Siluvery <arun.siluvery@linux.intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2015-01-27drm/i915: Implement Wa4x4STCOptimizationDisable:chvVille Syrjälä1-0/+4
Wa4x4STCOptimizationDisable got only implemented for BDW, but according to the w/a database CHV needs it too, so add it. Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Arun Siluvery <arun.siluvery@linux.intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2015-01-27drm/i915: Rename the forcewake get/put functionsMika Kuoppala1-2/+2
We have multiple forcewake domains now on recent gens. Change the function naming to reflect this. v2: More verbose names (Chris) v3: Rebase v4: Rebase v5: Add documentation for forcewake_get/put Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com> Reviewed-by: Deepak S <deepak.s@linux.intel.com> (v2) Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2015-01-27drm/i915: Removed duplicate members from submit_requestNick Hoath1-1/+1
Where there were duplicate variables for the tail, context and ring (engine) in the gem request and the execlist queue item, use the one from the request and remove the duplicate from the execlist queue item. Issue: VIZ-4274 v1: Rebase v2: Fixed build issues. Keep separate postfix & tail pointers as these are used in different ways. Reinserted missing full tail pointer update. Signed-off-by: Nick Hoath <nicholas.hoath@intel.com> Reviewed-by: Thomas Daniel <thomas.daniel@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2015-01-17drm/i915: Ensure the HiZ RAW Stall Optimization is on for Cherryview.Kenneth Graunke1-0/+5
This is an important optimization for avoiding read-after-write (RAW) stalls in the HiZ buffer. Certain workloads would run very slowly with HiZ enabled, but run much faster with the "hiz=false" driconf option. With this patch, they run at full speed even with HiZ. Increases performance in OglVSInstancing by about 2.7x on Braswell. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2015-01-17drm/i915: Enable the HiZ RAW Stall Optimization on Broadwell.Kenneth Graunke1-0/+10
This is an important optimization for avoiding read-after-write (RAW) stalls in the HiZ buffer. Certain workloads would run very slowly with HiZ enabled, but run much faster with the "hiz=false" driconf option. With this patch, they run at full speed even with HiZ. Improves performance in OglVSInstancing by 3.2x on Broadwell GT3e (Iris Pro 6200). Thanks to Jesse Barnes and Ben Widawsky for their help in tracking this down. Thanks to Chris Wilson for showing me the new workarounds system. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Cc: Jesse Barnes <jbarnes@virtuousgeek.org> Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2015-01-13drm/i915: Improve HiZ throughput on Cherryview.Kenneth Graunke1-0/+3
Found by reading the HIZ_CHICKEN documentation. Improves performance in a HiZ microbenchmark by around 50%. Improves performance in OglZBuffer by around 18%. Thanks to Chris Wilson for helping me figure out where to put this. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2015-01-12Merge tag 'topic/i915-hda-componentized-2015-01-12' into drm-intel-next-queuedDaniel Vetter1-10/+17
Conflicts: drivers/gpu/drm/i915/intel_runtime_pm.c Separate branch so that Takashi can also pull just this refactoring into sound-next. Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
2014-12-16drm/i915: Force the CS stall for invalidate flushesChris Wilson1-0/+2
In order to act as a full command barrier by itself, we need to tell the pipecontrol to actually stall the command streamer while the flush runs. We require the full command barrier before operations like MI_SET_CONTEXT, which currently rely on a prior invalidate flush. References: https://bugs.freedesktop.org/show_bug.cgi?id=83677 Cc: Simon Farnsworth <simon@farnz.org.uk> Cc: Daniel Vetter <daniel@ffwll.ch> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: stable@vger.kernel.org Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2014-12-16drm/i915: Invalidate media caches on gen7Chris Wilson1-0/+1
In the gen7 pipe control there is an extra bit to flush the media caches, so let's set it during cache invalidation flushes. v2: Rename to MEDIA_STATE_CLEAR to be more inline with spec. Cc: Simon Farnsworth <simon@farnz.org.uk> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: stable@vger.kernel.org Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2014-12-16drm/i915: Warn about missing context state workarounds only onceMichel Thierry1-1/+1
Otherwise, new platforms without workarounds will hit this warning for every new context created. Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Signed-off-by: Michel Thierry <michel.thierry@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2014-12-10drm/i915/bdw: Add WaForceEnableNonCoherent labelMichel Thierry1-0/+1
We already implement this workaround, but it was missing its name. Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Michel Thierry <michel.thierry@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2014-12-10drm/i915: Remove '& 0xffff' from the mask given to WA_REG()Damien Lespiau1-2/+2
We may be hidding bugs by doing that, so let remove it and have the actual mask value shine through, for better or worse. Signed-off-by: Damien Lespiau <damien.lespiau@intel.com> Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2014-12-10drm/i915: Invert the mask and val arguments in wa_add() and WA_REG()Damien Lespiau1-9/+9
While trying to unify the order of those arguments throughout the driver, Daniel noticed what we were inverting them in this part of the code. Suggested-by: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Damien Lespiau <damien.lespiau@intel.com> Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2014-12-10drm/i915/bdw: Fix the write setting up the WIZ hashing modeDamien Lespiau1-2/+6
I was playing with clang and oh surprise! a warning trigerred by -Wshift-overflow (gcc doesn't have this one): WA_SET_BIT_MASKED(GEN7_GT_MODE, GEN6_WIZ_HASHING_MASK | GEN6_WIZ_HASHING_16x4); drivers/gpu/drm/i915/intel_ringbuffer.c:786:2: warning: signed shift result (0x28002000000) requires 43 bits to represent, but 'int' only has 32 bits [-Wshift-overflow] WA_SET_BIT_MASKED(GEN7_GT_MODE, ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ drivers/gpu/drm/i915/intel_ringbuffer.c:737:15: note: expanded from macro 'WA_SET_BIT_MASKED' WA_REG(addr, _MASKED_BIT_ENABLE(mask), (mask) & 0xffff) Turned out GEN6_WIZ_HASHING_MASK was already shifted by 16, and we were trying to shift it a bit more. The other thing is that it's not the usual case of setting WA bits here, we need to have separate mask and value. To fix this, I've introduced a new _MASKED_FIELD() macro that takes both the (unshifted) mask and the desired value and the rest of the patch ripples through from it. This bug was introduced when reworking the WA emission in: Commit 7225342ab501befdb64bcec76ded41f5897c0855 Author: Mika Kuoppala <mika.kuoppala@linux.intel.com> Date: Tue Oct 7 17:21:26 2014 +0300 drm/i915: Build workaround list in ring initialization v2: Invert the order of the mask and value arguments (Daniel Vetter) Rewrite _MASKED_BIT_ENABLE() and _MASKED_BIT_DISABLE() with _MASKED_FIELD() (Jani Nikula) Make sure we only evaluate 'a' once in _MASKED_BIT_ENABLE() (Dave Gordon) Add check to ensure the value is within the mask boundaries (Chris Wilson) v3: Ensure the the value and mask are 16 bits (Dave Gordon) Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: Arun Siluvery <arun.siluvery@linux.intel.com> Signed-off-by: Damien Lespiau <damien.lespiau@intel.com> Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2014-12-08drm/i915: Move golden context init into ->init_contextDaniel Vetter1-1/+17
Similar to a patch from Thomas Daniel for lrc contexts. This keeps both sides somewhat in sync and should make Dave Gordon happy. Note that both the wa and the golden context init code suffer a bit from an inssuficient split into driver load and hw init code. Which means we have a bunch of tests all over the place to check whether the one-time initialization has been done already or not. All that one-tim code should be moved into the one-time ring setup code, but that's work for later. Cc: Dave Gordon <david.s.gordon@intel.com> Cc: Thomas Daniel <thomas.daniel@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com> Reviewed-by: Dave Gordon <david.s.gordon@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2014-12-06drm/i915: Add unique id to the request structure for debuggingJohn Harrison1-0/+2
For debugging purposes, it is useful to be able to uniquely identify a given request structure as it works its way through the system. This becomes especially tricky once the seqno value is lazily allocated as then the request has nothing but its pointer to identify it for much of its life. Change-Id: Ie76b2268b940467f4cdf5a4ba6f5a54cbb96445d For: VIZ-4377 Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Reviewed-by: Thomas Daniel <Thomas.Daniel@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2014-12-06drm/i915: Zero fill the request structureJohn Harrison1-1/+1
There is a general theory that kzmalloc is better/safer than kmalloc, especially for interesting data structures. This change updates the request structure allocation to be zero filled. This also fixes crashes in the reset code. Quoting Mika's patch: "Clean the request structure on alloc. Otherwise we might end up referencing uninitialized fields. This is apparent when we try to cleanup the preallocated request on ring reset, before any request has been submitted to the ring. The request->ctx is foobar and we end up freeing the foobarness." Note that this fixes a regression introduced in commit 9eba5d4a1d79d5094321469479b4dbe418f60110 Author: John Harrison <John.C.Harrison@Intel.com> Date: Mon Nov 24 18:49:23 2014 +0000 drm/i915: Ensure OLS & PLR are always in sync References: https://bugs.freedesktop.org/show_bug.cgi?id=86959 References: https://bugs.freedesktop.org/show_bug.cgi?id=86962 References: https://bugs.freedesktop.org/show_bug.cgi?id=86992 Change-Id: I68715ef758025fab8db763941ef63bf60d7031e2 For: VIZ-4377 Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Reviewed-by: Thomas Daniel <Thomas.Daniel@intel.com> Cc: Mika Kuoppala <mika.kuoppala@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2014-12-05drm/i915/bdw: Add WaHdcDisableFetchWhenMaskedMichel Thierry1-0/+2
We already have it for chv, but was missing for bdw. Signed-off-by: Michel Thierry <michel.thierry@intel.com> Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2014-12-03drm/i915: Flatten engine init control flowDaniel Vetter1-23/+22
Now that sanity prevails and we have the clean split between software init and starting the engines we can drop all the "have we allocate this struct already?" nonsense. Execlist code could benefit quite a bit more still, but that's for another patch. Reviewed-by: Dave Gordon <david.s.gordon@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
2014-12-03drm/i915: Only init engines onceDaniel Vetter1-4/+0
We can do this. And now there's finally the clean split between software setup and hardware setup I kinda wanted since multi-ring support was merged aeons ago. It only took almost 5 years. Signed-off-by: Daniel Vetter <daniel.vetter@intel.com> Reviewed-by: Dave Gordon <david.s.gordon@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2014-12-03drm/i915: Move intel_init_pipe_control out of engine->init_hwDaniel Vetter1-7/+11
With this all the ->init_hw hooks really only set up hw state needed to start the ring, all the software state setup and memory/buffer allocations happen beforehand. v2: We need to call intel_init_pipe_control after the ring init since otherwise engine->dev is NULL and it falls over. Currently that's now after the hw ring is enabled but a) we'll be fine as long as no one submits a batch b) this will change soon. Signed-off-by: Daniel Vetter <daniel.vetter@intel.com> Reviewed-by: Dave Gordon <david.s.gordon@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2014-12-03drm/i915: s/init()/init_hw()/ in intel_engine_csDaniel Vetter1-6/+6
This is (mostly, some exceptions that need fixing) the hw setup function which starts the ring. And not the function which allocates all the resources. Make this clear by giving it a better name. Signed-off-by: Daniel Vetter <daniel.vetter@intel.com> Reviewed-by: Dave Gordon <david.s.gordon@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2014-12-03drm/i915: Consolidate ring freespace calculationsDave Gordon1-20/+22
There are numerous places in the code where the driver's idea of how much space is left in a ring is updated using the driver's latest notions of the positions of 'head' and 'tail' for the ring. Among them are some that update one or both of these values before (re)doing the calculation. In particular, there are four different places in the code where 'last_retired_head' is copied to 'head' and then set to -1; and two of these do not have a guard to check that it has actually been updated since last time it was consumed, leaving the possibility that the dummy -1 can be transferred from 'last_retired_head' to 'head', causing the space calculation to produce 'impossible' results (previously seen on Android/VLV). This code therefore consolidates all the calculation and updating of these values, such that there is only one place where the ring space is updated, and it ALWAYS uses (and consumes) 'last_retired_head' if (and ONLY if) it has been updated since the last call. Signed-off-by: Dave Gordon <david.s.gordon@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2014-12-03drm/i915: Make ring freespace calculation more robustDave Gordon1-3/+3
The used space in a ring is given by the cyclic distance from the consumer (HEAD) to the producer (TAIL), i.e. ((tail-head) MOD size); conversely, the available space in a ring is the cyclic distance from the producer to the consumer, MINUS the amount reserved for a "gap" that is supposed to guarantee that the producer never catches up with or overruns the consumer. Note that some GEN h/w requires that TAIL never approach to within one cacheline of HEAD, so the gap is usually set to twice the cacheline size to ensure this. While the existing code gives the correct answer for correct inputs, if the producer HAS overrun into the reserved space, the result can be a value larger than the maximum valid value (size-reserved). We can improve this by reorganising the calculation, so that in the event of overrun the result will be negative rather than over-large. This means that the commonly-used test (available >= required) will then reject further writes into the ring after an overrun, giving some chance that we can recover from or at least diagnose the original problem; whereas allowing more writes would likely both confuse the h/w and destroy the evidence of what went wrong. Signed-off-by: Dave Gordon <david.s.gordon@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2014-12-03drm/i915: Connect requests to rings at creation not submissionJohn Harrison1-0/+1
It makes a lot more sense (and makes future seqno -> request conversion patches simpler) to fill in the 'ring' field of the request structure at the point of creation rather than submission. Given that the request structure is assigned by ring specific code and thus is locked to a ring from the start, there really is no reason to defer this assignment. For: VIZ-4377 Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Reviewed-by: Thomas Daniel <Thomas.Daniel@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2014-12-03drm/i915: Remove obsolete seqno parameter from 'i915_add_request'John Harrison1-1/+1
There is no longer any need to retrieve a seqno value from an i915_add_request() call. The calling code already knows which request structure is being processed (it can only be ring->OLR). And as the request itself is now used in preference to the basic seqno value, the latter is now redundant in this situation. For: VIZ-4377 Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Reviewed-by: Thomas Daniel <Thomas.Daniel@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2014-12-03drm/i915: Convert i915_wait_seqno to i915_wait_requestDaniel Vetter1-8/+6
Updated i915_wait_seqno() to take a request structure instead of a seqno value and renamed it accordingly. Internally, it just pulls the seqno out of the request and calls on to __wait_seqno() as before. However, all the code further up the stack is now simplified as it can just pass the request object straight through without having to peek inside. For: VIZ-4377 Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Reviewed-by: Thomas Daniel <Thomas.Daniel@intel.com> [danvet: Squash in hunk from an earlier patch which was rebased wrongly.] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2014-12-03drm/i915: Remove 'outstanding_lazy_seqno'John Harrison1-25/+27
The OLS value is now obsolete. Exactly the same value is guarateed to be always available as PLR->seqno. Thus it is safe to remove the OLS completely. And also to rename the PLR to OLR to keep the 'outstanding lazy ...' naming convention valid. For: VIZ-4377 Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Reviewed-by: Thomas Daniel <Thomas.Daniel@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2014-12-03drm/i915: Add reference count to request structureJohn Harrison1-1/+3
The plan is to use request structures everywhere that seqno values were previously used. This means saving pointers to structures in places that used to be simple integers. In turn, that means that the target structure now needs much more stringent lifetime tracking. That is, it must not be freed while some other random object still holds a pointer to it. To achieve this tracking, a reference count needs to be added. Whenever a pointer to the structure is saved away, the count must be incremented and the free must only occur when all references have been released. For: VIZ-4377 Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Reviewed-by: Thomas Daniel <Thomas.Daniel@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2014-12-03drm/i915: Ensure OLS & PLR are always in syncJohn Harrison1-8/+21
The aim is to replace seqno values with request structures. A step along the way is to switch to using the PLR in preference to the OLS. That requires the PLR to only be valid when and only when the OLS is also valid. I.e., the two must be kept in lock step. Then, code which was using the OLS can be safely switched over to using the PLR instead. For: VIZ-4377 Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Reviewed-by: Thomas Daniel <Thomas.Daniel@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2014-11-19drm/i915: Remove DRI1 ring accessors and APIChris Wilson1-100/+4
With the deprecation of UMS, and by association DRI1, we have a tough choice when updating the ring access routines. We either rewrite the DRI1 routines blindly without testing (so likely to be broken) or take the liberty of declaring them no longer supported and remove them entirely. This takes the latter approach. v2: Also remove the DRI1 sarea updates Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> [danvet: Fix rebase conflicts.] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2014-11-19drm/i915/bdw: Pin the ringbuffer backing object to GGTT on-demandThomas Daniel1-35/+50
Same as with the context, pinning to GGTT regardless is harmful (it badly fragments the GGTT and can even exhaust it). Unfortunately, this case is also more complex than the previous one because we need to map and access the ringbuffer in several places along the execbuffer path (and we cannot make do by leaving the default ringbuffer pinned, as before). Also, the context object itself contains a pointer to the ringbuffer address that we have to keep updated if we are going to allow the ringbuffer to move around. v2: Same as with the context pinning, we cannot really do it during an interrupt. Also, pin the default ringbuffers objects regardless (makes error capture a lot easier). v3: Rebased. Take a pin reference of the ringbuffer for each item in the execlist request queue because the hardware may still be using the ringbuffer after the MI_USER_INTERRUPT to notify the seqno update is executed. The ringbuffer must remain pinned until the context save is complete. No longer pin and unpin ringbuffer in populate_lr_context() - this transient address is meaningless and the pinning can cause a sleep while atomic. v4: Moved ringbuffer pin and unpin into the lr_context_pin functions. Downgraded pinning check BUG_ONs to WARN_ONs. v5: Reinstated WARN_ONs for unexpected execlist states. Removed unused variable. Issue: VIZ-4277 Signed-off-by: Oscar Mateo <oscar.mateo@intel.com> Signed-off-by: Thomas Daniel <thomas.daniel@intel.com> Reviewed-by: Akash Goel <akash.goels@gmail.com> Reviewed-by: Deepak S<deepak.s@linux.intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2014-11-14drm/i915: Initialize workarounds in logical ring mode tooMichel Thierry1-2/+3
Following the legacy ring submission example, update the ring->init_context() hook to support the execlist submission mode. v2: update to use the new workaround macros and cleanup unused code. This takes care of both bdw and chv workarounds. v2.1: Add missing call to init_context() during deferred context creation. v3: Split init_context (emit) in legacy/lrc modes. For lrc, get the ringbuf from the context (Mika/Daniel). v4: Merge init_context interfaces back, the legacy mode only needs the ring, but the lrc mode needs the ring and context (Mika). Issue: VIZ-4092 Issue: GMIN-3475 Change-Id: Ie3d093b2542ab0e2a44b90460533e2f979788d6c Cc: Deepak S <deepak.s@intel.com> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: Daniel Vetter <daniel@ffwll.ch> Signed-off-by: Michel Thierry <michel.thierry@intel.com> Signed-off-by: Arun Siluvery <arun.siluvery@linux.intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com> [danvet: Align function paramater lists properly.] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2014-11-14drm/i915/chv: Add new workarounds for chvArun Siluvery1-0/+10
+WaForceEnableNonCoherent:chv +WaHdcDisableFetchWhenMasked:chv For: VIZ-4090 Signed-off-by: Arun Siluvery <arun.siluvery@linux.intel.com> Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2014-11-14drm/i915/chv: Combine GEN8_ROW_CHICKEN w/aArun Siluvery1-4/+2
WaDisablePartialInstShootdown:chv and WaDisableThreadStallDopClockGating:chv are related to the same register so combine them. Signed-off-by: Arun Siluvery <arun.siluvery@linux.intel.com> Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2014-11-14drm/i915/chv: Remove pre-production workaroundsArun Siluvery1-8/+0
-WaDisableDopClockGating:chv -WaDisableSamplerPowerBypass:chv -WaDisableGunitClockGating:chv -WaDisableFfDopClockGating:chv -WaDisableDopClockGating:chv v2: Remove pre-production WA instead of restricting them based on revision id (Ville) For: VIZ-4090 Signed-off-by: Arun Siluvery <arun.siluvery@linux.intel.com> Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2014-11-04drm/i915: Fix null pointer dereference in ring cleanup codeJohn Harrison1-2/+5
If a ring failed to initialise for any reason then the error path would try to clean up all rings including those that had not yet been allocated. The ring clean up code did a check that the ring was valid before starting its work. Unfortunately, that was after it had already dereferenced the ring to obtain a dev_private pointer. Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Reviewed-by: Damien Lespiau <damien.lespiau@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2014-11-04Merge tag 'drm-intel-next-2014-10-24' of git://anongit.freedesktop.org/drm-intel into drm-nextDave Airlie1-83/+107
- suspend/resume/freeze/thaw unification from Imre - wa list improvements from Mika&Arun - display pll precomputation from Ander Conselvan, this removed the last ->mode_set callbacks, a big step towards implementing atomic modesets - more kerneldoc for the interrupt code - 180 rotation for cursors (Ville&Sonika) - ULT/ULX feature check macros cleaned up thanks to Damien - piles and piles of fixes all over, bug team seems to work! * tag 'drm-intel-next-2014-10-24' of git://anongit.freedesktop.org/drm-intel: (61 commits) drm/i915: Update DRIVER_DATE to 20141024 drm/i915: add comments on what stage a given PM handler is called drm/i915: unify switcheroo and legacy suspend/resume handlers drm/i915: add poweroff_late handler drm/i915: sanitize suspend/resume helper function names drm/i915: unify S3 and S4 suspend/resume handlers drm/i915: disable/re-enable PCI device around S4 freeze/thaw drm/i915: enable output polling during S4 thaw drm/i915: check for GT faults in all resume handlers and driver load time drm/i915: remove unused restore_gtt_mappings optimization during suspend drm/i915: fix S4 suspend while switcheroo state is off drm/i915: vlv: fix switcheroo/legacy suspend/resume drm/i915: propagate error from legacy resume handler drm/i915: unify legacy S3 suspend and S4 freeze handlers drm/i915: factor out i915_drm_suspend_late drm/i915: Emit even number of dwords when emitting LRIs drm/i915: Add rotation support for cursor plane (v5) drm/i915: Correctly reject invalid flags for wait_ioctl drm/i915: use macros to assign mmio access functions drm/i915: only run hsw_power_well_post_enable when really needed ...
2014-10-28Merge tag 'drm-intel-next-2014-10-03-no-ppgtt' of git://anongit.freedesktop.org/drm-intel into drm-nextDave Airlie1-14/+20
Ok, new attempt, this time around with full ppgtt disabled again. drm-intel-next-2014-10-03: - first batch of skl stage 1 enabling - fixes from Rodrigo to the PSR, fbc and sink crc code - kerneldoc for the frontbuffer tracking code, runtime pm code and the basic interrupt enable/disable functions - smaller stuff all over drm-intel-next-2014-09-19: - bunch more i830M fixes from Ville - full ppgtt now again enabled by default - more ppgtt fixes from Michel Thierry and Chris Wilson - plane config work from Gustavo Padovan - spinlock clarifications - piles of smaller improvements all over, as usual * tag 'drm-intel-next-2014-10-03-no-ppgtt' of git://anongit.freedesktop.org/drm-intel: (114 commits) Revert "drm/i915: Enable full PPGTT on gen7" drm/i915: Update DRIVER_DATE to 20141003 drm/i915: Remove the duplicated logic between the two shrink phases drm/i915: kerneldoc for interrupt enable/disable functions drm/i915: Use dev_priv instead of dev in irq setup functions drm/i915: s/pm._irqs_disabled/pm.irqs_enabled/ drm/i915: Clear TX FIFO reset master override bits on chv drm/i915: Make sure hardware uses the correct swing margin/deemph bits on chv drm/i915: make sink_crc return -EIO on aux read/write failure drm/i915: Constify send buffer for intel_dp_aux_ch drm/i915: De-magic the PSR AUX message drm/i915: Reinstate error level message for non-simulated gpu hangs drm/i915: Kerneldoc for intel_runtime_pm.c drm/i915: Call runtime_pm_disable directly drm/i915: Move intel_display_set_init_power to intel_runtime_pm.c drm/i915: Bikeshed rpm functions name a bit. drm/i915: Extract intel_runtime_pm.c drm/i915: Remove intel_modeset_suspend_hw drm/i915: spelling fixes for frontbuffer tracking kerneldoc drm/i915: Tighting frontbuffer tracking around flips ...
2014-10-24drm/i915: Emit even number of dwords when emitting LRIsArun Siluvery1-2/+3
The number of DWords should be even when doing ring emits as command sequences require QWord alignment. There was some discussion about the maximum length of the MI_LRI command. Quoting Mika "I did some test with bdw: "The maximum is 128 writes, resulting the 8 bit length field of the command being 0xff, thus following the spec. The 128'th write went through. "Perhaps the max command length is then less in older gens? "Perhaps WARN_ON(x > 128) in MI_LOAD_REGISTER_IMM would be in place but one needs minor tweak to command parser a bit also then. #define I915_MAX_WA_REGS 16 keeps us safe for now atleast." Ville commented that on pre-gen6 the length field seems to be restricted to 0x3f though. So for all cases we should be ok. v2: user LRI variant that can write multiple regs in one go (Damien). We can simply insert one NOP at the end instead of one per register write. Cc: Mika Kuoppala <mika.kuoppala@intel.com> Signed-off-by: Arun Siluvery <arun.siluvery@linux.intel.com> Reviewed-by: Damien Lespiau <damien.lespiau@intel.com> [danvet: Add a summary of the MI_LRI length discussion.] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2014-10-24drm/i915: Build workaround list in ring initializationMika Kuoppala1-80/+104
If we build the workaround list in ring initialization and decouple it from the actual writing of values, we gain the ability to decide where and how we want to apply the values. The advantage of this will become more clear when we need to initialize workarounds on older gens where it is not possible to write all the registers through ring LRIs. v2: rebase on newest bdw workarounds Cc: Arun Siluvery <arun.siluvery@linux.intel.com> Cc: Damien Lespiau <damien.lespiau@intel.com> Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com> Reviewed-by: Arun Siluvery <arun.siluvery@linux.intel.com> [danvet: Resolve tiny conflict in comments and ocd alignments a bit.] [danvet2: Remove bogus force_wake_get call spotted by Paulo and QA.] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2014-10-24drm/i915/bdw: Remove BDW preproduction W/As until C stepping.Rodrigo Vivi1-3/+2
Let's clean this a bit v2: Rebase after other Mika's patch that removed some BDW production workarounds. v3: Removed stepping info. Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2014-10-21Merge branch 'drm-intel-next-fixes' into drm-intel-nextDaniel Vetter1-13/+2
So I've sent the first pull request to Dave and I expect his request for a merge tree any second now ;-) More seriously I have some pending patches for 3.19 that depend upon both trees, hence backmerge. Conflicts are all trivial. Conflicts: drivers/gpu/drm/i915/i915_irq.c drivers/gpu/drm/i915/intel_display.c v2: Of course I've forgotten the fixup script for the silent conflict. Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
2014-10-14Merge branch 'drm-next' of git://people.freedesktop.org/~airlied/linuxLinus Torvalds1-41/+215
Pull drm updates from Dave Airlie: "This is the main git pull for the drm, I pretty much froze major pulls at -rc5/6 time, and haven't had much fallout, so will probably continue doing that. Lots of changes all over, big internal header cleanup to make it clear drm features are legacy things and what are things that modern KMS drivers should be using. Also big move to use the new generic fences in all the TTM drivers. core: atomic prep work, vblank rework changes, allows immediate vblank disables major header reworking and cleanups to better delinate legacy interfaces from what KMS drivers should be using. cursor planes locking fixes ttm: move to generic fences (affects all TTM drivers) ppc64 caching fixes radeon: userptr support, uvd for old asics, reset rework for fence changes better buffer placement changes, dpm feature enablement hdmi audio support fixes intel: Cherryview work, 180 degree rotation, skylake prep work, execlist command submission full ppgtt prep work cursor improvements edid caching, vdd handling improvements nouveau: fence reworking kepler memory clock work gt21x clock work fan control improvements hdmi infoframe fixes DP audio ast: ppc64 fixes caching fix rcar: rcar-du DT support ipuv3: prep work for capture support msm: LVDS support for mdp4, new panel, gpu refactoring exynos: exynos3250 SoC support, drop bad mmap interface, mipi dsi changes, and component match support" * 'drm-next' of git://people.freedesktop.org/~airlied/linux: (640 commits) drm/mst: rework payload table allocation to conform better. drm/ast: Fix HW cursor image drm/radeon/kv: add uvd/vce info to dpm debugfs output drm/radeon/ci: add uvd/vce info to dpm debugfs output drm/radeon: export reservation_object from dmabuf to ttm drm/radeon: cope with foreign fences inside the reservation object drm/radeon: cope with foreign fences inside display drm/core: use helper to check driver features drm/radeon/cik: write gfx ucode version to ucode addr reg drm/radeon/si: print full CS when we hit a packet 0 drm/radeon: remove unecessary includes drm/radeon/combios: declare legacy_connector_convert as static drm/radeon/atombios: declare connector convert tables as static drm/radeon: drop btc_get_max_clock_from_voltage_dependency_table drm/radeon/dpm: drop clk/voltage dependency filters for BTC drm/radeon/dpm: drop clk/voltage dependency filters for CI drm/radeon/dpm: drop clk/voltage dependency filters for SI drm/radeon/dpm: drop clk/voltage dependency filters for NI drm/radeon: disable audio when we disable hdmi (v2) drm/radeon: split audio enable between eg and r600 (v2) ...
2014-09-30Merge branch 'topic/skl-stage1' into drm-intel-next-queuedDaniel Vetter1-1/+1
SKL stage 1 patches still need polish so will likely miss the 3.18 merge window. We've decided to postpone to 3.19 so let's pull this in to make patch merging and conflict handling easier. Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
2014-09-30drm/i915/bdw: WaDisableFenceDestinationToSLMRodrigo Vivi1-1/+5
This WA affect BDW GT3 pre-production steppings. Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com> [danvet: Don't mention steppings ...] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2014-09-29drm/i915: Minimize the huge amount of unecessary fbc sw cache clean.Rodrigo Vivi1-2/+7
The sw cache clean on BDW is a tempoorary workaround because we cannot set cache clean on blt ring with risk of hungs. So we are doing the cache clean on sw. However we are doing much more than needed. Not only when using blt ring. So, with this extra w/a we minimize the ammount of cache cleans and call it only on same cases that it was being called on gen7. The traditional FBC Cache clean happens over LRI on BLT ring when there is a frontbuffer touch happening. frontbuffer tracking set fbc_dirty variable to let BLT flush that it must clean FBC cache. fbc.need_sw_cache_clean works in the opposite information direction of ring->fbc_dirty telling software on frontbuffer tracking to perform the cache clean on sw side. v2: Clean it a little bit and fully check for Broadwell instead of gen8. v3: Rebase after frontbuffer organization. v4: Wiggle confused me. So fixing v3! Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: Paulo Zanoni <paulo.r.zanoni@intel.com> Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2014-09-24drm/i915/skl: don't set the AsyncFlip performance mode for Gen9+Imre Deak1-1/+1
The following sets the AsyncFlip performance mode for everything above Gen6: commit 4790cb36b3eede8fb0cca529dc1d31b9936fa24b Author: Chris Wilson <chris@chris-wilson.co.uk> Date: Sun Jan 20 16:11:20 2013 +0000 drm/i915: Disable AsyncFlip performance optimisations Starting from Gen9 the MI_MODE register layout changes and doesn't include the above bit. Reviewed-by: Thomas Wood <thomas.wood@intel.com> Signed-off-by: Imre Deak <imre.deak@intel.com> Signed-off-by: Damien Lespiau <damien.lespiau@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>