linux-dev - Linux kernel development work

Age	Commit message (Collapse)	Author	Files	Lines
2018-03-08	drm/i915: expose rcs topology through query uAPI	Lionel Landwerlin	2	-0/+137
	With the introduction of asymmetric slices in CNL, we cannot rely on the previous SUBSLICE_MASK getparam to tell userspace what subslices are available. Here we introduce a more detailed way of querying the Gen's GPU topology that doesn't aggregate numbers. This is essential for monitoring parts of the GPU with the OA unit, because counters need to be normalized to the number of EUs/subslices/slices. The current aggregated numbers like EU_TOTAL do not gives us sufficient information. The Mesa series making use of this API is : https://patchwork.freedesktop.org/series/38795/ As a bonus we can draw representations of the GPU : https://imgur.com/a/vuqpa v2: Rename uapi struct s/_mask/_info/ (Tvrtko) Report max_slice/subslice/eus_per_subslice rather than strides (Tvrtko) Add uapi macros to read data from *_info structs (Tvrtko) v3: Use !!(v & DRM_I915_BIT()) for uapi macros instead of custom shifts (Tvrtko) v4: factorize query item writting (Tvrtko) tweak uapi struct/define names (Tvrtko) v5: Replace ALIGN() macro (Chris) v6: Updated uapi comments (Tvrtko) Moved flags != 0 checks into vfuncs (Tvrtko) v7: Use access_ok() before copying anything, to avoid overflows (Chris) Switch BUG_ON() to GEM_WARN_ON() (Tvrtko) v8: Tweak uapi comments style to match the coding style (Lionel) v9: Fix error in comment about computation of enabled subslice (Tvrtko) v10: Fix/update comments in uAPI (Sagar) v11: Drop drm_i915_query_(slice\|subslice\|eu)_info in favor of a single drm_i915_query_topology_info (Joonas) v12: Add subslice_stride/eu_stride in drm_i915_query_topology_info (Joonas) v13: Fix comment in uAPI (Joonas) Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Acked-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180306122857.27317-7-lionel.g.landwerlin@intel.com
2018-03-08	drm/i915: add query uAPI	Lionel Landwerlin	5	-3/+111
	There are a number of information that are readable from hardware registers and that we would like to make accessible to userspace. One particular example is the topology of the execution units (how are execution units grouped in subslices and slices and also which ones have been fused off for die recovery). At the moment the GET_PARAM ioctl covers some basic needs, but generally is only able to return a single value for each defined parameter. This is a bit problematic with topology descriptions which are array/maps of available units. This change introduces a new ioctl that can deal with requests to fill structures of potentially variable lengths. The user is expected fill a query with length fields set at 0 on the first call, the kernel then sets the length fields to the their expected values. A second call to the kernel with length fields at their expected values will trigger a copy of the data to the pointed memory locations. The scope of this uAPI is only to provide information to userspace, not to allow configuration of the device. v2: Simplify dispatcher code iteration (Tvrtko) Tweak uapi drm_i915_query_item structure (Tvrtko) v3: Rename pad fields into flags (Chris) Return error on flags field != 0 (Chris) Only copy length back to userspace in drm_i915_query_item (Chris) v4: Use array of functions instead of switch (Chris) v5: More comments in uapi (Tvrtko) Return query item errors in length field (All) v6: Tweak uapi comments style to match the coding style (Lionel) v7: Add i915_query.h (Joonas) v8: (Lionel) Change the behavior of the item iterator to report invalid queries into the query item rather than stopping the iteration. This enables userspace applications to query newer items on older kernels and only have failure on the items that are not supported. v9: Edit copyright headers (Joonas) v10: Typos & comments in uapi (Joonas) Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Acked-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180306122857.27317-6-lionel.g.landwerlin@intel.com
2018-03-08	drm/i915: add rcs topology to error state	Lionel Landwerlin	1	-0/+1
	This might be useful information for developers looking at an error state. v2: Place topology towards the end of the error state (Chris) v3: Reuse common printing code (Michal) v4: Make this a one-liner (Chris) Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Acked-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20180306122857.27317-5-lionel.g.landwerlin@intel.com
2018-03-08	drm/i915/debugfs: add rcs topology entry	Lionel Landwerlin	3	-0/+37
	While the end goal is to make this information available to userspace through a new ioctl, there is no reason we can't display it in a human readable fashion through debugfs. slice0: 3 subslice(s) (0x7): subslice0: 8 EUs (0xff) subslice1: 8 EUs (0xff) subslice2: 8 EUs (0xff) subslice3: 0 EUs (0x0) slice1: 3 subslice(s) (0x7): subslice0: 8 EUs (0xff) subslice1: 8 EUs (0xff) subslice2: 8 EUs (0xff) subslice3: 0 EUs (0x0) slice2: 3 subslice(s) (0x7): subslice0: 8 EUs (0xff) subslice1: 8 EUs (0xff) subslice2: 8 EUs (0xff) subslice3: 0 EUs (0x0) v2: Reformat debugfs printing (Tvrtko) Use the new EU mask helper (Tvrtko) v3: Move printing code to intel_device_info.c to be shared with error state (Michal) v4: Bump u8 to u16 when using sseu_get_eus() (Lionel) Suggested-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Acked-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20180306122857.27317-4-lionel.g.landwerlin@intel.com
2018-03-08	drm/i915/debugfs: reuse max slice/subslices already stored in sseu	Lionel Landwerlin	2	-17/+12
	Now that we have that information in topology fields, let's just reuse it. v2: Style tweaks (Tvrtko) Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Acked-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20180306122857.27317-3-lionel.g.landwerlin@intel.com
2018-03-08	drm/i915: store all subslice masks	Lionel Landwerlin	6	-67/+237
	Up to now, subslice mask was assumed to be uniform across slices. But starting with Cannonlake, slices can be asymmetric (for example slice0 has different number of subslices as slice1+). This change stores all subslices masks for all slices rather than having a single mask that applies to all slices. v2: Rework how we store total numbers in sseu_dev_info (Tvrtko) Fix CHV eu masks, was reading disabled as enabled (Tvrtko) Readability changes (Tvrtko) Add EU index helper (Tvrtko) v3: Turn ALIGN(v, 8) / 8 into DIV_ROUND_UP(v, BITS_PER_BYTE) (Tvrtko) Reuse sseu_eu_idx() for setting eu_mask on CHV (Tvrtko) Reformat debug prints for subslices (Tvrtko) v4: Change eu_mask helper into sseu_set_eus() (Tvrtko) v5: With Haswell reporting masks & counts, bump sseu_*_eus() functions to use u16 (Lionel) v6: Fix sseu_get_eus() for > 8 EUs per subslice (Lionel) v7: Change debugfs enabels for number of subslices per slice, will need a small igt/pm_sseu change (Lionel) Drop subslice_total field from sseu_dev_info, rely on sseu_subslice_total() to recompute the value instead (Lionel) v8: Remove unused function compute_subslice_total() (Lionel) Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Acked-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20180306122857.27317-2-lionel.g.landwerlin@intel.com
2018-03-08	drm/i915/guc: work around gcc-4.4.4 union initializer issue	Andrew Morton	1	-2/+4
	gcc-4.4.4 has problems with initalizers of anon unions. drivers/gpu/drm/i915/intel_guc_log.c: In function 'guc_log_control': drivers/gpu/drm/i915/intel_guc_log.c:64: error: unknown field 'logging_enabled' specified in initializer Work around this. Fixes: 35fe703c3161 ("drm/i915/guc: Change values for i915_guc_log_control") Cc: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Sagar Arun Kamble <sagar.a.kamble@intel.com> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Jani Nikula <jani.nikula@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180308001333.rI2vrNRTY%akpm@linux-foundation.org
2018-03-07	drm/i915/cnl: Add Wa_2201832410	Rodrigo Vivi	2	-0/+8
	"Clock gating bug in GWL may not clear barrier state when an EOT is received, causing a hang the next time that barrier is used." HSDES: 2201832410 Cc: Rafael Antognolli <rafael.antognolli@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180307220912.3681-1-rodrigo.vivi@intel.com
2018-03-07	drm/i915/icl: Gen11 forcewake support	Daniele Ceraolo Spurio	4	-26/+189
	The main difference with previous GENs is that starting from Gen11 each VCS and VECS engine has its own power well, which only exist if the related engine exists in the HW. The fallback forcewake request workaround is only needed on gen9 according to the HSDES WA entry (1604254524), so we can go back to using the simpler fw_domains_get/put functions. BSpec: 18331 v2: fix fwtable, use array to test shadow tables, create new accessors to avoid check on every access (Tvrtko) v3 (from Paulo): Rebase. v4: - Range 09400-097FF should be FORCEWAKE_ALL (Daniele) - Use the BIT macro for forcewake domains (Daniele) - Add a comment about the range ordering (Oscar) - Updated commit message (Oscar) v5: Rebased v6: Use I915_MAX_VCS/VECS (Michal) v7: translate FORCEWAKE_ALL to available domains v8: rebase, add clarification on fallback ack in commit message. v9: fix rebase issue, change check in fw_domains_init from IS_GEN11 to GEN >= 11 v10: Generate is_genX_shadowed with a macro (Daniele) Include gen11_fw_ranges in the selftest (Michel) v11: Simplify FORCEWAKE_ALL, new line between NEEDS_FORCEWAKEs (Tvrtko) Cc: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Paulo Zanoni <paulo.r.zanoni@intel.com> Acked-by: Michel Thierry <michel.thierry@intel.com> Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Signed-off-by: Oscar Mateo <oscar.mateo@intel.com> Signed-off-by: Michel Thierry <michel.thierry@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180302161501.28594-6-mika.kuoppala@linux.intel.com Signed-off-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
2018-03-07	drm/i915/icl: Add Indirect Context Offset for Gen11	Michel Thierry	2	-0/+5
	v2: rebased to intel_lr_indirect_ctx_offset v3: rebase, move define to intel_lrc_reg.h BSpec: 11740 Signed-off-by: Michel Thierry <michel.thierry@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Reviewed-by: Oscar Mateo <oscar.mateo@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180302161501.28594-5-mika.kuoppala@linux.intel.com Signed-off-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
2018-03-07	drm/i915/icl: Enhanced execution list support	Thomas Daniel	6	-17/+62
	Enhanced Execlists is an upgraded version of execlists which supports up to 8 ports. The lrcs to be submitted are written to a submit queue (the ExecLists Submission Queue - ELSQ), which is then loaded on the HW. When writing to the ELSP register, the lrcs are written cyclically in the queue from position 0 to position 7. Alternatively, it is possible to write directly in the individual positions of the queue using the ELSQC registers. To be able to re-use all the existing code we're using the latter method and we're currently limiting ourself to only using 2 elements. v2: Rebase. v3: Switch from !IS_GEN11 to GEN < 11 (Daniele Ceraolo Spurio). v4: Use the elsq registers instead of elsp. (Daniele Ceraolo Spurio) v5: Reword commit, rename regs to be closer to specs, turn off preemption (Daniele), reuse engine->execlists.elsp (Chris) v6: use has_logical_ring_elsq to differentiate the new paths v7: add preemption support, rename els to submit_reg (Chris) v8: save the ctrl register inside the execlists struct, drop CSB handling updates (superseded by preempt_complete_status) (Chris) v9: s/drm_i915_gem_request/i915_request (Mika) v10: resolved conflict in inject_preempt_context (Mika) Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Signed-off-by: Thomas Daniel <thomas.daniel@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Signed-off-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20180302161501.28594-4-mika.kuoppala@linux.intel.com
2018-03-07	drm/i915/icl: new context descriptor support	Daniele Ceraolo Spurio	5	-4/+53
	Starting from Gen11 the context descriptor format has been updated in the HW. The hw_id field has been considerably reduced in size and engine class and instance fields have been added. There is a slight name clashing issue because the field that we call hw_id is actually called SW Context ID in the specs for Gen11+. With the current size of the hw_id field we can have a maximum of 2k contexts at any time, but we could use the sw_counter field (which is sw defined) to increase that because the HW requirement is that engine_id + sw id + sw_counter is a unique number. GuC uses a similar method to support more contexts but does its tracking at lrc level. To avoid doing an implementation that will need to be reworked once GuC support lands, defer it for now and mark it as TODO. v2: rebased, add documentation, fix GEN11_ENGINE_INSTANCE_SHIFT v3: rebased, bring back lost code from i915_gem_context.c v4: make TODO comment more generic v5: be consistent with bit ordering, add extra checks (Chris) Cc: Oscar Mateo <oscar.mateo@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Reviewed-by: Oscar Mateo <oscar.mateo@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180302161501.28594-3-mika.kuoppala@linux.intel.com Signed-off-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
2018-03-07	drm/i915/icl: Correctly initialize the Gen11 engines	Oscar Mateo	2	-1/+49
	Gen11 has up to 4 VCS and up to 2 VECS engines, this patch adds mmio base definitions for all of them. Bspec: 20944 Bspec: 7021 v2: Set the correct mmio_base in intel_engines_init_mmio; updating the base mmio values any later would cause incorrect reads in i915_gem_sanitize (Michel). Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Ceraolo Spurio, Daniele <daniele.ceraolospurio@intel.com> Signed-off-by: Oscar Mateo <oscar.mateo@intel.com> Signed-off-by: Michel Thierry <michel.thierry@intel.com> Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180302161501.28594-2-mika.kuoppala@linux.intel.com Signed-off-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
2018-03-07	drm/i915: Assert that the request is indeed complete when signaled from irq	Chris Wilson	1	-0/+1
	After we call dma_fence_signal(), confirm that the request was indeed complete. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180305104105.8296-1-chris@chris-wilson.co.uk
2018-03-07	drm/i915: Handle changing enable_fbc parameter at runtime better.	Maarten Lankhorst	1	-26/+36
	If i915.enable_fbc is cleared at runtime, but FBC was previously enabled then we don't disable FBC until the next time the crtc is disabled. Make sure that if the module param is changed, we disable FBC in intel_fbc_post_update so we never have to worry about disabling. Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180305123608.20665-1-maarten.lankhorst@linux.intel.com Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2018-03-06	drm/i915: Track whether the DP link is trained or not	Ville Syrjälä	4	-3/+12
	LSPCON likes to throw short HPDs during the enable seqeunce prior to the link being trained. These obviously result in the channel CR/EQ check failing and thus we schedule a pointless hotplug work to retrain the link. Avoid that by ignoring the bad CR/EQ status until we've actually initially trained the link. I've not actually investigated to see what LSPCON is trying to signal with the short pulse. But as long as it signals anything I think we're supposed to check the link status anyway, so I don't really see other good ways to solve this. I've not seen these short pulses being generated by normal DP sinks. Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Lyude Paul <lyude@redhat.com> Signed-off-by: Lyude Paul <lyude@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180117192149.17760-5-ville.syrjala@linux.intel.com
2018-03-06	drm/i915: Nuke intel_dp->channel_eq_status	Ville Syrjälä	2	-4/+3
	intel_dp->channel_eq_status is used in exactly one function, and we don't need it to persist between calls. So just go back to using a local variable instead. Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Reviewed-by: Lyude Paul <lyude@redhat.com> Reviewed-by: Manasi Navare <manasi.d.navare@intel.com> Signed-off-by: Lyude Paul <lyude@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180117192149.17760-4-ville.syrjala@linux.intel.com
2018-03-06	drm/i915: Move SST DP link retraining into the ->post_hotplug() hook	Ville Syrjälä	3	-88/+121
	Doing link retraining from the short pulse handler is problematic since that might introduce deadlocks with MST sideband processing. Currently we don't retrain MST links from this code, but we want to change that. So better to move the entire thing to the hotplug work. We can utilize the new encoder->hotplug() hook for this. The only thing we leave in the short pulse handler is the link status check. That one still depends on the link parameters stored under intel_dp, so no locking around that but races should be mostly harmless as the actual retraining code will recheck the link state if we end up there by mistake. v2: Rebase due to ->post_hotplug() now being just ->hotplug() Check the connector type to figure out if we should do the HDMI thing or the DP think for DDI [pushed with whitespace changes for sparse] Cc: Manasi Navare <manasi.d.navare@intel.com> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Acked-by: Manasi Navare <manasi.d.navare@intel.com> Signed-off-by: Lyude Paul <lyude@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180117192149.17760-3-ville.syrjala@linux.intel.com
2018-03-06	drm/i915: Reinitialize sink scrambling/TMDS clock ratio on HPD	Ville Syrjälä	6	-15/+168
	The LG 4k TV I have doesn't deassert HPD when I turn the TV off, but when I turn it back on it will pulse the HPD line. By that time it has forgotten everything we told it about scrambling and the clock ratio. Hence if we want to get a picture out if it again we have to tell it whether we're currently sending scrambled data or not. Implement that via the encoder->hotplug() hook. v2: Force a full modeset to not follow the HDMI 2.0 spec more closely (Shashank) [pushed with whitespace fixes to make sparse happy] Cc: Shashank Sharma <shashank.sharma@intel.com> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Lyude Paul <lyude@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180117192149.17760-1-ville.syrjala@linux.intel.com
2018-03-06	drm/i915: Convert intel_hpd_irq_event() into an encoder hotplug hook	Ville Syrjälä	1	-2/+10
	Allow encoders to customize their hotplug processing by moving the intel_hpd_irq_event() code into an encoder hotplug vfunc. Currently only SDVO needs this to re-enable hotplug signalling in the SDVO chip. We'll use this same hook for DP/HDMI link management later. Reviewed-by: Jani Nikula <jani.nikula@intel.com> Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Lyude Paul <lyude@redhat.com> Signed-off-by: Lyude Paul <lyude@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180117192149.17760-1-ville.syrjala@linux.intel.com
2018-03-06	drm/i915/cnp: Document WaSouthDisplayDisablePWMCGEGating	Rodrigo Vivi	1	-1/+1
	No functional change since WA is already applied. But since it has different names on different databases, let's document it here to avoid future confusion. Cc: Radhakrishna Sripada <radhakrishna.sripada@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Reviewed-by: Radhakrishna Sripada <radhakrishna.sripada@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180306012812.19779-1-rodrigo.vivi@intel.com
2018-03-06	drm/i915/cnl: document WaVFUnitClockGatingDisable	Rodrigo Vivi	1	-0/+1
	No functional change. WA is already properly applied. but in different databases it has different names. Let's document all of them to avoid future confusion. Cc: Rafael Antognolli <rafael.antognolli@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180306012000.18928-1-rodrigo.vivi@intel.com
2018-03-06	drm/i915/psr: Update PSR2 resolution check for Cannonlake	Dhinakaran Pandiyan	1	-6/+15
	In fact, apply the Cannonlake resolution check for all >= Gen-10 platforms to be safe. v3: Update GLK too. (Ville) Longer variable names. if-else in place of ternary operator. v2: Use local variables for resolution limits and print them (Ville) Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: Elio Martinez Monroy <elio.martinez.monroy@intel.com> Signed-off-by: Dhinakaran Pandiyan <dhinakaran.pandiyan@intel.com> Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180306203355.29292-1-dhinakaran.pandiyan@intel.com
2018-03-06	drm/i915: Flush waiters on seqno wraparound	Chris Wilson	3	-24/+2
	Previously, we would spin waiting for all waiters to wake up and notice their request had completed before we would reset the seqno upon wraparound. However, we can mark their waits as complete and wake them up directly using the existing machinery for handling the flushing of missed wakeups when idling. Suggested-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180306130143.13312-2-chris@chris-wilson.co.uk
2018-03-06	drm/i915: Stop kicking the signaling thread on seqno wraparound	Chris Wilson	2	-5/+2
	Since commit fd10e2ce9905 ("drm/i915/breadcrumbs: Ignore unsubmitted signalers"), we cancel the signaler when retiring the request and so upon wraparound, where we wait for all requests to be retired, we no longer need to spin waiting for the signaling thread to release its references to the in-flight requests, and so we can assert that the signaler is idle. References: fd10e2ce9905 ("drm/i915/breadcrumbs: Ignore unsubmitted signalers") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180306130143.13312-1-chris@chris-wilson.co.uk
2018-03-06	drm/i915/breadcrumbs: Assert all missed breadcrumbs were signaled	Chris Wilson	1	-0/+2
	When parking the engines and their breadcrumbs, if we have waiters left then they missed their wakeup. Verify that each waiter's seqno did complete. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180222092545.17216-2-chris@chris-wilson.co.uk
2018-03-06	drm/i915/breadcrumbs: Reduce signaler rbtree to a sorted list	Chris Wilson	3	-151/+116
	The goal here is to try and reduce the latency of signaling additional requests following the wakeup from interrupt by reducing the list of to-be-signaled requests from an rbtree to a sorted linked list. The original choice of using an rbtree was to facilitate random insertions of request into the signaler while maintaining a sorted list. However, if we assume that most new requests are added when they are submitted, we see those new requests in execution order making a insertion sort fast, and the reduction in overhead of each signaler iteration significant. Since commit 56299fb7d904 ("drm/i915: Signal first fence from irq handler if complete"), we signal most fences directly from notify_ring() in the interrupt handler greatly reducing the amount of work that actually needs to be done by the signaler kthread. All the thread is then required to do is operate as the bottom-half, cleaning up after the interrupt handler and preparing the next waiter. This includes signaling all later completed fences in a saturated system, but on a mostly idle system we only have to rebuild the wait rbtree in time for the next interrupt. With this de-emphasis of the signaler's role, we want to rejig it's datastructures to reduce the amount of work we require to both setup the signal tree and maintain it on every interrupt. References: 56299fb7d904 ("drm/i915: Signal first fence from irq handler if complete") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180222092545.17216-1-chris@chris-wilson.co.uk
2018-03-06	drm/i915/error: capture uc_state after gen_state	Daniele Ceraolo Spurio	1	-1/+1
	error->device_info.has_guc, which we check in capture_uc_state, is set in capture_gen_state, so the latter needs to be performed first. v2: rebased Reported-by: Vinay Belgaumkar <vinay.belgaumkar@intel.com> Fixes: 7d41ef3479a6 (drm/i915: Add Guc/HuC firmware details to error state) Cc: Vinay Belgaumkar <vinay.belgaumkar@intel.com> Cc: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20180305222122.3547-3-daniele.ceraolospurio@intel.com Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2018-03-06	drm/i915/error: standardize function style in error capture	Daniele Ceraolo Spurio	1	-45/+39
	some of the static functions used from capture() have the "i915_" prefix while other don't; most of them take i915 as a parameter, but one of them derives it internally from error->i915. Let's be consistent by avoiding prefix for static functions and by getting i915 from error->i915. While at it, s/dev_priv/i915 in functions that don't perform register reads. v2: take i915 from error->i915 (Michal), s/dev_priv/i915, update commit message Cc: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Reviewed-by: Michel Thierry <michel.thierry@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20180305222122.3547-2-daniele.ceraolospurio@intel.com Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2018-03-06	drm/i915/error: remove unused gen8_engine_sync_index	Daniele Ceraolo Spurio	1	-21/+0
	Leftover from Gen8 ringbuffer support removal Cc: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20180305222122.3547-1-daniele.ceraolospurio@intel.com Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2018-03-06	drm/i915/gvt: Return error at the failure of finding page_track	Xiong Zhang	1	-1/+3
	In XenGT, ioreq copy is used to trap mmio write and ppgtt write. Both of them are memory write, ioreq handler couldn't distinguish them. So ioreq handler probe the ppgtt write handler, if it is succuess, this ioreq is ppgtt write, otherwise it is mmio write. So ppgtt write handler should return an error at the failure of finding page track, it is fatal to implement ioreq handler in XenGT. Signed-off-by: Xiong Zhang <xiong.y.zhang@intel.com> Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
2018-03-06	drm/i915/gvt: Release gvt->lock at the failure of finding page track	Xiong Zhang	1	-1/+2
	page_track_handler take lock at the beginning, the lock should be released at the failure of finding page track. Otherwise deadlock will happen. Fixes: e502a2af4c35 ("drm/i915/gvt: Provide generic page_track infrastructure for write-protected page") Signed-off-by: Xiong Zhang <xiong.y.zhang@intel.com> Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
2018-03-06	drm/i915/kvmgt: Add kvmgt debugfs entry nr_cache_entries under vgpu	Changbin Du	2	-0/+16
	Add a new debugfs entry kvmgt_nr_cache_entries under vgpu which shows the number of entry in dma cache. $ cat /sys/kernel/debug/gvt/vgpu1/kvmgt_nr_cache_entries 10101 v3: fix compiling error for some configuration. (Xiong Zhang <xiong.y.zhang@intel.com>) v2: keep debugfs layout flat. Signed-off-by: Changbin Du <changbin.du@intel.com> Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
2018-03-06	drm/i915/gvt: Fix guest vGPU hang caused by very high dma setup overhead	Changbin Du	5	-134/+246
	The implementation of current kvmgt implicitly setup dma mapping at MPT API gfn_to_mfn. First this design against the API's original purpose. Second, there is no unmap hit in this design. The result is that the dma mapping keep growing larger and larger. For mutl-vm case, they will consume IOMMU IOVA low 4GB address space quickly and so tons of rbtree entries crated in the IOMMU IOVA allocator. Finally, single IOVA allocation can take as long as ~70ms. Such latency is intolerable. To address both above issues, this patch introduced two new MPT API: o dma_map_guest_page - setup dma map for guest page o dma_unmap_guest_page - cancel dma map for guest page The kvmgt implements these 2 API. And to reduce dma setup overhead for duplicated pages (eg. scratch pages), two caches are used: one is for mapping gfn to struct gvt_dma, another is for mapping dma addr to struct gvt_dma. With these 2 new API, the gtt now is able to cancel dma mapping when page table is invalidated. The dma mapping is not in a gradual increase now. v2: follow the old logic for VFIO_IOMMU_NOTIFY_DMA_UNMAP at this point. Cc: Hang Yuan <hang.yuan@intel.com> Cc: Xiong Zhang <xiong.y.zhang@intel.com> Signed-off-by: Changbin Du <changbin.du@intel.com> Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
2018-03-06	drm/i915/gvt: Fix check error on hws_pga_write() fail message	Zhenyu Wang	1	-4/+4
	Fix below check error by using proper failure message output. drivers/gpu/drm/i915//gvt/handlers.c:1392 hws_pga_write() error: 'vgpu' dereferencing possible ERR_PTR() drivers/gpu/drm/i915//gvt/handlers.c:1402 hws_pga_write() error: 'vgpu' dereferencing possible ERR_PTR() Reviewed-by: Zhi Wang <zhi.a.wang@intel.com> Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
2018-03-06	drm/i915/gvt: Fix one indent error	Zhenyu Wang	1	-1/+1
	Fix below warning: drivers/gpu/drm/i915//gvt/handlers.c:323 gdrst_mmio_write() warn: inconsistent indenting Reviewed-by: Zhi Wang <zhi.a.wang@intel.com> Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
2018-03-06	drm/i915/gvt: Fix check error on fence mmio handler	Zhenyu Wang	1	-2/+4
	Fix below error with minor code refactor. CHECK drivers/gpu/drm/i915//gvt/handlers.c drivers/gpu/drm/i915//gvt/handlers.c:203 sanitize_fence_mmio_access() error: 'vgpu' dereferencing possible ERR_PTR() Reviewed-by: Zhi Wang <zhi.a.wang@intel.com> Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
2018-03-06	drm/i915/gvt: Fix check error of vgpu create failure message	Zhenyu Wang	1	-1/+1
	Fix check error at CHECK drivers/gpu/drm/i915//gvt/kvmgt.c drivers/gpu/drm/i915//gvt/kvmgt.c:455 intel_vgpu_create() error: we previously assumed 'vgpu' could be null (see line 454) For failed vgpu create, just show error return in failure message. Reviewed-by: Zhi Wang <zhi.a.wang@intel.com> Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
2018-03-06	drm/i915/gvt: Fix vGPU sched timeslice calculation warning	Zhenyu Wang	1	-3/+2
	Fix below warning by using proper ktime helper to calculate timeslice. CHECK drivers/gpu/drm/i915//gvt/sched_policy.c drivers/gpu/drm/i915//gvt/sched_policy.c:108 gvt_balance_timeslice() debug: sval_binop_signed: invalid divide LLONG_MIN/-1 drivers/gpu/drm/i915//gvt/sched_policy.c:108 gvt_balance_timeslice() debug: sval_binop_signed: invalid divide LLONG_MIN/-1 Reviewed-by: Zhi Wang <zhi.a.wang@intel.com> Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
2018-03-06	drm/i915/gvt: remove gvt max port definition	Zhenyu Wang	1	-3/+1
	Remove GVT-g private max port definition but use i915 one. Fix error caused by: drivers/gpu/drm/i915//gvt/handlers.c:871 dp_aux_ch_ctl_mmio_write() error: buffer overflow 'display->ports' 5 <= 5 Reviewed-by: Zhi Wang <zhi.a.wang@intel.com> Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
2018-03-06	drm/i915/gvt: Fix one gvt_vgpu_error() use in dmabuf.c	Zhenyu Wang	1	-1/+1
	Fix below warning with proper usage. CHECK drivers/gpu/drm/i915//gvt/dmabuf.c drivers/gpu/drm/i915//gvt/dmabuf.c:462 intel_vgpu_get_dmabuf() error: 'vgpu' dereferencing possible ERR_PTR() Reviewed-by: Zhi Wang <zhi.a.wang@intel.com> Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
2018-03-06	drm/i915/gvt: init mmio by lri command in vgpu inhibit context	Weinan Li	4	-4/+181
	There is one issue relates to Coarse Power Gating(CPG) on KBL NUC in GVT-g, vgpu can't get the correct default context by updating the registers before inhibit context submission. It always get back the hardware default value unless the inhibit context submission happened before the 1st time forcewake put. With this wrong default context, vgpu will run with incorrect state and meet unknown issues. The solution is initialize these mmios by adding lri command in ring buffer of the inhibit context, then gpu hardware has no chance to go down RC6 when lri commands are right being executed, and then vgpu can get correct default context for further use. v3: - fix code fault, use 'for' to loop through mmio render list(Zhenyu) v4: - save the count of engine mmio need to be restored for inhibit context and refine some comments. (Kevin) v5: - code rebase Cc: Kevin Tian <kevin.tian@intel.com> Cc: Zhenyu Wang <zhenyuw@linux.intel.com> Signed-off-by: Weinan Li <weinan.z.li@intel.com> Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
2018-03-06	drm/i915/gvt: add interface to check if context is inhibit	Weinan Li	2	-10/+16
	No functional change, just for easy to use. v4: - refine comment (Kevin) Signed-off-by: Weinan Li <weinan.z.li@intel.com> Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
2018-03-06	drm/i915/gvt: add define GEN9_MOCS_SIZE	Weinan Li	1	-6/+8
	No functional change. This defination will also be used in future patchesi. v4: - refine patch description (Kevin) Signed-off-by: Weinan Li <weinan.z.li@intel.com> Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
2018-03-06	drm/i915/gvt: Define PTE addr mask with GENMASK_ULL	Changbin Du	1	-3/+3
	Define the masks better. Signed-off-by: Changbin Du <changbin.du@intel.com> Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
2018-03-06	drm/i915/gvt: Manage shadow pages with radix tree	Changbin Du	2	-27/+27
	We don't know how many page tables will be shadowed. It varies considerably corresponding to guest load. Radix tree is a better choice for us. Since Page Frame Number is used as key so most of the bits are common. Here is some performance data (duration in us) of looking up a element: Before: (aka. ppgtt_find_shadow_page) 0.308 0.292 0.246 0.432 0.143 ... 0.311 0.225 0.382 0.199 0.325 After: (aka. intel_vgpu_find_spt_by_mfn) 0.106 0.106 0.107 0.106 0.105 0.107 ... 0.107 0.109 0.105 0.108 This time I didn't get the early data of hash table. The data is measured when desktop is shown. As last change, the overall benchmark almost is not changed, but we get better scalability. Signed-off-by: Changbin Du <changbin.du@intel.com> Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
2018-03-06	drm/i915/gvt: Provide generic page_track infrastructure for write-protected page	Changbin Du	8	-111/+266
	This patch provide generic page_track infrastructure for write-protected guest page. The old page_track logic gets rewrote and now stays in a new standalone page_track.c. This page track infrastructure can be both used by vGUC and GTT shadowing. The important change is that it uses radix tree instead of hash table. We don't have a predictable number of pages that will be tracked. Here is some performance data (duration in us) of looking up a element: Before: (aka. intel_vgpu_find_tracked_page) 0.091 0.089 0.090 ... 0.093 0.091 0.087 ... 0.292 0.285 0.292 0.291 After: (aka. intel_vgpu_find_page_track) 0.104 0.105 0.100 0.102 0.102 0.100 ... 0.101 0.101 0.105 0.105 The hash table has good performance at beginning, but turns bad with more pages being tracked even no 3D applications are running. As expected, radix tree has stable duration and very quick. The overall benchmark (tested with Heaven Benchmark) marginally improved since this is not the bottleneck. What we benefit more from this change is scalability. Signed-off-by: Changbin Du <changbin.du@intel.com> Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
2018-03-06	drm/i915/gvt: Don't extend page_track to mpt layer	Changbin Du	2	-52/+36
	Don't extend page_track to mpt layer. Keep MPT simple and clean. Meanwhile remove gtt.n_tracked_guest_page which doesn't make much sense. v2: clean up gtt.n_tracked_guest_page. Signed-off-by: Changbin Du <changbin.du@intel.com> Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
2018-03-06	drm/i915/gvt: Rename mpt api {set, unset}_wp_page to {enable, disable}_page_track	Changbin Du	3	-11/+10
	The kvmgt's implementation of mpt api {set,unset}_wp_page is not real write-protection - the data get written before invoke this two api. As discussed, change the mpt api to match the real behavior. Signed-off-by: Changbin Du <changbin.du@intel.com> Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
2018-03-06	drm/i915/gvt: Rename shadow_page to short name spt	Changbin Du	2	-29/+29
	The target structure of some functions is struct intel_vgpu_ppgtt_spt and their names are xxx_shadow_page. It should be xxx_shadow_page_table. Let's use short name 'spt' instead to reduce the length. As well as the hash table name. Signed-off-by: Changbin Du <changbin.du@intel.com> Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>