aboutsummaryrefslogtreecommitdiffstats
path: root/drivers/gpu/drm/i915/gt
diff options
context:
space:
mode:
authorLinus Torvalds <torvalds@linux-foundation.org>2021-09-01 11:26:46 -0700
committerLinus Torvalds <torvalds@linux-foundation.org>2021-09-01 11:26:46 -0700
commit477f70cd2a67904e04c2c2b9bd0fa2e95222f2f6 (patch)
tree1897dd1de49e1ea24897163533e2d8ead5dad0ad /drivers/gpu/drm/i915/gt
parentMerge tag 'media/v5.15-1' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media (diff)
parentMerge tag 'amd-drm-next-5.15-2021-08-27' of https://gitlab.freedesktop.org/agd5f/linux into drm-next (diff)
downloadlinux-dev-477f70cd2a67904e04c2c2b9bd0fa2e95222f2f6.tar.xz
linux-dev-477f70cd2a67904e04c2c2b9bd0fa2e95222f2f6.zip
Merge tag 'drm-next-2021-08-31-1' of git://anongit.freedesktop.org/drm/drm
Pull drm updates from Dave Airlie: "Highlights: - i915 has seen a lot of refactoring and uAPI cleanups due to a change in the upstream direction going forward This has all been audited with known userspace, but there may be some pitfalls that were missed. - i915 now uses common TTM to enable discrete memory on DG1/2 GPUs - i915 enables Jasper and Elkhart Lake by default and has preliminary XeHP/DG2 support - amdgpu adds support for Cyan Skillfish - lots of implicit fencing rules documented and fixed up in drivers - msm now uses the core scheduler - the irq midlayer has been removed for non-legacy drivers - the sysfb code now works on more than x86. Otherwise the usual smattering of stuff everywhere, panels, bridges, refactorings. Detailed summary: core: - extract i915 eDP backlight into core - DP aux bus support - drm_device.irq_enabled removed - port drivers to native irq interfaces - export gem shadow plane handling for vgem - print proper driver name in framebuffer registration - driver fixes for implicit fencing rules - ARM fixed rate compression modifier added - updated fb damage handling - rmfb ioctl logging/docs - drop drm_gem_object_put_locked - define DRM_FORMAT_MAX_PLANES - add gem fb vmap/vunmap helpers - add lockdep_assert(once) helpers - mark drm irq midlayer as legacy - use offset adjusted bo mapping conversion vgaarb: - cleanups fbdev: - extend efifb handling to all arches - div by 0 fixes for multiple drivers udmabuf: - add hugepage mapping support dma-buf: - non-dynamic exporter fixups - document implicit fencing rules amdgpu: - Initial Cyan Skillfish support - switch virtual DCE over to vkms based atomic - VCN/JPEG power down fixes - NAVI PCIE link handling fixes - AMD HDMI freesync fixes - Yellow Carp + Beige Goby fixes - Clockgating/S0ix/SMU/EEPROM fixes - embed hw fence in job - rework dma-resv handling - ensure eviction to system ram amdkfd: - uapi: SVM address range query added - sysfs leak fix - GPUVM TLB optimizations - vmfault/migration counters i915: - Enable JSL and EHL by default - preliminary XeHP/DG2 support - remove all CNL support (never shipped) - move to TTM for discrete memory support - allow mixed object mmap handling - GEM uAPI spring cleaning - add I915_MMAP_OBJECT_FIXED - reinstate ADL-P mmap ioctls - drop a bunch of unused by userspace features - disable and remove GPU relocations - revert some i915 misfeatures - major refactoring of GuC for Gen11+ - execbuffer object locking separate step - reject caching/set-domain on discrete - Enable pipe DMC loading on XE-LPD and ADL-P - add PSF GV point support - Refactor and fix DDI buffer translations - Clean up FBC CFB allocation code - Finish INTEL_GEN() and friends macro conversions nouveau: - add eDP backlight support - implicit fence fix msm: - a680/7c3 support - drm/scheduler conversion panfrost: - rework GPU reset virtio: - fix fencing for planes ast: - add detect support bochs: - move to tiny GPU driver vc4: - use hotplug irqs - HDMI codec support vmwgfx: - use internal vmware device headers ingenic: - demidlayering irq rcar-du: - shutdown fixes - convert to bridge connector helpers zynqmp-dsub: - misc fixes mgag200: - convert PLL handling to atomic mediatek: - MT8133 AAL support - gem mmap object support - MT8167 support etnaviv: - NXP Layerscape LS1028A SoC support - GEM mmap cleanups tegra: - new user API exynos: - missing unlock fix - build warning fix - use refcount_t" * tag 'drm-next-2021-08-31-1' of git://anongit.freedesktop.org/drm/drm: (1318 commits) drm/amd/display: Move AllowDRAMSelfRefreshOrDRAMClockChangeInVblank to bounding box drm/amd/display: Remove duplicate dml init drm/amd/display: Update bounding box states (v2) drm/amd/display: Update number of DCN3 clock states drm/amdgpu: disable GFX CGCG in aldebaran drm/amdgpu: Clear RAS interrupt status on aldebaran drm/amdgpu: Add support for RAS XGMI err query drm/amdkfd: Account for SH/SE count when setting up cu masks. drm/amdgpu: rename amdgpu_bo_get_preferred_pin_domain drm/amdgpu: drop redundant cancel_delayed_work_sync call drm/amdgpu: add missing cleanups for more ASICs on UVD/VCE suspend drm/amdgpu: add missing cleanups for Polaris12 UVD/VCE on suspend drm/amdkfd: map SVM range with correct access permission drm/amdkfd: check access permisson to restore retry fault drm/amdgpu: Update RAS XGMI Error Query drm/amdgpu: Add driver infrastructure for MCA RAS drm/amd/display: Add Logging for HDMI color depth information drm/amd/amdgpu: consolidate PSP TA init shared buf functions drm/amd/amdgpu: add name field back to ras_common_if drm/amdgpu: Fix build with missing pm_suspend_target_state module export ...
Diffstat (limited to 'drivers/gpu/drm/i915/gt')
-rw-r--r--drivers/gpu/drm/i915/gt/debugfs_gt_pm.c10
-rw-r--r--drivers/gpu/drm/i915/gt/gen8_engine_cs.c17
-rw-r--r--drivers/gpu/drm/i915/gt/gen8_ppgtt.c68
-rw-r--r--drivers/gpu/drm/i915/gt/intel_breadcrumbs.c44
-rw-r--r--drivers/gpu/drm/i915/gt/intel_breadcrumbs.h16
-rw-r--r--drivers/gpu/drm/i915/gt/intel_breadcrumbs_types.h7
-rw-r--r--drivers/gpu/drm/i915/gt/intel_context.c88
-rw-r--r--drivers/gpu/drm/i915/gt/intel_context.h56
-rw-r--r--drivers/gpu/drm/i915/gt/intel_context_param.c63
-rw-r--r--drivers/gpu/drm/i915/gt/intel_context_param.h6
-rw-r--r--drivers/gpu/drm/i915/gt/intel_context_types.h64
-rw-r--r--drivers/gpu/drm/i915/gt/intel_engine.h87
-rw-r--r--drivers/gpu/drm/i915/gt/intel_engine_cs.c420
-rw-r--r--drivers/gpu/drm/i915/gt/intel_engine_heartbeat.c74
-rw-r--r--drivers/gpu/drm/i915/gt/intel_engine_heartbeat.h4
-rw-r--r--drivers/gpu/drm/i915/gt/intel_engine_pm.c4
-rw-r--r--drivers/gpu/drm/i915/gt/intel_engine_types.h93
-rw-r--r--drivers/gpu/drm/i915/gt/intel_engine_user.c6
-rw-r--r--drivers/gpu/drm/i915/gt/intel_execlists_submission.c604
-rw-r--r--drivers/gpu/drm/i915/gt/intel_execlists_submission.h12
-rw-r--r--drivers/gpu/drm/i915/gt/intel_ggtt.c6
-rw-r--r--drivers/gpu/drm/i915/gt/intel_gpu_commands.h2
-rw-r--r--drivers/gpu/drm/i915/gt/intel_gt.c197
-rw-r--r--drivers/gpu/drm/i915/gt/intel_gt.h10
-rw-r--r--drivers/gpu/drm/i915/gt/intel_gt_clock_utils.c10
-rw-r--r--drivers/gpu/drm/i915/gt/intel_gt_irq.c13
-rw-r--r--drivers/gpu/drm/i915/gt/intel_gt_pm.c11
-rw-r--r--drivers/gpu/drm/i915/gt/intel_gt_requests.c21
-rw-r--r--drivers/gpu/drm/i915/gt/intel_gt_requests.h9
-rw-r--r--drivers/gpu/drm/i915/gt/intel_gt_types.h37
-rw-r--r--drivers/gpu/drm/i915/gt/intel_gtt.c20
-rw-r--r--drivers/gpu/drm/i915/gt/intel_gtt.h18
-rw-r--r--drivers/gpu/drm/i915/gt/intel_lrc.c117
-rw-r--r--drivers/gpu/drm/i915/gt/intel_lrc_reg.h3
-rw-r--r--drivers/gpu/drm/i915/gt/intel_migrate.c688
-rw-r--r--drivers/gpu/drm/i915/gt/intel_migrate.h65
-rw-r--r--drivers/gpu/drm/i915/gt/intel_migrate_types.h15
-rw-r--r--drivers/gpu/drm/i915/gt/intel_mocs.c2
-rw-r--r--drivers/gpu/drm/i915/gt/intel_rc6.c49
-rw-r--r--drivers/gpu/drm/i915/gt/intel_region_lmem.c7
-rw-r--r--drivers/gpu/drm/i915/gt/intel_renderstate.h1
-rw-r--r--drivers/gpu/drm/i915/gt/intel_reset.c56
-rw-r--r--drivers/gpu/drm/i915/gt/intel_ring.h1
-rw-r--r--drivers/gpu/drm/i915/gt/intel_ring_submission.c70
-rw-r--r--drivers/gpu/drm/i915/gt/intel_rps.c209
-rw-r--r--drivers/gpu/drm/i915/gt/intel_rps.h10
-rw-r--r--drivers/gpu/drm/i915/gt/intel_sseu.c126
-rw-r--r--drivers/gpu/drm/i915/gt/intel_sseu.h10
-rw-r--r--drivers/gpu/drm/i915/gt/intel_sseu_debugfs.c8
-rw-r--r--drivers/gpu/drm/i915/gt/intel_workarounds.c498
-rw-r--r--drivers/gpu/drm/i915/gt/intel_workarounds_types.h1
-rw-r--r--drivers/gpu/drm/i915/gt/mock_engine.c51
-rw-r--r--drivers/gpu/drm/i915/gt/selftest_context.c10
-rw-r--r--drivers/gpu/drm/i915/gt/selftest_engine_heartbeat.c22
-rw-r--r--drivers/gpu/drm/i915/gt/selftest_engine_heartbeat.h2
-rw-r--r--drivers/gpu/drm/i915/gt/selftest_engine_pm.c4
-rw-r--r--drivers/gpu/drm/i915/gt/selftest_execlists.c307
-rw-r--r--drivers/gpu/drm/i915/gt/selftest_hangcheck.c330
-rw-r--r--drivers/gpu/drm/i915/gt/selftest_lrc.c6
-rw-r--r--drivers/gpu/drm/i915/gt/selftest_migrate.c669
-rw-r--r--drivers/gpu/drm/i915/gt/selftest_mocs.c52
-rw-r--r--drivers/gpu/drm/i915/gt/selftest_reset.c2
-rw-r--r--drivers/gpu/drm/i915/gt/selftest_slpc.c311
-rw-r--r--drivers/gpu/drm/i915/gt/selftest_timeline.c2
-rw-r--r--drivers/gpu/drm/i915/gt/selftest_workarounds.c162
-rw-r--r--drivers/gpu/drm/i915/gt/uc/abi/guc_actions_abi.h129
-rw-r--r--drivers/gpu/drm/i915/gt/uc/abi/guc_actions_slpc_abi.h235
-rw-r--r--drivers/gpu/drm/i915/gt/uc/abi/guc_communication_ctb_abi.h127
-rw-r--r--drivers/gpu/drm/i915/gt/uc/abi/guc_communication_mmio_abi.h65
-rw-r--r--drivers/gpu/drm/i915/gt/uc/abi/guc_messages_abi.h213
-rw-r--r--drivers/gpu/drm/i915/gt/uc/intel_guc.c206
-rw-r--r--drivers/gpu/drm/i915/gt/uc/intel_guc.h119
-rw-r--r--drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c487
-rw-r--r--drivers/gpu/drm/i915/gt/uc/intel_guc_ads.h4
-rw-r--r--drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c703
-rw-r--r--drivers/gpu/drm/i915/gt/uc/intel_guc_ct.h36
-rw-r--r--drivers/gpu/drm/i915/gt/uc/intel_guc_debugfs.c47
-rw-r--r--drivers/gpu/drm/i915/gt/uc/intel_guc_fwif.h167
-rw-r--r--drivers/gpu/drm/i915/gt/uc/intel_guc_log.c29
-rw-r--r--drivers/gpu/drm/i915/gt/uc/intel_guc_log.h6
-rw-r--r--drivers/gpu/drm/i915/gt/uc/intel_guc_rc.c80
-rw-r--r--drivers/gpu/drm/i915/gt/uc/intel_guc_rc.h31
-rw-r--r--drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c626
-rw-r--r--drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.h42
-rw-r--r--drivers/gpu/drm/i915/gt/uc/intel_guc_slpc_types.h29
-rw-r--r--drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c2753
-rw-r--r--drivers/gpu/drm/i915/gt/uc/intel_guc_submission.h18
-rw-r--r--drivers/gpu/drm/i915/gt/uc/intel_uc.c126
-rw-r--r--drivers/gpu/drm/i915/gt/uc/intel_uc.h15
-rw-r--r--drivers/gpu/drm/i915/gt/uc/intel_uc_fw.c27
90 files changed, 9846 insertions, 2437 deletions
diff --git a/drivers/gpu/drm/i915/gt/debugfs_gt_pm.c b/drivers/gpu/drm/i915/gt/debugfs_gt_pm.c
index 4270b5a34a83..d6f5836396f8 100644
--- a/drivers/gpu/drm/i915/gt/debugfs_gt_pm.c
+++ b/drivers/gpu/drm/i915/gt/debugfs_gt_pm.c
@@ -437,20 +437,20 @@ static int frequency_show(struct seq_file *m, void *unused)
max_freq = (IS_GEN9_LP(i915) ? rp_state_cap >> 0 :
rp_state_cap >> 16) & 0xff;
max_freq *= (IS_GEN9_BC(i915) ||
- GRAPHICS_VER(i915) >= 10 ? GEN9_FREQ_SCALER : 1);
+ GRAPHICS_VER(i915) >= 11 ? GEN9_FREQ_SCALER : 1);
seq_printf(m, "Lowest (RPN) frequency: %dMHz\n",
intel_gpu_freq(rps, max_freq));
max_freq = (rp_state_cap & 0xff00) >> 8;
max_freq *= (IS_GEN9_BC(i915) ||
- GRAPHICS_VER(i915) >= 10 ? GEN9_FREQ_SCALER : 1);
+ GRAPHICS_VER(i915) >= 11 ? GEN9_FREQ_SCALER : 1);
seq_printf(m, "Nominal (RP1) frequency: %dMHz\n",
intel_gpu_freq(rps, max_freq));
max_freq = (IS_GEN9_LP(i915) ? rp_state_cap >> 16 :
rp_state_cap >> 0) & 0xff;
max_freq *= (IS_GEN9_BC(i915) ||
- GRAPHICS_VER(i915) >= 10 ? GEN9_FREQ_SCALER : 1);
+ GRAPHICS_VER(i915) >= 11 ? GEN9_FREQ_SCALER : 1);
seq_printf(m, "Max non-overclocked (RP0) frequency: %dMHz\n",
intel_gpu_freq(rps, max_freq));
seq_printf(m, "Max overclocked frequency: %dMHz\n",
@@ -500,7 +500,7 @@ static int llc_show(struct seq_file *m, void *data)
min_gpu_freq = rps->min_freq;
max_gpu_freq = rps->max_freq;
- if (IS_GEN9_BC(i915) || GRAPHICS_VER(i915) >= 10) {
+ if (IS_GEN9_BC(i915) || GRAPHICS_VER(i915) >= 11) {
/* Convert GT frequency to 50 HZ units */
min_gpu_freq /= GEN9_FREQ_SCALER;
max_gpu_freq /= GEN9_FREQ_SCALER;
@@ -518,7 +518,7 @@ static int llc_show(struct seq_file *m, void *data)
intel_gpu_freq(rps,
(gpu_freq *
(IS_GEN9_BC(i915) ||
- GRAPHICS_VER(i915) >= 10 ?
+ GRAPHICS_VER(i915) >= 11 ?
GEN9_FREQ_SCALER : 1))),
((ia_freq >> 0) & 0xff) * 100,
((ia_freq >> 8) & 0xff) * 100);
diff --git a/drivers/gpu/drm/i915/gt/gen8_engine_cs.c b/drivers/gpu/drm/i915/gt/gen8_engine_cs.c
index 94e0a5669f90..461844dffd7e 100644
--- a/drivers/gpu/drm/i915/gt/gen8_engine_cs.c
+++ b/drivers/gpu/drm/i915/gt/gen8_engine_cs.c
@@ -42,7 +42,7 @@ int gen8_emit_flush_rcs(struct i915_request *rq, u32 mode)
vf_flush_wa = true;
/* WaForGAMHang:kbl */
- if (IS_KBL_GT_STEP(rq->engine->i915, 0, STEP_B0))
+ if (IS_KBL_GT_STEP(rq->engine->i915, 0, STEP_C0))
dc_flush_wa = true;
}
@@ -208,7 +208,7 @@ int gen12_emit_flush_rcs(struct i915_request *rq, u32 mode)
flags |= PIPE_CONTROL_FLUSH_L3;
flags |= PIPE_CONTROL_RENDER_TARGET_CACHE_FLUSH;
flags |= PIPE_CONTROL_DEPTH_CACHE_FLUSH;
- /* Wa_1409600907:tgl */
+ /* Wa_1409600907:tgl,adl-p */
flags |= PIPE_CONTROL_DEPTH_STALL;
flags |= PIPE_CONTROL_DC_FLUSH_ENABLE;
flags |= PIPE_CONTROL_FLUSH_ENABLE;
@@ -279,7 +279,7 @@ int gen12_emit_flush_xcs(struct i915_request *rq, u32 mode)
if (mode & EMIT_INVALIDATE)
aux_inv = rq->engine->mask & ~BIT(BCS0);
if (aux_inv)
- cmd += 2 * hweight8(aux_inv) + 2;
+ cmd += 2 * hweight32(aux_inv) + 2;
cs = intel_ring_begin(rq, cmd);
if (IS_ERR(cs))
@@ -313,9 +313,8 @@ int gen12_emit_flush_xcs(struct i915_request *rq, u32 mode)
struct intel_engine_cs *engine;
unsigned int tmp;
- *cs++ = MI_LOAD_REGISTER_IMM(hweight8(aux_inv));
- for_each_engine_masked(engine, rq->engine->gt,
- aux_inv, tmp) {
+ *cs++ = MI_LOAD_REGISTER_IMM(hweight32(aux_inv));
+ for_each_engine_masked(engine, rq->engine->gt, aux_inv, tmp) {
*cs++ = i915_mmio_reg_offset(aux_inv_reg(engine));
*cs++ = AUX_INV;
}
@@ -506,7 +505,8 @@ gen8_emit_fini_breadcrumb_tail(struct i915_request *rq, u32 *cs)
*cs++ = MI_USER_INTERRUPT;
*cs++ = MI_ARB_ON_OFF | MI_ARB_ENABLE;
- if (intel_engine_has_semaphores(rq->engine))
+ if (intel_engine_has_semaphores(rq->engine) &&
+ !intel_uc_uses_guc_submission(&rq->engine->gt->uc))
cs = emit_preempt_busywait(rq, cs);
rq->tail = intel_ring_offset(rq, cs);
@@ -598,7 +598,8 @@ gen12_emit_fini_breadcrumb_tail(struct i915_request *rq, u32 *cs)
*cs++ = MI_USER_INTERRUPT;
*cs++ = MI_ARB_ON_OFF | MI_ARB_ENABLE;
- if (intel_engine_has_semaphores(rq->engine))
+ if (intel_engine_has_semaphores(rq->engine) &&
+ !intel_uc_uses_guc_submission(&rq->engine->gt->uc))
cs = gen12_emit_preempt_busywait(rq, cs);
rq->tail = intel_ring_offset(rq, cs);
diff --git a/drivers/gpu/drm/i915/gt/gen8_ppgtt.c b/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
index da4f5eb43ac2..6e0e52eeb87a 100644
--- a/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
+++ b/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
@@ -358,6 +358,54 @@ static void gen8_ppgtt_alloc(struct i915_address_space *vm,
&start, start + length, vm->top);
}
+static void __gen8_ppgtt_foreach(struct i915_address_space *vm,
+ struct i915_page_directory *pd,
+ u64 *start, u64 end, int lvl,
+ void (*fn)(struct i915_address_space *vm,
+ struct i915_page_table *pt,
+ void *data),
+ void *data)
+{
+ unsigned int idx, len;
+
+ len = gen8_pd_range(*start, end, lvl--, &idx);
+
+ spin_lock(&pd->lock);
+ do {
+ struct i915_page_table *pt = pd->entry[idx];
+
+ atomic_inc(&pt->used);
+ spin_unlock(&pd->lock);
+
+ if (lvl) {
+ __gen8_ppgtt_foreach(vm, as_pd(pt), start, end, lvl,
+ fn, data);
+ } else {
+ fn(vm, pt, data);
+ *start += gen8_pt_count(*start, end);
+ }
+
+ spin_lock(&pd->lock);
+ atomic_dec(&pt->used);
+ } while (idx++, --len);
+ spin_unlock(&pd->lock);
+}
+
+static void gen8_ppgtt_foreach(struct i915_address_space *vm,
+ u64 start, u64 length,
+ void (*fn)(struct i915_address_space *vm,
+ struct i915_page_table *pt,
+ void *data),
+ void *data)
+{
+ start >>= GEN8_PTE_SHIFT;
+ length >>= GEN8_PTE_SHIFT;
+
+ __gen8_ppgtt_foreach(vm, i915_vm_to_ppgtt(vm)->pd,
+ &start, start + length, vm->top,
+ fn, data);
+}
+
static __always_inline u64
gen8_ppgtt_insert_pte(struct i915_ppgtt *ppgtt,
struct i915_page_directory *pdp,
@@ -552,6 +600,24 @@ static void gen8_ppgtt_insert(struct i915_address_space *vm,
}
}
+static void gen8_ppgtt_insert_entry(struct i915_address_space *vm,
+ dma_addr_t addr,
+ u64 offset,
+ enum i915_cache_level level,
+ u32 flags)
+{
+ u64 idx = offset >> GEN8_PTE_SHIFT;
+ struct i915_page_directory * const pdp =
+ gen8_pdp_for_page_index(vm, idx);
+ struct i915_page_directory *pd =
+ i915_pd_entry(pdp, gen8_pd_index(idx, 2));
+ gen8_pte_t *vaddr;
+
+ vaddr = px_vaddr(i915_pt_entry(pd, gen8_pd_index(idx, 1)));
+ vaddr[gen8_pd_index(idx, 0)] = gen8_pte_encode(addr, level, flags);
+ clflush_cache_range(&vaddr[gen8_pd_index(idx, 0)], sizeof(*vaddr));
+}
+
static int gen8_init_scratch(struct i915_address_space *vm)
{
u32 pte_flags;
@@ -731,8 +797,10 @@ struct i915_ppgtt *gen8_ppgtt_create(struct intel_gt *gt)
ppgtt->vm.bind_async_flags = I915_VMA_LOCAL_BIND;
ppgtt->vm.insert_entries = gen8_ppgtt_insert;
+ ppgtt->vm.insert_page = gen8_ppgtt_insert_entry;
ppgtt->vm.allocate_va_range = gen8_ppgtt_alloc;
ppgtt->vm.clear_range = gen8_ppgtt_clear;
+ ppgtt->vm.foreach = gen8_ppgtt_foreach;
ppgtt->vm.pte_encode = gen8_pte_encode;
diff --git a/drivers/gpu/drm/i915/gt/intel_breadcrumbs.c b/drivers/gpu/drm/i915/gt/intel_breadcrumbs.c
index 38cc42783dfb..209cf265bf74 100644
--- a/drivers/gpu/drm/i915/gt/intel_breadcrumbs.c
+++ b/drivers/gpu/drm/i915/gt/intel_breadcrumbs.c
@@ -15,28 +15,14 @@
#include "intel_gt_pm.h"
#include "intel_gt_requests.h"
-static bool irq_enable(struct intel_engine_cs *engine)
+static bool irq_enable(struct intel_breadcrumbs *b)
{
- if (!engine->irq_enable)
- return false;
-
- /* Caller disables interrupts */
- spin_lock(&engine->gt->irq_lock);
- engine->irq_enable(engine);
- spin_unlock(&engine->gt->irq_lock);
-
- return true;
+ return intel_engine_irq_enable(b->irq_engine);
}
-static void irq_disable(struct intel_engine_cs *engine)
+static void irq_disable(struct intel_breadcrumbs *b)
{
- if (!engine->irq_disable)
- return;
-
- /* Caller disables interrupts */
- spin_lock(&engine->gt->irq_lock);
- engine->irq_disable(engine);
- spin_unlock(&engine->gt->irq_lock);
+ intel_engine_irq_disable(b->irq_engine);
}
static void __intel_breadcrumbs_arm_irq(struct intel_breadcrumbs *b)
@@ -57,7 +43,7 @@ static void __intel_breadcrumbs_arm_irq(struct intel_breadcrumbs *b)
WRITE_ONCE(b->irq_armed, true);
/* Requests may have completed before we could enable the interrupt. */
- if (!b->irq_enabled++ && irq_enable(b->irq_engine))
+ if (!b->irq_enabled++ && b->irq_enable(b))
irq_work_queue(&b->irq_work);
}
@@ -76,7 +62,7 @@ static void __intel_breadcrumbs_disarm_irq(struct intel_breadcrumbs *b)
{
GEM_BUG_ON(!b->irq_enabled);
if (!--b->irq_enabled)
- irq_disable(b->irq_engine);
+ b->irq_disable(b);
WRITE_ONCE(b->irq_armed, false);
intel_gt_pm_put_async(b->irq_engine->gt);
@@ -259,6 +245,9 @@ static void signal_irq_work(struct irq_work *work)
llist_entry(signal, typeof(*rq), signal_node);
struct list_head cb_list;
+ if (rq->engine->sched_engine->retire_inflight_request_prio)
+ rq->engine->sched_engine->retire_inflight_request_prio(rq);
+
spin_lock(&rq->lock);
list_replace(&rq->fence.cb_list, &cb_list);
__dma_fence_signal__timestamp(&rq->fence, timestamp);
@@ -281,7 +270,7 @@ intel_breadcrumbs_create(struct intel_engine_cs *irq_engine)
if (!b)
return NULL;
- b->irq_engine = irq_engine;
+ kref_init(&b->ref);
spin_lock_init(&b->signalers_lock);
INIT_LIST_HEAD(&b->signalers);
@@ -290,6 +279,10 @@ intel_breadcrumbs_create(struct intel_engine_cs *irq_engine)
spin_lock_init(&b->irq_lock);
init_irq_work(&b->irq_work, signal_irq_work);
+ b->irq_engine = irq_engine;
+ b->irq_enable = irq_enable;
+ b->irq_disable = irq_disable;
+
return b;
}
@@ -303,9 +296,9 @@ void intel_breadcrumbs_reset(struct intel_breadcrumbs *b)
spin_lock_irqsave(&b->irq_lock, flags);
if (b->irq_enabled)
- irq_enable(b->irq_engine);
+ b->irq_enable(b);
else
- irq_disable(b->irq_engine);
+ b->irq_disable(b);
spin_unlock_irqrestore(&b->irq_lock, flags);
}
@@ -325,11 +318,14 @@ void __intel_breadcrumbs_park(struct intel_breadcrumbs *b)
}
}
-void intel_breadcrumbs_free(struct intel_breadcrumbs *b)
+void intel_breadcrumbs_free(struct kref *kref)
{
+ struct intel_breadcrumbs *b = container_of(kref, typeof(*b), ref);
+
irq_work_sync(&b->irq_work);
GEM_BUG_ON(!list_empty(&b->signalers));
GEM_BUG_ON(b->irq_armed);
+
kfree(b);
}
diff --git a/drivers/gpu/drm/i915/gt/intel_breadcrumbs.h b/drivers/gpu/drm/i915/gt/intel_breadcrumbs.h
index 3ce5ce270b04..be0d4f379a85 100644
--- a/drivers/gpu/drm/i915/gt/intel_breadcrumbs.h
+++ b/drivers/gpu/drm/i915/gt/intel_breadcrumbs.h
@@ -9,7 +9,7 @@
#include <linux/atomic.h>
#include <linux/irq_work.h>
-#include "intel_engine_types.h"
+#include "intel_breadcrumbs_types.h"
struct drm_printer;
struct i915_request;
@@ -17,7 +17,7 @@ struct intel_breadcrumbs;
struct intel_breadcrumbs *
intel_breadcrumbs_create(struct intel_engine_cs *irq_engine);
-void intel_breadcrumbs_free(struct intel_breadcrumbs *b);
+void intel_breadcrumbs_free(struct kref *kref);
void intel_breadcrumbs_reset(struct intel_breadcrumbs *b);
void __intel_breadcrumbs_park(struct intel_breadcrumbs *b);
@@ -48,4 +48,16 @@ void i915_request_cancel_breadcrumb(struct i915_request *request);
void intel_context_remove_breadcrumbs(struct intel_context *ce,
struct intel_breadcrumbs *b);
+static inline struct intel_breadcrumbs *
+intel_breadcrumbs_get(struct intel_breadcrumbs *b)
+{
+ kref_get(&b->ref);
+ return b;
+}
+
+static inline void intel_breadcrumbs_put(struct intel_breadcrumbs *b)
+{
+ kref_put(&b->ref, intel_breadcrumbs_free);
+}
+
#endif /* __INTEL_BREADCRUMBS__ */
diff --git a/drivers/gpu/drm/i915/gt/intel_breadcrumbs_types.h b/drivers/gpu/drm/i915/gt/intel_breadcrumbs_types.h
index 3a084ce8ff5e..72dfd3748c4c 100644
--- a/drivers/gpu/drm/i915/gt/intel_breadcrumbs_types.h
+++ b/drivers/gpu/drm/i915/gt/intel_breadcrumbs_types.h
@@ -7,10 +7,13 @@
#define __INTEL_BREADCRUMBS_TYPES__
#include <linux/irq_work.h>
+#include <linux/kref.h>
#include <linux/list.h>
#include <linux/spinlock.h>
#include <linux/types.h>
+#include "intel_engine_types.h"
+
/*
* Rather than have every client wait upon all user interrupts,
* with the herd waking after every interrupt and each doing the
@@ -29,6 +32,7 @@
* the overhead of waking that client is much preferred.
*/
struct intel_breadcrumbs {
+ struct kref ref;
atomic_t active;
spinlock_t signalers_lock; /* protects the list of signalers */
@@ -42,7 +46,10 @@ struct intel_breadcrumbs {
bool irq_armed;
/* Not all breadcrumbs are attached to physical HW */
+ intel_engine_mask_t engine_mask;
struct intel_engine_cs *irq_engine;
+ bool (*irq_enable)(struct intel_breadcrumbs *b);
+ void (*irq_disable)(struct intel_breadcrumbs *b);
};
#endif /* __INTEL_BREADCRUMBS_TYPES__ */
diff --git a/drivers/gpu/drm/i915/gt/intel_context.c b/drivers/gpu/drm/i915/gt/intel_context.c
index 4033184f13b9..745e84c72c90 100644
--- a/drivers/gpu/drm/i915/gt/intel_context.c
+++ b/drivers/gpu/drm/i915/gt/intel_context.c
@@ -7,28 +7,26 @@
#include "gem/i915_gem_pm.h"
#include "i915_drv.h"
-#include "i915_globals.h"
+#include "i915_trace.h"
#include "intel_context.h"
#include "intel_engine.h"
#include "intel_engine_pm.h"
#include "intel_ring.h"
-static struct i915_global_context {
- struct i915_global base;
- struct kmem_cache *slab_ce;
-} global;
+static struct kmem_cache *slab_ce;
static struct intel_context *intel_context_alloc(void)
{
- return kmem_cache_zalloc(global.slab_ce, GFP_KERNEL);
+ return kmem_cache_zalloc(slab_ce, GFP_KERNEL);
}
static void rcu_context_free(struct rcu_head *rcu)
{
struct intel_context *ce = container_of(rcu, typeof(*ce), rcu);
- kmem_cache_free(global.slab_ce, ce);
+ trace_intel_context_free(ce);
+ kmem_cache_free(slab_ce, ce);
}
void intel_context_free(struct intel_context *ce)
@@ -46,6 +44,7 @@ intel_context_create(struct intel_engine_cs *engine)
return ERR_PTR(-ENOMEM);
intel_context_init(ce, engine);
+ trace_intel_context_create(ce);
return ce;
}
@@ -80,7 +79,7 @@ static int intel_context_active_acquire(struct intel_context *ce)
__i915_active_acquire(&ce->active);
- if (intel_context_is_barrier(ce))
+ if (intel_context_is_barrier(ce) || intel_engine_uses_guc(ce->engine))
return 0;
/* Preallocate tracking nodes */
@@ -268,6 +267,8 @@ int __intel_context_do_pin_ww(struct intel_context *ce,
GEM_BUG_ON(!intel_context_is_pinned(ce)); /* no overflow! */
+ trace_intel_context_do_pin(ce);
+
err_unlock:
mutex_unlock(&ce->pin_mutex);
err_post_unpin:
@@ -306,9 +307,9 @@ retry:
return err;
}
-void intel_context_unpin(struct intel_context *ce)
+void __intel_context_do_unpin(struct intel_context *ce, int sub)
{
- if (!atomic_dec_and_test(&ce->pin_count))
+ if (!atomic_sub_and_test(sub, &ce->pin_count))
return;
CE_TRACE(ce, "unpin\n");
@@ -323,6 +324,7 @@ void intel_context_unpin(struct intel_context *ce)
*/
intel_context_get(ce);
intel_context_active_release(ce);
+ trace_intel_context_do_unpin(ce);
intel_context_put(ce);
}
@@ -360,6 +362,12 @@ static int __intel_context_active(struct i915_active *active)
return 0;
}
+static int sw_fence_dummy_notify(struct i915_sw_fence *sf,
+ enum i915_sw_fence_notify state)
+{
+ return NOTIFY_DONE;
+}
+
void
intel_context_init(struct intel_context *ce, struct intel_engine_cs *engine)
{
@@ -371,7 +379,8 @@ intel_context_init(struct intel_context *ce, struct intel_engine_cs *engine)
ce->engine = engine;
ce->ops = engine->cops;
ce->sseu = engine->sseu;
- ce->ring = __intel_context_ring_size(SZ_4K);
+ ce->ring = NULL;
+ ce->ring_size = SZ_4K;
ewma_runtime_init(&ce->runtime.avg);
@@ -383,6 +392,22 @@ intel_context_init(struct intel_context *ce, struct intel_engine_cs *engine)
mutex_init(&ce->pin_mutex);
+ spin_lock_init(&ce->guc_state.lock);
+ INIT_LIST_HEAD(&ce->guc_state.fences);
+
+ spin_lock_init(&ce->guc_active.lock);
+ INIT_LIST_HEAD(&ce->guc_active.requests);
+
+ ce->guc_id = GUC_INVALID_LRC_ID;
+ INIT_LIST_HEAD(&ce->guc_id_link);
+
+ /*
+ * Initialize fence to be complete as this is expected to be complete
+ * unless there is a pending schedule disable outstanding.
+ */
+ i915_sw_fence_init(&ce->guc_blocked, sw_fence_dummy_notify);
+ i915_sw_fence_commit(&ce->guc_blocked);
+
i915_active_init(&ce->active,
__intel_context_active, __intel_context_retire, 0);
}
@@ -397,28 +422,17 @@ void intel_context_fini(struct intel_context *ce)
i915_active_fini(&ce->active);
}
-static void i915_global_context_shrink(void)
-{
- kmem_cache_shrink(global.slab_ce);
-}
-
-static void i915_global_context_exit(void)
+void i915_context_module_exit(void)
{
- kmem_cache_destroy(global.slab_ce);
+ kmem_cache_destroy(slab_ce);
}
-static struct i915_global_context global = { {
- .shrink = i915_global_context_shrink,
- .exit = i915_global_context_exit,
-} };
-
-int __init i915_global_context_init(void)
+int __init i915_context_module_init(void)
{
- global.slab_ce = KMEM_CACHE(intel_context, SLAB_HWCACHE_ALIGN);
- if (!global.slab_ce)
+ slab_ce = KMEM_CACHE(intel_context, SLAB_HWCACHE_ALIGN);
+ if (!slab_ce)
return -ENOMEM;
- i915_global_register(&global.base);
return 0;
}
@@ -499,6 +513,26 @@ retry:
return rq;
}
+struct i915_request *intel_context_find_active_request(struct intel_context *ce)
+{
+ struct i915_request *rq, *active = NULL;
+ unsigned long flags;
+
+ GEM_BUG_ON(!intel_engine_uses_guc(ce->engine));
+
+ spin_lock_irqsave(&ce->guc_active.lock, flags);
+ list_for_each_entry_reverse(rq, &ce->guc_active.requests,
+ sched.link) {
+ if (i915_request_completed(rq))
+ break;
+
+ active = rq;
+ }
+ spin_unlock_irqrestore(&ce->guc_active.lock, flags);
+
+ return active;
+}
+
#if IS_ENABLED(CONFIG_DRM_I915_SELFTEST)
#include "selftest_context.c"
#endif
diff --git a/drivers/gpu/drm/i915/gt/intel_context.h b/drivers/gpu/drm/i915/gt/intel_context.h
index f83a73a2b39f..c41098950746 100644
--- a/drivers/gpu/drm/i915/gt/intel_context.h
+++ b/drivers/gpu/drm/i915/gt/intel_context.h
@@ -16,6 +16,7 @@
#include "intel_engine_types.h"
#include "intel_ring_types.h"
#include "intel_timeline_types.h"
+#include "i915_trace.h"
#define CE_TRACE(ce, fmt, ...) do { \
const struct intel_context *ce__ = (ce); \
@@ -30,6 +31,9 @@ void intel_context_init(struct intel_context *ce,
struct intel_engine_cs *engine);
void intel_context_fini(struct intel_context *ce);
+void i915_context_module_exit(void);
+int i915_context_module_init(void);
+
struct intel_context *
intel_context_create(struct intel_engine_cs *engine);
@@ -69,6 +73,13 @@ intel_context_is_pinned(struct intel_context *ce)
return atomic_read(&ce->pin_count);
}
+static inline void intel_context_cancel_request(struct intel_context *ce,
+ struct i915_request *rq)
+{
+ GEM_BUG_ON(!ce->ops->cancel_request);
+ return ce->ops->cancel_request(ce, rq);
+}
+
/**
* intel_context_unlock_pinned - Releases the earlier locking of 'pinned' status
* @ce - the context
@@ -113,7 +124,32 @@ static inline void __intel_context_pin(struct intel_context *ce)
atomic_inc(&ce->pin_count);
}
-void intel_context_unpin(struct intel_context *ce);
+void __intel_context_do_unpin(struct intel_context *ce, int sub);
+
+static inline void intel_context_sched_disable_unpin(struct intel_context *ce)
+{
+ __intel_context_do_unpin(ce, 2);
+}
+
+static inline void intel_context_unpin(struct intel_context *ce)
+{
+ if (!ce->ops->sched_disable) {
+ __intel_context_do_unpin(ce, 1);
+ } else {
+ /*
+ * Move ownership of this pin to the scheduling disable which is
+ * an async operation. When that operation completes the above
+ * intel_context_sched_disable_unpin is called potentially
+ * unpinning the context.
+ */
+ while (!atomic_add_unless(&ce->pin_count, -1, 1)) {
+ if (atomic_cmpxchg(&ce->pin_count, 1, 2) == 1) {
+ ce->ops->sched_disable(ce);
+ break;
+ }
+ }
+ }
+}
void intel_context_enter_engine(struct intel_context *ce);
void intel_context_exit_engine(struct intel_context *ce);
@@ -175,10 +211,8 @@ int intel_context_prepare_remote_request(struct intel_context *ce,
struct i915_request *intel_context_create_request(struct intel_context *ce);
-static inline struct intel_ring *__intel_context_ring_size(u64 sz)
-{
- return u64_to_ptr(struct intel_ring, sz);
-}
+struct i915_request *
+intel_context_find_active_request(struct intel_context *ce);
static inline bool intel_context_is_barrier(const struct intel_context *ce)
{
@@ -220,6 +254,18 @@ static inline bool intel_context_set_banned(struct intel_context *ce)
return test_and_set_bit(CONTEXT_BANNED, &ce->flags);
}
+static inline bool intel_context_ban(struct intel_context *ce,
+ struct i915_request *rq)
+{
+ bool ret = intel_context_set_banned(ce);
+
+ trace_intel_context_ban(ce);
+ if (ce->ops->ban)
+ ce->ops->ban(ce, rq);
+
+ return ret;
+}
+
static inline bool
intel_context_force_single_submission(const struct intel_context *ce)
{
diff --git a/drivers/gpu/drm/i915/gt/intel_context_param.c b/drivers/gpu/drm/i915/gt/intel_context_param.c
deleted file mode 100644
index 65dcd090245d..000000000000
--- a/drivers/gpu/drm/i915/gt/intel_context_param.c
+++ /dev/null
@@ -1,63 +0,0 @@
-// SPDX-License-Identifier: MIT
-/*
- * Copyright © 2019 Intel Corporation
- */
-
-#include "i915_active.h"
-#include "intel_context.h"
-#include "intel_context_param.h"
-#include "intel_ring.h"
-
-int intel_context_set_ring_size(struct intel_context *ce, long sz)
-{
- int err;
-
- if (intel_context_lock_pinned(ce))
- return -EINTR;
-
- err = i915_active_wait(&ce->active);
- if (err < 0)
- goto unlock;
-
- if (intel_context_is_pinned(ce)) {
- err = -EBUSY; /* In active use, come back later! */
- goto unlock;
- }
-
- if (test_bit(CONTEXT_ALLOC_BIT, &ce->flags)) {
- struct intel_ring *ring;
-
- /* Replace the existing ringbuffer */
- ring = intel_engine_create_ring(ce->engine, sz);
- if (IS_ERR(ring)) {
- err = PTR_ERR(ring);
- goto unlock;
- }
-
- intel_ring_put(ce->ring);
- ce->ring = ring;
-
- /* Context image will be updated on next pin */
- } else {
- ce->ring = __intel_context_ring_size(sz);
- }
-
-unlock:
- intel_context_unlock_pinned(ce);
- return err;
-}
-
-long intel_context_get_ring_size(struct intel_context *ce)
-{
- long sz = (unsigned long)READ_ONCE(ce->ring);
-
- if (test_bit(CONTEXT_ALLOC_BIT, &ce->flags)) {
- if (intel_context_lock_pinned(ce))
- return -EINTR;
-
- sz = ce->ring->size;
- intel_context_unlock_pinned(ce);
- }
-
- return sz;
-}
diff --git a/drivers/gpu/drm/i915/gt/intel_context_param.h b/drivers/gpu/drm/i915/gt/intel_context_param.h
index 3ecacc675f41..0c69cb42d075 100644
--- a/drivers/gpu/drm/i915/gt/intel_context_param.h
+++ b/drivers/gpu/drm/i915/gt/intel_context_param.h
@@ -10,14 +10,10 @@
#include "intel_context.h"
-int intel_context_set_ring_size(struct intel_context *ce, long sz);
-long intel_context_get_ring_size(struct intel_context *ce);
-
-static inline int
+static inline void
intel_context_set_watchdog_us(struct intel_context *ce, u64 timeout_us)
{
ce->watchdog.timeout_us = timeout_us;
- return 0;
}
#endif /* INTEL_CONTEXT_PARAM_H */
diff --git a/drivers/gpu/drm/i915/gt/intel_context_types.h b/drivers/gpu/drm/i915/gt/intel_context_types.h
index ed8c447a7346..e54351a170e2 100644
--- a/drivers/gpu/drm/i915/gt/intel_context_types.h
+++ b/drivers/gpu/drm/i915/gt/intel_context_types.h
@@ -13,12 +13,14 @@
#include <linux/types.h>
#include "i915_active_types.h"
+#include "i915_sw_fence.h"
#include "i915_utils.h"
#include "intel_engine_types.h"
#include "intel_sseu.h"
-#define CONTEXT_REDZONE POISON_INUSE
+#include "uc/intel_guc_fwif.h"
+#define CONTEXT_REDZONE POISON_INUSE
DECLARE_EWMA(runtime, 3, 8);
struct i915_gem_context;
@@ -35,16 +37,29 @@ struct intel_context_ops {
int (*alloc)(struct intel_context *ce);
+ void (*ban)(struct intel_context *ce, struct i915_request *rq);
+
int (*pre_pin)(struct intel_context *ce, struct i915_gem_ww_ctx *ww, void **vaddr);
int (*pin)(struct intel_context *ce, void *vaddr);
void (*unpin)(struct intel_context *ce);
void (*post_unpin)(struct intel_context *ce);
+ void (*cancel_request)(struct intel_context *ce,
+ struct i915_request *rq);
+
void (*enter)(struct intel_context *ce);
void (*exit)(struct intel_context *ce);
+ void (*sched_disable)(struct intel_context *ce);
+
void (*reset)(struct intel_context *ce);
void (*destroy)(struct kref *kref);
+
+ /* virtual engine/context interface */
+ struct intel_context *(*create_virtual)(struct intel_engine_cs **engine,
+ unsigned int count);
+ struct intel_engine_cs *(*get_sibling)(struct intel_engine_cs *engine,
+ unsigned int sibling);
};
struct intel_context {
@@ -82,6 +97,7 @@ struct intel_context {
spinlock_t signal_lock; /* protects signals, the list of requests */
struct i915_vma *state;
+ u32 ring_size;
struct intel_ring *ring;
struct intel_timeline *timeline;
@@ -95,6 +111,7 @@ struct intel_context {
#define CONTEXT_BANNED 6
#define CONTEXT_FORCE_SINGLE_SUBMISSION 7
#define CONTEXT_NOPREEMPT 8
+#define CONTEXT_LRCA_DIRTY 9
struct {
u64 timeout_us;
@@ -136,6 +153,51 @@ struct intel_context {
struct intel_sseu sseu;
u8 wa_bb_page; /* if set, page num reserved for context workarounds */
+
+ struct {
+ /** lock: protects everything in guc_state */
+ spinlock_t lock;
+ /**
+ * sched_state: scheduling state of this context using GuC
+ * submission
+ */
+ u16 sched_state;
+ /*
+ * fences: maintains of list of requests that have a submit
+ * fence related to GuC submission
+ */
+ struct list_head fences;
+ } guc_state;
+
+ struct {
+ /** lock: protects everything in guc_active */
+ spinlock_t lock;
+ /** requests: active requests on this context */
+ struct list_head requests;
+ } guc_active;
+
+ /* GuC scheduling state flags that do not require a lock. */
+ atomic_t guc_sched_state_no_lock;
+
+ /* GuC LRC descriptor ID */
+ u16 guc_id;
+
+ /* GuC LRC descriptor reference count */
+ atomic_t guc_id_ref;
+
+ /*
+ * GuC ID link - in list when unpinned but guc_id still valid in GuC
+ */
+ struct list_head guc_id_link;
+
+ /* GuC context blocked fence */
+ struct i915_sw_fence guc_blocked;
+
+ /*
+ * GuC priority management
+ */
+ u8 guc_prio;
+ u32 guc_prio_count[GUC_CLIENT_PRIORITY_NUM];
};
#endif /* __INTEL_CONTEXT_TYPES__ */
diff --git a/drivers/gpu/drm/i915/gt/intel_engine.h b/drivers/gpu/drm/i915/gt/intel_engine.h
index 8d9184920c51..87579affb952 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine.h
+++ b/drivers/gpu/drm/i915/gt/intel_engine.h
@@ -19,7 +19,9 @@
#include "intel_workarounds.h"
struct drm_printer;
+struct intel_context;
struct intel_gt;
+struct lock_class_key;
/* Early gen2 devices have a cacheline of just 32 bytes, using 64 is overkill,
* but keeps the logic simple. Indeed, the whole purpose of this macro is just
@@ -123,20 +125,6 @@ execlists_active(const struct intel_engine_execlists *execlists)
return active;
}
-static inline void
-execlists_active_lock_bh(struct intel_engine_execlists *execlists)
-{
- local_bh_disable(); /* prevent local softirq and lock recursion */
- tasklet_lock(&execlists->tasklet);
-}
-
-static inline void
-execlists_active_unlock_bh(struct intel_engine_execlists *execlists)
-{
- tasklet_unlock(&execlists->tasklet);
- local_bh_enable(); /* restore softirq, and kick ksoftirqd! */
-}
-
struct i915_request *
execlists_unwind_incomplete_requests(struct intel_engine_execlists *execlists);
@@ -186,11 +174,12 @@ intel_write_status_page(struct intel_engine_cs *engine, int reg, u32 value)
#define I915_GEM_HWS_PREEMPT_ADDR (I915_GEM_HWS_PREEMPT * sizeof(u32))
#define I915_GEM_HWS_SEQNO 0x40
#define I915_GEM_HWS_SEQNO_ADDR (I915_GEM_HWS_SEQNO * sizeof(u32))
+#define I915_GEM_HWS_MIGRATE (0x42 * sizeof(u32))
#define I915_GEM_HWS_SCRATCH 0x80
#define I915_HWS_CSB_BUF0_INDEX 0x10
#define I915_HWS_CSB_WRITE_INDEX 0x1f
-#define CNL_HWS_CSB_WRITE_INDEX 0x2f
+#define ICL_HWS_CSB_WRITE_INDEX 0x2f
void intel_engine_stop(struct intel_engine_cs *engine);
void intel_engine_cleanup(struct intel_engine_cs *engine);
@@ -223,6 +212,9 @@ void intel_engine_get_instdone(const struct intel_engine_cs *engine,
void intel_engine_init_execlists(struct intel_engine_cs *engine);
+bool intel_engine_irq_enable(struct intel_engine_cs *engine);
+void intel_engine_irq_disable(struct intel_engine_cs *engine);
+
static inline void __intel_engine_reset(struct intel_engine_cs *engine,
bool stalled)
{
@@ -248,17 +240,27 @@ __printf(3, 4)
void intel_engine_dump(struct intel_engine_cs *engine,
struct drm_printer *m,
const char *header, ...);
+void intel_engine_dump_active_requests(struct list_head *requests,
+ struct i915_request *hung_rq,
+ struct drm_printer *m);
ktime_t intel_engine_get_busy_time(struct intel_engine_cs *engine,
ktime_t *now);
struct i915_request *
-intel_engine_find_active_request(struct intel_engine_cs *engine);
+intel_engine_execlist_find_hung_request(struct intel_engine_cs *engine);
u32 intel_engine_context_size(struct intel_gt *gt, u8 class);
+struct intel_context *
+intel_engine_create_pinned_context(struct intel_engine_cs *engine,
+ struct i915_address_space *vm,
+ unsigned int ring_size,
+ unsigned int hwsp,
+ struct lock_class_key *key,
+ const char *name);
+
+void intel_engine_destroy_pinned_context(struct intel_context *ce);
-void intel_engine_init_active(struct intel_engine_cs *engine,
- unsigned int subclass);
#define ENGINE_PHYSICAL 0
#define ENGINE_MOCK 1
#define ENGINE_VIRTUAL 2
@@ -277,13 +279,60 @@ intel_engine_has_preempt_reset(const struct intel_engine_cs *engine)
return intel_engine_has_preemption(engine);
}
+struct intel_context *
+intel_engine_create_virtual(struct intel_engine_cs **siblings,
+ unsigned int count);
+
+static inline bool
+intel_virtual_engine_has_heartbeat(const struct intel_engine_cs *engine)
+{
+ /*
+ * For non-GuC submission we expect the back-end to look at the
+ * heartbeat status of the actual physical engine that the work
+ * has been (or is being) scheduled on, so we should only reach
+ * here with GuC submission enabled.
+ */
+ GEM_BUG_ON(!intel_engine_uses_guc(engine));
+
+ return intel_guc_virtual_engine_has_heartbeat(engine);
+}
+
static inline bool
intel_engine_has_heartbeat(const struct intel_engine_cs *engine)
{
if (!IS_ACTIVE(CONFIG_DRM_I915_HEARTBEAT_INTERVAL))
return false;
- return READ_ONCE(engine->props.heartbeat_interval_ms);
+ if (intel_engine_is_virtual(engine))
+ return intel_virtual_engine_has_heartbeat(engine);
+ else
+ return READ_ONCE(engine->props.heartbeat_interval_ms);
+}
+
+static inline struct intel_engine_cs *
+intel_engine_get_sibling(struct intel_engine_cs *engine, unsigned int sibling)
+{
+ GEM_BUG_ON(!intel_engine_is_virtual(engine));
+ return engine->cops->get_sibling(engine, sibling);
+}
+
+static inline void
+intel_engine_set_hung_context(struct intel_engine_cs *engine,
+ struct intel_context *ce)
+{
+ engine->hung_ce = ce;
+}
+
+static inline void
+intel_engine_clear_hung_context(struct intel_engine_cs *engine)
+{
+ intel_engine_set_hung_context(engine, NULL);
+}
+
+static inline struct intel_context *
+intel_engine_get_hung_context(struct intel_engine_cs *engine)
+{
+ return engine->hung_ce;
}
#endif /* _INTEL_RINGBUFFER_H_ */
diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
index 7f03df236613..0d9105a31d84 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
+++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
@@ -35,14 +35,12 @@
#define DEFAULT_LR_CONTEXT_RENDER_SIZE (22 * PAGE_SIZE)
#define GEN8_LR_CONTEXT_RENDER_SIZE (20 * PAGE_SIZE)
#define GEN9_LR_CONTEXT_RENDER_SIZE (22 * PAGE_SIZE)
-#define GEN10_LR_CONTEXT_RENDER_SIZE (18 * PAGE_SIZE)
#define GEN11_LR_CONTEXT_RENDER_SIZE (14 * PAGE_SIZE)
#define GEN8_LR_CONTEXT_OTHER_SIZE ( 2 * PAGE_SIZE)
#define MAX_MMIO_BASES 3
struct engine_info {
- unsigned int hw_id;
u8 class;
u8 instance;
/* mmio bases table *must* be sorted in reverse graphics_ver order */
@@ -54,7 +52,6 @@ struct engine_info {
static const struct engine_info intel_engines[] = {
[RCS0] = {
- .hw_id = RCS0_HW,
.class = RENDER_CLASS,
.instance = 0,
.mmio_bases = {
@@ -62,7 +59,6 @@ static const struct engine_info intel_engines[] = {
},
},
[BCS0] = {
- .hw_id = BCS0_HW,
.class = COPY_ENGINE_CLASS,
.instance = 0,
.mmio_bases = {
@@ -70,7 +66,6 @@ static const struct engine_info intel_engines[] = {
},
},
[VCS0] = {
- .hw_id = VCS0_HW,
.class = VIDEO_DECODE_CLASS,
.instance = 0,
.mmio_bases = {
@@ -80,7 +75,6 @@ static const struct engine_info intel_engines[] = {
},
},
[VCS1] = {
- .hw_id = VCS1_HW,
.class = VIDEO_DECODE_CLASS,
.instance = 1,
.mmio_bases = {
@@ -89,7 +83,6 @@ static const struct engine_info intel_engines[] = {
},
},
[VCS2] = {
- .hw_id = VCS2_HW,
.class = VIDEO_DECODE_CLASS,
.instance = 2,
.mmio_bases = {
@@ -97,15 +90,41 @@ static const struct engine_info intel_engines[] = {
},
},
[VCS3] = {
- .hw_id = VCS3_HW,
.class = VIDEO_DECODE_CLASS,
.instance = 3,
.mmio_bases = {
{ .graphics_ver = 11, .base = GEN11_BSD4_RING_BASE }
},
},
+ [VCS4] = {
+ .class = VIDEO_DECODE_CLASS,
+ .instance = 4,
+ .mmio_bases = {
+ { .graphics_ver = 12, .base = XEHP_BSD5_RING_BASE }
+ },
+ },
+ [VCS5] = {
+ .class = VIDEO_DECODE_CLASS,
+ .instance = 5,
+ .mmio_bases = {
+ { .graphics_ver = 12, .base = XEHP_BSD6_RING_BASE }
+ },
+ },
+ [VCS6] = {
+ .class = VIDEO_DECODE_CLASS,
+ .instance = 6,
+ .mmio_bases = {
+ { .graphics_ver = 12, .base = XEHP_BSD7_RING_BASE }
+ },
+ },
+ [VCS7] = {
+ .class = VIDEO_DECODE_CLASS,
+ .instance = 7,
+ .mmio_bases = {
+ { .graphics_ver = 12, .base = XEHP_BSD8_RING_BASE }
+ },
+ },
[VECS0] = {
- .hw_id = VECS0_HW,
.class = VIDEO_ENHANCEMENT_CLASS,
.instance = 0,
.mmio_bases = {
@@ -114,13 +133,26 @@ static const struct engine_info intel_engines[] = {
},
},
[VECS1] = {
- .hw_id = VECS1_HW,
.class = VIDEO_ENHANCEMENT_CLASS,
.instance = 1,
.mmio_bases = {
{ .graphics_ver = 11, .base = GEN11_VEBOX2_RING_BASE }
},
},
+ [VECS2] = {
+ .class = VIDEO_ENHANCEMENT_CLASS,
+ .instance = 2,
+ .mmio_bases = {
+ { .graphics_ver = 12, .base = XEHP_VEBOX3_RING_BASE }
+ },
+ },
+ [VECS3] = {
+ .class = VIDEO_ENHANCEMENT_CLASS,
+ .instance = 3,
+ .mmio_bases = {
+ { .graphics_ver = 12, .base = XEHP_VEBOX4_RING_BASE }
+ },
+ },
};
/**
@@ -153,8 +185,6 @@ u32 intel_engine_context_size(struct intel_gt *gt, u8 class)
case 12:
case 11:
return GEN11_LR_CONTEXT_RENDER_SIZE;
- case 10:
- return GEN10_LR_CONTEXT_RENDER_SIZE;
case 9:
return GEN9_LR_CONTEXT_RENDER_SIZE;
case 8:
@@ -269,6 +299,8 @@ static int intel_engine_setup(struct intel_gt *gt, enum intel_engine_id id)
BUILD_BUG_ON(MAX_ENGINE_CLASS >= BIT(GEN11_ENGINE_CLASS_WIDTH));
BUILD_BUG_ON(MAX_ENGINE_INSTANCE >= BIT(GEN11_ENGINE_INSTANCE_WIDTH));
+ BUILD_BUG_ON(I915_MAX_VCS > (MAX_ENGINE_INSTANCE + 1));
+ BUILD_BUG_ON(I915_MAX_VECS > (MAX_ENGINE_INSTANCE + 1));
if (GEM_DEBUG_WARN_ON(id >= ARRAY_SIZE(gt->engine)))
return -EINVAL;
@@ -294,7 +326,6 @@ static int intel_engine_setup(struct intel_gt *gt, enum intel_engine_id id)
engine->i915 = i915;
engine->gt = gt;
engine->uncore = gt->uncore;
- engine->hw_id = info->hw_id;
guc_class = engine_class_to_guc_class(info->class);
engine->guc_id = MAKE_GUC_ID(guc_class, info->instance);
engine->mmio_base = __engine_mmio_base(i915, info->mmio_bases);
@@ -328,9 +359,6 @@ static int intel_engine_setup(struct intel_gt *gt, enum intel_engine_id id)
if (engine->context_size)
DRIVER_CAPS(i915)->has_logical_contexts = true;
- /* Nothing to do here, execute in order of dependencies */
- engine->schedule = NULL;
-
ewma__engine_latency_init(&engine->latency);
seqcount_init(&engine->stats.lock);
@@ -445,6 +473,28 @@ void intel_engines_free(struct intel_gt *gt)
}
}
+static
+bool gen11_vdbox_has_sfc(struct drm_i915_private *i915,
+ unsigned int physical_vdbox,
+ unsigned int logical_vdbox, u16 vdbox_mask)
+{
+ /*
+ * In Gen11, only even numbered logical VDBOXes are hooked
+ * up to an SFC (Scaler & Format Converter) unit.
+ * In Gen12, Even numbered physical instance always are connected
+ * to an SFC. Odd numbered physical instances have SFC only if
+ * previous even instance is fused off.
+ */
+ if (GRAPHICS_VER(i915) == 12)
+ return (physical_vdbox % 2 == 0) ||
+ !(BIT(physical_vdbox - 1) & vdbox_mask);
+ else if (GRAPHICS_VER(i915) == 11)
+ return logical_vdbox % 2 == 0;
+
+ MISSING_CASE(GRAPHICS_VER(i915));
+ return false;
+}
+
/*
* Determine which engines are fused off in our particular hardware.
* Note that we have a catch-22 situation where we need to be able to access
@@ -471,7 +521,14 @@ static intel_engine_mask_t init_engine_mask(struct intel_gt *gt)
if (GRAPHICS_VER(i915) < 11)
return info->engine_mask;
- media_fuse = ~intel_uncore_read(uncore, GEN11_GT_VEBOX_VDBOX_DISABLE);
+ /*
+ * On newer platforms the fusing register is called 'enable' and has
+ * enable semantics, while on older platforms it is called 'disable'
+ * and bits have disable semantices.
+ */
+ media_fuse = intel_uncore_read(uncore, GEN11_GT_VEBOX_VDBOX_DISABLE);
+ if (GRAPHICS_VER_FULL(i915) < IP_VER(12, 50))
+ media_fuse = ~media_fuse;
vdbox_mask = media_fuse & GEN11_GT_VDBOX_DISABLE_MASK;
vebox_mask = (media_fuse & GEN11_GT_VEBOX_DISABLE_MASK) >>
@@ -489,13 +546,9 @@ static intel_engine_mask_t init_engine_mask(struct intel_gt *gt)
continue;
}
- /*
- * In Gen11, only even numbered logical VDBOXes are
- * hooked up to an SFC (Scaler & Format Converter) unit.
- * In TGL each VDBOX has access to an SFC.
- */
- if (GRAPHICS_VER(i915) >= 12 || logical_vdbox++ % 2 == 0)
+ if (gen11_vdbox_has_sfc(i915, i, logical_vdbox, vdbox_mask))
gt->info.vdbox_sfc_access |= BIT(i);
+ logical_vdbox++;
}
drm_dbg(&i915->drm, "vdbox enable: %04x, instances: %04lx\n",
vdbox_mask, VDBOX_MASK(gt));
@@ -585,9 +638,6 @@ void intel_engine_init_execlists(struct intel_engine_cs *engine)
memset(execlists->pending, 0, sizeof(execlists->pending));
execlists->active =
memset(execlists->inflight, 0, sizeof(execlists->inflight));
-
- execlists->queue_priority_hint = INT_MIN;
- execlists->queue = RB_ROOT_CACHED;
}
static void cleanup_status_page(struct intel_engine_cs *engine)
@@ -714,11 +764,17 @@ static int engine_setup_common(struct intel_engine_cs *engine)
goto err_status;
}
+ engine->sched_engine = i915_sched_engine_create(ENGINE_PHYSICAL);
+ if (!engine->sched_engine) {
+ err = -ENOMEM;
+ goto err_sched_engine;
+ }
+ engine->sched_engine->private_data = engine;
+
err = intel_engine_init_cmd_parser(engine);
if (err)
goto err_cmd_parser;
- intel_engine_init_active(engine, ENGINE_PHYSICAL);
intel_engine_init_execlists(engine);
intel_engine_init__pm(engine);
intel_engine_init_retire(engine);
@@ -737,7 +793,9 @@ static int engine_setup_common(struct intel_engine_cs *engine)
return 0;
err_cmd_parser:
- intel_breadcrumbs_free(engine->breadcrumbs);
+ i915_sched_engine_put(engine->sched_engine);
+err_sched_engine:
+ intel_breadcrumbs_put(engine->breadcrumbs);
err_status:
cleanup_status_page(engine);
return err;
@@ -775,11 +833,11 @@ static int measure_breadcrumb_dw(struct intel_context *ce)
frame->rq.ring = &frame->ring;
mutex_lock(&ce->timeline->mutex);
- spin_lock_irq(&engine->active.lock);
+ spin_lock_irq(&engine->sched_engine->lock);
dw = engine->emit_fini_breadcrumb(&frame->rq, frame->cs) - frame->cs;
- spin_unlock_irq(&engine->active.lock);
+ spin_unlock_irq(&engine->sched_engine->lock);
mutex_unlock(&ce->timeline->mutex);
GEM_BUG_ON(dw & 1); /* RING_TAIL must be qword aligned */
@@ -788,33 +846,13 @@ static int measure_breadcrumb_dw(struct intel_context *ce)
return dw;
}
-void
-intel_engine_init_active(struct intel_engine_cs *engine, unsigned int subclass)
-{
- INIT_LIST_HEAD(&engine->active.requests);
- INIT_LIST_HEAD(&engine->active.hold);
-
- spin_lock_init(&engine->active.lock);
- lockdep_set_subclass(&engine->active.lock, subclass);
-
- /*
- * Due to an interesting quirk in lockdep's internal debug tracking,
- * after setting a subclass we must ensure the lock is used. Otherwise,
- * nr_unused_locks is incremented once too often.
- */
-#ifdef CONFIG_DEBUG_LOCK_ALLOC
- local_irq_disable();
- lock_map_acquire(&engine->active.lock.dep_map);
- lock_map_release(&engine->active.lock.dep_map);
- local_irq_enable();
-#endif
-}
-
-static struct intel_context *
-create_pinned_context(struct intel_engine_cs *engine,
- unsigned int hwsp,
- struct lock_class_key *key,
- const char *name)
+struct intel_context *
+intel_engine_create_pinned_context(struct intel_engine_cs *engine,
+ struct i915_address_space *vm,
+ unsigned int ring_size,
+ unsigned int hwsp,
+ struct lock_class_key *key,
+ const char *name)
{
struct intel_context *ce;
int err;
@@ -825,6 +863,11 @@ create_pinned_context(struct intel_engine_cs *engine,
__set_bit(CONTEXT_BARRIER_BIT, &ce->flags);
ce->timeline = page_pack_bits(NULL, hwsp);
+ ce->ring = NULL;
+ ce->ring_size = ring_size;
+
+ i915_vm_put(ce->vm);
+ ce->vm = i915_vm_get(vm);
err = intel_context_pin(ce); /* perma-pin so it is always available */
if (err) {
@@ -843,7 +886,7 @@ create_pinned_context(struct intel_engine_cs *engine,
return ce;
}
-static void destroy_pinned_context(struct intel_context *ce)
+void intel_engine_destroy_pinned_context(struct intel_context *ce)
{
struct intel_engine_cs *engine = ce->engine;
struct i915_vma *hwsp = engine->status_page.vma;
@@ -863,8 +906,9 @@ create_kernel_context(struct intel_engine_cs *engine)
{
static struct lock_class_key kernel;
- return create_pinned_context(engine, I915_GEM_HWS_SEQNO_ADDR,
- &kernel, "kernel_context");
+ return intel_engine_create_pinned_context(engine, engine->gt->vm, SZ_4K,
+ I915_GEM_HWS_SEQNO_ADDR,
+ &kernel, "kernel_context");
}
/**
@@ -907,7 +951,7 @@ static int engine_init_common(struct intel_engine_cs *engine)
return 0;
err_context:
- destroy_pinned_context(ce);
+ intel_engine_destroy_pinned_context(ce);
return ret;
}
@@ -957,10 +1001,10 @@ int intel_engines_init(struct intel_gt *gt)
*/
void intel_engine_cleanup_common(struct intel_engine_cs *engine)
{
- GEM_BUG_ON(!list_empty(&engine->active.requests));
- tasklet_kill(&engine->execlists.tasklet); /* flush the callback */
+ GEM_BUG_ON(!list_empty(&engine->sched_engine->requests));
- intel_breadcrumbs_free(engine->breadcrumbs);
+ i915_sched_engine_put(engine->sched_engine);
+ intel_breadcrumbs_put(engine->breadcrumbs);
intel_engine_fini_retire(engine);
intel_engine_cleanup_cmd_parser(engine);
@@ -969,7 +1013,7 @@ void intel_engine_cleanup_common(struct intel_engine_cs *engine)
fput(engine->default_state);
if (engine->kernel_context)
- destroy_pinned_context(engine->kernel_context);
+ intel_engine_destroy_pinned_context(engine->kernel_context);
GEM_BUG_ON(!llist_empty(&engine->barrier_tasks));
cleanup_status_page(engine);
@@ -1105,45 +1149,8 @@ static u32
read_subslice_reg(const struct intel_engine_cs *engine,
int slice, int subslice, i915_reg_t reg)
{
- struct drm_i915_private *i915 = engine->i915;
- struct intel_uncore *uncore = engine->uncore;
- u32 mcr_mask, mcr_ss, mcr, old_mcr, val;
- enum forcewake_domains fw_domains;
-
- if (GRAPHICS_VER(i915) >= 11) {
- mcr_mask = GEN11_MCR_SLICE_MASK | GEN11_MCR_SUBSLICE_MASK;
- mcr_ss = GEN11_MCR_SLICE(slice) | GEN11_MCR_SUBSLICE(subslice);
- } else {
- mcr_mask = GEN8_MCR_SLICE_MASK | GEN8_MCR_SUBSLICE_MASK;
- mcr_ss = GEN8_MCR_SLICE(slice) | GEN8_MCR_SUBSLICE(subslice);
- }
-
- fw_domains = intel_uncore_forcewake_for_reg(uncore, reg,
- FW_REG_READ);
- fw_domains |= intel_uncore_forcewake_for_reg(uncore,
- GEN8_MCR_SELECTOR,
- FW_REG_READ | FW_REG_WRITE);
-
- spin_lock_irq(&uncore->lock);
- intel_uncore_forcewake_get__locked(uncore, fw_domains);
-
- old_mcr = mcr = intel_uncore_read_fw(uncore, GEN8_MCR_SELECTOR);
-
- mcr &= ~mcr_mask;
- mcr |= mcr_ss;
- intel_uncore_write_fw(uncore, GEN8_MCR_SELECTOR, mcr);
-
- val = intel_uncore_read_fw(uncore, reg);
-
- mcr &= ~mcr_mask;
- mcr |= old_mcr & mcr_mask;
-
- intel_uncore_write_fw(uncore, GEN8_MCR_SELECTOR, mcr);
-
- intel_uncore_forcewake_put__locked(uncore, fw_domains);
- spin_unlock_irq(&uncore->lock);
-
- return val;
+ return intel_uncore_read_with_mcr_steering(engine->uncore, reg,
+ slice, subslice);
}
/* NB: please notice the memset */
@@ -1243,7 +1250,7 @@ static bool ring_is_idle(struct intel_engine_cs *engine)
void __intel_engine_flush_submission(struct intel_engine_cs *engine, bool sync)
{
- struct tasklet_struct *t = &engine->execlists.tasklet;
+ struct tasklet_struct *t = &engine->sched_engine->tasklet;
if (!t->callback)
return;
@@ -1283,7 +1290,7 @@ bool intel_engine_is_idle(struct intel_engine_cs *engine)
intel_engine_flush_submission(engine);
/* ELSP is empty, but there are ready requests? E.g. after reset */
- if (!RB_EMPTY_ROOT(&engine->execlists.queue.rb_root))
+ if (!i915_sched_engine_is_empty(engine->sched_engine))
return false;
/* Ring stopped? */
@@ -1314,6 +1321,30 @@ bool intel_engines_are_idle(struct intel_gt *gt)
return true;
}
+bool intel_engine_irq_enable(struct intel_engine_cs *engine)
+{
+ if (!engine->irq_enable)
+ return false;
+
+ /* Caller disables interrupts */
+ spin_lock(&engine->gt->irq_lock);
+ engine->irq_enable(engine);
+ spin_unlock(&engine->gt->irq_lock);
+
+ return true;
+}
+
+void intel_engine_irq_disable(struct intel_engine_cs *engine)
+{
+ if (!engine->irq_disable)
+ return;
+
+ /* Caller disables interrupts */
+ spin_lock(&engine->gt->irq_lock);
+ engine->irq_disable(engine);
+ spin_unlock(&engine->gt->irq_lock);
+}
+
void intel_engines_reset_default_submission(struct intel_gt *gt)
{
struct intel_engine_cs *engine;
@@ -1349,7 +1380,7 @@ static struct intel_timeline *get_timeline(struct i915_request *rq)
struct intel_timeline *tl;
/*
- * Even though we are holding the engine->active.lock here, there
+ * Even though we are holding the engine->sched_engine->lock here, there
* is no control over the submission queue per-se and we are
* inspecting the active state at a random point in time, with an
* unknown queue. Play safe and make sure the timeline remains valid.
@@ -1504,8 +1535,8 @@ static void intel_engine_print_registers(struct intel_engine_cs *engine,
drm_printf(m, "\tExeclist tasklet queued? %s (%s), preempt? %s, timeslice? %s\n",
yesno(test_bit(TASKLET_STATE_SCHED,
- &engine->execlists.tasklet.state)),
- enableddisabled(!atomic_read(&engine->execlists.tasklet.count)),
+ &engine->sched_engine->tasklet.state)),
+ enableddisabled(!atomic_read(&engine->sched_engine->tasklet.count)),
repr_timer(&engine->execlists.preempt),
repr_timer(&engine->execlists.timer));
@@ -1529,7 +1560,7 @@ static void intel_engine_print_registers(struct intel_engine_cs *engine,
idx, hws[idx * 2], hws[idx * 2 + 1]);
}
- execlists_active_lock_bh(execlists);
+ i915_sched_engine_active_lock_bh(engine->sched_engine);
rcu_read_lock();
for (port = execlists->active; (rq = *port); port++) {
char hdr[160];
@@ -1560,7 +1591,7 @@ static void intel_engine_print_registers(struct intel_engine_cs *engine,
i915_request_show(m, rq, hdr, 0);
}
rcu_read_unlock();
- execlists_active_unlock_bh(execlists);
+ i915_sched_engine_active_unlock_bh(engine->sched_engine);
} else if (GRAPHICS_VER(dev_priv) > 6) {
drm_printf(m, "\tPP_DIR_BASE: 0x%08x\n",
ENGINE_READ(engine, RING_PP_DIR_BASE));
@@ -1650,6 +1681,98 @@ static void print_properties(struct intel_engine_cs *engine,
read_ul(&engine->defaults, p->offset));
}
+static void engine_dump_request(struct i915_request *rq, struct drm_printer *m, const char *msg)
+{
+ struct intel_timeline *tl = get_timeline(rq);
+
+ i915_request_show(m, rq, msg, 0);
+
+ drm_printf(m, "\t\tring->start: 0x%08x\n",
+ i915_ggtt_offset(rq->ring->vma));
+ drm_printf(m, "\t\tring->head: 0x%08x\n",
+ rq->ring->head);
+ drm_printf(m, "\t\tring->tail: 0x%08x\n",
+ rq->ring->tail);
+ drm_printf(m, "\t\tring->emit: 0x%08x\n",
+ rq->ring->emit);
+ drm_printf(m, "\t\tring->space: 0x%08x\n",
+ rq->ring->space);
+
+ if (tl) {
+ drm_printf(m, "\t\tring->hwsp: 0x%08x\n",
+ tl->hwsp_offset);
+ intel_timeline_put(tl);
+ }
+
+ print_request_ring(m, rq);
+
+ if (rq->context->lrc_reg_state) {
+ drm_printf(m, "Logical Ring Context:\n");
+ hexdump(m, rq->context->lrc_reg_state, PAGE_SIZE);
+ }
+}
+
+void intel_engine_dump_active_requests(struct list_head *requests,
+ struct i915_request *hung_rq,
+ struct drm_printer *m)
+{
+ struct i915_request *rq;
+ const char *msg;
+ enum i915_request_state state;
+
+ list_for_each_entry(rq, requests, sched.link) {
+ if (rq == hung_rq)
+ continue;
+
+ state = i915_test_request_state(rq);
+ if (state < I915_REQUEST_QUEUED)
+ continue;
+
+ if (state == I915_REQUEST_ACTIVE)
+ msg = "\t\tactive on engine";
+ else
+ msg = "\t\tactive in queue";
+
+ engine_dump_request(rq, m, msg);
+ }
+}
+
+static void engine_dump_active_requests(struct intel_engine_cs *engine, struct drm_printer *m)
+{
+ struct i915_request *hung_rq = NULL;
+ struct intel_context *ce;
+ bool guc;
+
+ /*
+ * No need for an engine->irq_seqno_barrier() before the seqno reads.
+ * The GPU is still running so requests are still executing and any
+ * hardware reads will be out of date by the time they are reported.
+ * But the intention here is just to report an instantaneous snapshot
+ * so that's fine.
+ */
+ lockdep_assert_held(&engine->sched_engine->lock);
+
+ drm_printf(m, "\tRequests:\n");
+
+ guc = intel_uc_uses_guc_submission(&engine->gt->uc);
+ if (guc) {
+ ce = intel_engine_get_hung_context(engine);
+ if (ce)
+ hung_rq = intel_context_find_active_request(ce);
+ } else {
+ hung_rq = intel_engine_execlist_find_hung_request(engine);
+ }
+
+ if (hung_rq)
+ engine_dump_request(hung_rq, m, "\t\thung");
+
+ if (guc)
+ intel_guc_dump_active_requests(engine, hung_rq, m);
+ else
+ intel_engine_dump_active_requests(&engine->sched_engine->requests,
+ hung_rq, m);
+}
+
void intel_engine_dump(struct intel_engine_cs *engine,
struct drm_printer *m,
const char *header, ...)
@@ -1694,41 +1817,12 @@ void intel_engine_dump(struct intel_engine_cs *engine,
i915_reset_count(error));
print_properties(engine, m);
- drm_printf(m, "\tRequests:\n");
+ spin_lock_irqsave(&engine->sched_engine->lock, flags);
+ engine_dump_active_requests(engine, m);
- spin_lock_irqsave(&engine->active.lock, flags);
- rq = intel_engine_find_active_request(engine);
- if (rq) {
- struct intel_timeline *tl = get_timeline(rq);
-
- i915_request_show(m, rq, "\t\tactive ", 0);
-
- drm_printf(m, "\t\tring->start: 0x%08x\n",
- i915_ggtt_offset(rq->ring->vma));
- drm_printf(m, "\t\tring->head: 0x%08x\n",
- rq->ring->head);
- drm_printf(m, "\t\tring->tail: 0x%08x\n",
- rq->ring->tail);
- drm_printf(m, "\t\tring->emit: 0x%08x\n",
- rq->ring->emit);
- drm_printf(m, "\t\tring->space: 0x%08x\n",
- rq->ring->space);
-
- if (tl) {
- drm_printf(m, "\t\tring->hwsp: 0x%08x\n",
- tl->hwsp_offset);
- intel_timeline_put(tl);
- }
-
- print_request_ring(m, rq);
-
- if (rq->context->lrc_reg_state) {
- drm_printf(m, "Logical Ring Context:\n");
- hexdump(m, rq->context->lrc_reg_state, PAGE_SIZE);
- }
- }
- drm_printf(m, "\tOn hold?: %lu\n", list_count(&engine->active.hold));
- spin_unlock_irqrestore(&engine->active.lock, flags);
+ drm_printf(m, "\tOn hold?: %lu\n",
+ list_count(&engine->sched_engine->hold));
+ spin_unlock_irqrestore(&engine->sched_engine->lock, flags);
drm_printf(m, "\tMMIO base: 0x%08x\n", engine->mmio_base);
wakeref = intel_runtime_pm_get_if_in_use(engine->uncore->rpm);
@@ -1785,19 +1879,33 @@ ktime_t intel_engine_get_busy_time(struct intel_engine_cs *engine, ktime_t *now)
return total;
}
-static bool match_ring(struct i915_request *rq)
+struct intel_context *
+intel_engine_create_virtual(struct intel_engine_cs **siblings,
+ unsigned int count)
{
- u32 ring = ENGINE_READ(rq->engine, RING_START);
+ if (count == 0)
+ return ERR_PTR(-EINVAL);
+
+ if (count == 1)
+ return intel_context_create(siblings[0]);
- return ring == i915_ggtt_offset(rq->ring->vma);
+ GEM_BUG_ON(!siblings[0]->cops->create_virtual);
+ return siblings[0]->cops->create_virtual(siblings, count);
}
struct i915_request *
-intel_engine_find_active_request(struct intel_engine_cs *engine)
+intel_engine_execlist_find_hung_request(struct intel_engine_cs *engine)
{
struct i915_request *request, *active = NULL;
/*
+ * This search does not work in GuC submission mode. However, the GuC
+ * will report the hanging context directly to the driver itself. So
+ * the driver should never get here when in GuC mode.
+ */
+ GEM_BUG_ON(intel_uc_uses_guc_submission(&engine->gt->uc));
+
+ /*
* We are called by the error capture, reset and to dump engine
* state at random points in time. In particular, note that neither is
* crucially ordered with an interrupt. After a hang, the GPU is dead
@@ -1808,7 +1916,7 @@ intel_engine_find_active_request(struct intel_engine_cs *engine)
* At all other times, we must assume the GPU is still running, but
* we only care about the snapshot of this moment.
*/
- lockdep_assert_held(&engine->active.lock);
+ lockdep_assert_held(&engine->sched_engine->lock);
rcu_read_lock();
request = execlists_active(&engine->execlists);
@@ -1826,15 +1934,9 @@ intel_engine_find_active_request(struct intel_engine_cs *engine)
if (active)
return active;
- list_for_each_entry(request, &engine->active.requests, sched.link) {
- if (__i915_request_is_complete(request))
- continue;
-
- if (!__i915_request_has_started(request))
- continue;
-
- /* More than one preemptible request may match! */
- if (!match_ring(request))
+ list_for_each_entry(request, &engine->sched_engine->requests,
+ sched.link) {
+ if (i915_test_request_state(request) != I915_REQUEST_ACTIVE)
continue;
active = request;
diff --git a/drivers/gpu/drm/i915/gt/intel_engine_heartbeat.c b/drivers/gpu/drm/i915/gt/intel_engine_heartbeat.c
index b99ac41695f3..74775ae961b2 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_heartbeat.c
+++ b/drivers/gpu/drm/i915/gt/intel_engine_heartbeat.c
@@ -70,12 +70,38 @@ static void show_heartbeat(const struct i915_request *rq,
{
struct drm_printer p = drm_debug_printer("heartbeat");
- intel_engine_dump(engine, &p,
- "%s heartbeat {seqno:%llx:%lld, prio:%d} not ticking\n",
- engine->name,
- rq->fence.context,
- rq->fence.seqno,
- rq->sched.attr.priority);
+ if (!rq) {
+ intel_engine_dump(engine, &p,
+ "%s heartbeat not ticking\n",
+ engine->name);
+ } else {
+ intel_engine_dump(engine, &p,
+ "%s heartbeat {seqno:%llx:%lld, prio:%d} not ticking\n",
+ engine->name,
+ rq->fence.context,
+ rq->fence.seqno,
+ rq->sched.attr.priority);
+ }
+}
+
+static void
+reset_engine(struct intel_engine_cs *engine, struct i915_request *rq)
+{
+ if (IS_ENABLED(CONFIG_DRM_I915_DEBUG_GEM))
+ show_heartbeat(rq, engine);
+
+ if (intel_engine_uses_guc(engine))
+ /*
+ * GuC itself is toast or GuC's hang detection
+ * is disabled. Either way, need to find the
+ * hang culprit manually.
+ */
+ intel_guc_find_hung_context(engine);
+
+ intel_gt_handle_error(engine->gt, engine->mask,
+ I915_ERROR_CAPTURE,
+ "stopped heartbeat on %s",
+ engine->name);
}
static void heartbeat(struct work_struct *wrk)
@@ -102,6 +128,11 @@ static void heartbeat(struct work_struct *wrk)
if (intel_gt_is_wedged(engine->gt))
goto out;
+ if (i915_sched_engine_disabled(engine->sched_engine)) {
+ reset_engine(engine, engine->heartbeat.systole);
+ goto out;
+ }
+
if (engine->heartbeat.systole) {
long delay = READ_ONCE(engine->props.heartbeat_interval_ms);
@@ -121,7 +152,7 @@ static void heartbeat(struct work_struct *wrk)
* but all other contexts, including the kernel
* context are stuck waiting for the signal.
*/
- } else if (engine->schedule &&
+ } else if (engine->sched_engine->schedule &&
rq->sched.attr.priority < I915_PRIORITY_BARRIER) {
/*
* Gradually raise the priority of the heartbeat to
@@ -136,16 +167,10 @@ static void heartbeat(struct work_struct *wrk)
attr.priority = I915_PRIORITY_BARRIER;
local_bh_disable();
- engine->schedule(rq, &attr);
+ engine->sched_engine->schedule(rq, &attr);
local_bh_enable();
} else {
- if (IS_ENABLED(CONFIG_DRM_I915_DEBUG_GEM))
- show_heartbeat(rq, engine);
-
- intel_gt_handle_error(engine->gt, engine->mask,
- I915_ERROR_CAPTURE,
- "stopped heartbeat on %s",
- engine->name);
+ reset_engine(engine, rq);
}
rq->emitted_jiffies = jiffies;
@@ -194,6 +219,25 @@ void intel_engine_park_heartbeat(struct intel_engine_cs *engine)
i915_request_put(fetch_and_zero(&engine->heartbeat.systole));
}
+void intel_gt_unpark_heartbeats(struct intel_gt *gt)
+{
+ struct intel_engine_cs *engine;
+ enum intel_engine_id id;
+
+ for_each_engine(engine, gt, id)
+ if (intel_engine_pm_is_awake(engine))
+ intel_engine_unpark_heartbeat(engine);
+}
+
+void intel_gt_park_heartbeats(struct intel_gt *gt)
+{
+ struct intel_engine_cs *engine;
+ enum intel_engine_id id;
+
+ for_each_engine(engine, gt, id)
+ intel_engine_park_heartbeat(engine);
+}
+
void intel_engine_init_heartbeat(struct intel_engine_cs *engine)
{
INIT_DELAYED_WORK(&engine->heartbeat.work, heartbeat);
diff --git a/drivers/gpu/drm/i915/gt/intel_engine_heartbeat.h b/drivers/gpu/drm/i915/gt/intel_engine_heartbeat.h
index a488ea3e84a3..5da6d809a87a 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_heartbeat.h
+++ b/drivers/gpu/drm/i915/gt/intel_engine_heartbeat.h
@@ -7,6 +7,7 @@
#define INTEL_ENGINE_HEARTBEAT_H
struct intel_engine_cs;
+struct intel_gt;
void intel_engine_init_heartbeat(struct intel_engine_cs *engine);
@@ -16,6 +17,9 @@ int intel_engine_set_heartbeat(struct intel_engine_cs *engine,
void intel_engine_park_heartbeat(struct intel_engine_cs *engine);
void intel_engine_unpark_heartbeat(struct intel_engine_cs *engine);
+void intel_gt_park_heartbeats(struct intel_gt *gt);
+void intel_gt_unpark_heartbeats(struct intel_gt *gt);
+
int intel_engine_pulse(struct intel_engine_cs *engine);
int intel_engine_flush_barriers(struct intel_engine_cs *engine);
diff --git a/drivers/gpu/drm/i915/gt/intel_engine_pm.c b/drivers/gpu/drm/i915/gt/intel_engine_pm.c
index 47f4397095e5..1f07ac4e0672 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_pm.c
+++ b/drivers/gpu/drm/i915/gt/intel_engine_pm.c
@@ -275,13 +275,11 @@ static int __engine_park(struct intel_wakeref *wf)
intel_breadcrumbs_park(engine->breadcrumbs);
/* Must be reset upon idling, or we may miss the busy wakeup. */
- GEM_BUG_ON(engine->execlists.queue_priority_hint != INT_MIN);
+ GEM_BUG_ON(engine->sched_engine->queue_priority_hint != INT_MIN);
if (engine->park)
engine->park(engine);
- engine->execlists.no_priolist = false;
-
/* While gt calls i915_vma_parked(), we have to break the lock cycle */
intel_gt_pm_put_async(engine->gt);
return 0;
diff --git a/drivers/gpu/drm/i915/gt/intel_engine_types.h b/drivers/gpu/drm/i915/gt/intel_engine_types.h
index e113f93b3274..ed91bcff20eb 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_types.h
+++ b/drivers/gpu/drm/i915/gt/intel_engine_types.h
@@ -21,32 +21,20 @@
#include "i915_pmu.h"
#include "i915_priolist_types.h"
#include "i915_selftest.h"
-#include "intel_breadcrumbs_types.h"
#include "intel_sseu.h"
#include "intel_timeline_types.h"
#include "intel_uncore.h"
#include "intel_wakeref.h"
#include "intel_workarounds_types.h"
-/* Legacy HW Engine ID */
-
-#define RCS0_HW 0
-#define VCS0_HW 1
-#define BCS0_HW 2
-#define VECS0_HW 3
-#define VCS1_HW 4
-#define VCS2_HW 6
-#define VCS3_HW 7
-#define VECS1_HW 12
-
-/* Gen11+ HW Engine class + instance */
+/* HW Engine class + instance */
#define RENDER_CLASS 0
#define VIDEO_DECODE_CLASS 1
#define VIDEO_ENHANCEMENT_CLASS 2
#define COPY_ENGINE_CLASS 3
#define OTHER_CLASS 4
#define MAX_ENGINE_CLASS 4
-#define MAX_ENGINE_INSTANCE 3
+#define MAX_ENGINE_INSTANCE 7
#define I915_MAX_SLICES 3
#define I915_MAX_SUBSLICES 8
@@ -59,11 +47,13 @@ struct drm_i915_reg_table;
struct i915_gem_context;
struct i915_request;
struct i915_sched_attr;
+struct i915_sched_engine;
struct intel_gt;
struct intel_ring;
struct intel_uncore;
+struct intel_breadcrumbs;
-typedef u8 intel_engine_mask_t;
+typedef u32 intel_engine_mask_t;
#define ALL_ENGINES ((intel_engine_mask_t)~0ul)
struct intel_hw_status_page {
@@ -100,8 +90,8 @@ struct i915_ctx_workarounds {
struct i915_vma *vma;
};
-#define I915_MAX_VCS 4
-#define I915_MAX_VECS 2
+#define I915_MAX_VCS 8
+#define I915_MAX_VECS 4
/*
* Engine IDs definitions.
@@ -114,9 +104,15 @@ enum intel_engine_id {
VCS1,
VCS2,
VCS3,
+ VCS4,
+ VCS5,
+ VCS6,
+ VCS7,
#define _VCS(n) (VCS0 + (n))
VECS0,
VECS1,
+ VECS2,
+ VECS3,
#define _VECS(n) (VECS0 + (n))
I915_NUM_ENGINES
#define INVALID_ENGINE ((enum intel_engine_id)-1)
@@ -138,11 +134,6 @@ struct st_preempt_hang {
*/
struct intel_engine_execlists {
/**
- * @tasklet: softirq tasklet for bottom handler
- */
- struct tasklet_struct tasklet;
-
- /**
* @timer: kick the current context if its timeslice expires
*/
struct timer_list timer;
@@ -153,11 +144,6 @@ struct intel_engine_execlists {
struct timer_list preempt;
/**
- * @default_priolist: priority list for I915_PRIORITY_NORMAL
- */
- struct i915_priolist default_priolist;
-
- /**
* @ccid: identifier for contexts submitted to this engine
*/
u32 ccid;
@@ -192,11 +178,6 @@ struct intel_engine_execlists {
u32 reset_ccid;
/**
- * @no_priolist: priority lists disabled
- */
- bool no_priolist;
-
- /**
* @submit_reg: gen-specific execlist submission register
* set to the ExecList Submission Port (elsp) register pre-Gen11 and to
* the ExecList Submission Queue Contents register array for Gen11+
@@ -238,23 +219,10 @@ struct intel_engine_execlists {
unsigned int port_mask;
/**
- * @queue_priority_hint: Highest pending priority.
- *
- * When we add requests into the queue, or adjust the priority of
- * executing requests, we compute the maximum priority of those
- * pending requests. We can then use this value to determine if
- * we need to preempt the executing requests to service the queue.
- * However, since the we may have recorded the priority of an inflight
- * request we wanted to preempt but since completed, at the time of
- * dequeuing the priority hint may no longer may match the highest
- * available request priority.
+ * @virtual: Queue of requets on a virtual engine, sorted by priority.
+ * Each RB entry is a struct i915_priolist containing a list of requests
+ * of the same priority.
*/
- int queue_priority_hint;
-
- /**
- * @queue: queue of requests, in priority lists
- */
- struct rb_root_cached queue;
struct rb_root_cached virtual;
/**
@@ -295,7 +263,6 @@ struct intel_engine_cs {
enum intel_engine_id id;
enum intel_engine_id legacy_idx;
- unsigned int hw_id;
unsigned int guc_id;
intel_engine_mask_t mask;
@@ -326,15 +293,13 @@ struct intel_engine_cs {
struct intel_sseu sseu;
- struct {
- spinlock_t lock;
- struct list_head requests;
- struct list_head hold; /* ready requests, but on hold */
- } active;
+ struct i915_sched_engine *sched_engine;
/* keep a request in reserve for a [pm] barrier under oom */
struct i915_request *request_pool;
+ struct intel_context *hung_ce;
+
struct llist_head barrier_tasks;
struct intel_context *kernel_context; /* pinned */
@@ -419,6 +384,8 @@ struct intel_engine_cs {
void (*park)(struct intel_engine_cs *engine);
void (*unpark)(struct intel_engine_cs *engine);
+ void (*bump_serial)(struct intel_engine_cs *engine);
+
void (*set_default_submission)(struct intel_engine_cs *engine);
const struct intel_context_ops *cops;
@@ -447,22 +414,13 @@ struct intel_engine_cs {
*/
void (*submit_request)(struct i915_request *rq);
- /*
- * Called on signaling of a SUBMIT_FENCE, passing along the signaling
- * request down to the bonded pairs.
- */
- void (*bond_execute)(struct i915_request *rq,
- struct dma_fence *signal);
+ void (*release)(struct intel_engine_cs *engine);
/*
- * Call when the priority on a request has changed and it and its
- * dependencies may need rescheduling. Note the request itself may
- * not be ready to run!
+ * Add / remove request from engine active tracking
*/
- void (*schedule)(struct i915_request *request,
- const struct i915_sched_attr *attr);
-
- void (*release)(struct intel_engine_cs *engine);
+ void (*add_active_request)(struct i915_request *rq);
+ void (*remove_active_request)(struct i915_request *rq);
struct intel_engine_execlists execlists;
@@ -485,6 +443,7 @@ struct intel_engine_cs {
#define I915_ENGINE_IS_VIRTUAL BIT(5)
#define I915_ENGINE_HAS_RELATIVE_MMIO BIT(6)
#define I915_ENGINE_REQUIRES_CMD_PARSER BIT(7)
+#define I915_ENGINE_WANT_FORCED_PREEMPTION BIT(8)
unsigned int flags;
/*
diff --git a/drivers/gpu/drm/i915/gt/intel_engine_user.c b/drivers/gpu/drm/i915/gt/intel_engine_user.c
index 3cca7ea2d6ea..8f8bea08e734 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_user.c
+++ b/drivers/gpu/drm/i915/gt/intel_engine_user.c
@@ -11,6 +11,7 @@
#include "intel_engine.h"
#include "intel_engine_user.h"
#include "intel_gt.h"
+#include "uc/intel_guc_submission.h"
struct intel_engine_cs *
intel_engine_lookup_user(struct drm_i915_private *i915, u8 class, u8 instance)
@@ -108,13 +109,16 @@ static void set_scheduler_caps(struct drm_i915_private *i915)
for_each_uabi_engine(engine, i915) { /* all engines must agree! */
int i;
- if (engine->schedule)
+ if (engine->sched_engine->schedule)
enabled |= (I915_SCHEDULER_CAP_ENABLED |
I915_SCHEDULER_CAP_PRIORITY);
else
disabled |= (I915_SCHEDULER_CAP_ENABLED |
I915_SCHEDULER_CAP_PRIORITY);
+ if (intel_uc_uses_guc_submission(&i915->gt.uc))
+ enabled |= I915_SCHEDULER_CAP_STATIC_PRIORITY_MAP;
+
for (i = 0; i < ARRAY_SIZE(map); i++) {
if (engine->flags & BIT(map[i].engine))
enabled |= BIT(map[i].sched);
diff --git a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
index fc77592d88a9..de5f9c86b9a4 100644
--- a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
+++ b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
@@ -114,6 +114,7 @@
#include "gen8_engine_cs.h"
#include "intel_breadcrumbs.h"
#include "intel_context.h"
+#include "intel_engine_heartbeat.h"
#include "intel_engine_pm.h"
#include "intel_engine_stats.h"
#include "intel_execlists_submission.h"
@@ -153,6 +154,12 @@
#define GEN12_CSB_CTX_VALID(csb_dw) \
(FIELD_GET(GEN12_CSB_SW_CTX_ID_MASK, csb_dw) != GEN12_IDLE_CTX_ID)
+#define XEHP_CTX_STATUS_SWITCHED_TO_NEW_QUEUE BIT(1) /* upper csb dword */
+#define XEHP_CSB_SW_CTX_ID_MASK GENMASK(31, 10)
+#define XEHP_IDLE_CTX_ID 0xFFFF
+#define XEHP_CSB_CTX_VALID(csb_dw) \
+ (FIELD_GET(XEHP_CSB_SW_CTX_ID_MASK, csb_dw) != XEHP_IDLE_CTX_ID)
+
/* Typical size of the average request (2 pipecontrols and a MI_BB) */
#define EXECLISTS_REQUEST_SIZE 64 /* bytes */
@@ -182,18 +189,6 @@ struct virtual_engine {
int prio;
} nodes[I915_NUM_ENGINES];
- /*
- * Keep track of bonded pairs -- restrictions upon on our selection
- * of physical engines any particular request may be submitted to.
- * If we receive a submit-fence from a master engine, we will only
- * use one of sibling_mask physical engines.
- */
- struct ve_bond {
- const struct intel_engine_cs *master;
- intel_engine_mask_t sibling_mask;
- } *bonds;
- unsigned int num_bonds;
-
/* And finally, which physical engines this virtual engine maps onto. */
unsigned int num_siblings;
struct intel_engine_cs *siblings[];
@@ -205,6 +200,9 @@ static struct virtual_engine *to_virtual_engine(struct intel_engine_cs *engine)
return container_of(engine, struct virtual_engine, base);
}
+static struct intel_context *
+execlists_create_virtual(struct intel_engine_cs **siblings, unsigned int count);
+
static struct i915_request *
__active_request(const struct intel_timeline * const tl,
struct i915_request *rq,
@@ -273,11 +271,11 @@ static int effective_prio(const struct i915_request *rq)
return prio;
}
-static int queue_prio(const struct intel_engine_execlists *execlists)
+static int queue_prio(const struct i915_sched_engine *sched_engine)
{
struct rb_node *rb;
- rb = rb_first_cached(&execlists->queue);
+ rb = rb_first_cached(&sched_engine->queue);
if (!rb)
return INT_MIN;
@@ -318,14 +316,14 @@ static bool need_preempt(const struct intel_engine_cs *engine,
* to preserve FIFO ordering of dependencies.
*/
last_prio = max(effective_prio(rq), I915_PRIORITY_NORMAL - 1);
- if (engine->execlists.queue_priority_hint <= last_prio)
+ if (engine->sched_engine->queue_priority_hint <= last_prio)
return false;
/*
* Check against the first request in ELSP[1], it will, thanks to the
* power of PI, be the highest priority of that context.
*/
- if (!list_is_last(&rq->sched.link, &engine->active.requests) &&
+ if (!list_is_last(&rq->sched.link, &engine->sched_engine->requests) &&
rq_prio(list_next_entry(rq, sched.link)) > last_prio)
return true;
@@ -340,7 +338,7 @@ static bool need_preempt(const struct intel_engine_cs *engine,
* context, it's priority would not exceed ELSP[0] aka last_prio.
*/
return max(virtual_prio(&engine->execlists),
- queue_prio(&engine->execlists)) > last_prio;
+ queue_prio(engine->sched_engine)) > last_prio;
}
__maybe_unused static bool
@@ -367,10 +365,10 @@ __unwind_incomplete_requests(struct intel_engine_cs *engine)
struct list_head *pl;
int prio = I915_PRIORITY_INVALID;
- lockdep_assert_held(&engine->active.lock);
+ lockdep_assert_held(&engine->sched_engine->lock);
list_for_each_entry_safe_reverse(rq, rn,
- &engine->active.requests,
+ &engine->sched_engine->requests,
sched.link) {
if (__i915_request_is_complete(rq)) {
list_del_init(&rq->sched.link);
@@ -382,9 +380,10 @@ __unwind_incomplete_requests(struct intel_engine_cs *engine)
GEM_BUG_ON(rq_prio(rq) == I915_PRIORITY_INVALID);
if (rq_prio(rq) != prio) {
prio = rq_prio(rq);
- pl = i915_sched_lookup_priolist(engine, prio);
+ pl = i915_sched_lookup_priolist(engine->sched_engine,
+ prio);
}
- GEM_BUG_ON(RB_EMPTY_ROOT(&engine->execlists.queue.rb_root));
+ GEM_BUG_ON(i915_sched_engine_is_empty(engine->sched_engine));
list_move(&rq->sched.link, pl);
set_bit(I915_FENCE_FLAG_PQUEUE, &rq->fence.flags);
@@ -489,6 +488,16 @@ __execlists_schedule_in(struct i915_request *rq)
/* Use a fixed tag for OA and friends */
GEM_BUG_ON(ce->tag <= BITS_PER_LONG);
ce->lrc.ccid = ce->tag;
+ } else if (GRAPHICS_VER_FULL(engine->i915) >= IP_VER(12, 50)) {
+ /* We don't need a strict matching tag, just different values */
+ unsigned int tag = ffs(READ_ONCE(engine->context_tag));
+
+ GEM_BUG_ON(tag == 0 || tag >= BITS_PER_LONG);
+ clear_bit(tag - 1, &engine->context_tag);
+ ce->lrc.ccid = tag << (XEHP_SW_CTX_ID_SHIFT - 32);
+
+ BUILD_BUG_ON(BITS_PER_LONG > GEN12_MAX_CONTEXT_HW_ID);
+
} else {
/* We don't need a strict matching tag, just different values */
unsigned int tag = __ffs(engine->context_tag);
@@ -534,13 +543,13 @@ resubmit_virtual_request(struct i915_request *rq, struct virtual_engine *ve)
{
struct intel_engine_cs *engine = rq->engine;
- spin_lock_irq(&engine->active.lock);
+ spin_lock_irq(&engine->sched_engine->lock);
clear_bit(I915_FENCE_FLAG_PQUEUE, &rq->fence.flags);
WRITE_ONCE(rq->engine, &ve->base);
ve->base.submit_request(rq);
- spin_unlock_irq(&engine->active.lock);
+ spin_unlock_irq(&engine->sched_engine->lock);
}
static void kick_siblings(struct i915_request *rq, struct intel_context *ce)
@@ -569,7 +578,7 @@ static void kick_siblings(struct i915_request *rq, struct intel_context *ce)
resubmit_virtual_request(rq, ve);
if (READ_ONCE(ve->request))
- tasklet_hi_schedule(&ve->base.execlists.tasklet);
+ tasklet_hi_schedule(&ve->base.sched_engine->tasklet);
}
static void __execlists_schedule_out(struct i915_request * const rq,
@@ -579,7 +588,7 @@ static void __execlists_schedule_out(struct i915_request * const rq,
unsigned int ccid;
/*
- * NB process_csb() is not under the engine->active.lock and hence
+ * NB process_csb() is not under the engine->sched_engine->lock and hence
* schedule_out can race with schedule_in meaning that we should
* refrain from doing non-trivial work here.
*/
@@ -599,8 +608,14 @@ static void __execlists_schedule_out(struct i915_request * const rq,
intel_engine_add_retire(engine, ce->timeline);
ccid = ce->lrc.ccid;
- ccid >>= GEN11_SW_CTX_ID_SHIFT - 32;
- ccid &= GEN12_MAX_CONTEXT_HW_ID;
+ if (GRAPHICS_VER_FULL(engine->i915) >= IP_VER(12, 50)) {
+ ccid >>= XEHP_SW_CTX_ID_SHIFT - 32;
+ ccid &= XEHP_MAX_CONTEXT_HW_ID;
+ } else {
+ ccid >>= GEN11_SW_CTX_ID_SHIFT - 32;
+ ccid &= GEN12_MAX_CONTEXT_HW_ID;
+ }
+
if (ccid < BITS_PER_LONG) {
GEM_BUG_ON(ccid == 0);
GEM_BUG_ON(test_bit(ccid - 1, &engine->context_tag));
@@ -738,9 +753,9 @@ trace_ports(const struct intel_engine_execlists *execlists,
}
static bool
-reset_in_progress(const struct intel_engine_execlists *execlists)
+reset_in_progress(const struct intel_engine_cs *engine)
{
- return unlikely(!__tasklet_is_enabled(&execlists->tasklet));
+ return unlikely(!__tasklet_is_enabled(&engine->sched_engine->tasklet));
}
static __maybe_unused noinline bool
@@ -756,7 +771,7 @@ assert_pending_valid(const struct intel_engine_execlists *execlists,
trace_ports(execlists, msg, execlists->pending);
/* We may be messing around with the lists during reset, lalala */
- if (reset_in_progress(execlists))
+ if (reset_in_progress(engine))
return true;
if (!execlists->pending[0]) {
@@ -1096,7 +1111,8 @@ static void defer_active(struct intel_engine_cs *engine)
if (!rq)
return;
- defer_request(rq, i915_sched_lookup_priolist(engine, rq_prio(rq)));
+ defer_request(rq, i915_sched_lookup_priolist(engine->sched_engine,
+ rq_prio(rq)));
}
static bool
@@ -1133,13 +1149,14 @@ static bool needs_timeslice(const struct intel_engine_cs *engine,
return false;
/* If ELSP[1] is occupied, always check to see if worth slicing */
- if (!list_is_last_rcu(&rq->sched.link, &engine->active.requests)) {
+ if (!list_is_last_rcu(&rq->sched.link,
+ &engine->sched_engine->requests)) {
ENGINE_TRACE(engine, "timeslice required for second inflight context\n");
return true;
}
/* Otherwise, ELSP[0] is by itself, but may be waiting in the queue */
- if (!RB_EMPTY_ROOT(&engine->execlists.queue.rb_root)) {
+ if (!i915_sched_engine_is_empty(engine->sched_engine)) {
ENGINE_TRACE(engine, "timeslice required for queue\n");
return true;
}
@@ -1187,7 +1204,7 @@ static void start_timeslice(struct intel_engine_cs *engine)
* its timeslice, so recheck.
*/
if (!timer_pending(&el->timer))
- tasklet_hi_schedule(&el->tasklet);
+ tasklet_hi_schedule(&engine->sched_engine->tasklet);
return;
}
@@ -1236,6 +1253,7 @@ static bool completed(const struct i915_request *rq)
static void execlists_dequeue(struct intel_engine_cs *engine)
{
struct intel_engine_execlists * const execlists = &engine->execlists;
+ struct i915_sched_engine * const sched_engine = engine->sched_engine;
struct i915_request **port = execlists->pending;
struct i915_request ** const last_port = port + execlists->port_mask;
struct i915_request *last, * const *active;
@@ -1265,7 +1283,7 @@ static void execlists_dequeue(struct intel_engine_cs *engine)
* and context switches) submission.
*/
- spin_lock(&engine->active.lock);
+ spin_lock(&sched_engine->lock);
/*
* If the queue is higher priority than the last
@@ -1287,7 +1305,7 @@ static void execlists_dequeue(struct intel_engine_cs *engine)
last->fence.context,
last->fence.seqno,
last->sched.attr.priority,
- execlists->queue_priority_hint);
+ sched_engine->queue_priority_hint);
record_preemption(execlists);
/*
@@ -1313,7 +1331,7 @@ static void execlists_dequeue(struct intel_engine_cs *engine)
yesno(timer_expired(&execlists->timer)),
last->fence.context, last->fence.seqno,
rq_prio(last),
- execlists->queue_priority_hint,
+ sched_engine->queue_priority_hint,
yesno(timeslice_yield(execlists, last)));
/*
@@ -1365,7 +1383,7 @@ static void execlists_dequeue(struct intel_engine_cs *engine)
* Even if ELSP[1] is occupied and not worthy
* of timeslices, our queue might be.
*/
- spin_unlock(&engine->active.lock);
+ spin_unlock(&sched_engine->lock);
return;
}
}
@@ -1375,7 +1393,7 @@ static void execlists_dequeue(struct intel_engine_cs *engine)
while ((ve = first_virtual_engine(engine))) {
struct i915_request *rq;
- spin_lock(&ve->base.active.lock);
+ spin_lock(&ve->base.sched_engine->lock);
rq = ve->request;
if (unlikely(!virtual_matches(ve, rq, engine)))
@@ -1384,14 +1402,14 @@ static void execlists_dequeue(struct intel_engine_cs *engine)
GEM_BUG_ON(rq->engine != &ve->base);
GEM_BUG_ON(rq->context != &ve->context);
- if (unlikely(rq_prio(rq) < queue_prio(execlists))) {
- spin_unlock(&ve->base.active.lock);
+ if (unlikely(rq_prio(rq) < queue_prio(sched_engine))) {
+ spin_unlock(&ve->base.sched_engine->lock);
break;
}
if (last && !can_merge_rq(last, rq)) {
- spin_unlock(&ve->base.active.lock);
- spin_unlock(&engine->active.lock);
+ spin_unlock(&ve->base.sched_engine->lock);
+ spin_unlock(&engine->sched_engine->lock);
return; /* leave this for another sibling */
}
@@ -1405,7 +1423,7 @@ static void execlists_dequeue(struct intel_engine_cs *engine)
yesno(engine != ve->siblings[0]));
WRITE_ONCE(ve->request, NULL);
- WRITE_ONCE(ve->base.execlists.queue_priority_hint, INT_MIN);
+ WRITE_ONCE(ve->base.sched_engine->queue_priority_hint, INT_MIN);
rb = &ve->nodes[engine->id].rb;
rb_erase_cached(rb, &execlists->virtual);
@@ -1437,7 +1455,7 @@ static void execlists_dequeue(struct intel_engine_cs *engine)
i915_request_put(rq);
unlock:
- spin_unlock(&ve->base.active.lock);
+ spin_unlock(&ve->base.sched_engine->lock);
/*
* Hmm, we have a bunch of virtual engine requests,
@@ -1450,7 +1468,7 @@ unlock:
break;
}
- while ((rb = rb_first_cached(&execlists->queue))) {
+ while ((rb = rb_first_cached(&sched_engine->queue))) {
struct i915_priolist *p = to_priolist(rb);
struct i915_request *rq, *rn;
@@ -1529,7 +1547,7 @@ unlock:
}
}
- rb_erase_cached(&p->node, &execlists->queue);
+ rb_erase_cached(&p->node, &sched_engine->queue);
i915_priolist_free(p);
}
done:
@@ -1551,8 +1569,9 @@ done:
* request triggering preemption on the next dequeue (or subsequent
* interrupt for secondary ports).
*/
- execlists->queue_priority_hint = queue_prio(execlists);
- spin_unlock(&engine->active.lock);
+ sched_engine->queue_priority_hint = queue_prio(sched_engine);
+ i915_sched_engine_reset_on_empty(sched_engine);
+ spin_unlock(&sched_engine->lock);
/*
* We can skip poking the HW if we ended up with exactly the same set
@@ -1655,13 +1674,24 @@ static void invalidate_csb_entries(const u64 *first, const u64 *last)
* bits 44-46: reserved
* bits 47-57: sw context id of the lrc the GT switched away from
* bits 58-63: sw counter of the lrc the GT switched away from
+ *
+ * Xe_HP csb shuffles things around compared to TGL:
+ *
+ * bits 0-3: context switch detail (same possible values as TGL)
+ * bits 4-9: engine instance
+ * bits 10-25: sw context id of the lrc the GT switched to
+ * bits 26-31: sw counter of the lrc the GT switched to
+ * bit 32: semaphore wait mode (poll or signal), Only valid when
+ * switch detail is set to "wait on semaphore"
+ * bit 33: switched to new queue
+ * bits 34-41: wait detail (for switch detail 1 to 4)
+ * bits 42-57: sw context id of the lrc the GT switched away from
+ * bits 58-63: sw counter of the lrc the GT switched away from
*/
-static bool gen12_csb_parse(const u64 csb)
+static inline bool
+__gen12_csb_parse(bool ctx_to_valid, bool ctx_away_valid, bool new_queue,
+ u8 switch_detail)
{
- bool ctx_away_valid = GEN12_CSB_CTX_VALID(upper_32_bits(csb));
- bool new_queue =
- lower_32_bits(csb) & GEN12_CTX_STATUS_SWITCHED_TO_NEW_QUEUE;
-
/*
* The context switch detail is not guaranteed to be 5 when a preemption
* occurs, so we can't just check for that. The check below works for
@@ -1670,7 +1700,7 @@ static bool gen12_csb_parse(const u64 csb)
* would require some extra handling, but we don't support that.
*/
if (!ctx_away_valid || new_queue) {
- GEM_BUG_ON(!GEN12_CSB_CTX_VALID(lower_32_bits(csb)));
+ GEM_BUG_ON(!ctx_to_valid);
return true;
}
@@ -1679,10 +1709,26 @@ static bool gen12_csb_parse(const u64 csb)
* context switch on an unsuccessful wait instruction since we always
* use polling mode.
*/
- GEM_BUG_ON(GEN12_CTX_SWITCH_DETAIL(upper_32_bits(csb)));
+ GEM_BUG_ON(switch_detail);
return false;
}
+static bool xehp_csb_parse(const u64 csb)
+{
+ return __gen12_csb_parse(XEHP_CSB_CTX_VALID(lower_32_bits(csb)), /* cxt to */
+ XEHP_CSB_CTX_VALID(upper_32_bits(csb)), /* cxt away */
+ upper_32_bits(csb) & XEHP_CTX_STATUS_SWITCHED_TO_NEW_QUEUE,
+ GEN12_CTX_SWITCH_DETAIL(lower_32_bits(csb)));
+}
+
+static bool gen12_csb_parse(const u64 csb)
+{
+ return __gen12_csb_parse(GEN12_CSB_CTX_VALID(lower_32_bits(csb)), /* cxt to */
+ GEN12_CSB_CTX_VALID(upper_32_bits(csb)), /* cxt away */
+ lower_32_bits(csb) & GEN12_CTX_STATUS_SWITCHED_TO_NEW_QUEUE,
+ GEN12_CTX_SWITCH_DETAIL(upper_32_bits(csb)));
+}
+
static bool gen8_csb_parse(const u64 csb)
{
return csb & (GEN8_CTX_STATUS_IDLE_ACTIVE | GEN8_CTX_STATUS_PREEMPTED);
@@ -1767,8 +1813,8 @@ process_csb(struct intel_engine_cs *engine, struct i915_request **inactive)
* access. Either we are inside the tasklet, or the tasklet is disabled
* and we assume that is only inside the reset paths and so serialised.
*/
- GEM_BUG_ON(!tasklet_is_locked(&execlists->tasklet) &&
- !reset_in_progress(execlists));
+ GEM_BUG_ON(!tasklet_is_locked(&engine->sched_engine->tasklet) &&
+ !reset_in_progress(engine));
/*
* Note that csb_write, csb_status may be either in HWSP or mmio.
@@ -1847,7 +1893,9 @@ process_csb(struct intel_engine_cs *engine, struct i915_request **inactive)
ENGINE_TRACE(engine, "csb[%d]: status=0x%08x:0x%08x\n",
head, upper_32_bits(csb), lower_32_bits(csb));
- if (GRAPHICS_VER(engine->i915) >= 12)
+ if (GRAPHICS_VER_FULL(engine->i915) >= IP_VER(12, 50))
+ promote = xehp_csb_parse(csb);
+ else if (GRAPHICS_VER(engine->i915) >= 12)
promote = gen12_csb_parse(csb);
else
promote = gen8_csb_parse(csb);
@@ -1979,7 +2027,8 @@ static void __execlists_hold(struct i915_request *rq)
__i915_request_unsubmit(rq);
clear_bit(I915_FENCE_FLAG_PQUEUE, &rq->fence.flags);
- list_move_tail(&rq->sched.link, &rq->engine->active.hold);
+ list_move_tail(&rq->sched.link,
+ &rq->engine->sched_engine->hold);
i915_request_set_hold(rq);
RQ_TRACE(rq, "on hold\n");
@@ -2016,7 +2065,7 @@ static bool execlists_hold(struct intel_engine_cs *engine,
if (i915_request_on_hold(rq))
return false;
- spin_lock_irq(&engine->active.lock);
+ spin_lock_irq(&engine->sched_engine->lock);
if (__i915_request_is_complete(rq)) { /* too late! */
rq = NULL;
@@ -2032,10 +2081,10 @@ static bool execlists_hold(struct intel_engine_cs *engine,
GEM_BUG_ON(i915_request_on_hold(rq));
GEM_BUG_ON(rq->engine != engine);
__execlists_hold(rq);
- GEM_BUG_ON(list_empty(&engine->active.hold));
+ GEM_BUG_ON(list_empty(&engine->sched_engine->hold));
unlock:
- spin_unlock_irq(&engine->active.lock);
+ spin_unlock_irq(&engine->sched_engine->lock);
return rq;
}
@@ -2079,7 +2128,7 @@ static void __execlists_unhold(struct i915_request *rq)
i915_request_clear_hold(rq);
list_move_tail(&rq->sched.link,
- i915_sched_lookup_priolist(rq->engine,
+ i915_sched_lookup_priolist(rq->engine->sched_engine,
rq_prio(rq)));
set_bit(I915_FENCE_FLAG_PQUEUE, &rq->fence.flags);
@@ -2115,7 +2164,7 @@ static void __execlists_unhold(struct i915_request *rq)
static void execlists_unhold(struct intel_engine_cs *engine,
struct i915_request *rq)
{
- spin_lock_irq(&engine->active.lock);
+ spin_lock_irq(&engine->sched_engine->lock);
/*
* Move this request back to the priority queue, and all of its
@@ -2123,12 +2172,12 @@ static void execlists_unhold(struct intel_engine_cs *engine,
*/
__execlists_unhold(rq);
- if (rq_prio(rq) > engine->execlists.queue_priority_hint) {
- engine->execlists.queue_priority_hint = rq_prio(rq);
- tasklet_hi_schedule(&engine->execlists.tasklet);
+ if (rq_prio(rq) > engine->sched_engine->queue_priority_hint) {
+ engine->sched_engine->queue_priority_hint = rq_prio(rq);
+ tasklet_hi_schedule(&engine->sched_engine->tasklet);
}
- spin_unlock_irq(&engine->active.lock);
+ spin_unlock_irq(&engine->sched_engine->lock);
}
struct execlists_capture {
@@ -2258,13 +2307,13 @@ static void execlists_capture(struct intel_engine_cs *engine)
if (!cap)
return;
- spin_lock_irq(&engine->active.lock);
+ spin_lock_irq(&engine->sched_engine->lock);
cap->rq = active_context(engine, active_ccid(engine));
if (cap->rq) {
cap->rq = active_request(cap->rq->context->timeline, cap->rq);
cap->rq = i915_request_get_rcu(cap->rq);
}
- spin_unlock_irq(&engine->active.lock);
+ spin_unlock_irq(&engine->sched_engine->lock);
if (!cap->rq)
goto err_free;
@@ -2316,13 +2365,13 @@ static void execlists_reset(struct intel_engine_cs *engine, const char *msg)
ENGINE_TRACE(engine, "reset for %s\n", msg);
/* Mark this tasklet as disabled to avoid waiting for it to complete */
- tasklet_disable_nosync(&engine->execlists.tasklet);
+ tasklet_disable_nosync(&engine->sched_engine->tasklet);
ring_set_paused(engine, 1); /* Freeze the current request in place */
execlists_capture(engine);
intel_engine_reset(engine, msg);
- tasklet_enable(&engine->execlists.tasklet);
+ tasklet_enable(&engine->sched_engine->tasklet);
clear_and_wake_up_bit(bit, lock);
}
@@ -2345,8 +2394,9 @@ static bool preempt_timeout(const struct intel_engine_cs *const engine)
*/
static void execlists_submission_tasklet(struct tasklet_struct *t)
{
- struct intel_engine_cs * const engine =
- from_tasklet(engine, t, execlists.tasklet);
+ struct i915_sched_engine *sched_engine =
+ from_tasklet(sched_engine, t, tasklet);
+ struct intel_engine_cs * const engine = sched_engine->private_data;
struct i915_request *post[2 * EXECLIST_MAX_PORTS];
struct i915_request **inactive;
@@ -2421,13 +2471,16 @@ static void execlists_irq_handler(struct intel_engine_cs *engine, u16 iir)
intel_engine_signal_breadcrumbs(engine);
if (tasklet)
- tasklet_hi_schedule(&engine->execlists.tasklet);
+ tasklet_hi_schedule(&engine->sched_engine->tasklet);
}
static void __execlists_kick(struct intel_engine_execlists *execlists)
{
+ struct intel_engine_cs *engine =
+ container_of(execlists, typeof(*engine), execlists);
+
/* Kick the tasklet for some interrupt coalescing and reset handling */
- tasklet_hi_schedule(&execlists->tasklet);
+ tasklet_hi_schedule(&engine->sched_engine->tasklet);
}
#define execlists_kick(t, member) \
@@ -2448,19 +2501,20 @@ static void queue_request(struct intel_engine_cs *engine,
{
GEM_BUG_ON(!list_empty(&rq->sched.link));
list_add_tail(&rq->sched.link,
- i915_sched_lookup_priolist(engine, rq_prio(rq)));
+ i915_sched_lookup_priolist(engine->sched_engine,
+ rq_prio(rq)));
set_bit(I915_FENCE_FLAG_PQUEUE, &rq->fence.flags);
}
static bool submit_queue(struct intel_engine_cs *engine,
const struct i915_request *rq)
{
- struct intel_engine_execlists *execlists = &engine->execlists;
+ struct i915_sched_engine *sched_engine = engine->sched_engine;
- if (rq_prio(rq) <= execlists->queue_priority_hint)
+ if (rq_prio(rq) <= sched_engine->queue_priority_hint)
return false;
- execlists->queue_priority_hint = rq_prio(rq);
+ sched_engine->queue_priority_hint = rq_prio(rq);
return true;
}
@@ -2468,7 +2522,7 @@ static bool ancestor_on_hold(const struct intel_engine_cs *engine,
const struct i915_request *rq)
{
GEM_BUG_ON(i915_request_on_hold(rq));
- return !list_empty(&engine->active.hold) && hold_request(rq);
+ return !list_empty(&engine->sched_engine->hold) && hold_request(rq);
}
static void execlists_submit_request(struct i915_request *request)
@@ -2477,23 +2531,24 @@ static void execlists_submit_request(struct i915_request *request)
unsigned long flags;
/* Will be called from irq-context when using foreign fences. */
- spin_lock_irqsave(&engine->active.lock, flags);
+ spin_lock_irqsave(&engine->sched_engine->lock, flags);
if (unlikely(ancestor_on_hold(engine, request))) {
RQ_TRACE(request, "ancestor on hold\n");
- list_add_tail(&request->sched.link, &engine->active.hold);
+ list_add_tail(&request->sched.link,
+ &engine->sched_engine->hold);
i915_request_set_hold(request);
} else {
queue_request(engine, request);
- GEM_BUG_ON(RB_EMPTY_ROOT(&engine->execlists.queue.rb_root));
+ GEM_BUG_ON(i915_sched_engine_is_empty(engine->sched_engine));
GEM_BUG_ON(list_empty(&request->sched.link));
if (submit_queue(engine, request))
__execlists_kick(&engine->execlists);
}
- spin_unlock_irqrestore(&engine->active.lock, flags);
+ spin_unlock_irqrestore(&engine->sched_engine->lock, flags);
}
static int
@@ -2533,11 +2588,26 @@ static int execlists_context_alloc(struct intel_context *ce)
return lrc_alloc(ce, ce->engine);
}
+static void execlists_context_cancel_request(struct intel_context *ce,
+ struct i915_request *rq)
+{
+ struct intel_engine_cs *engine = NULL;
+
+ i915_request_active_engine(rq, &engine);
+
+ if (engine && intel_engine_pulse(engine))
+ intel_gt_handle_error(engine->gt, engine->mask, 0,
+ "request cancellation by %s",
+ current->comm);
+}
+
static const struct intel_context_ops execlists_context_ops = {
.flags = COPS_HAS_INFLIGHT,
.alloc = execlists_context_alloc,
+ .cancel_request = execlists_context_cancel_request,
+
.pre_pin = execlists_context_pre_pin,
.pin = execlists_context_pin,
.unpin = lrc_unpin,
@@ -2548,6 +2618,8 @@ static const struct intel_context_ops execlists_context_ops = {
.reset = lrc_reset,
.destroy = lrc_destroy,
+
+ .create_virtual = execlists_create_virtual,
};
static int emit_pdps(struct i915_request *rq)
@@ -2800,10 +2872,8 @@ static int execlists_resume(struct intel_engine_cs *engine)
static void execlists_reset_prepare(struct intel_engine_cs *engine)
{
- struct intel_engine_execlists * const execlists = &engine->execlists;
-
ENGINE_TRACE(engine, "depth<-%d\n",
- atomic_read(&execlists->tasklet.count));
+ atomic_read(&engine->sched_engine->tasklet.count));
/*
* Prevent request submission to the hardware until we have
@@ -2814,8 +2884,8 @@ static void execlists_reset_prepare(struct intel_engine_cs *engine)
* Turning off the execlists->tasklet until the reset is over
* prevents the race.
*/
- __tasklet_disable_sync_once(&execlists->tasklet);
- GEM_BUG_ON(!reset_in_progress(execlists));
+ __tasklet_disable_sync_once(&engine->sched_engine->tasklet);
+ GEM_BUG_ON(!reset_in_progress(engine));
/*
* We stop engines, otherwise we might get failed reset and a
@@ -2957,24 +3027,26 @@ static void execlists_reset_rewind(struct intel_engine_cs *engine, bool stalled)
/* Push back any incomplete requests for replay after the reset. */
rcu_read_lock();
- spin_lock_irqsave(&engine->active.lock, flags);
+ spin_lock_irqsave(&engine->sched_engine->lock, flags);
__unwind_incomplete_requests(engine);
- spin_unlock_irqrestore(&engine->active.lock, flags);
+ spin_unlock_irqrestore(&engine->sched_engine->lock, flags);
rcu_read_unlock();
}
static void nop_submission_tasklet(struct tasklet_struct *t)
{
- struct intel_engine_cs * const engine =
- from_tasklet(engine, t, execlists.tasklet);
+ struct i915_sched_engine *sched_engine =
+ from_tasklet(sched_engine, t, tasklet);
+ struct intel_engine_cs * const engine = sched_engine->private_data;
/* The driver is wedged; don't process any more events. */
- WRITE_ONCE(engine->execlists.queue_priority_hint, INT_MIN);
+ WRITE_ONCE(engine->sched_engine->queue_priority_hint, INT_MIN);
}
static void execlists_reset_cancel(struct intel_engine_cs *engine)
{
struct intel_engine_execlists * const execlists = &engine->execlists;
+ struct i915_sched_engine * const sched_engine = engine->sched_engine;
struct i915_request *rq, *rn;
struct rb_node *rb;
unsigned long flags;
@@ -2998,15 +3070,15 @@ static void execlists_reset_cancel(struct intel_engine_cs *engine)
execlists_reset_csb(engine, true);
rcu_read_lock();
- spin_lock_irqsave(&engine->active.lock, flags);
+ spin_lock_irqsave(&engine->sched_engine->lock, flags);
/* Mark all executing requests as skipped. */
- list_for_each_entry(rq, &engine->active.requests, sched.link)
+ list_for_each_entry(rq, &engine->sched_engine->requests, sched.link)
i915_request_put(i915_request_mark_eio(rq));
intel_engine_signal_breadcrumbs(engine);
/* Flush the queued requests to the timeline list (for retiring). */
- while ((rb = rb_first_cached(&execlists->queue))) {
+ while ((rb = rb_first_cached(&sched_engine->queue))) {
struct i915_priolist *p = to_priolist(rb);
priolist_for_each_request_consume(rq, rn, p) {
@@ -3016,12 +3088,12 @@ static void execlists_reset_cancel(struct intel_engine_cs *engine)
}
}
- rb_erase_cached(&p->node, &execlists->queue);
+ rb_erase_cached(&p->node, &sched_engine->queue);
i915_priolist_free(p);
}
/* On-hold requests will be flushed to timeline upon their release */
- list_for_each_entry(rq, &engine->active.hold, sched.link)
+ list_for_each_entry(rq, &sched_engine->hold, sched.link)
i915_request_put(i915_request_mark_eio(rq));
/* Cancel all attached virtual engines */
@@ -3032,7 +3104,7 @@ static void execlists_reset_cancel(struct intel_engine_cs *engine)
rb_erase_cached(rb, &execlists->virtual);
RB_CLEAR_NODE(rb);
- spin_lock(&ve->base.active.lock);
+ spin_lock(&ve->base.sched_engine->lock);
rq = fetch_and_zero(&ve->request);
if (rq) {
if (i915_request_mark_eio(rq)) {
@@ -3042,20 +3114,20 @@ static void execlists_reset_cancel(struct intel_engine_cs *engine)
}
i915_request_put(rq);
- ve->base.execlists.queue_priority_hint = INT_MIN;
+ ve->base.sched_engine->queue_priority_hint = INT_MIN;
}
- spin_unlock(&ve->base.active.lock);
+ spin_unlock(&ve->base.sched_engine->lock);
}
/* Remaining _unready_ requests will be nop'ed when submitted */
- execlists->queue_priority_hint = INT_MIN;
- execlists->queue = RB_ROOT_CACHED;
+ sched_engine->queue_priority_hint = INT_MIN;
+ sched_engine->queue = RB_ROOT_CACHED;
- GEM_BUG_ON(__tasklet_is_enabled(&execlists->tasklet));
- execlists->tasklet.callback = nop_submission_tasklet;
+ GEM_BUG_ON(__tasklet_is_enabled(&engine->sched_engine->tasklet));
+ engine->sched_engine->tasklet.callback = nop_submission_tasklet;
- spin_unlock_irqrestore(&engine->active.lock, flags);
+ spin_unlock_irqrestore(&engine->sched_engine->lock, flags);
rcu_read_unlock();
}
@@ -3073,14 +3145,14 @@ static void execlists_reset_finish(struct intel_engine_cs *engine)
* reset as the next level of recovery, and as a final resort we
* will declare the device wedged.
*/
- GEM_BUG_ON(!reset_in_progress(execlists));
+ GEM_BUG_ON(!reset_in_progress(engine));
/* And kick in case we missed a new request submission. */
- if (__tasklet_enable(&execlists->tasklet))
+ if (__tasklet_enable(&engine->sched_engine->tasklet))
__execlists_kick(execlists);
ENGINE_TRACE(engine, "depth->%d\n",
- atomic_read(&execlists->tasklet.count));
+ atomic_read(&engine->sched_engine->tasklet.count));
}
static void gen8_logical_ring_enable_irq(struct intel_engine_cs *engine)
@@ -3101,6 +3173,42 @@ static void execlists_park(struct intel_engine_cs *engine)
cancel_timer(&engine->execlists.preempt);
}
+static void add_to_engine(struct i915_request *rq)
+{
+ lockdep_assert_held(&rq->engine->sched_engine->lock);
+ list_move_tail(&rq->sched.link, &rq->engine->sched_engine->requests);
+}
+
+static void remove_from_engine(struct i915_request *rq)
+{
+ struct intel_engine_cs *engine, *locked;
+
+ /*
+ * Virtual engines complicate acquiring the engine timeline lock,
+ * as their rq->engine pointer is not stable until under that
+ * engine lock. The simple ploy we use is to take the lock then
+ * check that the rq still belongs to the newly locked engine.
+ */
+ locked = READ_ONCE(rq->engine);
+ spin_lock_irq(&locked->sched_engine->lock);
+ while (unlikely(locked != (engine = READ_ONCE(rq->engine)))) {
+ spin_unlock(&locked->sched_engine->lock);
+ spin_lock(&engine->sched_engine->lock);
+ locked = engine;
+ }
+ list_del_init(&rq->sched.link);
+
+ clear_bit(I915_FENCE_FLAG_PQUEUE, &rq->fence.flags);
+ clear_bit(I915_FENCE_FLAG_HOLD, &rq->fence.flags);
+
+ /* Prevent further __await_execution() registering a cb, then flush */
+ set_bit(I915_FENCE_FLAG_ACTIVE, &rq->fence.flags);
+
+ spin_unlock_irq(&locked->sched_engine->lock);
+
+ i915_request_notify_execute_cb_imm(rq);
+}
+
static bool can_preempt(struct intel_engine_cs *engine)
{
if (GRAPHICS_VER(engine->i915) > 8)
@@ -3110,11 +3218,62 @@ static bool can_preempt(struct intel_engine_cs *engine)
return engine->class != RENDER_CLASS;
}
+static void kick_execlists(const struct i915_request *rq, int prio)
+{
+ struct intel_engine_cs *engine = rq->engine;
+ struct i915_sched_engine *sched_engine = engine->sched_engine;
+ const struct i915_request *inflight;
+
+ /*
+ * We only need to kick the tasklet once for the high priority
+ * new context we add into the queue.
+ */
+ if (prio <= sched_engine->queue_priority_hint)
+ return;
+
+ rcu_read_lock();
+
+ /* Nothing currently active? We're overdue for a submission! */
+ inflight = execlists_active(&engine->execlists);
+ if (!inflight)
+ goto unlock;
+
+ /*
+ * If we are already the currently executing context, don't
+ * bother evaluating if we should preempt ourselves.
+ */
+ if (inflight->context == rq->context)
+ goto unlock;
+
+ ENGINE_TRACE(engine,
+ "bumping queue-priority-hint:%d for rq:%llx:%lld, inflight:%llx:%lld prio %d\n",
+ prio,
+ rq->fence.context, rq->fence.seqno,
+ inflight->fence.context, inflight->fence.seqno,
+ inflight->sched.attr.priority);
+
+ sched_engine->queue_priority_hint = prio;
+
+ /*
+ * Allow preemption of low -> normal -> high, but we do
+ * not allow low priority tasks to preempt other low priority
+ * tasks under the impression that latency for low priority
+ * tasks does not matter (as much as background throughput),
+ * so kiss.
+ */
+ if (prio >= max(I915_PRIORITY_NORMAL, rq_prio(inflight)))
+ tasklet_hi_schedule(&sched_engine->tasklet);
+
+unlock:
+ rcu_read_unlock();
+}
+
static void execlists_set_default_submission(struct intel_engine_cs *engine)
{
engine->submit_request = execlists_submit_request;
- engine->schedule = i915_schedule;
- engine->execlists.tasklet.callback = execlists_submission_tasklet;
+ engine->sched_engine->schedule = i915_schedule;
+ engine->sched_engine->kick_backend = kick_execlists;
+ engine->sched_engine->tasklet.callback = execlists_submission_tasklet;
}
static void execlists_shutdown(struct intel_engine_cs *engine)
@@ -3122,7 +3281,7 @@ static void execlists_shutdown(struct intel_engine_cs *engine)
/* Synchronise with residual timers and any softirq they raise */
del_timer_sync(&engine->execlists.timer);
del_timer_sync(&engine->execlists.preempt);
- tasklet_kill(&engine->execlists.tasklet);
+ tasklet_kill(&engine->sched_engine->tasklet);
}
static void execlists_release(struct intel_engine_cs *engine)
@@ -3144,6 +3303,8 @@ logical_ring_default_vfuncs(struct intel_engine_cs *engine)
engine->cops = &execlists_context_ops;
engine->request_alloc = execlists_request_alloc;
+ engine->add_active_request = add_to_engine;
+ engine->remove_active_request = remove_from_engine;
engine->reset.prepare = execlists_reset_prepare;
engine->reset.rewind = execlists_reset_rewind;
@@ -3238,7 +3399,7 @@ int intel_execlists_submission_setup(struct intel_engine_cs *engine)
struct intel_uncore *uncore = engine->uncore;
u32 base = engine->mmio_base;
- tasklet_setup(&engine->execlists.tasklet, execlists_submission_tasklet);
+ tasklet_setup(&engine->sched_engine->tasklet, execlists_submission_tasklet);
timer_setup(&engine->execlists.timer, execlists_timeslice, 0);
timer_setup(&engine->execlists.preempt, execlists_preempt, 0);
@@ -3255,6 +3416,10 @@ int intel_execlists_submission_setup(struct intel_engine_cs *engine)
i915_mmio_reg_offset(RING_EXECLIST_SQ_CONTENTS(base));
execlists->ctrl_reg = uncore->regs +
i915_mmio_reg_offset(RING_EXECLIST_CONTROL(base));
+
+ engine->fw_domain = intel_uncore_forcewake_for_reg(engine->uncore,
+ RING_EXECLIST_CONTROL(engine->mmio_base),
+ FW_REG_WRITE);
} else {
execlists->submit_reg = uncore->regs +
i915_mmio_reg_offset(RING_ELSP(base));
@@ -3272,7 +3437,8 @@ int intel_execlists_submission_setup(struct intel_engine_cs *engine)
execlists->csb_size = GEN11_CSB_ENTRIES;
engine->context_tag = GENMASK(BITS_PER_LONG - 2, 0);
- if (GRAPHICS_VER(engine->i915) >= 11) {
+ if (GRAPHICS_VER(engine->i915) >= 11 &&
+ GRAPHICS_VER_FULL(engine->i915) < IP_VER(12, 50)) {
execlists->ccid |= engine->instance << (GEN11_ENGINE_INSTANCE_SHIFT - 32);
execlists->ccid |= engine->class << (GEN11_ENGINE_CLASS_SHIFT - 32);
}
@@ -3286,7 +3452,7 @@ int intel_execlists_submission_setup(struct intel_engine_cs *engine)
static struct list_head *virtual_queue(struct virtual_engine *ve)
{
- return &ve->base.execlists.default_priolist.requests;
+ return &ve->base.sched_engine->default_priolist.requests;
}
static void rcu_virtual_context_destroy(struct work_struct *wrk)
@@ -3301,7 +3467,7 @@ static void rcu_virtual_context_destroy(struct work_struct *wrk)
if (unlikely(ve->request)) {
struct i915_request *old;
- spin_lock_irq(&ve->base.active.lock);
+ spin_lock_irq(&ve->base.sched_engine->lock);
old = fetch_and_zero(&ve->request);
if (old) {
@@ -3310,7 +3476,7 @@ static void rcu_virtual_context_destroy(struct work_struct *wrk)
i915_request_put(old);
}
- spin_unlock_irq(&ve->base.active.lock);
+ spin_unlock_irq(&ve->base.sched_engine->lock);
}
/*
@@ -3320,7 +3486,7 @@ static void rcu_virtual_context_destroy(struct work_struct *wrk)
* rbtrees as in the case it is running in parallel, it may reinsert
* the rb_node into a sibling.
*/
- tasklet_kill(&ve->base.execlists.tasklet);
+ tasklet_kill(&ve->base.sched_engine->tasklet);
/* Decouple ourselves from the siblings, no more access allowed. */
for (n = 0; n < ve->num_siblings; n++) {
@@ -3330,24 +3496,26 @@ static void rcu_virtual_context_destroy(struct work_struct *wrk)
if (RB_EMPTY_NODE(node))
continue;
- spin_lock_irq(&sibling->active.lock);
+ spin_lock_irq(&sibling->sched_engine->lock);
- /* Detachment is lazily performed in the execlists tasklet */
+ /* Detachment is lazily performed in the sched_engine->tasklet */
if (!RB_EMPTY_NODE(node))
rb_erase_cached(node, &sibling->execlists.virtual);
- spin_unlock_irq(&sibling->active.lock);
+ spin_unlock_irq(&sibling->sched_engine->lock);
}
- GEM_BUG_ON(__tasklet_is_scheduled(&ve->base.execlists.tasklet));
+ GEM_BUG_ON(__tasklet_is_scheduled(&ve->base.sched_engine->tasklet));
GEM_BUG_ON(!list_empty(virtual_queue(ve)));
lrc_fini(&ve->context);
intel_context_fini(&ve->context);
- intel_breadcrumbs_free(ve->base.breadcrumbs);
+ if (ve->base.breadcrumbs)
+ intel_breadcrumbs_put(ve->base.breadcrumbs);
+ if (ve->base.sched_engine)
+ i915_sched_engine_put(ve->base.sched_engine);
intel_engine_free_request_pool(&ve->base);
- kfree(ve->bonds);
kfree(ve);
}
@@ -3440,11 +3608,24 @@ static void virtual_context_exit(struct intel_context *ce)
intel_engine_pm_put(ve->siblings[n]);
}
+static struct intel_engine_cs *
+virtual_get_sibling(struct intel_engine_cs *engine, unsigned int sibling)
+{
+ struct virtual_engine *ve = to_virtual_engine(engine);
+
+ if (sibling >= ve->num_siblings)
+ return NULL;
+
+ return ve->siblings[sibling];
+}
+
static const struct intel_context_ops virtual_context_ops = {
.flags = COPS_HAS_INFLIGHT,
.alloc = virtual_context_alloc,
+ .cancel_request = execlists_context_cancel_request,
+
.pre_pin = virtual_context_pre_pin,
.pin = virtual_context_pin,
.unpin = lrc_unpin,
@@ -3454,6 +3635,8 @@ static const struct intel_context_ops virtual_context_ops = {
.exit = virtual_context_exit,
.destroy = virtual_context_destroy,
+
+ .get_sibling = virtual_get_sibling,
};
static intel_engine_mask_t virtual_submission_mask(struct virtual_engine *ve)
@@ -3475,16 +3658,18 @@ static intel_engine_mask_t virtual_submission_mask(struct virtual_engine *ve)
ENGINE_TRACE(&ve->base, "rq=%llx:%lld, mask=%x, prio=%d\n",
rq->fence.context, rq->fence.seqno,
- mask, ve->base.execlists.queue_priority_hint);
+ mask, ve->base.sched_engine->queue_priority_hint);
return mask;
}
static void virtual_submission_tasklet(struct tasklet_struct *t)
{
+ struct i915_sched_engine *sched_engine =
+ from_tasklet(sched_engine, t, tasklet);
struct virtual_engine * const ve =
- from_tasklet(ve, t, base.execlists.tasklet);
- const int prio = READ_ONCE(ve->base.execlists.queue_priority_hint);
+ (struct virtual_engine *)sched_engine->private_data;
+ const int prio = READ_ONCE(sched_engine->queue_priority_hint);
intel_engine_mask_t mask;
unsigned int n;
@@ -3503,7 +3688,7 @@ static void virtual_submission_tasklet(struct tasklet_struct *t)
if (!READ_ONCE(ve->request))
break; /* already handled by a sibling's tasklet */
- spin_lock_irq(&sibling->active.lock);
+ spin_lock_irq(&sibling->sched_engine->lock);
if (unlikely(!(mask & sibling->mask))) {
if (!RB_EMPTY_NODE(&node->rb)) {
@@ -3552,11 +3737,11 @@ static void virtual_submission_tasklet(struct tasklet_struct *t)
submit_engine:
GEM_BUG_ON(RB_EMPTY_NODE(&node->rb));
node->prio = prio;
- if (first && prio > sibling->execlists.queue_priority_hint)
- tasklet_hi_schedule(&sibling->execlists.tasklet);
+ if (first && prio > sibling->sched_engine->queue_priority_hint)
+ tasklet_hi_schedule(&sibling->sched_engine->tasklet);
unlock_engine:
- spin_unlock_irq(&sibling->active.lock);
+ spin_unlock_irq(&sibling->sched_engine->lock);
if (intel_context_inflight(&ve->context))
break;
@@ -3574,7 +3759,7 @@ static void virtual_submit_request(struct i915_request *rq)
GEM_BUG_ON(ve->base.submit_request != virtual_submit_request);
- spin_lock_irqsave(&ve->base.active.lock, flags);
+ spin_lock_irqsave(&ve->base.sched_engine->lock, flags);
/* By the time we resubmit a request, it may be completed */
if (__i915_request_is_complete(rq)) {
@@ -3588,68 +3773,25 @@ static void virtual_submit_request(struct i915_request *rq)
i915_request_put(ve->request);
}
- ve->base.execlists.queue_priority_hint = rq_prio(rq);
+ ve->base.sched_engine->queue_priority_hint = rq_prio(rq);
ve->request = i915_request_get(rq);
GEM_BUG_ON(!list_empty(virtual_queue(ve)));
list_move_tail(&rq->sched.link, virtual_queue(ve));
- tasklet_hi_schedule(&ve->base.execlists.tasklet);
+ tasklet_hi_schedule(&ve->base.sched_engine->tasklet);
unlock:
- spin_unlock_irqrestore(&ve->base.active.lock, flags);
+ spin_unlock_irqrestore(&ve->base.sched_engine->lock, flags);
}
-static struct ve_bond *
-virtual_find_bond(struct virtual_engine *ve,
- const struct intel_engine_cs *master)
-{
- int i;
-
- for (i = 0; i < ve->num_bonds; i++) {
- if (ve->bonds[i].master == master)
- return &ve->bonds[i];
- }
-
- return NULL;
-}
-
-static void
-virtual_bond_execute(struct i915_request *rq, struct dma_fence *signal)
-{
- struct virtual_engine *ve = to_virtual_engine(rq->engine);
- intel_engine_mask_t allowed, exec;
- struct ve_bond *bond;
-
- allowed = ~to_request(signal)->engine->mask;
-
- bond = virtual_find_bond(ve, to_request(signal)->engine);
- if (bond)
- allowed &= bond->sibling_mask;
-
- /* Restrict the bonded request to run on only the available engines */
- exec = READ_ONCE(rq->execution_mask);
- while (!try_cmpxchg(&rq->execution_mask, &exec, exec & allowed))
- ;
-
- /* Prevent the master from being re-run on the bonded engines */
- to_request(signal)->execution_mask &= ~allowed;
-}
-
-struct intel_context *
-intel_execlists_create_virtual(struct intel_engine_cs **siblings,
- unsigned int count)
+static struct intel_context *
+execlists_create_virtual(struct intel_engine_cs **siblings, unsigned int count)
{
struct virtual_engine *ve;
unsigned int n;
int err;
- if (count == 0)
- return ERR_PTR(-EINVAL);
-
- if (count == 1)
- return intel_context_create(siblings[0]);
-
ve = kzalloc(struct_size(ve, siblings, count), GFP_KERNEL);
if (!ve)
return ERR_PTR(-ENOMEM);
@@ -3681,19 +3823,24 @@ intel_execlists_create_virtual(struct intel_engine_cs **siblings,
snprintf(ve->base.name, sizeof(ve->base.name), "virtual");
- intel_engine_init_active(&ve->base, ENGINE_VIRTUAL);
intel_engine_init_execlists(&ve->base);
+ ve->base.sched_engine = i915_sched_engine_create(ENGINE_VIRTUAL);
+ if (!ve->base.sched_engine) {
+ err = -ENOMEM;
+ goto err_put;
+ }
+ ve->base.sched_engine->private_data = &ve->base;
+
ve->base.cops = &virtual_context_ops;
ve->base.request_alloc = execlists_request_alloc;
- ve->base.schedule = i915_schedule;
+ ve->base.sched_engine->schedule = i915_schedule;
+ ve->base.sched_engine->kick_backend = kick_execlists;
ve->base.submit_request = virtual_submit_request;
- ve->base.bond_execute = virtual_bond_execute;
INIT_LIST_HEAD(virtual_queue(ve));
- ve->base.execlists.queue_priority_hint = INT_MIN;
- tasklet_setup(&ve->base.execlists.tasklet, virtual_submission_tasklet);
+ tasklet_setup(&ve->base.sched_engine->tasklet, virtual_submission_tasklet);
intel_context_init(&ve->context, &ve->base);
@@ -3721,7 +3868,7 @@ intel_execlists_create_virtual(struct intel_engine_cs **siblings,
* layering if we handle cloning of the requests and
* submitting a copy into each backend.
*/
- if (sibling->execlists.tasklet.callback !=
+ if (sibling->sched_engine->tasklet.callback !=
execlists_submission_tasklet) {
err = -ENODEV;
goto err_put;
@@ -3756,6 +3903,8 @@ intel_execlists_create_virtual(struct intel_engine_cs **siblings,
"v%dx%d", ve->base.class, count);
ve->base.context_size = sibling->context_size;
+ ve->base.add_active_request = sibling->add_active_request;
+ ve->base.remove_active_request = sibling->remove_active_request;
ve->base.emit_bb_start = sibling->emit_bb_start;
ve->base.emit_flush = sibling->emit_flush;
ve->base.emit_init_breadcrumb = sibling->emit_init_breadcrumb;
@@ -3776,70 +3925,6 @@ err_put:
return ERR_PTR(err);
}
-struct intel_context *
-intel_execlists_clone_virtual(struct intel_engine_cs *src)
-{
- struct virtual_engine *se = to_virtual_engine(src);
- struct intel_context *dst;
-
- dst = intel_execlists_create_virtual(se->siblings,
- se->num_siblings);
- if (IS_ERR(dst))
- return dst;
-
- if (se->num_bonds) {
- struct virtual_engine *de = to_virtual_engine(dst->engine);
-
- de->bonds = kmemdup(se->bonds,
- sizeof(*se->bonds) * se->num_bonds,
- GFP_KERNEL);
- if (!de->bonds) {
- intel_context_put(dst);
- return ERR_PTR(-ENOMEM);
- }
-
- de->num_bonds = se->num_bonds;
- }
-
- return dst;
-}
-
-int intel_virtual_engine_attach_bond(struct intel_engine_cs *engine,
- const struct intel_engine_cs *master,
- const struct intel_engine_cs *sibling)
-{
- struct virtual_engine *ve = to_virtual_engine(engine);
- struct ve_bond *bond;
- int n;
-
- /* Sanity check the sibling is part of the virtual engine */
- for (n = 0; n < ve->num_siblings; n++)
- if (sibling == ve->siblings[n])
- break;
- if (n == ve->num_siblings)
- return -EINVAL;
-
- bond = virtual_find_bond(ve, master);
- if (bond) {
- bond->sibling_mask |= sibling->mask;
- return 0;
- }
-
- bond = krealloc(ve->bonds,
- sizeof(*bond) * (ve->num_bonds + 1),
- GFP_KERNEL);
- if (!bond)
- return -ENOMEM;
-
- bond[ve->num_bonds].master = master;
- bond[ve->num_bonds].sibling_mask = sibling->mask;
-
- ve->bonds = bond;
- ve->num_bonds++;
-
- return 0;
-}
-
void intel_execlists_show_requests(struct intel_engine_cs *engine,
struct drm_printer *m,
void (*show_request)(struct drm_printer *m,
@@ -3849,16 +3934,17 @@ void intel_execlists_show_requests(struct intel_engine_cs *engine,
unsigned int max)
{
const struct intel_engine_execlists *execlists = &engine->execlists;
+ struct i915_sched_engine *sched_engine = engine->sched_engine;
struct i915_request *rq, *last;
unsigned long flags;
unsigned int count;
struct rb_node *rb;
- spin_lock_irqsave(&engine->active.lock, flags);
+ spin_lock_irqsave(&sched_engine->lock, flags);
last = NULL;
count = 0;
- list_for_each_entry(rq, &engine->active.requests, sched.link) {
+ list_for_each_entry(rq, &sched_engine->requests, sched.link) {
if (count++ < max - 1)
show_request(m, rq, "\t\t", 0);
else
@@ -3873,13 +3959,13 @@ void intel_execlists_show_requests(struct intel_engine_cs *engine,
show_request(m, last, "\t\t", 0);
}
- if (execlists->queue_priority_hint != INT_MIN)
+ if (sched_engine->queue_priority_hint != INT_MIN)
drm_printf(m, "\t\tQueue priority hint: %d\n",
- READ_ONCE(execlists->queue_priority_hint));
+ READ_ONCE(sched_engine->queue_priority_hint));
last = NULL;
count = 0;
- for (rb = rb_first_cached(&execlists->queue); rb; rb = rb_next(rb)) {
+ for (rb = rb_first_cached(&sched_engine->queue); rb; rb = rb_next(rb)) {
struct i915_priolist *p = rb_entry(rb, typeof(*p), node);
priolist_for_each_request(rq, p) {
@@ -3921,7 +4007,7 @@ void intel_execlists_show_requests(struct intel_engine_cs *engine,
show_request(m, last, "\t\t", 0);
}
- spin_unlock_irqrestore(&engine->active.lock, flags);
+ spin_unlock_irqrestore(&sched_engine->lock, flags);
}
#if IS_ENABLED(CONFIG_DRM_I915_SELFTEST)
diff --git a/drivers/gpu/drm/i915/gt/intel_execlists_submission.h b/drivers/gpu/drm/i915/gt/intel_execlists_submission.h
index 4ca9b475e252..a1aa92c983a5 100644
--- a/drivers/gpu/drm/i915/gt/intel_execlists_submission.h
+++ b/drivers/gpu/drm/i915/gt/intel_execlists_submission.h
@@ -32,15 +32,7 @@ void intel_execlists_show_requests(struct intel_engine_cs *engine,
int indent),
unsigned int max);
-struct intel_context *
-intel_execlists_create_virtual(struct intel_engine_cs **siblings,
- unsigned int count);
-
-struct intel_context *
-intel_execlists_clone_virtual(struct intel_engine_cs *src);
-
-int intel_virtual_engine_attach_bond(struct intel_engine_cs *engine,
- const struct intel_engine_cs *master,
- const struct intel_engine_cs *sibling);
+bool
+intel_engine_in_execlists_submission_mode(const struct intel_engine_cs *engine);
#endif /* __INTEL_EXECLISTS_SUBMISSION_H__ */
diff --git a/drivers/gpu/drm/i915/gt/intel_ggtt.c b/drivers/gpu/drm/i915/gt/intel_ggtt.c
index 20e46b843324..de3ac58fceec 100644
--- a/drivers/gpu/drm/i915/gt/intel_ggtt.c
+++ b/drivers/gpu/drm/i915/gt/intel_ggtt.c
@@ -826,13 +826,13 @@ static int ggtt_probe_common(struct i915_ggtt *ggtt, u64 size)
phys_addr = pci_resource_start(pdev, 0) + pci_resource_len(pdev, 0) / 2;
/*
- * On BXT+/CNL+ writes larger than 64 bit to the GTT pagetable range
+ * On BXT+/ICL+ writes larger than 64 bit to the GTT pagetable range
* will be dropped. For WC mappings in general we have 64 byte burst
* writes when the WC buffer is flushed, so we can't use it, but have to
* resort to an uncached mapping. The WC issue is easily caught by the
* readback check when writing GTT PTE entries.
*/
- if (IS_GEN9_LP(i915) || GRAPHICS_VER(i915) >= 10)
+ if (IS_GEN9_LP(i915) || GRAPHICS_VER(i915) >= 11)
ggtt->gsm = ioremap(phys_addr, size);
else
ggtt->gsm = ioremap_wc(phys_addr, size);
@@ -1494,7 +1494,7 @@ intel_partial_pages(const struct i915_ggtt_view *view,
if (ret)
goto err_sg_alloc;
- iter = i915_gem_object_get_sg_dma(obj, view->partial.offset, &offset, true);
+ iter = i915_gem_object_get_sg_dma(obj, view->partial.offset, &offset);
GEM_BUG_ON(!iter);
sg = st->sgl;
diff --git a/drivers/gpu/drm/i915/gt/intel_gpu_commands.h b/drivers/gpu/drm/i915/gt/intel_gpu_commands.h
index 2694dbb9967e..1c3af0fc0456 100644
--- a/drivers/gpu/drm/i915/gt/intel_gpu_commands.h
+++ b/drivers/gpu/drm/i915/gt/intel_gpu_commands.h
@@ -123,8 +123,10 @@
#define MI_SEMAPHORE_SAD_NEQ_SDD (5 << 12)
#define MI_SEMAPHORE_TOKEN_MASK REG_GENMASK(9, 5)
#define MI_SEMAPHORE_TOKEN_SHIFT 5
+#define MI_STORE_DATA_IMM MI_INSTR(0x20, 0)
#define MI_STORE_DWORD_IMM MI_INSTR(0x20, 1)
#define MI_STORE_DWORD_IMM_GEN4 MI_INSTR(0x20, 2)
+#define MI_STORE_QWORD_IMM_GEN8 (MI_INSTR(0x20, 3) | REG_BIT(21))
#define MI_MEM_VIRTUAL (1 << 22) /* 945,g33,965 */
#define MI_USE_GGTT (1 << 22) /* g4x+ */
#define MI_STORE_DWORD_INDEX MI_INSTR(0x21, 1)
diff --git a/drivers/gpu/drm/i915/gt/intel_gt.c b/drivers/gpu/drm/i915/gt/intel_gt.c
index 2161bf01ef8b..62d40c986642 100644
--- a/drivers/gpu/drm/i915/gt/intel_gt.c
+++ b/drivers/gpu/drm/i915/gt/intel_gt.c
@@ -13,6 +13,7 @@
#include "intel_gt_clock_utils.h"
#include "intel_gt_pm.h"
#include "intel_gt_requests.h"
+#include "intel_migrate.h"
#include "intel_mocs.h"
#include "intel_rc6.h"
#include "intel_renderstate.h"
@@ -40,8 +41,8 @@ void intel_gt_init_early(struct intel_gt *gt, struct drm_i915_private *i915)
intel_gt_init_timelines(gt);
intel_gt_pm_init_early(gt);
- intel_rps_init_early(&gt->rps);
intel_uc_init_early(&gt->uc);
+ intel_rps_init_early(&gt->rps);
}
int intel_gt_probe_lmem(struct intel_gt *gt)
@@ -83,13 +84,73 @@ void intel_gt_init_hw_early(struct intel_gt *gt, struct i915_ggtt *ggtt)
gt->ggtt = ggtt;
}
+static const struct intel_mmio_range icl_l3bank_steering_table[] = {
+ { 0x00B100, 0x00B3FF },
+ {},
+};
+
+static const struct intel_mmio_range xehpsdv_mslice_steering_table[] = {
+ { 0x004000, 0x004AFF },
+ { 0x00C800, 0x00CFFF },
+ { 0x00DD00, 0x00DDFF },
+ { 0x00E900, 0x00FFFF }, /* 0xEA00 - OxEFFF is unused */
+ {},
+};
+
+static const struct intel_mmio_range xehpsdv_lncf_steering_table[] = {
+ { 0x00B000, 0x00B0FF },
+ { 0x00D800, 0x00D8FF },
+ {},
+};
+
+static const struct intel_mmio_range dg2_lncf_steering_table[] = {
+ { 0x00B000, 0x00B0FF },
+ { 0x00D880, 0x00D8FF },
+ {},
+};
+
+static u16 slicemask(struct intel_gt *gt, int count)
+{
+ u64 dss_mask = intel_sseu_get_subslices(&gt->info.sseu, 0);
+
+ return intel_slicemask_from_dssmask(dss_mask, count);
+}
+
int intel_gt_init_mmio(struct intel_gt *gt)
{
+ struct drm_i915_private *i915 = gt->i915;
+
intel_gt_init_clock_frequency(gt);
intel_uc_init_mmio(&gt->uc);
intel_sseu_info_init(gt);
+ /*
+ * An mslice is unavailable only if both the meml3 for the slice is
+ * disabled *and* all of the DSS in the slice (quadrant) are disabled.
+ */
+ if (HAS_MSLICES(i915))
+ gt->info.mslice_mask =
+ slicemask(gt, GEN_DSS_PER_MSLICE) |
+ (intel_uncore_read(gt->uncore, GEN10_MIRROR_FUSE3) &
+ GEN12_MEML3_EN_MASK);
+
+ if (IS_DG2(i915)) {
+ gt->steering_table[MSLICE] = xehpsdv_mslice_steering_table;
+ gt->steering_table[LNCF] = dg2_lncf_steering_table;
+ } else if (IS_XEHPSDV(i915)) {
+ gt->steering_table[MSLICE] = xehpsdv_mslice_steering_table;
+ gt->steering_table[LNCF] = xehpsdv_lncf_steering_table;
+ } else if (GRAPHICS_VER(i915) >= 11 &&
+ GRAPHICS_VER_FULL(i915) < IP_VER(12, 50)) {
+ gt->steering_table[L3BANK] = icl_l3bank_steering_table;
+ gt->info.l3bank_mask =
+ ~intel_uncore_read(gt->uncore, GEN10_MIRROR_FUSE3) &
+ GEN10_L3BANK_MASK;
+ } else if (HAS_MSLICES(i915)) {
+ MISSING_CASE(INTEL_INFO(i915)->platform);
+ }
+
return intel_engines_init_mmio(gt);
}
@@ -192,7 +253,7 @@ static void clear_register(struct intel_uncore *uncore, i915_reg_t reg)
intel_uncore_rmw(uncore, reg, 0, 0);
}
-static void gen8_clear_engine_error_register(struct intel_engine_cs *engine)
+static void gen6_clear_engine_error_register(struct intel_engine_cs *engine)
{
GEN6_RING_FAULT_REG_RMW(engine, RING_FAULT_VALID, 0);
GEN6_RING_FAULT_REG_POSTING_READ(engine);
@@ -238,7 +299,7 @@ intel_gt_clear_error_registers(struct intel_gt *gt,
enum intel_engine_id id;
for_each_engine_masked(engine, gt, engine_mask, id)
- gen8_clear_engine_error_register(engine);
+ gen6_clear_engine_error_register(engine);
}
}
@@ -572,6 +633,25 @@ static void __intel_gt_disable(struct intel_gt *gt)
GEM_BUG_ON(intel_gt_pm_is_awake(gt));
}
+int intel_gt_wait_for_idle(struct intel_gt *gt, long timeout)
+{
+ long remaining_timeout;
+
+ /* If the device is asleep, we have no requests outstanding */
+ if (!intel_gt_pm_is_awake(gt))
+ return 0;
+
+ while ((timeout = intel_gt_retire_requests_timeout(gt, timeout,
+ &remaining_timeout)) > 0) {
+ cond_resched();
+ if (signal_pending(current))
+ return -EINTR;
+ }
+
+ return timeout ? timeout : intel_uc_wait_for_idle(&gt->uc,
+ remaining_timeout);
+}
+
int intel_gt_init(struct intel_gt *gt)
{
int err;
@@ -622,10 +702,14 @@ int intel_gt_init(struct intel_gt *gt)
if (err)
goto err_gt;
+ intel_uc_init_late(&gt->uc);
+
err = i915_inject_probe_error(gt->i915, -EIO);
if (err)
goto err_gt;
+ intel_migrate_init(&gt->migrate, gt);
+
goto out_fw;
err_gt:
__intel_gt_disable(gt);
@@ -649,6 +733,7 @@ void intel_gt_driver_remove(struct intel_gt *gt)
{
__intel_gt_disable(gt);
+ intel_migrate_fini(&gt->migrate);
intel_uc_driver_remove(&gt->uc);
intel_engines_release(gt);
@@ -697,6 +782,112 @@ void intel_gt_driver_late_release(struct intel_gt *gt)
intel_engines_free(gt);
}
+/**
+ * intel_gt_reg_needs_read_steering - determine whether a register read
+ * requires explicit steering
+ * @gt: GT structure
+ * @reg: the register to check steering requirements for
+ * @type: type of multicast steering to check
+ *
+ * Determines whether @reg needs explicit steering of a specific type for
+ * reads.
+ *
+ * Returns false if @reg does not belong to a register range of the given
+ * steering type, or if the default (subslice-based) steering IDs are suitable
+ * for @type steering too.
+ */
+static bool intel_gt_reg_needs_read_steering(struct intel_gt *gt,
+ i915_reg_t reg,
+ enum intel_steering_type type)
+{
+ const u32 offset = i915_mmio_reg_offset(reg);
+ const struct intel_mmio_range *entry;
+
+ if (likely(!intel_gt_needs_read_steering(gt, type)))
+ return false;
+
+ for (entry = gt->steering_table[type]; entry->end; entry++) {
+ if (offset >= entry->start && offset <= entry->end)
+ return true;
+ }
+
+ return false;
+}
+
+/**
+ * intel_gt_get_valid_steering - determines valid IDs for a class of MCR steering
+ * @gt: GT structure
+ * @type: multicast register type
+ * @sliceid: Slice ID returned
+ * @subsliceid: Subslice ID returned
+ *
+ * Determines sliceid and subsliceid values that will steer reads
+ * of a specific multicast register class to a valid value.
+ */
+static void intel_gt_get_valid_steering(struct intel_gt *gt,
+ enum intel_steering_type type,
+ u8 *sliceid, u8 *subsliceid)
+{
+ switch (type) {
+ case L3BANK:
+ GEM_DEBUG_WARN_ON(!gt->info.l3bank_mask); /* should be impossible! */
+
+ *sliceid = 0; /* unused */
+ *subsliceid = __ffs(gt->info.l3bank_mask);
+ break;
+ case MSLICE:
+ GEM_DEBUG_WARN_ON(!gt->info.mslice_mask); /* should be impossible! */
+
+ *sliceid = __ffs(gt->info.mslice_mask);
+ *subsliceid = 0; /* unused */
+ break;
+ case LNCF:
+ GEM_DEBUG_WARN_ON(!gt->info.mslice_mask); /* should be impossible! */
+
+ /*
+ * An LNCF is always present if its mslice is present, so we
+ * can safely just steer to LNCF 0 in all cases.
+ */
+ *sliceid = __ffs(gt->info.mslice_mask) << 1;
+ *subsliceid = 0; /* unused */
+ break;
+ default:
+ MISSING_CASE(type);
+ *sliceid = 0;
+ *subsliceid = 0;
+ }
+}
+
+/**
+ * intel_gt_read_register_fw - reads a GT register with support for multicast
+ * @gt: GT structure
+ * @reg: register to read
+ *
+ * This function will read a GT register. If the register is a multicast
+ * register, the read will be steered to a valid instance (i.e., one that
+ * isn't fused off or powered down by power gating).
+ *
+ * Returns the value from a valid instance of @reg.
+ */
+u32 intel_gt_read_register_fw(struct intel_gt *gt, i915_reg_t reg)
+{
+ int type;
+ u8 sliceid, subsliceid;
+
+ for (type = 0; type < NUM_STEERING_TYPES; type++) {
+ if (intel_gt_reg_needs_read_steering(gt, reg, type)) {
+ intel_gt_get_valid_steering(gt, type, &sliceid,
+ &subsliceid);
+ return intel_uncore_read_with_mcr_steering_fw(gt->uncore,
+ reg,
+ sliceid,
+ subsliceid);
+ }
+ }
+
+ return intel_uncore_read_fw(gt->uncore, reg);
+}
+
void intel_gt_info_print(const struct intel_gt_info *info,
struct drm_printer *p)
{
diff --git a/drivers/gpu/drm/i915/gt/intel_gt.h b/drivers/gpu/drm/i915/gt/intel_gt.h
index 7ec395cace69..74e771871a9b 100644
--- a/drivers/gpu/drm/i915/gt/intel_gt.h
+++ b/drivers/gpu/drm/i915/gt/intel_gt.h
@@ -48,6 +48,8 @@ void intel_gt_driver_release(struct intel_gt *gt);
void intel_gt_driver_late_release(struct intel_gt *gt);
+int intel_gt_wait_for_idle(struct intel_gt *gt, long timeout);
+
void intel_gt_check_and_clear_faults(struct intel_gt *gt);
void intel_gt_clear_error_registers(struct intel_gt *gt,
intel_engine_mask_t engine_mask);
@@ -75,6 +77,14 @@ static inline bool intel_gt_is_wedged(const struct intel_gt *gt)
return unlikely(test_bit(I915_WEDGED, &gt->reset.flags));
}
+static inline bool intel_gt_needs_read_steering(struct intel_gt *gt,
+ enum intel_steering_type type)
+{
+ return gt->steering_table[type];
+}
+
+u32 intel_gt_read_register_fw(struct intel_gt *gt, i915_reg_t reg);
+
void intel_gt_info_print(const struct intel_gt_info *info,
struct drm_printer *p);
diff --git a/drivers/gpu/drm/i915/gt/intel_gt_clock_utils.c b/drivers/gpu/drm/i915/gt/intel_gt_clock_utils.c
index 9f0e729d2d15..3513d6f90747 100644
--- a/drivers/gpu/drm/i915/gt/intel_gt_clock_utils.c
+++ b/drivers/gpu/drm/i915/gt/intel_gt_clock_utils.c
@@ -24,8 +24,8 @@ static u32 read_reference_ts_freq(struct intel_uncore *uncore)
return base_freq + frac_freq;
}
-static u32 gen10_get_crystal_clock_freq(struct intel_uncore *uncore,
- u32 rpm_config_reg)
+static u32 gen9_get_crystal_clock_freq(struct intel_uncore *uncore,
+ u32 rpm_config_reg)
{
u32 f19_2_mhz = 19200000;
u32 f24_mhz = 24000000;
@@ -128,10 +128,10 @@ static u32 read_clock_frequency(struct intel_uncore *uncore)
} else {
u32 c0 = intel_uncore_read(uncore, RPM_CONFIG0);
- if (GRAPHICS_VER(uncore->i915) <= 10)
- freq = gen10_get_crystal_clock_freq(uncore, c0);
- else
+ if (GRAPHICS_VER(uncore->i915) >= 11)
freq = gen11_get_crystal_clock_freq(uncore, c0);
+ else
+ freq = gen9_get_crystal_clock_freq(uncore, c0);
/*
* Now figure out how the command stream's timestamp
diff --git a/drivers/gpu/drm/i915/gt/intel_gt_irq.c b/drivers/gpu/drm/i915/gt/intel_gt_irq.c
index c13462274fe8..b2de83be4d97 100644
--- a/drivers/gpu/drm/i915/gt/intel_gt_irq.c
+++ b/drivers/gpu/drm/i915/gt/intel_gt_irq.c
@@ -184,7 +184,13 @@ void gen11_gt_irq_reset(struct intel_gt *gt)
intel_uncore_write(uncore, GEN11_BCS_RSVD_INTR_MASK, ~0);
intel_uncore_write(uncore, GEN11_VCS0_VCS1_INTR_MASK, ~0);
intel_uncore_write(uncore, GEN11_VCS2_VCS3_INTR_MASK, ~0);
+ if (HAS_ENGINE(gt, VCS4) || HAS_ENGINE(gt, VCS5))
+ intel_uncore_write(uncore, GEN12_VCS4_VCS5_INTR_MASK, ~0);
+ if (HAS_ENGINE(gt, VCS6) || HAS_ENGINE(gt, VCS7))
+ intel_uncore_write(uncore, GEN12_VCS6_VCS7_INTR_MASK, ~0);
intel_uncore_write(uncore, GEN11_VECS0_VECS1_INTR_MASK, ~0);
+ if (HAS_ENGINE(gt, VECS2) || HAS_ENGINE(gt, VECS3))
+ intel_uncore_write(uncore, GEN12_VECS2_VECS3_INTR_MASK, ~0);
intel_uncore_write(uncore, GEN11_GPM_WGBOXPERF_INTR_ENABLE, 0);
intel_uncore_write(uncore, GEN11_GPM_WGBOXPERF_INTR_MASK, ~0);
@@ -218,8 +224,13 @@ void gen11_gt_irq_postinstall(struct intel_gt *gt)
intel_uncore_write(uncore, GEN11_BCS_RSVD_INTR_MASK, ~smask);
intel_uncore_write(uncore, GEN11_VCS0_VCS1_INTR_MASK, ~dmask);
intel_uncore_write(uncore, GEN11_VCS2_VCS3_INTR_MASK, ~dmask);
+ if (HAS_ENGINE(gt, VCS4) || HAS_ENGINE(gt, VCS5))
+ intel_uncore_write(uncore, GEN12_VCS4_VCS5_INTR_MASK, ~dmask);
+ if (HAS_ENGINE(gt, VCS6) || HAS_ENGINE(gt, VCS7))
+ intel_uncore_write(uncore, GEN12_VCS6_VCS7_INTR_MASK, ~dmask);
intel_uncore_write(uncore, GEN11_VECS0_VECS1_INTR_MASK, ~dmask);
-
+ if (HAS_ENGINE(gt, VECS2) || HAS_ENGINE(gt, VECS3))
+ intel_uncore_write(uncore, GEN12_VECS2_VECS3_INTR_MASK, ~dmask);
/*
* RPS interrupts will get enabled/disabled on demand when RPS itself
* is enabled/disabled.
diff --git a/drivers/gpu/drm/i915/gt/intel_gt_pm.c b/drivers/gpu/drm/i915/gt/intel_gt_pm.c
index aef3084e8b16..dea8e2479897 100644
--- a/drivers/gpu/drm/i915/gt/intel_gt_pm.c
+++ b/drivers/gpu/drm/i915/gt/intel_gt_pm.c
@@ -6,7 +6,6 @@
#include <linux/suspend.h>
#include "i915_drv.h"
-#include "i915_globals.h"
#include "i915_params.h"
#include "intel_context.h"
#include "intel_engine_pm.h"
@@ -67,8 +66,6 @@ static int __gt_unpark(struct intel_wakeref *wf)
GT_TRACE(gt, "\n");
- i915_globals_unpark();
-
/*
* It seems that the DMC likes to transition between the DC states a lot
* when there are no connected displays (no active power domains) during
@@ -116,8 +113,6 @@ static int __gt_park(struct intel_wakeref *wf)
GEM_BUG_ON(!wakeref);
intel_display_power_put_async(i915, POWER_DOMAIN_GT_IRQ, wakeref);
- i915_globals_park();
-
return 0;
}
@@ -174,8 +169,6 @@ static void gt_sanitize(struct intel_gt *gt, bool force)
if (intel_gt_is_wedged(gt))
intel_gt_unset_wedged(gt);
- intel_uc_sanitize(&gt->uc);
-
for_each_engine(engine, gt, id)
if (engine->reset.prepare)
engine->reset.prepare(engine);
@@ -191,6 +184,8 @@ static void gt_sanitize(struct intel_gt *gt, bool force)
__intel_engine_reset(engine, false);
}
+ intel_uc_reset(&gt->uc, false);
+
for_each_engine(engine, gt, id)
if (engine->reset.finish)
engine->reset.finish(engine);
@@ -243,6 +238,8 @@ int intel_gt_resume(struct intel_gt *gt)
goto err_wedged;
}
+ intel_uc_reset_finish(&gt->uc);
+
intel_rps_enable(&gt->rps);
intel_llc_enable(&gt->llc);
diff --git a/drivers/gpu/drm/i915/gt/intel_gt_requests.c b/drivers/gpu/drm/i915/gt/intel_gt_requests.c
index 647eca9d867a..edb881d75630 100644
--- a/drivers/gpu/drm/i915/gt/intel_gt_requests.c
+++ b/drivers/gpu/drm/i915/gt/intel_gt_requests.c
@@ -130,7 +130,8 @@ void intel_engine_fini_retire(struct intel_engine_cs *engine)
GEM_BUG_ON(engine->retire);
}
-long intel_gt_retire_requests_timeout(struct intel_gt *gt, long timeout)
+long intel_gt_retire_requests_timeout(struct intel_gt *gt, long timeout,
+ long *remaining_timeout)
{
struct intel_gt_timelines *timelines = &gt->timelines;
struct intel_timeline *tl, *tn;
@@ -195,22 +196,10 @@ out_active: spin_lock(&timelines->lock);
if (flush_submission(gt, timeout)) /* Wait, there's more! */
active_count++;
- return active_count ? timeout : 0;
-}
-
-int intel_gt_wait_for_idle(struct intel_gt *gt, long timeout)
-{
- /* If the device is asleep, we have no requests outstanding */
- if (!intel_gt_pm_is_awake(gt))
- return 0;
-
- while ((timeout = intel_gt_retire_requests_timeout(gt, timeout)) > 0) {
- cond_resched();
- if (signal_pending(current))
- return -EINTR;
- }
+ if (remaining_timeout)
+ *remaining_timeout = timeout;
- return timeout;
+ return active_count ? timeout : 0;
}
static void retire_work_handler(struct work_struct *work)
diff --git a/drivers/gpu/drm/i915/gt/intel_gt_requests.h b/drivers/gpu/drm/i915/gt/intel_gt_requests.h
index fcc30a6e4fe9..51dbe0e3294e 100644
--- a/drivers/gpu/drm/i915/gt/intel_gt_requests.h
+++ b/drivers/gpu/drm/i915/gt/intel_gt_requests.h
@@ -6,14 +6,17 @@
#ifndef INTEL_GT_REQUESTS_H
#define INTEL_GT_REQUESTS_H
+#include <stddef.h>
+
struct intel_engine_cs;
struct intel_gt;
struct intel_timeline;
-long intel_gt_retire_requests_timeout(struct intel_gt *gt, long timeout);
+long intel_gt_retire_requests_timeout(struct intel_gt *gt, long timeout,
+ long *remaining_timeout);
static inline void intel_gt_retire_requests(struct intel_gt *gt)
{
- intel_gt_retire_requests_timeout(gt, 0);
+ intel_gt_retire_requests_timeout(gt, 0, NULL);
}
void intel_engine_init_retire(struct intel_engine_cs *engine);
@@ -21,8 +24,6 @@ void intel_engine_add_retire(struct intel_engine_cs *engine,
struct intel_timeline *tl);
void intel_engine_fini_retire(struct intel_engine_cs *engine);
-int intel_gt_wait_for_idle(struct intel_gt *gt, long timeout);
-
void intel_gt_init_requests(struct intel_gt *gt);
void intel_gt_park_requests(struct intel_gt *gt);
void intel_gt_unpark_requests(struct intel_gt *gt);
diff --git a/drivers/gpu/drm/i915/gt/intel_gt_types.h b/drivers/gpu/drm/i915/gt/intel_gt_types.h
index fecfacf551d5..a81e21bf1bd1 100644
--- a/drivers/gpu/drm/i915/gt/intel_gt_types.h
+++ b/drivers/gpu/drm/i915/gt/intel_gt_types.h
@@ -24,6 +24,7 @@
#include "intel_reset_types.h"
#include "intel_rc6_types.h"
#include "intel_rps_types.h"
+#include "intel_migrate_types.h"
#include "intel_wakeref.h"
struct drm_i915_private;
@@ -31,6 +32,33 @@ struct i915_ggtt;
struct intel_engine_cs;
struct intel_uncore;
+struct intel_mmio_range {
+ u32 start;
+ u32 end;
+};
+
+/*
+ * The hardware has multiple kinds of multicast register ranges that need
+ * special register steering (and future platforms are expected to add
+ * additional types).
+ *
+ * During driver startup, we initialize the steering control register to
+ * direct reads to a slice/subslice that are valid for the 'subslice' class
+ * of multicast registers. If another type of steering does not have any
+ * overlap in valid steering targets with 'subslice' style registers, we will
+ * need to explicitly re-steer reads of registers of the other type.
+ *
+ * Only the replication types that may need additional non-default steering
+ * are listed here.
+ */
+enum intel_steering_type {
+ L3BANK,
+ MSLICE,
+ LNCF,
+
+ NUM_STEERING_TYPES
+};
+
enum intel_submission_method {
INTEL_SUBMISSION_RING,
INTEL_SUBMISSION_ELSP,
@@ -145,8 +173,15 @@ struct intel_gt {
struct i915_vma *scratch;
+ struct intel_migrate migrate;
+
+ const struct intel_mmio_range *steering_table[NUM_STEERING_TYPES];
+
struct intel_gt_info {
intel_engine_mask_t engine_mask;
+
+ u32 l3bank_mask;
+
u8 num_engines;
/* Media engine access to SFC per instance */
@@ -154,6 +189,8 @@ struct intel_gt {
/* Slice/subslice/EU info */
struct sseu_dev_info sseu;
+
+ unsigned long mslice_mask;
} info;
};
diff --git a/drivers/gpu/drm/i915/gt/intel_gtt.c b/drivers/gpu/drm/i915/gt/intel_gtt.c
index 084ea65d59c0..e137dd32b5b8 100644
--- a/drivers/gpu/drm/i915/gt/intel_gtt.c
+++ b/drivers/gpu/drm/i915/gt/intel_gtt.c
@@ -16,7 +16,19 @@ struct drm_i915_gem_object *alloc_pt_lmem(struct i915_address_space *vm, int sz)
{
struct drm_i915_gem_object *obj;
- obj = i915_gem_object_create_lmem(vm->i915, sz, 0);
+ /*
+ * To avoid severe over-allocation when dealing with min_page_size
+ * restrictions, we override that behaviour here by allowing an object
+ * size and page layout which can be smaller. In practice this should be
+ * totally fine, since GTT paging structures are not typically inserted
+ * into the GTT.
+ *
+ * Note that we also hit this path for the scratch page, and for this
+ * case it might need to be 64K, but that should work fine here since we
+ * used the passed in size for the page size, which should ensure it
+ * also has the same alignment.
+ */
+ obj = __i915_gem_object_create_lmem_with_ps(vm->i915, sz, sz, 0);
/*
* Ensure all paging structures for this vm share the same dma-resv
* object underneath, with the idea that one object_lock() will lock
@@ -414,7 +426,7 @@ static void tgl_setup_private_ppat(struct intel_uncore *uncore)
intel_uncore_write(uncore, GEN12_PAT_INDEX(7), GEN8_PPAT_WB);
}
-static void cnl_setup_private_ppat(struct intel_uncore *uncore)
+static void icl_setup_private_ppat(struct intel_uncore *uncore)
{
intel_uncore_write(uncore,
GEN10_PAT_INDEX(0),
@@ -514,8 +526,8 @@ void setup_private_pat(struct intel_uncore *uncore)
if (GRAPHICS_VER(i915) >= 12)
tgl_setup_private_ppat(uncore);
- else if (GRAPHICS_VER(i915) >= 10)
- cnl_setup_private_ppat(uncore);
+ else if (GRAPHICS_VER(i915) >= 11)
+ icl_setup_private_ppat(uncore);
else if (IS_CHERRYVIEW(i915) || IS_GEN9_LP(i915))
chv_setup_private_ppat(uncore);
else
diff --git a/drivers/gpu/drm/i915/gt/intel_gtt.h b/drivers/gpu/drm/i915/gt/intel_gtt.h
index edea95b97c36..bc7153018ebd 100644
--- a/drivers/gpu/drm/i915/gt/intel_gtt.h
+++ b/drivers/gpu/drm/i915/gt/intel_gtt.h
@@ -140,7 +140,6 @@ typedef u64 gen8_pte_t;
enum i915_cache_level;
-struct drm_i915_file_private;
struct drm_i915_gem_object;
struct i915_fence_reg;
struct i915_vma;
@@ -220,16 +219,6 @@ struct i915_address_space {
struct intel_gt *gt;
struct drm_i915_private *i915;
struct device *dma;
- /*
- * Every address space belongs to a struct file - except for the global
- * GTT that is owned by the driver (and so @file is set to NULL). In
- * principle, no information should leak from one context to another
- * (or between files/processes etc) unless explicitly shared by the
- * owner. Tracking the owner is important in order to free up per-file
- * objects along with the file, to aide resource tracking, and to
- * assign blame.
- */
- struct drm_i915_file_private *file;
u64 total; /* size addr space maps (ex. 2GB for ggtt) */
u64 reserved; /* size addr space reserved */
@@ -296,6 +285,13 @@ struct i915_address_space {
u32 flags);
void (*cleanup)(struct i915_address_space *vm);
+ void (*foreach)(struct i915_address_space *vm,
+ u64 start, u64 length,
+ void (*fn)(struct i915_address_space *vm,
+ struct i915_page_table *pt,
+ void *data),
+ void *data);
+
struct i915_vma_ops vma_ops;
I915_SELFTEST_DECLARE(struct fault_attr fault_attr);
diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c
index a27bac0a4bfb..bb4af4977920 100644
--- a/drivers/gpu/drm/i915/gt/intel_lrc.c
+++ b/drivers/gpu/drm/i915/gt/intel_lrc.c
@@ -70,7 +70,7 @@ static void set_offsets(u32 *regs,
if (close) {
/* Close the batch; used mainly by live_lrc_layout() */
*regs = MI_BATCH_BUFFER_END;
- if (GRAPHICS_VER(engine->i915) >= 10)
+ if (GRAPHICS_VER(engine->i915) >= 11)
*regs |= BIT(0);
}
}
@@ -484,6 +484,47 @@ static const u8 gen12_rcs_offsets[] = {
END
};
+static const u8 xehp_rcs_offsets[] = {
+ NOP(1),
+ LRI(13, POSTED),
+ REG16(0x244),
+ REG(0x034),
+ REG(0x030),
+ REG(0x038),
+ REG(0x03c),
+ REG(0x168),
+ REG(0x140),
+ REG(0x110),
+ REG(0x1c0),
+ REG(0x1c4),
+ REG(0x1c8),
+ REG(0x180),
+ REG16(0x2b4),
+
+ NOP(5),
+ LRI(9, POSTED),
+ REG16(0x3a8),
+ REG16(0x28c),
+ REG16(0x288),
+ REG16(0x284),
+ REG16(0x280),
+ REG16(0x27c),
+ REG16(0x278),
+ REG16(0x274),
+ REG16(0x270),
+
+ LRI(3, POSTED),
+ REG(0x1b0),
+ REG16(0x5a8),
+ REG16(0x5ac),
+
+ NOP(6),
+ LRI(1, 0),
+ REG(0x0c8),
+
+ END
+};
+
#undef END
#undef REG16
#undef REG
@@ -502,7 +543,9 @@ static const u8 *reg_offsets(const struct intel_engine_cs *engine)
!intel_engine_has_relative_mmio(engine));
if (engine->class == RENDER_CLASS) {
- if (GRAPHICS_VER(engine->i915) >= 12)
+ if (GRAPHICS_VER_FULL(engine->i915) >= IP_VER(12, 50))
+ return xehp_rcs_offsets;
+ else if (GRAPHICS_VER(engine->i915) >= 12)
return gen12_rcs_offsets;
else if (GRAPHICS_VER(engine->i915) >= 11)
return gen11_rcs_offsets;
@@ -522,7 +565,9 @@ static const u8 *reg_offsets(const struct intel_engine_cs *engine)
static int lrc_ring_mi_mode(const struct intel_engine_cs *engine)
{
- if (GRAPHICS_VER(engine->i915) >= 12)
+ if (GRAPHICS_VER_FULL(engine->i915) >= IP_VER(12, 50))
+ return 0x70;
+ else if (GRAPHICS_VER(engine->i915) >= 12)
return 0x60;
else if (GRAPHICS_VER(engine->i915) >= 9)
return 0x54;
@@ -534,7 +579,9 @@ static int lrc_ring_mi_mode(const struct intel_engine_cs *engine)
static int lrc_ring_gpr0(const struct intel_engine_cs *engine)
{
- if (GRAPHICS_VER(engine->i915) >= 12)
+ if (GRAPHICS_VER_FULL(engine->i915) >= IP_VER(12, 50))
+ return 0x84;
+ else if (GRAPHICS_VER(engine->i915) >= 12)
return 0x74;
else if (GRAPHICS_VER(engine->i915) >= 9)
return 0x68;
@@ -578,10 +625,16 @@ static int lrc_ring_indirect_offset(const struct intel_engine_cs *engine)
static int lrc_ring_cmd_buf_cctl(const struct intel_engine_cs *engine)
{
- if (engine->class != RENDER_CLASS)
- return -1;
- if (GRAPHICS_VER(engine->i915) >= 12)
+ if (GRAPHICS_VER_FULL(engine->i915) >= IP_VER(12, 50))
+ /*
+ * Note that the CSFE context has a dummy slot for CMD_BUF_CCTL
+ * simply to match the RCS context image layout.
+ */
+ return 0xc6;
+ else if (engine->class != RENDER_CLASS)
+ return -1;
+ else if (GRAPHICS_VER(engine->i915) >= 12)
return 0xb6;
else if (GRAPHICS_VER(engine->i915) >= 11)
return 0xaa;
@@ -600,8 +653,6 @@ lrc_ring_indirect_offset_default(const struct intel_engine_cs *engine)
return GEN12_CTX_RCS_INDIRECT_CTX_OFFSET_DEFAULT;
case 11:
return GEN11_CTX_RCS_INDIRECT_CTX_OFFSET_DEFAULT;
- case 10:
- return GEN10_CTX_RCS_INDIRECT_CTX_OFFSET_DEFAULT;
case 9:
return GEN9_CTX_RCS_INDIRECT_CTX_OFFSET_DEFAULT;
case 8:
@@ -845,7 +896,7 @@ int lrc_alloc(struct intel_context *ce, struct intel_engine_cs *engine)
if (IS_ERR(vma))
return PTR_ERR(vma);
- ring = intel_engine_create_ring(engine, (unsigned long)ce->ring);
+ ring = intel_engine_create_ring(engine, ce->ring_size);
if (IS_ERR(ring)) {
err = PTR_ERR(ring);
goto err_vma;
@@ -1101,6 +1152,14 @@ setup_indirect_ctx_bb(const struct intel_context *ce,
* bits 55-60: SW counter
* bits 61-63: engine class
*
+ * On Xe_HP, the upper dword of the descriptor has a new format:
+ *
+ * bits 32-37: virtual function number
+ * bit 38: mbz, reserved for use by hardware
+ * bits 39-54: SW context ID
+ * bits 55-57: reserved
+ * bits 58-63: SW counter
+ *
* engine info, SW context ID and SW counter need to form a unique number
* (Context ID) per lrc.
*/
@@ -1387,40 +1446,6 @@ static u32 *gen9_init_indirectctx_bb(struct intel_engine_cs *engine, u32 *batch)
return batch;
}
-static u32 *
-gen10_init_indirectctx_bb(struct intel_engine_cs *engine, u32 *batch)
-{
- int i;
-
- /*
- * WaPipeControlBefore3DStateSamplePattern: cnl
- *
- * Ensure the engine is idle prior to programming a
- * 3DSTATE_SAMPLE_PATTERN during a context restore.
- */
- batch = gen8_emit_pipe_control(batch,
- PIPE_CONTROL_CS_STALL,
- 0);
- /*
- * WaPipeControlBefore3DStateSamplePattern says we need 4 dwords for
- * the PIPE_CONTROL followed by 12 dwords of 0x0, so 16 dwords in
- * total. However, a PIPE_CONTROL is 6 dwords long, not 4, which is
- * confusing. Since gen8_emit_pipe_control() already advances the
- * batch by 6 dwords, we advance the other 10 here, completing a
- * cacheline. It's not clear if the workaround requires this padding
- * before other commands, or if it's just the regular padding we would
- * already have for the workaround bb, so leave it here for now.
- */
- for (i = 0; i < 10; i++)
- *batch++ = MI_NOOP;
-
- /* Pad to end of cacheline */
- while ((unsigned long)batch % CACHELINE_BYTES)
- *batch++ = MI_NOOP;
-
- return batch;
-}
-
#define CTX_WA_BB_SIZE (PAGE_SIZE)
static int lrc_create_wa_ctx(struct intel_engine_cs *engine)
@@ -1473,10 +1498,6 @@ void lrc_init_wa_ctx(struct intel_engine_cs *engine)
case 12:
case 11:
return;
- case 10:
- wa_bb_fn[0] = gen10_init_indirectctx_bb;
- wa_bb_fn[1] = NULL;
- break;
case 9:
wa_bb_fn[0] = gen9_init_indirectctx_bb;
wa_bb_fn[1] = NULL;
diff --git a/drivers/gpu/drm/i915/gt/intel_lrc_reg.h b/drivers/gpu/drm/i915/gt/intel_lrc_reg.h
index 41e5350a7a05..f785d0ed238f 100644
--- a/drivers/gpu/drm/i915/gt/intel_lrc_reg.h
+++ b/drivers/gpu/drm/i915/gt/intel_lrc_reg.h
@@ -87,9 +87,10 @@
#define GEN11_CSB_WRITE_PTR_MASK (GEN11_CSB_PTR_MASK << 0)
#define MAX_CONTEXT_HW_ID (1 << 21) /* exclusive */
-#define MAX_GUC_CONTEXT_HW_ID (1 << 20) /* exclusive */
#define GEN11_MAX_CONTEXT_HW_ID (1 << 11) /* exclusive */
/* in Gen12 ID 0x7FF is reserved to indicate idle */
#define GEN12_MAX_CONTEXT_HW_ID (GEN11_MAX_CONTEXT_HW_ID - 1)
+/* in Xe_HP ID 0xFFFF is reserved to indicate "invalid context" */
+#define XEHP_MAX_CONTEXT_HW_ID 0xFFFF
#endif /* _INTEL_LRC_REG_H_ */
diff --git a/drivers/gpu/drm/i915/gt/intel_migrate.c b/drivers/gpu/drm/i915/gt/intel_migrate.c
new file mode 100644
index 000000000000..1dac21aa7e5c
--- /dev/null
+++ b/drivers/gpu/drm/i915/gt/intel_migrate.c
@@ -0,0 +1,688 @@
+// SPDX-License-Identifier: MIT
+/*
+ * Copyright © 2020 Intel Corporation
+ */
+
+#include "i915_drv.h"
+#include "intel_context.h"
+#include "intel_gpu_commands.h"
+#include "intel_gt.h"
+#include "intel_gtt.h"
+#include "intel_migrate.h"
+#include "intel_ring.h"
+
+struct insert_pte_data {
+ u64 offset;
+ bool is_lmem;
+};
+
+#define CHUNK_SZ SZ_8M /* ~1ms at 8GiB/s preemption delay */
+
+static bool engine_supports_migration(struct intel_engine_cs *engine)
+{
+ if (!engine)
+ return false;
+
+ /*
+ * We need the ability to prevent aribtration (MI_ARB_ON_OFF),
+ * the ability to write PTE using inline data (MI_STORE_DATA)
+ * and of course the ability to do the block transfer (blits).
+ */
+ GEM_BUG_ON(engine->class != COPY_ENGINE_CLASS);
+
+ return true;
+}
+
+static void insert_pte(struct i915_address_space *vm,
+ struct i915_page_table *pt,
+ void *data)
+{
+ struct insert_pte_data *d = data;
+
+ vm->insert_page(vm, px_dma(pt), d->offset, I915_CACHE_NONE,
+ d->is_lmem ? PTE_LM : 0);
+ d->offset += PAGE_SIZE;
+}
+
+static struct i915_address_space *migrate_vm(struct intel_gt *gt)
+{
+ struct i915_vm_pt_stash stash = {};
+ struct i915_ppgtt *vm;
+ int err;
+ int i;
+
+ /*
+ * We construct a very special VM for use by all migration contexts,
+ * it is kept pinned so that it can be used at any time. As we need
+ * to pre-allocate the page directories for the migration VM, this
+ * limits us to only using a small number of prepared vma.
+ *
+ * To be able to pipeline and reschedule migration operations while
+ * avoiding unnecessary contention on the vm itself, the PTE updates
+ * are inline with the blits. All the blits use the same fixed
+ * addresses, with the backing store redirection being updated on the
+ * fly. Only 2 implicit vma are used for all migration operations.
+ *
+ * We lay the ppGTT out as:
+ *
+ * [0, CHUNK_SZ) -> first object
+ * [CHUNK_SZ, 2 * CHUNK_SZ) -> second object
+ * [2 * CHUNK_SZ, 2 * CHUNK_SZ + 2 * CHUNK_SZ >> 9] -> PTE
+ *
+ * By exposing the dma addresses of the page directories themselves
+ * within the ppGTT, we are then able to rewrite the PTE prior to use.
+ * But the PTE update and subsequent migration operation must be atomic,
+ * i.e. within the same non-preemptible window so that we do not switch
+ * to another migration context that overwrites the PTE.
+ *
+ * TODO: Add support for huge LMEM PTEs
+ */
+
+ vm = i915_ppgtt_create(gt);
+ if (IS_ERR(vm))
+ return ERR_CAST(vm);
+
+ if (!vm->vm.allocate_va_range || !vm->vm.foreach) {
+ err = -ENODEV;
+ goto err_vm;
+ }
+
+ /*
+ * Each engine instance is assigned its own chunk in the VM, so
+ * that we can run multiple instances concurrently
+ */
+ for (i = 0; i < ARRAY_SIZE(gt->engine_class[COPY_ENGINE_CLASS]); i++) {
+ struct intel_engine_cs *engine;
+ u64 base = (u64)i << 32;
+ struct insert_pte_data d = {};
+ struct i915_gem_ww_ctx ww;
+ u64 sz;
+
+ engine = gt->engine_class[COPY_ENGINE_CLASS][i];
+ if (!engine_supports_migration(engine))
+ continue;
+
+ /*
+ * We copy in 8MiB chunks. Each PDE covers 2MiB, so we need
+ * 4x2 page directories for source/destination.
+ */
+ sz = 2 * CHUNK_SZ;
+ d.offset = base + sz;
+
+ /*
+ * We need another page directory setup so that we can write
+ * the 8x512 PTE in each chunk.
+ */
+ sz += (sz >> 12) * sizeof(u64);
+
+ err = i915_vm_alloc_pt_stash(&vm->vm, &stash, sz);
+ if (err)
+ goto err_vm;
+
+ for_i915_gem_ww(&ww, err, true) {
+ err = i915_vm_lock_objects(&vm->vm, &ww);
+ if (err)
+ continue;
+ err = i915_vm_map_pt_stash(&vm->vm, &stash);
+ if (err)
+ continue;
+
+ vm->vm.allocate_va_range(&vm->vm, &stash, base, sz);
+ }
+ i915_vm_free_pt_stash(&vm->vm, &stash);
+ if (err)
+ goto err_vm;
+
+ /* Now allow the GPU to rewrite the PTE via its own ppGTT */
+ d.is_lmem = i915_gem_object_is_lmem(vm->vm.scratch[0]);
+ vm->vm.foreach(&vm->vm, base, base + sz, insert_pte, &d);
+ }
+
+ return &vm->vm;
+
+err_vm:
+ i915_vm_put(&vm->vm);
+ return ERR_PTR(err);
+}
+
+static struct intel_engine_cs *first_copy_engine(struct intel_gt *gt)
+{
+ struct intel_engine_cs *engine;
+ int i;
+
+ for (i = 0; i < ARRAY_SIZE(gt->engine_class[COPY_ENGINE_CLASS]); i++) {
+ engine = gt->engine_class[COPY_ENGINE_CLASS][i];
+ if (engine_supports_migration(engine))
+ return engine;
+ }
+
+ return NULL;
+}
+
+static struct intel_context *pinned_context(struct intel_gt *gt)
+{
+ static struct lock_class_key key;
+ struct intel_engine_cs *engine;
+ struct i915_address_space *vm;
+ struct intel_context *ce;
+
+ engine = first_copy_engine(gt);
+ if (!engine)
+ return ERR_PTR(-ENODEV);
+
+ vm = migrate_vm(gt);
+ if (IS_ERR(vm))
+ return ERR_CAST(vm);
+
+ ce = intel_engine_create_pinned_context(engine, vm, SZ_512K,
+ I915_GEM_HWS_MIGRATE,
+ &key, "migrate");
+ i915_vm_put(vm);
+ return ce;
+}
+
+int intel_migrate_init(struct intel_migrate *m, struct intel_gt *gt)
+{
+ struct intel_context *ce;
+
+ memset(m, 0, sizeof(*m));
+
+ ce = pinned_context(gt);
+ if (IS_ERR(ce))
+ return PTR_ERR(ce);
+
+ m->context = ce;
+ return 0;
+}
+
+static int random_index(unsigned int max)
+{
+ return upper_32_bits(mul_u32_u32(get_random_u32(), max));
+}
+
+static struct intel_context *__migrate_engines(struct intel_gt *gt)
+{
+ struct intel_engine_cs *engines[MAX_ENGINE_INSTANCE];
+ struct intel_engine_cs *engine;
+ unsigned int count, i;
+
+ count = 0;
+ for (i = 0; i < ARRAY_SIZE(gt->engine_class[COPY_ENGINE_CLASS]); i++) {
+ engine = gt->engine_class[COPY_ENGINE_CLASS][i];
+ if (engine_supports_migration(engine))
+ engines[count++] = engine;
+ }
+
+ return intel_context_create(engines[random_index(count)]);
+}
+
+struct intel_context *intel_migrate_create_context(struct intel_migrate *m)
+{
+ struct intel_context *ce;
+
+ /*
+ * We randomly distribute contexts across the engines upon constrction,
+ * as they all share the same pinned vm, and so in order to allow
+ * multiple blits to run in parallel, we must construct each blit
+ * to use a different range of the vm for its GTT. This has to be
+ * known at construction, so we can not use the late greedy load
+ * balancing of the virtual-engine.
+ */
+ ce = __migrate_engines(m->context->engine->gt);
+ if (IS_ERR(ce))
+ return ce;
+
+ ce->ring = NULL;
+ ce->ring_size = SZ_256K;
+
+ i915_vm_put(ce->vm);
+ ce->vm = i915_vm_get(m->context->vm);
+
+ return ce;
+}
+
+static inline struct sgt_dma sg_sgt(struct scatterlist *sg)
+{
+ dma_addr_t addr = sg_dma_address(sg);
+
+ return (struct sgt_dma){ sg, addr, addr + sg_dma_len(sg) };
+}
+
+static int emit_no_arbitration(struct i915_request *rq)
+{
+ u32 *cs;
+
+ cs = intel_ring_begin(rq, 2);
+ if (IS_ERR(cs))
+ return PTR_ERR(cs);
+
+ /* Explicitly disable preemption for this request. */
+ *cs++ = MI_ARB_ON_OFF;
+ *cs++ = MI_NOOP;
+ intel_ring_advance(rq, cs);
+
+ return 0;
+}
+
+static int emit_pte(struct i915_request *rq,
+ struct sgt_dma *it,
+ enum i915_cache_level cache_level,
+ bool is_lmem,
+ u64 offset,
+ int length)
+{
+ const u64 encode = rq->context->vm->pte_encode(0, cache_level,
+ is_lmem ? PTE_LM : 0);
+ struct intel_ring *ring = rq->ring;
+ int total = 0;
+ u32 *hdr, *cs;
+ int pkt;
+
+ GEM_BUG_ON(GRAPHICS_VER(rq->engine->i915) < 8);
+
+ /* Compute the page directory offset for the target address range */
+ offset += (u64)rq->engine->instance << 32;
+ offset >>= 12;
+ offset *= sizeof(u64);
+ offset += 2 * CHUNK_SZ;
+
+ cs = intel_ring_begin(rq, 6);
+ if (IS_ERR(cs))
+ return PTR_ERR(cs);
+
+ /* Pack as many PTE updates as possible into a single MI command */
+ pkt = min_t(int, 0x400, ring->space / sizeof(u32) + 5);
+ pkt = min_t(int, pkt, (ring->size - ring->emit) / sizeof(u32) + 5);
+
+ hdr = cs;
+ *cs++ = MI_STORE_DATA_IMM | REG_BIT(21); /* as qword elements */
+ *cs++ = lower_32_bits(offset);
+ *cs++ = upper_32_bits(offset);
+
+ do {
+ if (cs - hdr >= pkt) {
+ *hdr += cs - hdr - 2;
+ *cs++ = MI_NOOP;
+
+ ring->emit = (void *)cs - ring->vaddr;
+ intel_ring_advance(rq, cs);
+ intel_ring_update_space(ring);
+
+ cs = intel_ring_begin(rq, 6);
+ if (IS_ERR(cs))
+ return PTR_ERR(cs);
+
+ pkt = min_t(int, 0x400, ring->space / sizeof(u32) + 5);
+ pkt = min_t(int, pkt, (ring->size - ring->emit) / sizeof(u32) + 5);
+
+ hdr = cs;
+ *cs++ = MI_STORE_DATA_IMM | REG_BIT(21);
+ *cs++ = lower_32_bits(offset);
+ *cs++ = upper_32_bits(offset);
+ }
+
+ *cs++ = lower_32_bits(encode | it->dma);
+ *cs++ = upper_32_bits(encode | it->dma);
+
+ offset += 8;
+ total += I915_GTT_PAGE_SIZE;
+
+ it->dma += I915_GTT_PAGE_SIZE;
+ if (it->dma >= it->max) {
+ it->sg = __sg_next(it->sg);
+ if (!it->sg || sg_dma_len(it->sg) == 0)
+ break;
+
+ it->dma = sg_dma_address(it->sg);
+ it->max = it->dma + sg_dma_len(it->sg);
+ }
+ } while (total < length);
+
+ *hdr += cs - hdr - 2;
+ *cs++ = MI_NOOP;
+
+ ring->emit = (void *)cs - ring->vaddr;
+ intel_ring_advance(rq, cs);
+ intel_ring_update_space(ring);
+
+ return total;
+}
+
+static bool wa_1209644611_applies(int ver, u32 size)
+{
+ u32 height = size >> PAGE_SHIFT;
+
+ if (ver != 11)
+ return false;
+
+ return height % 4 == 3 && height <= 8;
+}
+
+static int emit_copy(struct i915_request *rq, int size)
+{
+ const int ver = GRAPHICS_VER(rq->engine->i915);
+ u32 instance = rq->engine->instance;
+ u32 *cs;
+
+ cs = intel_ring_begin(rq, ver >= 8 ? 10 : 6);
+ if (IS_ERR(cs))
+ return PTR_ERR(cs);
+
+ if (ver >= 9 && !wa_1209644611_applies(ver, size)) {
+ *cs++ = GEN9_XY_FAST_COPY_BLT_CMD | (10 - 2);
+ *cs++ = BLT_DEPTH_32 | PAGE_SIZE;
+ *cs++ = 0;
+ *cs++ = size >> PAGE_SHIFT << 16 | PAGE_SIZE / 4;
+ *cs++ = CHUNK_SZ; /* dst offset */
+ *cs++ = instance;
+ *cs++ = 0;
+ *cs++ = PAGE_SIZE;
+ *cs++ = 0; /* src offset */
+ *cs++ = instance;
+ } else if (ver >= 8) {
+ *cs++ = XY_SRC_COPY_BLT_CMD | BLT_WRITE_RGBA | (10 - 2);
+ *cs++ = BLT_DEPTH_32 | BLT_ROP_SRC_COPY | PAGE_SIZE;
+ *cs++ = 0;
+ *cs++ = size >> PAGE_SHIFT << 16 | PAGE_SIZE / 4;
+ *cs++ = CHUNK_SZ; /* dst offset */
+ *cs++ = instance;
+ *cs++ = 0;
+ *cs++ = PAGE_SIZE;
+ *cs++ = 0; /* src offset */
+ *cs++ = instance;
+ } else {
+ GEM_BUG_ON(instance);
+ *cs++ = SRC_COPY_BLT_CMD | BLT_WRITE_RGBA | (6 - 2);
+ *cs++ = BLT_DEPTH_32 | BLT_ROP_SRC_COPY | PAGE_SIZE;
+ *cs++ = size >> PAGE_SHIFT << 16 | PAGE_SIZE;
+ *cs++ = CHUNK_SZ; /* dst offset */
+ *cs++ = PAGE_SIZE;
+ *cs++ = 0; /* src offset */
+ }
+
+ intel_ring_advance(rq, cs);
+ return 0;
+}
+
+int
+intel_context_migrate_copy(struct intel_context *ce,
+ struct dma_fence *await,
+ struct scatterlist *src,
+ enum i915_cache_level src_cache_level,
+ bool src_is_lmem,
+ struct scatterlist *dst,
+ enum i915_cache_level dst_cache_level,
+ bool dst_is_lmem,
+ struct i915_request **out)
+{
+ struct sgt_dma it_src = sg_sgt(src), it_dst = sg_sgt(dst);
+ struct i915_request *rq;
+ int err;
+
+ GEM_BUG_ON(ce->vm != ce->engine->gt->migrate.context->vm);
+ *out = NULL;
+
+ GEM_BUG_ON(ce->ring->size < SZ_64K);
+
+ do {
+ int len;
+
+ rq = i915_request_create(ce);
+ if (IS_ERR(rq)) {
+ err = PTR_ERR(rq);
+ goto out_ce;
+ }
+
+ if (await) {
+ err = i915_request_await_dma_fence(rq, await);
+ if (err)
+ goto out_rq;
+
+ if (rq->engine->emit_init_breadcrumb) {
+ err = rq->engine->emit_init_breadcrumb(rq);
+ if (err)
+ goto out_rq;
+ }
+
+ await = NULL;
+ }
+
+ /* The PTE updates + copy must not be interrupted. */
+ err = emit_no_arbitration(rq);
+ if (err)
+ goto out_rq;
+
+ len = emit_pte(rq, &it_src, src_cache_level, src_is_lmem, 0,
+ CHUNK_SZ);
+ if (len <= 0) {
+ err = len;
+ goto out_rq;
+ }
+
+ err = emit_pte(rq, &it_dst, dst_cache_level, dst_is_lmem,
+ CHUNK_SZ, len);
+ if (err < 0)
+ goto out_rq;
+ if (err < len) {
+ err = -EINVAL;
+ goto out_rq;
+ }
+
+ err = rq->engine->emit_flush(rq, EMIT_INVALIDATE);
+ if (err)
+ goto out_rq;
+
+ err = emit_copy(rq, len);
+
+ /* Arbitration is re-enabled between requests. */
+out_rq:
+ if (*out)
+ i915_request_put(*out);
+ *out = i915_request_get(rq);
+ i915_request_add(rq);
+ if (err || !it_src.sg || !sg_dma_len(it_src.sg))
+ break;
+
+ cond_resched();
+ } while (1);
+
+out_ce:
+ return err;
+}
+
+static int emit_clear(struct i915_request *rq, int size, u32 value)
+{
+ const int ver = GRAPHICS_VER(rq->engine->i915);
+ u32 instance = rq->engine->instance;
+ u32 *cs;
+
+ GEM_BUG_ON(size >> PAGE_SHIFT > S16_MAX);
+
+ cs = intel_ring_begin(rq, ver >= 8 ? 8 : 6);
+ if (IS_ERR(cs))
+ return PTR_ERR(cs);
+
+ if (ver >= 8) {
+ *cs++ = XY_COLOR_BLT_CMD | BLT_WRITE_RGBA | (7 - 2);
+ *cs++ = BLT_DEPTH_32 | BLT_ROP_COLOR_COPY | PAGE_SIZE;
+ *cs++ = 0;
+ *cs++ = size >> PAGE_SHIFT << 16 | PAGE_SIZE / 4;
+ *cs++ = 0; /* offset */
+ *cs++ = instance;
+ *cs++ = value;
+ *cs++ = MI_NOOP;
+ } else {
+ GEM_BUG_ON(instance);
+ *cs++ = XY_COLOR_BLT_CMD | BLT_WRITE_RGBA | (6 - 2);
+ *cs++ = BLT_DEPTH_32 | BLT_ROP_COLOR_COPY | PAGE_SIZE;
+ *cs++ = 0;
+ *cs++ = size >> PAGE_SHIFT << 16 | PAGE_SIZE / 4;
+ *cs++ = 0;
+ *cs++ = value;
+ }
+
+ intel_ring_advance(rq, cs);
+ return 0;
+}
+
+int
+intel_context_migrate_clear(struct intel_context *ce,
+ struct dma_fence *await,
+ struct scatterlist *sg,
+ enum i915_cache_level cache_level,
+ bool is_lmem,
+ u32 value,
+ struct i915_request **out)
+{
+ struct sgt_dma it = sg_sgt(sg);
+ struct i915_request *rq;
+ int err;
+
+ GEM_BUG_ON(ce->vm != ce->engine->gt->migrate.context->vm);
+ *out = NULL;
+
+ GEM_BUG_ON(ce->ring->size < SZ_64K);
+
+ do {
+ int len;
+
+ rq = i915_request_create(ce);
+ if (IS_ERR(rq)) {
+ err = PTR_ERR(rq);
+ goto out_ce;
+ }
+
+ if (await) {
+ err = i915_request_await_dma_fence(rq, await);
+ if (err)
+ goto out_rq;
+
+ if (rq->engine->emit_init_breadcrumb) {
+ err = rq->engine->emit_init_breadcrumb(rq);
+ if (err)
+ goto out_rq;
+ }
+
+ await = NULL;
+ }
+
+ /* The PTE updates + clear must not be interrupted. */
+ err = emit_no_arbitration(rq);
+ if (err)
+ goto out_rq;
+
+ len = emit_pte(rq, &it, cache_level, is_lmem, 0, CHUNK_SZ);
+ if (len <= 0) {
+ err = len;
+ goto out_rq;
+ }
+
+ err = rq->engine->emit_flush(rq, EMIT_INVALIDATE);
+ if (err)
+ goto out_rq;
+
+ err = emit_clear(rq, len, value);
+
+ /* Arbitration is re-enabled between requests. */
+out_rq:
+ if (*out)
+ i915_request_put(*out);
+ *out = i915_request_get(rq);
+ i915_request_add(rq);
+ if (err || !it.sg || !sg_dma_len(it.sg))
+ break;
+
+ cond_resched();
+ } while (1);
+
+out_ce:
+ return err;
+}
+
+int intel_migrate_copy(struct intel_migrate *m,
+ struct i915_gem_ww_ctx *ww,
+ struct dma_fence *await,
+ struct scatterlist *src,
+ enum i915_cache_level src_cache_level,
+ bool src_is_lmem,
+ struct scatterlist *dst,
+ enum i915_cache_level dst_cache_level,
+ bool dst_is_lmem,
+ struct i915_request **out)
+{
+ struct intel_context *ce;
+ int err;
+
+ *out = NULL;
+ if (!m->context)
+ return -ENODEV;
+
+ ce = intel_migrate_create_context(m);
+ if (IS_ERR(ce))
+ ce = intel_context_get(m->context);
+ GEM_BUG_ON(IS_ERR(ce));
+
+ err = intel_context_pin_ww(ce, ww);
+ if (err)
+ goto out;
+
+ err = intel_context_migrate_copy(ce, await,
+ src, src_cache_level, src_is_lmem,
+ dst, dst_cache_level, dst_is_lmem,
+ out);
+
+ intel_context_unpin(ce);
+out:
+ intel_context_put(ce);
+ return err;
+}
+
+int
+intel_migrate_clear(struct intel_migrate *m,
+ struct i915_gem_ww_ctx *ww,
+ struct dma_fence *await,
+ struct scatterlist *sg,
+ enum i915_cache_level cache_level,
+ bool is_lmem,
+ u32 value,
+ struct i915_request **out)
+{
+ struct intel_context *ce;
+ int err;
+
+ *out = NULL;
+ if (!m->context)
+ return -ENODEV;
+
+ ce = intel_migrate_create_context(m);
+ if (IS_ERR(ce))
+ ce = intel_context_get(m->context);
+ GEM_BUG_ON(IS_ERR(ce));
+
+ err = intel_context_pin_ww(ce, ww);
+ if (err)
+ goto out;
+
+ err = intel_context_migrate_clear(ce, await, sg, cache_level,
+ is_lmem, value, out);
+
+ intel_context_unpin(ce);
+out:
+ intel_context_put(ce);
+ return err;
+}
+
+void intel_migrate_fini(struct intel_migrate *m)
+{
+ struct intel_context *ce;
+
+ ce = fetch_and_zero(&m->context);
+ if (!ce)
+ return;
+
+ intel_engine_destroy_pinned_context(ce);
+}
+
+#if IS_ENABLED(CONFIG_DRM_I915_SELFTEST)
+#include "selftest_migrate.c"
+#endif
diff --git a/drivers/gpu/drm/i915/gt/intel_migrate.h b/drivers/gpu/drm/i915/gt/intel_migrate.h
new file mode 100644
index 000000000000..4e18e755a00b
--- /dev/null
+++ b/drivers/gpu/drm/i915/gt/intel_migrate.h
@@ -0,0 +1,65 @@
+/* SPDX-License-Identifier: MIT */
+/*
+ * Copyright © 2020 Intel Corporation
+ */
+
+#ifndef __INTEL_MIGRATE__
+#define __INTEL_MIGRATE__
+
+#include <linux/types.h>
+
+#include "intel_migrate_types.h"
+
+struct dma_fence;
+struct i915_request;
+struct i915_gem_ww_ctx;
+struct intel_gt;
+struct scatterlist;
+enum i915_cache_level;
+
+int intel_migrate_init(struct intel_migrate *m, struct intel_gt *gt);
+
+struct intel_context *intel_migrate_create_context(struct intel_migrate *m);
+
+int intel_migrate_copy(struct intel_migrate *m,
+ struct i915_gem_ww_ctx *ww,
+ struct dma_fence *await,
+ struct scatterlist *src,
+ enum i915_cache_level src_cache_level,
+ bool src_is_lmem,
+ struct scatterlist *dst,
+ enum i915_cache_level dst_cache_level,
+ bool dst_is_lmem,
+ struct i915_request **out);
+
+int intel_context_migrate_copy(struct intel_context *ce,
+ struct dma_fence *await,
+ struct scatterlist *src,
+ enum i915_cache_level src_cache_level,
+ bool src_is_lmem,
+ struct scatterlist *dst,
+ enum i915_cache_level dst_cache_level,
+ bool dst_is_lmem,
+ struct i915_request **out);
+
+int
+intel_migrate_clear(struct intel_migrate *m,
+ struct i915_gem_ww_ctx *ww,
+ struct dma_fence *await,
+ struct scatterlist *sg,
+ enum i915_cache_level cache_level,
+ bool is_lmem,
+ u32 value,
+ struct i915_request **out);
+int
+intel_context_migrate_clear(struct intel_context *ce,
+ struct dma_fence *await,
+ struct scatterlist *sg,
+ enum i915_cache_level cache_level,
+ bool is_lmem,
+ u32 value,
+ struct i915_request **out);
+
+void intel_migrate_fini(struct intel_migrate *m);
+
+#endif /* __INTEL_MIGRATE__ */
diff --git a/drivers/gpu/drm/i915/gt/intel_migrate_types.h b/drivers/gpu/drm/i915/gt/intel_migrate_types.h
new file mode 100644
index 000000000000..d98230597f42
--- /dev/null
+++ b/drivers/gpu/drm/i915/gt/intel_migrate_types.h
@@ -0,0 +1,15 @@
+/* SPDX-License-Identifier: MIT */
+/*
+ * Copyright © 2020 Intel Corporation
+ */
+
+#ifndef __INTEL_MIGRATE_TYPES__
+#define __INTEL_MIGRATE_TYPES__
+
+struct intel_context;
+
+struct intel_migrate {
+ struct intel_context *context;
+};
+
+#endif /* __INTEL_MIGRATE_TYPES__ */
diff --git a/drivers/gpu/drm/i915/gt/intel_mocs.c b/drivers/gpu/drm/i915/gt/intel_mocs.c
index 17848807f111..582c4423b95d 100644
--- a/drivers/gpu/drm/i915/gt/intel_mocs.c
+++ b/drivers/gpu/drm/i915/gt/intel_mocs.c
@@ -352,7 +352,7 @@ static unsigned int get_mocs_settings(const struct drm_i915_private *i915,
table->size = ARRAY_SIZE(icl_mocs_table);
table->table = icl_mocs_table;
table->n_entries = GEN9_NUM_MOCS_ENTRIES;
- } else if (IS_GEN9_BC(i915) || IS_CANNONLAKE(i915)) {
+ } else if (IS_GEN9_BC(i915)) {
table->size = ARRAY_SIZE(skl_mocs_table);
table->n_entries = GEN9_NUM_MOCS_ENTRIES;
table->table = skl_mocs_table;
diff --git a/drivers/gpu/drm/i915/gt/intel_rc6.c b/drivers/gpu/drm/i915/gt/intel_rc6.c
index 259d7eb4e165..799d382eea79 100644
--- a/drivers/gpu/drm/i915/gt/intel_rc6.c
+++ b/drivers/gpu/drm/i915/gt/intel_rc6.c
@@ -62,20 +62,25 @@ static void gen11_rc6_enable(struct intel_rc6 *rc6)
u32 pg_enable;
int i;
- /* 2b: Program RC6 thresholds.*/
- set(uncore, GEN6_RC6_WAKE_RATE_LIMIT, 54 << 16 | 85);
- set(uncore, GEN10_MEDIA_WAKE_RATE_LIMIT, 150);
+ /*
+ * With GuCRC, these parameters are set by GuC
+ */
+ if (!intel_uc_uses_guc_rc(&gt->uc)) {
+ /* 2b: Program RC6 thresholds.*/
+ set(uncore, GEN6_RC6_WAKE_RATE_LIMIT, 54 << 16 | 85);
+ set(uncore, GEN10_MEDIA_WAKE_RATE_LIMIT, 150);
- set(uncore, GEN6_RC_EVALUATION_INTERVAL, 125000); /* 12500 * 1280ns */
- set(uncore, GEN6_RC_IDLE_HYSTERSIS, 25); /* 25 * 1280ns */
- for_each_engine(engine, rc6_to_gt(rc6), id)
- set(uncore, RING_MAX_IDLE(engine->mmio_base), 10);
+ set(uncore, GEN6_RC_EVALUATION_INTERVAL, 125000); /* 12500 * 1280ns */
+ set(uncore, GEN6_RC_IDLE_HYSTERSIS, 25); /* 25 * 1280ns */
+ for_each_engine(engine, rc6_to_gt(rc6), id)
+ set(uncore, RING_MAX_IDLE(engine->mmio_base), 10);
- set(uncore, GUC_MAX_IDLE_COUNT, 0xA);
+ set(uncore, GUC_MAX_IDLE_COUNT, 0xA);
- set(uncore, GEN6_RC_SLEEP, 0);
+ set(uncore, GEN6_RC_SLEEP, 0);
- set(uncore, GEN6_RC6_THRESHOLD, 50000); /* 50/125ms per EI */
+ set(uncore, GEN6_RC6_THRESHOLD, 50000); /* 50/125ms per EI */
+ }
/*
* 2c: Program Coarse Power Gating Policies.
@@ -98,11 +103,19 @@ static void gen11_rc6_enable(struct intel_rc6 *rc6)
set(uncore, GEN9_MEDIA_PG_IDLE_HYSTERESIS, 60);
set(uncore, GEN9_RENDER_PG_IDLE_HYSTERESIS, 60);
- /* 3a: Enable RC6 */
- rc6->ctl_enable =
- GEN6_RC_CTL_HW_ENABLE |
- GEN6_RC_CTL_RC6_ENABLE |
- GEN6_RC_CTL_EI_MODE(1);
+ /* 3a: Enable RC6
+ *
+ * With GuCRC, we do not enable bit 31 of RC_CTL,
+ * thus allowing GuC to control RC6 entry/exit fully instead.
+ * We will not set the HW ENABLE and EI bits
+ */
+ if (!intel_guc_rc_enable(&gt->uc.guc))
+ rc6->ctl_enable = GEN6_RC_CTL_RC6_ENABLE;
+ else
+ rc6->ctl_enable =
+ GEN6_RC_CTL_HW_ENABLE |
+ GEN6_RC_CTL_RC6_ENABLE |
+ GEN6_RC_CTL_EI_MODE(1);
pg_enable =
GEN9_RENDER_PG_ENABLE |
@@ -126,7 +139,7 @@ static void gen9_rc6_enable(struct intel_rc6 *rc6)
enum intel_engine_id id;
/* 2b: Program RC6 thresholds.*/
- if (GRAPHICS_VER(rc6_to_i915(rc6)) >= 10) {
+ if (GRAPHICS_VER(rc6_to_i915(rc6)) >= 11) {
set(uncore, GEN6_RC6_WAKE_RATE_LIMIT, 54 << 16 | 85);
set(uncore, GEN10_MEDIA_WAKE_RATE_LIMIT, 150);
} else if (IS_SKYLAKE(rc6_to_i915(rc6))) {
@@ -513,6 +526,10 @@ static void __intel_rc6_disable(struct intel_rc6 *rc6)
{
struct drm_i915_private *i915 = rc6_to_i915(rc6);
struct intel_uncore *uncore = rc6_to_uncore(rc6);
+ struct intel_gt *gt = rc6_to_gt(rc6);
+
+ /* Take control of RC6 back from GuC */
+ intel_guc_rc_disable(&gt->uc.guc);
intel_uncore_forcewake_get(uncore, FORCEWAKE_ALL);
if (GRAPHICS_VER(i915) >= 9)
diff --git a/drivers/gpu/drm/i915/gt/intel_region_lmem.c b/drivers/gpu/drm/i915/gt/intel_region_lmem.c
index f7366b054f8e..a74b72f50cc9 100644
--- a/drivers/gpu/drm/i915/gt/intel_region_lmem.c
+++ b/drivers/gpu/drm/i915/gt/intel_region_lmem.c
@@ -9,7 +9,8 @@
#include "intel_region_ttm.h"
#include "gem/i915_gem_lmem.h"
#include "gem/i915_gem_region.h"
-#include "intel_region_lmem.h"
+#include "gem/i915_gem_ttm.h"
+#include "gt/intel_gt.h"
static int init_fake_lmem_bar(struct intel_memory_region *mem)
{
@@ -107,7 +108,7 @@ out_no_io:
static const struct intel_memory_region_ops intel_region_lmem_ops = {
.init = region_lmem_init,
.release = region_lmem_release,
- .init_object = __i915_gem_lmem_object_init,
+ .init_object = __i915_gem_ttm_object_init,
};
struct intel_memory_region *
@@ -157,7 +158,7 @@ intel_gt_setup_fake_lmem(struct intel_gt *gt)
static bool get_legacy_lowmem_region(struct intel_uncore *uncore,
u64 *start, u32 *size)
{
- if (!IS_DG1_REVID(uncore->i915, DG1_REVID_A0, DG1_REVID_B0))
+ if (!IS_DG1_GT_STEP(uncore->i915, STEP_A0, STEP_C0))
return false;
*start = 0;
diff --git a/drivers/gpu/drm/i915/gt/intel_renderstate.h b/drivers/gpu/drm/i915/gt/intel_renderstate.h
index 48f009203917..4da4c5234ef0 100644
--- a/drivers/gpu/drm/i915/gt/intel_renderstate.h
+++ b/drivers/gpu/drm/i915/gt/intel_renderstate.h
@@ -8,6 +8,7 @@
#include <linux/types.h>
#include "i915_gem.h"
+#include "i915_gem_ww.h"
struct i915_request;
struct intel_context;
diff --git a/drivers/gpu/drm/i915/gt/intel_reset.c b/drivers/gpu/drm/i915/gt/intel_reset.c
index 72251638d4ea..91200c43951f 100644
--- a/drivers/gpu/drm/i915/gt/intel_reset.c
+++ b/drivers/gpu/drm/i915/gt/intel_reset.c
@@ -22,7 +22,6 @@
#include "intel_reset.h"
#include "uc/intel_guc.h"
-#include "uc/intel_guc_submission.h"
#define RESET_MAX_RETRIES 3
@@ -39,21 +38,6 @@ static void rmw_clear_fw(struct intel_uncore *uncore, i915_reg_t reg, u32 clr)
intel_uncore_rmw_fw(uncore, reg, clr, 0);
}
-static void skip_context(struct i915_request *rq)
-{
- struct intel_context *hung_ctx = rq->context;
-
- list_for_each_entry_from_rcu(rq, &hung_ctx->timeline->requests, link) {
- if (!i915_request_is_active(rq))
- return;
-
- if (rq->context == hung_ctx) {
- i915_request_set_error_once(rq, -EIO);
- __i915_request_skip(rq);
- }
- }
-}
-
static void client_mark_guilty(struct i915_gem_context *ctx, bool banned)
{
struct drm_i915_file_private *file_priv = ctx->file_priv;
@@ -88,10 +72,8 @@ static bool mark_guilty(struct i915_request *rq)
bool banned;
int i;
- if (intel_context_is_closed(rq->context)) {
- intel_context_set_banned(rq->context);
+ if (intel_context_is_closed(rq->context))
return true;
- }
rcu_read_lock();
ctx = rcu_dereference(rq->context->gem_context);
@@ -123,11 +105,9 @@ static bool mark_guilty(struct i915_request *rq)
banned = !i915_gem_context_is_recoverable(ctx);
if (time_before(jiffies, prev_hang + CONTEXT_FAST_HANG_JIFFIES))
banned = true;
- if (banned) {
+ if (banned)
drm_dbg(&ctx->i915->drm, "context %s: guilty %d, banned\n",
ctx->name, atomic_read(&ctx->guilty_count));
- intel_context_set_banned(rq->context);
- }
client_mark_guilty(ctx, banned);
@@ -149,6 +129,8 @@ static void mark_innocent(struct i915_request *rq)
void __i915_request_reset(struct i915_request *rq, bool guilty)
{
+ bool banned = false;
+
RQ_TRACE(rq, "guilty? %s\n", yesno(guilty));
GEM_BUG_ON(__i915_request_is_complete(rq));
@@ -156,13 +138,15 @@ void __i915_request_reset(struct i915_request *rq, bool guilty)
if (guilty) {
i915_request_set_error_once(rq, -EIO);
__i915_request_skip(rq);
- if (mark_guilty(rq))
- skip_context(rq);
+ banned = mark_guilty(rq);
} else {
i915_request_set_error_once(rq, -EAGAIN);
mark_innocent(rq);
}
rcu_read_unlock();
+
+ if (banned)
+ intel_context_ban(rq->context, rq);
}
static bool i915_in_reset(struct pci_dev *pdev)
@@ -515,8 +499,14 @@ static int gen11_reset_engines(struct intel_gt *gt,
[VCS1] = GEN11_GRDOM_MEDIA2,
[VCS2] = GEN11_GRDOM_MEDIA3,
[VCS3] = GEN11_GRDOM_MEDIA4,
+ [VCS4] = GEN11_GRDOM_MEDIA5,
+ [VCS5] = GEN11_GRDOM_MEDIA6,
+ [VCS6] = GEN11_GRDOM_MEDIA7,
+ [VCS7] = GEN11_GRDOM_MEDIA8,
[VECS0] = GEN11_GRDOM_VECS,
[VECS1] = GEN11_GRDOM_VECS2,
+ [VECS2] = GEN11_GRDOM_VECS3,
+ [VECS3] = GEN11_GRDOM_VECS4,
};
struct intel_engine_cs *engine;
intel_engine_mask_t tmp;
@@ -826,6 +816,8 @@ static int gt_reset(struct intel_gt *gt, intel_engine_mask_t stalled_mask)
__intel_engine_reset(engine, stalled_mask & engine->mask);
local_bh_enable();
+ intel_uc_reset(&gt->uc, true);
+
intel_ggtt_restore_fences(gt->ggtt);
return err;
@@ -850,6 +842,8 @@ static void reset_finish(struct intel_gt *gt, intel_engine_mask_t awake)
if (awake & engine->mask)
intel_engine_pm_put(engine);
}
+
+ intel_uc_reset_finish(&gt->uc);
}
static void nop_submit_request(struct i915_request *request)
@@ -903,6 +897,7 @@ static void __intel_gt_set_wedged(struct intel_gt *gt)
for_each_engine(engine, gt, id)
if (engine->reset.cancel)
engine->reset.cancel(engine);
+ intel_uc_cancel_requests(&gt->uc);
local_bh_enable();
reset_finish(gt, awake);
@@ -1191,6 +1186,9 @@ int __intel_engine_reset_bh(struct intel_engine_cs *engine, const char *msg)
ENGINE_TRACE(engine, "flags=%lx\n", gt->reset.flags);
GEM_BUG_ON(!test_bit(I915_RESET_ENGINE + engine->id, &gt->reset.flags));
+ if (intel_engine_uses_guc(engine))
+ return -ENODEV;
+
if (!intel_engine_pm_get_if_awake(engine))
return 0;
@@ -1201,13 +1199,10 @@ int __intel_engine_reset_bh(struct intel_engine_cs *engine, const char *msg)
"Resetting %s for %s\n", engine->name, msg);
atomic_inc(&engine->i915->gpu_error.reset_engine_count[engine->uabi_class]);
- if (intel_engine_uses_guc(engine))
- ret = intel_guc_reset_engine(&engine->gt->uc.guc, engine);
- else
- ret = intel_gt_reset_engine(engine);
+ ret = intel_gt_reset_engine(engine);
if (ret) {
/* If we fail here, we expect to fallback to a global reset */
- ENGINE_TRACE(engine, "Failed to reset, err: %d\n", ret);
+ ENGINE_TRACE(engine, "Failed to reset %s, err: %d\n", engine->name, ret);
goto out;
}
@@ -1341,7 +1336,8 @@ void intel_gt_handle_error(struct intel_gt *gt,
* Try engine reset when available. We fall back to full reset if
* single reset fails.
*/
- if (intel_has_reset_engine(gt) && !intel_gt_is_wedged(gt)) {
+ if (!intel_uc_uses_guc_submission(&gt->uc) &&
+ intel_has_reset_engine(gt) && !intel_gt_is_wedged(gt)) {
local_bh_disable();
for_each_engine_masked(engine, gt, engine_mask, tmp) {
BUILD_BUG_ON(I915_RESET_MODESET >= I915_RESET_ENGINE);
diff --git a/drivers/gpu/drm/i915/gt/intel_ring.h b/drivers/gpu/drm/i915/gt/intel_ring.h
index dbf5f14a136f..1b32dadfb8c3 100644
--- a/drivers/gpu/drm/i915/gt/intel_ring.h
+++ b/drivers/gpu/drm/i915/gt/intel_ring.h
@@ -49,6 +49,7 @@ static inline void intel_ring_advance(struct i915_request *rq, u32 *cs)
* intel_ring_begin()).
*/
GEM_BUG_ON((rq->ring->vaddr + rq->ring->emit) != cs);
+ GEM_BUG_ON(!IS_ALIGNED(rq->ring->emit, 8)); /* RING_TAIL qword align */
}
static inline u32 intel_ring_wrap(const struct intel_ring *ring, u32 pos)
diff --git a/drivers/gpu/drm/i915/gt/intel_ring_submission.c b/drivers/gpu/drm/i915/gt/intel_ring_submission.c
index 37d74d4ed59b..2958e2fae380 100644
--- a/drivers/gpu/drm/i915/gt/intel_ring_submission.c
+++ b/drivers/gpu/drm/i915/gt/intel_ring_submission.c
@@ -16,6 +16,7 @@
#include "intel_reset.h"
#include "intel_ring.h"
#include "shmem_utils.h"
+#include "intel_engine_heartbeat.h"
/* Rough estimate of the typical request size, performing a flush,
* set-context and then emitting the batch.
@@ -342,9 +343,9 @@ static void reset_rewind(struct intel_engine_cs *engine, bool stalled)
u32 head;
rq = NULL;
- spin_lock_irqsave(&engine->active.lock, flags);
+ spin_lock_irqsave(&engine->sched_engine->lock, flags);
rcu_read_lock();
- list_for_each_entry(pos, &engine->active.requests, sched.link) {
+ list_for_each_entry(pos, &engine->sched_engine->requests, sched.link) {
if (!__i915_request_is_complete(pos)) {
rq = pos;
break;
@@ -399,7 +400,7 @@ static void reset_rewind(struct intel_engine_cs *engine, bool stalled)
}
engine->legacy.ring->head = intel_ring_wrap(engine->legacy.ring, head);
- spin_unlock_irqrestore(&engine->active.lock, flags);
+ spin_unlock_irqrestore(&engine->sched_engine->lock, flags);
}
static void reset_finish(struct intel_engine_cs *engine)
@@ -411,16 +412,16 @@ static void reset_cancel(struct intel_engine_cs *engine)
struct i915_request *request;
unsigned long flags;
- spin_lock_irqsave(&engine->active.lock, flags);
+ spin_lock_irqsave(&engine->sched_engine->lock, flags);
/* Mark all submitted requests as skipped. */
- list_for_each_entry(request, &engine->active.requests, sched.link)
+ list_for_each_entry(request, &engine->sched_engine->requests, sched.link)
i915_request_put(i915_request_mark_eio(request));
intel_engine_signal_breadcrumbs(engine);
/* Remaining _unready_ requests will be nop'ed when submitted */
- spin_unlock_irqrestore(&engine->active.lock, flags);
+ spin_unlock_irqrestore(&engine->sched_engine->lock, flags);
}
static void i9xx_submit_request(struct i915_request *request)
@@ -586,9 +587,44 @@ static void ring_context_reset(struct intel_context *ce)
clear_bit(CONTEXT_VALID_BIT, &ce->flags);
}
+static void ring_context_ban(struct intel_context *ce,
+ struct i915_request *rq)
+{
+ struct intel_engine_cs *engine;
+
+ if (!rq || !i915_request_is_active(rq))
+ return;
+
+ engine = rq->engine;
+ lockdep_assert_held(&engine->sched_engine->lock);
+ list_for_each_entry_continue(rq, &engine->sched_engine->requests,
+ sched.link)
+ if (rq->context == ce) {
+ i915_request_set_error_once(rq, -EIO);
+ __i915_request_skip(rq);
+ }
+}
+
+static void ring_context_cancel_request(struct intel_context *ce,
+ struct i915_request *rq)
+{
+ struct intel_engine_cs *engine = NULL;
+
+ i915_request_active_engine(rq, &engine);
+
+ if (engine && intel_engine_pulse(engine))
+ intel_gt_handle_error(engine->gt, engine->mask, 0,
+ "request cancellation by %s",
+ current->comm);
+}
+
static const struct intel_context_ops ring_context_ops = {
.alloc = ring_context_alloc,
+ .cancel_request = ring_context_cancel_request,
+
+ .ban = ring_context_ban,
+
.pre_pin = ring_context_pre_pin,
.pin = ring_context_pin,
.unpin = ring_context_unpin,
@@ -1047,6 +1083,25 @@ static void setup_irq(struct intel_engine_cs *engine)
}
}
+static void add_to_engine(struct i915_request *rq)
+{
+ lockdep_assert_held(&rq->engine->sched_engine->lock);
+ list_move_tail(&rq->sched.link, &rq->engine->sched_engine->requests);
+}
+
+static void remove_from_engine(struct i915_request *rq)
+{
+ spin_lock_irq(&rq->engine->sched_engine->lock);
+ list_del_init(&rq->sched.link);
+
+ /* Prevent further __await_execution() registering a cb, then flush */
+ set_bit(I915_FENCE_FLAG_ACTIVE, &rq->fence.flags);
+
+ spin_unlock_irq(&rq->engine->sched_engine->lock);
+
+ i915_request_notify_execute_cb_imm(rq);
+}
+
static void setup_common(struct intel_engine_cs *engine)
{
struct drm_i915_private *i915 = engine->i915;
@@ -1064,6 +1119,9 @@ static void setup_common(struct intel_engine_cs *engine)
engine->reset.cancel = reset_cancel;
engine->reset.finish = reset_finish;
+ engine->add_active_request = add_to_engine;
+ engine->remove_active_request = remove_from_engine;
+
engine->cops = &ring_context_ops;
engine->request_alloc = ring_request_alloc;
diff --git a/drivers/gpu/drm/i915/gt/intel_rps.c b/drivers/gpu/drm/i915/gt/intel_rps.c
index 06e9a8ed4e03..d812b27835f8 100644
--- a/drivers/gpu/drm/i915/gt/intel_rps.c
+++ b/drivers/gpu/drm/i915/gt/intel_rps.c
@@ -37,6 +37,20 @@ static struct intel_uncore *rps_to_uncore(struct intel_rps *rps)
return rps_to_gt(rps)->uncore;
}
+static struct intel_guc_slpc *rps_to_slpc(struct intel_rps *rps)
+{
+ struct intel_gt *gt = rps_to_gt(rps);
+
+ return &gt->uc.guc.slpc;
+}
+
+static bool rps_uses_slpc(struct intel_rps *rps)
+{
+ struct intel_gt *gt = rps_to_gt(rps);
+
+ return intel_uc_uses_guc_slpc(&gt->uc);
+}
+
static u32 rps_pm_sanitize_mask(struct intel_rps *rps, u32 mask)
{
return mask & ~rps->pm_intrmsk_mbz;
@@ -167,6 +181,8 @@ static void rps_enable_interrupts(struct intel_rps *rps)
{
struct intel_gt *gt = rps_to_gt(rps);
+ GEM_BUG_ON(rps_uses_slpc(rps));
+
GT_TRACE(gt, "interrupts:on rps->pm_events: %x, rps_pm_mask:%x\n",
rps->pm_events, rps_pm_mask(rps, rps->last_freq));
@@ -771,6 +787,8 @@ static int gen6_rps_set(struct intel_rps *rps, u8 val)
struct drm_i915_private *i915 = rps_to_i915(rps);
u32 swreq;
+ GEM_BUG_ON(rps_uses_slpc(rps));
+
if (GRAPHICS_VER(i915) >= 9)
swreq = GEN9_FREQUENCY(val);
else if (IS_HASWELL(i915) || IS_BROADWELL(i915))
@@ -861,6 +879,9 @@ void intel_rps_park(struct intel_rps *rps)
{
int adj;
+ if (!intel_rps_is_enabled(rps))
+ return;
+
GEM_BUG_ON(atomic_read(&rps->num_waiters));
if (!intel_rps_clear_active(rps))
@@ -999,7 +1020,7 @@ static void gen6_rps_init(struct intel_rps *rps)
rps->efficient_freq = rps->rp1_freq;
if (IS_HASWELL(i915) || IS_BROADWELL(i915) ||
- IS_GEN9_BC(i915) || GRAPHICS_VER(i915) >= 10) {
+ IS_GEN9_BC(i915) || GRAPHICS_VER(i915) >= 11) {
u32 ddcc_status = 0;
if (sandybridge_pcode_read(i915,
@@ -1012,7 +1033,7 @@ static void gen6_rps_init(struct intel_rps *rps)
rps->max_freq);
}
- if (IS_GEN9_BC(i915) || GRAPHICS_VER(i915) >= 10) {
+ if (IS_GEN9_BC(i915) || GRAPHICS_VER(i915) >= 11) {
/* Store the frequency values in 16.66 MHZ units, which is
* the natural hardware unit for SKL
*/
@@ -1356,6 +1377,9 @@ void intel_rps_enable(struct intel_rps *rps)
if (!HAS_RPS(i915))
return;
+ if (rps_uses_slpc(rps))
+ return;
+
intel_gt_check_clock_frequency(rps_to_gt(rps));
intel_uncore_forcewake_get(uncore, FORCEWAKE_ALL);
@@ -1829,6 +1853,9 @@ void intel_rps_init(struct intel_rps *rps)
{
struct drm_i915_private *i915 = rps_to_i915(rps);
+ if (rps_uses_slpc(rps))
+ return;
+
if (IS_CHERRYVIEW(i915))
chv_rps_init(rps);
else if (IS_VALLEYVIEW(i915))
@@ -1877,10 +1904,17 @@ void intel_rps_init(struct intel_rps *rps)
if (GRAPHICS_VER(i915) >= 8 && GRAPHICS_VER(i915) < 11)
rps->pm_intrmsk_mbz |= GEN8_PMINTR_DISABLE_REDIRECT_TO_GUC;
+
+ /* GuC needs ARAT expired interrupt unmasked */
+ if (intel_uc_uses_guc_submission(&rps_to_gt(rps)->uc))
+ rps->pm_intrmsk_mbz |= ARAT_EXPIRED_INTRMSK;
}
void intel_rps_sanitize(struct intel_rps *rps)
{
+ if (rps_uses_slpc(rps))
+ return;
+
if (GRAPHICS_VER(rps_to_i915(rps)) >= 6)
rps_disable_interrupts(rps);
}
@@ -1936,6 +1970,176 @@ u32 intel_rps_read_actual_frequency(struct intel_rps *rps)
return freq;
}
+u32 intel_rps_read_punit_req(struct intel_rps *rps)
+{
+ struct intel_uncore *uncore = rps_to_uncore(rps);
+
+ return intel_uncore_read(uncore, GEN6_RPNSWREQ);
+}
+
+static u32 intel_rps_get_req(u32 pureq)
+{
+ u32 req = pureq >> GEN9_SW_REQ_UNSLICE_RATIO_SHIFT;
+
+ return req;
+}
+
+u32 intel_rps_read_punit_req_frequency(struct intel_rps *rps)
+{
+ u32 freq = intel_rps_get_req(intel_rps_read_punit_req(rps));
+
+ return intel_gpu_freq(rps, freq);
+}
+
+u32 intel_rps_get_requested_frequency(struct intel_rps *rps)
+{
+ if (rps_uses_slpc(rps))
+ return intel_rps_read_punit_req_frequency(rps);
+ else
+ return intel_gpu_freq(rps, rps->cur_freq);
+}
+
+u32 intel_rps_get_max_frequency(struct intel_rps *rps)
+{
+ struct intel_guc_slpc *slpc = rps_to_slpc(rps);
+
+ if (rps_uses_slpc(rps))
+ return slpc->max_freq_softlimit;
+ else
+ return intel_gpu_freq(rps, rps->max_freq_softlimit);
+}
+
+u32 intel_rps_get_rp0_frequency(struct intel_rps *rps)
+{
+ struct intel_guc_slpc *slpc = rps_to_slpc(rps);
+
+ if (rps_uses_slpc(rps))
+ return slpc->rp0_freq;
+ else
+ return intel_gpu_freq(rps, rps->rp0_freq);
+}
+
+u32 intel_rps_get_rp1_frequency(struct intel_rps *rps)
+{
+ struct intel_guc_slpc *slpc = rps_to_slpc(rps);
+
+ if (rps_uses_slpc(rps))
+ return slpc->rp1_freq;
+ else
+ return intel_gpu_freq(rps, rps->rp1_freq);
+}
+
+u32 intel_rps_get_rpn_frequency(struct intel_rps *rps)
+{
+ struct intel_guc_slpc *slpc = rps_to_slpc(rps);
+
+ if (rps_uses_slpc(rps))
+ return slpc->min_freq;
+ else
+ return intel_gpu_freq(rps, rps->min_freq);
+}
+
+static int set_max_freq(struct intel_rps *rps, u32 val)
+{
+ struct drm_i915_private *i915 = rps_to_i915(rps);
+ int ret = 0;
+
+ mutex_lock(&rps->lock);
+
+ val = intel_freq_opcode(rps, val);
+ if (val < rps->min_freq ||
+ val > rps->max_freq ||
+ val < rps->min_freq_softlimit) {
+ ret = -EINVAL;
+ goto unlock;
+ }
+
+ if (val > rps->rp0_freq)
+ drm_dbg(&i915->drm, "User requested overclocking to %d\n",
+ intel_gpu_freq(rps, val));
+
+ rps->max_freq_softlimit = val;
+
+ val = clamp_t(int, rps->cur_freq,
+ rps->min_freq_softlimit,
+ rps->max_freq_softlimit);
+
+ /*
+ * We still need *_set_rps to process the new max_delay and
+ * update the interrupt limits and PMINTRMSK even though
+ * frequency request may be unchanged.
+ */
+ intel_rps_set(rps, val);
+
+unlock:
+ mutex_unlock(&rps->lock);
+
+ return ret;
+}
+
+int intel_rps_set_max_frequency(struct intel_rps *rps, u32 val)
+{
+ struct intel_guc_slpc *slpc = rps_to_slpc(rps);
+
+ if (rps_uses_slpc(rps))
+ return intel_guc_slpc_set_max_freq(slpc, val);
+ else
+ return set_max_freq(rps, val);
+}
+
+u32 intel_rps_get_min_frequency(struct intel_rps *rps)
+{
+ struct intel_guc_slpc *slpc = rps_to_slpc(rps);
+
+ if (rps_uses_slpc(rps))
+ return slpc->min_freq_softlimit;
+ else
+ return intel_gpu_freq(rps, rps->min_freq_softlimit);
+}
+
+static int set_min_freq(struct intel_rps *rps, u32 val)
+{
+ int ret = 0;
+
+ mutex_lock(&rps->lock);
+
+ val = intel_freq_opcode(rps, val);
+ if (val < rps->min_freq ||
+ val > rps->max_freq ||
+ val > rps->max_freq_softlimit) {
+ ret = -EINVAL;
+ goto unlock;
+ }
+
+ rps->min_freq_softlimit = val;
+
+ val = clamp_t(int, rps->cur_freq,
+ rps->min_freq_softlimit,
+ rps->max_freq_softlimit);
+
+ /*
+ * We still need *_set_rps to process the new min_delay and
+ * update the interrupt limits and PMINTRMSK even though
+ * frequency request may be unchanged.
+ */
+ intel_rps_set(rps, val);
+
+unlock:
+ mutex_unlock(&rps->lock);
+
+ return ret;
+}
+
+int intel_rps_set_min_frequency(struct intel_rps *rps, u32 val)
+{
+ struct intel_guc_slpc *slpc = rps_to_slpc(rps);
+
+ if (rps_uses_slpc(rps))
+ return intel_guc_slpc_set_min_freq(slpc, val);
+ else
+ return set_min_freq(rps, val);
+}
+
/* External interface for intel_ips.ko */
static struct drm_i915_private __rcu *ips_mchdev;
@@ -2129,4 +2333,5 @@ EXPORT_SYMBOL_GPL(i915_gpu_turbo_disable);
#if IS_ENABLED(CONFIG_DRM_I915_SELFTEST)
#include "selftest_rps.c"
+#include "selftest_slpc.c"
#endif
diff --git a/drivers/gpu/drm/i915/gt/intel_rps.h b/drivers/gpu/drm/i915/gt/intel_rps.h
index 1d2cfc98b510..4213bcce1667 100644
--- a/drivers/gpu/drm/i915/gt/intel_rps.h
+++ b/drivers/gpu/drm/i915/gt/intel_rps.h
@@ -31,6 +31,16 @@ int intel_gpu_freq(struct intel_rps *rps, int val);
int intel_freq_opcode(struct intel_rps *rps, int val);
u32 intel_rps_get_cagf(struct intel_rps *rps, u32 rpstat1);
u32 intel_rps_read_actual_frequency(struct intel_rps *rps);
+u32 intel_rps_get_requested_frequency(struct intel_rps *rps);
+u32 intel_rps_get_min_frequency(struct intel_rps *rps);
+int intel_rps_set_min_frequency(struct intel_rps *rps, u32 val);
+u32 intel_rps_get_max_frequency(struct intel_rps *rps);
+int intel_rps_set_max_frequency(struct intel_rps *rps, u32 val);
+u32 intel_rps_get_rp0_frequency(struct intel_rps *rps);
+u32 intel_rps_get_rp1_frequency(struct intel_rps *rps);
+u32 intel_rps_get_rpn_frequency(struct intel_rps *rps);
+u32 intel_rps_read_punit_req(struct intel_rps *rps);
+u32 intel_rps_read_punit_req_frequency(struct intel_rps *rps);
void gen5_rps_irq_handler(struct intel_rps *rps);
void gen6_rps_irq_handler(struct intel_rps *rps, u32 pm_iir);
diff --git a/drivers/gpu/drm/i915/gt/intel_sseu.c b/drivers/gpu/drm/i915/gt/intel_sseu.c
index 367fd44b81c8..bbd272943c3f 100644
--- a/drivers/gpu/drm/i915/gt/intel_sseu.c
+++ b/drivers/gpu/drm/i915/gt/intel_sseu.c
@@ -139,17 +139,36 @@ static void gen12_sseu_info_init(struct intel_gt *gt)
* Gen12 has Dual-Subslices, which behave similarly to 2 gen11 SS.
* Instead of splitting these, provide userspace with an array
* of DSS to more closely represent the hardware resource.
+ *
+ * In addition, the concept of slice has been removed in Xe_HP.
+ * To be compatible with prior generations, assume a single slice
+ * across the entire device. Then calculate out the DSS for each
+ * workload type within that software slice.
*/
- intel_sseu_set_info(sseu, 1, 6, 16);
+ if (IS_DG2(gt->i915) || IS_XEHPSDV(gt->i915))
+ intel_sseu_set_info(sseu, 1, 32, 16);
+ else
+ intel_sseu_set_info(sseu, 1, 6, 16);
- s_en = intel_uncore_read(uncore, GEN11_GT_SLICE_ENABLE) &
- GEN11_GT_S_ENA_MASK;
+ /*
+ * As mentioned above, Xe_HP does not have the concept of a slice.
+ * Enable one for software backwards compatibility.
+ */
+ if (GRAPHICS_VER_FULL(gt->i915) >= IP_VER(12, 50))
+ s_en = 0x1;
+ else
+ s_en = intel_uncore_read(uncore, GEN11_GT_SLICE_ENABLE) &
+ GEN11_GT_S_ENA_MASK;
dss_en = intel_uncore_read(uncore, GEN12_GT_DSS_ENABLE);
/* one bit per pair of EUs */
- eu_en_fuse = ~(intel_uncore_read(uncore, GEN11_EU_DISABLE) &
- GEN11_EU_DIS_MASK);
+ if (GRAPHICS_VER_FULL(gt->i915) >= IP_VER(12, 50))
+ eu_en_fuse = intel_uncore_read(uncore, XEHP_EU_ENABLE) & XEHP_EU_ENA_MASK;
+ else
+ eu_en_fuse = ~(intel_uncore_read(uncore, GEN11_EU_DISABLE) &
+ GEN11_EU_DIS_MASK);
+
for (eu = 0; eu < sseu->max_eus_per_subslice / 2; eu++)
if (eu_en_fuse & BIT(eu))
eu_en |= BIT(eu * 2) | BIT(eu * 2 + 1);
@@ -188,83 +207,6 @@ static void gen11_sseu_info_init(struct intel_gt *gt)
sseu->has_eu_pg = 1;
}
-static void gen10_sseu_info_init(struct intel_gt *gt)
-{
- struct intel_uncore *uncore = gt->uncore;
- struct sseu_dev_info *sseu = &gt->info.sseu;
- const u32 fuse2 = intel_uncore_read(uncore, GEN8_FUSE2);
- const int eu_mask = 0xff;
- u32 subslice_mask, eu_en;
- int s, ss;
-
- intel_sseu_set_info(sseu, 6, 4, 8);
-
- sseu->slice_mask = (fuse2 & GEN10_F2_S_ENA_MASK) >>
- GEN10_F2_S_ENA_SHIFT;
-
- /* Slice0 */
- eu_en = ~intel_uncore_read(uncore, GEN8_EU_DISABLE0);
- for (ss = 0; ss < sseu->max_subslices; ss++)
- sseu_set_eus(sseu, 0, ss, (eu_en >> (8 * ss)) & eu_mask);
- /* Slice1 */
- sseu_set_eus(sseu, 1, 0, (eu_en >> 24) & eu_mask);
- eu_en = ~intel_uncore_read(uncore, GEN8_EU_DISABLE1);
- sseu_set_eus(sseu, 1, 1, eu_en & eu_mask);
- /* Slice2 */
- sseu_set_eus(sseu, 2, 0, (eu_en >> 8) & eu_mask);
- sseu_set_eus(sseu, 2, 1, (eu_en >> 16) & eu_mask);
- /* Slice3 */
- sseu_set_eus(sseu, 3, 0, (eu_en >> 24) & eu_mask);
- eu_en = ~intel_uncore_read(uncore, GEN8_EU_DISABLE2);
- sseu_set_eus(sseu, 3, 1, eu_en & eu_mask);
- /* Slice4 */
- sseu_set_eus(sseu, 4, 0, (eu_en >> 8) & eu_mask);
- sseu_set_eus(sseu, 4, 1, (eu_en >> 16) & eu_mask);
- /* Slice5 */
- sseu_set_eus(sseu, 5, 0, (eu_en >> 24) & eu_mask);
- eu_en = ~intel_uncore_read(uncore, GEN10_EU_DISABLE3);
- sseu_set_eus(sseu, 5, 1, eu_en & eu_mask);
-
- subslice_mask = (1 << 4) - 1;
- subslice_mask &= ~((fuse2 & GEN10_F2_SS_DIS_MASK) >>
- GEN10_F2_SS_DIS_SHIFT);
-
- for (s = 0; s < sseu->max_slices; s++) {
- u32 subslice_mask_with_eus = subslice_mask;
-
- for (ss = 0; ss < sseu->max_subslices; ss++) {
- if (sseu_get_eus(sseu, s, ss) == 0)
- subslice_mask_with_eus &= ~BIT(ss);
- }
-
- /*
- * Slice0 can have up to 3 subslices, but there are only 2 in
- * slice1/2.
- */
- intel_sseu_set_subslices(sseu, s, s == 0 ?
- subslice_mask_with_eus :
- subslice_mask_with_eus & 0x3);
- }
-
- sseu->eu_total = compute_eu_total(sseu);
-
- /*
- * CNL is expected to always have a uniform distribution
- * of EU across subslices with the exception that any one
- * EU in any one subslice may be fused off for die
- * recovery.
- */
- sseu->eu_per_subslice =
- intel_sseu_subslice_total(sseu) ?
- DIV_ROUND_UP(sseu->eu_total, intel_sseu_subslice_total(sseu)) :
- 0;
-
- /* No restrictions on Power Gating */
- sseu->has_slice_pg = 1;
- sseu->has_subslice_pg = 1;
- sseu->has_eu_pg = 1;
-}
-
static void cherryview_sseu_info_init(struct intel_gt *gt)
{
struct sseu_dev_info *sseu = &gt->info.sseu;
@@ -592,8 +534,6 @@ void intel_sseu_info_init(struct intel_gt *gt)
bdw_sseu_info_init(gt);
else if (GRAPHICS_VER(i915) == 9)
gen9_sseu_info_init(gt);
- else if (GRAPHICS_VER(i915) == 10)
- gen10_sseu_info_init(gt);
else if (GRAPHICS_VER(i915) == 11)
gen11_sseu_info_init(gt);
else if (GRAPHICS_VER(i915) >= 12)
@@ -759,3 +699,21 @@ void intel_sseu_print_topology(const struct sseu_dev_info *sseu,
}
}
}
+
+u16 intel_slicemask_from_dssmask(u64 dss_mask, int dss_per_slice)
+{
+ u16 slice_mask = 0;
+ int i;
+
+ WARN_ON(sizeof(dss_mask) * 8 / dss_per_slice > 8 * sizeof(slice_mask));
+
+ for (i = 0; dss_mask; i++) {
+ if (dss_mask & GENMASK(dss_per_slice - 1, 0))
+ slice_mask |= BIT(i);
+
+ dss_mask >>= dss_per_slice;
+ }
+
+ return slice_mask;
+}
+
diff --git a/drivers/gpu/drm/i915/gt/intel_sseu.h b/drivers/gpu/drm/i915/gt/intel_sseu.h
index 4cd1a8a7298a..22fef98887c0 100644
--- a/drivers/gpu/drm/i915/gt/intel_sseu.h
+++ b/drivers/gpu/drm/i915/gt/intel_sseu.h
@@ -15,13 +15,17 @@ struct drm_i915_private;
struct intel_gt;
struct drm_printer;
-#define GEN_MAX_SLICES (6) /* CNL upper bound */
-#define GEN_MAX_SUBSLICES (8) /* ICL upper bound */
+#define GEN_MAX_SLICES (3) /* SKL upper bound */
+#define GEN_MAX_SUBSLICES (32) /* XEHPSDV upper bound */
#define GEN_SSEU_STRIDE(max_entries) DIV_ROUND_UP(max_entries, BITS_PER_BYTE)
#define GEN_MAX_SUBSLICE_STRIDE GEN_SSEU_STRIDE(GEN_MAX_SUBSLICES)
#define GEN_MAX_EUS (16) /* TGL upper bound */
#define GEN_MAX_EU_STRIDE GEN_SSEU_STRIDE(GEN_MAX_EUS)
+#define GEN_DSS_PER_GSLICE 4
+#define GEN_DSS_PER_CSLICE 8
+#define GEN_DSS_PER_MSLICE 8
+
struct sseu_dev_info {
u8 slice_mask;
u8 subslice_mask[GEN_MAX_SLICES * GEN_MAX_SUBSLICE_STRIDE];
@@ -104,4 +108,6 @@ void intel_sseu_dump(const struct sseu_dev_info *sseu, struct drm_printer *p);
void intel_sseu_print_topology(const struct sseu_dev_info *sseu,
struct drm_printer *p);
+u16 intel_slicemask_from_dssmask(u64 dss_mask, int dss_per_slice);
+
#endif /* __INTEL_SSEU_H__ */
diff --git a/drivers/gpu/drm/i915/gt/intel_sseu_debugfs.c b/drivers/gpu/drm/i915/gt/intel_sseu_debugfs.c
index 714fe8495775..1ba8b7da9d37 100644
--- a/drivers/gpu/drm/i915/gt/intel_sseu_debugfs.c
+++ b/drivers/gpu/drm/i915/gt/intel_sseu_debugfs.c
@@ -50,10 +50,10 @@ static void cherryview_sseu_device_status(struct intel_gt *gt,
#undef SS_MAX
}
-static void gen10_sseu_device_status(struct intel_gt *gt,
+static void gen11_sseu_device_status(struct intel_gt *gt,
struct sseu_dev_info *sseu)
{
-#define SS_MAX 6
+#define SS_MAX 8
struct intel_uncore *uncore = gt->uncore;
const struct intel_gt_info *info = &gt->info;
u32 s_reg[SS_MAX], eu_reg[2 * SS_MAX], eu_mask[2];
@@ -267,8 +267,8 @@ int intel_sseu_status(struct seq_file *m, struct intel_gt *gt)
bdw_sseu_device_status(gt, &sseu);
else if (GRAPHICS_VER(i915) == 9)
gen9_sseu_device_status(gt, &sseu);
- else if (GRAPHICS_VER(i915) >= 10)
- gen10_sseu_device_status(gt, &sseu);
+ else if (GRAPHICS_VER(i915) >= 11)
+ gen11_sseu_device_status(gt, &sseu);
}
i915_print_sseu_info(m, false, HAS_POOLED_EU(i915), &sseu);
diff --git a/drivers/gpu/drm/i915/gt/intel_workarounds.c b/drivers/gpu/drm/i915/gt/intel_workarounds.c
index b62d1e31a645..aae609d7d85d 100644
--- a/drivers/gpu/drm/i915/gt/intel_workarounds.c
+++ b/drivers/gpu/drm/i915/gt/intel_workarounds.c
@@ -150,13 +150,14 @@ static void _wa_add(struct i915_wa_list *wal, const struct i915_wa *wa)
}
static void wa_add(struct i915_wa_list *wal, i915_reg_t reg,
- u32 clear, u32 set, u32 read_mask)
+ u32 clear, u32 set, u32 read_mask, bool masked_reg)
{
struct i915_wa wa = {
.reg = reg,
.clr = clear,
.set = set,
.read = read_mask,
+ .masked_reg = masked_reg,
};
_wa_add(wal, &wa);
@@ -165,7 +166,7 @@ static void wa_add(struct i915_wa_list *wal, i915_reg_t reg,
static void
wa_write_clr_set(struct i915_wa_list *wal, i915_reg_t reg, u32 clear, u32 set)
{
- wa_add(wal, reg, clear, set, clear);
+ wa_add(wal, reg, clear, set, clear, false);
}
static void
@@ -200,20 +201,20 @@ wa_write_clr(struct i915_wa_list *wal, i915_reg_t reg, u32 clr)
static void
wa_masked_en(struct i915_wa_list *wal, i915_reg_t reg, u32 val)
{
- wa_add(wal, reg, 0, _MASKED_BIT_ENABLE(val), val);
+ wa_add(wal, reg, 0, _MASKED_BIT_ENABLE(val), val, true);
}
static void
wa_masked_dis(struct i915_wa_list *wal, i915_reg_t reg, u32 val)
{
- wa_add(wal, reg, 0, _MASKED_BIT_DISABLE(val), val);
+ wa_add(wal, reg, 0, _MASKED_BIT_DISABLE(val), val, true);
}
static void
wa_masked_field_set(struct i915_wa_list *wal, i915_reg_t reg,
u32 mask, u32 val)
{
- wa_add(wal, reg, 0, _MASKED_FIELD(mask, val), mask);
+ wa_add(wal, reg, 0, _MASKED_FIELD(mask, val), mask, true);
}
static void gen6_ctx_workarounds_init(struct intel_engine_cs *engine,
@@ -514,53 +515,15 @@ static void cfl_ctx_workarounds_init(struct intel_engine_cs *engine,
GEN7_SBE_SS_CACHE_DISPATCH_PORT_SHARING_DISABLE);
}
-static void cnl_ctx_workarounds_init(struct intel_engine_cs *engine,
- struct i915_wa_list *wal)
-{
- /* WaForceContextSaveRestoreNonCoherent:cnl */
- wa_masked_en(wal, CNL_HDC_CHICKEN0,
- HDC_FORCE_CONTEXT_SAVE_RESTORE_NON_COHERENT);
-
- /* WaDisableReplayBufferBankArbitrationOptimization:cnl */
- wa_masked_en(wal, COMMON_SLICE_CHICKEN2,
- GEN8_SBE_DISABLE_REPLAY_BUF_OPTIMIZATION);
-
- /* WaPushConstantDereferenceHoldDisable:cnl */
- wa_masked_en(wal, GEN7_ROW_CHICKEN2, PUSH_CONSTANT_DEREF_DISABLE);
-
- /* FtrEnableFastAnisoL1BankingFix:cnl */
- wa_masked_en(wal, HALF_SLICE_CHICKEN3, CNL_FAST_ANISO_L1_BANKING_FIX);
-
- /* WaDisable3DMidCmdPreemption:cnl */
- wa_masked_dis(wal, GEN8_CS_CHICKEN1, GEN9_PREEMPT_3D_OBJECT_LEVEL);
-
- /* WaDisableGPGPUMidCmdPreemption:cnl */
- wa_masked_field_set(wal, GEN8_CS_CHICKEN1,
- GEN9_PREEMPT_GPGPU_LEVEL_MASK,
- GEN9_PREEMPT_GPGPU_COMMAND_LEVEL);
-
- /* WaDisableEarlyEOT:cnl */
- wa_masked_en(wal, GEN8_ROW_CHICKEN, DISABLE_EARLY_EOT);
-}
-
static void icl_ctx_workarounds_init(struct intel_engine_cs *engine,
struct i915_wa_list *wal)
{
- struct drm_i915_private *i915 = engine->i915;
-
- /* WaDisableBankHangMode:icl */
+ /* Wa_1406697149 (WaDisableBankHangMode:icl) */
wa_write(wal,
GEN8_L3CNTLREG,
intel_uncore_read(engine->uncore, GEN8_L3CNTLREG) |
GEN8_ERRDETBCTRL);
- /* Wa_1604370585:icl (pre-prod)
- * Formerly known as WaPushConstantDereferenceHoldDisable
- */
- if (IS_ICL_REVID(i915, ICL_REVID_A0, ICL_REVID_B0))
- wa_masked_en(wal, GEN7_ROW_CHICKEN2,
- PUSH_CONSTANT_DEREF_DISABLE);
-
/* WaForceEnableNonCoherent:icl
* This is not the same workaround as in early Gen9 platforms, where
* lacking this could cause system hangs, but coherency performance
@@ -570,23 +533,11 @@ static void icl_ctx_workarounds_init(struct intel_engine_cs *engine,
*/
wa_masked_en(wal, ICL_HDC_MODE, HDC_FORCE_NON_COHERENT);
- /* Wa_2006611047:icl (pre-prod)
- * Formerly known as WaDisableImprovedTdlClkGating
- */
- if (IS_ICL_REVID(i915, ICL_REVID_A0, ICL_REVID_A0))
- wa_masked_en(wal, GEN7_ROW_CHICKEN2,
- GEN11_TDL_CLOCK_GATING_FIX_DISABLE);
-
- /* Wa_2006665173:icl (pre-prod) */
- if (IS_ICL_REVID(i915, ICL_REVID_A0, ICL_REVID_A0))
- wa_masked_en(wal, GEN11_COMMON_SLICE_CHICKEN3,
- GEN11_BLEND_EMB_FIX_DISABLE_IN_RCC);
-
/* WaEnableFloatBlendOptimization:icl */
- wa_write_clr_set(wal,
- GEN10_CACHE_MODE_SS,
- 0, /* write-only, so skip validation */
- _MASKED_BIT_ENABLE(FLOAT_BLEND_OPTIMIZATION_ENABLE));
+ wa_add(wal, GEN10_CACHE_MODE_SS, 0,
+ _MASKED_BIT_ENABLE(FLOAT_BLEND_OPTIMIZATION_ENABLE),
+ 0 /* write-only, so skip validation */,
+ true);
/* WaDisableGPGPUMidThreadPreemption:icl */
wa_masked_field_set(wal, GEN8_CS_CHICKEN1,
@@ -631,7 +582,7 @@ static void gen12_ctx_gt_tuning_init(struct intel_engine_cs *engine,
FF_MODE2,
FF_MODE2_TDS_TIMER_MASK,
FF_MODE2_TDS_TIMER_128,
- 0);
+ 0, false);
}
static void gen12_ctx_workarounds_init(struct intel_engine_cs *engine,
@@ -640,15 +591,16 @@ static void gen12_ctx_workarounds_init(struct intel_engine_cs *engine,
gen12_ctx_gt_tuning_init(engine, wal);
/*
- * Wa_1409142259:tgl
- * Wa_1409347922:tgl
- * Wa_1409252684:tgl
- * Wa_1409217633:tgl
- * Wa_1409207793:tgl
- * Wa_1409178076:tgl
- * Wa_1408979724:tgl
- * Wa_14010443199:rkl
- * Wa_14010698770:rkl
+ * Wa_1409142259:tgl,dg1,adl-p
+ * Wa_1409347922:tgl,dg1,adl-p
+ * Wa_1409252684:tgl,dg1,adl-p
+ * Wa_1409217633:tgl,dg1,adl-p
+ * Wa_1409207793:tgl,dg1,adl-p
+ * Wa_1409178076:tgl,dg1,adl-p
+ * Wa_1408979724:tgl,dg1,adl-p
+ * Wa_14010443199:tgl,rkl,dg1,adl-p
+ * Wa_14010698770:tgl,rkl,dg1,adl-s,adl-p
+ * Wa_1409342910:tgl,rkl,dg1,adl-s,adl-p
*/
wa_masked_en(wal, GEN11_COMMON_SLICE_CHICKEN3,
GEN12_DISABLE_CPS_AWARE_COLOR_PIPE);
@@ -668,7 +620,14 @@ static void gen12_ctx_workarounds_init(struct intel_engine_cs *engine,
FF_MODE2,
FF_MODE2_GS_TIMER_MASK,
FF_MODE2_GS_TIMER_224,
- 0);
+ 0, false);
+
+ /*
+ * Wa_14012131227:dg1
+ * Wa_1508744258:tgl,rkl,dg1,adl-s,adl-p
+ */
+ wa_masked_en(wal, GEN7_COMMON_SLICE_CHICKEN1,
+ GEN9_RHWO_OPTIMIZATION_DISABLE);
}
static void dg1_ctx_workarounds_init(struct intel_engine_cs *engine,
@@ -703,8 +662,6 @@ __intel_engine_init_ctx_wa(struct intel_engine_cs *engine,
gen12_ctx_workarounds_init(engine, wal);
else if (GRAPHICS_VER(i915) == 11)
icl_ctx_workarounds_init(engine, wal);
- else if (IS_CANNONLAKE(i915))
- cnl_ctx_workarounds_init(engine, wal);
else if (IS_COFFEELAKE(i915) || IS_COMETLAKE(i915))
cfl_ctx_workarounds_init(engine, wal);
else if (IS_GEMINILAKE(i915))
@@ -839,7 +796,7 @@ hsw_gt_workarounds_init(struct drm_i915_private *i915, struct i915_wa_list *wal)
wa_add(wal,
HSW_ROW_CHICKEN3, 0,
_MASKED_BIT_ENABLE(HSW_ROW_CHICKEN3_L3_GLOBAL_ATOMICS_DISABLE),
- 0 /* XXX does this reg exist? */);
+ 0 /* XXX does this reg exist? */, true);
/* WaVSRefCountFullforceMissDisable:hsw */
wa_write_clr(wal, GEN7_FF_THREAD_MODE, GEN7_FF_VS_REF_CNT_FFME);
@@ -882,30 +839,19 @@ skl_gt_workarounds_init(struct drm_i915_private *i915, struct i915_wa_list *wal)
GEN8_EU_GAUNIT_CLOCK_GATE_DISABLE);
/* WaInPlaceDecompressionHang:skl */
- if (IS_SKL_REVID(i915, SKL_REVID_H0, REVID_FOREVER))
+ if (IS_SKL_GT_STEP(i915, STEP_A0, STEP_H0))
wa_write_or(wal,
GEN9_GAMT_ECO_REG_RW_IA,
GAMT_ECO_ENABLE_IN_PLACE_DECOMPRESS);
}
static void
-bxt_gt_workarounds_init(struct drm_i915_private *i915, struct i915_wa_list *wal)
-{
- gen9_gt_workarounds_init(i915, wal);
-
- /* WaInPlaceDecompressionHang:bxt */
- wa_write_or(wal,
- GEN9_GAMT_ECO_REG_RW_IA,
- GAMT_ECO_ENABLE_IN_PLACE_DECOMPRESS);
-}
-
-static void
kbl_gt_workarounds_init(struct drm_i915_private *i915, struct i915_wa_list *wal)
{
gen9_gt_workarounds_init(i915, wal);
/* WaDisableDynamicCreditSharing:kbl */
- if (IS_KBL_GT_STEP(i915, 0, STEP_B0))
+ if (IS_KBL_GT_STEP(i915, 0, STEP_C0))
wa_write_or(wal,
GAMT_CHKN_BIT_REG,
GAMT_CHKN_DISABLE_DYNAMIC_CREDIT_SHARING);
@@ -943,98 +889,144 @@ cfl_gt_workarounds_init(struct drm_i915_private *i915, struct i915_wa_list *wal)
GAMT_ECO_ENABLE_IN_PLACE_DECOMPRESS);
}
+static void __set_mcr_steering(struct i915_wa_list *wal,
+ i915_reg_t steering_reg,
+ unsigned int slice, unsigned int subslice)
+{
+ u32 mcr, mcr_mask;
+
+ mcr = GEN11_MCR_SLICE(slice) | GEN11_MCR_SUBSLICE(subslice);
+ mcr_mask = GEN11_MCR_SLICE_MASK | GEN11_MCR_SUBSLICE_MASK;
+
+ wa_write_clr_set(wal, steering_reg, mcr_mask, mcr);
+}
+
+static void __add_mcr_wa(struct drm_i915_private *i915, struct i915_wa_list *wal,
+ unsigned int slice, unsigned int subslice)
+{
+ drm_dbg(&i915->drm, "MCR slice=0x%x, subslice=0x%x\n", slice, subslice);
+
+ __set_mcr_steering(wal, GEN8_MCR_SELECTOR, slice, subslice);
+}
+
static void
-wa_init_mcr(struct drm_i915_private *i915, struct i915_wa_list *wal)
+icl_wa_init_mcr(struct drm_i915_private *i915, struct i915_wa_list *wal)
{
const struct sseu_dev_info *sseu = &i915->gt.info.sseu;
unsigned int slice, subslice;
- u32 l3_en, mcr, mcr_mask;
- GEM_BUG_ON(GRAPHICS_VER(i915) < 10);
+ GEM_BUG_ON(GRAPHICS_VER(i915) < 11);
+ GEM_BUG_ON(hweight8(sseu->slice_mask) > 1);
+ slice = 0;
/*
- * WaProgramMgsrForL3BankSpecificMmioReads: cnl,icl
- * L3Banks could be fused off in single slice scenario. If that is
- * the case, we might need to program MCR select to a valid L3Bank
- * by default, to make sure we correctly read certain registers
- * later on (in the range 0xB100 - 0xB3FF).
+ * Although a platform may have subslices, we need to always steer
+ * reads to the lowest instance that isn't fused off. When Render
+ * Power Gating is enabled, grabbing forcewake will only power up a
+ * single subslice (the "minconfig") if there isn't a real workload
+ * that needs to be run; this means that if we steer register reads to
+ * one of the higher subslices, we run the risk of reading back 0's or
+ * random garbage.
+ */
+ subslice = __ffs(intel_sseu_get_subslices(sseu, slice));
+
+ /*
+ * If the subslice we picked above also steers us to a valid L3 bank,
+ * then we can just rely on the default steering and won't need to
+ * worry about explicitly re-steering L3BANK reads later.
+ */
+ if (i915->gt.info.l3bank_mask & BIT(subslice))
+ i915->gt.steering_table[L3BANK] = NULL;
+
+ __add_mcr_wa(i915, wal, slice, subslice);
+}
+
+static void
+xehp_init_mcr(struct intel_gt *gt, struct i915_wa_list *wal)
+{
+ struct drm_i915_private *i915 = gt->i915;
+ const struct sseu_dev_info *sseu = &gt->info.sseu;
+ unsigned long slice, subslice = 0, slice_mask = 0;
+ u64 dss_mask = 0;
+ u32 lncf_mask = 0;
+ int i;
+
+ /*
+ * On Xe_HP the steering increases in complexity. There are now several
+ * more units that require steering and we're not guaranteed to be able
+ * to find a common setting for all of them. These are:
+ * - GSLICE (fusable)
+ * - DSS (sub-unit within gslice; fusable)
+ * - L3 Bank (fusable)
+ * - MSLICE (fusable)
+ * - LNCF (sub-unit within mslice; always present if mslice is present)
*
- * WaProgramMgsrForCorrectSliceSpecificMmioReads:cnl,icl
- * Before any MMIO read into slice/subslice specific registers, MCR
- * packet control register needs to be programmed to point to any
- * enabled s/ss pair. Otherwise, incorrect values will be returned.
- * This means each subsequent MMIO read will be forwarded to an
- * specific s/ss combination, but this is OK since these registers
- * are consistent across s/ss in almost all cases. In the rare
- * occasions, such as INSTDONE, where this value is dependent
- * on s/ss combo, the read should be done with read_subslice_reg.
+ * We'll do our default/implicit steering based on GSLICE (in the
+ * sliceid field) and DSS (in the subsliceid field). If we can
+ * find overlap between the valid MSLICE and/or LNCF values with
+ * a suitable GSLICE, then we can just re-use the default value and
+ * skip and explicit steering at runtime.
*
- * Since GEN8_MCR_SELECTOR contains dual-purpose bits which select both
- * to which subslice, or to which L3 bank, the respective mmio reads
- * will go, we have to find a common index which works for both
- * accesses.
+ * We only need to look for overlap between GSLICE/MSLICE/LNCF to find
+ * a valid sliceid value. DSS steering is the only type of steering
+ * that utilizes the 'subsliceid' bits.
*
- * Case where we cannot find a common index fortunately should not
- * happen in production hardware, so we only emit a warning instead of
- * implementing something more complex that requires checking the range
- * of every MMIO read.
+ * Also note that, even though the steering domain is called "GSlice"
+ * and it is encoded in the register using the gslice format, the spec
+ * says that the combined (geometry | compute) fuse should be used to
+ * select the steering.
*/
- if (GRAPHICS_VER(i915) >= 10 && is_power_of_2(sseu->slice_mask)) {
- u32 l3_fuse =
- intel_uncore_read(&i915->uncore, GEN10_MIRROR_FUSE3) &
- GEN10_L3BANK_MASK;
+ /* Find the potential gslice candidates */
+ dss_mask = intel_sseu_get_subslices(sseu, 0);
+ slice_mask = intel_slicemask_from_dssmask(dss_mask, GEN_DSS_PER_GSLICE);
- drm_dbg(&i915->drm, "L3 fuse = %x\n", l3_fuse);
- l3_en = ~(l3_fuse << GEN10_L3BANK_PAIR_COUNT | l3_fuse);
- } else {
- l3_en = ~0;
- }
+ /*
+ * Find the potential LNCF candidates. Either LNCF within a valid
+ * mslice is fine.
+ */
+ for_each_set_bit(i, &gt->info.mslice_mask, GEN12_MAX_MSLICES)
+ lncf_mask |= (0x3 << (i * 2));
- slice = fls(sseu->slice_mask) - 1;
- subslice = fls(l3_en & intel_sseu_get_subslices(sseu, slice));
- if (!subslice) {
- drm_warn(&i915->drm,
- "No common index found between subslice mask %x and L3 bank mask %x!\n",
- intel_sseu_get_subslices(sseu, slice), l3_en);
- subslice = fls(l3_en);
- drm_WARN_ON(&i915->drm, !subslice);
- }
- subslice--;
-
- if (GRAPHICS_VER(i915) >= 11) {
- mcr = GEN11_MCR_SLICE(slice) | GEN11_MCR_SUBSLICE(subslice);
- mcr_mask = GEN11_MCR_SLICE_MASK | GEN11_MCR_SUBSLICE_MASK;
- } else {
- mcr = GEN8_MCR_SLICE(slice) | GEN8_MCR_SUBSLICE(subslice);
- mcr_mask = GEN8_MCR_SLICE_MASK | GEN8_MCR_SUBSLICE_MASK;
+ /*
+ * Are there any sliceid values that work for both GSLICE and LNCF
+ * steering?
+ */
+ if (slice_mask & lncf_mask) {
+ slice_mask &= lncf_mask;
+ gt->steering_table[LNCF] = NULL;
}
- drm_dbg(&i915->drm, "MCR slice/subslice = %x\n", mcr);
+ /* How about sliceid values that also work for MSLICE steering? */
+ if (slice_mask & gt->info.mslice_mask) {
+ slice_mask &= gt->info.mslice_mask;
+ gt->steering_table[MSLICE] = NULL;
+ }
- wa_write_clr_set(wal, GEN8_MCR_SELECTOR, mcr_mask, mcr);
-}
+ slice = __ffs(slice_mask);
+ subslice = __ffs(dss_mask >> (slice * GEN_DSS_PER_GSLICE));
+ WARN_ON(subslice > GEN_DSS_PER_GSLICE);
+ WARN_ON(dss_mask >> (slice * GEN_DSS_PER_GSLICE) == 0);
-static void
-cnl_gt_workarounds_init(struct drm_i915_private *i915, struct i915_wa_list *wal)
-{
- wa_init_mcr(i915, wal);
+ __add_mcr_wa(i915, wal, slice, subslice);
- /* WaInPlaceDecompressionHang:cnl */
- wa_write_or(wal,
- GEN9_GAMT_ECO_REG_RW_IA,
- GAMT_ECO_ENABLE_IN_PLACE_DECOMPRESS);
+ /*
+ * SQIDI ranges are special because they use different steering
+ * registers than everything else we work with. On XeHP SDV and
+ * DG2-G10, any value in the steering registers will work fine since
+ * all instances are present, but DG2-G11 only has SQIDI instances at
+ * ID's 2 and 3, so we need to steer to one of those. For simplicity
+ * we'll just steer to a hardcoded "2" since that value will work
+ * everywhere.
+ */
+ __set_mcr_steering(wal, MCFG_MCR_SELECTOR, 0, 2);
+ __set_mcr_steering(wal, SF_MCR_SELECTOR, 0, 2);
}
static void
icl_gt_workarounds_init(struct drm_i915_private *i915, struct i915_wa_list *wal)
{
- wa_init_mcr(i915, wal);
-
- /* WaInPlaceDecompressionHang:icl */
- wa_write_or(wal,
- GEN9_GAMT_ECO_REG_RW_IA,
- GAMT_ECO_ENABLE_IN_PLACE_DECOMPRESS);
+ icl_wa_init_mcr(i915, wal);
/* WaModifyGamTlbPartitioning:icl */
wa_write_clr_set(wal,
@@ -1057,18 +1049,6 @@ icl_gt_workarounds_init(struct drm_i915_private *i915, struct i915_wa_list *wal)
GEN8_GAMW_ECO_DEV_RW_IA,
GAMW_ECO_DEV_CTX_RELOAD_DISABLE);
- /* Wa_1405779004:icl (pre-prod) */
- if (IS_ICL_REVID(i915, ICL_REVID_A0, ICL_REVID_A0))
- wa_write_or(wal,
- SLICE_UNIT_LEVEL_CLKGATE,
- MSCUNIT_CLKGATE_DIS);
-
- /* Wa_1406838659:icl (pre-prod) */
- if (IS_ICL_REVID(i915, ICL_REVID_A0, ICL_REVID_B0))
- wa_write_or(wal,
- INF_UNIT_LEVEL_CLKGATE,
- CGPSF_CLKGATE_DIS);
-
/* Wa_1406463099:icl
* Formerly known as WaGamTlbPendError
*/
@@ -1078,10 +1058,16 @@ icl_gt_workarounds_init(struct drm_i915_private *i915, struct i915_wa_list *wal)
/* Wa_1607087056:icl,ehl,jsl */
if (IS_ICELAKE(i915) ||
- IS_JSL_EHL_REVID(i915, EHL_REVID_A0, EHL_REVID_A0))
+ IS_JSL_EHL_GT_STEP(i915, STEP_A0, STEP_B0))
wa_write_or(wal,
SLICE_UNIT_LEVEL_CLKGATE,
L3_CLKGATE_DIS | L3_CR2X_CLKGATE_DIS);
+
+ /*
+ * This is not a documented workaround, but rather an optimization
+ * to reduce sampler power.
+ */
+ wa_write_clr(wal, GEN10_DFR_RATIO_EN_AND_CHICKEN, DFR_DISABLE);
}
/*
@@ -1111,10 +1097,13 @@ static void
gen12_gt_workarounds_init(struct drm_i915_private *i915,
struct i915_wa_list *wal)
{
- wa_init_mcr(i915, wal);
+ icl_wa_init_mcr(i915, wal);
- /* Wa_14011060649:tgl,rkl,dg1,adls */
+ /* Wa_14011060649:tgl,rkl,dg1,adl-s,adl-p */
wa_14011060649(i915, wal);
+
+ /* Wa_14011059788:tgl,rkl,adl-s,dg1,adl-p */
+ wa_write_or(wal, GEN10_DFR_RATIO_EN_AND_CHICKEN, DFR_DISABLE);
}
static void
@@ -1123,19 +1112,19 @@ tgl_gt_workarounds_init(struct drm_i915_private *i915, struct i915_wa_list *wal)
gen12_gt_workarounds_init(i915, wal);
/* Wa_1409420604:tgl */
- if (IS_TGL_UY_GT_STEP(i915, STEP_A0, STEP_A0))
+ if (IS_TGL_UY_GT_STEP(i915, STEP_A0, STEP_B0))
wa_write_or(wal,
SUBSLICE_UNIT_LEVEL_CLKGATE2,
CPSSUNIT_CLKGATE_DIS);
/* Wa_1607087056:tgl also know as BUG:1409180338 */
- if (IS_TGL_UY_GT_STEP(i915, STEP_A0, STEP_A0))
+ if (IS_TGL_UY_GT_STEP(i915, STEP_A0, STEP_B0))
wa_write_or(wal,
SLICE_UNIT_LEVEL_CLKGATE,
L3_CLKGATE_DIS | L3_CR2X_CLKGATE_DIS);
/* Wa_1408615072:tgl[a0] */
- if (IS_TGL_UY_GT_STEP(i915, STEP_A0, STEP_A0))
+ if (IS_TGL_UY_GT_STEP(i915, STEP_A0, STEP_B0))
wa_write_or(wal, UNSLICE_UNIT_LEVEL_CLKGATE2,
VSUNIT_CLKGATE_DIS_TGL);
}
@@ -1146,7 +1135,7 @@ dg1_gt_workarounds_init(struct drm_i915_private *i915, struct i915_wa_list *wal)
gen12_gt_workarounds_init(i915, wal);
/* Wa_1607087056:dg1 */
- if (IS_DG1_REVID(i915, DG1_REVID_A0, DG1_REVID_A0))
+ if (IS_DG1_GT_STEP(i915, STEP_A0, STEP_B0))
wa_write_or(wal,
SLICE_UNIT_LEVEL_CLKGATE,
L3_CLKGATE_DIS | L3_CR2X_CLKGATE_DIS);
@@ -1165,9 +1154,17 @@ dg1_gt_workarounds_init(struct drm_i915_private *i915, struct i915_wa_list *wal)
}
static void
+xehpsdv_gt_workarounds_init(struct drm_i915_private *i915, struct i915_wa_list *wal)
+{
+ xehp_init_mcr(&i915->gt, wal);
+}
+
+static void
gt_init_workarounds(struct drm_i915_private *i915, struct i915_wa_list *wal)
{
- if (IS_DG1(i915))
+ if (IS_XEHPSDV(i915))
+ xehpsdv_gt_workarounds_init(i915, wal);
+ else if (IS_DG1(i915))
dg1_gt_workarounds_init(i915, wal);
else if (IS_TIGERLAKE(i915))
tgl_gt_workarounds_init(i915, wal);
@@ -1175,8 +1172,6 @@ gt_init_workarounds(struct drm_i915_private *i915, struct i915_wa_list *wal)
gen12_gt_workarounds_init(i915, wal);
else if (GRAPHICS_VER(i915) == 11)
icl_gt_workarounds_init(i915, wal);
- else if (IS_CANNONLAKE(i915))
- cnl_gt_workarounds_init(i915, wal);
else if (IS_COFFEELAKE(i915) || IS_COMETLAKE(i915))
cfl_gt_workarounds_init(i915, wal);
else if (IS_GEMINILAKE(i915))
@@ -1184,7 +1179,7 @@ gt_init_workarounds(struct drm_i915_private *i915, struct i915_wa_list *wal)
else if (IS_KABYLAKE(i915))
kbl_gt_workarounds_init(i915, wal);
else if (IS_BROXTON(i915))
- bxt_gt_workarounds_init(i915, wal);
+ gen9_gt_workarounds_init(i915, wal);
else if (IS_SKYLAKE(i915))
skl_gt_workarounds_init(i915, wal);
else if (IS_HASWELL(i915))
@@ -1247,8 +1242,9 @@ wa_verify(const struct i915_wa *wa, u32 cur, const char *name, const char *from)
}
static void
-wa_list_apply(struct intel_uncore *uncore, const struct i915_wa_list *wal)
+wa_list_apply(struct intel_gt *gt, const struct i915_wa_list *wal)
{
+ struct intel_uncore *uncore = gt->uncore;
enum forcewake_domains fw;
unsigned long flags;
struct i915_wa *wa;
@@ -1263,13 +1259,16 @@ wa_list_apply(struct intel_uncore *uncore, const struct i915_wa_list *wal)
intel_uncore_forcewake_get__locked(uncore, fw);
for (i = 0, wa = wal->list; i < wal->count; i++, wa++) {
- if (wa->clr)
- intel_uncore_rmw_fw(uncore, wa->reg, wa->clr, wa->set);
- else
- intel_uncore_write_fw(uncore, wa->reg, wa->set);
+ u32 val, old = 0;
+
+ /* open-coded rmw due to steering */
+ old = wa->clr ? intel_gt_read_register_fw(gt, wa->reg) : 0;
+ val = (old & ~wa->clr) | wa->set;
+ if (val != old || !wa->clr)
+ intel_uncore_write_fw(uncore, wa->reg, val);
+
if (IS_ENABLED(CONFIG_DRM_I915_DEBUG_GEM))
- wa_verify(wa,
- intel_uncore_read_fw(uncore, wa->reg),
+ wa_verify(wa, intel_gt_read_register_fw(gt, wa->reg),
wal->name, "application");
}
@@ -1279,28 +1278,39 @@ wa_list_apply(struct intel_uncore *uncore, const struct i915_wa_list *wal)
void intel_gt_apply_workarounds(struct intel_gt *gt)
{
- wa_list_apply(gt->uncore, &gt->i915->gt_wa_list);
+ wa_list_apply(gt, &gt->i915->gt_wa_list);
}
-static bool wa_list_verify(struct intel_uncore *uncore,
+static bool wa_list_verify(struct intel_gt *gt,
const struct i915_wa_list *wal,
const char *from)
{
+ struct intel_uncore *uncore = gt->uncore;
struct i915_wa *wa;
+ enum forcewake_domains fw;
+ unsigned long flags;
unsigned int i;
bool ok = true;
+ fw = wal_get_fw_for_rmw(uncore, wal);
+
+ spin_lock_irqsave(&uncore->lock, flags);
+ intel_uncore_forcewake_get__locked(uncore, fw);
+
for (i = 0, wa = wal->list; i < wal->count; i++, wa++)
ok &= wa_verify(wa,
- intel_uncore_read(uncore, wa->reg),
+ intel_gt_read_register_fw(gt, wa->reg),
wal->name, from);
+ intel_uncore_forcewake_put__locked(uncore, fw);
+ spin_unlock_irqrestore(&uncore->lock, flags);
+
return ok;
}
bool intel_gt_verify_workarounds(struct intel_gt *gt, const char *from)
{
- return wa_list_verify(gt->uncore, &gt->i915->gt_wa_list, from);
+ return wa_list_verify(gt, &gt->i915->gt_wa_list, from);
}
__maybe_unused
@@ -1438,17 +1448,6 @@ static void cml_whitelist_build(struct intel_engine_cs *engine)
cfl_whitelist_build(engine);
}
-static void cnl_whitelist_build(struct intel_engine_cs *engine)
-{
- struct i915_wa_list *w = &engine->whitelist;
-
- if (engine->class != RENDER_CLASS)
- return;
-
- /* WaEnablePreemptionGranularityControlByUMD:cnl */
- whitelist_reg(w, GEN8_CS_CHICKEN1);
-}
-
static void icl_whitelist_build(struct intel_engine_cs *engine)
{
struct i915_wa_list *w = &engine->whitelist;
@@ -1542,7 +1541,7 @@ static void dg1_whitelist_build(struct intel_engine_cs *engine)
tgl_whitelist_build(engine);
/* GEN:BUG:1409280441:dg1 */
- if (IS_DG1_REVID(engine->i915, DG1_REVID_A0, DG1_REVID_A0) &&
+ if (IS_DG1_GT_STEP(engine->i915, STEP_A0, STEP_B0) &&
(engine->class == RENDER_CLASS ||
engine->class == COPY_ENGINE_CLASS))
whitelist_reg_ext(w, RING_ID(engine->mmio_base),
@@ -1562,8 +1561,6 @@ void intel_engine_init_whitelist(struct intel_engine_cs *engine)
tgl_whitelist_build(engine);
else if (GRAPHICS_VER(i915) == 11)
icl_whitelist_build(engine);
- else if (IS_CANNONLAKE(i915))
- cnl_whitelist_build(engine);
else if (IS_COMETLAKE(i915))
cml_whitelist_build(engine);
else if (IS_COFFEELAKE(i915))
@@ -1612,8 +1609,8 @@ rcs_engine_wa_init(struct intel_engine_cs *engine, struct i915_wa_list *wal)
{
struct drm_i915_private *i915 = engine->i915;
- if (IS_DG1_REVID(i915, DG1_REVID_A0, DG1_REVID_A0) ||
- IS_TGL_UY_GT_STEP(i915, STEP_A0, STEP_A0)) {
+ if (IS_DG1_GT_STEP(i915, STEP_A0, STEP_B0) ||
+ IS_TGL_UY_GT_STEP(i915, STEP_A0, STEP_B0)) {
/*
* Wa_1607138336:tgl[a0],dg1[a0]
* Wa_1607063988:tgl[a0],dg1[a0]
@@ -1623,7 +1620,7 @@ rcs_engine_wa_init(struct intel_engine_cs *engine, struct i915_wa_list *wal)
GEN12_DISABLE_POSH_BUSY_FF_DOP_CG);
}
- if (IS_TGL_UY_GT_STEP(i915, STEP_A0, STEP_A0)) {
+ if (IS_TGL_UY_GT_STEP(i915, STEP_A0, STEP_B0)) {
/*
* Wa_1606679103:tgl
* (see also Wa_1606682166:icl)
@@ -1633,44 +1630,46 @@ rcs_engine_wa_init(struct intel_engine_cs *engine, struct i915_wa_list *wal)
GEN7_DISABLE_SAMPLER_PREFETCH);
}
- if (IS_ALDERLAKE_S(i915) || IS_DG1(i915) ||
+ if (IS_ALDERLAKE_P(i915) || IS_ALDERLAKE_S(i915) || IS_DG1(i915) ||
IS_ROCKETLAKE(i915) || IS_TIGERLAKE(i915)) {
- /* Wa_1606931601:tgl,rkl,dg1,adl-s */
+ /* Wa_1606931601:tgl,rkl,dg1,adl-s,adl-p */
wa_masked_en(wal, GEN7_ROW_CHICKEN2, GEN12_DISABLE_EARLY_READ);
/*
* Wa_1407928979:tgl A*
* Wa_18011464164:tgl[B0+],dg1[B0+]
* Wa_22010931296:tgl[B0+],dg1[B0+]
- * Wa_14010919138:rkl,dg1,adl-s
+ * Wa_14010919138:rkl,dg1,adl-s,adl-p
*/
wa_write_or(wal, GEN7_FF_THREAD_MODE,
GEN12_FF_TESSELATION_DOP_GATE_DISABLE);
/*
- * Wa_1606700617:tgl,dg1
- * Wa_22010271021:tgl,rkl,dg1, adl-s
+ * Wa_1606700617:tgl,dg1,adl-p
+ * Wa_22010271021:tgl,rkl,dg1,adl-s,adl-p
+ * Wa_14010826681:tgl,dg1,rkl,adl-p
*/
wa_masked_en(wal,
GEN9_CS_DEBUG_MODE1,
FF_DOP_CLOCK_GATE_DISABLE);
}
- if (IS_ALDERLAKE_S(i915) || IS_DG1_REVID(i915, DG1_REVID_A0, DG1_REVID_A0) ||
+ if (IS_ALDERLAKE_P(i915) || IS_ALDERLAKE_S(i915) ||
+ IS_DG1_GT_STEP(i915, STEP_A0, STEP_B0) ||
IS_ROCKETLAKE(i915) || IS_TIGERLAKE(i915)) {
- /* Wa_1409804808:tgl,rkl,dg1[a0],adl-s */
+ /* Wa_1409804808:tgl,rkl,dg1[a0],adl-s,adl-p */
wa_masked_en(wal, GEN7_ROW_CHICKEN2,
GEN12_PUSH_CONST_DEREF_HOLD_DIS);
/*
* Wa_1409085225:tgl
- * Wa_14010229206:tgl,rkl,dg1[a0],adl-s
+ * Wa_14010229206:tgl,rkl,dg1[a0],adl-s,adl-p
*/
wa_masked_en(wal, GEN9_ROW_CHICKEN4, GEN12_DISABLE_TDL_PUSH);
}
- if (IS_DG1_REVID(i915, DG1_REVID_A0, DG1_REVID_A0) ||
+ if (IS_DG1_GT_STEP(i915, STEP_A0, STEP_B0) ||
IS_ROCKETLAKE(i915) || IS_TIGERLAKE(i915)) {
/*
* Wa_1607030317:tgl
@@ -1688,8 +1687,9 @@ rcs_engine_wa_init(struct intel_engine_cs *engine, struct i915_wa_list *wal)
GEN8_RC_SEMA_IDLE_MSG_DISABLE);
}
- if (IS_DG1(i915) || IS_ROCKETLAKE(i915) || IS_TIGERLAKE(i915)) {
- /* Wa_1406941453:tgl,rkl,dg1 */
+ if (IS_DG1(i915) || IS_ROCKETLAKE(i915) || IS_TIGERLAKE(i915) ||
+ IS_ALDERLAKE_S(i915) || IS_ALDERLAKE_P(i915)) {
+ /* Wa_1406941453:tgl,rkl,dg1,adl-s,adl-p */
wa_masked_en(wal,
GEN10_SAMPLER_MODE,
ENABLE_SMALLPL);
@@ -1701,11 +1701,6 @@ rcs_engine_wa_init(struct intel_engine_cs *engine, struct i915_wa_list *wal)
_3D_CHICKEN3,
_3D_CHICKEN3_AA_LINE_QUALITY_FIX_ENABLE);
- /* WaPipelineFlushCoherentLines:icl */
- wa_write_or(wal,
- GEN8_L3SQCREG4,
- GEN8_LQSC_FLUSH_COHERENT_LINES);
-
/*
* Wa_1405543622:icl
* Formerly known as WaGAPZPriorityScheme
@@ -1735,19 +1730,6 @@ rcs_engine_wa_init(struct intel_engine_cs *engine, struct i915_wa_list *wal)
GEN8_L3SQCREG4,
GEN11_LQSC_CLEAN_EVICT_DISABLE);
- /* WaForwardProgressSoftReset:icl */
- wa_write_or(wal,
- GEN10_SCRATCH_LNCF2,
- PMFLUSHDONE_LNICRSDROP |
- PMFLUSH_GAPL3UNBLOCK |
- PMFLUSHDONE_LNEBLK);
-
- /* Wa_1406609255:icl (pre-prod) */
- if (IS_ICL_REVID(i915, ICL_REVID_A0, ICL_REVID_B0))
- wa_write_or(wal,
- GEN7_SARCHKMD,
- GEN7_DISABLE_DEMAND_PREFETCH);
-
/* Wa_1606682166:icl */
wa_write_or(wal,
GEN7_SARCHKMD,
@@ -1947,10 +1929,10 @@ rcs_engine_wa_init(struct intel_engine_cs *engine, struct i915_wa_list *wal)
* disable bit, which we don't touch here, but it's good
* to keep in mind (see 3DSTATE_PS and 3DSTATE_WM).
*/
- wa_add(wal, GEN7_GT_MODE, 0,
- _MASKED_FIELD(GEN6_WIZ_HASHING_MASK,
- GEN6_WIZ_HASHING_16x4),
- GEN6_WIZ_HASHING_16x4);
+ wa_masked_field_set(wal,
+ GEN7_GT_MODE,
+ GEN6_WIZ_HASHING_MASK,
+ GEN6_WIZ_HASHING_16x4);
}
if (IS_GRAPHICS_VER(i915, 6, 7))
@@ -2000,10 +1982,10 @@ rcs_engine_wa_init(struct intel_engine_cs *engine, struct i915_wa_list *wal)
* disable bit, which we don't touch here, but it's good
* to keep in mind (see 3DSTATE_PS and 3DSTATE_WM).
*/
- wa_add(wal,
- GEN6_GT_MODE, 0,
- _MASKED_FIELD(GEN6_WIZ_HASHING_MASK, GEN6_WIZ_HASHING_16x4),
- GEN6_WIZ_HASHING_16x4);
+ wa_masked_field_set(wal,
+ GEN6_GT_MODE,
+ GEN6_WIZ_HASHING_MASK,
+ GEN6_WIZ_HASHING_16x4);
/* WaDisable_RenderCache_OperationalFlush:snb */
wa_masked_dis(wal, CACHE_MODE_0, RC_OP_FLUSH_ENABLE);
@@ -2024,7 +2006,7 @@ rcs_engine_wa_init(struct intel_engine_cs *engine, struct i915_wa_list *wal)
wa_add(wal, MI_MODE,
0, _MASKED_BIT_ENABLE(VS_TIMER_DISPATCH),
/* XXX bit doesn't stick on Broadwater */
- IS_I965G(i915) ? 0 : VS_TIMER_DISPATCH);
+ IS_I965G(i915) ? 0 : VS_TIMER_DISPATCH, true);
if (GRAPHICS_VER(i915) == 4)
/*
@@ -2039,7 +2021,8 @@ rcs_engine_wa_init(struct intel_engine_cs *engine, struct i915_wa_list *wal)
*/
wa_add(wal, ECOSKPD,
0, _MASKED_BIT_ENABLE(ECO_CONSTANT_BUFFER_SR_DISABLE),
- 0 /* XXX bit doesn't stick on Broadwater */);
+ 0 /* XXX bit doesn't stick on Broadwater */,
+ true);
}
static void
@@ -2048,7 +2031,7 @@ xcs_engine_wa_init(struct intel_engine_cs *engine, struct i915_wa_list *wal)
struct drm_i915_private *i915 = engine->i915;
/* WaKBLVECSSemaphoreWaitPoll:kbl */
- if (IS_KBL_GT_STEP(i915, STEP_A0, STEP_E0)) {
+ if (IS_KBL_GT_STEP(i915, STEP_A0, STEP_F0)) {
wa_write(wal,
RING_SEMA_WAIT_POLL(engine->mmio_base),
1);
@@ -2081,7 +2064,7 @@ void intel_engine_init_workarounds(struct intel_engine_cs *engine)
void intel_engine_apply_workarounds(struct intel_engine_cs *engine)
{
- wa_list_apply(engine->uncore, &engine->wa_list);
+ wa_list_apply(engine->gt, &engine->wa_list);
}
struct mcr_range {
@@ -2107,12 +2090,31 @@ static const struct mcr_range mcr_ranges_gen12[] = {
{},
};
+static const struct mcr_range mcr_ranges_xehp[] = {
+ { .start = 0x4000, .end = 0x4aff },
+ { .start = 0x5200, .end = 0x52ff },
+ { .start = 0x5400, .end = 0x7fff },
+ { .start = 0x8140, .end = 0x815f },
+ { .start = 0x8c80, .end = 0x8dff },
+ { .start = 0x94d0, .end = 0x955f },
+ { .start = 0x9680, .end = 0x96ff },
+ { .start = 0xb000, .end = 0xb3ff },
+ { .start = 0xc800, .end = 0xcfff },
+ { .start = 0xd800, .end = 0xd8ff },
+ { .start = 0xdc00, .end = 0xffff },
+ { .start = 0x17000, .end = 0x17fff },
+ { .start = 0x24a00, .end = 0x24a7f },
+ {},
+};
+
static bool mcr_range(struct drm_i915_private *i915, u32 offset)
{
const struct mcr_range *mcr_ranges;
int i;
- if (GRAPHICS_VER(i915) >= 12)
+ if (GRAPHICS_VER_FULL(i915) >= IP_VER(12, 50))
+ mcr_ranges = mcr_ranges_xehp;
+ else if (GRAPHICS_VER(i915) >= 12)
mcr_ranges = mcr_ranges_gen12;
else if (GRAPHICS_VER(i915) >= 8)
mcr_ranges = mcr_ranges_gen8;
diff --git a/drivers/gpu/drm/i915/gt/intel_workarounds_types.h b/drivers/gpu/drm/i915/gt/intel_workarounds_types.h
index c214111ea367..1e873681795d 100644
--- a/drivers/gpu/drm/i915/gt/intel_workarounds_types.h
+++ b/drivers/gpu/drm/i915/gt/intel_workarounds_types.h
@@ -15,6 +15,7 @@ struct i915_wa {
u32 clr;
u32 set;
u32 read;
+ bool masked_reg;
};
struct i915_wa_list {
diff --git a/drivers/gpu/drm/i915/gt/mock_engine.c b/drivers/gpu/drm/i915/gt/mock_engine.c
index 32589c6625e1..2c1af030310c 100644
--- a/drivers/gpu/drm/i915/gt/mock_engine.c
+++ b/drivers/gpu/drm/i915/gt/mock_engine.c
@@ -235,6 +235,34 @@ static void mock_submit_request(struct i915_request *request)
spin_unlock_irqrestore(&engine->hw_lock, flags);
}
+static void mock_add_to_engine(struct i915_request *rq)
+{
+ lockdep_assert_held(&rq->engine->sched_engine->lock);
+ list_move_tail(&rq->sched.link, &rq->engine->sched_engine->requests);
+}
+
+static void mock_remove_from_engine(struct i915_request *rq)
+{
+ struct intel_engine_cs *engine, *locked;
+
+ /*
+ * Virtual engines complicate acquiring the engine timeline lock,
+ * as their rq->engine pointer is not stable until under that
+ * engine lock. The simple ploy we use is to take the lock then
+ * check that the rq still belongs to the newly locked engine.
+ */
+
+ locked = READ_ONCE(rq->engine);
+ spin_lock_irq(&locked->sched_engine->lock);
+ while (unlikely(locked != (engine = READ_ONCE(rq->engine)))) {
+ spin_unlock(&locked->sched_engine->lock);
+ spin_lock(&engine->sched_engine->lock);
+ locked = engine;
+ }
+ list_del_init(&rq->sched.link);
+ spin_unlock_irq(&locked->sched_engine->lock);
+}
+
static void mock_reset_prepare(struct intel_engine_cs *engine)
{
}
@@ -253,10 +281,10 @@ static void mock_reset_cancel(struct intel_engine_cs *engine)
del_timer_sync(&mock->hw_delay);
- spin_lock_irqsave(&engine->active.lock, flags);
+ spin_lock_irqsave(&engine->sched_engine->lock, flags);
/* Mark all submitted requests as skipped. */
- list_for_each_entry(rq, &engine->active.requests, sched.link)
+ list_for_each_entry(rq, &engine->sched_engine->requests, sched.link)
i915_request_put(i915_request_mark_eio(rq));
intel_engine_signal_breadcrumbs(engine);
@@ -269,7 +297,7 @@ static void mock_reset_cancel(struct intel_engine_cs *engine)
}
INIT_LIST_HEAD(&mock->hw_queue);
- spin_unlock_irqrestore(&engine->active.lock, flags);
+ spin_unlock_irqrestore(&engine->sched_engine->lock, flags);
}
static void mock_reset_finish(struct intel_engine_cs *engine)
@@ -283,7 +311,8 @@ static void mock_engine_release(struct intel_engine_cs *engine)
GEM_BUG_ON(timer_pending(&mock->hw_delay));
- intel_breadcrumbs_free(engine->breadcrumbs);
+ i915_sched_engine_put(engine->sched_engine);
+ intel_breadcrumbs_put(engine->breadcrumbs);
intel_context_unpin(engine->kernel_context);
intel_context_put(engine->kernel_context);
@@ -320,6 +349,8 @@ struct intel_engine_cs *mock_engine(struct drm_i915_private *i915,
engine->base.emit_flush = mock_emit_flush;
engine->base.emit_fini_breadcrumb = mock_emit_breadcrumb;
engine->base.submit_request = mock_submit_request;
+ engine->base.add_active_request = mock_add_to_engine;
+ engine->base.remove_active_request = mock_remove_from_engine;
engine->base.reset.prepare = mock_reset_prepare;
engine->base.reset.rewind = mock_reset_rewind;
@@ -345,14 +376,18 @@ int mock_engine_init(struct intel_engine_cs *engine)
{
struct intel_context *ce;
- intel_engine_init_active(engine, ENGINE_MOCK);
+ engine->sched_engine = i915_sched_engine_create(ENGINE_MOCK);
+ if (!engine->sched_engine)
+ return -ENOMEM;
+ engine->sched_engine->private_data = engine;
+
intel_engine_init_execlists(engine);
intel_engine_init__pm(engine);
intel_engine_init_retire(engine);
engine->breadcrumbs = intel_breadcrumbs_create(NULL);
if (!engine->breadcrumbs)
- return -ENOMEM;
+ goto err_schedule;
ce = create_kernel_context(engine);
if (IS_ERR(ce))
@@ -365,7 +400,9 @@ int mock_engine_init(struct intel_engine_cs *engine)
return 0;
err_breadcrumbs:
- intel_breadcrumbs_free(engine->breadcrumbs);
+ intel_breadcrumbs_put(engine->breadcrumbs);
+err_schedule:
+ i915_sched_engine_put(engine->sched_engine);
return -ENOMEM;
}
diff --git a/drivers/gpu/drm/i915/gt/selftest_context.c b/drivers/gpu/drm/i915/gt/selftest_context.c
index 26685b927169..fa7b99a671dd 100644
--- a/drivers/gpu/drm/i915/gt/selftest_context.c
+++ b/drivers/gpu/drm/i915/gt/selftest_context.c
@@ -209,7 +209,13 @@ static int __live_active_context(struct intel_engine_cs *engine)
* This test makes sure that the context is kept alive until a
* subsequent idle-barrier (emitted when the engine wakeref hits 0
* with no more outstanding requests).
+ *
+ * In GuC submission mode we don't use idle barriers and we instead
+ * get a message from the GuC to signal that it is safe to unpin the
+ * context from memory.
*/
+ if (intel_engine_uses_guc(engine))
+ return 0;
if (intel_engine_pm_is_awake(engine)) {
pr_err("%s is awake before starting %s!\n",
@@ -357,7 +363,11 @@ static int __live_remote_context(struct intel_engine_cs *engine)
* on the context image remotely (intel_context_prepare_remote_request),
* which inserts foreign fences into intel_context.active, does not
* clobber the idle-barrier.
+ *
+ * In GuC submission mode we don't use idle barriers.
*/
+ if (intel_engine_uses_guc(engine))
+ return 0;
if (intel_engine_pm_is_awake(engine)) {
pr_err("%s is awake before starting %s!\n",
diff --git a/drivers/gpu/drm/i915/gt/selftest_engine_heartbeat.c b/drivers/gpu/drm/i915/gt/selftest_engine_heartbeat.c
index 4896e4ccad50..317eebf086c3 100644
--- a/drivers/gpu/drm/i915/gt/selftest_engine_heartbeat.c
+++ b/drivers/gpu/drm/i915/gt/selftest_engine_heartbeat.c
@@ -405,3 +405,25 @@ void st_engine_heartbeat_enable(struct intel_engine_cs *engine)
engine->props.heartbeat_interval_ms =
engine->defaults.heartbeat_interval_ms;
}
+
+void st_engine_heartbeat_disable_no_pm(struct intel_engine_cs *engine)
+{
+ engine->props.heartbeat_interval_ms = 0;
+
+ /*
+ * Park the heartbeat but without holding the PM lock as that
+ * makes the engines appear not-idle. Note that if/when unpark
+ * is called due to the PM lock being acquired later the
+ * heartbeat still won't be enabled because of the above = 0.
+ */
+ if (intel_engine_pm_get_if_awake(engine)) {
+ intel_engine_park_heartbeat(engine);
+ intel_engine_pm_put(engine);
+ }
+}
+
+void st_engine_heartbeat_enable_no_pm(struct intel_engine_cs *engine)
+{
+ engine->props.heartbeat_interval_ms =
+ engine->defaults.heartbeat_interval_ms;
+}
diff --git a/drivers/gpu/drm/i915/gt/selftest_engine_heartbeat.h b/drivers/gpu/drm/i915/gt/selftest_engine_heartbeat.h
index cd27113d5400..81da2cd8e406 100644
--- a/drivers/gpu/drm/i915/gt/selftest_engine_heartbeat.h
+++ b/drivers/gpu/drm/i915/gt/selftest_engine_heartbeat.h
@@ -9,6 +9,8 @@
struct intel_engine_cs;
void st_engine_heartbeat_disable(struct intel_engine_cs *engine);
+void st_engine_heartbeat_disable_no_pm(struct intel_engine_cs *engine);
void st_engine_heartbeat_enable(struct intel_engine_cs *engine);
+void st_engine_heartbeat_enable_no_pm(struct intel_engine_cs *engine);
#endif /* SELFTEST_ENGINE_HEARTBEAT_H */
diff --git a/drivers/gpu/drm/i915/gt/selftest_engine_pm.c b/drivers/gpu/drm/i915/gt/selftest_engine_pm.c
index 72cca3f0da21..75569666105d 100644
--- a/drivers/gpu/drm/i915/gt/selftest_engine_pm.c
+++ b/drivers/gpu/drm/i915/gt/selftest_engine_pm.c
@@ -173,8 +173,8 @@ static int __live_engine_timestamps(struct intel_engine_cs *engine)
d_ctx = trifilter(s_ctx);
d_ctx *= engine->gt->clock_frequency;
- if (IS_ICELAKE(engine->i915))
- d_ring *= 12500000; /* Fixed 80ns for icl ctx timestamp? */
+ if (GRAPHICS_VER(engine->i915) == 11)
+ d_ring *= 12500000; /* Fixed 80ns for GEN11 ctx timestamp? */
else
d_ring *= engine->gt->clock_frequency;
diff --git a/drivers/gpu/drm/i915/gt/selftest_execlists.c b/drivers/gpu/drm/i915/gt/selftest_execlists.c
index 1c8108d30b85..f12ffe797639 100644
--- a/drivers/gpu/drm/i915/gt/selftest_execlists.c
+++ b/drivers/gpu/drm/i915/gt/selftest_execlists.c
@@ -43,7 +43,7 @@ static int wait_for_submit(struct intel_engine_cs *engine,
unsigned long timeout)
{
/* Ignore our own attempts to suppress excess tasklets */
- tasklet_hi_schedule(&engine->execlists.tasklet);
+ tasklet_hi_schedule(&engine->sched_engine->tasklet);
timeout += jiffies;
do {
@@ -273,7 +273,7 @@ static int live_unlite_restore(struct intel_gt *gt, int prio)
};
/* Alternatively preempt the spinner with ce[1] */
- engine->schedule(rq[1], &attr);
+ engine->sched_engine->schedule(rq[1], &attr);
}
/* And switch back to ce[0] for good measure */
@@ -553,13 +553,13 @@ static int live_pin_rewind(void *arg)
static int engine_lock_reset_tasklet(struct intel_engine_cs *engine)
{
- tasklet_disable(&engine->execlists.tasklet);
+ tasklet_disable(&engine->sched_engine->tasklet);
local_bh_disable();
if (test_and_set_bit(I915_RESET_ENGINE + engine->id,
&engine->gt->reset.flags)) {
local_bh_enable();
- tasklet_enable(&engine->execlists.tasklet);
+ tasklet_enable(&engine->sched_engine->tasklet);
intel_gt_set_wedged(engine->gt);
return -EBUSY;
@@ -574,7 +574,7 @@ static void engine_unlock_reset_tasklet(struct intel_engine_cs *engine)
&engine->gt->reset.flags);
local_bh_enable();
- tasklet_enable(&engine->execlists.tasklet);
+ tasklet_enable(&engine->sched_engine->tasklet);
}
static int live_hold_reset(void *arg)
@@ -628,7 +628,7 @@ static int live_hold_reset(void *arg)
if (err)
goto out;
- engine->execlists.tasklet.callback(&engine->execlists.tasklet);
+ engine->sched_engine->tasklet.callback(&engine->sched_engine->tasklet);
GEM_BUG_ON(execlists_active(&engine->execlists) != rq);
i915_request_get(rq);
@@ -917,7 +917,7 @@ release_queue(struct intel_engine_cs *engine,
i915_request_add(rq);
local_bh_disable();
- engine->schedule(rq, &attr);
+ engine->sched_engine->schedule(rq, &attr);
local_bh_enable(); /* kick tasklet */
i915_request_put(rq);
@@ -1200,7 +1200,7 @@ static int live_timeslice_rewind(void *arg)
while (i915_request_is_active(rq[A2])) { /* semaphore yield! */
/* Wait for the timeslice to kick in */
del_timer(&engine->execlists.timer);
- tasklet_hi_schedule(&engine->execlists.tasklet);
+ tasklet_hi_schedule(&engine->sched_engine->tasklet);
intel_engine_flush_submission(engine);
}
/* -> ELSP[] = { { A:rq1 }, { B:rq1 } } */
@@ -1342,7 +1342,7 @@ static int live_timeslice_queue(void *arg)
err = PTR_ERR(rq);
goto err_heartbeat;
}
- engine->schedule(rq, &attr);
+ engine->sched_engine->schedule(rq, &attr);
err = wait_for_submit(engine, rq, HZ / 2);
if (err) {
pr_err("%s: Timed out trying to submit semaphores\n",
@@ -1539,12 +1539,12 @@ static int live_busywait_preempt(void *arg)
* preempt the busywaits used to synchronise between rings.
*/
- ctx_hi = kernel_context(gt->i915);
+ ctx_hi = kernel_context(gt->i915, NULL);
if (!ctx_hi)
return -ENOMEM;
ctx_hi->sched.priority = I915_CONTEXT_MAX_USER_PRIORITY;
- ctx_lo = kernel_context(gt->i915);
+ ctx_lo = kernel_context(gt->i915, NULL);
if (!ctx_lo)
goto err_ctx_hi;
ctx_lo->sched.priority = I915_CONTEXT_MIN_USER_PRIORITY;
@@ -1741,12 +1741,12 @@ static int live_preempt(void *arg)
if (igt_spinner_init(&spin_lo, gt))
goto err_spin_hi;
- ctx_hi = kernel_context(gt->i915);
+ ctx_hi = kernel_context(gt->i915, NULL);
if (!ctx_hi)
goto err_spin_lo;
ctx_hi->sched.priority = I915_CONTEXT_MAX_USER_PRIORITY;
- ctx_lo = kernel_context(gt->i915);
+ ctx_lo = kernel_context(gt->i915, NULL);
if (!ctx_lo)
goto err_ctx_hi;
ctx_lo->sched.priority = I915_CONTEXT_MIN_USER_PRIORITY;
@@ -1833,11 +1833,11 @@ static int live_late_preempt(void *arg)
if (igt_spinner_init(&spin_lo, gt))
goto err_spin_hi;
- ctx_hi = kernel_context(gt->i915);
+ ctx_hi = kernel_context(gt->i915, NULL);
if (!ctx_hi)
goto err_spin_lo;
- ctx_lo = kernel_context(gt->i915);
+ ctx_lo = kernel_context(gt->i915, NULL);
if (!ctx_lo)
goto err_ctx_hi;
@@ -1884,7 +1884,7 @@ static int live_late_preempt(void *arg)
}
attr.priority = I915_PRIORITY_MAX;
- engine->schedule(rq, &attr);
+ engine->sched_engine->schedule(rq, &attr);
if (!igt_wait_for_spinner(&spin_hi, rq)) {
pr_err("High priority context failed to preempt the low priority context\n");
@@ -1927,7 +1927,7 @@ struct preempt_client {
static int preempt_client_init(struct intel_gt *gt, struct preempt_client *c)
{
- c->ctx = kernel_context(gt->i915);
+ c->ctx = kernel_context(gt->i915, NULL);
if (!c->ctx)
return -ENOMEM;
@@ -2497,7 +2497,7 @@ static int live_suppress_self_preempt(void *arg)
i915_request_add(rq_b);
GEM_BUG_ON(i915_request_completed(rq_a));
- engine->schedule(rq_a, &attr);
+ engine->sched_engine->schedule(rq_a, &attr);
igt_spinner_end(&a.spin);
if (!igt_wait_for_spinner(&b.spin, rq_b)) {
@@ -2629,7 +2629,7 @@ static int live_chain_preempt(void *arg)
i915_request_get(rq);
i915_request_add(rq);
- engine->schedule(rq, &attr);
+ engine->sched_engine->schedule(rq, &attr);
igt_spinner_end(&hi.spin);
if (i915_request_wait(rq, 0, HZ / 5) < 0) {
@@ -2810,7 +2810,7 @@ static int __live_preempt_ring(struct intel_engine_cs *engine,
goto err_ce;
}
- tmp->ring = __intel_context_ring_size(ring_sz);
+ tmp->ring_size = ring_sz;
err = intel_context_pin(tmp);
if (err) {
@@ -2988,7 +2988,7 @@ static int live_preempt_gang(void *arg)
break;
/* Submit each spinner at increasing priority */
- engine->schedule(rq, &attr);
+ engine->sched_engine->schedule(rq, &attr);
} while (prio <= I915_PRIORITY_MAX &&
!__igt_timeout(end_time, NULL));
pr_debug("%s: Preempt chain of %d requests\n",
@@ -3236,7 +3236,7 @@ static int preempt_user(struct intel_engine_cs *engine,
i915_request_get(rq);
i915_request_add(rq);
- engine->schedule(rq, &attr);
+ engine->sched_engine->schedule(rq, &attr);
if (i915_request_wait(rq, 0, HZ / 2) < 0)
err = -ETIME;
@@ -3384,12 +3384,12 @@ static int live_preempt_timeout(void *arg)
if (igt_spinner_init(&spin_lo, gt))
return -ENOMEM;
- ctx_hi = kernel_context(gt->i915);
+ ctx_hi = kernel_context(gt->i915, NULL);
if (!ctx_hi)
goto err_spin_lo;
ctx_hi->sched.priority = I915_CONTEXT_MAX_USER_PRIORITY;
- ctx_lo = kernel_context(gt->i915);
+ ctx_lo = kernel_context(gt->i915, NULL);
if (!ctx_lo)
goto err_ctx_hi;
ctx_lo->sched.priority = I915_CONTEXT_MIN_USER_PRIORITY;
@@ -3561,12 +3561,16 @@ static int smoke_crescendo(struct preempt_smoke *smoke, unsigned int flags)
#define BATCH BIT(0)
{
struct task_struct *tsk[I915_NUM_ENGINES] = {};
- struct preempt_smoke arg[I915_NUM_ENGINES];
+ struct preempt_smoke *arg;
struct intel_engine_cs *engine;
enum intel_engine_id id;
unsigned long count;
int err = 0;
+ arg = kmalloc_array(I915_NUM_ENGINES, sizeof(*arg), GFP_KERNEL);
+ if (!arg)
+ return -ENOMEM;
+
for_each_engine(engine, smoke->gt, id) {
arg[id] = *smoke;
arg[id].engine = engine;
@@ -3574,7 +3578,7 @@ static int smoke_crescendo(struct preempt_smoke *smoke, unsigned int flags)
arg[id].batch = NULL;
arg[id].count = 0;
- tsk[id] = kthread_run(smoke_crescendo_thread, &arg,
+ tsk[id] = kthread_run(smoke_crescendo_thread, arg,
"igt/smoke:%d", id);
if (IS_ERR(tsk[id])) {
err = PTR_ERR(tsk[id]);
@@ -3603,6 +3607,8 @@ static int smoke_crescendo(struct preempt_smoke *smoke, unsigned int flags)
pr_info("Submitted %lu crescendo:%x requests across %d engines and %d contexts\n",
count, flags, smoke->gt->info.num_engines, smoke->ncontext);
+
+ kfree(arg);
return 0;
}
@@ -3676,7 +3682,7 @@ static int live_preempt_smoke(void *arg)
}
for (n = 0; n < smoke.ncontext; n++) {
- smoke.contexts[n] = kernel_context(smoke.gt->i915);
+ smoke.contexts[n] = kernel_context(smoke.gt->i915, NULL);
if (!smoke.contexts[n])
goto err_ctx;
}
@@ -3727,7 +3733,7 @@ static int nop_virtual_engine(struct intel_gt *gt,
GEM_BUG_ON(!nctx || nctx > ARRAY_SIZE(ve));
for (n = 0; n < nctx; n++) {
- ve[n] = intel_execlists_create_virtual(siblings, nsibling);
+ ve[n] = intel_engine_create_virtual(siblings, nsibling);
if (IS_ERR(ve[n])) {
err = PTR_ERR(ve[n]);
nctx = n;
@@ -3923,7 +3929,7 @@ static int mask_virtual_engine(struct intel_gt *gt,
* restrict it to our desired engine within the virtual engine.
*/
- ve = intel_execlists_create_virtual(siblings, nsibling);
+ ve = intel_engine_create_virtual(siblings, nsibling);
if (IS_ERR(ve)) {
err = PTR_ERR(ve);
goto out_close;
@@ -4054,7 +4060,7 @@ static int slicein_virtual_engine(struct intel_gt *gt,
i915_request_add(rq);
}
- ce = intel_execlists_create_virtual(siblings, nsibling);
+ ce = intel_engine_create_virtual(siblings, nsibling);
if (IS_ERR(ce)) {
err = PTR_ERR(ce);
goto out;
@@ -4106,7 +4112,7 @@ static int sliceout_virtual_engine(struct intel_gt *gt,
/* XXX We do not handle oversubscription and fairness with normal rq */
for (n = 0; n < nsibling; n++) {
- ce = intel_execlists_create_virtual(siblings, nsibling);
+ ce = intel_engine_create_virtual(siblings, nsibling);
if (IS_ERR(ce)) {
err = PTR_ERR(ce);
goto out;
@@ -4208,7 +4214,7 @@ static int preserved_virtual_engine(struct intel_gt *gt,
if (err)
goto out_scratch;
- ve = intel_execlists_create_virtual(siblings, nsibling);
+ ve = intel_engine_create_virtual(siblings, nsibling);
if (IS_ERR(ve)) {
err = PTR_ERR(ve);
goto out_scratch;
@@ -4328,234 +4334,6 @@ static int live_virtual_preserved(void *arg)
return 0;
}
-static int bond_virtual_engine(struct intel_gt *gt,
- unsigned int class,
- struct intel_engine_cs **siblings,
- unsigned int nsibling,
- unsigned int flags)
-#define BOND_SCHEDULE BIT(0)
-{
- struct intel_engine_cs *master;
- struct i915_request *rq[16];
- enum intel_engine_id id;
- struct igt_spinner spin;
- unsigned long n;
- int err;
-
- /*
- * A set of bonded requests is intended to be run concurrently
- * across a number of engines. We use one request per-engine
- * and a magic fence to schedule each of the bonded requests
- * at the same time. A consequence of our current scheduler is that
- * we only move requests to the HW ready queue when the request
- * becomes ready, that is when all of its prerequisite fences have
- * been signaled. As one of those fences is the master submit fence,
- * there is a delay on all secondary fences as the HW may be
- * currently busy. Equally, as all the requests are independent,
- * they may have other fences that delay individual request
- * submission to HW. Ergo, we do not guarantee that all requests are
- * immediately submitted to HW at the same time, just that if the
- * rules are abided by, they are ready at the same time as the
- * first is submitted. Userspace can embed semaphores in its batch
- * to ensure parallel execution of its phases as it requires.
- * Though naturally it gets requested that perhaps the scheduler should
- * take care of parallel execution, even across preemption events on
- * different HW. (The proper answer is of course "lalalala".)
- *
- * With the submit-fence, we have identified three possible phases
- * of synchronisation depending on the master fence: queued (not
- * ready), executing, and signaled. The first two are quite simple
- * and checked below. However, the signaled master fence handling is
- * contentious. Currently we do not distinguish between a signaled
- * fence and an expired fence, as once signaled it does not convey
- * any information about the previous execution. It may even be freed
- * and hence checking later it may not exist at all. Ergo we currently
- * do not apply the bonding constraint for an already signaled fence,
- * as our expectation is that it should not constrain the secondaries
- * and is outside of the scope of the bonded request API (i.e. all
- * userspace requests are meant to be running in parallel). As
- * it imposes no constraint, and is effectively a no-op, we do not
- * check below as normal execution flows are checked extensively above.
- *
- * XXX Is the degenerate handling of signaled submit fences the
- * expected behaviour for userpace?
- */
-
- GEM_BUG_ON(nsibling >= ARRAY_SIZE(rq) - 1);
-
- if (igt_spinner_init(&spin, gt))
- return -ENOMEM;
-
- err = 0;
- rq[0] = ERR_PTR(-ENOMEM);
- for_each_engine(master, gt, id) {
- struct i915_sw_fence fence = {};
- struct intel_context *ce;
-
- if (master->class == class)
- continue;
-
- ce = intel_context_create(master);
- if (IS_ERR(ce)) {
- err = PTR_ERR(ce);
- goto out;
- }
-
- memset_p((void *)rq, ERR_PTR(-EINVAL), ARRAY_SIZE(rq));
-
- rq[0] = igt_spinner_create_request(&spin, ce, MI_NOOP);
- intel_context_put(ce);
- if (IS_ERR(rq[0])) {
- err = PTR_ERR(rq[0]);
- goto out;
- }
- i915_request_get(rq[0]);
-
- if (flags & BOND_SCHEDULE) {
- onstack_fence_init(&fence);
- err = i915_sw_fence_await_sw_fence_gfp(&rq[0]->submit,
- &fence,
- GFP_KERNEL);
- }
-
- i915_request_add(rq[0]);
- if (err < 0)
- goto out;
-
- if (!(flags & BOND_SCHEDULE) &&
- !igt_wait_for_spinner(&spin, rq[0])) {
- err = -EIO;
- goto out;
- }
-
- for (n = 0; n < nsibling; n++) {
- struct intel_context *ve;
-
- ve = intel_execlists_create_virtual(siblings, nsibling);
- if (IS_ERR(ve)) {
- err = PTR_ERR(ve);
- onstack_fence_fini(&fence);
- goto out;
- }
-
- err = intel_virtual_engine_attach_bond(ve->engine,
- master,
- siblings[n]);
- if (err) {
- intel_context_put(ve);
- onstack_fence_fini(&fence);
- goto out;
- }
-
- err = intel_context_pin(ve);
- intel_context_put(ve);
- if (err) {
- onstack_fence_fini(&fence);
- goto out;
- }
-
- rq[n + 1] = i915_request_create(ve);
- intel_context_unpin(ve);
- if (IS_ERR(rq[n + 1])) {
- err = PTR_ERR(rq[n + 1]);
- onstack_fence_fini(&fence);
- goto out;
- }
- i915_request_get(rq[n + 1]);
-
- err = i915_request_await_execution(rq[n + 1],
- &rq[0]->fence,
- ve->engine->bond_execute);
- i915_request_add(rq[n + 1]);
- if (err < 0) {
- onstack_fence_fini(&fence);
- goto out;
- }
- }
- onstack_fence_fini(&fence);
- intel_engine_flush_submission(master);
- igt_spinner_end(&spin);
-
- if (i915_request_wait(rq[0], 0, HZ / 10) < 0) {
- pr_err("Master request did not execute (on %s)!\n",
- rq[0]->engine->name);
- err = -EIO;
- goto out;
- }
-
- for (n = 0; n < nsibling; n++) {
- if (i915_request_wait(rq[n + 1], 0,
- MAX_SCHEDULE_TIMEOUT) < 0) {
- err = -EIO;
- goto out;
- }
-
- if (rq[n + 1]->engine != siblings[n]) {
- pr_err("Bonded request did not execute on target engine: expected %s, used %s; master was %s\n",
- siblings[n]->name,
- rq[n + 1]->engine->name,
- rq[0]->engine->name);
- err = -EINVAL;
- goto out;
- }
- }
-
- for (n = 0; !IS_ERR(rq[n]); n++)
- i915_request_put(rq[n]);
- rq[0] = ERR_PTR(-ENOMEM);
- }
-
-out:
- for (n = 0; !IS_ERR(rq[n]); n++)
- i915_request_put(rq[n]);
- if (igt_flush_test(gt->i915))
- err = -EIO;
-
- igt_spinner_fini(&spin);
- return err;
-}
-
-static int live_virtual_bond(void *arg)
-{
- static const struct phase {
- const char *name;
- unsigned int flags;
- } phases[] = {
- { "", 0 },
- { "schedule", BOND_SCHEDULE },
- { },
- };
- struct intel_gt *gt = arg;
- struct intel_engine_cs *siblings[MAX_ENGINE_INSTANCE + 1];
- unsigned int class;
- int err;
-
- if (intel_uc_uses_guc_submission(&gt->uc))
- return 0;
-
- for (class = 0; class <= MAX_ENGINE_CLASS; class++) {
- const struct phase *p;
- int nsibling;
-
- nsibling = select_siblings(gt, class, siblings);
- if (nsibling < 2)
- continue;
-
- for (p = phases; p->name; p++) {
- err = bond_virtual_engine(gt,
- class, siblings, nsibling,
- p->flags);
- if (err) {
- pr_err("%s(%s): failed class=%d, nsibling=%d, err=%d\n",
- __func__, p->name, class, nsibling, err);
- return err;
- }
- }
- }
-
- return 0;
-}
-
static int reset_virtual_engine(struct intel_gt *gt,
struct intel_engine_cs **siblings,
unsigned int nsibling)
@@ -4576,7 +4354,7 @@ static int reset_virtual_engine(struct intel_gt *gt,
if (igt_spinner_init(&spin, gt))
return -ENOMEM;
- ve = intel_execlists_create_virtual(siblings, nsibling);
+ ve = intel_engine_create_virtual(siblings, nsibling);
if (IS_ERR(ve)) {
err = PTR_ERR(ve);
goto out_spin;
@@ -4606,13 +4384,13 @@ static int reset_virtual_engine(struct intel_gt *gt,
if (err)
goto out_heartbeat;
- engine->execlists.tasklet.callback(&engine->execlists.tasklet);
+ engine->sched_engine->tasklet.callback(&engine->sched_engine->tasklet);
GEM_BUG_ON(execlists_active(&engine->execlists) != rq);
/* Fake a preemption event; failed of course */
- spin_lock_irq(&engine->active.lock);
+ spin_lock_irq(&engine->sched_engine->lock);
__unwind_incomplete_requests(engine);
- spin_unlock_irq(&engine->active.lock);
+ spin_unlock_irq(&engine->sched_engine->lock);
GEM_BUG_ON(rq->engine != engine);
/* Reset the engine while keeping our active request on hold */
@@ -4721,7 +4499,6 @@ int intel_execlists_live_selftests(struct drm_i915_private *i915)
SUBTEST(live_virtual_mask),
SUBTEST(live_virtual_preserved),
SUBTEST(live_virtual_slice),
- SUBTEST(live_virtual_bond),
SUBTEST(live_virtual_reset),
};
diff --git a/drivers/gpu/drm/i915/gt/selftest_hangcheck.c b/drivers/gpu/drm/i915/gt/selftest_hangcheck.c
index 853246fad05f..2c1ed32ca5ac 100644
--- a/drivers/gpu/drm/i915/gt/selftest_hangcheck.c
+++ b/drivers/gpu/drm/i915/gt/selftest_hangcheck.c
@@ -17,6 +17,8 @@
#include "selftests/igt_flush_test.h"
#include "selftests/igt_reset.h"
#include "selftests/igt_atomic.h"
+#include "selftests/igt_spinner.h"
+#include "selftests/intel_scheduler_helpers.h"
#include "selftests/mock_drm.h"
@@ -42,7 +44,7 @@ static int hang_init(struct hang *h, struct intel_gt *gt)
memset(h, 0, sizeof(*h));
h->gt = gt;
- h->ctx = kernel_context(gt->i915);
+ h->ctx = kernel_context(gt->i915, NULL);
if (IS_ERR(h->ctx))
return PTR_ERR(h->ctx);
@@ -378,6 +380,7 @@ static int igt_reset_nop(void *arg)
ce = intel_context_create(engine);
if (IS_ERR(ce)) {
err = PTR_ERR(ce);
+ pr_err("[%s] Create context failed: %d!\n", engine->name, err);
break;
}
@@ -387,6 +390,8 @@ static int igt_reset_nop(void *arg)
rq = intel_context_create_request(ce);
if (IS_ERR(rq)) {
err = PTR_ERR(rq);
+ pr_err("[%s] Create request failed: %d!\n",
+ engine->name, err);
break;
}
@@ -401,24 +406,31 @@ static int igt_reset_nop(void *arg)
igt_global_reset_unlock(gt);
if (intel_gt_is_wedged(gt)) {
+ pr_err("[%s] GT is wedged!\n", engine->name);
err = -EIO;
break;
}
if (i915_reset_count(global) != reset_count + ++count) {
- pr_err("Full GPU reset not recorded!\n");
+ pr_err("[%s] Reset not recorded: %d vs %d + %d!\n",
+ engine->name, i915_reset_count(global), reset_count, count);
err = -EINVAL;
break;
}
err = igt_flush_test(gt->i915);
- if (err)
+ if (err) {
+ pr_err("[%s] Flush failed: %d!\n", engine->name, err);
break;
+ }
} while (time_before(jiffies, end_time));
pr_info("%s: %d resets\n", __func__, count);
- if (igt_flush_test(gt->i915))
+ if (igt_flush_test(gt->i915)) {
+ pr_err("Post flush failed: %d!\n", err);
err = -EIO;
+ }
+
return err;
}
@@ -440,9 +452,19 @@ static int igt_reset_nop_engine(void *arg)
IGT_TIMEOUT(end_time);
int err;
+ if (intel_engine_uses_guc(engine)) {
+ /* Engine level resets are triggered by GuC when a hang
+ * is detected. They can't be triggered by the KMD any
+ * more. Thus a nop batch cannot be used as a reset test
+ */
+ continue;
+ }
+
ce = intel_context_create(engine);
- if (IS_ERR(ce))
+ if (IS_ERR(ce)) {
+ pr_err("[%s] Create context failed: %pe!\n", engine->name, ce);
return PTR_ERR(ce);
+ }
reset_count = i915_reset_count(global);
reset_engine_count = i915_reset_engine_count(global, engine);
@@ -549,9 +571,15 @@ static int igt_reset_fail_engine(void *arg)
IGT_TIMEOUT(end_time);
int err;
+ /* Can't manually break the reset if i915 doesn't perform it */
+ if (intel_engine_uses_guc(engine))
+ continue;
+
ce = intel_context_create(engine);
- if (IS_ERR(ce))
+ if (IS_ERR(ce)) {
+ pr_err("[%s] Create context failed: %pe!\n", engine->name, ce);
return PTR_ERR(ce);
+ }
st_engine_heartbeat_disable(engine);
set_bit(I915_RESET_ENGINE + id, &gt->reset.flags);
@@ -686,8 +714,12 @@ static int __igt_reset_engine(struct intel_gt *gt, bool active)
for_each_engine(engine, gt, id) {
unsigned int reset_count, reset_engine_count;
unsigned long count;
+ bool using_guc = intel_engine_uses_guc(engine);
IGT_TIMEOUT(end_time);
+ if (using_guc && !active)
+ continue;
+
if (active && !intel_engine_can_store_dword(engine))
continue;
@@ -705,13 +737,24 @@ static int __igt_reset_engine(struct intel_gt *gt, bool active)
set_bit(I915_RESET_ENGINE + id, &gt->reset.flags);
count = 0;
do {
- if (active) {
- struct i915_request *rq;
+ struct i915_request *rq = NULL;
+ struct intel_selftest_saved_policy saved;
+ int err2;
+
+ err = intel_selftest_modify_policy(engine, &saved,
+ SELFTEST_SCHEDULER_MODIFY_FAST_RESET);
+ if (err) {
+ pr_err("[%s] Modify policy failed: %d!\n", engine->name, err);
+ break;
+ }
+ if (active) {
rq = hang_create_request(&h, engine);
if (IS_ERR(rq)) {
err = PTR_ERR(rq);
- break;
+ pr_err("[%s] Create hang request failed: %d!\n",
+ engine->name, err);
+ goto restore;
}
i915_request_get(rq);
@@ -727,34 +770,59 @@ static int __igt_reset_engine(struct intel_gt *gt, bool active)
i915_request_put(rq);
err = -EIO;
- break;
+ goto restore;
}
+ }
- i915_request_put(rq);
+ if (!using_guc) {
+ err = intel_engine_reset(engine, NULL);
+ if (err) {
+ pr_err("intel_engine_reset(%s) failed, err:%d\n",
+ engine->name, err);
+ goto skip;
+ }
}
- err = intel_engine_reset(engine, NULL);
- if (err) {
- pr_err("intel_engine_reset(%s) failed, err:%d\n",
- engine->name, err);
- break;
+ if (rq) {
+ /* Ensure the reset happens and kills the engine */
+ err = intel_selftest_wait_for_rq(rq);
+ if (err)
+ pr_err("[%s] Wait for request %lld:%lld [0x%04X] failed: %d!\n",
+ engine->name, rq->fence.context,
+ rq->fence.seqno, rq->context->guc_id, err);
}
+skip:
+ if (rq)
+ i915_request_put(rq);
+
if (i915_reset_count(global) != reset_count) {
pr_err("Full GPU reset recorded! (engine reset expected)\n");
err = -EINVAL;
- break;
+ goto restore;
}
- if (i915_reset_engine_count(global, engine) !=
- ++reset_engine_count) {
- pr_err("%s engine reset not recorded!\n",
- engine->name);
- err = -EINVAL;
- break;
+ /* GuC based resets are not logged per engine */
+ if (!using_guc) {
+ if (i915_reset_engine_count(global, engine) !=
+ ++reset_engine_count) {
+ pr_err("%s engine reset not recorded!\n",
+ engine->name);
+ err = -EINVAL;
+ goto restore;
+ }
}
count++;
+
+restore:
+ err2 = intel_selftest_restore_policy(engine, &saved);
+ if (err2)
+ pr_err("[%s] Restore policy failed: %d!\n", engine->name, err);
+ if (err == 0)
+ err = err2;
+ if (err)
+ break;
} while (time_before(jiffies, end_time));
clear_bit(I915_RESET_ENGINE + id, &gt->reset.flags);
st_engine_heartbeat_enable(engine);
@@ -765,12 +833,16 @@ static int __igt_reset_engine(struct intel_gt *gt, bool active)
break;
err = igt_flush_test(gt->i915);
- if (err)
+ if (err) {
+ pr_err("[%s] Flush failed: %d!\n", engine->name, err);
break;
+ }
}
- if (intel_gt_is_wedged(gt))
+ if (intel_gt_is_wedged(gt)) {
+ pr_err("GT is wedged!\n");
err = -EIO;
+ }
if (active)
hang_fini(&h);
@@ -807,7 +879,7 @@ static int active_request_put(struct i915_request *rq)
if (!rq)
return 0;
- if (i915_request_wait(rq, 0, 5 * HZ) < 0) {
+ if (i915_request_wait(rq, 0, 10 * HZ) < 0) {
GEM_TRACE("%s timed out waiting for completion of fence %llx:%lld\n",
rq->engine->name,
rq->fence.context,
@@ -837,6 +909,7 @@ static int active_engine(void *data)
ce[count] = intel_context_create(engine);
if (IS_ERR(ce[count])) {
err = PTR_ERR(ce[count]);
+ pr_err("[%s] Create context #%ld failed: %d!\n", engine->name, count, err);
while (--count)
intel_context_put(ce[count]);
return err;
@@ -852,23 +925,26 @@ static int active_engine(void *data)
new = intel_context_create_request(ce[idx]);
if (IS_ERR(new)) {
err = PTR_ERR(new);
+ pr_err("[%s] Create request #%d failed: %d!\n", engine->name, idx, err);
break;
}
rq[idx] = i915_request_get(new);
i915_request_add(new);
- if (engine->schedule && arg->flags & TEST_PRIORITY) {
+ if (engine->sched_engine->schedule && arg->flags & TEST_PRIORITY) {
struct i915_sched_attr attr = {
.priority =
i915_prandom_u32_max_state(512, &prng),
};
- engine->schedule(rq[idx], &attr);
+ engine->sched_engine->schedule(rq[idx], &attr);
}
err = active_request_put(old);
- if (err)
+ if (err) {
+ pr_err("[%s] Request put failed: %d!\n", engine->name, err);
break;
+ }
cond_resched();
}
@@ -876,6 +952,9 @@ static int active_engine(void *data)
for (count = 0; count < ARRAY_SIZE(rq); count++) {
int err__ = active_request_put(rq[count]);
+ if (err)
+ pr_err("[%s] Request put #%ld failed: %d!\n", engine->name, count, err);
+
/* Keep the first error */
if (!err)
err = err__;
@@ -916,10 +995,13 @@ static int __igt_reset_engines(struct intel_gt *gt,
struct active_engine threads[I915_NUM_ENGINES] = {};
unsigned long device = i915_reset_count(global);
unsigned long count = 0, reported;
+ bool using_guc = intel_engine_uses_guc(engine);
IGT_TIMEOUT(end_time);
- if (flags & TEST_ACTIVE &&
- !intel_engine_can_store_dword(engine))
+ if (flags & TEST_ACTIVE) {
+ if (!intel_engine_can_store_dword(engine))
+ continue;
+ } else if (using_guc)
continue;
if (!wait_for_idle(engine)) {
@@ -949,6 +1031,7 @@ static int __igt_reset_engines(struct intel_gt *gt,
"igt/%s", other->name);
if (IS_ERR(tsk)) {
err = PTR_ERR(tsk);
+ pr_err("[%s] Thread spawn failed: %d!\n", engine->name, err);
goto unwind;
}
@@ -958,16 +1041,27 @@ static int __igt_reset_engines(struct intel_gt *gt,
yield(); /* start all threads before we begin */
- st_engine_heartbeat_disable(engine);
+ st_engine_heartbeat_disable_no_pm(engine);
set_bit(I915_RESET_ENGINE + id, &gt->reset.flags);
do {
struct i915_request *rq = NULL;
+ struct intel_selftest_saved_policy saved;
+ int err2;
+
+ err = intel_selftest_modify_policy(engine, &saved,
+ SELFTEST_SCHEDULER_MODIFY_FAST_RESET);
+ if (err) {
+ pr_err("[%s] Modify policy failed: %d!\n", engine->name, err);
+ break;
+ }
if (flags & TEST_ACTIVE) {
rq = hang_create_request(&h, engine);
if (IS_ERR(rq)) {
err = PTR_ERR(rq);
- break;
+ pr_err("[%s] Create hang request failed: %d!\n",
+ engine->name, err);
+ goto restore;
}
i915_request_get(rq);
@@ -983,32 +1077,44 @@ static int __igt_reset_engines(struct intel_gt *gt,
i915_request_put(rq);
err = -EIO;
- break;
+ goto restore;
}
+ } else {
+ intel_engine_pm_get(engine);
}
- err = intel_engine_reset(engine, NULL);
- if (err) {
- pr_err("i915_reset_engine(%s:%s): failed, err=%d\n",
- engine->name, test_name, err);
- break;
+ if (!using_guc) {
+ err = intel_engine_reset(engine, NULL);
+ if (err) {
+ pr_err("i915_reset_engine(%s:%s): failed, err=%d\n",
+ engine->name, test_name, err);
+ goto restore;
+ }
+ }
+
+ if (rq) {
+ /* Ensure the reset happens and kills the engine */
+ err = intel_selftest_wait_for_rq(rq);
+ if (err)
+ pr_err("[%s] Wait for request %lld:%lld [0x%04X] failed: %d!\n",
+ engine->name, rq->fence.context,
+ rq->fence.seqno, rq->context->guc_id, err);
}
count++;
if (rq) {
if (rq->fence.error != -EIO) {
- pr_err("i915_reset_engine(%s:%s):"
- " failed to reset request %llx:%lld\n",
+ pr_err("i915_reset_engine(%s:%s): failed to reset request %lld:%lld [0x%04X]\n",
engine->name, test_name,
rq->fence.context,
- rq->fence.seqno);
+ rq->fence.seqno, rq->context->guc_id);
i915_request_put(rq);
GEM_TRACE_DUMP();
intel_gt_set_wedged(gt);
err = -EIO;
- break;
+ goto restore;
}
if (i915_request_wait(rq, 0, HZ / 5) < 0) {
@@ -1027,12 +1133,15 @@ static int __igt_reset_engines(struct intel_gt *gt,
GEM_TRACE_DUMP();
intel_gt_set_wedged(gt);
err = -EIO;
- break;
+ goto restore;
}
i915_request_put(rq);
}
+ if (!(flags & TEST_ACTIVE))
+ intel_engine_pm_put(engine);
+
if (!(flags & TEST_SELF) && !wait_for_idle(engine)) {
struct drm_printer p =
drm_info_printer(gt->i915->drm.dev);
@@ -1044,22 +1153,34 @@ static int __igt_reset_engines(struct intel_gt *gt,
"%s\n", engine->name);
err = -EIO;
- break;
+ goto restore;
}
+
+restore:
+ err2 = intel_selftest_restore_policy(engine, &saved);
+ if (err2)
+ pr_err("[%s] Restore policy failed: %d!\n", engine->name, err2);
+ if (err == 0)
+ err = err2;
+ if (err)
+ break;
} while (time_before(jiffies, end_time));
clear_bit(I915_RESET_ENGINE + id, &gt->reset.flags);
- st_engine_heartbeat_enable(engine);
+ st_engine_heartbeat_enable_no_pm(engine);
pr_info("i915_reset_engine(%s:%s): %lu resets\n",
engine->name, test_name, count);
- reported = i915_reset_engine_count(global, engine);
- reported -= threads[engine->id].resets;
- if (reported != count) {
- pr_err("i915_reset_engine(%s:%s): reset %lu times, but reported %lu\n",
- engine->name, test_name, count, reported);
- if (!err)
- err = -EINVAL;
+ /* GuC based resets are not logged per engine */
+ if (!using_guc) {
+ reported = i915_reset_engine_count(global, engine);
+ reported -= threads[engine->id].resets;
+ if (reported != count) {
+ pr_err("i915_reset_engine(%s:%s): reset %lu times, but reported %lu\n",
+ engine->name, test_name, count, reported);
+ if (!err)
+ err = -EINVAL;
+ }
}
unwind:
@@ -1078,15 +1199,18 @@ unwind:
}
put_task_struct(threads[tmp].task);
- if (other->uabi_class != engine->uabi_class &&
- threads[tmp].resets !=
- i915_reset_engine_count(global, other)) {
- pr_err("Innocent engine %s was reset (count=%ld)\n",
- other->name,
- i915_reset_engine_count(global, other) -
- threads[tmp].resets);
- if (!err)
- err = -EINVAL;
+ /* GuC based resets are not logged per engine */
+ if (!using_guc) {
+ if (other->uabi_class != engine->uabi_class &&
+ threads[tmp].resets !=
+ i915_reset_engine_count(global, other)) {
+ pr_err("Innocent engine %s was reset (count=%ld)\n",
+ other->name,
+ i915_reset_engine_count(global, other) -
+ threads[tmp].resets);
+ if (!err)
+ err = -EINVAL;
+ }
}
}
@@ -1101,8 +1225,10 @@ unwind:
break;
err = igt_flush_test(gt->i915);
- if (err)
+ if (err) {
+ pr_err("[%s] Flush failed: %d!\n", engine->name, err);
break;
+ }
}
if (intel_gt_is_wedged(gt))
@@ -1180,12 +1306,15 @@ static int igt_reset_wait(void *arg)
igt_global_reset_lock(gt);
err = hang_init(&h, gt);
- if (err)
+ if (err) {
+ pr_err("[%s] Hang init failed: %d!\n", engine->name, err);
goto unlock;
+ }
rq = hang_create_request(&h, engine);
if (IS_ERR(rq)) {
err = PTR_ERR(rq);
+ pr_err("[%s] Create hang request failed: %d!\n", engine->name, err);
goto fini;
}
@@ -1310,12 +1439,15 @@ static int __igt_reset_evict_vma(struct intel_gt *gt,
/* Check that we can recover an unbind stuck on a hanging request */
err = hang_init(&h, gt);
- if (err)
+ if (err) {
+ pr_err("[%s] Hang init failed: %d!\n", engine->name, err);
return err;
+ }
obj = i915_gem_object_create_internal(gt->i915, SZ_1M);
if (IS_ERR(obj)) {
err = PTR_ERR(obj);
+ pr_err("[%s] Create object failed: %d!\n", engine->name, err);
goto fini;
}
@@ -1330,12 +1462,14 @@ static int __igt_reset_evict_vma(struct intel_gt *gt,
arg.vma = i915_vma_instance(obj, vm, NULL);
if (IS_ERR(arg.vma)) {
err = PTR_ERR(arg.vma);
+ pr_err("[%s] VMA instance failed: %d!\n", engine->name, err);
goto out_obj;
}
rq = hang_create_request(&h, engine);
if (IS_ERR(rq)) {
err = PTR_ERR(rq);
+ pr_err("[%s] Create hang request failed: %d!\n", engine->name, err);
goto out_obj;
}
@@ -1347,6 +1481,7 @@ static int __igt_reset_evict_vma(struct intel_gt *gt,
err = i915_vma_pin(arg.vma, 0, 0, pin_flags);
if (err) {
i915_request_add(rq);
+ pr_err("[%s] VMA pin failed: %d!\n", engine->name, err);
goto out_obj;
}
@@ -1363,8 +1498,14 @@ static int __igt_reset_evict_vma(struct intel_gt *gt,
i915_vma_lock(arg.vma);
err = i915_request_await_object(rq, arg.vma->obj,
flags & EXEC_OBJECT_WRITE);
- if (err == 0)
+ if (err == 0) {
err = i915_vma_move_to_active(arg.vma, rq, flags);
+ if (err)
+ pr_err("[%s] Move to active failed: %d!\n", engine->name, err);
+ } else {
+ pr_err("[%s] Request await failed: %d!\n", engine->name, err);
+ }
+
i915_vma_unlock(arg.vma);
if (flags & EXEC_OBJECT_NEEDS_FENCE)
@@ -1392,6 +1533,7 @@ static int __igt_reset_evict_vma(struct intel_gt *gt,
tsk = kthread_run(fn, &arg, "igt/evict_vma");
if (IS_ERR(tsk)) {
err = PTR_ERR(tsk);
+ pr_err("[%s] Thread spawn failed: %d!\n", engine->name, err);
tsk = NULL;
goto out_reset;
}
@@ -1508,17 +1650,29 @@ static int igt_reset_queue(void *arg)
goto unlock;
for_each_engine(engine, gt, id) {
+ struct intel_selftest_saved_policy saved;
struct i915_request *prev;
IGT_TIMEOUT(end_time);
unsigned int count;
+ bool using_guc = intel_engine_uses_guc(engine);
if (!intel_engine_can_store_dword(engine))
continue;
+ if (using_guc) {
+ err = intel_selftest_modify_policy(engine, &saved,
+ SELFTEST_SCHEDULER_MODIFY_NO_HANGCHECK);
+ if (err) {
+ pr_err("[%s] Modify policy failed: %d!\n", engine->name, err);
+ goto fini;
+ }
+ }
+
prev = hang_create_request(&h, engine);
if (IS_ERR(prev)) {
err = PTR_ERR(prev);
- goto fini;
+ pr_err("[%s] Create 'prev' hang request failed: %d!\n", engine->name, err);
+ goto restore;
}
i915_request_get(prev);
@@ -1532,7 +1686,8 @@ static int igt_reset_queue(void *arg)
rq = hang_create_request(&h, engine);
if (IS_ERR(rq)) {
err = PTR_ERR(rq);
- goto fini;
+ pr_err("[%s] Create hang request failed: %d!\n", engine->name, err);
+ goto restore;
}
i915_request_get(rq);
@@ -1557,7 +1712,7 @@ static int igt_reset_queue(void *arg)
GEM_TRACE_DUMP();
intel_gt_set_wedged(gt);
- goto fini;
+ goto restore;
}
if (!wait_until_running(&h, prev)) {
@@ -1575,7 +1730,7 @@ static int igt_reset_queue(void *arg)
intel_gt_set_wedged(gt);
err = -EIO;
- goto fini;
+ goto restore;
}
reset_count = fake_hangcheck(gt, BIT(id));
@@ -1586,7 +1741,7 @@ static int igt_reset_queue(void *arg)
i915_request_put(rq);
i915_request_put(prev);
err = -EINVAL;
- goto fini;
+ goto restore;
}
if (rq->fence.error) {
@@ -1595,7 +1750,7 @@ static int igt_reset_queue(void *arg)
i915_request_put(rq);
i915_request_put(prev);
err = -EINVAL;
- goto fini;
+ goto restore;
}
if (i915_reset_count(global) == reset_count) {
@@ -1603,7 +1758,7 @@ static int igt_reset_queue(void *arg)
i915_request_put(rq);
i915_request_put(prev);
err = -EINVAL;
- goto fini;
+ goto restore;
}
i915_request_put(prev);
@@ -1618,9 +1773,24 @@ static int igt_reset_queue(void *arg)
i915_request_put(prev);
- err = igt_flush_test(gt->i915);
+restore:
+ if (using_guc) {
+ int err2 = intel_selftest_restore_policy(engine, &saved);
+
+ if (err2)
+ pr_err("%s:%d> [%s] Restore policy failed: %d!\n",
+ __func__, __LINE__, engine->name, err2);
+ if (err == 0)
+ err = err2;
+ }
if (err)
+ goto fini;
+
+ err = igt_flush_test(gt->i915);
+ if (err) {
+ pr_err("[%s] Flush failed: %d!\n", engine->name, err);
break;
+ }
}
fini:
@@ -1653,12 +1823,15 @@ static int igt_handle_error(void *arg)
return 0;
err = hang_init(&h, gt);
- if (err)
+ if (err) {
+ pr_err("[%s] Hang init failed: %d!\n", engine->name, err);
return err;
+ }
rq = hang_create_request(&h, engine);
if (IS_ERR(rq)) {
err = PTR_ERR(rq);
+ pr_err("[%s] Create hang request failed: %d!\n", engine->name, err);
goto err_fini;
}
@@ -1702,7 +1875,7 @@ static int __igt_atomic_reset_engine(struct intel_engine_cs *engine,
const struct igt_atomic_section *p,
const char *mode)
{
- struct tasklet_struct * const t = &engine->execlists.tasklet;
+ struct tasklet_struct * const t = &engine->sched_engine->tasklet;
int err;
GEM_TRACE("i915_reset_engine(%s:%s) under %s\n",
@@ -1743,12 +1916,15 @@ static int igt_atomic_reset_engine(struct intel_engine_cs *engine,
return err;
err = hang_init(&h, engine->gt);
- if (err)
+ if (err) {
+ pr_err("[%s] Hang init failed: %d!\n", engine->name, err);
return err;
+ }
rq = hang_create_request(&h, engine);
if (IS_ERR(rq)) {
err = PTR_ERR(rq);
+ pr_err("[%s] Create hang request failed: %d!\n", engine->name, err);
goto out;
}
diff --git a/drivers/gpu/drm/i915/gt/selftest_lrc.c b/drivers/gpu/drm/i915/gt/selftest_lrc.c
index 3119016d9910..b0977a3b699b 100644
--- a/drivers/gpu/drm/i915/gt/selftest_lrc.c
+++ b/drivers/gpu/drm/i915/gt/selftest_lrc.c
@@ -49,7 +49,7 @@ static int wait_for_submit(struct intel_engine_cs *engine,
unsigned long timeout)
{
/* Ignore our own attempts to suppress excess tasklets */
- tasklet_hi_schedule(&engine->execlists.tasklet);
+ tasklet_hi_schedule(&engine->sched_engine->tasklet);
timeout += jiffies;
do {
@@ -1613,12 +1613,12 @@ static void garbage_reset(struct intel_engine_cs *engine,
local_bh_disable();
if (!test_and_set_bit(bit, lock)) {
- tasklet_disable(&engine->execlists.tasklet);
+ tasklet_disable(&engine->sched_engine->tasklet);
if (!rq->fence.error)
__intel_engine_reset_bh(engine, NULL);
- tasklet_enable(&engine->execlists.tasklet);
+ tasklet_enable(&engine->sched_engine->tasklet);
clear_and_wake_up_bit(bit, lock);
}
local_bh_enable();
diff --git a/drivers/gpu/drm/i915/gt/selftest_migrate.c b/drivers/gpu/drm/i915/gt/selftest_migrate.c
new file mode 100644
index 000000000000..12ef2837c89b
--- /dev/null
+++ b/drivers/gpu/drm/i915/gt/selftest_migrate.c
@@ -0,0 +1,669 @@
+// SPDX-License-Identifier: MIT
+/*
+ * Copyright © 2020 Intel Corporation
+ */
+
+#include <linux/sort.h>
+
+#include "selftests/i915_random.h"
+
+static const unsigned int sizes[] = {
+ SZ_4K,
+ SZ_64K,
+ SZ_2M,
+ CHUNK_SZ - SZ_4K,
+ CHUNK_SZ,
+ CHUNK_SZ + SZ_4K,
+ SZ_64M,
+};
+
+static struct drm_i915_gem_object *
+create_lmem_or_internal(struct drm_i915_private *i915, size_t size)
+{
+ struct drm_i915_gem_object *obj;
+
+ obj = i915_gem_object_create_lmem(i915, size, 0);
+ if (!IS_ERR(obj))
+ return obj;
+
+ return i915_gem_object_create_internal(i915, size);
+}
+
+static int copy(struct intel_migrate *migrate,
+ int (*fn)(struct intel_migrate *migrate,
+ struct i915_gem_ww_ctx *ww,
+ struct drm_i915_gem_object *src,
+ struct drm_i915_gem_object *dst,
+ struct i915_request **out),
+ u32 sz, struct rnd_state *prng)
+{
+ struct drm_i915_private *i915 = migrate->context->engine->i915;
+ struct drm_i915_gem_object *src, *dst;
+ struct i915_request *rq;
+ struct i915_gem_ww_ctx ww;
+ u32 *vaddr;
+ int err = 0;
+ int i;
+
+ src = create_lmem_or_internal(i915, sz);
+ if (IS_ERR(src))
+ return 0;
+
+ dst = i915_gem_object_create_internal(i915, sz);
+ if (IS_ERR(dst))
+ goto err_free_src;
+
+ for_i915_gem_ww(&ww, err, true) {
+ err = i915_gem_object_lock(src, &ww);
+ if (err)
+ continue;
+
+ err = i915_gem_object_lock(dst, &ww);
+ if (err)
+ continue;
+
+ vaddr = i915_gem_object_pin_map(src, I915_MAP_WC);
+ if (IS_ERR(vaddr)) {
+ err = PTR_ERR(vaddr);
+ continue;
+ }
+
+ for (i = 0; i < sz / sizeof(u32); i++)
+ vaddr[i] = i;
+ i915_gem_object_flush_map(src);
+
+ vaddr = i915_gem_object_pin_map(dst, I915_MAP_WC);
+ if (IS_ERR(vaddr)) {
+ err = PTR_ERR(vaddr);
+ goto unpin_src;
+ }
+
+ for (i = 0; i < sz / sizeof(u32); i++)
+ vaddr[i] = ~i;
+ i915_gem_object_flush_map(dst);
+
+ err = fn(migrate, &ww, src, dst, &rq);
+ if (!err)
+ continue;
+
+ if (err != -EDEADLK && err != -EINTR && err != -ERESTARTSYS)
+ pr_err("%ps failed, size: %u\n", fn, sz);
+ if (rq) {
+ i915_request_wait(rq, 0, HZ);
+ i915_request_put(rq);
+ }
+ i915_gem_object_unpin_map(dst);
+unpin_src:
+ i915_gem_object_unpin_map(src);
+ }
+ if (err)
+ goto err_out;
+
+ if (rq) {
+ if (i915_request_wait(rq, 0, HZ) < 0) {
+ pr_err("%ps timed out, size: %u\n", fn, sz);
+ err = -ETIME;
+ }
+ i915_request_put(rq);
+ }
+
+ for (i = 0; !err && i < sz / PAGE_SIZE; i++) {
+ int x = i * 1024 + i915_prandom_u32_max_state(1024, prng);
+
+ if (vaddr[x] != x) {
+ pr_err("%ps failed, size: %u, offset: %zu\n",
+ fn, sz, x * sizeof(u32));
+ igt_hexdump(vaddr + i * 1024, 4096);
+ err = -EINVAL;
+ }
+ }
+
+ i915_gem_object_unpin_map(dst);
+ i915_gem_object_unpin_map(src);
+
+err_out:
+ i915_gem_object_put(dst);
+err_free_src:
+ i915_gem_object_put(src);
+
+ return err;
+}
+
+static int clear(struct intel_migrate *migrate,
+ int (*fn)(struct intel_migrate *migrate,
+ struct i915_gem_ww_ctx *ww,
+ struct drm_i915_gem_object *obj,
+ u32 value,
+ struct i915_request **out),
+ u32 sz, struct rnd_state *prng)
+{
+ struct drm_i915_private *i915 = migrate->context->engine->i915;
+ struct drm_i915_gem_object *obj;
+ struct i915_request *rq;
+ struct i915_gem_ww_ctx ww;
+ u32 *vaddr;
+ int err = 0;
+ int i;
+
+ obj = create_lmem_or_internal(i915, sz);
+ if (IS_ERR(obj))
+ return 0;
+
+ for_i915_gem_ww(&ww, err, true) {
+ err = i915_gem_object_lock(obj, &ww);
+ if (err)
+ continue;
+
+ vaddr = i915_gem_object_pin_map(obj, I915_MAP_WC);
+ if (IS_ERR(vaddr)) {
+ err = PTR_ERR(vaddr);
+ continue;
+ }
+
+ for (i = 0; i < sz / sizeof(u32); i++)
+ vaddr[i] = ~i;
+ i915_gem_object_flush_map(obj);
+
+ err = fn(migrate, &ww, obj, sz, &rq);
+ if (!err)
+ continue;
+
+ if (err != -EDEADLK && err != -EINTR && err != -ERESTARTSYS)
+ pr_err("%ps failed, size: %u\n", fn, sz);
+ if (rq) {
+ i915_request_wait(rq, 0, HZ);
+ i915_request_put(rq);
+ }
+ i915_gem_object_unpin_map(obj);
+ }
+ if (err)
+ goto err_out;
+
+ if (rq) {
+ if (i915_request_wait(rq, 0, HZ) < 0) {
+ pr_err("%ps timed out, size: %u\n", fn, sz);
+ err = -ETIME;
+ }
+ i915_request_put(rq);
+ }
+
+ for (i = 0; !err && i < sz / PAGE_SIZE; i++) {
+ int x = i * 1024 + i915_prandom_u32_max_state(1024, prng);
+
+ if (vaddr[x] != sz) {
+ pr_err("%ps failed, size: %u, offset: %zu\n",
+ fn, sz, x * sizeof(u32));
+ igt_hexdump(vaddr + i * 1024, 4096);
+ err = -EINVAL;
+ }
+ }
+
+ i915_gem_object_unpin_map(obj);
+err_out:
+ i915_gem_object_put(obj);
+
+ return err;
+}
+
+static int __migrate_copy(struct intel_migrate *migrate,
+ struct i915_gem_ww_ctx *ww,
+ struct drm_i915_gem_object *src,
+ struct drm_i915_gem_object *dst,
+ struct i915_request **out)
+{
+ return intel_migrate_copy(migrate, ww, NULL,
+ src->mm.pages->sgl, src->cache_level,
+ i915_gem_object_is_lmem(src),
+ dst->mm.pages->sgl, dst->cache_level,
+ i915_gem_object_is_lmem(dst),
+ out);
+}
+
+static int __global_copy(struct intel_migrate *migrate,
+ struct i915_gem_ww_ctx *ww,
+ struct drm_i915_gem_object *src,
+ struct drm_i915_gem_object *dst,
+ struct i915_request **out)
+{
+ return intel_context_migrate_copy(migrate->context, NULL,
+ src->mm.pages->sgl, src->cache_level,
+ i915_gem_object_is_lmem(src),
+ dst->mm.pages->sgl, dst->cache_level,
+ i915_gem_object_is_lmem(dst),
+ out);
+}
+
+static int
+migrate_copy(struct intel_migrate *migrate, u32 sz, struct rnd_state *prng)
+{
+ return copy(migrate, __migrate_copy, sz, prng);
+}
+
+static int
+global_copy(struct intel_migrate *migrate, u32 sz, struct rnd_state *prng)
+{
+ return copy(migrate, __global_copy, sz, prng);
+}
+
+static int __migrate_clear(struct intel_migrate *migrate,
+ struct i915_gem_ww_ctx *ww,
+ struct drm_i915_gem_object *obj,
+ u32 value,
+ struct i915_request **out)
+{
+ return intel_migrate_clear(migrate, ww, NULL,
+ obj->mm.pages->sgl,
+ obj->cache_level,
+ i915_gem_object_is_lmem(obj),
+ value, out);
+}
+
+static int __global_clear(struct intel_migrate *migrate,
+ struct i915_gem_ww_ctx *ww,
+ struct drm_i915_gem_object *obj,
+ u32 value,
+ struct i915_request **out)
+{
+ return intel_context_migrate_clear(migrate->context, NULL,
+ obj->mm.pages->sgl,
+ obj->cache_level,
+ i915_gem_object_is_lmem(obj),
+ value, out);
+}
+
+static int
+migrate_clear(struct intel_migrate *migrate, u32 sz, struct rnd_state *prng)
+{
+ return clear(migrate, __migrate_clear, sz, prng);
+}
+
+static int
+global_clear(struct intel_migrate *migrate, u32 sz, struct rnd_state *prng)
+{
+ return clear(migrate, __global_clear, sz, prng);
+}
+
+static int live_migrate_copy(void *arg)
+{
+ struct intel_migrate *migrate = arg;
+ struct drm_i915_private *i915 = migrate->context->engine->i915;
+ I915_RND_STATE(prng);
+ int i;
+
+ for (i = 0; i < ARRAY_SIZE(sizes); i++) {
+ int err;
+
+ err = migrate_copy(migrate, sizes[i], &prng);
+ if (err == 0)
+ err = global_copy(migrate, sizes[i], &prng);
+ i915_gem_drain_freed_objects(i915);
+ if (err)
+ return err;
+ }
+
+ return 0;
+}
+
+static int live_migrate_clear(void *arg)
+{
+ struct intel_migrate *migrate = arg;
+ struct drm_i915_private *i915 = migrate->context->engine->i915;
+ I915_RND_STATE(prng);
+ int i;
+
+ for (i = 0; i < ARRAY_SIZE(sizes); i++) {
+ int err;
+
+ err = migrate_clear(migrate, sizes[i], &prng);
+ if (err == 0)
+ err = global_clear(migrate, sizes[i], &prng);
+
+ i915_gem_drain_freed_objects(i915);
+ if (err)
+ return err;
+ }
+
+ return 0;
+}
+
+struct threaded_migrate {
+ struct intel_migrate *migrate;
+ struct task_struct *tsk;
+ struct rnd_state prng;
+};
+
+static int threaded_migrate(struct intel_migrate *migrate,
+ int (*fn)(void *arg),
+ unsigned int flags)
+{
+ const unsigned int n_cpus = num_online_cpus() + 1;
+ struct threaded_migrate *thread;
+ I915_RND_STATE(prng);
+ unsigned int i;
+ int err = 0;
+
+ thread = kcalloc(n_cpus, sizeof(*thread), GFP_KERNEL);
+ if (!thread)
+ return 0;
+
+ for (i = 0; i < n_cpus; ++i) {
+ struct task_struct *tsk;
+
+ thread[i].migrate = migrate;
+ thread[i].prng =
+ I915_RND_STATE_INITIALIZER(prandom_u32_state(&prng));
+
+ tsk = kthread_run(fn, &thread[i], "igt-%d", i);
+ if (IS_ERR(tsk)) {
+ err = PTR_ERR(tsk);
+ break;
+ }
+
+ get_task_struct(tsk);
+ thread[i].tsk = tsk;
+ }
+
+ msleep(10); /* start all threads before we kthread_stop() */
+
+ for (i = 0; i < n_cpus; ++i) {
+ struct task_struct *tsk = thread[i].tsk;
+ int status;
+
+ if (IS_ERR_OR_NULL(tsk))
+ continue;
+
+ status = kthread_stop(tsk);
+ if (status && !err)
+ err = status;
+
+ put_task_struct(tsk);
+ }
+
+ kfree(thread);
+ return err;
+}
+
+static int __thread_migrate_copy(void *arg)
+{
+ struct threaded_migrate *tm = arg;
+
+ return migrate_copy(tm->migrate, 2 * CHUNK_SZ, &tm->prng);
+}
+
+static int thread_migrate_copy(void *arg)
+{
+ return threaded_migrate(arg, __thread_migrate_copy, 0);
+}
+
+static int __thread_global_copy(void *arg)
+{
+ struct threaded_migrate *tm = arg;
+
+ return global_copy(tm->migrate, 2 * CHUNK_SZ, &tm->prng);
+}
+
+static int thread_global_copy(void *arg)
+{
+ return threaded_migrate(arg, __thread_global_copy, 0);
+}
+
+static int __thread_migrate_clear(void *arg)
+{
+ struct threaded_migrate *tm = arg;
+
+ return migrate_clear(tm->migrate, 2 * CHUNK_SZ, &tm->prng);
+}
+
+static int __thread_global_clear(void *arg)
+{
+ struct threaded_migrate *tm = arg;
+
+ return global_clear(tm->migrate, 2 * CHUNK_SZ, &tm->prng);
+}
+
+static int thread_migrate_clear(void *arg)
+{
+ return threaded_migrate(arg, __thread_migrate_clear, 0);
+}
+
+static int thread_global_clear(void *arg)
+{
+ return threaded_migrate(arg, __thread_global_clear, 0);
+}
+
+int intel_migrate_live_selftests(struct drm_i915_private *i915)
+{
+ static const struct i915_subtest tests[] = {
+ SUBTEST(live_migrate_copy),
+ SUBTEST(live_migrate_clear),
+ SUBTEST(thread_migrate_copy),
+ SUBTEST(thread_migrate_clear),
+ SUBTEST(thread_global_copy),
+ SUBTEST(thread_global_clear),
+ };
+ struct intel_gt *gt = &i915->gt;
+
+ if (!gt->migrate.context)
+ return 0;
+
+ return i915_subtests(tests, &gt->migrate);
+}
+
+static struct drm_i915_gem_object *
+create_init_lmem_internal(struct intel_gt *gt, size_t sz, bool try_lmem)
+{
+ struct drm_i915_gem_object *obj = NULL;
+ int err;
+
+ if (try_lmem)
+ obj = i915_gem_object_create_lmem(gt->i915, sz, 0);
+
+ if (IS_ERR_OR_NULL(obj)) {
+ obj = i915_gem_object_create_internal(gt->i915, sz);
+ if (IS_ERR(obj))
+ return obj;
+ }
+
+ i915_gem_object_trylock(obj);
+ err = i915_gem_object_pin_pages(obj);
+ if (err) {
+ i915_gem_object_unlock(obj);
+ i915_gem_object_put(obj);
+ return ERR_PTR(err);
+ }
+
+ return obj;
+}
+
+static int wrap_ktime_compare(const void *A, const void *B)
+{
+ const ktime_t *a = A, *b = B;
+
+ return ktime_compare(*a, *b);
+}
+
+static int __perf_clear_blt(struct intel_context *ce,
+ struct scatterlist *sg,
+ enum i915_cache_level cache_level,
+ bool is_lmem,
+ size_t sz)
+{
+ ktime_t t[5];
+ int pass;
+ int err = 0;
+
+ for (pass = 0; pass < ARRAY_SIZE(t); pass++) {
+ struct i915_request *rq;
+ ktime_t t0, t1;
+
+ t0 = ktime_get();
+
+ err = intel_context_migrate_clear(ce, NULL, sg, cache_level,
+ is_lmem, 0, &rq);
+ if (rq) {
+ if (i915_request_wait(rq, 0, MAX_SCHEDULE_TIMEOUT) < 0)
+ err = -EIO;
+ i915_request_put(rq);
+ }
+ if (err)
+ break;
+
+ t1 = ktime_get();
+ t[pass] = ktime_sub(t1, t0);
+ }
+ if (err)
+ return err;
+
+ sort(t, ARRAY_SIZE(t), sizeof(*t), wrap_ktime_compare, NULL);
+ pr_info("%s: %zd KiB fill: %lld MiB/s\n",
+ ce->engine->name, sz >> 10,
+ div64_u64(mul_u32_u32(4 * sz,
+ 1000 * 1000 * 1000),
+ t[1] + 2 * t[2] + t[3]) >> 20);
+ return 0;
+}
+
+static int perf_clear_blt(void *arg)
+{
+ struct intel_gt *gt = arg;
+ static const unsigned long sizes[] = {
+ SZ_4K,
+ SZ_64K,
+ SZ_2M,
+ SZ_64M
+ };
+ int i;
+
+ for (i = 0; i < ARRAY_SIZE(sizes); i++) {
+ struct drm_i915_gem_object *dst;
+ int err;
+
+ dst = create_init_lmem_internal(gt, sizes[i], true);
+ if (IS_ERR(dst))
+ return PTR_ERR(dst);
+
+ err = __perf_clear_blt(gt->migrate.context,
+ dst->mm.pages->sgl,
+ I915_CACHE_NONE,
+ i915_gem_object_is_lmem(dst),
+ sizes[i]);
+
+ i915_gem_object_unlock(dst);
+ i915_gem_object_put(dst);
+ if (err)
+ return err;
+ }
+
+ return 0;
+}
+
+static int __perf_copy_blt(struct intel_context *ce,
+ struct scatterlist *src,
+ enum i915_cache_level src_cache_level,
+ bool src_is_lmem,
+ struct scatterlist *dst,
+ enum i915_cache_level dst_cache_level,
+ bool dst_is_lmem,
+ size_t sz)
+{
+ ktime_t t[5];
+ int pass;
+ int err = 0;
+
+ for (pass = 0; pass < ARRAY_SIZE(t); pass++) {
+ struct i915_request *rq;
+ ktime_t t0, t1;
+
+ t0 = ktime_get();
+
+ err = intel_context_migrate_copy(ce, NULL,
+ src, src_cache_level,
+ src_is_lmem,
+ dst, dst_cache_level,
+ dst_is_lmem,
+ &rq);
+ if (rq) {
+ if (i915_request_wait(rq, 0, MAX_SCHEDULE_TIMEOUT) < 0)
+ err = -EIO;
+ i915_request_put(rq);
+ }
+ if (err)
+ break;
+
+ t1 = ktime_get();
+ t[pass] = ktime_sub(t1, t0);
+ }
+ if (err)
+ return err;
+
+ sort(t, ARRAY_SIZE(t), sizeof(*t), wrap_ktime_compare, NULL);
+ pr_info("%s: %zd KiB copy: %lld MiB/s\n",
+ ce->engine->name, sz >> 10,
+ div64_u64(mul_u32_u32(4 * sz,
+ 1000 * 1000 * 1000),
+ t[1] + 2 * t[2] + t[3]) >> 20);
+ return 0;
+}
+
+static int perf_copy_blt(void *arg)
+{
+ struct intel_gt *gt = arg;
+ static const unsigned long sizes[] = {
+ SZ_4K,
+ SZ_64K,
+ SZ_2M,
+ SZ_64M
+ };
+ int i;
+
+ for (i = 0; i < ARRAY_SIZE(sizes); i++) {
+ struct drm_i915_gem_object *src, *dst;
+ int err;
+
+ src = create_init_lmem_internal(gt, sizes[i], true);
+ if (IS_ERR(src))
+ return PTR_ERR(src);
+
+ dst = create_init_lmem_internal(gt, sizes[i], false);
+ if (IS_ERR(dst)) {
+ err = PTR_ERR(dst);
+ goto err_src;
+ }
+
+ err = __perf_copy_blt(gt->migrate.context,
+ src->mm.pages->sgl,
+ I915_CACHE_NONE,
+ i915_gem_object_is_lmem(src),
+ dst->mm.pages->sgl,
+ I915_CACHE_NONE,
+ i915_gem_object_is_lmem(dst),
+ sizes[i]);
+
+ i915_gem_object_unlock(dst);
+ i915_gem_object_put(dst);
+err_src:
+ i915_gem_object_unlock(src);
+ i915_gem_object_put(src);
+ if (err)
+ return err;
+ }
+
+ return 0;
+}
+
+int intel_migrate_perf_selftests(struct drm_i915_private *i915)
+{
+ static const struct i915_subtest tests[] = {
+ SUBTEST(perf_clear_blt),
+ SUBTEST(perf_copy_blt),
+ };
+ struct intel_gt *gt = &i915->gt;
+
+ if (intel_gt_is_wedged(gt))
+ return 0;
+
+ if (!gt->migrate.context)
+ return 0;
+
+ return intel_gt_live_subtests(tests, gt);
+}
diff --git a/drivers/gpu/drm/i915/gt/selftest_mocs.c b/drivers/gpu/drm/i915/gt/selftest_mocs.c
index b9bb0e6e97f7..13d25bf2a94a 100644
--- a/drivers/gpu/drm/i915/gt/selftest_mocs.c
+++ b/drivers/gpu/drm/i915/gt/selftest_mocs.c
@@ -10,6 +10,7 @@
#include "gem/selftests/mock_context.h"
#include "selftests/igt_reset.h"
#include "selftests/igt_spinner.h"
+#include "selftests/intel_scheduler_helpers.h"
struct live_mocs {
struct drm_i915_mocs_table table;
@@ -28,7 +29,7 @@ static struct intel_context *mocs_context_create(struct intel_engine_cs *engine)
return ce;
/* We build large requests to read the registers from the ring */
- ce->ring = __intel_context_ring_size(SZ_16K);
+ ce->ring_size = SZ_16K;
return ce;
}
@@ -318,7 +319,8 @@ static int live_mocs_clean(void *arg)
}
static int active_engine_reset(struct intel_context *ce,
- const char *reason)
+ const char *reason,
+ bool using_guc)
{
struct igt_spinner spin;
struct i915_request *rq;
@@ -335,9 +337,13 @@ static int active_engine_reset(struct intel_context *ce,
}
err = request_add_spin(rq, &spin);
- if (err == 0)
+ if (err == 0 && !using_guc)
err = intel_engine_reset(ce->engine, reason);
+ /* Ensure the reset happens and kills the engine */
+ if (err == 0)
+ err = intel_selftest_wait_for_rq(rq);
+
igt_spinner_end(&spin);
igt_spinner_fini(&spin);
@@ -345,21 +351,23 @@ static int active_engine_reset(struct intel_context *ce,
}
static int __live_mocs_reset(struct live_mocs *mocs,
- struct intel_context *ce)
+ struct intel_context *ce, bool using_guc)
{
struct intel_gt *gt = ce->engine->gt;
int err;
if (intel_has_reset_engine(gt)) {
- err = intel_engine_reset(ce->engine, "mocs");
- if (err)
- return err;
-
- err = check_mocs_engine(mocs, ce);
- if (err)
- return err;
+ if (!using_guc) {
+ err = intel_engine_reset(ce->engine, "mocs");
+ if (err)
+ return err;
+
+ err = check_mocs_engine(mocs, ce);
+ if (err)
+ return err;
+ }
- err = active_engine_reset(ce, "mocs");
+ err = active_engine_reset(ce, "mocs", using_guc);
if (err)
return err;
@@ -395,19 +403,33 @@ static int live_mocs_reset(void *arg)
igt_global_reset_lock(gt);
for_each_engine(engine, gt, id) {
+ bool using_guc = intel_engine_uses_guc(engine);
+ struct intel_selftest_saved_policy saved;
struct intel_context *ce;
+ int err2;
+
+ err = intel_selftest_modify_policy(engine, &saved,
+ SELFTEST_SCHEDULER_MODIFY_FAST_RESET);
+ if (err)
+ break;
ce = mocs_context_create(engine);
if (IS_ERR(ce)) {
err = PTR_ERR(ce);
- break;
+ goto restore;
}
intel_engine_pm_get(engine);
- err = __live_mocs_reset(&mocs, ce);
- intel_engine_pm_put(engine);
+ err = __live_mocs_reset(&mocs, ce, using_guc);
+
+ intel_engine_pm_put(engine);
intel_context_put(ce);
+
+restore:
+ err2 = intel_selftest_restore_policy(engine, &saved);
+ if (err == 0)
+ err = err2;
if (err)
break;
}
diff --git a/drivers/gpu/drm/i915/gt/selftest_reset.c b/drivers/gpu/drm/i915/gt/selftest_reset.c
index 8784257ec808..7a50c9f4071b 100644
--- a/drivers/gpu/drm/i915/gt/selftest_reset.c
+++ b/drivers/gpu/drm/i915/gt/selftest_reset.c
@@ -321,7 +321,7 @@ static int igt_atomic_engine_reset(void *arg)
goto out_unlock;
for_each_engine(engine, gt, id) {
- struct tasklet_struct *t = &engine->execlists.tasklet;
+ struct tasklet_struct *t = &engine->sched_engine->tasklet;
if (t->func)
tasklet_disable(t);
diff --git a/drivers/gpu/drm/i915/gt/selftest_slpc.c b/drivers/gpu/drm/i915/gt/selftest_slpc.c
new file mode 100644
index 000000000000..9334bad131a2
--- /dev/null
+++ b/drivers/gpu/drm/i915/gt/selftest_slpc.c
@@ -0,0 +1,311 @@
+// SPDX-License-Identifier: MIT
+/*
+ * Copyright © 2021 Intel Corporation
+ */
+
+#define NUM_STEPS 5
+#define H2G_DELAY 50000
+#define delay_for_h2g() usleep_range(H2G_DELAY, H2G_DELAY + 10000)
+#define FREQUENCY_REQ_UNIT DIV_ROUND_CLOSEST(GT_FREQUENCY_MULTIPLIER, \
+ GEN9_FREQ_SCALER)
+
+static int slpc_set_min_freq(struct intel_guc_slpc *slpc, u32 freq)
+{
+ int ret;
+
+ ret = intel_guc_slpc_set_min_freq(slpc, freq);
+ if (ret)
+ pr_err("Could not set min frequency to [%u]\n", freq);
+ else /* Delay to ensure h2g completes */
+ delay_for_h2g();
+
+ return ret;
+}
+
+static int slpc_set_max_freq(struct intel_guc_slpc *slpc, u32 freq)
+{
+ int ret;
+
+ ret = intel_guc_slpc_set_max_freq(slpc, freq);
+ if (ret)
+ pr_err("Could not set maximum frequency [%u]\n",
+ freq);
+ else /* Delay to ensure h2g completes */
+ delay_for_h2g();
+
+ return ret;
+}
+
+static int live_slpc_clamp_min(void *arg)
+{
+ struct drm_i915_private *i915 = arg;
+ struct intel_gt *gt = &i915->gt;
+ struct intel_guc_slpc *slpc = &gt->uc.guc.slpc;
+ struct intel_rps *rps = &gt->rps;
+ struct intel_engine_cs *engine;
+ enum intel_engine_id id;
+ struct igt_spinner spin;
+ u32 slpc_min_freq, slpc_max_freq;
+ int err = 0;
+
+ if (!intel_uc_uses_guc_slpc(&gt->uc))
+ return 0;
+
+ if (igt_spinner_init(&spin, gt))
+ return -ENOMEM;
+
+ if (intel_guc_slpc_get_max_freq(slpc, &slpc_max_freq)) {
+ pr_err("Could not get SLPC max freq\n");
+ return -EIO;
+ }
+
+ if (intel_guc_slpc_get_min_freq(slpc, &slpc_min_freq)) {
+ pr_err("Could not get SLPC min freq\n");
+ return -EIO;
+ }
+
+ if (slpc_min_freq == slpc_max_freq) {
+ pr_err("Min/Max are fused to the same value\n");
+ return -EINVAL;
+ }
+
+ intel_gt_pm_wait_for_idle(gt);
+ intel_gt_pm_get(gt);
+ for_each_engine(engine, gt, id) {
+ struct i915_request *rq;
+ u32 step, min_freq, req_freq;
+ u32 act_freq, max_act_freq;
+
+ if (!intel_engine_can_store_dword(engine))
+ continue;
+
+ /* Go from min to max in 5 steps */
+ step = (slpc_max_freq - slpc_min_freq) / NUM_STEPS;
+ max_act_freq = slpc_min_freq;
+ for (min_freq = slpc_min_freq; min_freq < slpc_max_freq;
+ min_freq += step) {
+ err = slpc_set_min_freq(slpc, min_freq);
+ if (err)
+ break;
+
+ st_engine_heartbeat_disable(engine);
+
+ rq = igt_spinner_create_request(&spin,
+ engine->kernel_context,
+ MI_NOOP);
+ if (IS_ERR(rq)) {
+ err = PTR_ERR(rq);
+ st_engine_heartbeat_enable(engine);
+ break;
+ }
+
+ i915_request_add(rq);
+
+ if (!igt_wait_for_spinner(&spin, rq)) {
+ pr_err("%s: Spinner did not start\n",
+ engine->name);
+ igt_spinner_end(&spin);
+ st_engine_heartbeat_enable(engine);
+ intel_gt_set_wedged(engine->gt);
+ err = -EIO;
+ break;
+ }
+
+ /* Wait for GuC to detect business and raise
+ * requested frequency if necessary.
+ */
+ delay_for_h2g();
+
+ req_freq = intel_rps_read_punit_req_frequency(rps);
+
+ /* GuC requests freq in multiples of 50/3 MHz */
+ if (req_freq < (min_freq - FREQUENCY_REQ_UNIT)) {
+ pr_err("SWReq is %d, should be at least %d\n", req_freq,
+ min_freq - FREQUENCY_REQ_UNIT);
+ igt_spinner_end(&spin);
+ st_engine_heartbeat_enable(engine);
+ err = -EINVAL;
+ break;
+ }
+
+ act_freq = intel_rps_read_actual_frequency(rps);
+ if (act_freq > max_act_freq)
+ max_act_freq = act_freq;
+
+ igt_spinner_end(&spin);
+ st_engine_heartbeat_enable(engine);
+ }
+
+ pr_info("Max actual frequency for %s was %d\n",
+ engine->name, max_act_freq);
+
+ /* Actual frequency should rise above min */
+ if (max_act_freq == slpc_min_freq) {
+ pr_err("Actual freq did not rise above min\n");
+ err = -EINVAL;
+ }
+
+ if (err)
+ break;
+ }
+
+ /* Restore min/max frequencies */
+ slpc_set_max_freq(slpc, slpc_max_freq);
+ slpc_set_min_freq(slpc, slpc_min_freq);
+
+ if (igt_flush_test(gt->i915))
+ err = -EIO;
+
+ intel_gt_pm_put(gt);
+ igt_spinner_fini(&spin);
+ intel_gt_pm_wait_for_idle(gt);
+
+ return err;
+}
+
+static int live_slpc_clamp_max(void *arg)
+{
+ struct drm_i915_private *i915 = arg;
+ struct intel_gt *gt = &i915->gt;
+ struct intel_guc_slpc *slpc;
+ struct intel_rps *rps;
+ struct intel_engine_cs *engine;
+ enum intel_engine_id id;
+ struct igt_spinner spin;
+ int err = 0;
+ u32 slpc_min_freq, slpc_max_freq;
+
+ slpc = &gt->uc.guc.slpc;
+ rps = &gt->rps;
+
+ if (!intel_uc_uses_guc_slpc(&gt->uc))
+ return 0;
+
+ if (igt_spinner_init(&spin, gt))
+ return -ENOMEM;
+
+ if (intel_guc_slpc_get_max_freq(slpc, &slpc_max_freq)) {
+ pr_err("Could not get SLPC max freq\n");
+ return -EIO;
+ }
+
+ if (intel_guc_slpc_get_min_freq(slpc, &slpc_min_freq)) {
+ pr_err("Could not get SLPC min freq\n");
+ return -EIO;
+ }
+
+ if (slpc_min_freq == slpc_max_freq) {
+ pr_err("Min/Max are fused to the same value\n");
+ return -EINVAL;
+ }
+
+ intel_gt_pm_wait_for_idle(gt);
+ intel_gt_pm_get(gt);
+ for_each_engine(engine, gt, id) {
+ struct i915_request *rq;
+ u32 max_freq, req_freq;
+ u32 act_freq, max_act_freq;
+ u32 step;
+
+ if (!intel_engine_can_store_dword(engine))
+ continue;
+
+ /* Go from max to min in 5 steps */
+ step = (slpc_max_freq - slpc_min_freq) / NUM_STEPS;
+ max_act_freq = slpc_min_freq;
+ for (max_freq = slpc_max_freq; max_freq > slpc_min_freq;
+ max_freq -= step) {
+ err = slpc_set_max_freq(slpc, max_freq);
+ if (err)
+ break;
+
+ st_engine_heartbeat_disable(engine);
+
+ rq = igt_spinner_create_request(&spin,
+ engine->kernel_context,
+ MI_NOOP);
+ if (IS_ERR(rq)) {
+ st_engine_heartbeat_enable(engine);
+ err = PTR_ERR(rq);
+ break;
+ }
+
+ i915_request_add(rq);
+
+ if (!igt_wait_for_spinner(&spin, rq)) {
+ pr_err("%s: SLPC spinner did not start\n",
+ engine->name);
+ igt_spinner_end(&spin);
+ st_engine_heartbeat_enable(engine);
+ intel_gt_set_wedged(engine->gt);
+ err = -EIO;
+ break;
+ }
+
+ delay_for_h2g();
+
+ /* Verify that SWREQ indeed was set to specific value */
+ req_freq = intel_rps_read_punit_req_frequency(rps);
+
+ /* GuC requests freq in multiples of 50/3 MHz */
+ if (req_freq > (max_freq + FREQUENCY_REQ_UNIT)) {
+ pr_err("SWReq is %d, should be at most %d\n", req_freq,
+ max_freq + FREQUENCY_REQ_UNIT);
+ igt_spinner_end(&spin);
+ st_engine_heartbeat_enable(engine);
+ err = -EINVAL;
+ break;
+ }
+
+ act_freq = intel_rps_read_actual_frequency(rps);
+ if (act_freq > max_act_freq)
+ max_act_freq = act_freq;
+
+ st_engine_heartbeat_enable(engine);
+ igt_spinner_end(&spin);
+
+ if (err)
+ break;
+ }
+
+ pr_info("Max actual frequency for %s was %d\n",
+ engine->name, max_act_freq);
+
+ /* Actual frequency should rise above min */
+ if (max_act_freq == slpc_min_freq) {
+ pr_err("Actual freq did not rise above min\n");
+ err = -EINVAL;
+ }
+
+ if (igt_flush_test(gt->i915)) {
+ err = -EIO;
+ break;
+ }
+
+ if (err)
+ break;
+ }
+
+ /* Restore min/max freq */
+ slpc_set_max_freq(slpc, slpc_max_freq);
+ slpc_set_min_freq(slpc, slpc_min_freq);
+
+ intel_gt_pm_put(gt);
+ igt_spinner_fini(&spin);
+ intel_gt_pm_wait_for_idle(gt);
+
+ return err;
+}
+
+int intel_slpc_live_selftests(struct drm_i915_private *i915)
+{
+ static const struct i915_subtest tests[] = {
+ SUBTEST(live_slpc_clamp_max),
+ SUBTEST(live_slpc_clamp_min),
+ };
+
+ if (intel_gt_is_wedged(&i915->gt))
+ return 0;
+
+ return i915_live_subtests(tests, i915);
+}
diff --git a/drivers/gpu/drm/i915/gt/selftest_timeline.c b/drivers/gpu/drm/i915/gt/selftest_timeline.c
index 64da0c91dec1..d0b6a3afcf44 100644
--- a/drivers/gpu/drm/i915/gt/selftest_timeline.c
+++ b/drivers/gpu/drm/i915/gt/selftest_timeline.c
@@ -874,7 +874,7 @@ static int create_watcher(struct hwsp_watcher *w,
if (IS_ERR(ce))
return PTR_ERR(ce);
- ce->ring = __intel_context_ring_size(ringsz);
+ ce->ring_size = ringsz;
w->rq = intel_context_create_request(ce);
intel_context_put(ce);
if (IS_ERR(w->rq))
diff --git a/drivers/gpu/drm/i915/gt/selftest_workarounds.c b/drivers/gpu/drm/i915/gt/selftest_workarounds.c
index c30754daf4b1..e623ac45f4aa 100644
--- a/drivers/gpu/drm/i915/gt/selftest_workarounds.c
+++ b/drivers/gpu/drm/i915/gt/selftest_workarounds.c
@@ -12,6 +12,7 @@
#include "selftests/igt_flush_test.h"
#include "selftests/igt_reset.h"
#include "selftests/igt_spinner.h"
+#include "selftests/intel_scheduler_helpers.h"
#include "selftests/mock_drm.h"
#include "gem/selftests/igt_gem_utils.h"
@@ -261,28 +262,34 @@ static int do_engine_reset(struct intel_engine_cs *engine)
return intel_engine_reset(engine, "live_workarounds");
}
+static int do_guc_reset(struct intel_engine_cs *engine)
+{
+ /* Currently a no-op as the reset is handled by GuC */
+ return 0;
+}
+
static int
switch_to_scratch_context(struct intel_engine_cs *engine,
- struct igt_spinner *spin)
+ struct igt_spinner *spin,
+ struct i915_request **rq)
{
struct intel_context *ce;
- struct i915_request *rq;
int err = 0;
ce = intel_context_create(engine);
if (IS_ERR(ce))
return PTR_ERR(ce);
- rq = igt_spinner_create_request(spin, ce, MI_NOOP);
+ *rq = igt_spinner_create_request(spin, ce, MI_NOOP);
intel_context_put(ce);
- if (IS_ERR(rq)) {
+ if (IS_ERR(*rq)) {
spin = NULL;
- err = PTR_ERR(rq);
+ err = PTR_ERR(*rq);
goto err;
}
- err = request_add_spin(rq, spin);
+ err = request_add_spin(*rq, spin);
err:
if (err && spin)
igt_spinner_end(spin);
@@ -296,6 +303,7 @@ static int check_whitelist_across_reset(struct intel_engine_cs *engine,
{
struct intel_context *ce, *tmp;
struct igt_spinner spin;
+ struct i915_request *rq;
intel_wakeref_t wakeref;
int err;
@@ -316,13 +324,24 @@ static int check_whitelist_across_reset(struct intel_engine_cs *engine,
goto out_spin;
}
- err = switch_to_scratch_context(engine, &spin);
+ err = switch_to_scratch_context(engine, &spin, &rq);
if (err)
goto out_spin;
+ /* Ensure the spinner hasn't aborted */
+ if (i915_request_completed(rq)) {
+ pr_err("%s spinner failed to start\n", name);
+ err = -ETIMEDOUT;
+ goto out_spin;
+ }
+
with_intel_runtime_pm(engine->uncore->rpm, wakeref)
err = reset(engine);
+ /* Ensure the reset happens and kills the engine */
+ if (err == 0)
+ err = intel_selftest_wait_for_rq(rq);
+
igt_spinner_end(&spin);
if (err) {
@@ -787,9 +806,28 @@ static int live_reset_whitelist(void *arg)
continue;
if (intel_has_reset_engine(gt)) {
- err = check_whitelist_across_reset(engine,
- do_engine_reset,
- "engine");
+ if (intel_engine_uses_guc(engine)) {
+ struct intel_selftest_saved_policy saved;
+ int err2;
+
+ err = intel_selftest_modify_policy(engine, &saved,
+ SELFTEST_SCHEDULER_MODIFY_FAST_RESET);
+ if (err)
+ goto out;
+
+ err = check_whitelist_across_reset(engine,
+ do_guc_reset,
+ "guc");
+
+ err2 = intel_selftest_restore_policy(engine, &saved);
+ if (err == 0)
+ err = err2;
+ } else {
+ err = check_whitelist_across_reset(engine,
+ do_engine_reset,
+ "engine");
+ }
+
if (err)
goto out;
}
@@ -1147,7 +1185,7 @@ verify_wa_lists(struct intel_gt *gt, struct wa_lists *lists,
enum intel_engine_id id;
bool ok = true;
- ok &= wa_list_verify(gt->uncore, &lists->gt_wa_list, str);
+ ok &= wa_list_verify(gt, &lists->gt_wa_list, str);
for_each_engine(engine, gt, id) {
struct intel_context *ce;
@@ -1175,31 +1213,36 @@ live_gpu_reset_workarounds(void *arg)
{
struct intel_gt *gt = arg;
intel_wakeref_t wakeref;
- struct wa_lists lists;
+ struct wa_lists *lists;
bool ok;
if (!intel_has_gpu_reset(gt))
return 0;
+ lists = kzalloc(sizeof(*lists), GFP_KERNEL);
+ if (!lists)
+ return -ENOMEM;
+
pr_info("Verifying after GPU reset...\n");
igt_global_reset_lock(gt);
wakeref = intel_runtime_pm_get(gt->uncore->rpm);
- reference_lists_init(gt, &lists);
+ reference_lists_init(gt, lists);
- ok = verify_wa_lists(gt, &lists, "before reset");
+ ok = verify_wa_lists(gt, lists, "before reset");
if (!ok)
goto out;
intel_gt_reset(gt, ALL_ENGINES, "live_workarounds");
- ok = verify_wa_lists(gt, &lists, "after reset");
+ ok = verify_wa_lists(gt, lists, "after reset");
out:
- reference_lists_fini(gt, &lists);
+ reference_lists_fini(gt, lists);
intel_runtime_pm_put(gt->uncore->rpm, wakeref);
igt_global_reset_unlock(gt);
+ kfree(lists);
return ok ? 0 : -ESRCH;
}
@@ -1214,43 +1257,57 @@ live_engine_reset_workarounds(void *arg)
struct igt_spinner spin;
struct i915_request *rq;
intel_wakeref_t wakeref;
- struct wa_lists lists;
+ struct wa_lists *lists;
int ret = 0;
if (!intel_has_reset_engine(gt))
return 0;
+ lists = kzalloc(sizeof(*lists), GFP_KERNEL);
+ if (!lists)
+ return -ENOMEM;
+
igt_global_reset_lock(gt);
wakeref = intel_runtime_pm_get(gt->uncore->rpm);
- reference_lists_init(gt, &lists);
+ reference_lists_init(gt, lists);
for_each_engine(engine, gt, id) {
+ struct intel_selftest_saved_policy saved;
+ bool using_guc = intel_engine_uses_guc(engine);
bool ok;
+ int ret2;
pr_info("Verifying after %s reset...\n", engine->name);
+ ret = intel_selftest_modify_policy(engine, &saved,
+ SELFTEST_SCHEDULER_MODIFY_FAST_RESET);
+ if (ret)
+ break;
+
ce = intel_context_create(engine);
if (IS_ERR(ce)) {
ret = PTR_ERR(ce);
- break;
+ goto restore;
}
- ok = verify_wa_lists(gt, &lists, "before reset");
- if (!ok) {
- ret = -ESRCH;
- goto err;
- }
+ if (!using_guc) {
+ ok = verify_wa_lists(gt, lists, "before reset");
+ if (!ok) {
+ ret = -ESRCH;
+ goto err;
+ }
- ret = intel_engine_reset(engine, "live_workarounds:idle");
- if (ret) {
- pr_err("%s: Reset failed while idle\n", engine->name);
- goto err;
- }
+ ret = intel_engine_reset(engine, "live_workarounds:idle");
+ if (ret) {
+ pr_err("%s: Reset failed while idle\n", engine->name);
+ goto err;
+ }
- ok = verify_wa_lists(gt, &lists, "after idle reset");
- if (!ok) {
- ret = -ESRCH;
- goto err;
+ ok = verify_wa_lists(gt, lists, "after idle reset");
+ if (!ok) {
+ ret = -ESRCH;
+ goto err;
+ }
}
ret = igt_spinner_init(&spin, engine->gt);
@@ -1271,32 +1328,49 @@ live_engine_reset_workarounds(void *arg)
goto err;
}
- ret = intel_engine_reset(engine, "live_workarounds:active");
- if (ret) {
- pr_err("%s: Reset failed on an active spinner\n",
- engine->name);
- igt_spinner_fini(&spin);
- goto err;
+ /* Ensure the spinner hasn't aborted */
+ if (i915_request_completed(rq)) {
+ ret = -ETIMEDOUT;
+ goto skip;
}
+ if (!using_guc) {
+ ret = intel_engine_reset(engine, "live_workarounds:active");
+ if (ret) {
+ pr_err("%s: Reset failed on an active spinner\n",
+ engine->name);
+ igt_spinner_fini(&spin);
+ goto err;
+ }
+ }
+
+ /* Ensure the reset happens and kills the engine */
+ if (ret == 0)
+ ret = intel_selftest_wait_for_rq(rq);
+
+skip:
igt_spinner_end(&spin);
igt_spinner_fini(&spin);
- ok = verify_wa_lists(gt, &lists, "after busy reset");
- if (!ok) {
+ ok = verify_wa_lists(gt, lists, "after busy reset");
+ if (!ok)
ret = -ESRCH;
- goto err;
- }
err:
intel_context_put(ce);
+
+restore:
+ ret2 = intel_selftest_restore_policy(engine, &saved);
+ if (ret == 0)
+ ret = ret2;
if (ret)
break;
}
- reference_lists_fini(gt, &lists);
+ reference_lists_fini(gt, lists);
intel_runtime_pm_put(gt->uncore->rpm, wakeref);
igt_global_reset_unlock(gt);
+ kfree(lists);
igt_flush_test(gt->i915);
diff --git a/drivers/gpu/drm/i915/gt/uc/abi/guc_actions_abi.h b/drivers/gpu/drm/i915/gt/uc/abi/guc_actions_abi.h
index 90efef8a73e4..8ff582222aff 100644
--- a/drivers/gpu/drm/i915/gt/uc/abi/guc_actions_abi.h
+++ b/drivers/gpu/drm/i915/gt/uc/abi/guc_actions_abi.h
@@ -6,6 +6,113 @@
#ifndef _ABI_GUC_ACTIONS_ABI_H
#define _ABI_GUC_ACTIONS_ABI_H
+/**
+ * DOC: HOST2GUC_REGISTER_CTB
+ *
+ * This message is used as part of the `CTB based communication`_ setup.
+ *
+ * This message must be sent as `MMIO HXG Message`_.
+ *
+ * +---+-------+--------------------------------------------------------------+
+ * | | Bits | Description |
+ * +===+=======+==============================================================+
+ * | 0 | 31 | ORIGIN = GUC_HXG_ORIGIN_HOST_ |
+ * | +-------+--------------------------------------------------------------+
+ * | | 30:28 | TYPE = GUC_HXG_TYPE_REQUEST_ |
+ * | +-------+--------------------------------------------------------------+
+ * | | 27:16 | DATA0 = MBZ |
+ * | +-------+--------------------------------------------------------------+
+ * | | 15:0 | ACTION = _`GUC_ACTION_HOST2GUC_REGISTER_CTB` = 0x4505 |
+ * +---+-------+--------------------------------------------------------------+
+ * | 1 | 31:12 | RESERVED = MBZ |
+ * | +-------+--------------------------------------------------------------+
+ * | | 11:8 | **TYPE** - type for the `CT Buffer`_ |
+ * | | | |
+ * | | | - _`GUC_CTB_TYPE_HOST2GUC` = 0 |
+ * | | | - _`GUC_CTB_TYPE_GUC2HOST` = 1 |
+ * | +-------+--------------------------------------------------------------+
+ * | | 7:0 | **SIZE** - size of the `CT Buffer`_ in 4K units minus 1 |
+ * +---+-------+--------------------------------------------------------------+
+ * | 2 | 31:0 | **DESC_ADDR** - GGTT address of the `CTB Descriptor`_ |
+ * +---+-------+--------------------------------------------------------------+
+ * | 3 | 31:0 | **BUFF_ADDF** - GGTT address of the `CT Buffer`_ |
+ * +---+-------+--------------------------------------------------------------+
+ *
+ * +---+-------+--------------------------------------------------------------+
+ * | | Bits | Description |
+ * +===+=======+==============================================================+
+ * | 0 | 31 | ORIGIN = GUC_HXG_ORIGIN_GUC_ |
+ * | +-------+--------------------------------------------------------------+
+ * | | 30:28 | TYPE = GUC_HXG_TYPE_RESPONSE_SUCCESS_ |
+ * | +-------+--------------------------------------------------------------+
+ * | | 27:0 | DATA0 = MBZ |
+ * +---+-------+--------------------------------------------------------------+
+ */
+#define GUC_ACTION_HOST2GUC_REGISTER_CTB 0x4505
+
+#define HOST2GUC_REGISTER_CTB_REQUEST_MSG_LEN (GUC_HXG_REQUEST_MSG_MIN_LEN + 3u)
+#define HOST2GUC_REGISTER_CTB_REQUEST_MSG_0_MBZ GUC_HXG_REQUEST_MSG_0_DATA0
+#define HOST2GUC_REGISTER_CTB_REQUEST_MSG_1_MBZ (0xfffff << 12)
+#define HOST2GUC_REGISTER_CTB_REQUEST_MSG_1_TYPE (0xf << 8)
+#define GUC_CTB_TYPE_HOST2GUC 0u
+#define GUC_CTB_TYPE_GUC2HOST 1u
+#define HOST2GUC_REGISTER_CTB_REQUEST_MSG_1_SIZE (0xff << 0)
+#define HOST2GUC_REGISTER_CTB_REQUEST_MSG_2_DESC_ADDR GUC_HXG_REQUEST_MSG_n_DATAn
+#define HOST2GUC_REGISTER_CTB_REQUEST_MSG_3_BUFF_ADDR GUC_HXG_REQUEST_MSG_n_DATAn
+
+#define HOST2GUC_REGISTER_CTB_RESPONSE_MSG_LEN GUC_HXG_RESPONSE_MSG_MIN_LEN
+#define HOST2GUC_REGISTER_CTB_RESPONSE_MSG_0_MBZ GUC_HXG_RESPONSE_MSG_0_DATA0
+
+/**
+ * DOC: HOST2GUC_DEREGISTER_CTB
+ *
+ * This message is used as part of the `CTB based communication`_ teardown.
+ *
+ * This message must be sent as `MMIO HXG Message`_.
+ *
+ * +---+-------+--------------------------------------------------------------+
+ * | | Bits | Description |
+ * +===+=======+==============================================================+
+ * | 0 | 31 | ORIGIN = GUC_HXG_ORIGIN_HOST_ |
+ * | +-------+--------------------------------------------------------------+
+ * | | 30:28 | TYPE = GUC_HXG_TYPE_REQUEST_ |
+ * | +-------+--------------------------------------------------------------+
+ * | | 27:16 | DATA0 = MBZ |
+ * | +-------+--------------------------------------------------------------+
+ * | | 15:0 | ACTION = _`GUC_ACTION_HOST2GUC_DEREGISTER_CTB` = 0x4506 |
+ * +---+-------+--------------------------------------------------------------+
+ * | 1 | 31:12 | RESERVED = MBZ |
+ * | +-------+--------------------------------------------------------------+
+ * | | 11:8 | **TYPE** - type of the `CT Buffer`_ |
+ * | | | |
+ * | | | see `GUC_ACTION_HOST2GUC_REGISTER_CTB`_ |
+ * | +-------+--------------------------------------------------------------+
+ * | | 7:0 | RESERVED = MBZ |
+ * +---+-------+--------------------------------------------------------------+
+ *
+ * +---+-------+--------------------------------------------------------------+
+ * | | Bits | Description |
+ * +===+=======+==============================================================+
+ * | 0 | 31 | ORIGIN = GUC_HXG_ORIGIN_GUC_ |
+ * | +-------+--------------------------------------------------------------+
+ * | | 30:28 | TYPE = GUC_HXG_TYPE_RESPONSE_SUCCESS_ |
+ * | +-------+--------------------------------------------------------------+
+ * | | 27:0 | DATA0 = MBZ |
+ * +---+-------+--------------------------------------------------------------+
+ */
+#define GUC_ACTION_HOST2GUC_DEREGISTER_CTB 0x4506
+
+#define HOST2GUC_DEREGISTER_CTB_REQUEST_MSG_LEN (GUC_HXG_REQUEST_MSG_MIN_LEN + 1u)
+#define HOST2GUC_DEREGISTER_CTB_REQUEST_MSG_0_MBZ GUC_HXG_REQUEST_MSG_0_DATA0
+#define HOST2GUC_DEREGISTER_CTB_REQUEST_MSG_1_MBZ (0xfffff << 12)
+#define HOST2GUC_DEREGISTER_CTB_REQUEST_MSG_1_TYPE (0xf << 8)
+#define HOST2GUC_DEREGISTER_CTB_REQUEST_MSG_1_MBZ2 (0xff << 0)
+
+#define HOST2GUC_DEREGISTER_CTB_RESPONSE_MSG_LEN GUC_HXG_RESPONSE_MSG_MIN_LEN
+#define HOST2GUC_DEREGISTER_CTB_RESPONSE_MSG_0_MBZ GUC_HXG_RESPONSE_MSG_0_DATA0
+
+/* legacy definitions */
+
enum intel_guc_action {
INTEL_GUC_ACTION_DEFAULT = 0x0,
INTEL_GUC_ACTION_REQUEST_PREEMPTION = 0x2,
@@ -17,13 +124,33 @@ enum intel_guc_action {
INTEL_GUC_ACTION_FORCE_LOG_BUFFER_FLUSH = 0x302,
INTEL_GUC_ACTION_ENTER_S_STATE = 0x501,
INTEL_GUC_ACTION_EXIT_S_STATE = 0x502,
- INTEL_GUC_ACTION_SLPC_REQUEST = 0x3003,
+ INTEL_GUC_ACTION_GLOBAL_SCHED_POLICY_CHANGE = 0x506,
+ INTEL_GUC_ACTION_SCHED_CONTEXT = 0x1000,
+ INTEL_GUC_ACTION_SCHED_CONTEXT_MODE_SET = 0x1001,
+ INTEL_GUC_ACTION_SCHED_CONTEXT_MODE_DONE = 0x1002,
+ INTEL_GUC_ACTION_SCHED_ENGINE_MODE_SET = 0x1003,
+ INTEL_GUC_ACTION_SCHED_ENGINE_MODE_DONE = 0x1004,
+ INTEL_GUC_ACTION_SET_CONTEXT_PRIORITY = 0x1005,
+ INTEL_GUC_ACTION_SET_CONTEXT_EXECUTION_QUANTUM = 0x1006,
+ INTEL_GUC_ACTION_SET_CONTEXT_PREEMPTION_TIMEOUT = 0x1007,
+ INTEL_GUC_ACTION_CONTEXT_RESET_NOTIFICATION = 0x1008,
+ INTEL_GUC_ACTION_ENGINE_FAILURE_NOTIFICATION = 0x1009,
+ INTEL_GUC_ACTION_SETUP_PC_GUCRC = 0x3004,
INTEL_GUC_ACTION_AUTHENTICATE_HUC = 0x4000,
+ INTEL_GUC_ACTION_REGISTER_CONTEXT = 0x4502,
+ INTEL_GUC_ACTION_DEREGISTER_CONTEXT = 0x4503,
INTEL_GUC_ACTION_REGISTER_COMMAND_TRANSPORT_BUFFER = 0x4505,
INTEL_GUC_ACTION_DEREGISTER_COMMAND_TRANSPORT_BUFFER = 0x4506,
+ INTEL_GUC_ACTION_DEREGISTER_CONTEXT_DONE = 0x4600,
+ INTEL_GUC_ACTION_RESET_CLIENT = 0x5507,
INTEL_GUC_ACTION_LIMIT
};
+enum intel_guc_rc_options {
+ INTEL_GUCRC_HOST_CONTROL,
+ INTEL_GUCRC_FIRMWARE_CONTROL,
+};
+
enum intel_guc_preempt_options {
INTEL_GUC_PREEMPT_OPTION_DROP_WORK_Q = 0x4,
INTEL_GUC_PREEMPT_OPTION_DROP_SUBMIT_Q = 0x8,
diff --git a/drivers/gpu/drm/i915/gt/uc/abi/guc_actions_slpc_abi.h b/drivers/gpu/drm/i915/gt/uc/abi/guc_actions_slpc_abi.h
new file mode 100644
index 000000000000..7a8d4bfc5f6a
--- /dev/null
+++ b/drivers/gpu/drm/i915/gt/uc/abi/guc_actions_slpc_abi.h
@@ -0,0 +1,235 @@
+/* SPDX-License-Identifier: MIT */
+/*
+ * Copyright © 2021 Intel Corporation
+ */
+
+#ifndef _GUC_ACTIONS_SLPC_ABI_H_
+#define _GUC_ACTIONS_SLPC_ABI_H_
+
+#include <linux/types.h>
+#include "i915_reg.h"
+
+/**
+ * DOC: SLPC SHARED DATA STRUCTURE
+ *
+ * +----+------+--------------------------------------------------------------+
+ * | CL | Bytes| Description |
+ * +====+======+==============================================================+
+ * | 1 | 0-3 | SHARED DATA SIZE |
+ * | +------+--------------------------------------------------------------+
+ * | | 4-7 | GLOBAL STATE |
+ * | +------+--------------------------------------------------------------+
+ * | | 8-11 | DISPLAY DATA ADDRESS |
+ * | +------+--------------------------------------------------------------+
+ * | | 12:63| PADDING |
+ * +----+------+--------------------------------------------------------------+
+ * | | 0:63 | PADDING(PLATFORM INFO) |
+ * +----+------+--------------------------------------------------------------+
+ * | 3 | 0-3 | TASK STATE DATA |
+ * + +------+--------------------------------------------------------------+
+ * | | 4:63 | PADDING |
+ * +----+------+--------------------------------------------------------------+
+ * |4-21|0:1087| OVERRIDE PARAMS AND BIT FIELDS |
+ * +----+------+--------------------------------------------------------------+
+ * | | | PADDING + EXTRA RESERVED PAGE |
+ * +----+------+--------------------------------------------------------------+
+ */
+
+/*
+ * SLPC exposes certain parameters for global configuration by the host.
+ * These are referred to as override parameters, because in most cases
+ * the host will not need to modify the default values used by SLPC.
+ * SLPC remembers the default values which allows the host to easily restore
+ * them by simply unsetting the override. The host can set or unset override
+ * parameters during SLPC (re-)initialization using the SLPC Reset event.
+ * The host can also set or unset override parameters on the fly using the
+ * Parameter Set and Parameter Unset events
+ */
+
+#define SLPC_MAX_OVERRIDE_PARAMETERS 256
+#define SLPC_OVERRIDE_BITFIELD_SIZE \
+ (SLPC_MAX_OVERRIDE_PARAMETERS / 32)
+
+#define SLPC_PAGE_SIZE_BYTES 4096
+#define SLPC_CACHELINE_SIZE_BYTES 64
+#define SLPC_SHARED_DATA_SIZE_BYTE_HEADER SLPC_CACHELINE_SIZE_BYTES
+#define SLPC_SHARED_DATA_SIZE_BYTE_PLATFORM_INFO SLPC_CACHELINE_SIZE_BYTES
+#define SLPC_SHARED_DATA_SIZE_BYTE_TASK_STATE SLPC_CACHELINE_SIZE_BYTES
+#define SLPC_SHARED_DATA_MODE_DEFN_TABLE_SIZE SLPC_PAGE_SIZE_BYTES
+#define SLPC_SHARED_DATA_SIZE_BYTE_MAX (2 * SLPC_PAGE_SIZE_BYTES)
+
+/*
+ * Cacheline size aligned (Total size needed for
+ * SLPM_KMD_MAX_OVERRIDE_PARAMETERS=256 is 1088 bytes)
+ */
+#define SLPC_OVERRIDE_PARAMS_TOTAL_BYTES (((((SLPC_MAX_OVERRIDE_PARAMETERS * 4) \
+ + ((SLPC_MAX_OVERRIDE_PARAMETERS / 32) * 4)) \
+ + (SLPC_CACHELINE_SIZE_BYTES - 1)) / SLPC_CACHELINE_SIZE_BYTES) * \
+ SLPC_CACHELINE_SIZE_BYTES)
+
+#define SLPC_SHARED_DATA_SIZE_BYTE_OTHER (SLPC_SHARED_DATA_SIZE_BYTE_MAX - \
+ (SLPC_SHARED_DATA_SIZE_BYTE_HEADER \
+ + SLPC_SHARED_DATA_SIZE_BYTE_PLATFORM_INFO \
+ + SLPC_SHARED_DATA_SIZE_BYTE_TASK_STATE \
+ + SLPC_OVERRIDE_PARAMS_TOTAL_BYTES \
+ + SLPC_SHARED_DATA_MODE_DEFN_TABLE_SIZE))
+
+enum slpc_task_enable {
+ SLPC_PARAM_TASK_DEFAULT = 0,
+ SLPC_PARAM_TASK_ENABLED,
+ SLPC_PARAM_TASK_DISABLED,
+ SLPC_PARAM_TASK_UNKNOWN
+};
+
+enum slpc_global_state {
+ SLPC_GLOBAL_STATE_NOT_RUNNING = 0,
+ SLPC_GLOBAL_STATE_INITIALIZING = 1,
+ SLPC_GLOBAL_STATE_RESETTING = 2,
+ SLPC_GLOBAL_STATE_RUNNING = 3,
+ SLPC_GLOBAL_STATE_SHUTTING_DOWN = 4,
+ SLPC_GLOBAL_STATE_ERROR = 5
+};
+
+enum slpc_param_id {
+ SLPC_PARAM_TASK_ENABLE_GTPERF = 0,
+ SLPC_PARAM_TASK_DISABLE_GTPERF = 1,
+ SLPC_PARAM_TASK_ENABLE_BALANCER = 2,
+ SLPC_PARAM_TASK_DISABLE_BALANCER = 3,
+ SLPC_PARAM_TASK_ENABLE_DCC = 4,
+ SLPC_PARAM_TASK_DISABLE_DCC = 5,
+ SLPC_PARAM_GLOBAL_MIN_GT_UNSLICE_FREQ_MHZ = 6,
+ SLPC_PARAM_GLOBAL_MAX_GT_UNSLICE_FREQ_MHZ = 7,
+ SLPC_PARAM_GLOBAL_MIN_GT_SLICE_FREQ_MHZ = 8,
+ SLPC_PARAM_GLOBAL_MAX_GT_SLICE_FREQ_MHZ = 9,
+ SLPC_PARAM_GTPERF_THRESHOLD_MAX_FPS = 10,
+ SLPC_PARAM_GLOBAL_DISABLE_GT_FREQ_MANAGEMENT = 11,
+ SLPC_PARAM_GTPERF_ENABLE_FRAMERATE_STALLING = 12,
+ SLPC_PARAM_GLOBAL_DISABLE_RC6_MODE_CHANGE = 13,
+ SLPC_PARAM_GLOBAL_OC_UNSLICE_FREQ_MHZ = 14,
+ SLPC_PARAM_GLOBAL_OC_SLICE_FREQ_MHZ = 15,
+ SLPC_PARAM_GLOBAL_ENABLE_IA_GT_BALANCING = 16,
+ SLPC_PARAM_GLOBAL_ENABLE_ADAPTIVE_BURST_TURBO = 17,
+ SLPC_PARAM_GLOBAL_ENABLE_EVAL_MODE = 18,
+ SLPC_PARAM_GLOBAL_ENABLE_BALANCER_IN_NON_GAMING_MODE = 19,
+ SLPC_PARAM_GLOBAL_RT_MODE_TURBO_FREQ_DELTA_MHZ = 20,
+ SLPC_PARAM_PWRGATE_RC_MODE = 21,
+ SLPC_PARAM_EDR_MODE_COMPUTE_TIMEOUT_MS = 22,
+ SLPC_PARAM_EDR_QOS_FREQ_MHZ = 23,
+ SLPC_PARAM_MEDIA_FF_RATIO_MODE = 24,
+ SLPC_PARAM_ENABLE_IA_FREQ_LIMITING = 25,
+ SLPC_PARAM_STRATEGIES = 26,
+ SLPC_PARAM_POWER_PROFILE = 27,
+ SLPC_PARAM_IGNORE_EFFICIENT_FREQUENCY = 28,
+ SLPC_MAX_PARAM = 32,
+};
+
+enum slpc_event_id {
+ SLPC_EVENT_RESET = 0,
+ SLPC_EVENT_SHUTDOWN = 1,
+ SLPC_EVENT_PLATFORM_INFO_CHANGE = 2,
+ SLPC_EVENT_DISPLAY_MODE_CHANGE = 3,
+ SLPC_EVENT_FLIP_COMPLETE = 4,
+ SLPC_EVENT_QUERY_TASK_STATE = 5,
+ SLPC_EVENT_PARAMETER_SET = 6,
+ SLPC_EVENT_PARAMETER_UNSET = 7,
+};
+
+struct slpc_task_state_data {
+ union {
+ u32 task_status_padding;
+ struct {
+ u32 status;
+#define SLPC_GTPERF_TASK_ENABLED REG_BIT(0)
+#define SLPC_DCC_TASK_ENABLED REG_BIT(11)
+#define SLPC_IN_DCC REG_BIT(12)
+#define SLPC_BALANCER_ENABLED REG_BIT(15)
+#define SLPC_IBC_TASK_ENABLED REG_BIT(16)
+#define SLPC_BALANCER_IA_LMT_ENABLED REG_BIT(17)
+#define SLPC_BALANCER_IA_LMT_ACTIVE REG_BIT(18)
+ };
+ };
+ union {
+ u32 freq_padding;
+ struct {
+#define SLPC_MAX_UNSLICE_FREQ_MASK REG_GENMASK(7, 0)
+#define SLPC_MIN_UNSLICE_FREQ_MASK REG_GENMASK(15, 8)
+#define SLPC_MAX_SLICE_FREQ_MASK REG_GENMASK(23, 16)
+#define SLPC_MIN_SLICE_FREQ_MASK REG_GENMASK(31, 24)
+ u32 freq;
+ };
+ };
+} __packed;
+
+struct slpc_shared_data_header {
+ /* Total size in bytes of this shared buffer. */
+ u32 size;
+ u32 global_state;
+ u32 display_data_addr;
+} __packed;
+
+struct slpc_override_params {
+ u32 bits[SLPC_OVERRIDE_BITFIELD_SIZE];
+ u32 values[SLPC_MAX_OVERRIDE_PARAMETERS];
+} __packed;
+
+struct slpc_shared_data {
+ struct slpc_shared_data_header header;
+ u8 shared_data_header_pad[SLPC_SHARED_DATA_SIZE_BYTE_HEADER -
+ sizeof(struct slpc_shared_data_header)];
+
+ u8 platform_info_pad[SLPC_SHARED_DATA_SIZE_BYTE_PLATFORM_INFO];
+
+ struct slpc_task_state_data task_state_data;
+ u8 task_state_data_pad[SLPC_SHARED_DATA_SIZE_BYTE_TASK_STATE -
+ sizeof(struct slpc_task_state_data)];
+
+ struct slpc_override_params override_params;
+ u8 override_params_pad[SLPC_OVERRIDE_PARAMS_TOTAL_BYTES -
+ sizeof(struct slpc_override_params)];
+
+ u8 shared_data_pad[SLPC_SHARED_DATA_SIZE_BYTE_OTHER];
+
+ /* PAGE 2 (4096 bytes), mode based parameter will be removed soon */
+ u8 reserved_mode_definition[4096];
+} __packed;
+
+/**
+ * DOC: SLPC H2G MESSAGE FORMAT
+ *
+ * +---+-------+--------------------------------------------------------------+
+ * | | Bits | Description |
+ * +===+=======+==============================================================+
+ * | 0 | 31 | ORIGIN = GUC_HXG_ORIGIN_HOST_ |
+ * | +-------+--------------------------------------------------------------+
+ * | | 30:28 | TYPE = GUC_HXG_TYPE_REQUEST_ |
+ * | +-------+--------------------------------------------------------------+
+ * | | 27:16 | DATA0 = MBZ |
+ * | +-------+--------------------------------------------------------------+
+ * | | 15:0 | ACTION = _`GUC_ACTION_HOST2GUC_PC_SLPM_REQUEST` = 0x3003 |
+ * +---+-------+--------------------------------------------------------------+
+ * | 1 | 31:8 | **EVENT_ID** |
+ * + +-------+--------------------------------------------------------------+
+ * | | 7:0 | **EVENT_ARGC** - number of data arguments |
+ * +---+-------+--------------------------------------------------------------+
+ * | 2 | 31:0 | **EVENT_DATA1** |
+ * +---+-------+--------------------------------------------------------------+
+ * |...| 31:0 | ... |
+ * +---+-------+--------------------------------------------------------------+
+ * |2+n| 31:0 | **EVENT_DATAn** |
+ * +---+-------+--------------------------------------------------------------+
+ */
+
+#define GUC_ACTION_HOST2GUC_PC_SLPC_REQUEST 0x3003
+
+#define HOST2GUC_PC_SLPC_REQUEST_MSG_MIN_LEN \
+ (GUC_HXG_REQUEST_MSG_MIN_LEN + 1u)
+#define HOST2GUC_PC_SLPC_EVENT_MAX_INPUT_ARGS 9
+#define HOST2GUC_PC_SLPC_REQUEST_MSG_MAX_LEN \
+ (HOST2GUC_PC_SLPC_REQUEST_REQUEST_MSG_MIN_LEN + \
+ HOST2GUC_PC_SLPC_EVENT_MAX_INPUT_ARGS)
+#define HOST2GUC_PC_SLPC_REQUEST_MSG_0_MBZ GUC_HXG_REQUEST_MSG_0_DATA0
+#define HOST2GUC_PC_SLPC_REQUEST_MSG_1_EVENT_ID (0xff << 8)
+#define HOST2GUC_PC_SLPC_REQUEST_MSG_1_EVENT_ARGC (0xff << 0)
+#define HOST2GUC_PC_SLPC_REQUEST_MSG_N_EVENT_DATA_N GUC_HXG_REQUEST_MSG_n_DATAn
+
+#endif
diff --git a/drivers/gpu/drm/i915/gt/uc/abi/guc_communication_ctb_abi.h b/drivers/gpu/drm/i915/gt/uc/abi/guc_communication_ctb_abi.h
index d38935f47ecf..99e1fad5ca20 100644
--- a/drivers/gpu/drm/i915/gt/uc/abi/guc_communication_ctb_abi.h
+++ b/drivers/gpu/drm/i915/gt/uc/abi/guc_communication_ctb_abi.h
@@ -7,6 +7,111 @@
#define _ABI_GUC_COMMUNICATION_CTB_ABI_H
#include <linux/types.h>
+#include <linux/build_bug.h>
+
+#include "guc_messages_abi.h"
+
+/**
+ * DOC: CT Buffer
+ *
+ * Circular buffer used to send `CTB Message`_
+ */
+
+/**
+ * DOC: CTB Descriptor
+ *
+ * +---+-------+--------------------------------------------------------------+
+ * | | Bits | Description |
+ * +===+=======+==============================================================+
+ * | 0 | 31:0 | **HEAD** - offset (in dwords) to the last dword that was |
+ * | | | read from the `CT Buffer`_. |
+ * | | | It can only be updated by the receiver. |
+ * +---+-------+--------------------------------------------------------------+
+ * | 1 | 31:0 | **TAIL** - offset (in dwords) to the last dword that was |
+ * | | | written to the `CT Buffer`_. |
+ * | | | It can only be updated by the sender. |
+ * +---+-------+--------------------------------------------------------------+
+ * | 2 | 31:0 | **STATUS** - status of the CTB |
+ * | | | |
+ * | | | - _`GUC_CTB_STATUS_NO_ERROR` = 0 (normal operation) |
+ * | | | - _`GUC_CTB_STATUS_OVERFLOW` = 1 (head/tail too large) |
+ * | | | - _`GUC_CTB_STATUS_UNDERFLOW` = 2 (truncated message) |
+ * | | | - _`GUC_CTB_STATUS_MISMATCH` = 4 (head/tail modified) |
+ * +---+-------+--------------------------------------------------------------+
+ * |...| | RESERVED = MBZ |
+ * +---+-------+--------------------------------------------------------------+
+ * | 15| 31:0 | RESERVED = MBZ |
+ * +---+-------+--------------------------------------------------------------+
+ */
+
+struct guc_ct_buffer_desc {
+ u32 head;
+ u32 tail;
+ u32 status;
+#define GUC_CTB_STATUS_NO_ERROR 0
+#define GUC_CTB_STATUS_OVERFLOW (1 << 0)
+#define GUC_CTB_STATUS_UNDERFLOW (1 << 1)
+#define GUC_CTB_STATUS_MISMATCH (1 << 2)
+ u32 reserved[13];
+} __packed;
+static_assert(sizeof(struct guc_ct_buffer_desc) == 64);
+
+/**
+ * DOC: CTB Message
+ *
+ * +---+-------+--------------------------------------------------------------+
+ * | | Bits | Description |
+ * +===+=======+==============================================================+
+ * | 0 | 31:16 | **FENCE** - message identifier |
+ * | +-------+--------------------------------------------------------------+
+ * | | 15:12 | **FORMAT** - format of the CTB message |
+ * | | | - _`GUC_CTB_FORMAT_HXG` = 0 - see `CTB HXG Message`_ |
+ * | +-------+--------------------------------------------------------------+
+ * | | 11:8 | **RESERVED** |
+ * | +-------+--------------------------------------------------------------+
+ * | | 7:0 | **NUM_DWORDS** - length of the CTB message (w/o header) |
+ * +---+-------+--------------------------------------------------------------+
+ * | 1 | 31:0 | optional (depends on FORMAT) |
+ * +---+-------+ |
+ * |...| | |
+ * +---+-------+ |
+ * | n | 31:0 | |
+ * +---+-------+--------------------------------------------------------------+
+ */
+
+#define GUC_CTB_HDR_LEN 1u
+#define GUC_CTB_MSG_MIN_LEN GUC_CTB_HDR_LEN
+#define GUC_CTB_MSG_MAX_LEN 256u
+#define GUC_CTB_MSG_0_FENCE (0xffff << 16)
+#define GUC_CTB_MSG_0_FORMAT (0xf << 12)
+#define GUC_CTB_FORMAT_HXG 0u
+#define GUC_CTB_MSG_0_RESERVED (0xf << 8)
+#define GUC_CTB_MSG_0_NUM_DWORDS (0xff << 0)
+
+/**
+ * DOC: CTB HXG Message
+ *
+ * +---+-------+--------------------------------------------------------------+
+ * | | Bits | Description |
+ * +===+=======+==============================================================+
+ * | 0 | 31:16 | FENCE |
+ * | +-------+--------------------------------------------------------------+
+ * | | 15:12 | FORMAT = GUC_CTB_FORMAT_HXG_ |
+ * | +-------+--------------------------------------------------------------+
+ * | | 11:8 | RESERVED = MBZ |
+ * | +-------+--------------------------------------------------------------+
+ * | | 7:0 | NUM_DWORDS = length (in dwords) of the embedded HXG message |
+ * +---+-------+--------------------------------------------------------------+
+ * | 1 | 31:0 | +--------------------------------------------------------+ |
+ * +---+-------+ | | |
+ * |...| | | Embedded `HXG Message`_ | |
+ * +---+-------+ | | |
+ * | n | 31:0 | +--------------------------------------------------------+ |
+ * +---+-------+--------------------------------------------------------------+
+ */
+
+#define GUC_CTB_HXG_MSG_MIN_LEN (GUC_CTB_MSG_MIN_LEN + GUC_HXG_MSG_MIN_LEN)
+#define GUC_CTB_HXG_MSG_MAX_LEN GUC_CTB_MSG_MAX_LEN
/**
* DOC: CTB based communication
@@ -61,28 +166,6 @@
*/
/*
- * Describes single command transport buffer.
- * Used by both guc-master and clients.
- */
-struct guc_ct_buffer_desc {
- u32 addr; /* gfx address */
- u64 host_private; /* host private data */
- u32 size; /* size in bytes */
- u32 head; /* offset updated by GuC*/
- u32 tail; /* offset updated by owner */
- u32 is_in_error; /* error indicator */
- u32 reserved1;
- u32 reserved2;
- u32 owner; /* id of the channel owner */
- u32 owner_sub_id; /* owner-defined field for extra tracking */
- u32 reserved[5];
-} __packed;
-
-/* Type of command transport buffer */
-#define INTEL_GUC_CT_BUFFER_TYPE_SEND 0x0u
-#define INTEL_GUC_CT_BUFFER_TYPE_RECV 0x1u
-
-/*
* Definition of the command transport message header (DW0)
*
* bit[4..0] message len (in dwords)
diff --git a/drivers/gpu/drm/i915/gt/uc/abi/guc_communication_mmio_abi.h b/drivers/gpu/drm/i915/gt/uc/abi/guc_communication_mmio_abi.h
index be066a62e9e0..bbf1ddb77434 100644
--- a/drivers/gpu/drm/i915/gt/uc/abi/guc_communication_mmio_abi.h
+++ b/drivers/gpu/drm/i915/gt/uc/abi/guc_communication_mmio_abi.h
@@ -7,46 +7,43 @@
#define _ABI_GUC_COMMUNICATION_MMIO_ABI_H
/**
- * DOC: MMIO based communication
+ * DOC: GuC MMIO based communication
*
- * The MMIO based communication between Host and GuC uses software scratch
- * registers, where first register holds data treated as message header,
- * and other registers are used to hold message payload.
+ * The MMIO based communication between Host and GuC relies on special
+ * hardware registers which format could be defined by the software
+ * (so called scratch registers).
*
- * For Gen9+, GuC uses software scratch registers 0xC180-0xC1B8,
- * but no H2G command takes more than 8 parameters and the GuC FW
- * itself uses an 8-element array to store the H2G message.
+ * Each MMIO based message, both Host to GuC (H2G) and GuC to Host (G2H)
+ * messages, which maximum length depends on number of available scratch
+ * registers, is directly written into those scratch registers.
*
- * +-----------+---------+---------+---------+
- * | MMIO[0] | MMIO[1] | ... | MMIO[n] |
- * +-----------+---------+---------+---------+
- * | header | optional payload |
- * +======+====+=========+=========+=========+
- * | 31:28|type| | | |
- * +------+----+ | | |
- * | 27:16|data| | | |
- * +------+----+ | | |
- * | 15:0|code| | | |
- * +------+----+---------+---------+---------+
+ * For Gen9+, there are 16 software scratch registers 0xC180-0xC1B8,
+ * but no H2G command takes more than 4 parameters and the GuC firmware
+ * itself uses an 4-element array to store the H2G message.
*
- * The message header consists of:
+ * For Gen11+, there are additional 4 registers 0x190240-0x19024C, which
+ * are, regardless on lower count, preferred over legacy ones.
*
- * - **type**, indicates message type
- * - **code**, indicates message code, is specific for **type**
- * - **data**, indicates message data, optional, depends on **code**
- *
- * The following message **types** are supported:
- *
- * - **REQUEST**, indicates Host-to-GuC request, requested GuC action code
- * must be priovided in **code** field. Optional action specific parameters
- * can be provided in remaining payload registers or **data** field.
- *
- * - **RESPONSE**, indicates GuC-to-Host response from earlier GuC request,
- * action response status will be provided in **code** field. Optional
- * response data can be returned in remaining payload registers or **data**
- * field.
+ * The MMIO based communication is mainly used during driver initialization
+ * phase to setup the `CTB based communication`_ that will be used afterwards.
*/
-#define GUC_MAX_MMIO_MSG_LEN 8
+#define GUC_MAX_MMIO_MSG_LEN 4
+
+/**
+ * DOC: MMIO HXG Message
+ *
+ * Format of the MMIO messages follows definitions of `HXG Message`_.
+ *
+ * +---+-------+--------------------------------------------------------------+
+ * | | Bits | Description |
+ * +===+=======+==============================================================+
+ * | 0 | 31:0 | +--------------------------------------------------------+ |
+ * +---+-------+ | | |
+ * |...| | | Embedded `HXG Message`_ | |
+ * +---+-------+ | | |
+ * | n | 31:0 | +--------------------------------------------------------+ |
+ * +---+-------+--------------------------------------------------------------+
+ */
#endif /* _ABI_GUC_COMMUNICATION_MMIO_ABI_H */
diff --git a/drivers/gpu/drm/i915/gt/uc/abi/guc_messages_abi.h b/drivers/gpu/drm/i915/gt/uc/abi/guc_messages_abi.h
index 775e21f3058c..29ac823acd4c 100644
--- a/drivers/gpu/drm/i915/gt/uc/abi/guc_messages_abi.h
+++ b/drivers/gpu/drm/i915/gt/uc/abi/guc_messages_abi.h
@@ -6,6 +6,219 @@
#ifndef _ABI_GUC_MESSAGES_ABI_H
#define _ABI_GUC_MESSAGES_ABI_H
+/**
+ * DOC: HXG Message
+ *
+ * All messages exchanged with GuC are defined using 32 bit dwords.
+ * First dword is treated as a message header. Remaining dwords are optional.
+ *
+ * +---+-------+--------------------------------------------------------------+
+ * | | Bits | Description |
+ * +===+=======+==============================================================+
+ * | | | |
+ * | 0 | 31 | **ORIGIN** - originator of the message |
+ * | | | - _`GUC_HXG_ORIGIN_HOST` = 0 |
+ * | | | - _`GUC_HXG_ORIGIN_GUC` = 1 |
+ * | | | |
+ * | +-------+--------------------------------------------------------------+
+ * | | 30:28 | **TYPE** - message type |
+ * | | | - _`GUC_HXG_TYPE_REQUEST` = 0 |
+ * | | | - _`GUC_HXG_TYPE_EVENT` = 1 |
+ * | | | - _`GUC_HXG_TYPE_NO_RESPONSE_BUSY` = 3 |
+ * | | | - _`GUC_HXG_TYPE_NO_RESPONSE_RETRY` = 5 |
+ * | | | - _`GUC_HXG_TYPE_RESPONSE_FAILURE` = 6 |
+ * | | | - _`GUC_HXG_TYPE_RESPONSE_SUCCESS` = 7 |
+ * | +-------+--------------------------------------------------------------+
+ * | | 27:0 | **AUX** - auxiliary data (depends on TYPE) |
+ * +---+-------+--------------------------------------------------------------+
+ * | 1 | 31:0 | |
+ * +---+-------+ |
+ * |...| | **PAYLOAD** - optional payload (depends on TYPE) |
+ * +---+-------+ |
+ * | n | 31:0 | |
+ * +---+-------+--------------------------------------------------------------+
+ */
+
+#define GUC_HXG_MSG_MIN_LEN 1u
+#define GUC_HXG_MSG_0_ORIGIN (0x1 << 31)
+#define GUC_HXG_ORIGIN_HOST 0u
+#define GUC_HXG_ORIGIN_GUC 1u
+#define GUC_HXG_MSG_0_TYPE (0x7 << 28)
+#define GUC_HXG_TYPE_REQUEST 0u
+#define GUC_HXG_TYPE_EVENT 1u
+#define GUC_HXG_TYPE_NO_RESPONSE_BUSY 3u
+#define GUC_HXG_TYPE_NO_RESPONSE_RETRY 5u
+#define GUC_HXG_TYPE_RESPONSE_FAILURE 6u
+#define GUC_HXG_TYPE_RESPONSE_SUCCESS 7u
+#define GUC_HXG_MSG_0_AUX (0xfffffff << 0)
+#define GUC_HXG_MSG_n_PAYLOAD (0xffffffff << 0)
+
+/**
+ * DOC: HXG Request
+ *
+ * The `HXG Request`_ message should be used to initiate synchronous activity
+ * for which confirmation or return data is expected.
+ *
+ * The recipient of this message shall use `HXG Response`_, `HXG Failure`_
+ * or `HXG Retry`_ message as a definite reply, and may use `HXG Busy`_
+ * message as a intermediate reply.
+ *
+ * Format of @DATA0 and all @DATAn fields depends on the @ACTION code.
+ *
+ * +---+-------+--------------------------------------------------------------+
+ * | | Bits | Description |
+ * +===+=======+==============================================================+
+ * | 0 | 31 | ORIGIN |
+ * | +-------+--------------------------------------------------------------+
+ * | | 30:28 | TYPE = GUC_HXG_TYPE_REQUEST_ |
+ * | +-------+--------------------------------------------------------------+
+ * | | 27:16 | **DATA0** - request data (depends on ACTION) |
+ * | +-------+--------------------------------------------------------------+
+ * | | 15:0 | **ACTION** - requested action code |
+ * +---+-------+--------------------------------------------------------------+
+ * | 1 | 31:0 | |
+ * +---+-------+ |
+ * |...| | **DATAn** - optional data (depends on ACTION) |
+ * +---+-------+ |
+ * | n | 31:0 | |
+ * +---+-------+--------------------------------------------------------------+
+ */
+
+#define GUC_HXG_REQUEST_MSG_MIN_LEN GUC_HXG_MSG_MIN_LEN
+#define GUC_HXG_REQUEST_MSG_0_DATA0 (0xfff << 16)
+#define GUC_HXG_REQUEST_MSG_0_ACTION (0xffff << 0)
+#define GUC_HXG_REQUEST_MSG_n_DATAn GUC_HXG_MSG_n_PAYLOAD
+
+/**
+ * DOC: HXG Event
+ *
+ * The `HXG Event`_ message should be used to initiate asynchronous activity
+ * that does not involves immediate confirmation nor data.
+ *
+ * Format of @DATA0 and all @DATAn fields depends on the @ACTION code.
+ *
+ * +---+-------+--------------------------------------------------------------+
+ * | | Bits | Description |
+ * +===+=======+==============================================================+
+ * | 0 | 31 | ORIGIN |
+ * | +-------+--------------------------------------------------------------+
+ * | | 30:28 | TYPE = GUC_HXG_TYPE_EVENT_ |
+ * | +-------+--------------------------------------------------------------+
+ * | | 27:16 | **DATA0** - event data (depends on ACTION) |
+ * | +-------+--------------------------------------------------------------+
+ * | | 15:0 | **ACTION** - event action code |
+ * +---+-------+--------------------------------------------------------------+
+ * | 1 | 31:0 | |
+ * +---+-------+ |
+ * |...| | **DATAn** - optional event data (depends on ACTION) |
+ * +---+-------+ |
+ * | n | 31:0 | |
+ * +---+-------+--------------------------------------------------------------+
+ */
+
+#define GUC_HXG_EVENT_MSG_MIN_LEN GUC_HXG_MSG_MIN_LEN
+#define GUC_HXG_EVENT_MSG_0_DATA0 (0xfff << 16)
+#define GUC_HXG_EVENT_MSG_0_ACTION (0xffff << 0)
+#define GUC_HXG_EVENT_MSG_n_DATAn GUC_HXG_MSG_n_PAYLOAD
+
+/**
+ * DOC: HXG Busy
+ *
+ * The `HXG Busy`_ message may be used to acknowledge reception of the `HXG Request`_
+ * message if the recipient expects that it processing will be longer than default
+ * timeout.
+ *
+ * The @COUNTER field may be used as a progress indicator.
+ *
+ * +---+-------+--------------------------------------------------------------+
+ * | | Bits | Description |
+ * +===+=======+==============================================================+
+ * | 0 | 31 | ORIGIN |
+ * | +-------+--------------------------------------------------------------+
+ * | | 30:28 | TYPE = GUC_HXG_TYPE_NO_RESPONSE_BUSY_ |
+ * | +-------+--------------------------------------------------------------+
+ * | | 27:0 | **COUNTER** - progress indicator |
+ * +---+-------+--------------------------------------------------------------+
+ */
+
+#define GUC_HXG_BUSY_MSG_LEN GUC_HXG_MSG_MIN_LEN
+#define GUC_HXG_BUSY_MSG_0_COUNTER GUC_HXG_MSG_0_AUX
+
+/**
+ * DOC: HXG Retry
+ *
+ * The `HXG Retry`_ message should be used by recipient to indicate that the
+ * `HXG Request`_ message was dropped and it should be resent again.
+ *
+ * The @REASON field may be used to provide additional information.
+ *
+ * +---+-------+--------------------------------------------------------------+
+ * | | Bits | Description |
+ * +===+=======+==============================================================+
+ * | 0 | 31 | ORIGIN |
+ * | +-------+--------------------------------------------------------------+
+ * | | 30:28 | TYPE = GUC_HXG_TYPE_NO_RESPONSE_RETRY_ |
+ * | +-------+--------------------------------------------------------------+
+ * | | 27:0 | **REASON** - reason for retry |
+ * | | | - _`GUC_HXG_RETRY_REASON_UNSPECIFIED` = 0 |
+ * +---+-------+--------------------------------------------------------------+
+ */
+
+#define GUC_HXG_RETRY_MSG_LEN GUC_HXG_MSG_MIN_LEN
+#define GUC_HXG_RETRY_MSG_0_REASON GUC_HXG_MSG_0_AUX
+#define GUC_HXG_RETRY_REASON_UNSPECIFIED 0u
+
+/**
+ * DOC: HXG Failure
+ *
+ * The `HXG Failure`_ message shall be used as a reply to the `HXG Request`_
+ * message that could not be processed due to an error.
+ *
+ * +---+-------+--------------------------------------------------------------+
+ * | | Bits | Description |
+ * +===+=======+==============================================================+
+ * | 0 | 31 | ORIGIN |
+ * | +-------+--------------------------------------------------------------+
+ * | | 30:28 | TYPE = GUC_HXG_TYPE_RESPONSE_FAILURE_ |
+ * | +-------+--------------------------------------------------------------+
+ * | | 27:16 | **HINT** - additional error hint |
+ * | +-------+--------------------------------------------------------------+
+ * | | 15:0 | **ERROR** - error/result code |
+ * +---+-------+--------------------------------------------------------------+
+ */
+
+#define GUC_HXG_FAILURE_MSG_LEN GUC_HXG_MSG_MIN_LEN
+#define GUC_HXG_FAILURE_MSG_0_HINT (0xfff << 16)
+#define GUC_HXG_FAILURE_MSG_0_ERROR (0xffff << 0)
+
+/**
+ * DOC: HXG Response
+ *
+ * The `HXG Response`_ message shall be used as a reply to the `HXG Request`_
+ * message that was successfully processed without an error.
+ *
+ * +---+-------+--------------------------------------------------------------+
+ * | | Bits | Description |
+ * +===+=======+==============================================================+
+ * | 0 | 31 | ORIGIN |
+ * | +-------+--------------------------------------------------------------+
+ * | | 30:28 | TYPE = GUC_HXG_TYPE_RESPONSE_SUCCESS_ |
+ * | +-------+--------------------------------------------------------------+
+ * | | 27:0 | **DATA0** - data (depends on ACTION from `HXG Request`_) |
+ * +---+-------+--------------------------------------------------------------+
+ * | 1 | 31:0 | |
+ * +---+-------+ |
+ * |...| | **DATAn** - data (depends on ACTION from `HXG Request`_) |
+ * +---+-------+ |
+ * | n | 31:0 | |
+ * +---+-------+--------------------------------------------------------------+
+ */
+
+#define GUC_HXG_RESPONSE_MSG_MIN_LEN GUC_HXG_MSG_MIN_LEN
+#define GUC_HXG_RESPONSE_MSG_0_DATA0 GUC_HXG_MSG_0_AUX
+#define GUC_HXG_RESPONSE_MSG_n_DATAn GUC_HXG_MSG_n_PAYLOAD
+
+/* deprecated */
#define INTEL_GUC_MSG_TYPE_SHIFT 28
#define INTEL_GUC_MSG_TYPE_MASK (0xF << INTEL_GUC_MSG_TYPE_SHIFT)
#define INTEL_GUC_MSG_DATA_SHIFT 16
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc.c b/drivers/gpu/drm/i915/gt/uc/intel_guc.c
index f147cb389a20..fbfcae727d7f 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc.c
@@ -7,6 +7,7 @@
#include "gt/intel_gt_irq.h"
#include "gt/intel_gt_pm_irq.h"
#include "intel_guc.h"
+#include "intel_guc_slpc.h"
#include "intel_guc_ads.h"
#include "intel_guc_submission.h"
#include "i915_drv.h"
@@ -157,6 +158,8 @@ void intel_guc_init_early(struct intel_guc *guc)
intel_guc_ct_init_early(&guc->ct);
intel_guc_log_init_early(&guc->log);
intel_guc_submission_init_early(guc);
+ intel_guc_slpc_init_early(&guc->slpc);
+ intel_guc_rc_init_early(guc);
mutex_init(&guc->send_mutex);
spin_lock_init(&guc->irq_lock);
@@ -180,6 +183,11 @@ void intel_guc_init_early(struct intel_guc *guc)
}
}
+void intel_guc_init_late(struct intel_guc *guc)
+{
+ intel_guc_ads_init_late(guc);
+}
+
static u32 guc_ctl_debug_flags(struct intel_guc *guc)
{
u32 level = intel_guc_log_get_level(&guc->log);
@@ -201,6 +209,9 @@ static u32 guc_ctl_feature_flags(struct intel_guc *guc)
if (!intel_guc_submission_is_used(guc))
flags |= GUC_CTL_DISABLE_SCHEDULER;
+ if (intel_guc_slpc_is_used(guc))
+ flags |= GUC_CTL_ENABLE_SLPC;
+
return flags;
}
@@ -219,24 +230,19 @@ static u32 guc_ctl_log_params_flags(struct intel_guc *guc)
BUILD_BUG_ON(!CRASH_BUFFER_SIZE);
BUILD_BUG_ON(!IS_ALIGNED(CRASH_BUFFER_SIZE, UNIT));
- BUILD_BUG_ON(!DPC_BUFFER_SIZE);
- BUILD_BUG_ON(!IS_ALIGNED(DPC_BUFFER_SIZE, UNIT));
- BUILD_BUG_ON(!ISR_BUFFER_SIZE);
- BUILD_BUG_ON(!IS_ALIGNED(ISR_BUFFER_SIZE, UNIT));
+ BUILD_BUG_ON(!DEBUG_BUFFER_SIZE);
+ BUILD_BUG_ON(!IS_ALIGNED(DEBUG_BUFFER_SIZE, UNIT));
BUILD_BUG_ON((CRASH_BUFFER_SIZE / UNIT - 1) >
(GUC_LOG_CRASH_MASK >> GUC_LOG_CRASH_SHIFT));
- BUILD_BUG_ON((DPC_BUFFER_SIZE / UNIT - 1) >
- (GUC_LOG_DPC_MASK >> GUC_LOG_DPC_SHIFT));
- BUILD_BUG_ON((ISR_BUFFER_SIZE / UNIT - 1) >
- (GUC_LOG_ISR_MASK >> GUC_LOG_ISR_SHIFT));
+ BUILD_BUG_ON((DEBUG_BUFFER_SIZE / UNIT - 1) >
+ (GUC_LOG_DEBUG_MASK >> GUC_LOG_DEBUG_SHIFT));
flags = GUC_LOG_VALID |
GUC_LOG_NOTIFY_ON_HALF_FULL |
FLAG |
((CRASH_BUFFER_SIZE / UNIT - 1) << GUC_LOG_CRASH_SHIFT) |
- ((DPC_BUFFER_SIZE / UNIT - 1) << GUC_LOG_DPC_SHIFT) |
- ((ISR_BUFFER_SIZE / UNIT - 1) << GUC_LOG_ISR_SHIFT) |
+ ((DEBUG_BUFFER_SIZE / UNIT - 1) << GUC_LOG_DEBUG_SHIFT) |
(offset << GUC_LOG_BUF_ADDR_SHIFT);
#undef UNIT
@@ -331,6 +337,12 @@ int intel_guc_init(struct intel_guc *guc)
goto err_ct;
}
+ if (intel_guc_slpc_is_used(guc)) {
+ ret = intel_guc_slpc_init(&guc->slpc);
+ if (ret)
+ goto err_submission;
+ }
+
/* now that everything is perma-pinned, initialize the parameters */
guc_init_params(guc);
@@ -341,6 +353,8 @@ int intel_guc_init(struct intel_guc *guc)
return 0;
+err_submission:
+ intel_guc_submission_fini(guc);
err_ct:
intel_guc_ct_fini(&guc->ct);
err_ads:
@@ -363,6 +377,9 @@ void intel_guc_fini(struct intel_guc *guc)
i915_ggtt_disable_guc(gt->ggtt);
+ if (intel_guc_slpc_is_used(guc))
+ intel_guc_slpc_fini(&guc->slpc);
+
if (intel_guc_submission_is_used(guc))
intel_guc_submission_fini(guc);
@@ -376,29 +393,27 @@ void intel_guc_fini(struct intel_guc *guc)
/*
* This function implements the MMIO based host to GuC interface.
*/
-int intel_guc_send_mmio(struct intel_guc *guc, const u32 *action, u32 len,
+int intel_guc_send_mmio(struct intel_guc *guc, const u32 *request, u32 len,
u32 *response_buf, u32 response_buf_size)
{
+ struct drm_i915_private *i915 = guc_to_gt(guc)->i915;
struct intel_uncore *uncore = guc_to_gt(guc)->uncore;
- u32 status;
+ u32 header;
int i;
int ret;
GEM_BUG_ON(!len);
GEM_BUG_ON(len > guc->send_regs.count);
- /* We expect only action code */
- GEM_BUG_ON(*action & ~INTEL_GUC_MSG_CODE_MASK);
-
- /* If CT is available, we expect to use MMIO only during init/fini */
- GEM_BUG_ON(*action != INTEL_GUC_ACTION_REGISTER_COMMAND_TRANSPORT_BUFFER &&
- *action != INTEL_GUC_ACTION_DEREGISTER_COMMAND_TRANSPORT_BUFFER);
+ GEM_BUG_ON(FIELD_GET(GUC_HXG_MSG_0_ORIGIN, request[0]) != GUC_HXG_ORIGIN_HOST);
+ GEM_BUG_ON(FIELD_GET(GUC_HXG_MSG_0_TYPE, request[0]) != GUC_HXG_TYPE_REQUEST);
mutex_lock(&guc->send_mutex);
intel_uncore_forcewake_get(uncore, guc->send_regs.fw_domains);
+retry:
for (i = 0; i < len; i++)
- intel_uncore_write(uncore, guc_send_reg(guc, i), action[i]);
+ intel_uncore_write(uncore, guc_send_reg(guc, i), request[i]);
intel_uncore_posting_read(uncore, guc_send_reg(guc, i - 1));
@@ -410,30 +425,74 @@ int intel_guc_send_mmio(struct intel_guc *guc, const u32 *action, u32 len,
*/
ret = __intel_wait_for_register_fw(uncore,
guc_send_reg(guc, 0),
- INTEL_GUC_MSG_TYPE_MASK,
- INTEL_GUC_MSG_TYPE_RESPONSE <<
- INTEL_GUC_MSG_TYPE_SHIFT,
- 10, 10, &status);
- /* If GuC explicitly returned an error, convert it to -EIO */
- if (!ret && !INTEL_GUC_MSG_IS_RESPONSE_SUCCESS(status))
- ret = -EIO;
+ GUC_HXG_MSG_0_ORIGIN,
+ FIELD_PREP(GUC_HXG_MSG_0_ORIGIN,
+ GUC_HXG_ORIGIN_GUC),
+ 10, 10, &header);
+ if (unlikely(ret)) {
+timeout:
+ drm_err(&i915->drm, "mmio request %#x: no reply %x\n",
+ request[0], header);
+ goto out;
+ }
- if (ret) {
- DRM_ERROR("MMIO: GuC action %#x failed with error %d %#x\n",
- action[0], ret, status);
+ if (FIELD_GET(GUC_HXG_MSG_0_TYPE, header) == GUC_HXG_TYPE_NO_RESPONSE_BUSY) {
+#define done ({ header = intel_uncore_read(uncore, guc_send_reg(guc, 0)); \
+ FIELD_GET(GUC_HXG_MSG_0_ORIGIN, header) != GUC_HXG_ORIGIN_GUC || \
+ FIELD_GET(GUC_HXG_MSG_0_TYPE, header) != GUC_HXG_TYPE_NO_RESPONSE_BUSY; })
+
+ ret = wait_for(done, 1000);
+ if (unlikely(ret))
+ goto timeout;
+ if (unlikely(FIELD_GET(GUC_HXG_MSG_0_ORIGIN, header) !=
+ GUC_HXG_ORIGIN_GUC))
+ goto proto;
+#undef done
+ }
+
+ if (FIELD_GET(GUC_HXG_MSG_0_TYPE, header) == GUC_HXG_TYPE_NO_RESPONSE_RETRY) {
+ u32 reason = FIELD_GET(GUC_HXG_RETRY_MSG_0_REASON, header);
+
+ drm_dbg(&i915->drm, "mmio request %#x: retrying, reason %u\n",
+ request[0], reason);
+ goto retry;
+ }
+
+ if (FIELD_GET(GUC_HXG_MSG_0_TYPE, header) == GUC_HXG_TYPE_RESPONSE_FAILURE) {
+ u32 hint = FIELD_GET(GUC_HXG_FAILURE_MSG_0_HINT, header);
+ u32 error = FIELD_GET(GUC_HXG_FAILURE_MSG_0_ERROR, header);
+
+ drm_err(&i915->drm, "mmio request %#x: failure %x/%u\n",
+ request[0], error, hint);
+ ret = -ENXIO;
+ goto out;
+ }
+
+ if (FIELD_GET(GUC_HXG_MSG_0_TYPE, header) != GUC_HXG_TYPE_RESPONSE_SUCCESS) {
+proto:
+ drm_err(&i915->drm, "mmio request %#x: unexpected reply %#x\n",
+ request[0], header);
+ ret = -EPROTO;
goto out;
}
if (response_buf) {
- int count = min(response_buf_size, guc->send_regs.count - 1);
+ int count = min(response_buf_size, guc->send_regs.count);
+
+ GEM_BUG_ON(!count);
- for (i = 0; i < count; i++)
+ response_buf[0] = header;
+
+ for (i = 1; i < count; i++)
response_buf[i] = intel_uncore_read(uncore,
- guc_send_reg(guc, i + 1));
- }
+ guc_send_reg(guc, i));
- /* Use data from the GuC response as our return value */
- ret = INTEL_GUC_MSG_TO_DATA(status);
+ /* Use number of copied dwords as our return value */
+ ret = count;
+ } else {
+ /* Use data from the GuC response as our return value */
+ ret = FIELD_GET(GUC_HXG_RESPONSE_MSG_0_DATA0, header);
+ }
out:
intel_uncore_forcewake_put(uncore, guc->send_regs.fw_domains);
@@ -487,65 +546,35 @@ int intel_guc_auth_huc(struct intel_guc *guc, u32 rsa_offset)
*/
int intel_guc_suspend(struct intel_guc *guc)
{
- struct intel_uncore *uncore = guc_to_gt(guc)->uncore;
int ret;
- u32 status;
u32 action[] = {
- INTEL_GUC_ACTION_ENTER_S_STATE,
- GUC_POWER_D1, /* any value greater than GUC_POWER_D0 */
+ INTEL_GUC_ACTION_RESET_CLIENT,
};
- /*
- * If GuC communication is enabled but submission is not supported,
- * we do not need to suspend the GuC.
- */
- if (!intel_guc_submission_is_used(guc) || !intel_guc_is_ready(guc))
+ if (!intel_guc_is_ready(guc))
return 0;
- /*
- * The ENTER_S_STATE action queues the save/restore operation in GuC FW
- * and then returns, so waiting on the H2G is not enough to guarantee
- * GuC is done. When all the processing is done, GuC writes
- * INTEL_GUC_SLEEP_STATE_SUCCESS to scratch register 14, so we can poll
- * on that. Note that GuC does not ensure that the value in the register
- * is different from INTEL_GUC_SLEEP_STATE_SUCCESS while the action is
- * in progress so we need to take care of that ourselves as well.
- */
-
- intel_uncore_write(uncore, SOFT_SCRATCH(14),
- INTEL_GUC_SLEEP_STATE_INVALID_MASK);
-
- ret = intel_guc_send(guc, action, ARRAY_SIZE(action));
- if (ret)
- return ret;
-
- ret = __intel_wait_for_register(uncore, SOFT_SCRATCH(14),
- INTEL_GUC_SLEEP_STATE_INVALID_MASK,
- 0, 0, 10, &status);
- if (ret)
- return ret;
-
- if (status != INTEL_GUC_SLEEP_STATE_SUCCESS) {
- DRM_ERROR("GuC failed to change sleep state. "
- "action=0x%x, err=%u\n",
- action[0], status);
- return -EIO;
+ if (intel_guc_submission_is_used(guc)) {
+ /*
+ * This H2G MMIO command tears down the GuC in two steps. First it will
+ * generate a G2H CTB for every active context indicating a reset. In
+ * practice the i915 shouldn't ever get a G2H as suspend should only be
+ * called when the GPU is idle. Next, it tears down the CTBs and this
+ * H2G MMIO command completes.
+ *
+ * Don't abort on a failure code from the GuC. Keep going and do the
+ * clean up in santize() and re-initialisation on resume and hopefully
+ * the error here won't be problematic.
+ */
+ ret = intel_guc_send_mmio(guc, action, ARRAY_SIZE(action), NULL, 0);
+ if (ret)
+ DRM_ERROR("GuC suspend: RESET_CLIENT action failed with error %d!\n", ret);
}
- return 0;
-}
-
-/**
- * intel_guc_reset_engine() - ask GuC to reset an engine
- * @guc: intel_guc structure
- * @engine: engine to be reset
- */
-int intel_guc_reset_engine(struct intel_guc *guc,
- struct intel_engine_cs *engine)
-{
- /* XXX: to be implemented with submission interface rework */
+ /* Signal that the GuC isn't running. */
+ intel_guc_sanitize(guc);
- return -ENODEV;
+ return 0;
}
/**
@@ -554,7 +583,12 @@ int intel_guc_reset_engine(struct intel_guc *guc,
*/
int intel_guc_resume(struct intel_guc *guc)
{
- /* XXX: to be implemented with submission interface rework */
+ /*
+ * NB: This function can still be called even if GuC submission is
+ * disabled, e.g. if GuC is enabled for HuC authentication only. Thus,
+ * if any code is later added here, it must be support doing nothing
+ * if submission is disabled (as per intel_guc_suspend).
+ */
return 0;
}
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc.h b/drivers/gpu/drm/i915/gt/uc/intel_guc.h
index 4abc59f6f3cd..2e27fe59786b 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc.h
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc.h
@@ -6,12 +6,16 @@
#ifndef _INTEL_GUC_H_
#define _INTEL_GUC_H_
+#include <linux/xarray.h>
+#include <linux/delay.h>
+
#include "intel_uncore.h"
#include "intel_guc_fw.h"
#include "intel_guc_fwif.h"
#include "intel_guc_ct.h"
#include "intel_guc_log.h"
#include "intel_guc_reg.h"
+#include "intel_guc_slpc_types.h"
#include "intel_uc_fw.h"
#include "i915_utils.h"
#include "i915_vma.h"
@@ -27,24 +31,47 @@ struct intel_guc {
struct intel_uc_fw fw;
struct intel_guc_log log;
struct intel_guc_ct ct;
+ struct intel_guc_slpc slpc;
+
+ /* Global engine used to submit requests to GuC */
+ struct i915_sched_engine *sched_engine;
+ struct i915_request *stalled_request;
/* intel_guc_recv interrupt related state */
spinlock_t irq_lock;
unsigned int msg_enabled_mask;
+ atomic_t outstanding_submission_g2h;
+
struct {
void (*reset)(struct intel_guc *guc);
void (*enable)(struct intel_guc *guc);
void (*disable)(struct intel_guc *guc);
} interrupts;
+ /*
+ * contexts_lock protects the pool of free guc ids and a linked list of
+ * guc ids available to be stolen
+ */
+ spinlock_t contexts_lock;
+ struct ida guc_ids;
+ struct list_head guc_id_list;
+
+ bool submission_supported;
bool submission_selected;
+ bool rc_supported;
+ bool rc_selected;
struct i915_vma *ads_vma;
struct __guc_ads_blob *ads_blob;
+ u32 ads_regset_size;
+ u32 ads_golden_ctxt_size;
- struct i915_vma *stage_desc_pool;
- void *stage_desc_pool_vaddr;
+ struct i915_vma *lrc_desc_pool;
+ void *lrc_desc_pool_vaddr;
+
+ /* guc_id to intel_context lookup */
+ struct xarray context_lookup;
/* Control params for fw initialization */
u32 params[GUC_CTL_MAX_DWORDS];
@@ -74,7 +101,15 @@ static inline struct intel_guc *log_to_guc(struct intel_guc_log *log)
static
inline int intel_guc_send(struct intel_guc *guc, const u32 *action, u32 len)
{
- return intel_guc_ct_send(&guc->ct, action, len, NULL, 0);
+ return intel_guc_ct_send(&guc->ct, action, len, NULL, 0, 0);
+}
+
+static
+inline int intel_guc_send_nb(struct intel_guc *guc, const u32 *action, u32 len,
+ u32 g2h_len_dw)
+{
+ return intel_guc_ct_send(&guc->ct, action, len, NULL, 0,
+ MAKE_SEND_FLAGS(g2h_len_dw));
}
static inline int
@@ -82,7 +117,43 @@ intel_guc_send_and_receive(struct intel_guc *guc, const u32 *action, u32 len,
u32 *response_buf, u32 response_buf_size)
{
return intel_guc_ct_send(&guc->ct, action, len,
- response_buf, response_buf_size);
+ response_buf, response_buf_size, 0);
+}
+
+static inline int intel_guc_send_busy_loop(struct intel_guc *guc,
+ const u32 *action,
+ u32 len,
+ u32 g2h_len_dw,
+ bool loop)
+{
+ int err;
+ unsigned int sleep_period_ms = 1;
+ bool not_atomic = !in_atomic() && !irqs_disabled();
+
+ /*
+ * FIXME: Have caller pass in if we are in an atomic context to avoid
+ * using in_atomic(). It is likely safe here as we check for irqs
+ * disabled which basically all the spin locks in the i915 do but
+ * regardless this should be cleaned up.
+ */
+
+ /* No sleeping with spin locks, just busy loop */
+ might_sleep_if(loop && not_atomic);
+
+retry:
+ err = intel_guc_send_nb(guc, action, len, g2h_len_dw);
+ if (unlikely(err == -EBUSY && loop)) {
+ if (likely(not_atomic)) {
+ if (msleep_interruptible(sleep_period_ms))
+ return -EINTR;
+ sleep_period_ms = sleep_period_ms << 1;
+ } else {
+ cpu_relax();
+ }
+ goto retry;
+ }
+
+ return err;
}
static inline void intel_guc_to_host_event_handler(struct intel_guc *guc)
@@ -118,6 +189,7 @@ static inline u32 intel_guc_ggtt_offset(struct intel_guc *guc,
}
void intel_guc_init_early(struct intel_guc *guc);
+void intel_guc_init_late(struct intel_guc *guc);
void intel_guc_init_send_regs(struct intel_guc *guc);
void intel_guc_write_params(struct intel_guc *guc);
int intel_guc_init(struct intel_guc *guc);
@@ -160,9 +232,25 @@ static inline bool intel_guc_is_ready(struct intel_guc *guc)
return intel_guc_is_fw_running(guc) && intel_guc_ct_enabled(&guc->ct);
}
+static inline void intel_guc_reset_interrupts(struct intel_guc *guc)
+{
+ guc->interrupts.reset(guc);
+}
+
+static inline void intel_guc_enable_interrupts(struct intel_guc *guc)
+{
+ guc->interrupts.enable(guc);
+}
+
+static inline void intel_guc_disable_interrupts(struct intel_guc *guc)
+{
+ guc->interrupts.disable(guc);
+}
+
static inline int intel_guc_sanitize(struct intel_guc *guc)
{
intel_uc_fw_sanitize(&guc->fw);
+ intel_guc_disable_interrupts(guc);
intel_guc_ct_sanitize(&guc->ct);
guc->mmio_msg = 0;
@@ -183,8 +271,27 @@ static inline void intel_guc_disable_msg(struct intel_guc *guc, u32 mask)
spin_unlock_irq(&guc->irq_lock);
}
-int intel_guc_reset_engine(struct intel_guc *guc,
- struct intel_engine_cs *engine);
+int intel_guc_wait_for_idle(struct intel_guc *guc, long timeout);
+
+int intel_guc_deregister_done_process_msg(struct intel_guc *guc,
+ const u32 *msg, u32 len);
+int intel_guc_sched_done_process_msg(struct intel_guc *guc,
+ const u32 *msg, u32 len);
+int intel_guc_context_reset_process_msg(struct intel_guc *guc,
+ const u32 *msg, u32 len);
+int intel_guc_engine_failure_process_msg(struct intel_guc *guc,
+ const u32 *msg, u32 len);
+
+void intel_guc_find_hung_context(struct intel_engine_cs *engine);
+
+int intel_guc_global_policies_update(struct intel_guc *guc);
+
+void intel_guc_context_ban(struct intel_context *ce, struct i915_request *rq);
+
+void intel_guc_submission_reset_prepare(struct intel_guc *guc);
+void intel_guc_submission_reset(struct intel_guc *guc, bool stalled);
+void intel_guc_submission_reset_finish(struct intel_guc *guc);
+void intel_guc_submission_cancel_requests(struct intel_guc *guc);
void intel_guc_load_status(struct intel_guc *guc, struct drm_printer *p);
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c
index 9abfbc6edbd6..6926919bcac6 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c
@@ -3,8 +3,11 @@
* Copyright © 2014-2019 Intel Corporation
*/
+#include <linux/bsearch.h>
+
#include "gt/intel_gt.h"
#include "gt/intel_lrc.h"
+#include "gt/shmem_utils.h"
#include "intel_guc_ads.h"
#include "intel_guc_fwif.h"
#include "intel_uc.h"
@@ -23,10 +26,15 @@
* | guc_policies |
* +---------------------------------------+
* | guc_gt_system_info |
- * +---------------------------------------+
- * | guc_clients_info |
- * +---------------------------------------+
- * | guc_ct_pool_entry[size] |
+ * +---------------------------------------+ <== static
+ * | guc_mmio_reg[countA] (engine 0.0) |
+ * | guc_mmio_reg[countB] (engine 0.1) |
+ * | guc_mmio_reg[countC] (engine 1.0) |
+ * | ... |
+ * +---------------------------------------+ <== dynamic
+ * | padding |
+ * +---------------------------------------+ <== 4K aligned
+ * | golden contexts |
* +---------------------------------------+
* | padding |
* +---------------------------------------+ <== 4K aligned
@@ -39,18 +47,49 @@ struct __guc_ads_blob {
struct guc_ads ads;
struct guc_policies policies;
struct guc_gt_system_info system_info;
- struct guc_clients_info clients_info;
- struct guc_ct_pool_entry ct_pool[GUC_CT_POOL_SIZE];
+ /* From here on, location is dynamic! Refer to above diagram. */
+ struct guc_mmio_reg regset[0];
} __packed;
+static u32 guc_ads_regset_size(struct intel_guc *guc)
+{
+ GEM_BUG_ON(!guc->ads_regset_size);
+ return guc->ads_regset_size;
+}
+
+static u32 guc_ads_golden_ctxt_size(struct intel_guc *guc)
+{
+ return PAGE_ALIGN(guc->ads_golden_ctxt_size);
+}
+
static u32 guc_ads_private_data_size(struct intel_guc *guc)
{
return PAGE_ALIGN(guc->fw.private_data_size);
}
+static u32 guc_ads_regset_offset(struct intel_guc *guc)
+{
+ return offsetof(struct __guc_ads_blob, regset);
+}
+
+static u32 guc_ads_golden_ctxt_offset(struct intel_guc *guc)
+{
+ u32 offset;
+
+ offset = guc_ads_regset_offset(guc) +
+ guc_ads_regset_size(guc);
+
+ return PAGE_ALIGN(offset);
+}
+
static u32 guc_ads_private_data_offset(struct intel_guc *guc)
{
- return PAGE_ALIGN(sizeof(struct __guc_ads_blob));
+ u32 offset;
+
+ offset = guc_ads_golden_ctxt_offset(guc) +
+ guc_ads_golden_ctxt_size(guc);
+
+ return PAGE_ALIGN(offset);
}
static u32 guc_ads_blob_size(struct intel_guc *guc)
@@ -59,36 +98,66 @@ static u32 guc_ads_blob_size(struct intel_guc *guc)
guc_ads_private_data_size(guc);
}
-static void guc_policy_init(struct guc_policy *policy)
+static void guc_policies_init(struct intel_guc *guc, struct guc_policies *policies)
{
- policy->execution_quantum = POLICY_DEFAULT_EXECUTION_QUANTUM_US;
- policy->preemption_time = POLICY_DEFAULT_PREEMPTION_TIME_US;
- policy->fault_time = POLICY_DEFAULT_FAULT_TIME_US;
- policy->policy_flags = 0;
+ struct intel_gt *gt = guc_to_gt(guc);
+ struct drm_i915_private *i915 = gt->i915;
+
+ policies->dpc_promote_time = GLOBAL_POLICY_DEFAULT_DPC_PROMOTE_TIME_US;
+ policies->max_num_work_items = GLOBAL_POLICY_MAX_NUM_WI;
+
+ policies->global_flags = 0;
+ if (i915->params.reset < 2)
+ policies->global_flags |= GLOBAL_POLICY_DISABLE_ENGINE_RESET;
+
+ policies->is_valid = 1;
}
-static void guc_policies_init(struct guc_policies *policies)
+void intel_guc_ads_print_policy_info(struct intel_guc *guc,
+ struct drm_printer *dp)
{
- struct guc_policy *policy;
- u32 p, i;
+ struct __guc_ads_blob *blob = guc->ads_blob;
- policies->dpc_promote_time = POLICY_DEFAULT_DPC_PROMOTE_TIME_US;
- policies->max_num_work_items = POLICY_MAX_NUM_WI;
+ if (unlikely(!blob))
+ return;
- for (p = 0; p < GUC_CLIENT_PRIORITY_NUM; p++) {
- for (i = 0; i < GUC_MAX_ENGINE_CLASSES; i++) {
- policy = &policies->policy[p][i];
+ drm_printf(dp, "Global scheduling policies:\n");
+ drm_printf(dp, " DPC promote time = %u\n", blob->policies.dpc_promote_time);
+ drm_printf(dp, " Max num work items = %u\n", blob->policies.max_num_work_items);
+ drm_printf(dp, " Flags = %u\n", blob->policies.global_flags);
+}
- guc_policy_init(policy);
- }
- }
+static int guc_action_policies_update(struct intel_guc *guc, u32 policy_offset)
+{
+ u32 action[] = {
+ INTEL_GUC_ACTION_GLOBAL_SCHED_POLICY_CHANGE,
+ policy_offset
+ };
- policies->is_valid = 1;
+ return intel_guc_send_busy_loop(guc, action, ARRAY_SIZE(action), 0, true);
}
-static void guc_ct_pool_entries_init(struct guc_ct_pool_entry *pool, u32 num)
+int intel_guc_global_policies_update(struct intel_guc *guc)
{
- memset(pool, 0, num * sizeof(*pool));
+ struct __guc_ads_blob *blob = guc->ads_blob;
+ struct intel_gt *gt = guc_to_gt(guc);
+ intel_wakeref_t wakeref;
+ int ret;
+
+ if (!blob)
+ return -EOPNOTSUPP;
+
+ GEM_BUG_ON(!blob->ads.scheduler_policies);
+
+ guc_policies_init(guc, &blob->policies);
+
+ if (!intel_guc_is_ready(guc))
+ return 0;
+
+ with_intel_runtime_pm(&gt->i915->runtime_pm, wakeref)
+ ret = guc_action_policies_update(guc, blob->ads.scheduler_policies);
+
+ return ret;
}
static void guc_mapping_table_init(struct intel_gt *gt,
@@ -113,53 +182,324 @@ static void guc_mapping_table_init(struct intel_gt *gt,
}
/*
- * The first 80 dwords of the register state context, containing the
- * execlists and ppgtt registers.
+ * The save/restore register list must be pre-calculated to a temporary
+ * buffer of driver defined size before it can be generated in place
+ * inside the ADS.
*/
-#define LR_HW_CONTEXT_SIZE (80 * sizeof(u32))
+#define MAX_MMIO_REGS 128 /* Arbitrary size, increase as needed */
+struct temp_regset {
+ struct guc_mmio_reg *registers;
+ u32 used;
+ u32 size;
+};
-static void __guc_ads_init(struct intel_guc *guc)
+static int guc_mmio_reg_cmp(const void *a, const void *b)
+{
+ const struct guc_mmio_reg *ra = a;
+ const struct guc_mmio_reg *rb = b;
+
+ return (int)ra->offset - (int)rb->offset;
+}
+
+static void guc_mmio_reg_add(struct temp_regset *regset,
+ u32 offset, u32 flags)
+{
+ u32 count = regset->used;
+ struct guc_mmio_reg reg = {
+ .offset = offset,
+ .flags = flags,
+ };
+ struct guc_mmio_reg *slot;
+
+ GEM_BUG_ON(count >= regset->size);
+
+ /*
+ * The mmio list is built using separate lists within the driver.
+ * It's possible that at some point we may attempt to add the same
+ * register more than once. Do not consider this an error; silently
+ * move on if the register is already in the list.
+ */
+ if (bsearch(&reg, regset->registers, count,
+ sizeof(reg), guc_mmio_reg_cmp))
+ return;
+
+ slot = &regset->registers[count];
+ regset->used++;
+ *slot = reg;
+
+ while (slot-- > regset->registers) {
+ GEM_BUG_ON(slot[0].offset == slot[1].offset);
+ if (slot[1].offset > slot[0].offset)
+ break;
+
+ swap(slot[1], slot[0]);
+ }
+}
+
+#define GUC_MMIO_REG_ADD(regset, reg, masked) \
+ guc_mmio_reg_add(regset, \
+ i915_mmio_reg_offset((reg)), \
+ (masked) ? GUC_REGSET_MASKED : 0)
+
+static void guc_mmio_regset_init(struct temp_regset *regset,
+ struct intel_engine_cs *engine)
+{
+ const u32 base = engine->mmio_base;
+ struct i915_wa_list *wal = &engine->wa_list;
+ struct i915_wa *wa;
+ unsigned int i;
+
+ regset->used = 0;
+
+ GUC_MMIO_REG_ADD(regset, RING_MODE_GEN7(base), true);
+ GUC_MMIO_REG_ADD(regset, RING_HWS_PGA(base), false);
+ GUC_MMIO_REG_ADD(regset, RING_IMR(base), false);
+
+ for (i = 0, wa = wal->list; i < wal->count; i++, wa++)
+ GUC_MMIO_REG_ADD(regset, wa->reg, wa->masked_reg);
+
+ /* Be extra paranoid and include all whitelist registers. */
+ for (i = 0; i < RING_MAX_NONPRIV_SLOTS; i++)
+ GUC_MMIO_REG_ADD(regset,
+ RING_FORCE_TO_NONPRIV(base, i),
+ false);
+
+ /* add in local MOCS registers */
+ for (i = 0; i < GEN9_LNCFCMOCS_REG_COUNT; i++)
+ GUC_MMIO_REG_ADD(regset, GEN9_LNCFCMOCS(i), false);
+}
+
+static int guc_mmio_reg_state_query(struct intel_guc *guc)
{
struct intel_gt *gt = guc_to_gt(guc);
- struct drm_i915_private *i915 = gt->i915;
+ struct intel_engine_cs *engine;
+ enum intel_engine_id id;
+ struct temp_regset temp_set;
+ u32 total;
+
+ /*
+ * Need to actually build the list in order to filter out
+ * duplicates and other such data dependent constructions.
+ */
+ temp_set.size = MAX_MMIO_REGS;
+ temp_set.registers = kmalloc_array(temp_set.size,
+ sizeof(*temp_set.registers),
+ GFP_KERNEL);
+ if (!temp_set.registers)
+ return -ENOMEM;
+
+ total = 0;
+ for_each_engine(engine, gt, id) {
+ guc_mmio_regset_init(&temp_set, engine);
+ total += temp_set.used;
+ }
+
+ kfree(temp_set.registers);
+
+ return total * sizeof(struct guc_mmio_reg);
+}
+
+static void guc_mmio_reg_state_init(struct intel_guc *guc,
+ struct __guc_ads_blob *blob)
+{
+ struct intel_gt *gt = guc_to_gt(guc);
+ struct intel_engine_cs *engine;
+ enum intel_engine_id id;
+ struct temp_regset temp_set;
+ struct guc_mmio_reg_set *ads_reg_set;
+ u32 addr_ggtt, offset;
+ u8 guc_class;
+
+ offset = guc_ads_regset_offset(guc);
+ addr_ggtt = intel_guc_ggtt_offset(guc, guc->ads_vma) + offset;
+ temp_set.registers = (struct guc_mmio_reg *)(((u8 *)blob) + offset);
+ temp_set.size = guc->ads_regset_size / sizeof(temp_set.registers[0]);
+
+ for_each_engine(engine, gt, id) {
+ /* Class index is checked in class converter */
+ GEM_BUG_ON(engine->instance >= GUC_MAX_INSTANCES_PER_CLASS);
+
+ guc_class = engine_class_to_guc_class(engine->class);
+ ads_reg_set = &blob->ads.reg_state_list[guc_class][engine->instance];
+
+ guc_mmio_regset_init(&temp_set, engine);
+ if (!temp_set.used) {
+ ads_reg_set->address = 0;
+ ads_reg_set->count = 0;
+ continue;
+ }
+
+ ads_reg_set->address = addr_ggtt;
+ ads_reg_set->count = temp_set.used;
+
+ temp_set.size -= temp_set.used;
+ temp_set.registers += temp_set.used;
+ addr_ggtt += temp_set.used * sizeof(struct guc_mmio_reg);
+ }
+
+ GEM_BUG_ON(temp_set.size);
+}
+
+static void fill_engine_enable_masks(struct intel_gt *gt,
+ struct guc_gt_system_info *info)
+{
+ info->engine_enabled_masks[GUC_RENDER_CLASS] = 1;
+ info->engine_enabled_masks[GUC_BLITTER_CLASS] = 1;
+ info->engine_enabled_masks[GUC_VIDEO_CLASS] = VDBOX_MASK(gt);
+ info->engine_enabled_masks[GUC_VIDEOENHANCE_CLASS] = VEBOX_MASK(gt);
+}
+
+static int guc_prep_golden_context(struct intel_guc *guc,
+ struct __guc_ads_blob *blob)
+{
+ struct intel_gt *gt = guc_to_gt(guc);
+ u32 addr_ggtt, offset;
+ u32 total_size = 0, alloc_size, real_size;
+ u8 engine_class, guc_class;
+ struct guc_gt_system_info *info, local_info;
+
+ /*
+ * Reserve the memory for the golden contexts and point GuC at it but
+ * leave it empty for now. The context data will be filled in later
+ * once there is something available to put there.
+ *
+ * Note that the HWSP and ring context are not included.
+ *
+ * Note also that the storage must be pinned in the GGTT, so that the
+ * address won't change after GuC has been told where to find it. The
+ * GuC will also validate that the LRC base + size fall within the
+ * allowed GGTT range.
+ */
+ if (blob) {
+ offset = guc_ads_golden_ctxt_offset(guc);
+ addr_ggtt = intel_guc_ggtt_offset(guc, guc->ads_vma) + offset;
+ info = &blob->system_info;
+ } else {
+ memset(&local_info, 0, sizeof(local_info));
+ info = &local_info;
+ fill_engine_enable_masks(gt, info);
+ }
+
+ for (engine_class = 0; engine_class <= MAX_ENGINE_CLASS; ++engine_class) {
+ if (engine_class == OTHER_CLASS)
+ continue;
+
+ guc_class = engine_class_to_guc_class(engine_class);
+
+ if (!info->engine_enabled_masks[guc_class])
+ continue;
+
+ real_size = intel_engine_context_size(gt, engine_class);
+ alloc_size = PAGE_ALIGN(real_size);
+ total_size += alloc_size;
+
+ if (!blob)
+ continue;
+
+ blob->ads.eng_state_size[guc_class] = real_size;
+ blob->ads.golden_context_lrca[guc_class] = addr_ggtt;
+ addr_ggtt += alloc_size;
+ }
+
+ if (!blob)
+ return total_size;
+
+ GEM_BUG_ON(guc->ads_golden_ctxt_size != total_size);
+ return total_size;
+}
+
+static struct intel_engine_cs *find_engine_state(struct intel_gt *gt, u8 engine_class)
+{
+ struct intel_engine_cs *engine;
+ enum intel_engine_id id;
+
+ for_each_engine(engine, gt, id) {
+ if (engine->class != engine_class)
+ continue;
+
+ if (!engine->default_state)
+ continue;
+
+ return engine;
+ }
+
+ return NULL;
+}
+
+static void guc_init_golden_context(struct intel_guc *guc)
+{
struct __guc_ads_blob *blob = guc->ads_blob;
- const u32 skipped_size = LRC_PPHWSP_SZ * PAGE_SIZE + LR_HW_CONTEXT_SIZE;
- u32 base;
+ struct intel_engine_cs *engine;
+ struct intel_gt *gt = guc_to_gt(guc);
+ u32 addr_ggtt, offset;
+ u32 total_size = 0, alloc_size, real_size;
u8 engine_class, guc_class;
+ u8 *ptr;
- /* GuC scheduling policies */
- guc_policies_init(&blob->policies);
+ /* Skip execlist and PPGTT registers + HWSP */
+ const u32 lr_hw_context_size = 80 * sizeof(u32);
+ const u32 skip_size = LRC_PPHWSP_SZ * PAGE_SIZE +
+ lr_hw_context_size;
+
+ if (!intel_uc_uses_guc_submission(&gt->uc))
+ return;
+
+ GEM_BUG_ON(!blob);
/*
- * GuC expects a per-engine-class context image and size
- * (minus hwsp and ring context). The context image will be
- * used to reinitialize engines after a reset. It must exist
- * and be pinned in the GGTT, so that the address won't change after
- * we have told GuC where to find it. The context size will be used
- * to validate that the LRC base + size fall within allowed GGTT.
+ * Go back and fill in the golden context data now that it is
+ * available.
*/
+ offset = guc_ads_golden_ctxt_offset(guc);
+ addr_ggtt = intel_guc_ggtt_offset(guc, guc->ads_vma) + offset;
+ ptr = ((u8 *)blob) + offset;
+
for (engine_class = 0; engine_class <= MAX_ENGINE_CLASS; ++engine_class) {
if (engine_class == OTHER_CLASS)
continue;
guc_class = engine_class_to_guc_class(engine_class);
- /*
- * TODO: Set context pointer to default state to allow
- * GuC to re-init guilty contexts after internal reset.
- */
- blob->ads.golden_context_lrca[guc_class] = 0;
- blob->ads.eng_state_size[guc_class] =
- intel_engine_context_size(guc_to_gt(guc),
- engine_class) -
- skipped_size;
+ if (!blob->system_info.engine_enabled_masks[guc_class])
+ continue;
+
+ real_size = intel_engine_context_size(gt, engine_class);
+ alloc_size = PAGE_ALIGN(real_size);
+ total_size += alloc_size;
+
+ engine = find_engine_state(gt, engine_class);
+ if (!engine) {
+ drm_err(&gt->i915->drm, "No engine state recorded for class %d!\n",
+ engine_class);
+ blob->ads.eng_state_size[guc_class] = 0;
+ blob->ads.golden_context_lrca[guc_class] = 0;
+ continue;
+ }
+
+ GEM_BUG_ON(blob->ads.eng_state_size[guc_class] != real_size);
+ GEM_BUG_ON(blob->ads.golden_context_lrca[guc_class] != addr_ggtt);
+ addr_ggtt += alloc_size;
+
+ shmem_read(engine->default_state, skip_size, ptr + skip_size,
+ real_size - skip_size);
+ ptr += alloc_size;
}
+ GEM_BUG_ON(guc->ads_golden_ctxt_size != total_size);
+}
+
+static void __guc_ads_init(struct intel_guc *guc)
+{
+ struct intel_gt *gt = guc_to_gt(guc);
+ struct drm_i915_private *i915 = gt->i915;
+ struct __guc_ads_blob *blob = guc->ads_blob;
+ u32 base;
+
+ /* GuC scheduling policies */
+ guc_policies_init(guc, &blob->policies);
+
/* System info */
- blob->system_info.engine_enabled_masks[GUC_RENDER_CLASS] = 1;
- blob->system_info.engine_enabled_masks[GUC_BLITTER_CLASS] = 1;
- blob->system_info.engine_enabled_masks[GUC_VIDEO_CLASS] = VDBOX_MASK(gt);
- blob->system_info.engine_enabled_masks[GUC_VIDEOENHANCE_CLASS] = VEBOX_MASK(gt);
+ fill_engine_enable_masks(gt, &blob->system_info);
blob->system_info.generic_gt_sysinfo[GUC_GENERIC_GT_SYSINFO_SLICE_ENABLED] =
hweight8(gt->info.sseu.slice_mask);
@@ -174,21 +514,19 @@ static void __guc_ads_init(struct intel_guc *guc)
GEN12_DOORBELLS_PER_SQIDI) + 1;
}
+ /* Golden contexts for re-initialising after a watchdog reset */
+ guc_prep_golden_context(guc, blob);
+
guc_mapping_table_init(guc_to_gt(guc), &blob->system_info);
base = intel_guc_ggtt_offset(guc, guc->ads_vma);
- /* Clients info */
- guc_ct_pool_entries_init(blob->ct_pool, ARRAY_SIZE(blob->ct_pool));
-
- blob->clients_info.clients_num = 1;
- blob->clients_info.ct_pool_addr = base + ptr_offset(blob, ct_pool);
- blob->clients_info.ct_pool_count = ARRAY_SIZE(blob->ct_pool);
-
/* ADS */
blob->ads.scheduler_policies = base + ptr_offset(blob, policies);
blob->ads.gt_system_info = base + ptr_offset(blob, system_info);
- blob->ads.clients_info = base + ptr_offset(blob, clients_info);
+
+ /* MMIO save/restore list */
+ guc_mmio_reg_state_init(guc, blob);
/* Private Data */
blob->ads.private_data = base + guc_ads_private_data_offset(guc);
@@ -210,6 +548,19 @@ int intel_guc_ads_create(struct intel_guc *guc)
GEM_BUG_ON(guc->ads_vma);
+ /* Need to calculate the reg state size dynamically: */
+ ret = guc_mmio_reg_state_query(guc);
+ if (ret < 0)
+ return ret;
+ guc->ads_regset_size = ret;
+
+ /* Likewise the golden contexts: */
+ ret = guc_prep_golden_context(guc, NULL);
+ if (ret < 0)
+ return ret;
+ guc->ads_golden_ctxt_size = ret;
+
+ /* Now the total size can be determined: */
size = guc_ads_blob_size(guc);
ret = intel_guc_allocate_and_map_vma(guc, size, &guc->ads_vma,
@@ -222,6 +573,18 @@ int intel_guc_ads_create(struct intel_guc *guc)
return 0;
}
+void intel_guc_ads_init_late(struct intel_guc *guc)
+{
+ /*
+ * The golden context setup requires the saved engine state from
+ * __engines_record_defaults(). However, that requires engines to be
+ * operational which means the ADS must already have been configured.
+ * Fortunately, the golden context state is not needed until a hang
+ * occurs, so it can be filled in during this late init phase.
+ */
+ guc_init_golden_context(guc);
+}
+
void intel_guc_ads_destroy(struct intel_guc *guc)
{
i915_vma_unpin_and_release(&guc->ads_vma, I915_VMA_RELEASE_MAP);
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.h b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.h
index b00d3ae1113a..3d85051d57e4 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.h
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.h
@@ -7,9 +7,13 @@
#define _INTEL_GUC_ADS_H_
struct intel_guc;
+struct drm_printer;
int intel_guc_ads_create(struct intel_guc *guc);
void intel_guc_ads_destroy(struct intel_guc *guc);
+void intel_guc_ads_init_late(struct intel_guc *guc);
void intel_guc_ads_reset(struct intel_guc *guc);
+void intel_guc_ads_print_policy_info(struct intel_guc *guc,
+ struct drm_printer *p);
#endif
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
index 8f7b148fef58..22b4733b55e2 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
@@ -3,6 +3,11 @@
* Copyright © 2016-2019 Intel Corporation
*/
+#include <linux/circ_buf.h>
+#include <linux/ktime.h>
+#include <linux/time64.h>
+#include <linux/timekeeping.h>
+
#include "i915_drv.h"
#include "intel_guc_ct.h"
#include "gt/intel_gt.h"
@@ -58,11 +63,17 @@ static inline struct drm_device *ct_to_drm(struct intel_guc_ct *ct)
* +--------+-----------------------------------------------+------+
*
* Size of each `CT Buffer`_ must be multiple of 4K.
- * As we don't expect too many messages, for now use minimum sizes.
+ * We don't expect too many messages in flight at any time, unless we are
+ * using the GuC submission. In that case each request requires a minimum
+ * 2 dwords which gives us a maximum 256 queue'd requests. Hopefully this
+ * enough space to avoid backpressure on the driver. We increase the size
+ * of the receive buffer (relative to the send) to ensure a G2H response
+ * CTB has a landing spot.
*/
#define CTB_DESC_SIZE ALIGN(sizeof(struct guc_ct_buffer_desc), SZ_2K)
#define CTB_H2G_BUFFER_SIZE (SZ_4K)
-#define CTB_G2H_BUFFER_SIZE (SZ_4K)
+#define CTB_G2H_BUFFER_SIZE (4 * CTB_H2G_BUFFER_SIZE)
+#define G2H_ROOM_BUFFER_SIZE (CTB_G2H_BUFFER_SIZE / 4)
struct ct_request {
struct list_head link;
@@ -98,66 +109,84 @@ void intel_guc_ct_init_early(struct intel_guc_ct *ct)
INIT_LIST_HEAD(&ct->requests.incoming);
INIT_WORK(&ct->requests.worker, ct_incoming_request_worker_func);
tasklet_setup(&ct->receive_tasklet, ct_receive_tasklet_func);
+ init_waitqueue_head(&ct->wq);
}
static inline const char *guc_ct_buffer_type_to_str(u32 type)
{
switch (type) {
- case INTEL_GUC_CT_BUFFER_TYPE_SEND:
+ case GUC_CTB_TYPE_HOST2GUC:
return "SEND";
- case INTEL_GUC_CT_BUFFER_TYPE_RECV:
+ case GUC_CTB_TYPE_GUC2HOST:
return "RECV";
default:
return "<invalid>";
}
}
-static void guc_ct_buffer_desc_init(struct guc_ct_buffer_desc *desc,
- u32 cmds_addr, u32 size)
+static void guc_ct_buffer_desc_init(struct guc_ct_buffer_desc *desc)
{
memset(desc, 0, sizeof(*desc));
- desc->addr = cmds_addr;
- desc->size = size;
- desc->owner = CTB_OWNER_HOST;
}
-static void guc_ct_buffer_reset(struct intel_guc_ct_buffer *ctb, u32 cmds_addr)
+static void guc_ct_buffer_reset(struct intel_guc_ct_buffer *ctb)
{
- guc_ct_buffer_desc_init(ctb->desc, cmds_addr, ctb->size);
+ u32 space;
+
+ ctb->broken = false;
+ ctb->tail = 0;
+ ctb->head = 0;
+ space = CIRC_SPACE(ctb->tail, ctb->head, ctb->size) - ctb->resv_space;
+ atomic_set(&ctb->space, space);
+
+ guc_ct_buffer_desc_init(ctb->desc);
}
static void guc_ct_buffer_init(struct intel_guc_ct_buffer *ctb,
struct guc_ct_buffer_desc *desc,
- u32 *cmds, u32 size)
+ u32 *cmds, u32 size_in_bytes, u32 resv_space)
{
- GEM_BUG_ON(size % 4);
+ GEM_BUG_ON(size_in_bytes % 4);
ctb->desc = desc;
ctb->cmds = cmds;
- ctb->size = size;
+ ctb->size = size_in_bytes / 4;
+ ctb->resv_space = resv_space / 4;
- guc_ct_buffer_reset(ctb, 0);
+ guc_ct_buffer_reset(ctb);
}
-static int guc_action_register_ct_buffer(struct intel_guc *guc,
- u32 desc_addr,
- u32 type)
+static int guc_action_register_ct_buffer(struct intel_guc *guc, u32 type,
+ u32 desc_addr, u32 buff_addr, u32 size)
{
- u32 action[] = {
- INTEL_GUC_ACTION_REGISTER_COMMAND_TRANSPORT_BUFFER,
- desc_addr,
- sizeof(struct guc_ct_buffer_desc),
- type
+ u32 request[HOST2GUC_REGISTER_CTB_REQUEST_MSG_LEN] = {
+ FIELD_PREP(GUC_HXG_MSG_0_ORIGIN, GUC_HXG_ORIGIN_HOST) |
+ FIELD_PREP(GUC_HXG_MSG_0_TYPE, GUC_HXG_TYPE_REQUEST) |
+ FIELD_PREP(GUC_HXG_REQUEST_MSG_0_ACTION, GUC_ACTION_HOST2GUC_REGISTER_CTB),
+ FIELD_PREP(HOST2GUC_REGISTER_CTB_REQUEST_MSG_1_SIZE, size / SZ_4K - 1) |
+ FIELD_PREP(HOST2GUC_REGISTER_CTB_REQUEST_MSG_1_TYPE, type),
+ FIELD_PREP(HOST2GUC_REGISTER_CTB_REQUEST_MSG_2_DESC_ADDR, desc_addr),
+ FIELD_PREP(HOST2GUC_REGISTER_CTB_REQUEST_MSG_3_BUFF_ADDR, buff_addr),
};
- /* Can't use generic send(), CT registration must go over MMIO */
- return intel_guc_send_mmio(guc, action, ARRAY_SIZE(action), NULL, 0);
+ GEM_BUG_ON(type != GUC_CTB_TYPE_HOST2GUC && type != GUC_CTB_TYPE_GUC2HOST);
+ GEM_BUG_ON(size % SZ_4K);
+
+ /* CT registration must go over MMIO */
+ return intel_guc_send_mmio(guc, request, ARRAY_SIZE(request), NULL, 0);
}
-static int ct_register_buffer(struct intel_guc_ct *ct, u32 desc_addr, u32 type)
+static int ct_register_buffer(struct intel_guc_ct *ct, u32 type,
+ u32 desc_addr, u32 buff_addr, u32 size)
{
- int err = guc_action_register_ct_buffer(ct_to_guc(ct), desc_addr, type);
+ int err;
+
+ err = i915_inject_probe_error(guc_to_gt(ct_to_guc(ct))->i915, -ENXIO);
+ if (unlikely(err))
+ return err;
+ err = guc_action_register_ct_buffer(ct_to_guc(ct), type,
+ desc_addr, buff_addr, size);
if (unlikely(err))
CT_ERROR(ct, "Failed to register %s buffer (err=%d)\n",
guc_ct_buffer_type_to_str(type), err);
@@ -166,14 +195,17 @@ static int ct_register_buffer(struct intel_guc_ct *ct, u32 desc_addr, u32 type)
static int guc_action_deregister_ct_buffer(struct intel_guc *guc, u32 type)
{
- u32 action[] = {
- INTEL_GUC_ACTION_DEREGISTER_COMMAND_TRANSPORT_BUFFER,
- CTB_OWNER_HOST,
- type
+ u32 request[HOST2GUC_DEREGISTER_CTB_REQUEST_MSG_LEN] = {
+ FIELD_PREP(GUC_HXG_MSG_0_ORIGIN, GUC_HXG_ORIGIN_HOST) |
+ FIELD_PREP(GUC_HXG_MSG_0_TYPE, GUC_HXG_TYPE_REQUEST) |
+ FIELD_PREP(GUC_HXG_REQUEST_MSG_0_ACTION, GUC_ACTION_HOST2GUC_DEREGISTER_CTB),
+ FIELD_PREP(HOST2GUC_DEREGISTER_CTB_REQUEST_MSG_1_TYPE, type),
};
- /* Can't use generic send(), CT deregistration must go over MMIO */
- return intel_guc_send_mmio(guc, action, ARRAY_SIZE(action), NULL, 0);
+ GEM_BUG_ON(type != GUC_CTB_TYPE_HOST2GUC && type != GUC_CTB_TYPE_GUC2HOST);
+
+ /* CT deregistration must go over MMIO */
+ return intel_guc_send_mmio(guc, request, ARRAY_SIZE(request), NULL, 0);
}
static int ct_deregister_buffer(struct intel_guc_ct *ct, u32 type)
@@ -200,10 +232,15 @@ int intel_guc_ct_init(struct intel_guc_ct *ct)
struct guc_ct_buffer_desc *desc;
u32 blob_size;
u32 cmds_size;
+ u32 resv_space;
void *blob;
u32 *cmds;
int err;
+ err = i915_inject_probe_error(guc_to_gt(guc)->i915, -ENXIO);
+ if (err)
+ return err;
+
GEM_BUG_ON(ct->vma);
blob_size = 2 * CTB_DESC_SIZE + CTB_H2G_BUFFER_SIZE + CTB_G2H_BUFFER_SIZE;
@@ -220,19 +257,23 @@ int intel_guc_ct_init(struct intel_guc_ct *ct)
desc = blob;
cmds = blob + 2 * CTB_DESC_SIZE;
cmds_size = CTB_H2G_BUFFER_SIZE;
- CT_DEBUG(ct, "%s desc %#tx cmds %#tx size %u\n", "send",
- ptrdiff(desc, blob), ptrdiff(cmds, blob), cmds_size);
+ resv_space = 0;
+ CT_DEBUG(ct, "%s desc %#tx cmds %#tx size %u/%u\n", "send",
+ ptrdiff(desc, blob), ptrdiff(cmds, blob), cmds_size,
+ resv_space);
- guc_ct_buffer_init(&ct->ctbs.send, desc, cmds, cmds_size);
+ guc_ct_buffer_init(&ct->ctbs.send, desc, cmds, cmds_size, resv_space);
/* store pointers to desc and cmds for recv ctb */
desc = blob + CTB_DESC_SIZE;
cmds = blob + 2 * CTB_DESC_SIZE + CTB_H2G_BUFFER_SIZE;
cmds_size = CTB_G2H_BUFFER_SIZE;
- CT_DEBUG(ct, "%s desc %#tx cmds %#tx size %u\n", "recv",
- ptrdiff(desc, blob), ptrdiff(cmds, blob), cmds_size);
+ resv_space = G2H_ROOM_BUFFER_SIZE;
+ CT_DEBUG(ct, "%s desc %#tx cmds %#tx size %u/%u\n", "recv",
+ ptrdiff(desc, blob), ptrdiff(cmds, blob), cmds_size,
+ resv_space);
- guc_ct_buffer_init(&ct->ctbs.recv, desc, cmds, cmds_size);
+ guc_ct_buffer_init(&ct->ctbs.recv, desc, cmds, cmds_size, resv_space);
return 0;
}
@@ -261,7 +302,7 @@ void intel_guc_ct_fini(struct intel_guc_ct *ct)
int intel_guc_ct_enable(struct intel_guc_ct *ct)
{
struct intel_guc *guc = ct_to_guc(ct);
- u32 base, cmds;
+ u32 base, desc, cmds;
void *blob;
int err;
@@ -277,32 +318,36 @@ int intel_guc_ct_enable(struct intel_guc_ct *ct)
GEM_BUG_ON(blob != ct->ctbs.send.desc);
/* (re)initialize descriptors */
- cmds = base + ptrdiff(ct->ctbs.send.cmds, blob);
- guc_ct_buffer_reset(&ct->ctbs.send, cmds);
-
- cmds = base + ptrdiff(ct->ctbs.recv.cmds, blob);
- guc_ct_buffer_reset(&ct->ctbs.recv, cmds);
+ guc_ct_buffer_reset(&ct->ctbs.send);
+ guc_ct_buffer_reset(&ct->ctbs.recv);
/*
* Register both CT buffers starting with RECV buffer.
* Descriptors are in first half of the blob.
*/
- err = ct_register_buffer(ct, base + ptrdiff(ct->ctbs.recv.desc, blob),
- INTEL_GUC_CT_BUFFER_TYPE_RECV);
+ desc = base + ptrdiff(ct->ctbs.recv.desc, blob);
+ cmds = base + ptrdiff(ct->ctbs.recv.cmds, blob);
+ err = ct_register_buffer(ct, GUC_CTB_TYPE_GUC2HOST,
+ desc, cmds, ct->ctbs.recv.size * 4);
+
if (unlikely(err))
goto err_out;
- err = ct_register_buffer(ct, base + ptrdiff(ct->ctbs.send.desc, blob),
- INTEL_GUC_CT_BUFFER_TYPE_SEND);
+ desc = base + ptrdiff(ct->ctbs.send.desc, blob);
+ cmds = base + ptrdiff(ct->ctbs.send.cmds, blob);
+ err = ct_register_buffer(ct, GUC_CTB_TYPE_HOST2GUC,
+ desc, cmds, ct->ctbs.send.size * 4);
+
if (unlikely(err))
goto err_deregister;
ct->enabled = true;
+ ct->stall_time = KTIME_MAX;
return 0;
err_deregister:
- ct_deregister_buffer(ct, INTEL_GUC_CT_BUFFER_TYPE_RECV);
+ ct_deregister_buffer(ct, GUC_CTB_TYPE_GUC2HOST);
err_out:
CT_PROBE_ERROR(ct, "Failed to enable CTB (%pe)\n", ERR_PTR(err));
return err;
@@ -321,8 +366,8 @@ void intel_guc_ct_disable(struct intel_guc_ct *ct)
ct->enabled = false;
if (intel_guc_is_fw_running(guc)) {
- ct_deregister_buffer(ct, INTEL_GUC_CT_BUFFER_TYPE_SEND);
- ct_deregister_buffer(ct, INTEL_GUC_CT_BUFFER_TYPE_RECV);
+ ct_deregister_buffer(ct, GUC_CTB_TYPE_HOST2GUC);
+ ct_deregister_buffer(ct, GUC_CTB_TYPE_GUC2HOST);
}
}
@@ -354,81 +399,63 @@ static void write_barrier(struct intel_guc_ct *ct)
}
}
-/**
- * DOC: CTB Host to GuC request
- *
- * Format of the CTB Host to GuC request message is as follows::
- *
- * +------------+---------+---------+---------+---------+
- * | msg[0] | [1] | [2] | ... | [n-1] |
- * +------------+---------+---------+---------+---------+
- * | MESSAGE | MESSAGE PAYLOAD |
- * + HEADER +---------+---------+---------+---------+
- * | | 0 | 1 | ... | n |
- * +============+=========+=========+=========+=========+
- * | len >= 1 | FENCE | request specific data |
- * +------+-----+---------+---------+---------+---------+
- *
- * ^-----------------len-------------------^
- */
-
static int ct_write(struct intel_guc_ct *ct,
const u32 *action,
u32 len /* in dwords */,
- u32 fence)
+ u32 fence, u32 flags)
{
struct intel_guc_ct_buffer *ctb = &ct->ctbs.send;
struct guc_ct_buffer_desc *desc = ctb->desc;
- u32 head = desc->head;
- u32 tail = desc->tail;
+ u32 tail = ctb->tail;
u32 size = ctb->size;
- u32 used;
u32 header;
+ u32 hxg;
+ u32 type;
u32 *cmds = ctb->cmds;
unsigned int i;
- if (unlikely(desc->is_in_error))
- return -EPIPE;
-
- if (unlikely(!IS_ALIGNED(head | tail, 4) ||
- (tail | head) >= size))
+ if (unlikely(desc->status))
goto corrupted;
- /* later calculations will be done in dwords */
- head /= 4;
- tail /= 4;
- size /= 4;
-
- /*
- * tail == head condition indicates empty. GuC FW does not support
- * using up the entire buffer to get tail == head meaning full.
- */
- if (tail < head)
- used = (size - head) + tail;
- else
- used = tail - head;
+ GEM_BUG_ON(tail > size);
- /* make sure there is a space including extra dw for the fence */
- if (unlikely(used + len + 1 >= size))
- return -ENOSPC;
+#ifdef CONFIG_DRM_I915_DEBUG_GUC
+ if (unlikely(tail != READ_ONCE(desc->tail))) {
+ CT_ERROR(ct, "Tail was modified %u != %u\n",
+ desc->tail, tail);
+ desc->status |= GUC_CTB_STATUS_MISMATCH;
+ goto corrupted;
+ }
+ if (unlikely(READ_ONCE(desc->head) >= size)) {
+ CT_ERROR(ct, "Invalid head offset %u >= %u)\n",
+ desc->head, size);
+ desc->status |= GUC_CTB_STATUS_OVERFLOW;
+ goto corrupted;
+ }
+#endif
/*
- * Write the message. The format is the following:
- * DW0: header (including action code)
- * DW1: fence
- * DW2+: action data
+ * dw0: CT header (including fence)
+ * dw1: HXG header (including action code)
+ * dw2+: action data
*/
- header = (len << GUC_CT_MSG_LEN_SHIFT) |
- GUC_CT_MSG_SEND_STATUS |
- (action[0] << GUC_CT_MSG_ACTION_SHIFT);
+ header = FIELD_PREP(GUC_CTB_MSG_0_FORMAT, GUC_CTB_FORMAT_HXG) |
+ FIELD_PREP(GUC_CTB_MSG_0_NUM_DWORDS, len) |
+ FIELD_PREP(GUC_CTB_MSG_0_FENCE, fence);
- CT_DEBUG(ct, "writing %*ph %*ph %*ph\n",
- 4, &header, 4, &fence, 4 * (len - 1), &action[1]);
+ type = (flags & INTEL_GUC_CT_SEND_NB) ? GUC_HXG_TYPE_EVENT :
+ GUC_HXG_TYPE_REQUEST;
+ hxg = FIELD_PREP(GUC_HXG_MSG_0_TYPE, type) |
+ FIELD_PREP(GUC_HXG_EVENT_MSG_0_ACTION |
+ GUC_HXG_EVENT_MSG_0_DATA0, action[0]);
+
+ CT_DEBUG(ct, "writing (tail %u) %*ph %*ph %*ph\n",
+ tail, 4, &header, 4, &hxg, 4 * (len - 1), &action[1]);
cmds[tail] = header;
tail = (tail + 1) % size;
- cmds[tail] = fence;
+ cmds[tail] = hxg;
tail = (tail + 1) % size;
for (i = 1; i < len; i++) {
@@ -443,14 +470,20 @@ static int ct_write(struct intel_guc_ct *ct,
*/
write_barrier(ct);
- /* now update desc tail (back in bytes) */
- desc->tail = tail * 4;
+ /* update local copies */
+ ctb->tail = tail;
+ GEM_BUG_ON(atomic_read(&ctb->space) < len + GUC_CTB_HDR_LEN);
+ atomic_sub(len + GUC_CTB_HDR_LEN, &ctb->space);
+
+ /* now update descriptor */
+ WRITE_ONCE(desc->tail, tail);
+
return 0;
corrupted:
- CT_ERROR(ct, "Corrupted descriptor addr=%#x head=%u tail=%u size=%u\n",
- desc->addr, desc->head, desc->tail, desc->size);
- desc->is_in_error = 1;
+ CT_ERROR(ct, "Corrupted descriptor head=%u tail=%u status=%#x\n",
+ desc->head, desc->tail, desc->status);
+ ctb->broken = true;
return -EPIPE;
}
@@ -459,7 +492,7 @@ corrupted:
* @req: pointer to pending request
* @status: placeholder for status
*
- * For each sent request, Guc shall send bac CT response message.
+ * For each sent request, GuC shall send back CT response message.
* Our message handler will update status of tracked request once
* response message with given fence is received. Wait here and
* check for valid response status value.
@@ -475,12 +508,18 @@ static int wait_for_ct_request_update(struct ct_request *req, u32 *status)
/*
* Fast commands should complete in less than 10us, so sample quickly
* up to that length of time, then switch to a slower sleep-wait loop.
- * No GuC command should ever take longer than 10ms.
+ * No GuC command should ever take longer than 10ms but many GuC
+ * commands can be inflight at time, so use a 1s timeout on the slower
+ * sleep-wait loop.
*/
-#define done INTEL_GUC_MSG_IS_RESPONSE(READ_ONCE(req->status))
- err = wait_for_us(done, 10);
+#define GUC_CTB_RESPONSE_TIMEOUT_SHORT_MS 10
+#define GUC_CTB_RESPONSE_TIMEOUT_LONG_MS 1000
+#define done \
+ (FIELD_GET(GUC_HXG_MSG_0_ORIGIN, READ_ONCE(req->status)) == \
+ GUC_HXG_ORIGIN_GUC)
+ err = wait_for_us(done, GUC_CTB_RESPONSE_TIMEOUT_SHORT_MS);
if (err)
- err = wait_for(done, 10);
+ err = wait_for(done, GUC_CTB_RESPONSE_TIMEOUT_LONG_MS);
#undef done
if (unlikely(err))
@@ -490,6 +529,131 @@ static int wait_for_ct_request_update(struct ct_request *req, u32 *status)
return err;
}
+#define GUC_CTB_TIMEOUT_MS 1500
+static inline bool ct_deadlocked(struct intel_guc_ct *ct)
+{
+ long timeout = GUC_CTB_TIMEOUT_MS;
+ bool ret = ktime_ms_delta(ktime_get(), ct->stall_time) > timeout;
+
+ if (unlikely(ret)) {
+ struct guc_ct_buffer_desc *send = ct->ctbs.send.desc;
+ struct guc_ct_buffer_desc *recv = ct->ctbs.send.desc;
+
+ CT_ERROR(ct, "Communication stalled for %lld ms, desc status=%#x,%#x\n",
+ ktime_ms_delta(ktime_get(), ct->stall_time),
+ send->status, recv->status);
+ ct->ctbs.send.broken = true;
+ }
+
+ return ret;
+}
+
+static inline bool g2h_has_room(struct intel_guc_ct *ct, u32 g2h_len_dw)
+{
+ struct intel_guc_ct_buffer *ctb = &ct->ctbs.recv;
+
+ /*
+ * We leave a certain amount of space in the G2H CTB buffer for
+ * unexpected G2H CTBs (e.g. logging, engine hang, etc...)
+ */
+ return !g2h_len_dw || atomic_read(&ctb->space) >= g2h_len_dw;
+}
+
+static inline void g2h_reserve_space(struct intel_guc_ct *ct, u32 g2h_len_dw)
+{
+ lockdep_assert_held(&ct->ctbs.send.lock);
+
+ GEM_BUG_ON(!g2h_has_room(ct, g2h_len_dw));
+
+ if (g2h_len_dw)
+ atomic_sub(g2h_len_dw, &ct->ctbs.recv.space);
+}
+
+static inline void g2h_release_space(struct intel_guc_ct *ct, u32 g2h_len_dw)
+{
+ atomic_add(g2h_len_dw, &ct->ctbs.recv.space);
+}
+
+static inline bool h2g_has_room(struct intel_guc_ct *ct, u32 len_dw)
+{
+ struct intel_guc_ct_buffer *ctb = &ct->ctbs.send;
+ struct guc_ct_buffer_desc *desc = ctb->desc;
+ u32 head;
+ u32 space;
+
+ if (atomic_read(&ctb->space) >= len_dw)
+ return true;
+
+ head = READ_ONCE(desc->head);
+ if (unlikely(head > ctb->size)) {
+ CT_ERROR(ct, "Invalid head offset %u >= %u)\n",
+ head, ctb->size);
+ desc->status |= GUC_CTB_STATUS_OVERFLOW;
+ ctb->broken = true;
+ return false;
+ }
+
+ space = CIRC_SPACE(ctb->tail, head, ctb->size);
+ atomic_set(&ctb->space, space);
+
+ return space >= len_dw;
+}
+
+static int has_room_nb(struct intel_guc_ct *ct, u32 h2g_dw, u32 g2h_dw)
+{
+ lockdep_assert_held(&ct->ctbs.send.lock);
+
+ if (unlikely(!h2g_has_room(ct, h2g_dw) || !g2h_has_room(ct, g2h_dw))) {
+ if (ct->stall_time == KTIME_MAX)
+ ct->stall_time = ktime_get();
+
+ if (unlikely(ct_deadlocked(ct)))
+ return -EPIPE;
+ else
+ return -EBUSY;
+ }
+
+ ct->stall_time = KTIME_MAX;
+ return 0;
+}
+
+#define G2H_LEN_DW(f) ({ \
+ typeof(f) f_ = (f); \
+ FIELD_GET(INTEL_GUC_CT_SEND_G2H_DW_MASK, f_) ? \
+ FIELD_GET(INTEL_GUC_CT_SEND_G2H_DW_MASK, f_) + \
+ GUC_CTB_HXG_MSG_MIN_LEN : 0; \
+})
+static int ct_send_nb(struct intel_guc_ct *ct,
+ const u32 *action,
+ u32 len,
+ u32 flags)
+{
+ struct intel_guc_ct_buffer *ctb = &ct->ctbs.send;
+ unsigned long spin_flags;
+ u32 g2h_len_dw = G2H_LEN_DW(flags);
+ u32 fence;
+ int ret;
+
+ spin_lock_irqsave(&ctb->lock, spin_flags);
+
+ ret = has_room_nb(ct, len + GUC_CTB_HDR_LEN, g2h_len_dw);
+ if (unlikely(ret))
+ goto out;
+
+ fence = ct_get_next_fence(ct);
+ ret = ct_write(ct, action, len, fence, flags);
+ if (unlikely(ret))
+ goto out;
+
+ g2h_reserve_space(ct, g2h_len_dw);
+ intel_guc_notify(ct_to_guc(ct));
+
+out:
+ spin_unlock_irqrestore(&ctb->lock, spin_flags);
+
+ return ret;
+}
+
static int ct_send(struct intel_guc_ct *ct,
const u32 *action,
u32 len,
@@ -497,8 +661,10 @@ static int ct_send(struct intel_guc_ct *ct,
u32 response_buf_size,
u32 *status)
{
+ struct intel_guc_ct_buffer *ctb = &ct->ctbs.send;
struct ct_request request;
unsigned long flags;
+ unsigned int sleep_period_ms = 1;
u32 fence;
int err;
@@ -506,8 +672,33 @@ static int ct_send(struct intel_guc_ct *ct,
GEM_BUG_ON(!len);
GEM_BUG_ON(len & ~GUC_CT_MSG_LEN_MASK);
GEM_BUG_ON(!response_buf && response_buf_size);
+ might_sleep();
+
+ /*
+ * We use a lazy spin wait loop here as we believe that if the CT
+ * buffers are sized correctly the flow control condition should be
+ * rare. Reserving the maximum size in the G2H credits as we don't know
+ * how big the response is going to be.
+ */
+retry:
+ spin_lock_irqsave(&ctb->lock, flags);
+ if (unlikely(!h2g_has_room(ct, len + GUC_CTB_HDR_LEN) ||
+ !g2h_has_room(ct, GUC_CTB_HXG_MSG_MAX_LEN))) {
+ if (ct->stall_time == KTIME_MAX)
+ ct->stall_time = ktime_get();
+ spin_unlock_irqrestore(&ctb->lock, flags);
+
+ if (unlikely(ct_deadlocked(ct)))
+ return -EPIPE;
+
+ if (msleep_interruptible(sleep_period_ms))
+ return -EINTR;
+ sleep_period_ms = sleep_period_ms << 1;
+
+ goto retry;
+ }
- spin_lock_irqsave(&ct->ctbs.send.lock, flags);
+ ct->stall_time = KTIME_MAX;
fence = ct_get_next_fence(ct);
request.fence = fence;
@@ -519,9 +710,10 @@ static int ct_send(struct intel_guc_ct *ct,
list_add_tail(&request.link, &ct->requests.pending);
spin_unlock(&ct->requests.lock);
- err = ct_write(ct, action, len, fence);
+ err = ct_write(ct, action, len, fence, 0);
+ g2h_reserve_space(ct, GUC_CTB_HXG_MSG_MAX_LEN);
- spin_unlock_irqrestore(&ct->ctbs.send.lock, flags);
+ spin_unlock_irqrestore(&ctb->lock, flags);
if (unlikely(err))
goto unlink;
@@ -529,24 +721,25 @@ static int ct_send(struct intel_guc_ct *ct,
intel_guc_notify(ct_to_guc(ct));
err = wait_for_ct_request_update(&request, status);
+ g2h_release_space(ct, GUC_CTB_HXG_MSG_MAX_LEN);
if (unlikely(err))
goto unlink;
- if (!INTEL_GUC_MSG_IS_RESPONSE_SUCCESS(*status)) {
+ if (FIELD_GET(GUC_HXG_MSG_0_TYPE, *status) != GUC_HXG_TYPE_RESPONSE_SUCCESS) {
err = -EIO;
goto unlink;
}
if (response_buf) {
/* There shall be no data in the status */
- WARN_ON(INTEL_GUC_MSG_TO_DATA(request.status));
+ WARN_ON(FIELD_GET(GUC_HXG_RESPONSE_MSG_0_DATA0, request.status));
/* Return actual response len */
err = request.response_len;
} else {
/* There shall be no response payload */
WARN_ON(request.response_len);
/* Return data decoded from the status dword */
- err = INTEL_GUC_MSG_TO_DATA(*status);
+ err = FIELD_GET(GUC_HXG_RESPONSE_MSG_0_DATA0, *status);
}
unlink:
@@ -561,16 +754,25 @@ unlink:
* Command Transport (CT) buffer based GuC send function.
*/
int intel_guc_ct_send(struct intel_guc_ct *ct, const u32 *action, u32 len,
- u32 *response_buf, u32 response_buf_size)
+ u32 *response_buf, u32 response_buf_size, u32 flags)
{
u32 status = ~0; /* undefined */
int ret;
if (unlikely(!ct->enabled)) {
- WARN(1, "Unexpected send: action=%#x\n", *action);
+ struct intel_guc *guc = ct_to_guc(ct);
+ struct intel_uc *uc = container_of(guc, struct intel_uc, guc);
+
+ WARN(!uc->reset_in_progress, "Unexpected send: action=%#x\n", *action);
return -ENODEV;
}
+ if (unlikely(ct->ctbs.send.broken))
+ return -EPIPE;
+
+ if (flags & INTEL_GUC_CT_SEND_NB)
+ return ct_send_nb(ct, action, len, flags);
+
ret = ct_send(ct, action, len, response_buf, response_buf_size, &status);
if (unlikely(ret < 0)) {
CT_ERROR(ct, "Sending action %#x failed (err=%d status=%#X)\n",
@@ -583,21 +785,6 @@ int intel_guc_ct_send(struct intel_guc_ct *ct, const u32 *action, u32 len,
return ret;
}
-static inline unsigned int ct_header_get_len(u32 header)
-{
- return (header >> GUC_CT_MSG_LEN_SHIFT) & GUC_CT_MSG_LEN_MASK;
-}
-
-static inline unsigned int ct_header_get_action(u32 header)
-{
- return (header >> GUC_CT_MSG_ACTION_SHIFT) & GUC_CT_MSG_ACTION_MASK;
-}
-
-static inline bool ct_header_is_response(u32 header)
-{
- return !!(header & GUC_CT_MSG_IS_RESPONSE);
-}
-
static struct ct_incoming_msg *ct_alloc_msg(u32 num_dwords)
{
struct ct_incoming_msg *msg;
@@ -621,8 +808,8 @@ static int ct_read(struct intel_guc_ct *ct, struct ct_incoming_msg **msg)
{
struct intel_guc_ct_buffer *ctb = &ct->ctbs.recv;
struct guc_ct_buffer_desc *desc = ctb->desc;
- u32 head = desc->head;
- u32 tail = desc->tail;
+ u32 head = ctb->head;
+ u32 tail = READ_ONCE(desc->tail);
u32 size = ctb->size;
u32 *cmds = ctb->cmds;
s32 available;
@@ -630,17 +817,28 @@ static int ct_read(struct intel_guc_ct *ct, struct ct_incoming_msg **msg)
unsigned int i;
u32 header;
- if (unlikely(desc->is_in_error))
+ if (unlikely(ctb->broken))
return -EPIPE;
- if (unlikely(!IS_ALIGNED(head | tail, 4) ||
- (tail | head) >= size))
+ if (unlikely(desc->status))
goto corrupted;
- /* later calculations will be done in dwords */
- head /= 4;
- tail /= 4;
- size /= 4;
+ GEM_BUG_ON(head > size);
+
+#ifdef CONFIG_DRM_I915_DEBUG_GUC
+ if (unlikely(head != READ_ONCE(desc->head))) {
+ CT_ERROR(ct, "Head was modified %u != %u\n",
+ desc->head, head);
+ desc->status |= GUC_CTB_STATUS_MISMATCH;
+ goto corrupted;
+ }
+#endif
+ if (unlikely(tail >= size)) {
+ CT_ERROR(ct, "Invalid tail offset %u >= %u)\n",
+ tail, size);
+ desc->status |= GUC_CTB_STATUS_OVERFLOW;
+ goto corrupted;
+ }
/* tail == head condition indicates empty */
available = tail - head;
@@ -652,14 +850,14 @@ static int ct_read(struct intel_guc_ct *ct, struct ct_incoming_msg **msg)
/* beware of buffer wrap case */
if (unlikely(available < 0))
available += size;
- CT_DEBUG(ct, "available %d (%u:%u)\n", available, head, tail);
+ CT_DEBUG(ct, "available %d (%u:%u:%u)\n", available, head, tail, size);
GEM_BUG_ON(available < 0);
header = cmds[head];
head = (head + 1) % size;
/* message len with header */
- len = ct_header_get_len(header) + 1;
+ len = FIELD_GET(GUC_CTB_MSG_0_NUM_DWORDS, header) + GUC_CTB_MSG_MIN_LEN;
if (unlikely(len > (u32)available)) {
CT_ERROR(ct, "Incomplete message %*ph %*ph %*ph\n",
4, &header,
@@ -667,6 +865,7 @@ static int ct_read(struct intel_guc_ct *ct, struct ct_incoming_msg **msg)
size - head : available - 1), &cmds[head],
4 * (head + available - 1 > size ?
available - 1 - size + head : 0), &cmds[0]);
+ desc->status |= GUC_CTB_STATUS_UNDERFLOW;
goto corrupted;
}
@@ -689,65 +888,39 @@ static int ct_read(struct intel_guc_ct *ct, struct ct_incoming_msg **msg)
}
CT_DEBUG(ct, "received %*ph\n", 4 * len, (*msg)->msg);
- desc->head = head * 4;
+ /* update local copies */
+ ctb->head = head;
+
+ /* now update descriptor */
+ WRITE_ONCE(desc->head, head);
+
return available - len;
corrupted:
- CT_ERROR(ct, "Corrupted descriptor addr=%#x head=%u tail=%u size=%u\n",
- desc->addr, desc->head, desc->tail, desc->size);
- desc->is_in_error = 1;
+ CT_ERROR(ct, "Corrupted descriptor head=%u tail=%u status=%#x\n",
+ desc->head, desc->tail, desc->status);
+ ctb->broken = true;
return -EPIPE;
}
-/**
- * DOC: CTB GuC to Host response
- *
- * Format of the CTB GuC to Host response message is as follows::
- *
- * +------------+---------+---------+---------+---------+---------+
- * | msg[0] | [1] | [2] | [3] | ... | [n-1] |
- * +------------+---------+---------+---------+---------+---------+
- * | MESSAGE | MESSAGE PAYLOAD |
- * + HEADER +---------+---------+---------+---------+---------+
- * | | 0 | 1 | 2 | ... | n |
- * +============+=========+=========+=========+=========+=========+
- * | len >= 2 | FENCE | STATUS | response specific data |
- * +------+-----+---------+---------+---------+---------+---------+
- *
- * ^-----------------------len-----------------------^
- */
-
static int ct_handle_response(struct intel_guc_ct *ct, struct ct_incoming_msg *response)
{
- u32 header = response->msg[0];
- u32 len = ct_header_get_len(header);
- u32 fence;
- u32 status;
- u32 datalen;
+ u32 len = FIELD_GET(GUC_CTB_MSG_0_NUM_DWORDS, response->msg[0]);
+ u32 fence = FIELD_GET(GUC_CTB_MSG_0_FENCE, response->msg[0]);
+ const u32 *hxg = &response->msg[GUC_CTB_MSG_MIN_LEN];
+ const u32 *data = &hxg[GUC_HXG_MSG_MIN_LEN];
+ u32 datalen = len - GUC_HXG_MSG_MIN_LEN;
struct ct_request *req;
unsigned long flags;
bool found = false;
int err = 0;
- GEM_BUG_ON(!ct_header_is_response(header));
-
- /* Response payload shall at least include fence and status */
- if (unlikely(len < 2)) {
- CT_ERROR(ct, "Corrupted response (len %u)\n", len);
- return -EPROTO;
- }
-
- fence = response->msg[1];
- status = response->msg[2];
- datalen = len - 2;
-
- /* Format of the status follows RESPONSE message */
- if (unlikely(!INTEL_GUC_MSG_IS_RESPONSE(status))) {
- CT_ERROR(ct, "Corrupted response (status %#x)\n", status);
- return -EPROTO;
- }
+ GEM_BUG_ON(len < GUC_HXG_MSG_MIN_LEN);
+ GEM_BUG_ON(FIELD_GET(GUC_HXG_MSG_0_ORIGIN, hxg[0]) != GUC_HXG_ORIGIN_GUC);
+ GEM_BUG_ON(FIELD_GET(GUC_HXG_MSG_0_TYPE, hxg[0]) != GUC_HXG_TYPE_RESPONSE_SUCCESS &&
+ FIELD_GET(GUC_HXG_MSG_0_TYPE, hxg[0]) != GUC_HXG_TYPE_RESPONSE_FAILURE);
- CT_DEBUG(ct, "response fence %u status %#x\n", fence, status);
+ CT_DEBUG(ct, "response fence %u status %#x\n", fence, hxg[0]);
spin_lock_irqsave(&ct->requests.lock, flags);
list_for_each_entry(req, &ct->requests.pending, link) {
@@ -763,18 +936,22 @@ static int ct_handle_response(struct intel_guc_ct *ct, struct ct_incoming_msg *r
err = -EMSGSIZE;
}
if (datalen)
- memcpy(req->response_buf, response->msg + 3, 4 * datalen);
+ memcpy(req->response_buf, data, 4 * datalen);
req->response_len = datalen;
- WRITE_ONCE(req->status, status);
+ WRITE_ONCE(req->status, hxg[0]);
found = true;
break;
}
- spin_unlock_irqrestore(&ct->requests.lock, flags);
-
if (!found) {
CT_ERROR(ct, "Unsolicited response (fence %u)\n", fence);
- return -ENOKEY;
+ CT_ERROR(ct, "Could not find fence=%u, last_fence=%u\n", fence,
+ ct->requests.last_fence);
+ list_for_each_entry(req, &ct->requests.pending, link)
+ CT_ERROR(ct, "request %u awaits response\n",
+ req->fence);
+ err = -ENOKEY;
}
+ spin_unlock_irqrestore(&ct->requests.lock, flags);
if (unlikely(err))
return err;
@@ -786,14 +963,16 @@ static int ct_handle_response(struct intel_guc_ct *ct, struct ct_incoming_msg *r
static int ct_process_request(struct intel_guc_ct *ct, struct ct_incoming_msg *request)
{
struct intel_guc *guc = ct_to_guc(ct);
- u32 header, action, len;
+ const u32 *hxg;
const u32 *payload;
+ u32 hxg_len, action, len;
int ret;
- header = request->msg[0];
- payload = &request->msg[1];
- action = ct_header_get_action(header);
- len = ct_header_get_len(header);
+ hxg = &request->msg[GUC_CTB_MSG_MIN_LEN];
+ hxg_len = request->size - GUC_CTB_MSG_MIN_LEN;
+ payload = &hxg[GUC_HXG_MSG_MIN_LEN];
+ action = FIELD_GET(GUC_HXG_EVENT_MSG_0_ACTION, hxg[0]);
+ len = hxg_len - GUC_HXG_MSG_MIN_LEN;
CT_DEBUG(ct, "request %x %*ph\n", action, 4 * len, payload);
@@ -801,6 +980,19 @@ static int ct_process_request(struct intel_guc_ct *ct, struct ct_incoming_msg *r
case INTEL_GUC_ACTION_DEFAULT:
ret = intel_guc_to_host_process_recv_msg(guc, payload, len);
break;
+ case INTEL_GUC_ACTION_DEREGISTER_CONTEXT_DONE:
+ ret = intel_guc_deregister_done_process_msg(guc, payload,
+ len);
+ break;
+ case INTEL_GUC_ACTION_SCHED_CONTEXT_MODE_DONE:
+ ret = intel_guc_sched_done_process_msg(guc, payload, len);
+ break;
+ case INTEL_GUC_ACTION_CONTEXT_RESET_NOTIFICATION:
+ ret = intel_guc_context_reset_process_msg(guc, payload, len);
+ break;
+ case INTEL_GUC_ACTION_ENGINE_FAILURE_NOTIFICATION:
+ ret = intel_guc_engine_failure_process_msg(guc, payload, len);
+ break;
default:
ret = -EOPNOTSUPP;
break;
@@ -855,29 +1047,24 @@ static void ct_incoming_request_worker_func(struct work_struct *w)
queue_work(system_unbound_wq, &ct->requests.worker);
}
-/**
- * DOC: CTB GuC to Host request
- *
- * Format of the CTB GuC to Host request message is as follows::
- *
- * +------------+---------+---------+---------+---------+---------+
- * | msg[0] | [1] | [2] | [3] | ... | [n-1] |
- * +------------+---------+---------+---------+---------+---------+
- * | MESSAGE | MESSAGE PAYLOAD |
- * + HEADER +---------+---------+---------+---------+---------+
- * | | 0 | 1 | 2 | ... | n |
- * +============+=========+=========+=========+=========+=========+
- * | len | request specific data |
- * +------+-----+---------+---------+---------+---------+---------+
- *
- * ^-----------------------len-----------------------^
- */
-
-static int ct_handle_request(struct intel_guc_ct *ct, struct ct_incoming_msg *request)
+static int ct_handle_event(struct intel_guc_ct *ct, struct ct_incoming_msg *request)
{
+ const u32 *hxg = &request->msg[GUC_CTB_MSG_MIN_LEN];
+ u32 action = FIELD_GET(GUC_HXG_EVENT_MSG_0_ACTION, hxg[0]);
unsigned long flags;
- GEM_BUG_ON(ct_header_is_response(request->msg[0]));
+ GEM_BUG_ON(FIELD_GET(GUC_HXG_MSG_0_TYPE, hxg[0]) != GUC_HXG_TYPE_EVENT);
+
+ /*
+ * Adjusting the space must be done in IRQ or deadlock can occur as the
+ * CTB processing in the below workqueue can send CTBs which creates a
+ * circular dependency if the space was returned there.
+ */
+ switch (action) {
+ case INTEL_GUC_ACTION_SCHED_CONTEXT_MODE_DONE:
+ case INTEL_GUC_ACTION_DEREGISTER_CONTEXT_DONE:
+ g2h_release_space(ct, request->size);
+ }
spin_lock_irqsave(&ct->requests.lock, flags);
list_add_tail(&request->link, &ct->requests.incoming);
@@ -887,15 +1074,53 @@ static int ct_handle_request(struct intel_guc_ct *ct, struct ct_incoming_msg *re
return 0;
}
-static void ct_handle_msg(struct intel_guc_ct *ct, struct ct_incoming_msg *msg)
+static int ct_handle_hxg(struct intel_guc_ct *ct, struct ct_incoming_msg *msg)
{
- u32 header = msg->msg[0];
+ u32 origin, type;
+ u32 *hxg;
int err;
- if (ct_header_is_response(header))
+ if (unlikely(msg->size < GUC_CTB_HXG_MSG_MIN_LEN))
+ return -EBADMSG;
+
+ hxg = &msg->msg[GUC_CTB_MSG_MIN_LEN];
+
+ origin = FIELD_GET(GUC_HXG_MSG_0_ORIGIN, hxg[0]);
+ if (unlikely(origin != GUC_HXG_ORIGIN_GUC)) {
+ err = -EPROTO;
+ goto failed;
+ }
+
+ type = FIELD_GET(GUC_HXG_MSG_0_TYPE, hxg[0]);
+ switch (type) {
+ case GUC_HXG_TYPE_EVENT:
+ err = ct_handle_event(ct, msg);
+ break;
+ case GUC_HXG_TYPE_RESPONSE_SUCCESS:
+ case GUC_HXG_TYPE_RESPONSE_FAILURE:
err = ct_handle_response(ct, msg);
+ break;
+ default:
+ err = -EOPNOTSUPP;
+ }
+
+ if (unlikely(err)) {
+failed:
+ CT_ERROR(ct, "Failed to handle HXG message (%pe) %*ph\n",
+ ERR_PTR(err), 4 * GUC_HXG_MSG_MIN_LEN, hxg);
+ }
+ return err;
+}
+
+static void ct_handle_msg(struct intel_guc_ct *ct, struct ct_incoming_msg *msg)
+{
+ u32 format = FIELD_GET(GUC_CTB_MSG_0_FORMAT, msg->msg[0]);
+ int err;
+
+ if (format == GUC_CTB_FORMAT_HXG)
+ err = ct_handle_hxg(ct, msg);
else
- err = ct_handle_request(ct, msg);
+ err = -EOPNOTSUPP;
if (unlikely(err)) {
CT_ERROR(ct, "Failed to process CT message (%pe) %*ph\n",
@@ -958,3 +1183,25 @@ void intel_guc_ct_event_handler(struct intel_guc_ct *ct)
ct_try_receive_message(ct);
}
+
+void intel_guc_ct_print_info(struct intel_guc_ct *ct,
+ struct drm_printer *p)
+{
+ drm_printf(p, "CT %s\n", enableddisabled(ct->enabled));
+
+ if (!ct->enabled)
+ return;
+
+ drm_printf(p, "H2G Space: %u\n",
+ atomic_read(&ct->ctbs.send.space) * 4);
+ drm_printf(p, "Head: %u\n",
+ ct->ctbs.send.desc->head);
+ drm_printf(p, "Tail: %u\n",
+ ct->ctbs.send.desc->tail);
+ drm_printf(p, "G2H Space: %u\n",
+ atomic_read(&ct->ctbs.recv.space) * 4);
+ drm_printf(p, "Head: %u\n",
+ ct->ctbs.recv.desc->head);
+ drm_printf(p, "Tail: %u\n",
+ ct->ctbs.recv.desc->tail);
+}
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.h b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.h
index cb222f202301..f709a19c7e21 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.h
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.h
@@ -9,11 +9,14 @@
#include <linux/interrupt.h>
#include <linux/spinlock.h>
#include <linux/workqueue.h>
+#include <linux/ktime.h>
+#include <linux/wait.h>
#include "intel_guc_fwif.h"
struct i915_vma;
struct intel_guc;
+struct drm_printer;
/**
* DOC: Command Transport (CT).
@@ -31,16 +34,25 @@ struct intel_guc;
* @lock: protects access to the commands buffer and buffer descriptor
* @desc: pointer to the buffer descriptor
* @cmds: pointer to the commands buffer
- * @size: size of the commands buffer
+ * @size: size of the commands buffer in dwords
+ * @resv_space: reserved space in buffer in dwords
+ * @head: local shadow copy of head in dwords
+ * @tail: local shadow copy of tail in dwords
+ * @space: local shadow copy of space in dwords
+ * @broken: flag to indicate if descriptor data is broken
*/
struct intel_guc_ct_buffer {
spinlock_t lock;
struct guc_ct_buffer_desc *desc;
u32 *cmds;
u32 size;
+ u32 resv_space;
+ u32 tail;
+ u32 head;
+ atomic_t space;
+ bool broken;
};
-
/** Top-level structure for Command Transport related data
*
* Includes a pair of CT buffers for bi-directional communication and tracking
@@ -58,8 +70,11 @@ struct intel_guc_ct {
struct tasklet_struct receive_tasklet;
+ /** @wq: wait queue for g2h chanenl */
+ wait_queue_head_t wq;
+
struct {
- u32 last_fence; /* last fence used to send request */
+ u16 last_fence; /* last fence used to send request */
spinlock_t lock; /* protects pending requests list */
struct list_head pending; /* requests waiting for response */
@@ -67,6 +82,9 @@ struct intel_guc_ct {
struct list_head incoming; /* incoming requests */
struct work_struct worker; /* handler for incoming requests */
} requests;
+
+ /** @stall_time: time of first time a CTB submission is stalled */
+ ktime_t stall_time;
};
void intel_guc_ct_init_early(struct intel_guc_ct *ct);
@@ -85,8 +103,18 @@ static inline bool intel_guc_ct_enabled(struct intel_guc_ct *ct)
return ct->enabled;
}
+#define INTEL_GUC_CT_SEND_NB BIT(31)
+#define INTEL_GUC_CT_SEND_G2H_DW_SHIFT 0
+#define INTEL_GUC_CT_SEND_G2H_DW_MASK (0xff << INTEL_GUC_CT_SEND_G2H_DW_SHIFT)
+#define MAKE_SEND_FLAGS(len) ({ \
+ typeof(len) len_ = (len); \
+ GEM_BUG_ON(!FIELD_FIT(INTEL_GUC_CT_SEND_G2H_DW_MASK, len_)); \
+ (FIELD_PREP(INTEL_GUC_CT_SEND_G2H_DW_MASK, len_) | INTEL_GUC_CT_SEND_NB); \
+})
int intel_guc_ct_send(struct intel_guc_ct *ct, const u32 *action, u32 len,
- u32 *response_buf, u32 response_buf_size);
+ u32 *response_buf, u32 response_buf_size, u32 flags);
void intel_guc_ct_event_handler(struct intel_guc_ct *ct);
+void intel_guc_ct_print_info(struct intel_guc_ct *ct, struct drm_printer *p);
+
#endif /* _INTEL_GUC_CT_H_ */
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_debugfs.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_debugfs.c
index fe7cb7b29a1e..887c8c8f35db 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_debugfs.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_debugfs.c
@@ -9,6 +9,10 @@
#include "intel_guc.h"
#include "intel_guc_debugfs.h"
#include "intel_guc_log_debugfs.h"
+#include "gt/uc/intel_guc_ct.h"
+#include "gt/uc/intel_guc_ads.h"
+#include "gt/uc/intel_guc_submission.h"
+#include "gt/uc/intel_guc_slpc.h"
static int guc_info_show(struct seq_file *m, void *data)
{
@@ -22,16 +26,57 @@ static int guc_info_show(struct seq_file *m, void *data)
drm_puts(&p, "\n");
intel_guc_log_info(&guc->log, &p);
- /* Add more as required ... */
+ if (!intel_guc_submission_is_used(guc))
+ return 0;
+
+ intel_guc_ct_print_info(&guc->ct, &p);
+ intel_guc_submission_print_info(guc, &p);
+ intel_guc_ads_print_policy_info(guc, &p);
return 0;
}
DEFINE_GT_DEBUGFS_ATTRIBUTE(guc_info);
+static int guc_registered_contexts_show(struct seq_file *m, void *data)
+{
+ struct intel_guc *guc = m->private;
+ struct drm_printer p = drm_seq_file_printer(m);
+
+ if (!intel_guc_submission_is_used(guc))
+ return -ENODEV;
+
+ intel_guc_submission_print_context_info(guc, &p);
+
+ return 0;
+}
+DEFINE_GT_DEBUGFS_ATTRIBUTE(guc_registered_contexts);
+
+static int guc_slpc_info_show(struct seq_file *m, void *unused)
+{
+ struct intel_guc *guc = m->private;
+ struct intel_guc_slpc *slpc = &guc->slpc;
+ struct drm_printer p = drm_seq_file_printer(m);
+
+ if (!intel_guc_slpc_is_used(guc))
+ return -ENODEV;
+
+ return intel_guc_slpc_print_info(slpc, &p);
+}
+DEFINE_GT_DEBUGFS_ATTRIBUTE(guc_slpc_info);
+
+static bool intel_eval_slpc_support(void *data)
+{
+ struct intel_guc *guc = (struct intel_guc *)data;
+
+ return intel_guc_slpc_is_used(guc);
+}
+
void intel_guc_debugfs_register(struct intel_guc *guc, struct dentry *root)
{
static const struct debugfs_gt_file files[] = {
{ "guc_info", &guc_info_fops, NULL },
+ { "guc_registered_contexts", &guc_registered_contexts_fops, NULL },
+ { "guc_slpc_info", &guc_slpc_info_fops, &intel_eval_slpc_support},
};
if (!intel_guc_is_supported(guc))
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_fwif.h b/drivers/gpu/drm/i915/gt/uc/intel_guc_fwif.h
index e9a9d85e2aa3..fa4be13c8854 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_fwif.h
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_fwif.h
@@ -12,19 +12,27 @@
#include "gt/intel_engine_types.h"
#include "abi/guc_actions_abi.h"
+#include "abi/guc_actions_slpc_abi.h"
#include "abi/guc_errors_abi.h"
#include "abi/guc_communication_mmio_abi.h"
#include "abi/guc_communication_ctb_abi.h"
#include "abi/guc_messages_abi.h"
+/* Payload length only i.e. don't include G2H header length */
+#define G2H_LEN_DW_SCHED_CONTEXT_MODE_SET 2
+#define G2H_LEN_DW_DEREGISTER_CONTEXT 1
+
+#define GUC_CONTEXT_DISABLE 0
+#define GUC_CONTEXT_ENABLE 1
+
#define GUC_CLIENT_PRIORITY_KMD_HIGH 0
#define GUC_CLIENT_PRIORITY_HIGH 1
#define GUC_CLIENT_PRIORITY_KMD_NORMAL 2
#define GUC_CLIENT_PRIORITY_NORMAL 3
#define GUC_CLIENT_PRIORITY_NUM 4
-#define GUC_MAX_STAGE_DESCRIPTORS 1024
-#define GUC_INVALID_STAGE_ID GUC_MAX_STAGE_DESCRIPTORS
+#define GUC_MAX_LRC_DESCRIPTORS 65535
+#define GUC_INVALID_LRC_ID GUC_MAX_LRC_DESCRIPTORS
#define GUC_RENDER_ENGINE 0
#define GUC_VIDEO_ENGINE 1
@@ -81,15 +89,14 @@
#define GUC_LOG_ALLOC_IN_MEGABYTE (1 << 3)
#define GUC_LOG_CRASH_SHIFT 4
#define GUC_LOG_CRASH_MASK (0x3 << GUC_LOG_CRASH_SHIFT)
-#define GUC_LOG_DPC_SHIFT 6
-#define GUC_LOG_DPC_MASK (0x7 << GUC_LOG_DPC_SHIFT)
-#define GUC_LOG_ISR_SHIFT 9
-#define GUC_LOG_ISR_MASK (0x7 << GUC_LOG_ISR_SHIFT)
+#define GUC_LOG_DEBUG_SHIFT 6
+#define GUC_LOG_DEBUG_MASK (0xF << GUC_LOG_DEBUG_SHIFT)
#define GUC_LOG_BUF_ADDR_SHIFT 12
#define GUC_CTL_WA 1
#define GUC_CTL_FEATURE 2
#define GUC_CTL_DISABLE_SCHEDULER (1 << 14)
+#define GUC_CTL_ENABLE_SLPC BIT(2)
#define GUC_CTL_DEBUG 3
#define GUC_LOG_VERBOSITY_SHIFT 0
@@ -136,6 +143,11 @@
#define GUC_ID_TO_ENGINE_INSTANCE(guc_id) \
(((guc_id) & GUC_ENGINE_INSTANCE_MASK) >> GUC_ENGINE_INSTANCE_SHIFT)
+#define SLPC_EVENT(id, c) (\
+FIELD_PREP(HOST2GUC_PC_SLPC_REQUEST_MSG_1_EVENT_ID, id) | \
+FIELD_PREP(HOST2GUC_PC_SLPC_REQUEST_MSG_1_EVENT_ARGC, c) \
+)
+
static inline u8 engine_class_to_guc_class(u8 class)
{
BUILD_BUG_ON(GUC_RENDER_CLASS != RENDER_CLASS);
@@ -177,66 +189,40 @@ struct guc_process_desc {
u32 reserved[30];
} __packed;
-/* engine id and context id is packed into guc_execlist_context.context_id*/
-#define GUC_ELC_CTXID_OFFSET 0
-#define GUC_ELC_ENGINE_OFFSET 29
+#define CONTEXT_REGISTRATION_FLAG_KMD BIT(0)
-/* The execlist context including software and HW information */
-struct guc_execlist_context {
- u32 context_desc;
- u32 context_id;
- u32 ring_status;
- u32 ring_lrca;
- u32 ring_begin;
- u32 ring_end;
- u32 ring_next_free_location;
- u32 ring_current_tail_pointer_value;
- u8 engine_state_submit_value;
- u8 engine_state_wait_value;
- u16 pagefault_count;
- u16 engine_submit_queue_count;
-} __packed;
+#define CONTEXT_POLICY_DEFAULT_EXECUTION_QUANTUM_US 1000000
+#define CONTEXT_POLICY_DEFAULT_PREEMPTION_TIME_US 500000
+
+/* Preempt to idle on quantum expiry */
+#define CONTEXT_POLICY_FLAG_PREEMPT_TO_IDLE BIT(0)
/*
- * This structure describes a stage set arranged for a particular communication
- * between uKernel (GuC) and Driver (KMD). Technically, this is known as a
- * "GuC Context descriptor" in the specs, but we use the term "stage descriptor"
- * to avoid confusion with all the other things already named "context" in the
- * driver. A static pool of these descriptors are stored inside a GEM object
- * (stage_desc_pool) which is held for the entire lifetime of our interaction
- * with the GuC, being allocated before the GuC is loaded with its firmware.
+ * GuC Context registration descriptor.
+ * FIXME: This is only required to exist during context registration.
+ * The current 1:1 between guc_lrc_desc and LRCs for the lifetime of the LRC
+ * is not required.
*/
-struct guc_stage_desc {
- u32 sched_common_area;
- u32 stage_id;
- u32 pas_id;
- u8 engines_used;
- u64 db_trigger_cpu;
- u32 db_trigger_uk;
- u64 db_trigger_phy;
- u16 db_id;
-
- struct guc_execlist_context lrc[GUC_MAX_ENGINES_NUM];
-
- u8 attribute;
-
+struct guc_lrc_desc {
+ u32 hw_context_desc;
+ u32 slpm_perf_mode_hint; /* SPLC v1 only */
+ u32 slpm_freq_hint;
+ u32 engine_submit_mask; /* In logical space */
+ u8 engine_class;
+ u8 reserved0[3];
u32 priority;
-
- u32 wq_sampled_tail_offset;
- u32 wq_total_submit_enqueues;
-
u32 process_desc;
u32 wq_addr;
u32 wq_size;
-
- u32 engine_presence;
-
- u8 engine_suspended;
-
- u8 reserved0[3];
- u64 reserved1[1];
-
- u64 desc_private;
+ u32 context_flags; /* CONTEXT_REGISTRATION_* */
+ /* Time for one workload to execute. (in micro seconds) */
+ u32 execution_quantum;
+ /* Time to wait for a preemption request to complete before issuing a
+ * reset. (in micro seconds).
+ */
+ u32 preemption_timeout;
+ u32 policy_flags; /* CONTEXT_POLICY_* */
+ u32 reserved1[19];
} __packed;
#define GUC_POWER_UNSPECIFIED 0
@@ -247,32 +233,14 @@ struct guc_stage_desc {
/* Scheduling policy settings */
-/* Reset engine upon preempt failure */
-#define POLICY_RESET_ENGINE (1<<0)
-/* Preempt to idle on quantum expiry */
-#define POLICY_PREEMPT_TO_IDLE (1<<1)
+#define GLOBAL_POLICY_MAX_NUM_WI 15
-#define POLICY_MAX_NUM_WI 15
-#define POLICY_DEFAULT_DPC_PROMOTE_TIME_US 500000
-#define POLICY_DEFAULT_EXECUTION_QUANTUM_US 1000000
-#define POLICY_DEFAULT_PREEMPTION_TIME_US 500000
-#define POLICY_DEFAULT_FAULT_TIME_US 250000
+/* Don't reset an engine upon preemption failure */
+#define GLOBAL_POLICY_DISABLE_ENGINE_RESET BIT(0)
-struct guc_policy {
- /* Time for one workload to execute. (in micro seconds) */
- u32 execution_quantum;
- /* Time to wait for a preemption request to completed before issuing a
- * reset. (in micro seconds). */
- u32 preemption_time;
- /* How much time to allow to run after the first fault is observed.
- * Then preempt afterwards. (in micro seconds) */
- u32 fault_time;
- u32 policy_flags;
- u32 reserved[8];
-} __packed;
+#define GLOBAL_POLICY_DEFAULT_DPC_PROMOTE_TIME_US 500000
struct guc_policies {
- struct guc_policy policy[GUC_CLIENT_PRIORITY_NUM][GUC_MAX_ENGINE_CLASSES];
u32 submission_queue_depth[GUC_MAX_ENGINE_CLASSES];
/* In micro seconds. How much time to allow before DPC processing is
* called back via interrupt (to prevent DPC queue drain starving).
@@ -286,6 +254,7 @@ struct guc_policies {
* idle. */
u32 max_num_work_items;
+ u32 global_flags;
u32 reserved[4];
} __packed;
@@ -311,29 +280,13 @@ struct guc_gt_system_info {
u32 generic_gt_sysinfo[GUC_GENERIC_GT_SYSINFO_MAX];
} __packed;
-/* Clients info */
-struct guc_ct_pool_entry {
- struct guc_ct_buffer_desc desc;
- u32 reserved[7];
-} __packed;
-
-#define GUC_CT_POOL_SIZE 2
-
-struct guc_clients_info {
- u32 clients_num;
- u32 reserved0[13];
- u32 ct_pool_addr;
- u32 ct_pool_count;
- u32 reserved[4];
-} __packed;
-
/* GuC Additional Data Struct */
struct guc_ads {
struct guc_mmio_reg_set reg_state_list[GUC_MAX_ENGINE_CLASSES][GUC_MAX_INSTANCES_PER_CLASS];
u32 reserved0;
u32 scheduler_policies;
u32 gt_system_info;
- u32 clients_info;
+ u32 reserved1;
u32 control_data;
u32 golden_context_lrca[GUC_MAX_ENGINE_CLASSES];
u32 eng_state_size[GUC_MAX_ENGINE_CLASSES];
@@ -344,8 +297,7 @@ struct guc_ads {
/* GuC logging structures */
enum guc_log_buffer_type {
- GUC_ISR_LOG_BUFFER,
- GUC_DPC_LOG_BUFFER,
+ GUC_DEBUG_LOG_BUFFER,
GUC_CRASH_DUMP_LOG_BUFFER,
GUC_MAX_LOG_BUFFER
};
@@ -414,23 +366,6 @@ struct guc_shared_ctx_data {
struct guc_ctx_report preempt_ctx_report[GUC_MAX_ENGINES_NUM];
} __packed;
-#define __INTEL_GUC_MSG_GET(T, m) \
- (((m) & INTEL_GUC_MSG_ ## T ## _MASK) >> INTEL_GUC_MSG_ ## T ## _SHIFT)
-#define INTEL_GUC_MSG_TO_TYPE(m) __INTEL_GUC_MSG_GET(TYPE, m)
-#define INTEL_GUC_MSG_TO_DATA(m) __INTEL_GUC_MSG_GET(DATA, m)
-#define INTEL_GUC_MSG_TO_CODE(m) __INTEL_GUC_MSG_GET(CODE, m)
-
-#define __INTEL_GUC_MSG_TYPE_IS(T, m) \
- (INTEL_GUC_MSG_TO_TYPE(m) == INTEL_GUC_MSG_TYPE_ ## T)
-#define INTEL_GUC_MSG_IS_REQUEST(m) __INTEL_GUC_MSG_TYPE_IS(REQUEST, m)
-#define INTEL_GUC_MSG_IS_RESPONSE(m) __INTEL_GUC_MSG_TYPE_IS(RESPONSE, m)
-
-#define INTEL_GUC_MSG_IS_RESPONSE_SUCCESS(m) \
- (typecheck(u32, (m)) && \
- ((m) & (INTEL_GUC_MSG_TYPE_MASK | INTEL_GUC_MSG_CODE_MASK)) == \
- ((INTEL_GUC_MSG_TYPE_RESPONSE << INTEL_GUC_MSG_TYPE_SHIFT) | \
- (INTEL_GUC_RESPONSE_STATUS_SUCCESS << INTEL_GUC_MSG_CODE_SHIFT)))
-
/* This action will be programmed in C1BC - SOFT_SCRATCH_15_REG */
enum intel_guc_recv_message {
INTEL_GUC_RECV_MSG_CRASH_DUMP_POSTED = BIT(1),
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_log.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_log.c
index c36d5eb5bbb9..ac0931f0374b 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_log.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_log.c
@@ -197,10 +197,8 @@ static bool guc_check_log_buf_overflow(struct intel_guc_log *log,
static unsigned int guc_get_log_buffer_size(enum guc_log_buffer_type type)
{
switch (type) {
- case GUC_ISR_LOG_BUFFER:
- return ISR_BUFFER_SIZE;
- case GUC_DPC_LOG_BUFFER:
- return DPC_BUFFER_SIZE;
+ case GUC_DEBUG_LOG_BUFFER:
+ return DEBUG_BUFFER_SIZE;
case GUC_CRASH_DUMP_LOG_BUFFER:
return CRASH_BUFFER_SIZE;
default:
@@ -245,7 +243,7 @@ static void guc_read_update_log_buffer(struct intel_guc_log *log)
src_data += PAGE_SIZE;
dst_data += PAGE_SIZE;
- for (type = GUC_ISR_LOG_BUFFER; type < GUC_MAX_LOG_BUFFER; type++) {
+ for (type = GUC_DEBUG_LOG_BUFFER; type < GUC_MAX_LOG_BUFFER; type++) {
/*
* Make a copy of the state structure, inside GuC log buffer
* (which is uncached mapped), on the stack to avoid reading
@@ -463,21 +461,16 @@ int intel_guc_log_create(struct intel_guc_log *log)
* +===============================+ 00B
* | Crash dump state header |
* +-------------------------------+ 32B
- * | DPC state header |
+ * | Debug state header |
* +-------------------------------+ 64B
- * | ISR state header |
- * +-------------------------------+ 96B
* | |
* +===============================+ PAGE_SIZE (4KB)
* | Crash Dump logs |
* +===============================+ + CRASH_SIZE
- * | DPC logs |
- * +===============================+ + DPC_SIZE
- * | ISR logs |
- * +===============================+ + ISR_SIZE
+ * | Debug logs |
+ * +===============================+ + DEBUG_SIZE
*/
- guc_log_size = PAGE_SIZE + CRASH_BUFFER_SIZE + DPC_BUFFER_SIZE +
- ISR_BUFFER_SIZE;
+ guc_log_size = PAGE_SIZE + CRASH_BUFFER_SIZE + DEBUG_BUFFER_SIZE;
vma = intel_guc_allocate_vma(guc, guc_log_size);
if (IS_ERR(vma)) {
@@ -675,10 +668,8 @@ static const char *
stringify_guc_log_type(enum guc_log_buffer_type type)
{
switch (type) {
- case GUC_ISR_LOG_BUFFER:
- return "ISR";
- case GUC_DPC_LOG_BUFFER:
- return "DPC";
+ case GUC_DEBUG_LOG_BUFFER:
+ return "DEBUG";
case GUC_CRASH_DUMP_LOG_BUFFER:
return "CRASH";
default:
@@ -708,7 +699,7 @@ void intel_guc_log_info(struct intel_guc_log *log, struct drm_printer *p)
drm_printf(p, "\tRelay full count: %u\n", log->relay.full_count);
- for (type = GUC_ISR_LOG_BUFFER; type < GUC_MAX_LOG_BUFFER; type++) {
+ for (type = GUC_DEBUG_LOG_BUFFER; type < GUC_MAX_LOG_BUFFER; type++) {
drm_printf(p, "\t%s:\tflush count %10u, overflow count %10u\n",
stringify_guc_log_type(type),
log->stats[type].flush,
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_log.h b/drivers/gpu/drm/i915/gt/uc/intel_guc_log.h
index 11fccd0b2294..ac1ee1d5ce10 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_log.h
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_log.h
@@ -17,12 +17,10 @@ struct intel_guc;
#ifdef CONFIG_DRM_I915_DEBUG_GUC
#define CRASH_BUFFER_SIZE SZ_2M
-#define DPC_BUFFER_SIZE SZ_8M
-#define ISR_BUFFER_SIZE SZ_8M
+#define DEBUG_BUFFER_SIZE SZ_16M
#else
#define CRASH_BUFFER_SIZE SZ_8K
-#define DPC_BUFFER_SIZE SZ_32K
-#define ISR_BUFFER_SIZE SZ_32K
+#define DEBUG_BUFFER_SIZE SZ_64K
#endif
/*
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_rc.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_rc.c
new file mode 100644
index 000000000000..fc805d466d99
--- /dev/null
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_rc.c
@@ -0,0 +1,80 @@
+// SPDX-License-Identifier: MIT
+/*
+ * Copyright © 2021 Intel Corporation
+ */
+
+#include "intel_guc_rc.h"
+#include "gt/intel_gt.h"
+#include "i915_drv.h"
+
+static bool __guc_rc_supported(struct intel_guc *guc)
+{
+ /* GuC RC is unavailable for pre-Gen12 */
+ return guc->submission_supported &&
+ GRAPHICS_VER(guc_to_gt(guc)->i915) >= 12;
+}
+
+static bool __guc_rc_selected(struct intel_guc *guc)
+{
+ if (!intel_guc_rc_is_supported(guc))
+ return false;
+
+ return guc->submission_selected;
+}
+
+void intel_guc_rc_init_early(struct intel_guc *guc)
+{
+ guc->rc_supported = __guc_rc_supported(guc);
+ guc->rc_selected = __guc_rc_selected(guc);
+}
+
+static int guc_action_control_gucrc(struct intel_guc *guc, bool enable)
+{
+ u32 rc_mode = enable ? INTEL_GUCRC_FIRMWARE_CONTROL :
+ INTEL_GUCRC_HOST_CONTROL;
+ u32 action[] = {
+ INTEL_GUC_ACTION_SETUP_PC_GUCRC,
+ rc_mode
+ };
+ int ret;
+
+ ret = intel_guc_send(guc, action, ARRAY_SIZE(action));
+ ret = ret > 0 ? -EPROTO : ret;
+
+ return ret;
+}
+
+static int __guc_rc_control(struct intel_guc *guc, bool enable)
+{
+ struct intel_gt *gt = guc_to_gt(guc);
+ struct drm_device *drm = &guc_to_gt(guc)->i915->drm;
+ int ret;
+
+ if (!intel_uc_uses_guc_rc(&gt->uc))
+ return -EOPNOTSUPP;
+
+ if (!intel_guc_is_ready(guc))
+ return -EINVAL;
+
+ ret = guc_action_control_gucrc(guc, enable);
+ if (ret) {
+ drm_err(drm, "Failed to %s GuC RC (%pe)\n",
+ enabledisable(enable), ERR_PTR(ret));
+ return ret;
+ }
+
+ drm_info(&gt->i915->drm, "GuC RC: %s\n",
+ enableddisabled(enable));
+
+ return 0;
+}
+
+int intel_guc_rc_enable(struct intel_guc *guc)
+{
+ return __guc_rc_control(guc, true);
+}
+
+int intel_guc_rc_disable(struct intel_guc *guc)
+{
+ return __guc_rc_control(guc, false);
+}
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_rc.h b/drivers/gpu/drm/i915/gt/uc/intel_guc_rc.h
new file mode 100644
index 000000000000..57e86c337838
--- /dev/null
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_rc.h
@@ -0,0 +1,31 @@
+/* SPDX-License-Identifier: MIT */
+/*
+ * Copyright © 2021 Intel Corporation
+ */
+
+#ifndef _INTEL_GUC_RC_H_
+#define _INTEL_GUC_RC_H_
+
+#include "intel_guc_submission.h"
+
+void intel_guc_rc_init_early(struct intel_guc *guc);
+
+static inline bool intel_guc_rc_is_supported(struct intel_guc *guc)
+{
+ return guc->rc_supported;
+}
+
+static inline bool intel_guc_rc_is_wanted(struct intel_guc *guc)
+{
+ return guc->submission_selected && intel_guc_rc_is_supported(guc);
+}
+
+static inline bool intel_guc_rc_is_used(struct intel_guc *guc)
+{
+ return intel_guc_submission_is_used(guc) && intel_guc_rc_is_wanted(guc);
+}
+
+int intel_guc_rc_enable(struct intel_guc *guc);
+int intel_guc_rc_disable(struct intel_guc *guc);
+
+#endif
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c
new file mode 100644
index 000000000000..65a3e7fdb2b2
--- /dev/null
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c
@@ -0,0 +1,626 @@
+// SPDX-License-Identifier: MIT
+/*
+ * Copyright © 2021 Intel Corporation
+ */
+
+#include "i915_drv.h"
+#include "intel_guc_slpc.h"
+#include "gt/intel_gt.h"
+
+static inline struct intel_guc *slpc_to_guc(struct intel_guc_slpc *slpc)
+{
+ return container_of(slpc, struct intel_guc, slpc);
+}
+
+static inline struct intel_gt *slpc_to_gt(struct intel_guc_slpc *slpc)
+{
+ return guc_to_gt(slpc_to_guc(slpc));
+}
+
+static inline struct drm_i915_private *slpc_to_i915(struct intel_guc_slpc *slpc)
+{
+ return slpc_to_gt(slpc)->i915;
+}
+
+static bool __detect_slpc_supported(struct intel_guc *guc)
+{
+ /* GuC SLPC is unavailable for pre-Gen12 */
+ return guc->submission_supported &&
+ GRAPHICS_VER(guc_to_gt(guc)->i915) >= 12;
+}
+
+static bool __guc_slpc_selected(struct intel_guc *guc)
+{
+ if (!intel_guc_slpc_is_supported(guc))
+ return false;
+
+ return guc->submission_selected;
+}
+
+void intel_guc_slpc_init_early(struct intel_guc_slpc *slpc)
+{
+ struct intel_guc *guc = slpc_to_guc(slpc);
+
+ slpc->supported = __detect_slpc_supported(guc);
+ slpc->selected = __guc_slpc_selected(guc);
+}
+
+static void slpc_mem_set_param(struct slpc_shared_data *data,
+ u32 id, u32 value)
+{
+ GEM_BUG_ON(id >= SLPC_MAX_OVERRIDE_PARAMETERS);
+ /*
+ * When the flag bit is set, corresponding value will be read
+ * and applied by SLPC.
+ */
+ data->override_params.bits[id >> 5] |= (1 << (id % 32));
+ data->override_params.values[id] = value;
+}
+
+static void slpc_mem_set_enabled(struct slpc_shared_data *data,
+ u8 enable_id, u8 disable_id)
+{
+ /*
+ * Enabling a param involves setting the enable_id
+ * to 1 and disable_id to 0.
+ */
+ slpc_mem_set_param(data, enable_id, 1);
+ slpc_mem_set_param(data, disable_id, 0);
+}
+
+static void slpc_mem_set_disabled(struct slpc_shared_data *data,
+ u8 enable_id, u8 disable_id)
+{
+ /*
+ * Disabling a param involves setting the enable_id
+ * to 0 and disable_id to 1.
+ */
+ slpc_mem_set_param(data, disable_id, 1);
+ slpc_mem_set_param(data, enable_id, 0);
+}
+
+int intel_guc_slpc_init(struct intel_guc_slpc *slpc)
+{
+ struct intel_guc *guc = slpc_to_guc(slpc);
+ struct drm_i915_private *i915 = slpc_to_i915(slpc);
+ u32 size = PAGE_ALIGN(sizeof(struct slpc_shared_data));
+ int err;
+
+ GEM_BUG_ON(slpc->vma);
+
+ err = intel_guc_allocate_and_map_vma(guc, size, &slpc->vma, (void **)&slpc->vaddr);
+ if (unlikely(err)) {
+ drm_err(&i915->drm,
+ "Failed to allocate SLPC struct (err=%pe)\n",
+ ERR_PTR(err));
+ return err;
+ }
+
+ slpc->max_freq_softlimit = 0;
+ slpc->min_freq_softlimit = 0;
+
+ return err;
+}
+
+static u32 slpc_get_state(struct intel_guc_slpc *slpc)
+{
+ struct slpc_shared_data *data;
+
+ GEM_BUG_ON(!slpc->vma);
+
+ drm_clflush_virt_range(slpc->vaddr, sizeof(u32));
+ data = slpc->vaddr;
+
+ return data->header.global_state;
+}
+
+static int guc_action_slpc_set_param(struct intel_guc *guc, u8 id, u32 value)
+{
+ u32 request[] = {
+ GUC_ACTION_HOST2GUC_PC_SLPC_REQUEST,
+ SLPC_EVENT(SLPC_EVENT_PARAMETER_SET, 2),
+ id,
+ value,
+ };
+ int ret;
+
+ ret = intel_guc_send(guc, request, ARRAY_SIZE(request));
+
+ return ret > 0 ? -EPROTO : ret;
+}
+
+static int guc_action_slpc_unset_param(struct intel_guc *guc, u8 id)
+{
+ u32 request[] = {
+ GUC_ACTION_HOST2GUC_PC_SLPC_REQUEST,
+ SLPC_EVENT(SLPC_EVENT_PARAMETER_UNSET, 2),
+ id,
+ };
+
+ return intel_guc_send(guc, request, ARRAY_SIZE(request));
+}
+
+static bool slpc_is_running(struct intel_guc_slpc *slpc)
+{
+ return slpc_get_state(slpc) == SLPC_GLOBAL_STATE_RUNNING;
+}
+
+static int guc_action_slpc_query(struct intel_guc *guc, u32 offset)
+{
+ u32 request[] = {
+ GUC_ACTION_HOST2GUC_PC_SLPC_REQUEST,
+ SLPC_EVENT(SLPC_EVENT_QUERY_TASK_STATE, 2),
+ offset,
+ 0,
+ };
+ int ret;
+
+ ret = intel_guc_send(guc, request, ARRAY_SIZE(request));
+
+ return ret > 0 ? -EPROTO : ret;
+}
+
+static int slpc_query_task_state(struct intel_guc_slpc *slpc)
+{
+ struct intel_guc *guc = slpc_to_guc(slpc);
+ struct drm_i915_private *i915 = slpc_to_i915(slpc);
+ u32 offset = intel_guc_ggtt_offset(guc, slpc->vma);
+ int ret;
+
+ ret = guc_action_slpc_query(guc, offset);
+ if (unlikely(ret))
+ drm_err(&i915->drm, "Failed to query task state (%pe)\n",
+ ERR_PTR(ret));
+
+ drm_clflush_virt_range(slpc->vaddr, SLPC_PAGE_SIZE_BYTES);
+
+ return ret;
+}
+
+static int slpc_set_param(struct intel_guc_slpc *slpc, u8 id, u32 value)
+{
+ struct intel_guc *guc = slpc_to_guc(slpc);
+ struct drm_i915_private *i915 = slpc_to_i915(slpc);
+ int ret;
+
+ GEM_BUG_ON(id >= SLPC_MAX_PARAM);
+
+ ret = guc_action_slpc_set_param(guc, id, value);
+ if (ret)
+ drm_err(&i915->drm, "Failed to set param %d to %u (%pe)\n",
+ id, value, ERR_PTR(ret));
+
+ return ret;
+}
+
+static int slpc_unset_param(struct intel_guc_slpc *slpc,
+ u8 id)
+{
+ struct intel_guc *guc = slpc_to_guc(slpc);
+
+ GEM_BUG_ON(id >= SLPC_MAX_PARAM);
+
+ return guc_action_slpc_unset_param(guc, id);
+}
+
+static const char *slpc_global_state_to_string(enum slpc_global_state state)
+{
+ switch (state) {
+ case SLPC_GLOBAL_STATE_NOT_RUNNING:
+ return "not running";
+ case SLPC_GLOBAL_STATE_INITIALIZING:
+ return "initializing";
+ case SLPC_GLOBAL_STATE_RESETTING:
+ return "resetting";
+ case SLPC_GLOBAL_STATE_RUNNING:
+ return "running";
+ case SLPC_GLOBAL_STATE_SHUTTING_DOWN:
+ return "shutting down";
+ case SLPC_GLOBAL_STATE_ERROR:
+ return "error";
+ default:
+ return "unknown";
+ }
+}
+
+static const char *slpc_get_state_string(struct intel_guc_slpc *slpc)
+{
+ return slpc_global_state_to_string(slpc_get_state(slpc));
+}
+
+static int guc_action_slpc_reset(struct intel_guc *guc, u32 offset)
+{
+ u32 request[] = {
+ GUC_ACTION_HOST2GUC_PC_SLPC_REQUEST,
+ SLPC_EVENT(SLPC_EVENT_RESET, 2),
+ offset,
+ 0,
+ };
+ int ret;
+
+ ret = intel_guc_send(guc, request, ARRAY_SIZE(request));
+
+ return ret > 0 ? -EPROTO : ret;
+}
+
+static int slpc_reset(struct intel_guc_slpc *slpc)
+{
+ struct drm_i915_private *i915 = slpc_to_i915(slpc);
+ struct intel_guc *guc = slpc_to_guc(slpc);
+ u32 offset = intel_guc_ggtt_offset(guc, slpc->vma);
+ int ret;
+
+ ret = guc_action_slpc_reset(guc, offset);
+
+ if (unlikely(ret < 0)) {
+ drm_err(&i915->drm, "SLPC reset action failed (%pe)\n",
+ ERR_PTR(ret));
+ return ret;
+ }
+
+ if (!ret) {
+ if (wait_for(slpc_is_running(slpc), SLPC_RESET_TIMEOUT_MS)) {
+ drm_err(&i915->drm, "SLPC not enabled! State = %s\n",
+ slpc_get_state_string(slpc));
+ return -EIO;
+ }
+ }
+
+ return 0;
+}
+
+static u32 slpc_decode_min_freq(struct intel_guc_slpc *slpc)
+{
+ struct slpc_shared_data *data = slpc->vaddr;
+
+ GEM_BUG_ON(!slpc->vma);
+
+ return DIV_ROUND_CLOSEST(REG_FIELD_GET(SLPC_MIN_UNSLICE_FREQ_MASK,
+ data->task_state_data.freq) *
+ GT_FREQUENCY_MULTIPLIER, GEN9_FREQ_SCALER);
+}
+
+static u32 slpc_decode_max_freq(struct intel_guc_slpc *slpc)
+{
+ struct slpc_shared_data *data = slpc->vaddr;
+
+ GEM_BUG_ON(!slpc->vma);
+
+ return DIV_ROUND_CLOSEST(REG_FIELD_GET(SLPC_MAX_UNSLICE_FREQ_MASK,
+ data->task_state_data.freq) *
+ GT_FREQUENCY_MULTIPLIER, GEN9_FREQ_SCALER);
+}
+
+static void slpc_shared_data_reset(struct slpc_shared_data *data)
+{
+ memset(data, 0, sizeof(struct slpc_shared_data));
+
+ data->header.size = sizeof(struct slpc_shared_data);
+
+ /* Enable only GTPERF task, disable others */
+ slpc_mem_set_enabled(data, SLPC_PARAM_TASK_ENABLE_GTPERF,
+ SLPC_PARAM_TASK_DISABLE_GTPERF);
+
+ slpc_mem_set_disabled(data, SLPC_PARAM_TASK_ENABLE_BALANCER,
+ SLPC_PARAM_TASK_DISABLE_BALANCER);
+
+ slpc_mem_set_disabled(data, SLPC_PARAM_TASK_ENABLE_DCC,
+ SLPC_PARAM_TASK_DISABLE_DCC);
+}
+
+/**
+ * intel_guc_slpc_set_max_freq() - Set max frequency limit for SLPC.
+ * @slpc: pointer to intel_guc_slpc.
+ * @val: frequency (MHz)
+ *
+ * This function will invoke GuC SLPC action to update the max frequency
+ * limit for unslice.
+ *
+ * Return: 0 on success, non-zero error code on failure.
+ */
+int intel_guc_slpc_set_max_freq(struct intel_guc_slpc *slpc, u32 val)
+{
+ struct drm_i915_private *i915 = slpc_to_i915(slpc);
+ intel_wakeref_t wakeref;
+ int ret;
+
+ if (val < slpc->min_freq ||
+ val > slpc->rp0_freq ||
+ val < slpc->min_freq_softlimit)
+ return -EINVAL;
+
+ with_intel_runtime_pm(&i915->runtime_pm, wakeref) {
+ ret = slpc_set_param(slpc,
+ SLPC_PARAM_GLOBAL_MAX_GT_UNSLICE_FREQ_MHZ,
+ val);
+
+ /* Return standardized err code for sysfs calls */
+ if (ret)
+ ret = -EIO;
+ }
+
+ if (!ret)
+ slpc->max_freq_softlimit = val;
+
+ return ret;
+}
+
+/**
+ * intel_guc_slpc_get_max_freq() - Get max frequency limit for SLPC.
+ * @slpc: pointer to intel_guc_slpc.
+ * @val: pointer to val which will hold max frequency (MHz)
+ *
+ * This function will invoke GuC SLPC action to read the max frequency
+ * limit for unslice.
+ *
+ * Return: 0 on success, non-zero error code on failure.
+ */
+int intel_guc_slpc_get_max_freq(struct intel_guc_slpc *slpc, u32 *val)
+{
+ struct drm_i915_private *i915 = slpc_to_i915(slpc);
+ intel_wakeref_t wakeref;
+ int ret = 0;
+
+ with_intel_runtime_pm(&i915->runtime_pm, wakeref) {
+ /* Force GuC to update task data */
+ ret = slpc_query_task_state(slpc);
+
+ if (!ret)
+ *val = slpc_decode_max_freq(slpc);
+ }
+
+ return ret;
+}
+
+/**
+ * intel_guc_slpc_set_min_freq() - Set min frequency limit for SLPC.
+ * @slpc: pointer to intel_guc_slpc.
+ * @val: frequency (MHz)
+ *
+ * This function will invoke GuC SLPC action to update the min unslice
+ * frequency.
+ *
+ * Return: 0 on success, non-zero error code on failure.
+ */
+int intel_guc_slpc_set_min_freq(struct intel_guc_slpc *slpc, u32 val)
+{
+ struct drm_i915_private *i915 = slpc_to_i915(slpc);
+ intel_wakeref_t wakeref;
+ int ret;
+
+ if (val < slpc->min_freq ||
+ val > slpc->rp0_freq ||
+ val > slpc->max_freq_softlimit)
+ return -EINVAL;
+
+ with_intel_runtime_pm(&i915->runtime_pm, wakeref) {
+ ret = slpc_set_param(slpc,
+ SLPC_PARAM_GLOBAL_MIN_GT_UNSLICE_FREQ_MHZ,
+ val);
+
+ /* Return standardized err code for sysfs calls */
+ if (ret)
+ ret = -EIO;
+ }
+
+ if (!ret)
+ slpc->min_freq_softlimit = val;
+
+ return ret;
+}
+
+/**
+ * intel_guc_slpc_get_min_freq() - Get min frequency limit for SLPC.
+ * @slpc: pointer to intel_guc_slpc.
+ * @val: pointer to val which will hold min frequency (MHz)
+ *
+ * This function will invoke GuC SLPC action to read the min frequency
+ * limit for unslice.
+ *
+ * Return: 0 on success, non-zero error code on failure.
+ */
+int intel_guc_slpc_get_min_freq(struct intel_guc_slpc *slpc, u32 *val)
+{
+ struct drm_i915_private *i915 = slpc_to_i915(slpc);
+ intel_wakeref_t wakeref;
+ int ret = 0;
+
+ with_intel_runtime_pm(&i915->runtime_pm, wakeref) {
+ /* Force GuC to update task data */
+ ret = slpc_query_task_state(slpc);
+
+ if (!ret)
+ *val = slpc_decode_min_freq(slpc);
+ }
+
+ return ret;
+}
+
+void intel_guc_pm_intrmsk_enable(struct intel_gt *gt)
+{
+ u32 pm_intrmsk_mbz = 0;
+
+ /*
+ * Allow GuC to receive ARAT timer expiry event.
+ * This interrupt register is setup by RPS code
+ * when host based Turbo is enabled.
+ */
+ pm_intrmsk_mbz |= ARAT_EXPIRED_INTRMSK;
+
+ intel_uncore_rmw(gt->uncore,
+ GEN6_PMINTRMSK, pm_intrmsk_mbz, 0);
+}
+
+static int slpc_set_softlimits(struct intel_guc_slpc *slpc)
+{
+ int ret = 0;
+
+ /*
+ * Softlimits are initially equivalent to platform limits
+ * unless they have deviated from defaults, in which case,
+ * we retain the values and set min/max accordingly.
+ */
+ if (!slpc->max_freq_softlimit)
+ slpc->max_freq_softlimit = slpc->rp0_freq;
+ else if (slpc->max_freq_softlimit != slpc->rp0_freq)
+ ret = intel_guc_slpc_set_max_freq(slpc,
+ slpc->max_freq_softlimit);
+
+ if (unlikely(ret))
+ return ret;
+
+ if (!slpc->min_freq_softlimit)
+ slpc->min_freq_softlimit = slpc->min_freq;
+ else if (slpc->min_freq_softlimit != slpc->min_freq)
+ return intel_guc_slpc_set_min_freq(slpc,
+ slpc->min_freq_softlimit);
+
+ return 0;
+}
+
+static int slpc_ignore_eff_freq(struct intel_guc_slpc *slpc, bool ignore)
+{
+ int ret = 0;
+
+ if (ignore) {
+ ret = slpc_set_param(slpc,
+ SLPC_PARAM_IGNORE_EFFICIENT_FREQUENCY,
+ ignore);
+ if (!ret)
+ return slpc_set_param(slpc,
+ SLPC_PARAM_GLOBAL_MIN_GT_UNSLICE_FREQ_MHZ,
+ slpc->min_freq);
+ } else {
+ ret = slpc_unset_param(slpc,
+ SLPC_PARAM_IGNORE_EFFICIENT_FREQUENCY);
+ if (!ret)
+ return slpc_unset_param(slpc,
+ SLPC_PARAM_GLOBAL_MIN_GT_UNSLICE_FREQ_MHZ);
+ }
+
+ return ret;
+}
+
+static int slpc_use_fused_rp0(struct intel_guc_slpc *slpc)
+{
+ /* Force SLPC to used platform rp0 */
+ return slpc_set_param(slpc,
+ SLPC_PARAM_GLOBAL_MAX_GT_UNSLICE_FREQ_MHZ,
+ slpc->rp0_freq);
+}
+
+static void slpc_get_rp_values(struct intel_guc_slpc *slpc)
+{
+ u32 rp_state_cap;
+
+ rp_state_cap = intel_uncore_read(slpc_to_gt(slpc)->uncore,
+ GEN6_RP_STATE_CAP);
+
+ slpc->rp0_freq = REG_FIELD_GET(RP0_CAP_MASK, rp_state_cap) *
+ GT_FREQUENCY_MULTIPLIER;
+ slpc->rp1_freq = REG_FIELD_GET(RP1_CAP_MASK, rp_state_cap) *
+ GT_FREQUENCY_MULTIPLIER;
+ slpc->min_freq = REG_FIELD_GET(RPN_CAP_MASK, rp_state_cap) *
+ GT_FREQUENCY_MULTIPLIER;
+}
+
+/*
+ * intel_guc_slpc_enable() - Start SLPC
+ * @slpc: pointer to intel_guc_slpc.
+ *
+ * SLPC is enabled by setting up the shared data structure and
+ * sending reset event to GuC SLPC. Initial data is setup in
+ * intel_guc_slpc_init. Here we send the reset event. We do
+ * not currently need a slpc_disable since this is taken care
+ * of automatically when a reset/suspend occurs and the GuC
+ * CTB is destroyed.
+ *
+ * Return: 0 on success, non-zero error code on failure.
+ */
+int intel_guc_slpc_enable(struct intel_guc_slpc *slpc)
+{
+ struct drm_i915_private *i915 = slpc_to_i915(slpc);
+ int ret;
+
+ GEM_BUG_ON(!slpc->vma);
+
+ slpc_shared_data_reset(slpc->vaddr);
+
+ ret = slpc_reset(slpc);
+ if (unlikely(ret < 0)) {
+ drm_err(&i915->drm, "SLPC Reset event returned (%pe)\n",
+ ERR_PTR(ret));
+ return ret;
+ }
+
+ ret = slpc_query_task_state(slpc);
+ if (unlikely(ret < 0))
+ return ret;
+
+ intel_guc_pm_intrmsk_enable(&i915->gt);
+
+ slpc_get_rp_values(slpc);
+
+ /* Ignore efficient freq and set min to platform min */
+ ret = slpc_ignore_eff_freq(slpc, true);
+ if (unlikely(ret)) {
+ drm_err(&i915->drm, "Failed to set SLPC min to RPn (%pe)\n",
+ ERR_PTR(ret));
+ return ret;
+ }
+
+ /* Set SLPC max limit to RP0 */
+ ret = slpc_use_fused_rp0(slpc);
+ if (unlikely(ret)) {
+ drm_err(&i915->drm, "Failed to set SLPC max to RP0 (%pe)\n",
+ ERR_PTR(ret));
+ return ret;
+ }
+
+ /* Revert SLPC min/max to softlimits if necessary */
+ ret = slpc_set_softlimits(slpc);
+ if (unlikely(ret)) {
+ drm_err(&i915->drm, "Failed to set SLPC softlimits (%pe)\n",
+ ERR_PTR(ret));
+ return ret;
+ }
+
+ return 0;
+}
+
+int intel_guc_slpc_print_info(struct intel_guc_slpc *slpc, struct drm_printer *p)
+{
+ struct drm_i915_private *i915 = slpc_to_i915(slpc);
+ struct slpc_shared_data *data = slpc->vaddr;
+ struct slpc_task_state_data *slpc_tasks;
+ intel_wakeref_t wakeref;
+ int ret = 0;
+
+ GEM_BUG_ON(!slpc->vma);
+
+ with_intel_runtime_pm(&i915->runtime_pm, wakeref) {
+ ret = slpc_query_task_state(slpc);
+
+ if (!ret) {
+ slpc_tasks = &data->task_state_data;
+
+ drm_printf(p, "\tSLPC state: %s\n", slpc_get_state_string(slpc));
+ drm_printf(p, "\tGTPERF task active: %s\n",
+ yesno(slpc_tasks->status & SLPC_GTPERF_TASK_ENABLED));
+ drm_printf(p, "\tMax freq: %u MHz\n",
+ slpc_decode_max_freq(slpc));
+ drm_printf(p, "\tMin freq: %u MHz\n",
+ slpc_decode_min_freq(slpc));
+ }
+ }
+
+ return ret;
+}
+
+void intel_guc_slpc_fini(struct intel_guc_slpc *slpc)
+{
+ if (!slpc->vma)
+ return;
+
+ i915_vma_unpin_and_release(&slpc->vma, I915_VMA_RELEASE_MAP);
+}
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.h b/drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.h
new file mode 100644
index 000000000000..e45054d5b9b4
--- /dev/null
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.h
@@ -0,0 +1,42 @@
+/* SPDX-License-Identifier: MIT */
+/*
+ * Copyright © 2021 Intel Corporation
+ */
+
+#ifndef _INTEL_GUC_SLPC_H_
+#define _INTEL_GUC_SLPC_H_
+
+#include "intel_guc_submission.h"
+#include "intel_guc_slpc_types.h"
+
+struct intel_gt;
+struct drm_printer;
+
+static inline bool intel_guc_slpc_is_supported(struct intel_guc *guc)
+{
+ return guc->slpc.supported;
+}
+
+static inline bool intel_guc_slpc_is_wanted(struct intel_guc *guc)
+{
+ return guc->slpc.selected;
+}
+
+static inline bool intel_guc_slpc_is_used(struct intel_guc *guc)
+{
+ return intel_guc_submission_is_used(guc) && intel_guc_slpc_is_wanted(guc);
+}
+
+void intel_guc_slpc_init_early(struct intel_guc_slpc *slpc);
+
+int intel_guc_slpc_init(struct intel_guc_slpc *slpc);
+int intel_guc_slpc_enable(struct intel_guc_slpc *slpc);
+void intel_guc_slpc_fini(struct intel_guc_slpc *slpc);
+int intel_guc_slpc_set_max_freq(struct intel_guc_slpc *slpc, u32 val);
+int intel_guc_slpc_set_min_freq(struct intel_guc_slpc *slpc, u32 val);
+int intel_guc_slpc_get_max_freq(struct intel_guc_slpc *slpc, u32 *val);
+int intel_guc_slpc_get_min_freq(struct intel_guc_slpc *slpc, u32 *val);
+int intel_guc_slpc_print_info(struct intel_guc_slpc *slpc, struct drm_printer *p);
+void intel_guc_pm_intrmsk_enable(struct intel_gt *gt);
+
+#endif
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_slpc_types.h b/drivers/gpu/drm/i915/gt/uc/intel_guc_slpc_types.h
new file mode 100644
index 000000000000..41d13527666f
--- /dev/null
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_slpc_types.h
@@ -0,0 +1,29 @@
+/* SPDX-License-Identifier: MIT */
+/*
+ * Copyright © 2021 Intel Corporation
+ */
+
+#ifndef _INTEL_GUC_SLPC_TYPES_H_
+#define _INTEL_GUC_SLPC_TYPES_H_
+
+#include <linux/types.h>
+
+#define SLPC_RESET_TIMEOUT_MS 5
+
+struct intel_guc_slpc {
+ struct i915_vma *vma;
+ struct slpc_shared_data *vaddr;
+ bool supported;
+ bool selected;
+
+ /* platform frequency limits */
+ u32 min_freq;
+ u32 rp0_freq;
+ u32 rp1_freq;
+
+ /* frequency softlimits */
+ u32 min_freq_softlimit;
+ u32 max_freq_softlimit;
+};
+
+#endif
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
index 7c8ff9792f7b..87d8dc8f51b9 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
@@ -10,10 +10,13 @@
#include "gt/intel_breadcrumbs.h"
#include "gt/intel_context.h"
#include "gt/intel_engine_pm.h"
+#include "gt/intel_engine_heartbeat.h"
#include "gt/intel_gt.h"
#include "gt/intel_gt_irq.h"
#include "gt/intel_gt_pm.h"
+#include "gt/intel_gt_requests.h"
#include "gt/intel_lrc.h"
+#include "gt/intel_lrc_reg.h"
#include "gt/intel_mocs.h"
#include "gt/intel_ring.h"
@@ -58,244 +61,705 @@
*
*/
+/* GuC Virtual Engine */
+struct guc_virtual_engine {
+ struct intel_engine_cs base;
+ struct intel_context context;
+};
+
+static struct intel_context *
+guc_create_virtual(struct intel_engine_cs **siblings, unsigned int count);
+
#define GUC_REQUEST_SIZE 64 /* bytes */
-static inline struct i915_priolist *to_priolist(struct rb_node *rb)
+/*
+ * Below is a set of functions which control the GuC scheduling state which do
+ * not require a lock as all state transitions are mutually exclusive. i.e. It
+ * is not possible for the context pinning code and submission, for the same
+ * context, to be executing simultaneously. We still need an atomic as it is
+ * possible for some of the bits to changing at the same time though.
+ */
+#define SCHED_STATE_NO_LOCK_ENABLED BIT(0)
+#define SCHED_STATE_NO_LOCK_PENDING_ENABLE BIT(1)
+#define SCHED_STATE_NO_LOCK_REGISTERED BIT(2)
+static inline bool context_enabled(struct intel_context *ce)
{
- return rb_entry(rb, struct i915_priolist, node);
+ return (atomic_read(&ce->guc_sched_state_no_lock) &
+ SCHED_STATE_NO_LOCK_ENABLED);
}
-static struct guc_stage_desc *__get_stage_desc(struct intel_guc *guc, u32 id)
+static inline void set_context_enabled(struct intel_context *ce)
{
- struct guc_stage_desc *base = guc->stage_desc_pool_vaddr;
+ atomic_or(SCHED_STATE_NO_LOCK_ENABLED, &ce->guc_sched_state_no_lock);
+}
- return &base[id];
+static inline void clr_context_enabled(struct intel_context *ce)
+{
+ atomic_and((u32)~SCHED_STATE_NO_LOCK_ENABLED,
+ &ce->guc_sched_state_no_lock);
}
-static int guc_stage_desc_pool_create(struct intel_guc *guc)
+static inline bool context_pending_enable(struct intel_context *ce)
{
- u32 size = PAGE_ALIGN(sizeof(struct guc_stage_desc) *
- GUC_MAX_STAGE_DESCRIPTORS);
+ return (atomic_read(&ce->guc_sched_state_no_lock) &
+ SCHED_STATE_NO_LOCK_PENDING_ENABLE);
+}
- return intel_guc_allocate_and_map_vma(guc, size, &guc->stage_desc_pool,
- &guc->stage_desc_pool_vaddr);
+static inline void set_context_pending_enable(struct intel_context *ce)
+{
+ atomic_or(SCHED_STATE_NO_LOCK_PENDING_ENABLE,
+ &ce->guc_sched_state_no_lock);
+}
+
+static inline void clr_context_pending_enable(struct intel_context *ce)
+{
+ atomic_and((u32)~SCHED_STATE_NO_LOCK_PENDING_ENABLE,
+ &ce->guc_sched_state_no_lock);
}
-static void guc_stage_desc_pool_destroy(struct intel_guc *guc)
+static inline bool context_registered(struct intel_context *ce)
{
- i915_vma_unpin_and_release(&guc->stage_desc_pool, I915_VMA_RELEASE_MAP);
+ return (atomic_read(&ce->guc_sched_state_no_lock) &
+ SCHED_STATE_NO_LOCK_REGISTERED);
+}
+
+static inline void set_context_registered(struct intel_context *ce)
+{
+ atomic_or(SCHED_STATE_NO_LOCK_REGISTERED,
+ &ce->guc_sched_state_no_lock);
+}
+
+static inline void clr_context_registered(struct intel_context *ce)
+{
+ atomic_and((u32)~SCHED_STATE_NO_LOCK_REGISTERED,
+ &ce->guc_sched_state_no_lock);
}
/*
- * Initialise/clear the stage descriptor shared with the GuC firmware.
- *
- * This descriptor tells the GuC where (in GGTT space) to find the important
- * data structures related to work submission (process descriptor, write queue,
- * etc).
+ * Below is a set of functions which control the GuC scheduling state which
+ * require a lock, aside from the special case where the functions are called
+ * from guc_lrc_desc_pin(). In that case it isn't possible for any other code
+ * path to be executing on the context.
*/
-static void guc_stage_desc_init(struct intel_guc *guc)
+#define SCHED_STATE_WAIT_FOR_DEREGISTER_TO_REGISTER BIT(0)
+#define SCHED_STATE_DESTROYED BIT(1)
+#define SCHED_STATE_PENDING_DISABLE BIT(2)
+#define SCHED_STATE_BANNED BIT(3)
+#define SCHED_STATE_BLOCKED_SHIFT 4
+#define SCHED_STATE_BLOCKED BIT(SCHED_STATE_BLOCKED_SHIFT)
+#define SCHED_STATE_BLOCKED_MASK (0xfff << SCHED_STATE_BLOCKED_SHIFT)
+static inline void init_sched_state(struct intel_context *ce)
+{
+ /* Only should be called from guc_lrc_desc_pin() */
+ atomic_set(&ce->guc_sched_state_no_lock, 0);
+ ce->guc_state.sched_state = 0;
+}
+
+static inline bool
+context_wait_for_deregister_to_register(struct intel_context *ce)
+{
+ return ce->guc_state.sched_state &
+ SCHED_STATE_WAIT_FOR_DEREGISTER_TO_REGISTER;
+}
+
+static inline void
+set_context_wait_for_deregister_to_register(struct intel_context *ce)
{
- struct guc_stage_desc *desc;
+ /* Only should be called from guc_lrc_desc_pin() without lock */
+ ce->guc_state.sched_state |=
+ SCHED_STATE_WAIT_FOR_DEREGISTER_TO_REGISTER;
+}
- /* we only use 1 stage desc, so hardcode it to 0 */
- desc = __get_stage_desc(guc, 0);
- memset(desc, 0, sizeof(*desc));
+static inline void
+clr_context_wait_for_deregister_to_register(struct intel_context *ce)
+{
+ lockdep_assert_held(&ce->guc_state.lock);
+ ce->guc_state.sched_state &=
+ ~SCHED_STATE_WAIT_FOR_DEREGISTER_TO_REGISTER;
+}
- desc->attribute = GUC_STAGE_DESC_ATTR_ACTIVE |
- GUC_STAGE_DESC_ATTR_KERNEL;
+static inline bool
+context_destroyed(struct intel_context *ce)
+{
+ return ce->guc_state.sched_state & SCHED_STATE_DESTROYED;
+}
- desc->stage_id = 0;
- desc->priority = GUC_CLIENT_PRIORITY_KMD_NORMAL;
+static inline void
+set_context_destroyed(struct intel_context *ce)
+{
+ lockdep_assert_held(&ce->guc_state.lock);
+ ce->guc_state.sched_state |= SCHED_STATE_DESTROYED;
+}
- desc->wq_size = GUC_WQ_SIZE;
+static inline bool context_pending_disable(struct intel_context *ce)
+{
+ return ce->guc_state.sched_state & SCHED_STATE_PENDING_DISABLE;
}
-static void guc_stage_desc_fini(struct intel_guc *guc)
+static inline void set_context_pending_disable(struct intel_context *ce)
{
- struct guc_stage_desc *desc;
+ lockdep_assert_held(&ce->guc_state.lock);
+ ce->guc_state.sched_state |= SCHED_STATE_PENDING_DISABLE;
+}
- desc = __get_stage_desc(guc, 0);
- memset(desc, 0, sizeof(*desc));
+static inline void clr_context_pending_disable(struct intel_context *ce)
+{
+ lockdep_assert_held(&ce->guc_state.lock);
+ ce->guc_state.sched_state &= ~SCHED_STATE_PENDING_DISABLE;
}
-static void guc_add_request(struct intel_guc *guc, struct i915_request *rq)
+static inline bool context_banned(struct intel_context *ce)
{
- /* Leaving stub as this function will be used in future patches */
+ return ce->guc_state.sched_state & SCHED_STATE_BANNED;
}
-/*
- * When we're doing submissions using regular execlists backend, writing to
- * ELSP from CPU side is enough to make sure that writes to ringbuffer pages
- * pinned in mappable aperture portion of GGTT are visible to command streamer.
- * Writes done by GuC on our behalf are not guaranteeing such ordering,
- * therefore, to ensure the flush, we're issuing a POSTING READ.
- */
-static void flush_ggtt_writes(struct i915_vma *vma)
+static inline void set_context_banned(struct intel_context *ce)
{
- if (i915_vma_is_map_and_fenceable(vma))
- intel_uncore_posting_read_fw(vma->vm->gt->uncore,
- GUC_STATUS);
+ lockdep_assert_held(&ce->guc_state.lock);
+ ce->guc_state.sched_state |= SCHED_STATE_BANNED;
}
-static void guc_submit(struct intel_engine_cs *engine,
- struct i915_request **out,
- struct i915_request **end)
+static inline void clr_context_banned(struct intel_context *ce)
{
- struct intel_guc *guc = &engine->gt->uc.guc;
+ lockdep_assert_held(&ce->guc_state.lock);
+ ce->guc_state.sched_state &= ~SCHED_STATE_BANNED;
+}
- do {
- struct i915_request *rq = *out++;
+static inline u32 context_blocked(struct intel_context *ce)
+{
+ return (ce->guc_state.sched_state & SCHED_STATE_BLOCKED_MASK) >>
+ SCHED_STATE_BLOCKED_SHIFT;
+}
- flush_ggtt_writes(rq->ring->vma);
- guc_add_request(guc, rq);
- } while (out != end);
+static inline void incr_context_blocked(struct intel_context *ce)
+{
+ lockdep_assert_held(&ce->engine->sched_engine->lock);
+ lockdep_assert_held(&ce->guc_state.lock);
+
+ ce->guc_state.sched_state += SCHED_STATE_BLOCKED;
+
+ GEM_BUG_ON(!context_blocked(ce)); /* Overflow check */
}
-static inline int rq_prio(const struct i915_request *rq)
+static inline void decr_context_blocked(struct intel_context *ce)
{
- return rq->sched.attr.priority;
+ lockdep_assert_held(&ce->engine->sched_engine->lock);
+ lockdep_assert_held(&ce->guc_state.lock);
+
+ GEM_BUG_ON(!context_blocked(ce)); /* Underflow check */
+
+ ce->guc_state.sched_state -= SCHED_STATE_BLOCKED;
}
-static struct i915_request *schedule_in(struct i915_request *rq, int idx)
+static inline bool context_guc_id_invalid(struct intel_context *ce)
{
- trace_i915_request_in(rq, idx);
+ return ce->guc_id == GUC_INVALID_LRC_ID;
+}
+
+static inline void set_context_guc_id_invalid(struct intel_context *ce)
+{
+ ce->guc_id = GUC_INVALID_LRC_ID;
+}
+
+static inline struct intel_guc *ce_to_guc(struct intel_context *ce)
+{
+ return &ce->engine->gt->uc.guc;
+}
+
+static inline struct i915_priolist *to_priolist(struct rb_node *rb)
+{
+ return rb_entry(rb, struct i915_priolist, node);
+}
+
+static struct guc_lrc_desc *__get_lrc_desc(struct intel_guc *guc, u32 index)
+{
+ struct guc_lrc_desc *base = guc->lrc_desc_pool_vaddr;
+
+ GEM_BUG_ON(index >= GUC_MAX_LRC_DESCRIPTORS);
+
+ return &base[index];
+}
+
+static inline struct intel_context *__get_context(struct intel_guc *guc, u32 id)
+{
+ struct intel_context *ce = xa_load(&guc->context_lookup, id);
+
+ GEM_BUG_ON(id >= GUC_MAX_LRC_DESCRIPTORS);
+
+ return ce;
+}
+
+static int guc_lrc_desc_pool_create(struct intel_guc *guc)
+{
+ u32 size;
+ int ret;
+
+ size = PAGE_ALIGN(sizeof(struct guc_lrc_desc) *
+ GUC_MAX_LRC_DESCRIPTORS);
+ ret = intel_guc_allocate_and_map_vma(guc, size, &guc->lrc_desc_pool,
+ (void **)&guc->lrc_desc_pool_vaddr);
+ if (ret)
+ return ret;
+
+ return 0;
+}
+
+static void guc_lrc_desc_pool_destroy(struct intel_guc *guc)
+{
+ guc->lrc_desc_pool_vaddr = NULL;
+ i915_vma_unpin_and_release(&guc->lrc_desc_pool, I915_VMA_RELEASE_MAP);
+}
+
+static inline bool guc_submission_initialized(struct intel_guc *guc)
+{
+ return !!guc->lrc_desc_pool_vaddr;
+}
+
+static inline void reset_lrc_desc(struct intel_guc *guc, u32 id)
+{
+ if (likely(guc_submission_initialized(guc))) {
+ struct guc_lrc_desc *desc = __get_lrc_desc(guc, id);
+ unsigned long flags;
+
+ memset(desc, 0, sizeof(*desc));
+
+ /*
+ * xarray API doesn't have xa_erase_irqsave wrapper, so calling
+ * the lower level functions directly.
+ */
+ xa_lock_irqsave(&guc->context_lookup, flags);
+ __xa_erase(&guc->context_lookup, id);
+ xa_unlock_irqrestore(&guc->context_lookup, flags);
+ }
+}
+
+static inline bool lrc_desc_registered(struct intel_guc *guc, u32 id)
+{
+ return __get_context(guc, id);
+}
+
+static inline void set_lrc_desc_registered(struct intel_guc *guc, u32 id,
+ struct intel_context *ce)
+{
+ unsigned long flags;
/*
- * Currently we are not tracking the rq->context being inflight
- * (ce->inflight = rq->engine). It is only used by the execlists
- * backend at the moment, a similar counting strategy would be
- * required if we generalise the inflight tracking.
+ * xarray API doesn't have xa_save_irqsave wrapper, so calling the
+ * lower level functions directly.
*/
+ xa_lock_irqsave(&guc->context_lookup, flags);
+ __xa_store(&guc->context_lookup, id, ce, GFP_ATOMIC);
+ xa_unlock_irqrestore(&guc->context_lookup, flags);
+}
+
+static int guc_submission_send_busy_loop(struct intel_guc *guc,
+ const u32 *action,
+ u32 len,
+ u32 g2h_len_dw,
+ bool loop)
+{
+ int err;
+
+ err = intel_guc_send_busy_loop(guc, action, len, g2h_len_dw, loop);
+
+ if (!err && g2h_len_dw)
+ atomic_inc(&guc->outstanding_submission_g2h);
+
+ return err;
+}
+
+int intel_guc_wait_for_pending_msg(struct intel_guc *guc,
+ atomic_t *wait_var,
+ bool interruptible,
+ long timeout)
+{
+ const int state = interruptible ?
+ TASK_INTERRUPTIBLE : TASK_UNINTERRUPTIBLE;
+ DEFINE_WAIT(wait);
- __intel_gt_pm_get(rq->engine->gt);
- return i915_request_get(rq);
+ might_sleep();
+ GEM_BUG_ON(timeout < 0);
+
+ if (!atomic_read(wait_var))
+ return 0;
+
+ if (!timeout)
+ return -ETIME;
+
+ for (;;) {
+ prepare_to_wait(&guc->ct.wq, &wait, state);
+
+ if (!atomic_read(wait_var))
+ break;
+
+ if (signal_pending_state(state, current)) {
+ timeout = -EINTR;
+ break;
+ }
+
+ if (!timeout) {
+ timeout = -ETIME;
+ break;
+ }
+
+ timeout = io_schedule_timeout(timeout);
+ }
+ finish_wait(&guc->ct.wq, &wait);
+
+ return (timeout < 0) ? timeout : 0;
}
-static void schedule_out(struct i915_request *rq)
+int intel_guc_wait_for_idle(struct intel_guc *guc, long timeout)
{
- trace_i915_request_out(rq);
+ if (!intel_uc_uses_guc_submission(&guc_to_gt(guc)->uc))
+ return 0;
- intel_gt_pm_put_async(rq->engine->gt);
- i915_request_put(rq);
+ return intel_guc_wait_for_pending_msg(guc,
+ &guc->outstanding_submission_g2h,
+ true, timeout);
}
-static void __guc_dequeue(struct intel_engine_cs *engine)
+static int guc_lrc_desc_pin(struct intel_context *ce, bool loop);
+
+static int guc_add_request(struct intel_guc *guc, struct i915_request *rq)
{
- struct intel_engine_execlists * const execlists = &engine->execlists;
- struct i915_request **first = execlists->inflight;
- struct i915_request ** const last_port = first + execlists->port_mask;
- struct i915_request *last = first[0];
- struct i915_request **port;
- bool submit = false;
- struct rb_node *rb;
+ int err = 0;
+ struct intel_context *ce = rq->context;
+ u32 action[3];
+ int len = 0;
+ u32 g2h_len_dw = 0;
+ bool enabled;
- lockdep_assert_held(&engine->active.lock);
+ /*
+ * Corner case where requests were sitting in the priority list or a
+ * request resubmitted after the context was banned.
+ */
+ if (unlikely(intel_context_is_banned(ce))) {
+ i915_request_put(i915_request_mark_eio(rq));
+ intel_engine_signal_breadcrumbs(ce->engine);
+ goto out;
+ }
- if (last) {
- if (*++first)
- return;
+ GEM_BUG_ON(!atomic_read(&ce->guc_id_ref));
+ GEM_BUG_ON(context_guc_id_invalid(ce));
- last = NULL;
+ /*
+ * Corner case where the GuC firmware was blown away and reloaded while
+ * this context was pinned.
+ */
+ if (unlikely(!lrc_desc_registered(guc, ce->guc_id))) {
+ err = guc_lrc_desc_pin(ce, false);
+ if (unlikely(err))
+ goto out;
}
/*
- * We write directly into the execlists->inflight queue and don't use
- * the execlists->pending queue, as we don't have a distinct switch
- * event.
+ * The request / context will be run on the hardware when scheduling
+ * gets enabled in the unblock.
*/
- port = first;
- while ((rb = rb_first_cached(&execlists->queue))) {
+ if (unlikely(context_blocked(ce)))
+ goto out;
+
+ enabled = context_enabled(ce);
+
+ if (!enabled) {
+ action[len++] = INTEL_GUC_ACTION_SCHED_CONTEXT_MODE_SET;
+ action[len++] = ce->guc_id;
+ action[len++] = GUC_CONTEXT_ENABLE;
+ set_context_pending_enable(ce);
+ intel_context_get(ce);
+ g2h_len_dw = G2H_LEN_DW_SCHED_CONTEXT_MODE_SET;
+ } else {
+ action[len++] = INTEL_GUC_ACTION_SCHED_CONTEXT;
+ action[len++] = ce->guc_id;
+ }
+
+ err = intel_guc_send_nb(guc, action, len, g2h_len_dw);
+ if (!enabled && !err) {
+ trace_intel_context_sched_enable(ce);
+ atomic_inc(&guc->outstanding_submission_g2h);
+ set_context_enabled(ce);
+ } else if (!enabled) {
+ clr_context_pending_enable(ce);
+ intel_context_put(ce);
+ }
+ if (likely(!err))
+ trace_i915_request_guc_submit(rq);
+
+out:
+ return err;
+}
+
+static inline void guc_set_lrc_tail(struct i915_request *rq)
+{
+ rq->context->lrc_reg_state[CTX_RING_TAIL] =
+ intel_ring_set_tail(rq->ring, rq->tail);
+}
+
+static inline int rq_prio(const struct i915_request *rq)
+{
+ return rq->sched.attr.priority;
+}
+
+static int guc_dequeue_one_context(struct intel_guc *guc)
+{
+ struct i915_sched_engine * const sched_engine = guc->sched_engine;
+ struct i915_request *last = NULL;
+ bool submit = false;
+ struct rb_node *rb;
+ int ret;
+
+ lockdep_assert_held(&sched_engine->lock);
+
+ if (guc->stalled_request) {
+ submit = true;
+ last = guc->stalled_request;
+ goto resubmit;
+ }
+
+ while ((rb = rb_first_cached(&sched_engine->queue))) {
struct i915_priolist *p = to_priolist(rb);
struct i915_request *rq, *rn;
priolist_for_each_request_consume(rq, rn, p) {
- if (last && rq->context != last->context) {
- if (port == last_port)
- goto done;
-
- *port = schedule_in(last,
- port - execlists->inflight);
- port++;
- }
+ if (last && rq->context != last->context)
+ goto done;
list_del_init(&rq->sched.link);
+
__i915_request_submit(rq);
- submit = true;
+
+ trace_i915_request_in(rq, 0);
last = rq;
+ submit = true;
}
- rb_erase_cached(&p->node, &execlists->queue);
+ rb_erase_cached(&p->node, &sched_engine->queue);
i915_priolist_free(p);
}
done:
- execlists->queue_priority_hint =
- rb ? to_priolist(rb)->priority : INT_MIN;
if (submit) {
- *port = schedule_in(last, port - execlists->inflight);
- *++port = NULL;
- guc_submit(engine, first, port);
+ guc_set_lrc_tail(last);
+resubmit:
+ ret = guc_add_request(guc, last);
+ if (unlikely(ret == -EPIPE))
+ goto deadlk;
+ else if (ret == -EBUSY) {
+ tasklet_schedule(&sched_engine->tasklet);
+ guc->stalled_request = last;
+ return false;
+ }
}
- execlists->active = execlists->inflight;
+
+ guc->stalled_request = NULL;
+ return submit;
+
+deadlk:
+ sched_engine->tasklet.callback = NULL;
+ tasklet_disable_nosync(&sched_engine->tasklet);
+ return false;
}
static void guc_submission_tasklet(struct tasklet_struct *t)
{
- struct intel_engine_cs * const engine =
- from_tasklet(engine, t, execlists.tasklet);
- struct intel_engine_execlists * const execlists = &engine->execlists;
- struct i915_request **port, *rq;
+ struct i915_sched_engine *sched_engine =
+ from_tasklet(sched_engine, t, tasklet);
unsigned long flags;
+ bool loop;
- spin_lock_irqsave(&engine->active.lock, flags);
-
- for (port = execlists->inflight; (rq = *port); port++) {
- if (!i915_request_completed(rq))
- break;
+ spin_lock_irqsave(&sched_engine->lock, flags);
- schedule_out(rq);
- }
- if (port != execlists->inflight) {
- int idx = port - execlists->inflight;
- int rem = ARRAY_SIZE(execlists->inflight) - idx;
- memmove(execlists->inflight, port, rem * sizeof(*port));
- }
+ do {
+ loop = guc_dequeue_one_context(sched_engine->private_data);
+ } while (loop);
- __guc_dequeue(engine);
+ i915_sched_engine_reset_on_empty(sched_engine);
- spin_unlock_irqrestore(&engine->active.lock, flags);
+ spin_unlock_irqrestore(&sched_engine->lock, flags);
}
static void cs_irq_handler(struct intel_engine_cs *engine, u16 iir)
{
- if (iir & GT_RENDER_USER_INTERRUPT) {
+ if (iir & GT_RENDER_USER_INTERRUPT)
intel_engine_signal_breadcrumbs(engine);
- tasklet_hi_schedule(&engine->execlists.tasklet);
+}
+
+static void __guc_context_destroy(struct intel_context *ce);
+static void release_guc_id(struct intel_guc *guc, struct intel_context *ce);
+static void guc_signal_context_fence(struct intel_context *ce);
+static void guc_cancel_context_requests(struct intel_context *ce);
+static void guc_blocked_fence_complete(struct intel_context *ce);
+
+static void scrub_guc_desc_for_outstanding_g2h(struct intel_guc *guc)
+{
+ struct intel_context *ce;
+ unsigned long index, flags;
+ bool pending_disable, pending_enable, deregister, destroyed, banned;
+
+ xa_for_each(&guc->context_lookup, index, ce) {
+ /* Flush context */
+ spin_lock_irqsave(&ce->guc_state.lock, flags);
+ spin_unlock_irqrestore(&ce->guc_state.lock, flags);
+
+ /*
+ * Once we are at this point submission_disabled() is guaranteed
+ * to be visible to all callers who set the below flags (see above
+ * flush and flushes in reset_prepare). If submission_disabled()
+ * is set, the caller shouldn't set these flags.
+ */
+
+ destroyed = context_destroyed(ce);
+ pending_enable = context_pending_enable(ce);
+ pending_disable = context_pending_disable(ce);
+ deregister = context_wait_for_deregister_to_register(ce);
+ banned = context_banned(ce);
+ init_sched_state(ce);
+
+ if (pending_enable || destroyed || deregister) {
+ atomic_dec(&guc->outstanding_submission_g2h);
+ if (deregister)
+ guc_signal_context_fence(ce);
+ if (destroyed) {
+ release_guc_id(guc, ce);
+ __guc_context_destroy(ce);
+ }
+ if (pending_enable || deregister)
+ intel_context_put(ce);
+ }
+
+ /* Not mutualy exclusive with above if statement. */
+ if (pending_disable) {
+ guc_signal_context_fence(ce);
+ if (banned) {
+ guc_cancel_context_requests(ce);
+ intel_engine_signal_breadcrumbs(ce->engine);
+ }
+ intel_context_sched_disable_unpin(ce);
+ atomic_dec(&guc->outstanding_submission_g2h);
+ spin_lock_irqsave(&ce->guc_state.lock, flags);
+ guc_blocked_fence_complete(ce);
+ spin_unlock_irqrestore(&ce->guc_state.lock, flags);
+
+ intel_context_put(ce);
+ }
}
}
-static void guc_reset_prepare(struct intel_engine_cs *engine)
+static inline bool
+submission_disabled(struct intel_guc *guc)
+{
+ struct i915_sched_engine * const sched_engine = guc->sched_engine;
+
+ return unlikely(!sched_engine ||
+ !__tasklet_is_enabled(&sched_engine->tasklet));
+}
+
+static void disable_submission(struct intel_guc *guc)
{
- struct intel_engine_execlists * const execlists = &engine->execlists;
+ struct i915_sched_engine * const sched_engine = guc->sched_engine;
+
+ if (__tasklet_is_enabled(&sched_engine->tasklet)) {
+ GEM_BUG_ON(!guc->ct.enabled);
+ __tasklet_disable_sync_once(&sched_engine->tasklet);
+ sched_engine->tasklet.callback = NULL;
+ }
+}
+
+static void enable_submission(struct intel_guc *guc)
+{
+ struct i915_sched_engine * const sched_engine = guc->sched_engine;
+ unsigned long flags;
- ENGINE_TRACE(engine, "\n");
+ spin_lock_irqsave(&guc->sched_engine->lock, flags);
+ sched_engine->tasklet.callback = guc_submission_tasklet;
+ wmb(); /* Make sure callback visible */
+ if (!__tasklet_is_enabled(&sched_engine->tasklet) &&
+ __tasklet_enable(&sched_engine->tasklet)) {
+ GEM_BUG_ON(!guc->ct.enabled);
+
+ /* And kick in case we missed a new request submission. */
+ tasklet_hi_schedule(&sched_engine->tasklet);
+ }
+ spin_unlock_irqrestore(&guc->sched_engine->lock, flags);
+}
+
+static void guc_flush_submissions(struct intel_guc *guc)
+{
+ struct i915_sched_engine * const sched_engine = guc->sched_engine;
+ unsigned long flags;
+
+ spin_lock_irqsave(&sched_engine->lock, flags);
+ spin_unlock_irqrestore(&sched_engine->lock, flags);
+}
+
+void intel_guc_submission_reset_prepare(struct intel_guc *guc)
+{
+ int i;
+
+ if (unlikely(!guc_submission_initialized(guc))) {
+ /* Reset called during driver load? GuC not yet initialised! */
+ return;
+ }
+
+ intel_gt_park_heartbeats(guc_to_gt(guc));
+ disable_submission(guc);
+ guc->interrupts.disable(guc);
+
+ /* Flush IRQ handler */
+ spin_lock_irq(&guc_to_gt(guc)->irq_lock);
+ spin_unlock_irq(&guc_to_gt(guc)->irq_lock);
+
+ guc_flush_submissions(guc);
/*
- * Prevent request submission to the hardware until we have
- * completed the reset in i915_gem_reset_finish(). If a request
- * is completed by one engine, it may then queue a request
- * to a second via its execlists->tasklet *just* as we are
- * calling engine->init_hw() and also writing the ELSP.
- * Turning off the execlists->tasklet until the reset is over
- * prevents the race.
+ * Handle any outstanding G2Hs before reset. Call IRQ handler directly
+ * each pass as interrupt have been disabled. We always scrub for
+ * outstanding G2H as it is possible for outstanding_submission_g2h to
+ * be incremented after the context state update.
*/
- __tasklet_disable_sync_once(&execlists->tasklet);
+ for (i = 0; i < 4 && atomic_read(&guc->outstanding_submission_g2h); ++i) {
+ intel_guc_to_host_event_handler(guc);
+#define wait_for_reset(guc, wait_var) \
+ intel_guc_wait_for_pending_msg(guc, wait_var, false, (HZ / 20))
+ do {
+ wait_for_reset(guc, &guc->outstanding_submission_g2h);
+ } while (!list_empty(&guc->ct.requests.incoming));
+ }
+ scrub_guc_desc_for_outstanding_g2h(guc);
+}
+
+static struct intel_engine_cs *
+guc_virtual_get_sibling(struct intel_engine_cs *ve, unsigned int sibling)
+{
+ struct intel_engine_cs *engine;
+ intel_engine_mask_t tmp, mask = ve->mask;
+ unsigned int num_siblings = 0;
+
+ for_each_engine_masked(engine, ve->gt, mask, tmp)
+ if (num_siblings++ == sibling)
+ return engine;
+
+ return NULL;
+}
+
+static inline struct intel_engine_cs *
+__context_to_physical_engine(struct intel_context *ce)
+{
+ struct intel_engine_cs *engine = ce->engine;
+
+ if (intel_engine_is_virtual(engine))
+ engine = guc_virtual_get_sibling(engine, 0);
+
+ return engine;
}
-static void guc_reset_state(struct intel_context *ce,
- struct intel_engine_cs *engine,
- u32 head,
- bool scrub)
+static void guc_reset_state(struct intel_context *ce, u32 head, bool scrub)
{
+ struct intel_engine_cs *engine = __context_to_physical_engine(ce);
+
+ if (intel_context_is_banned(ce))
+ return;
+
GEM_BUG_ON(!intel_context_is_pinned(ce));
/*
@@ -313,37 +777,132 @@ static void guc_reset_state(struct intel_context *ce,
lrc_update_regs(ce, engine, head);
}
-static void guc_reset_rewind(struct intel_engine_cs *engine, bool stalled)
+static void guc_reset_nop(struct intel_engine_cs *engine)
{
- struct intel_engine_execlists * const execlists = &engine->execlists;
- struct i915_request *rq;
+}
+
+static void guc_rewind_nop(struct intel_engine_cs *engine, bool stalled)
+{
+}
+
+static void
+__unwind_incomplete_requests(struct intel_context *ce)
+{
+ struct i915_request *rq, *rn;
+ struct list_head *pl;
+ int prio = I915_PRIORITY_INVALID;
+ struct i915_sched_engine * const sched_engine =
+ ce->engine->sched_engine;
unsigned long flags;
- spin_lock_irqsave(&engine->active.lock, flags);
+ spin_lock_irqsave(&sched_engine->lock, flags);
+ spin_lock(&ce->guc_active.lock);
+ list_for_each_entry_safe(rq, rn,
+ &ce->guc_active.requests,
+ sched.link) {
+ if (i915_request_completed(rq))
+ continue;
- /* Push back any incomplete requests for replay after the reset. */
- rq = execlists_unwind_incomplete_requests(execlists);
- if (!rq)
- goto out_unlock;
+ list_del_init(&rq->sched.link);
+ spin_unlock(&ce->guc_active.lock);
+
+ __i915_request_unsubmit(rq);
+
+ /* Push the request back into the queue for later resubmission. */
+ GEM_BUG_ON(rq_prio(rq) == I915_PRIORITY_INVALID);
+ if (rq_prio(rq) != prio) {
+ prio = rq_prio(rq);
+ pl = i915_sched_lookup_priolist(sched_engine, prio);
+ }
+ GEM_BUG_ON(i915_sched_engine_is_empty(sched_engine));
+
+ list_add_tail(&rq->sched.link, pl);
+ set_bit(I915_FENCE_FLAG_PQUEUE, &rq->fence.flags);
+
+ spin_lock(&ce->guc_active.lock);
+ }
+ spin_unlock(&ce->guc_active.lock);
+ spin_unlock_irqrestore(&sched_engine->lock, flags);
+}
+
+static void __guc_reset_context(struct intel_context *ce, bool stalled)
+{
+ struct i915_request *rq;
+ u32 head;
+
+ intel_context_get(ce);
+
+ /*
+ * GuC will implicitly mark the context as non-schedulable
+ * when it sends the reset notification. Make sure our state
+ * reflects this change. The context will be marked enabled
+ * on resubmission.
+ */
+ clr_context_enabled(ce);
+
+ rq = intel_context_find_active_request(ce);
+ if (!rq) {
+ head = ce->ring->tail;
+ stalled = false;
+ goto out_replay;
+ }
if (!i915_request_started(rq))
stalled = false;
+ GEM_BUG_ON(i915_active_is_idle(&ce->active));
+ head = intel_ring_wrap(ce->ring, rq->head);
__i915_request_reset(rq, stalled);
- guc_reset_state(rq->context, engine, rq->head, stalled);
-out_unlock:
- spin_unlock_irqrestore(&engine->active.lock, flags);
+out_replay:
+ guc_reset_state(ce, head, stalled);
+ __unwind_incomplete_requests(ce);
+ intel_context_put(ce);
+}
+
+void intel_guc_submission_reset(struct intel_guc *guc, bool stalled)
+{
+ struct intel_context *ce;
+ unsigned long index;
+
+ if (unlikely(!guc_submission_initialized(guc))) {
+ /* Reset called during driver load? GuC not yet initialised! */
+ return;
+ }
+
+ xa_for_each(&guc->context_lookup, index, ce)
+ if (intel_context_is_pinned(ce))
+ __guc_reset_context(ce, stalled);
+
+ /* GuC is blown away, drop all references to contexts */
+ xa_destroy(&guc->context_lookup);
}
-static void guc_reset_cancel(struct intel_engine_cs *engine)
+static void guc_cancel_context_requests(struct intel_context *ce)
+{
+ struct i915_sched_engine *sched_engine = ce_to_guc(ce)->sched_engine;
+ struct i915_request *rq;
+ unsigned long flags;
+
+ /* Mark all executing requests as skipped. */
+ spin_lock_irqsave(&sched_engine->lock, flags);
+ spin_lock(&ce->guc_active.lock);
+ list_for_each_entry(rq, &ce->guc_active.requests, sched.link)
+ i915_request_put(i915_request_mark_eio(rq));
+ spin_unlock(&ce->guc_active.lock);
+ spin_unlock_irqrestore(&sched_engine->lock, flags);
+}
+
+static void
+guc_cancel_sched_engine_requests(struct i915_sched_engine *sched_engine)
{
- struct intel_engine_execlists * const execlists = &engine->execlists;
struct i915_request *rq, *rn;
struct rb_node *rb;
unsigned long flags;
- ENGINE_TRACE(engine, "\n");
+ /* Can be called during boot if GuC fails to load */
+ if (!sched_engine)
+ return;
/*
* Before we call engine->cancel_requests(), we should have exclusive
@@ -359,47 +918,67 @@ static void guc_reset_cancel(struct intel_engine_cs *engine)
* submission's irq state, we also wish to remind ourselves that
* it is irq state.)
*/
- spin_lock_irqsave(&engine->active.lock, flags);
-
- /* Mark all executing requests as skipped. */
- list_for_each_entry(rq, &engine->active.requests, sched.link) {
- i915_request_set_error_once(rq, -EIO);
- i915_request_mark_complete(rq);
- }
+ spin_lock_irqsave(&sched_engine->lock, flags);
/* Flush the queued requests to the timeline list (for retiring). */
- while ((rb = rb_first_cached(&execlists->queue))) {
+ while ((rb = rb_first_cached(&sched_engine->queue))) {
struct i915_priolist *p = to_priolist(rb);
priolist_for_each_request_consume(rq, rn, p) {
list_del_init(&rq->sched.link);
+
__i915_request_submit(rq);
- dma_fence_set_error(&rq->fence, -EIO);
- i915_request_mark_complete(rq);
+
+ i915_request_put(i915_request_mark_eio(rq));
}
- rb_erase_cached(&p->node, &execlists->queue);
+ rb_erase_cached(&p->node, &sched_engine->queue);
i915_priolist_free(p);
}
/* Remaining _unready_ requests will be nop'ed when submitted */
- execlists->queue_priority_hint = INT_MIN;
- execlists->queue = RB_ROOT_CACHED;
+ sched_engine->queue_priority_hint = INT_MIN;
+ sched_engine->queue = RB_ROOT_CACHED;
- spin_unlock_irqrestore(&engine->active.lock, flags);
+ spin_unlock_irqrestore(&sched_engine->lock, flags);
}
-static void guc_reset_finish(struct intel_engine_cs *engine)
+void intel_guc_submission_cancel_requests(struct intel_guc *guc)
{
- struct intel_engine_execlists * const execlists = &engine->execlists;
+ struct intel_context *ce;
+ unsigned long index;
- if (__tasklet_enable(&execlists->tasklet))
- /* And kick in case we missed a new request submission. */
- tasklet_hi_schedule(&execlists->tasklet);
+ xa_for_each(&guc->context_lookup, index, ce)
+ if (intel_context_is_pinned(ce))
+ guc_cancel_context_requests(ce);
+
+ guc_cancel_sched_engine_requests(guc->sched_engine);
- ENGINE_TRACE(engine, "depth->%d\n",
- atomic_read(&execlists->tasklet.count));
+ /* GuC is blown away, drop all references to contexts */
+ xa_destroy(&guc->context_lookup);
+}
+
+void intel_guc_submission_reset_finish(struct intel_guc *guc)
+{
+ /* Reset called during driver load or during wedge? */
+ if (unlikely(!guc_submission_initialized(guc) ||
+ test_bit(I915_WEDGED, &guc_to_gt(guc)->reset.flags))) {
+ return;
+ }
+
+ /*
+ * Technically possible for either of these values to be non-zero here,
+ * but very unlikely + harmless. Regardless let's add a warn so we can
+ * see in CI if this happens frequently / a precursor to taking down the
+ * machine.
+ */
+ GEM_WARN_ON(atomic_read(&guc->outstanding_submission_g2h));
+ atomic_set(&guc->outstanding_submission_g2h, 0);
+
+ intel_guc_global_policies_update(guc);
+ enable_submission(guc);
+ intel_gt_unpark_heartbeats(guc_to_gt(guc));
}
/*
@@ -410,43 +989,986 @@ int intel_guc_submission_init(struct intel_guc *guc)
{
int ret;
- if (guc->stage_desc_pool)
+ if (guc->lrc_desc_pool)
return 0;
- ret = guc_stage_desc_pool_create(guc);
+ ret = guc_lrc_desc_pool_create(guc);
if (ret)
return ret;
/*
* Keep static analysers happy, let them know that we allocated the
* vma after testing that it didn't exist earlier.
*/
- GEM_BUG_ON(!guc->stage_desc_pool);
+ GEM_BUG_ON(!guc->lrc_desc_pool);
+
+ xa_init_flags(&guc->context_lookup, XA_FLAGS_LOCK_IRQ);
+
+ spin_lock_init(&guc->contexts_lock);
+ INIT_LIST_HEAD(&guc->guc_id_list);
+ ida_init(&guc->guc_ids);
return 0;
}
void intel_guc_submission_fini(struct intel_guc *guc)
{
- if (guc->stage_desc_pool) {
- guc_stage_desc_pool_destroy(guc);
+ if (!guc->lrc_desc_pool)
+ return;
+
+ guc_lrc_desc_pool_destroy(guc);
+ i915_sched_engine_put(guc->sched_engine);
+}
+
+static inline void queue_request(struct i915_sched_engine *sched_engine,
+ struct i915_request *rq,
+ int prio)
+{
+ GEM_BUG_ON(!list_empty(&rq->sched.link));
+ list_add_tail(&rq->sched.link,
+ i915_sched_lookup_priolist(sched_engine, prio));
+ set_bit(I915_FENCE_FLAG_PQUEUE, &rq->fence.flags);
+}
+
+static int guc_bypass_tasklet_submit(struct intel_guc *guc,
+ struct i915_request *rq)
+{
+ int ret;
+
+ __i915_request_submit(rq);
+
+ trace_i915_request_in(rq, 0);
+
+ guc_set_lrc_tail(rq);
+ ret = guc_add_request(guc, rq);
+ if (ret == -EBUSY)
+ guc->stalled_request = rq;
+
+ if (unlikely(ret == -EPIPE))
+ disable_submission(guc);
+
+ return ret;
+}
+
+static void guc_submit_request(struct i915_request *rq)
+{
+ struct i915_sched_engine *sched_engine = rq->engine->sched_engine;
+ struct intel_guc *guc = &rq->engine->gt->uc.guc;
+ unsigned long flags;
+
+ /* Will be called from irq-context when using foreign fences. */
+ spin_lock_irqsave(&sched_engine->lock, flags);
+
+ if (submission_disabled(guc) || guc->stalled_request ||
+ !i915_sched_engine_is_empty(sched_engine))
+ queue_request(sched_engine, rq, rq_prio(rq));
+ else if (guc_bypass_tasklet_submit(guc, rq) == -EBUSY)
+ tasklet_hi_schedule(&sched_engine->tasklet);
+
+ spin_unlock_irqrestore(&sched_engine->lock, flags);
+}
+
+static int new_guc_id(struct intel_guc *guc)
+{
+ return ida_simple_get(&guc->guc_ids, 0,
+ GUC_MAX_LRC_DESCRIPTORS, GFP_KERNEL |
+ __GFP_RETRY_MAYFAIL | __GFP_NOWARN);
+}
+
+static void __release_guc_id(struct intel_guc *guc, struct intel_context *ce)
+{
+ if (!context_guc_id_invalid(ce)) {
+ ida_simple_remove(&guc->guc_ids, ce->guc_id);
+ reset_lrc_desc(guc, ce->guc_id);
+ set_context_guc_id_invalid(ce);
}
+ if (!list_empty(&ce->guc_id_link))
+ list_del_init(&ce->guc_id_link);
}
-static int guc_context_alloc(struct intel_context *ce)
+static void release_guc_id(struct intel_guc *guc, struct intel_context *ce)
{
- return lrc_alloc(ce, ce->engine);
+ unsigned long flags;
+
+ spin_lock_irqsave(&guc->contexts_lock, flags);
+ __release_guc_id(guc, ce);
+ spin_unlock_irqrestore(&guc->contexts_lock, flags);
+}
+
+static int steal_guc_id(struct intel_guc *guc)
+{
+ struct intel_context *ce;
+ int guc_id;
+
+ lockdep_assert_held(&guc->contexts_lock);
+
+ if (!list_empty(&guc->guc_id_list)) {
+ ce = list_first_entry(&guc->guc_id_list,
+ struct intel_context,
+ guc_id_link);
+
+ GEM_BUG_ON(atomic_read(&ce->guc_id_ref));
+ GEM_BUG_ON(context_guc_id_invalid(ce));
+
+ list_del_init(&ce->guc_id_link);
+ guc_id = ce->guc_id;
+ clr_context_registered(ce);
+ set_context_guc_id_invalid(ce);
+ return guc_id;
+ } else {
+ return -EAGAIN;
+ }
+}
+
+static int assign_guc_id(struct intel_guc *guc, u16 *out)
+{
+ int ret;
+
+ lockdep_assert_held(&guc->contexts_lock);
+
+ ret = new_guc_id(guc);
+ if (unlikely(ret < 0)) {
+ ret = steal_guc_id(guc);
+ if (ret < 0)
+ return ret;
+ }
+
+ *out = ret;
+ return 0;
+}
+
+#define PIN_GUC_ID_TRIES 4
+static int pin_guc_id(struct intel_guc *guc, struct intel_context *ce)
+{
+ int ret = 0;
+ unsigned long flags, tries = PIN_GUC_ID_TRIES;
+
+ GEM_BUG_ON(atomic_read(&ce->guc_id_ref));
+
+try_again:
+ spin_lock_irqsave(&guc->contexts_lock, flags);
+
+ if (context_guc_id_invalid(ce)) {
+ ret = assign_guc_id(guc, &ce->guc_id);
+ if (ret)
+ goto out_unlock;
+ ret = 1; /* Indidcates newly assigned guc_id */
+ }
+ if (!list_empty(&ce->guc_id_link))
+ list_del_init(&ce->guc_id_link);
+ atomic_inc(&ce->guc_id_ref);
+
+out_unlock:
+ spin_unlock_irqrestore(&guc->contexts_lock, flags);
+
+ /*
+ * -EAGAIN indicates no guc_ids are available, let's retire any
+ * outstanding requests to see if that frees up a guc_id. If the first
+ * retire didn't help, insert a sleep with the timeslice duration before
+ * attempting to retire more requests. Double the sleep period each
+ * subsequent pass before finally giving up. The sleep period has max of
+ * 100ms and minimum of 1ms.
+ */
+ if (ret == -EAGAIN && --tries) {
+ if (PIN_GUC_ID_TRIES - tries > 1) {
+ unsigned int timeslice_shifted =
+ ce->engine->props.timeslice_duration_ms <<
+ (PIN_GUC_ID_TRIES - tries - 2);
+ unsigned int max = min_t(unsigned int, 100,
+ timeslice_shifted);
+
+ msleep(max_t(unsigned int, max, 1));
+ }
+ intel_gt_retire_requests(guc_to_gt(guc));
+ goto try_again;
+ }
+
+ return ret;
+}
+
+static void unpin_guc_id(struct intel_guc *guc, struct intel_context *ce)
+{
+ unsigned long flags;
+
+ GEM_BUG_ON(atomic_read(&ce->guc_id_ref) < 0);
+
+ if (unlikely(context_guc_id_invalid(ce)))
+ return;
+
+ spin_lock_irqsave(&guc->contexts_lock, flags);
+ if (!context_guc_id_invalid(ce) && list_empty(&ce->guc_id_link) &&
+ !atomic_read(&ce->guc_id_ref))
+ list_add_tail(&ce->guc_id_link, &guc->guc_id_list);
+ spin_unlock_irqrestore(&guc->contexts_lock, flags);
+}
+
+static int __guc_action_register_context(struct intel_guc *guc,
+ u32 guc_id,
+ u32 offset,
+ bool loop)
+{
+ u32 action[] = {
+ INTEL_GUC_ACTION_REGISTER_CONTEXT,
+ guc_id,
+ offset,
+ };
+
+ return guc_submission_send_busy_loop(guc, action, ARRAY_SIZE(action),
+ 0, loop);
+}
+
+static int register_context(struct intel_context *ce, bool loop)
+{
+ struct intel_guc *guc = ce_to_guc(ce);
+ u32 offset = intel_guc_ggtt_offset(guc, guc->lrc_desc_pool) +
+ ce->guc_id * sizeof(struct guc_lrc_desc);
+ int ret;
+
+ trace_intel_context_register(ce);
+
+ ret = __guc_action_register_context(guc, ce->guc_id, offset, loop);
+ if (likely(!ret))
+ set_context_registered(ce);
+
+ return ret;
+}
+
+static int __guc_action_deregister_context(struct intel_guc *guc,
+ u32 guc_id,
+ bool loop)
+{
+ u32 action[] = {
+ INTEL_GUC_ACTION_DEREGISTER_CONTEXT,
+ guc_id,
+ };
+
+ return guc_submission_send_busy_loop(guc, action, ARRAY_SIZE(action),
+ G2H_LEN_DW_DEREGISTER_CONTEXT,
+ loop);
+}
+
+static int deregister_context(struct intel_context *ce, u32 guc_id, bool loop)
+{
+ struct intel_guc *guc = ce_to_guc(ce);
+
+ trace_intel_context_deregister(ce);
+
+ return __guc_action_deregister_context(guc, guc_id, loop);
+}
+
+static intel_engine_mask_t adjust_engine_mask(u8 class, intel_engine_mask_t mask)
+{
+ switch (class) {
+ case RENDER_CLASS:
+ return mask >> RCS0;
+ case VIDEO_ENHANCEMENT_CLASS:
+ return mask >> VECS0;
+ case VIDEO_DECODE_CLASS:
+ return mask >> VCS0;
+ case COPY_ENGINE_CLASS:
+ return mask >> BCS0;
+ default:
+ MISSING_CASE(class);
+ return 0;
+ }
+}
+
+static void guc_context_policy_init(struct intel_engine_cs *engine,
+ struct guc_lrc_desc *desc)
+{
+ desc->policy_flags = 0;
+
+ if (engine->flags & I915_ENGINE_WANT_FORCED_PREEMPTION)
+ desc->policy_flags |= CONTEXT_POLICY_FLAG_PREEMPT_TO_IDLE;
+
+ /* NB: For both of these, zero means disabled. */
+ desc->execution_quantum = engine->props.timeslice_duration_ms * 1000;
+ desc->preemption_timeout = engine->props.preempt_timeout_ms * 1000;
+}
+
+static inline u8 map_i915_prio_to_guc_prio(int prio);
+
+static int guc_lrc_desc_pin(struct intel_context *ce, bool loop)
+{
+ struct intel_engine_cs *engine = ce->engine;
+ struct intel_runtime_pm *runtime_pm = engine->uncore->rpm;
+ struct intel_guc *guc = &engine->gt->uc.guc;
+ u32 desc_idx = ce->guc_id;
+ struct guc_lrc_desc *desc;
+ const struct i915_gem_context *ctx;
+ int prio = I915_CONTEXT_DEFAULT_PRIORITY;
+ bool context_registered;
+ intel_wakeref_t wakeref;
+ int ret = 0;
+
+ GEM_BUG_ON(!engine->mask);
+
+ /*
+ * Ensure LRC + CT vmas are is same region as write barrier is done
+ * based on CT vma region.
+ */
+ GEM_BUG_ON(i915_gem_object_is_lmem(guc->ct.vma->obj) !=
+ i915_gem_object_is_lmem(ce->ring->vma->obj));
+
+ context_registered = lrc_desc_registered(guc, desc_idx);
+
+ rcu_read_lock();
+ ctx = rcu_dereference(ce->gem_context);
+ if (ctx)
+ prio = ctx->sched.priority;
+ rcu_read_unlock();
+
+ reset_lrc_desc(guc, desc_idx);
+ set_lrc_desc_registered(guc, desc_idx, ce);
+
+ desc = __get_lrc_desc(guc, desc_idx);
+ desc->engine_class = engine_class_to_guc_class(engine->class);
+ desc->engine_submit_mask = adjust_engine_mask(engine->class,
+ engine->mask);
+ desc->hw_context_desc = ce->lrc.lrca;
+ ce->guc_prio = map_i915_prio_to_guc_prio(prio);
+ desc->priority = ce->guc_prio;
+ desc->context_flags = CONTEXT_REGISTRATION_FLAG_KMD;
+ guc_context_policy_init(engine, desc);
+ init_sched_state(ce);
+
+ /*
+ * The context_lookup xarray is used to determine if the hardware
+ * context is currently registered. There are two cases in which it
+ * could be registered either the guc_id has been stolen from another
+ * context or the lrc descriptor address of this context has changed. In
+ * either case the context needs to be deregistered with the GuC before
+ * registering this context.
+ */
+ if (context_registered) {
+ trace_intel_context_steal_guc_id(ce);
+ if (!loop) {
+ set_context_wait_for_deregister_to_register(ce);
+ intel_context_get(ce);
+ } else {
+ bool disabled;
+ unsigned long flags;
+
+ /* Seal race with Reset */
+ spin_lock_irqsave(&ce->guc_state.lock, flags);
+ disabled = submission_disabled(guc);
+ if (likely(!disabled)) {
+ set_context_wait_for_deregister_to_register(ce);
+ intel_context_get(ce);
+ }
+ spin_unlock_irqrestore(&ce->guc_state.lock, flags);
+ if (unlikely(disabled)) {
+ reset_lrc_desc(guc, desc_idx);
+ return 0; /* Will get registered later */
+ }
+ }
+
+ /*
+ * If stealing the guc_id, this ce has the same guc_id as the
+ * context whose guc_id was stolen.
+ */
+ with_intel_runtime_pm(runtime_pm, wakeref)
+ ret = deregister_context(ce, ce->guc_id, loop);
+ if (unlikely(ret == -EBUSY)) {
+ clr_context_wait_for_deregister_to_register(ce);
+ intel_context_put(ce);
+ } else if (unlikely(ret == -ENODEV)) {
+ ret = 0; /* Will get registered later */
+ }
+ } else {
+ with_intel_runtime_pm(runtime_pm, wakeref)
+ ret = register_context(ce, loop);
+ if (unlikely(ret == -EBUSY))
+ reset_lrc_desc(guc, desc_idx);
+ else if (unlikely(ret == -ENODEV))
+ ret = 0; /* Will get registered later */
+ }
+
+ return ret;
+}
+
+static int __guc_context_pre_pin(struct intel_context *ce,
+ struct intel_engine_cs *engine,
+ struct i915_gem_ww_ctx *ww,
+ void **vaddr)
+{
+ return lrc_pre_pin(ce, engine, ww, vaddr);
+}
+
+static int __guc_context_pin(struct intel_context *ce,
+ struct intel_engine_cs *engine,
+ void *vaddr)
+{
+ if (i915_ggtt_offset(ce->state) !=
+ (ce->lrc.lrca & CTX_GTT_ADDRESS_MASK))
+ set_bit(CONTEXT_LRCA_DIRTY, &ce->flags);
+
+ /*
+ * GuC context gets pinned in guc_request_alloc. See that function for
+ * explaination of why.
+ */
+
+ return lrc_pin(ce, engine, vaddr);
}
static int guc_context_pre_pin(struct intel_context *ce,
struct i915_gem_ww_ctx *ww,
void **vaddr)
{
- return lrc_pre_pin(ce, ce->engine, ww, vaddr);
+ return __guc_context_pre_pin(ce, ce->engine, ww, vaddr);
}
static int guc_context_pin(struct intel_context *ce, void *vaddr)
{
- return lrc_pin(ce, ce->engine, vaddr);
+ return __guc_context_pin(ce, ce->engine, vaddr);
+}
+
+static void guc_context_unpin(struct intel_context *ce)
+{
+ struct intel_guc *guc = ce_to_guc(ce);
+
+ unpin_guc_id(guc, ce);
+ lrc_unpin(ce);
+}
+
+static void guc_context_post_unpin(struct intel_context *ce)
+{
+ lrc_post_unpin(ce);
+}
+
+static void __guc_context_sched_enable(struct intel_guc *guc,
+ struct intel_context *ce)
+{
+ u32 action[] = {
+ INTEL_GUC_ACTION_SCHED_CONTEXT_MODE_SET,
+ ce->guc_id,
+ GUC_CONTEXT_ENABLE
+ };
+
+ trace_intel_context_sched_enable(ce);
+
+ guc_submission_send_busy_loop(guc, action, ARRAY_SIZE(action),
+ G2H_LEN_DW_SCHED_CONTEXT_MODE_SET, true);
+}
+
+static void __guc_context_sched_disable(struct intel_guc *guc,
+ struct intel_context *ce,
+ u16 guc_id)
+{
+ u32 action[] = {
+ INTEL_GUC_ACTION_SCHED_CONTEXT_MODE_SET,
+ guc_id, /* ce->guc_id not stable */
+ GUC_CONTEXT_DISABLE
+ };
+
+ GEM_BUG_ON(guc_id == GUC_INVALID_LRC_ID);
+
+ trace_intel_context_sched_disable(ce);
+
+ guc_submission_send_busy_loop(guc, action, ARRAY_SIZE(action),
+ G2H_LEN_DW_SCHED_CONTEXT_MODE_SET, true);
+}
+
+static void guc_blocked_fence_complete(struct intel_context *ce)
+{
+ lockdep_assert_held(&ce->guc_state.lock);
+
+ if (!i915_sw_fence_done(&ce->guc_blocked))
+ i915_sw_fence_complete(&ce->guc_blocked);
+}
+
+static void guc_blocked_fence_reinit(struct intel_context *ce)
+{
+ lockdep_assert_held(&ce->guc_state.lock);
+ GEM_BUG_ON(!i915_sw_fence_done(&ce->guc_blocked));
+
+ /*
+ * This fence is always complete unless a pending schedule disable is
+ * outstanding. We arm the fence here and complete it when we receive
+ * the pending schedule disable complete message.
+ */
+ i915_sw_fence_fini(&ce->guc_blocked);
+ i915_sw_fence_reinit(&ce->guc_blocked);
+ i915_sw_fence_await(&ce->guc_blocked);
+ i915_sw_fence_commit(&ce->guc_blocked);
+}
+
+static u16 prep_context_pending_disable(struct intel_context *ce)
+{
+ lockdep_assert_held(&ce->guc_state.lock);
+
+ set_context_pending_disable(ce);
+ clr_context_enabled(ce);
+ guc_blocked_fence_reinit(ce);
+ intel_context_get(ce);
+
+ return ce->guc_id;
+}
+
+static struct i915_sw_fence *guc_context_block(struct intel_context *ce)
+{
+ struct intel_guc *guc = ce_to_guc(ce);
+ struct i915_sched_engine *sched_engine = ce->engine->sched_engine;
+ unsigned long flags;
+ struct intel_runtime_pm *runtime_pm = ce->engine->uncore->rpm;
+ intel_wakeref_t wakeref;
+ u16 guc_id;
+ bool enabled;
+
+ spin_lock_irqsave(&ce->guc_state.lock, flags);
+
+ /*
+ * Sync with submission path, increment before below changes to context
+ * state.
+ */
+ spin_lock(&sched_engine->lock);
+ incr_context_blocked(ce);
+ spin_unlock(&sched_engine->lock);
+
+ enabled = context_enabled(ce);
+ if (unlikely(!enabled || submission_disabled(guc))) {
+ if (enabled)
+ clr_context_enabled(ce);
+ spin_unlock_irqrestore(&ce->guc_state.lock, flags);
+ return &ce->guc_blocked;
+ }
+
+ /*
+ * We add +2 here as the schedule disable complete CTB handler calls
+ * intel_context_sched_disable_unpin (-2 to pin_count).
+ */
+ atomic_add(2, &ce->pin_count);
+
+ guc_id = prep_context_pending_disable(ce);
+
+ spin_unlock_irqrestore(&ce->guc_state.lock, flags);
+
+ with_intel_runtime_pm(runtime_pm, wakeref)
+ __guc_context_sched_disable(guc, ce, guc_id);
+
+ return &ce->guc_blocked;
+}
+
+static void guc_context_unblock(struct intel_context *ce)
+{
+ struct intel_guc *guc = ce_to_guc(ce);
+ struct i915_sched_engine *sched_engine = ce->engine->sched_engine;
+ unsigned long flags;
+ struct intel_runtime_pm *runtime_pm = ce->engine->uncore->rpm;
+ intel_wakeref_t wakeref;
+ bool enable;
+
+ GEM_BUG_ON(context_enabled(ce));
+
+ spin_lock_irqsave(&ce->guc_state.lock, flags);
+
+ if (unlikely(submission_disabled(guc) ||
+ !intel_context_is_pinned(ce) ||
+ context_pending_disable(ce) ||
+ context_blocked(ce) > 1)) {
+ enable = false;
+ } else {
+ enable = true;
+ set_context_pending_enable(ce);
+ set_context_enabled(ce);
+ intel_context_get(ce);
+ }
+
+ /*
+ * Sync with submission path, decrement after above changes to context
+ * state.
+ */
+ spin_lock(&sched_engine->lock);
+ decr_context_blocked(ce);
+ spin_unlock(&sched_engine->lock);
+
+ spin_unlock_irqrestore(&ce->guc_state.lock, flags);
+
+ if (enable) {
+ with_intel_runtime_pm(runtime_pm, wakeref)
+ __guc_context_sched_enable(guc, ce);
+ }
+}
+
+static void guc_context_cancel_request(struct intel_context *ce,
+ struct i915_request *rq)
+{
+ if (i915_sw_fence_signaled(&rq->submit)) {
+ struct i915_sw_fence *fence = guc_context_block(ce);
+
+ i915_sw_fence_wait(fence);
+ if (!i915_request_completed(rq)) {
+ __i915_request_skip(rq);
+ guc_reset_state(ce, intel_ring_wrap(ce->ring, rq->head),
+ true);
+ }
+ guc_context_unblock(ce);
+ }
+}
+
+static void __guc_context_set_preemption_timeout(struct intel_guc *guc,
+ u16 guc_id,
+ u32 preemption_timeout)
+{
+ u32 action[] = {
+ INTEL_GUC_ACTION_SET_CONTEXT_PREEMPTION_TIMEOUT,
+ guc_id,
+ preemption_timeout
+ };
+
+ intel_guc_send_busy_loop(guc, action, ARRAY_SIZE(action), 0, true);
+}
+
+static void guc_context_ban(struct intel_context *ce, struct i915_request *rq)
+{
+ struct intel_guc *guc = ce_to_guc(ce);
+ struct intel_runtime_pm *runtime_pm =
+ &ce->engine->gt->i915->runtime_pm;
+ intel_wakeref_t wakeref;
+ unsigned long flags;
+
+ guc_flush_submissions(guc);
+
+ spin_lock_irqsave(&ce->guc_state.lock, flags);
+ set_context_banned(ce);
+
+ if (submission_disabled(guc) ||
+ (!context_enabled(ce) && !context_pending_disable(ce))) {
+ spin_unlock_irqrestore(&ce->guc_state.lock, flags);
+
+ guc_cancel_context_requests(ce);
+ intel_engine_signal_breadcrumbs(ce->engine);
+ } else if (!context_pending_disable(ce)) {
+ u16 guc_id;
+
+ /*
+ * We add +2 here as the schedule disable complete CTB handler
+ * calls intel_context_sched_disable_unpin (-2 to pin_count).
+ */
+ atomic_add(2, &ce->pin_count);
+
+ guc_id = prep_context_pending_disable(ce);
+ spin_unlock_irqrestore(&ce->guc_state.lock, flags);
+
+ /*
+ * In addition to disabling scheduling, set the preemption
+ * timeout to the minimum value (1 us) so the banned context
+ * gets kicked off the HW ASAP.
+ */
+ with_intel_runtime_pm(runtime_pm, wakeref) {
+ __guc_context_set_preemption_timeout(guc, guc_id, 1);
+ __guc_context_sched_disable(guc, ce, guc_id);
+ }
+ } else {
+ if (!context_guc_id_invalid(ce))
+ with_intel_runtime_pm(runtime_pm, wakeref)
+ __guc_context_set_preemption_timeout(guc,
+ ce->guc_id,
+ 1);
+ spin_unlock_irqrestore(&ce->guc_state.lock, flags);
+ }
+}
+
+static void guc_context_sched_disable(struct intel_context *ce)
+{
+ struct intel_guc *guc = ce_to_guc(ce);
+ unsigned long flags;
+ struct intel_runtime_pm *runtime_pm = &ce->engine->gt->i915->runtime_pm;
+ intel_wakeref_t wakeref;
+ u16 guc_id;
+ bool enabled;
+
+ if (submission_disabled(guc) || context_guc_id_invalid(ce) ||
+ !lrc_desc_registered(guc, ce->guc_id)) {
+ clr_context_enabled(ce);
+ goto unpin;
+ }
+
+ if (!context_enabled(ce))
+ goto unpin;
+
+ spin_lock_irqsave(&ce->guc_state.lock, flags);
+
+ /*
+ * We have to check if the context has been disabled by another thread.
+ * We also have to check if the context has been pinned again as another
+ * pin operation is allowed to pass this function. Checking the pin
+ * count, within ce->guc_state.lock, synchronizes this function with
+ * guc_request_alloc ensuring a request doesn't slip through the
+ * 'context_pending_disable' fence. Checking within the spin lock (can't
+ * sleep) ensures another process doesn't pin this context and generate
+ * a request before we set the 'context_pending_disable' flag here.
+ */
+ enabled = context_enabled(ce);
+ if (unlikely(!enabled || submission_disabled(guc))) {
+ if (enabled)
+ clr_context_enabled(ce);
+ spin_unlock_irqrestore(&ce->guc_state.lock, flags);
+ goto unpin;
+ }
+ if (unlikely(atomic_add_unless(&ce->pin_count, -2, 2))) {
+ spin_unlock_irqrestore(&ce->guc_state.lock, flags);
+ return;
+ }
+ guc_id = prep_context_pending_disable(ce);
+
+ spin_unlock_irqrestore(&ce->guc_state.lock, flags);
+
+ with_intel_runtime_pm(runtime_pm, wakeref)
+ __guc_context_sched_disable(guc, ce, guc_id);
+
+ return;
+unpin:
+ intel_context_sched_disable_unpin(ce);
+}
+
+static inline void guc_lrc_desc_unpin(struct intel_context *ce)
+{
+ struct intel_guc *guc = ce_to_guc(ce);
+
+ GEM_BUG_ON(!lrc_desc_registered(guc, ce->guc_id));
+ GEM_BUG_ON(ce != __get_context(guc, ce->guc_id));
+ GEM_BUG_ON(context_enabled(ce));
+
+ clr_context_registered(ce);
+ deregister_context(ce, ce->guc_id, true);
+}
+
+static void __guc_context_destroy(struct intel_context *ce)
+{
+ GEM_BUG_ON(ce->guc_prio_count[GUC_CLIENT_PRIORITY_KMD_HIGH] ||
+ ce->guc_prio_count[GUC_CLIENT_PRIORITY_HIGH] ||
+ ce->guc_prio_count[GUC_CLIENT_PRIORITY_KMD_NORMAL] ||
+ ce->guc_prio_count[GUC_CLIENT_PRIORITY_NORMAL]);
+
+ lrc_fini(ce);
+ intel_context_fini(ce);
+
+ if (intel_engine_is_virtual(ce->engine)) {
+ struct guc_virtual_engine *ve =
+ container_of(ce, typeof(*ve), context);
+
+ if (ve->base.breadcrumbs)
+ intel_breadcrumbs_put(ve->base.breadcrumbs);
+
+ kfree(ve);
+ } else {
+ intel_context_free(ce);
+ }
+}
+
+static void guc_context_destroy(struct kref *kref)
+{
+ struct intel_context *ce = container_of(kref, typeof(*ce), ref);
+ struct intel_runtime_pm *runtime_pm = ce->engine->uncore->rpm;
+ struct intel_guc *guc = ce_to_guc(ce);
+ intel_wakeref_t wakeref;
+ unsigned long flags;
+ bool disabled;
+
+ /*
+ * If the guc_id is invalid this context has been stolen and we can free
+ * it immediately. Also can be freed immediately if the context is not
+ * registered with the GuC or the GuC is in the middle of a reset.
+ */
+ if (context_guc_id_invalid(ce)) {
+ __guc_context_destroy(ce);
+ return;
+ } else if (submission_disabled(guc) ||
+ !lrc_desc_registered(guc, ce->guc_id)) {
+ release_guc_id(guc, ce);
+ __guc_context_destroy(ce);
+ return;
+ }
+
+ /*
+ * We have to acquire the context spinlock and check guc_id again, if it
+ * is valid it hasn't been stolen and needs to be deregistered. We
+ * delete this context from the list of unpinned guc_ids available to
+ * steal to seal a race with guc_lrc_desc_pin(). When the G2H CTB
+ * returns indicating this context has been deregistered the guc_id is
+ * returned to the pool of available guc_ids.
+ */
+ spin_lock_irqsave(&guc->contexts_lock, flags);
+ if (context_guc_id_invalid(ce)) {
+ spin_unlock_irqrestore(&guc->contexts_lock, flags);
+ __guc_context_destroy(ce);
+ return;
+ }
+
+ if (!list_empty(&ce->guc_id_link))
+ list_del_init(&ce->guc_id_link);
+ spin_unlock_irqrestore(&guc->contexts_lock, flags);
+
+ /* Seal race with Reset */
+ spin_lock_irqsave(&ce->guc_state.lock, flags);
+ disabled = submission_disabled(guc);
+ if (likely(!disabled))
+ set_context_destroyed(ce);
+ spin_unlock_irqrestore(&ce->guc_state.lock, flags);
+ if (unlikely(disabled)) {
+ release_guc_id(guc, ce);
+ __guc_context_destroy(ce);
+ return;
+ }
+
+ /*
+ * We defer GuC context deregistration until the context is destroyed
+ * in order to save on CTBs. With this optimization ideally we only need
+ * 1 CTB to register the context during the first pin and 1 CTB to
+ * deregister the context when the context is destroyed. Without this
+ * optimization, a CTB would be needed every pin & unpin.
+ *
+ * XXX: Need to acqiure the runtime wakeref as this can be triggered
+ * from context_free_worker when runtime wakeref is not held.
+ * guc_lrc_desc_unpin requires the runtime as a GuC register is written
+ * in H2G CTB to deregister the context. A future patch may defer this
+ * H2G CTB if the runtime wakeref is zero.
+ */
+ with_intel_runtime_pm(runtime_pm, wakeref)
+ guc_lrc_desc_unpin(ce);
+}
+
+static int guc_context_alloc(struct intel_context *ce)
+{
+ return lrc_alloc(ce, ce->engine);
+}
+
+static void guc_context_set_prio(struct intel_guc *guc,
+ struct intel_context *ce,
+ u8 prio)
+{
+ u32 action[] = {
+ INTEL_GUC_ACTION_SET_CONTEXT_PRIORITY,
+ ce->guc_id,
+ prio,
+ };
+
+ GEM_BUG_ON(prio < GUC_CLIENT_PRIORITY_KMD_HIGH ||
+ prio > GUC_CLIENT_PRIORITY_NORMAL);
+
+ if (ce->guc_prio == prio || submission_disabled(guc) ||
+ !context_registered(ce))
+ return;
+
+ guc_submission_send_busy_loop(guc, action, ARRAY_SIZE(action), 0, true);
+
+ ce->guc_prio = prio;
+ trace_intel_context_set_prio(ce);
+}
+
+static inline u8 map_i915_prio_to_guc_prio(int prio)
+{
+ if (prio == I915_PRIORITY_NORMAL)
+ return GUC_CLIENT_PRIORITY_KMD_NORMAL;
+ else if (prio < I915_PRIORITY_NORMAL)
+ return GUC_CLIENT_PRIORITY_NORMAL;
+ else if (prio < I915_PRIORITY_DISPLAY)
+ return GUC_CLIENT_PRIORITY_HIGH;
+ else
+ return GUC_CLIENT_PRIORITY_KMD_HIGH;
+}
+
+static inline void add_context_inflight_prio(struct intel_context *ce,
+ u8 guc_prio)
+{
+ lockdep_assert_held(&ce->guc_active.lock);
+ GEM_BUG_ON(guc_prio >= ARRAY_SIZE(ce->guc_prio_count));
+
+ ++ce->guc_prio_count[guc_prio];
+
+ /* Overflow protection */
+ GEM_WARN_ON(!ce->guc_prio_count[guc_prio]);
+}
+
+static inline void sub_context_inflight_prio(struct intel_context *ce,
+ u8 guc_prio)
+{
+ lockdep_assert_held(&ce->guc_active.lock);
+ GEM_BUG_ON(guc_prio >= ARRAY_SIZE(ce->guc_prio_count));
+
+ /* Underflow protection */
+ GEM_WARN_ON(!ce->guc_prio_count[guc_prio]);
+
+ --ce->guc_prio_count[guc_prio];
+}
+
+static inline void update_context_prio(struct intel_context *ce)
+{
+ struct intel_guc *guc = &ce->engine->gt->uc.guc;
+ int i;
+
+ BUILD_BUG_ON(GUC_CLIENT_PRIORITY_KMD_HIGH != 0);
+ BUILD_BUG_ON(GUC_CLIENT_PRIORITY_KMD_HIGH > GUC_CLIENT_PRIORITY_NORMAL);
+
+ lockdep_assert_held(&ce->guc_active.lock);
+
+ for (i = 0; i < ARRAY_SIZE(ce->guc_prio_count); ++i) {
+ if (ce->guc_prio_count[i]) {
+ guc_context_set_prio(guc, ce, i);
+ break;
+ }
+ }
+}
+
+static inline bool new_guc_prio_higher(u8 old_guc_prio, u8 new_guc_prio)
+{
+ /* Lower value is higher priority */
+ return new_guc_prio < old_guc_prio;
+}
+
+static void add_to_context(struct i915_request *rq)
+{
+ struct intel_context *ce = rq->context;
+ u8 new_guc_prio = map_i915_prio_to_guc_prio(rq_prio(rq));
+
+ GEM_BUG_ON(rq->guc_prio == GUC_PRIO_FINI);
+
+ spin_lock(&ce->guc_active.lock);
+ list_move_tail(&rq->sched.link, &ce->guc_active.requests);
+
+ if (rq->guc_prio == GUC_PRIO_INIT) {
+ rq->guc_prio = new_guc_prio;
+ add_context_inflight_prio(ce, rq->guc_prio);
+ } else if (new_guc_prio_higher(rq->guc_prio, new_guc_prio)) {
+ sub_context_inflight_prio(ce, rq->guc_prio);
+ rq->guc_prio = new_guc_prio;
+ add_context_inflight_prio(ce, rq->guc_prio);
+ }
+ update_context_prio(ce);
+
+ spin_unlock(&ce->guc_active.lock);
+}
+
+static void guc_prio_fini(struct i915_request *rq, struct intel_context *ce)
+{
+ lockdep_assert_held(&ce->guc_active.lock);
+
+ if (rq->guc_prio != GUC_PRIO_INIT &&
+ rq->guc_prio != GUC_PRIO_FINI) {
+ sub_context_inflight_prio(ce, rq->guc_prio);
+ update_context_prio(ce);
+ }
+ rq->guc_prio = GUC_PRIO_FINI;
+}
+
+static void remove_from_context(struct i915_request *rq)
+{
+ struct intel_context *ce = rq->context;
+
+ spin_lock_irq(&ce->guc_active.lock);
+
+ list_del_init(&rq->sched.link);
+ clear_bit(I915_FENCE_FLAG_PQUEUE, &rq->fence.flags);
+
+ /* Prevent further __await_execution() registering a cb, then flush */
+ set_bit(I915_FENCE_FLAG_ACTIVE, &rq->fence.flags);
+
+ guc_prio_fini(rq, ce);
+
+ spin_unlock_irq(&ce->guc_active.lock);
+
+ atomic_dec(&ce->guc_id_ref);
+ i915_request_notify_execute_cb_imm(rq);
}
static const struct intel_context_ops guc_context_ops = {
@@ -454,28 +1976,71 @@ static const struct intel_context_ops guc_context_ops = {
.pre_pin = guc_context_pre_pin,
.pin = guc_context_pin,
- .unpin = lrc_unpin,
- .post_unpin = lrc_post_unpin,
+ .unpin = guc_context_unpin,
+ .post_unpin = guc_context_post_unpin,
+
+ .ban = guc_context_ban,
+
+ .cancel_request = guc_context_cancel_request,
.enter = intel_context_enter_engine,
.exit = intel_context_exit_engine,
+ .sched_disable = guc_context_sched_disable,
+
.reset = lrc_reset,
- .destroy = lrc_destroy,
+ .destroy = guc_context_destroy,
+
+ .create_virtual = guc_create_virtual,
};
-static int guc_request_alloc(struct i915_request *request)
+static void __guc_signal_context_fence(struct intel_context *ce)
{
+ struct i915_request *rq;
+
+ lockdep_assert_held(&ce->guc_state.lock);
+
+ if (!list_empty(&ce->guc_state.fences))
+ trace_intel_context_fence_release(ce);
+
+ list_for_each_entry(rq, &ce->guc_state.fences, guc_fence_link)
+ i915_sw_fence_complete(&rq->submit);
+
+ INIT_LIST_HEAD(&ce->guc_state.fences);
+}
+
+static void guc_signal_context_fence(struct intel_context *ce)
+{
+ unsigned long flags;
+
+ spin_lock_irqsave(&ce->guc_state.lock, flags);
+ clr_context_wait_for_deregister_to_register(ce);
+ __guc_signal_context_fence(ce);
+ spin_unlock_irqrestore(&ce->guc_state.lock, flags);
+}
+
+static bool context_needs_register(struct intel_context *ce, bool new_guc_id)
+{
+ return (new_guc_id || test_bit(CONTEXT_LRCA_DIRTY, &ce->flags) ||
+ !lrc_desc_registered(ce_to_guc(ce), ce->guc_id)) &&
+ !submission_disabled(ce_to_guc(ce));
+}
+
+static int guc_request_alloc(struct i915_request *rq)
+{
+ struct intel_context *ce = rq->context;
+ struct intel_guc *guc = ce_to_guc(ce);
+ unsigned long flags;
int ret;
- GEM_BUG_ON(!intel_context_is_pinned(request->context));
+ GEM_BUG_ON(!intel_context_is_pinned(rq->context));
/*
* Flush enough space to reduce the likelihood of waiting after
* we start building the request - in which case we will just
* have to repeat work.
*/
- request->reserved_space += GUC_REQUEST_SIZE;
+ rq->reserved_space += GUC_REQUEST_SIZE;
/*
* Note that after this point, we have committed to using
@@ -486,40 +2051,232 @@ static int guc_request_alloc(struct i915_request *request)
*/
/* Unconditionally invalidate GPU caches and TLBs. */
- ret = request->engine->emit_flush(request, EMIT_INVALIDATE);
+ ret = rq->engine->emit_flush(rq, EMIT_INVALIDATE);
if (ret)
return ret;
- request->reserved_space -= GUC_REQUEST_SIZE;
+ rq->reserved_space -= GUC_REQUEST_SIZE;
+
+ /*
+ * Call pin_guc_id here rather than in the pinning step as with
+ * dma_resv, contexts can be repeatedly pinned / unpinned trashing the
+ * guc_ids and creating horrible race conditions. This is especially bad
+ * when guc_ids are being stolen due to over subscription. By the time
+ * this function is reached, it is guaranteed that the guc_id will be
+ * persistent until the generated request is retired. Thus, sealing these
+ * race conditions. It is still safe to fail here if guc_ids are
+ * exhausted and return -EAGAIN to the user indicating that they can try
+ * again in the future.
+ *
+ * There is no need for a lock here as the timeline mutex ensures at
+ * most one context can be executing this code path at once. The
+ * guc_id_ref is incremented once for every request in flight and
+ * decremented on each retire. When it is zero, a lock around the
+ * increment (in pin_guc_id) is needed to seal a race with unpin_guc_id.
+ */
+ if (atomic_add_unless(&ce->guc_id_ref, 1, 0))
+ goto out;
+
+ ret = pin_guc_id(guc, ce); /* returns 1 if new guc_id assigned */
+ if (unlikely(ret < 0))
+ return ret;
+ if (context_needs_register(ce, !!ret)) {
+ ret = guc_lrc_desc_pin(ce, true);
+ if (unlikely(ret)) { /* unwind */
+ if (ret == -EPIPE) {
+ disable_submission(guc);
+ goto out; /* GPU will be reset */
+ }
+ atomic_dec(&ce->guc_id_ref);
+ unpin_guc_id(guc, ce);
+ return ret;
+ }
+ }
+
+ clear_bit(CONTEXT_LRCA_DIRTY, &ce->flags);
+
+out:
+ /*
+ * We block all requests on this context if a G2H is pending for a
+ * schedule disable or context deregistration as the GuC will fail a
+ * schedule enable or context registration if either G2H is pending
+ * respectfully. Once a G2H returns, the fence is released that is
+ * blocking these requests (see guc_signal_context_fence).
+ *
+ * We can safely check the below fields outside of the lock as it isn't
+ * possible for these fields to transition from being clear to set but
+ * converse is possible, hence the need for the check within the lock.
+ */
+ if (likely(!context_wait_for_deregister_to_register(ce) &&
+ !context_pending_disable(ce)))
+ return 0;
+
+ spin_lock_irqsave(&ce->guc_state.lock, flags);
+ if (context_wait_for_deregister_to_register(ce) ||
+ context_pending_disable(ce)) {
+ i915_sw_fence_await(&rq->submit);
+
+ list_add_tail(&rq->guc_fence_link, &ce->guc_state.fences);
+ }
+ spin_unlock_irqrestore(&ce->guc_state.lock, flags);
+
return 0;
}
-static inline void queue_request(struct intel_engine_cs *engine,
- struct i915_request *rq,
- int prio)
+static int guc_virtual_context_pre_pin(struct intel_context *ce,
+ struct i915_gem_ww_ctx *ww,
+ void **vaddr)
{
- GEM_BUG_ON(!list_empty(&rq->sched.link));
- list_add_tail(&rq->sched.link,
- i915_sched_lookup_priolist(engine, prio));
- set_bit(I915_FENCE_FLAG_PQUEUE, &rq->fence.flags);
+ struct intel_engine_cs *engine = guc_virtual_get_sibling(ce->engine, 0);
+
+ return __guc_context_pre_pin(ce, engine, ww, vaddr);
}
-static void guc_submit_request(struct i915_request *rq)
+static int guc_virtual_context_pin(struct intel_context *ce, void *vaddr)
{
- struct intel_engine_cs *engine = rq->engine;
- unsigned long flags;
+ struct intel_engine_cs *engine = guc_virtual_get_sibling(ce->engine, 0);
- /* Will be called from irq-context when using foreign fences. */
- spin_lock_irqsave(&engine->active.lock, flags);
+ return __guc_context_pin(ce, engine, vaddr);
+}
- queue_request(engine, rq, rq_prio(rq));
+static void guc_virtual_context_enter(struct intel_context *ce)
+{
+ intel_engine_mask_t tmp, mask = ce->engine->mask;
+ struct intel_engine_cs *engine;
- GEM_BUG_ON(RB_EMPTY_ROOT(&engine->execlists.queue.rb_root));
- GEM_BUG_ON(list_empty(&rq->sched.link));
+ for_each_engine_masked(engine, ce->engine->gt, mask, tmp)
+ intel_engine_pm_get(engine);
- tasklet_hi_schedule(&engine->execlists.tasklet);
+ intel_timeline_enter(ce->timeline);
+}
+
+static void guc_virtual_context_exit(struct intel_context *ce)
+{
+ intel_engine_mask_t tmp, mask = ce->engine->mask;
+ struct intel_engine_cs *engine;
- spin_unlock_irqrestore(&engine->active.lock, flags);
+ for_each_engine_masked(engine, ce->engine->gt, mask, tmp)
+ intel_engine_pm_put(engine);
+
+ intel_timeline_exit(ce->timeline);
+}
+
+static int guc_virtual_context_alloc(struct intel_context *ce)
+{
+ struct intel_engine_cs *engine = guc_virtual_get_sibling(ce->engine, 0);
+
+ return lrc_alloc(ce, engine);
+}
+
+static const struct intel_context_ops virtual_guc_context_ops = {
+ .alloc = guc_virtual_context_alloc,
+
+ .pre_pin = guc_virtual_context_pre_pin,
+ .pin = guc_virtual_context_pin,
+ .unpin = guc_context_unpin,
+ .post_unpin = guc_context_post_unpin,
+
+ .ban = guc_context_ban,
+
+ .cancel_request = guc_context_cancel_request,
+
+ .enter = guc_virtual_context_enter,
+ .exit = guc_virtual_context_exit,
+
+ .sched_disable = guc_context_sched_disable,
+
+ .destroy = guc_context_destroy,
+
+ .get_sibling = guc_virtual_get_sibling,
+};
+
+static bool
+guc_irq_enable_breadcrumbs(struct intel_breadcrumbs *b)
+{
+ struct intel_engine_cs *sibling;
+ intel_engine_mask_t tmp, mask = b->engine_mask;
+ bool result = false;
+
+ for_each_engine_masked(sibling, b->irq_engine->gt, mask, tmp)
+ result |= intel_engine_irq_enable(sibling);
+
+ return result;
+}
+
+static void
+guc_irq_disable_breadcrumbs(struct intel_breadcrumbs *b)
+{
+ struct intel_engine_cs *sibling;
+ intel_engine_mask_t tmp, mask = b->engine_mask;
+
+ for_each_engine_masked(sibling, b->irq_engine->gt, mask, tmp)
+ intel_engine_irq_disable(sibling);
+}
+
+static void guc_init_breadcrumbs(struct intel_engine_cs *engine)
+{
+ int i;
+
+ /*
+ * In GuC submission mode we do not know which physical engine a request
+ * will be scheduled on, this creates a problem because the breadcrumb
+ * interrupt is per physical engine. To work around this we attach
+ * requests and direct all breadcrumb interrupts to the first instance
+ * of an engine per class. In addition all breadcrumb interrupts are
+ * enabled / disabled across an engine class in unison.
+ */
+ for (i = 0; i < MAX_ENGINE_INSTANCE; ++i) {
+ struct intel_engine_cs *sibling =
+ engine->gt->engine_class[engine->class][i];
+
+ if (sibling) {
+ if (engine->breadcrumbs != sibling->breadcrumbs) {
+ intel_breadcrumbs_put(engine->breadcrumbs);
+ engine->breadcrumbs =
+ intel_breadcrumbs_get(sibling->breadcrumbs);
+ }
+ break;
+ }
+ }
+
+ if (engine->breadcrumbs) {
+ engine->breadcrumbs->engine_mask |= engine->mask;
+ engine->breadcrumbs->irq_enable = guc_irq_enable_breadcrumbs;
+ engine->breadcrumbs->irq_disable = guc_irq_disable_breadcrumbs;
+ }
+}
+
+static void guc_bump_inflight_request_prio(struct i915_request *rq,
+ int prio)
+{
+ struct intel_context *ce = rq->context;
+ u8 new_guc_prio = map_i915_prio_to_guc_prio(prio);
+
+ /* Short circuit function */
+ if (prio < I915_PRIORITY_NORMAL ||
+ rq->guc_prio == GUC_PRIO_FINI ||
+ (rq->guc_prio != GUC_PRIO_INIT &&
+ !new_guc_prio_higher(rq->guc_prio, new_guc_prio)))
+ return;
+
+ spin_lock(&ce->guc_active.lock);
+ if (rq->guc_prio != GUC_PRIO_FINI) {
+ if (rq->guc_prio != GUC_PRIO_INIT)
+ sub_context_inflight_prio(ce, rq->guc_prio);
+ rq->guc_prio = new_guc_prio;
+ add_context_inflight_prio(ce, rq->guc_prio);
+ update_context_prio(ce);
+ }
+ spin_unlock(&ce->guc_active.lock);
+}
+
+static void guc_retire_inflight_request_prio(struct i915_request *rq)
+{
+ struct intel_context *ce = rq->context;
+
+ spin_lock(&ce->guc_active.lock);
+ guc_prio_fini(rq, ce);
+ spin_unlock(&ce->guc_active.lock);
}
static void sanitize_hwsp(struct intel_engine_cs *engine)
@@ -588,21 +2345,68 @@ static int guc_resume(struct intel_engine_cs *engine)
return 0;
}
+static bool guc_sched_engine_disabled(struct i915_sched_engine *sched_engine)
+{
+ return !sched_engine->tasklet.callback;
+}
+
static void guc_set_default_submission(struct intel_engine_cs *engine)
{
engine->submit_request = guc_submit_request;
}
+static inline void guc_kernel_context_pin(struct intel_guc *guc,
+ struct intel_context *ce)
+{
+ if (context_guc_id_invalid(ce))
+ pin_guc_id(guc, ce);
+ guc_lrc_desc_pin(ce, true);
+}
+
+static inline void guc_init_lrc_mapping(struct intel_guc *guc)
+{
+ struct intel_gt *gt = guc_to_gt(guc);
+ struct intel_engine_cs *engine;
+ enum intel_engine_id id;
+
+ /* make sure all descriptors are clean... */
+ xa_destroy(&guc->context_lookup);
+
+ /*
+ * Some contexts might have been pinned before we enabled GuC
+ * submission, so we need to add them to the GuC bookeeping.
+ * Also, after a reset the of the GuC we want to make sure that the
+ * information shared with GuC is properly reset. The kernel LRCs are
+ * not attached to the gem_context, so they need to be added separately.
+ *
+ * Note: we purposefully do not check the return of guc_lrc_desc_pin,
+ * because that function can only fail if a reset is just starting. This
+ * is at the end of reset so presumably another reset isn't happening
+ * and even it did this code would be run again.
+ */
+
+ for_each_engine(engine, gt, id)
+ if (engine->kernel_context)
+ guc_kernel_context_pin(guc, engine->kernel_context);
+}
+
static void guc_release(struct intel_engine_cs *engine)
{
engine->sanitize = NULL; /* no longer in control, nothing to sanitize */
- tasklet_kill(&engine->execlists.tasklet);
-
intel_engine_cleanup_common(engine);
lrc_fini_wa_ctx(engine);
}
+static void virtual_guc_bump_serial(struct intel_engine_cs *engine)
+{
+ struct intel_engine_cs *e;
+ intel_engine_mask_t tmp, mask = engine->mask;
+
+ for_each_engine_masked(e, engine->gt, mask, tmp)
+ e->serial++;
+}
+
static void guc_default_vfuncs(struct intel_engine_cs *engine)
{
/* Default vfuncs which can be overridden by each engine. */
@@ -611,13 +2415,15 @@ static void guc_default_vfuncs(struct intel_engine_cs *engine)
engine->cops = &guc_context_ops;
engine->request_alloc = guc_request_alloc;
+ engine->add_active_request = add_to_context;
+ engine->remove_active_request = remove_from_context;
- engine->schedule = i915_schedule;
+ engine->sched_engine->schedule = i915_schedule;
- engine->reset.prepare = guc_reset_prepare;
- engine->reset.rewind = guc_reset_rewind;
- engine->reset.cancel = guc_reset_cancel;
- engine->reset.finish = guc_reset_finish;
+ engine->reset.prepare = guc_reset_nop;
+ engine->reset.rewind = guc_rewind_nop;
+ engine->reset.cancel = guc_reset_nop;
+ engine->reset.finish = guc_reset_nop;
engine->emit_flush = gen8_emit_flush_xcs;
engine->emit_init_breadcrumb = gen8_emit_init_breadcrumb;
@@ -629,13 +2435,13 @@ static void guc_default_vfuncs(struct intel_engine_cs *engine)
engine->set_default_submission = guc_set_default_submission;
engine->flags |= I915_ENGINE_HAS_PREEMPTION;
+ engine->flags |= I915_ENGINE_HAS_TIMESLICES;
/*
* TODO: GuC supports timeslicing and semaphores as well, but they're
* handled by the firmware so some minor tweaks are required before
* enabling.
*
- * engine->flags |= I915_ENGINE_HAS_TIMESLICES;
* engine->flags |= I915_ENGINE_HAS_SEMAPHORES;
*/
@@ -666,9 +2472,21 @@ static inline void guc_default_irqs(struct intel_engine_cs *engine)
intel_engine_set_irq_handler(engine, cs_irq_handler);
}
+static void guc_sched_engine_destroy(struct kref *kref)
+{
+ struct i915_sched_engine *sched_engine =
+ container_of(kref, typeof(*sched_engine), ref);
+ struct intel_guc *guc = sched_engine->private_data;
+
+ guc->sched_engine = NULL;
+ tasklet_kill(&sched_engine->tasklet); /* flush the callback */
+ kfree(sched_engine);
+}
+
int intel_guc_submission_setup(struct intel_engine_cs *engine)
{
struct drm_i915_private *i915 = engine->i915;
+ struct intel_guc *guc = &engine->gt->uc.guc;
/*
* The setup relies on several assumptions (e.g. irqs always enabled)
@@ -676,10 +2494,28 @@ int intel_guc_submission_setup(struct intel_engine_cs *engine)
*/
GEM_BUG_ON(GRAPHICS_VER(i915) < 11);
- tasklet_setup(&engine->execlists.tasklet, guc_submission_tasklet);
+ if (!guc->sched_engine) {
+ guc->sched_engine = i915_sched_engine_create(ENGINE_VIRTUAL);
+ if (!guc->sched_engine)
+ return -ENOMEM;
+
+ guc->sched_engine->schedule = i915_schedule;
+ guc->sched_engine->disabled = guc_sched_engine_disabled;
+ guc->sched_engine->private_data = guc;
+ guc->sched_engine->destroy = guc_sched_engine_destroy;
+ guc->sched_engine->bump_inflight_request_prio =
+ guc_bump_inflight_request_prio;
+ guc->sched_engine->retire_inflight_request_prio =
+ guc_retire_inflight_request_prio;
+ tasklet_setup(&guc->sched_engine->tasklet,
+ guc_submission_tasklet);
+ }
+ i915_sched_engine_put(engine->sched_engine);
+ engine->sched_engine = i915_sched_engine_get(guc->sched_engine);
guc_default_vfuncs(engine);
guc_default_irqs(engine);
+ guc_init_breadcrumbs(engine);
if (engine->class == RENDER_CLASS)
rcs_submission_override(engine);
@@ -695,18 +2531,19 @@ int intel_guc_submission_setup(struct intel_engine_cs *engine)
void intel_guc_submission_enable(struct intel_guc *guc)
{
- guc_stage_desc_init(guc);
+ guc_init_lrc_mapping(guc);
}
void intel_guc_submission_disable(struct intel_guc *guc)
{
- struct intel_gt *gt = guc_to_gt(guc);
-
- GEM_BUG_ON(gt->awake); /* GT should be parked first */
-
/* Note: By the time we're here, GuC may have already been reset */
+}
- guc_stage_desc_fini(guc);
+static bool __guc_submission_supported(struct intel_guc *guc)
+{
+ /* GuC submission is unavailable for pre-Gen11 */
+ return intel_guc_is_supported(guc) &&
+ GRAPHICS_VER(guc_to_gt(guc)->i915) >= 11;
}
static bool __guc_submission_selected(struct intel_guc *guc)
@@ -721,5 +2558,481 @@ static bool __guc_submission_selected(struct intel_guc *guc)
void intel_guc_submission_init_early(struct intel_guc *guc)
{
+ guc->submission_supported = __guc_submission_supported(guc);
guc->submission_selected = __guc_submission_selected(guc);
}
+
+static inline struct intel_context *
+g2h_context_lookup(struct intel_guc *guc, u32 desc_idx)
+{
+ struct intel_context *ce;
+
+ if (unlikely(desc_idx >= GUC_MAX_LRC_DESCRIPTORS)) {
+ drm_err(&guc_to_gt(guc)->i915->drm,
+ "Invalid desc_idx %u", desc_idx);
+ return NULL;
+ }
+
+ ce = __get_context(guc, desc_idx);
+ if (unlikely(!ce)) {
+ drm_err(&guc_to_gt(guc)->i915->drm,
+ "Context is NULL, desc_idx %u", desc_idx);
+ return NULL;
+ }
+
+ return ce;
+}
+
+static void decr_outstanding_submission_g2h(struct intel_guc *guc)
+{
+ if (atomic_dec_and_test(&guc->outstanding_submission_g2h))
+ wake_up_all(&guc->ct.wq);
+}
+
+int intel_guc_deregister_done_process_msg(struct intel_guc *guc,
+ const u32 *msg,
+ u32 len)
+{
+ struct intel_context *ce;
+ u32 desc_idx = msg[0];
+
+ if (unlikely(len < 1)) {
+ drm_err(&guc_to_gt(guc)->i915->drm, "Invalid length %u", len);
+ return -EPROTO;
+ }
+
+ ce = g2h_context_lookup(guc, desc_idx);
+ if (unlikely(!ce))
+ return -EPROTO;
+
+ trace_intel_context_deregister_done(ce);
+
+ if (context_wait_for_deregister_to_register(ce)) {
+ struct intel_runtime_pm *runtime_pm =
+ &ce->engine->gt->i915->runtime_pm;
+ intel_wakeref_t wakeref;
+
+ /*
+ * Previous owner of this guc_id has been deregistered, now safe
+ * register this context.
+ */
+ with_intel_runtime_pm(runtime_pm, wakeref)
+ register_context(ce, true);
+ guc_signal_context_fence(ce);
+ intel_context_put(ce);
+ } else if (context_destroyed(ce)) {
+ /* Context has been destroyed */
+ release_guc_id(guc, ce);
+ __guc_context_destroy(ce);
+ }
+
+ decr_outstanding_submission_g2h(guc);
+
+ return 0;
+}
+
+int intel_guc_sched_done_process_msg(struct intel_guc *guc,
+ const u32 *msg,
+ u32 len)
+{
+ struct intel_context *ce;
+ unsigned long flags;
+ u32 desc_idx = msg[0];
+
+ if (unlikely(len < 2)) {
+ drm_err(&guc_to_gt(guc)->i915->drm, "Invalid length %u", len);
+ return -EPROTO;
+ }
+
+ ce = g2h_context_lookup(guc, desc_idx);
+ if (unlikely(!ce))
+ return -EPROTO;
+
+ if (unlikely(context_destroyed(ce) ||
+ (!context_pending_enable(ce) &&
+ !context_pending_disable(ce)))) {
+ drm_err(&guc_to_gt(guc)->i915->drm,
+ "Bad context sched_state 0x%x, 0x%x, desc_idx %u",
+ atomic_read(&ce->guc_sched_state_no_lock),
+ ce->guc_state.sched_state, desc_idx);
+ return -EPROTO;
+ }
+
+ trace_intel_context_sched_done(ce);
+
+ if (context_pending_enable(ce)) {
+ clr_context_pending_enable(ce);
+ } else if (context_pending_disable(ce)) {
+ bool banned;
+
+ /*
+ * Unpin must be done before __guc_signal_context_fence,
+ * otherwise a race exists between the requests getting
+ * submitted + retired before this unpin completes resulting in
+ * the pin_count going to zero and the context still being
+ * enabled.
+ */
+ intel_context_sched_disable_unpin(ce);
+
+ spin_lock_irqsave(&ce->guc_state.lock, flags);
+ banned = context_banned(ce);
+ clr_context_banned(ce);
+ clr_context_pending_disable(ce);
+ __guc_signal_context_fence(ce);
+ guc_blocked_fence_complete(ce);
+ spin_unlock_irqrestore(&ce->guc_state.lock, flags);
+
+ if (banned) {
+ guc_cancel_context_requests(ce);
+ intel_engine_signal_breadcrumbs(ce->engine);
+ }
+ }
+
+ decr_outstanding_submission_g2h(guc);
+ intel_context_put(ce);
+
+ return 0;
+}
+
+static void capture_error_state(struct intel_guc *guc,
+ struct intel_context *ce)
+{
+ struct intel_gt *gt = guc_to_gt(guc);
+ struct drm_i915_private *i915 = gt->i915;
+ struct intel_engine_cs *engine = __context_to_physical_engine(ce);
+ intel_wakeref_t wakeref;
+
+ intel_engine_set_hung_context(engine, ce);
+ with_intel_runtime_pm(&i915->runtime_pm, wakeref)
+ i915_capture_error_state(gt, engine->mask);
+ atomic_inc(&i915->gpu_error.reset_engine_count[engine->uabi_class]);
+}
+
+static void guc_context_replay(struct intel_context *ce)
+{
+ struct i915_sched_engine *sched_engine = ce->engine->sched_engine;
+
+ __guc_reset_context(ce, true);
+ tasklet_hi_schedule(&sched_engine->tasklet);
+}
+
+static void guc_handle_context_reset(struct intel_guc *guc,
+ struct intel_context *ce)
+{
+ trace_intel_context_reset(ce);
+
+ if (likely(!intel_context_is_banned(ce))) {
+ capture_error_state(guc, ce);
+ guc_context_replay(ce);
+ }
+}
+
+int intel_guc_context_reset_process_msg(struct intel_guc *guc,
+ const u32 *msg, u32 len)
+{
+ struct intel_context *ce;
+ int desc_idx;
+
+ if (unlikely(len != 1)) {
+ drm_err(&guc_to_gt(guc)->i915->drm, "Invalid length %u", len);
+ return -EPROTO;
+ }
+
+ desc_idx = msg[0];
+ ce = g2h_context_lookup(guc, desc_idx);
+ if (unlikely(!ce))
+ return -EPROTO;
+
+ guc_handle_context_reset(guc, ce);
+
+ return 0;
+}
+
+static struct intel_engine_cs *
+guc_lookup_engine(struct intel_guc *guc, u8 guc_class, u8 instance)
+{
+ struct intel_gt *gt = guc_to_gt(guc);
+ u8 engine_class = guc_class_to_engine_class(guc_class);
+
+ /* Class index is checked in class converter */
+ GEM_BUG_ON(instance > MAX_ENGINE_INSTANCE);
+
+ return gt->engine_class[engine_class][instance];
+}
+
+int intel_guc_engine_failure_process_msg(struct intel_guc *guc,
+ const u32 *msg, u32 len)
+{
+ struct intel_engine_cs *engine;
+ u8 guc_class, instance;
+ u32 reason;
+
+ if (unlikely(len != 3)) {
+ drm_err(&guc_to_gt(guc)->i915->drm, "Invalid length %u", len);
+ return -EPROTO;
+ }
+
+ guc_class = msg[0];
+ instance = msg[1];
+ reason = msg[2];
+
+ engine = guc_lookup_engine(guc, guc_class, instance);
+ if (unlikely(!engine)) {
+ drm_err(&guc_to_gt(guc)->i915->drm,
+ "Invalid engine %d:%d", guc_class, instance);
+ return -EPROTO;
+ }
+
+ intel_gt_handle_error(guc_to_gt(guc), engine->mask,
+ I915_ERROR_CAPTURE,
+ "GuC failed to reset %s (reason=0x%08x)\n",
+ engine->name, reason);
+
+ return 0;
+}
+
+void intel_guc_find_hung_context(struct intel_engine_cs *engine)
+{
+ struct intel_guc *guc = &engine->gt->uc.guc;
+ struct intel_context *ce;
+ struct i915_request *rq;
+ unsigned long index;
+
+ /* Reset called during driver load? GuC not yet initialised! */
+ if (unlikely(!guc_submission_initialized(guc)))
+ return;
+
+ xa_for_each(&guc->context_lookup, index, ce) {
+ if (!intel_context_is_pinned(ce))
+ continue;
+
+ if (intel_engine_is_virtual(ce->engine)) {
+ if (!(ce->engine->mask & engine->mask))
+ continue;
+ } else {
+ if (ce->engine != engine)
+ continue;
+ }
+
+ list_for_each_entry(rq, &ce->guc_active.requests, sched.link) {
+ if (i915_test_request_state(rq) != I915_REQUEST_ACTIVE)
+ continue;
+
+ intel_engine_set_hung_context(engine, ce);
+
+ /* Can only cope with one hang at a time... */
+ return;
+ }
+ }
+}
+
+void intel_guc_dump_active_requests(struct intel_engine_cs *engine,
+ struct i915_request *hung_rq,
+ struct drm_printer *m)
+{
+ struct intel_guc *guc = &engine->gt->uc.guc;
+ struct intel_context *ce;
+ unsigned long index;
+ unsigned long flags;
+
+ /* Reset called during driver load? GuC not yet initialised! */
+ if (unlikely(!guc_submission_initialized(guc)))
+ return;
+
+ xa_for_each(&guc->context_lookup, index, ce) {
+ if (!intel_context_is_pinned(ce))
+ continue;
+
+ if (intel_engine_is_virtual(ce->engine)) {
+ if (!(ce->engine->mask & engine->mask))
+ continue;
+ } else {
+ if (ce->engine != engine)
+ continue;
+ }
+
+ spin_lock_irqsave(&ce->guc_active.lock, flags);
+ intel_engine_dump_active_requests(&ce->guc_active.requests,
+ hung_rq, m);
+ spin_unlock_irqrestore(&ce->guc_active.lock, flags);
+ }
+}
+
+void intel_guc_submission_print_info(struct intel_guc *guc,
+ struct drm_printer *p)
+{
+ struct i915_sched_engine *sched_engine = guc->sched_engine;
+ struct rb_node *rb;
+ unsigned long flags;
+
+ if (!sched_engine)
+ return;
+
+ drm_printf(p, "GuC Number Outstanding Submission G2H: %u\n",
+ atomic_read(&guc->outstanding_submission_g2h));
+ drm_printf(p, "GuC tasklet count: %u\n\n",
+ atomic_read(&sched_engine->tasklet.count));
+
+ spin_lock_irqsave(&sched_engine->lock, flags);
+ drm_printf(p, "Requests in GuC submit tasklet:\n");
+ for (rb = rb_first_cached(&sched_engine->queue); rb; rb = rb_next(rb)) {
+ struct i915_priolist *pl = to_priolist(rb);
+ struct i915_request *rq;
+
+ priolist_for_each_request(rq, pl)
+ drm_printf(p, "guc_id=%u, seqno=%llu\n",
+ rq->context->guc_id,
+ rq->fence.seqno);
+ }
+ spin_unlock_irqrestore(&sched_engine->lock, flags);
+ drm_printf(p, "\n");
+}
+
+static inline void guc_log_context_priority(struct drm_printer *p,
+ struct intel_context *ce)
+{
+ int i;
+
+ drm_printf(p, "\t\tPriority: %d\n",
+ ce->guc_prio);
+ drm_printf(p, "\t\tNumber Requests (lower index == higher priority)\n");
+ for (i = GUC_CLIENT_PRIORITY_KMD_HIGH;
+ i < GUC_CLIENT_PRIORITY_NUM; ++i) {
+ drm_printf(p, "\t\tNumber requests in priority band[%d]: %d\n",
+ i, ce->guc_prio_count[i]);
+ }
+ drm_printf(p, "\n");
+}
+
+void intel_guc_submission_print_context_info(struct intel_guc *guc,
+ struct drm_printer *p)
+{
+ struct intel_context *ce;
+ unsigned long index;
+
+ xa_for_each(&guc->context_lookup, index, ce) {
+ drm_printf(p, "GuC lrc descriptor %u:\n", ce->guc_id);
+ drm_printf(p, "\tHW Context Desc: 0x%08x\n", ce->lrc.lrca);
+ drm_printf(p, "\t\tLRC Head: Internal %u, Memory %u\n",
+ ce->ring->head,
+ ce->lrc_reg_state[CTX_RING_HEAD]);
+ drm_printf(p, "\t\tLRC Tail: Internal %u, Memory %u\n",
+ ce->ring->tail,
+ ce->lrc_reg_state[CTX_RING_TAIL]);
+ drm_printf(p, "\t\tContext Pin Count: %u\n",
+ atomic_read(&ce->pin_count));
+ drm_printf(p, "\t\tGuC ID Ref Count: %u\n",
+ atomic_read(&ce->guc_id_ref));
+ drm_printf(p, "\t\tSchedule State: 0x%x, 0x%x\n\n",
+ ce->guc_state.sched_state,
+ atomic_read(&ce->guc_sched_state_no_lock));
+
+ guc_log_context_priority(p, ce);
+ }
+}
+
+static struct intel_context *
+guc_create_virtual(struct intel_engine_cs **siblings, unsigned int count)
+{
+ struct guc_virtual_engine *ve;
+ struct intel_guc *guc;
+ unsigned int n;
+ int err;
+
+ ve = kzalloc(sizeof(*ve), GFP_KERNEL);
+ if (!ve)
+ return ERR_PTR(-ENOMEM);
+
+ guc = &siblings[0]->gt->uc.guc;
+
+ ve->base.i915 = siblings[0]->i915;
+ ve->base.gt = siblings[0]->gt;
+ ve->base.uncore = siblings[0]->uncore;
+ ve->base.id = -1;
+
+ ve->base.uabi_class = I915_ENGINE_CLASS_INVALID;
+ ve->base.instance = I915_ENGINE_CLASS_INVALID_VIRTUAL;
+ ve->base.uabi_instance = I915_ENGINE_CLASS_INVALID_VIRTUAL;
+ ve->base.saturated = ALL_ENGINES;
+
+ snprintf(ve->base.name, sizeof(ve->base.name), "virtual");
+
+ ve->base.sched_engine = i915_sched_engine_get(guc->sched_engine);
+
+ ve->base.cops = &virtual_guc_context_ops;
+ ve->base.request_alloc = guc_request_alloc;
+ ve->base.bump_serial = virtual_guc_bump_serial;
+
+ ve->base.submit_request = guc_submit_request;
+
+ ve->base.flags = I915_ENGINE_IS_VIRTUAL;
+
+ intel_context_init(&ve->context, &ve->base);
+
+ for (n = 0; n < count; n++) {
+ struct intel_engine_cs *sibling = siblings[n];
+
+ GEM_BUG_ON(!is_power_of_2(sibling->mask));
+ if (sibling->mask & ve->base.mask) {
+ DRM_DEBUG("duplicate %s entry in load balancer\n",
+ sibling->name);
+ err = -EINVAL;
+ goto err_put;
+ }
+
+ ve->base.mask |= sibling->mask;
+
+ if (n != 0 && ve->base.class != sibling->class) {
+ DRM_DEBUG("invalid mixing of engine class, sibling %d, already %d\n",
+ sibling->class, ve->base.class);
+ err = -EINVAL;
+ goto err_put;
+ } else if (n == 0) {
+ ve->base.class = sibling->class;
+ ve->base.uabi_class = sibling->uabi_class;
+ snprintf(ve->base.name, sizeof(ve->base.name),
+ "v%dx%d", ve->base.class, count);
+ ve->base.context_size = sibling->context_size;
+
+ ve->base.add_active_request =
+ sibling->add_active_request;
+ ve->base.remove_active_request =
+ sibling->remove_active_request;
+ ve->base.emit_bb_start = sibling->emit_bb_start;
+ ve->base.emit_flush = sibling->emit_flush;
+ ve->base.emit_init_breadcrumb =
+ sibling->emit_init_breadcrumb;
+ ve->base.emit_fini_breadcrumb =
+ sibling->emit_fini_breadcrumb;
+ ve->base.emit_fini_breadcrumb_dw =
+ sibling->emit_fini_breadcrumb_dw;
+ ve->base.breadcrumbs =
+ intel_breadcrumbs_get(sibling->breadcrumbs);
+
+ ve->base.flags |= sibling->flags;
+
+ ve->base.props.timeslice_duration_ms =
+ sibling->props.timeslice_duration_ms;
+ ve->base.props.preempt_timeout_ms =
+ sibling->props.preempt_timeout_ms;
+ }
+ }
+
+ return &ve->context;
+
+err_put:
+ intel_context_put(&ve->context);
+ return ERR_PTR(err);
+}
+
+bool intel_guc_virtual_engine_has_heartbeat(const struct intel_engine_cs *ve)
+{
+ struct intel_engine_cs *engine;
+ intel_engine_mask_t tmp, mask = ve->mask;
+
+ for_each_engine_masked(engine, ve->gt, mask, tmp)
+ if (READ_ONCE(engine->props.heartbeat_interval_ms))
+ return true;
+
+ return false;
+}
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.h b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.h
index 3f7005018939..c7ef44fa0c36 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.h
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.h
@@ -10,6 +10,7 @@
#include "intel_guc.h"
+struct drm_printer;
struct intel_engine_cs;
void intel_guc_submission_init_early(struct intel_guc *guc);
@@ -20,11 +21,24 @@ void intel_guc_submission_fini(struct intel_guc *guc);
int intel_guc_preempt_work_create(struct intel_guc *guc);
void intel_guc_preempt_work_destroy(struct intel_guc *guc);
int intel_guc_submission_setup(struct intel_engine_cs *engine);
+void intel_guc_submission_print_info(struct intel_guc *guc,
+ struct drm_printer *p);
+void intel_guc_submission_print_context_info(struct intel_guc *guc,
+ struct drm_printer *p);
+void intel_guc_dump_active_requests(struct intel_engine_cs *engine,
+ struct i915_request *hung_rq,
+ struct drm_printer *m);
+
+bool intel_guc_virtual_engine_has_heartbeat(const struct intel_engine_cs *ve);
+
+int intel_guc_wait_for_pending_msg(struct intel_guc *guc,
+ atomic_t *wait_var,
+ bool interruptible,
+ long timeout);
static inline bool intel_guc_submission_is_supported(struct intel_guc *guc)
{
- /* XXX: GuC submission is unavailable for now */
- return false;
+ return guc->submission_supported;
}
static inline bool intel_guc_submission_is_wanted(struct intel_guc *guc)
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_uc.c b/drivers/gpu/drm/i915/gt/uc/intel_uc.c
index 6d8b9233214e..b104fb7607eb 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_uc.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_uc.c
@@ -34,8 +34,14 @@ static void uc_expand_default_options(struct intel_uc *uc)
return;
}
- /* Default: enable HuC authentication only */
- i915->params.enable_guc = ENABLE_GUC_LOAD_HUC;
+ /* Intermediate platforms are HuC authentication only */
+ if (IS_DG1(i915) || IS_ALDERLAKE_S(i915)) {
+ i915->params.enable_guc = ENABLE_GUC_LOAD_HUC;
+ return;
+ }
+
+ /* Default: enable HuC authentication and GuC submission */
+ i915->params.enable_guc = ENABLE_GUC_LOAD_HUC | ENABLE_GUC_SUBMISSION;
}
/* Reset GuC providing us with fresh state for both GuC and HuC.
@@ -69,16 +75,18 @@ static void __confirm_options(struct intel_uc *uc)
struct drm_i915_private *i915 = uc_to_gt(uc)->i915;
drm_dbg(&i915->drm,
- "enable_guc=%d (guc:%s submission:%s huc:%s)\n",
+ "enable_guc=%d (guc:%s submission:%s huc:%s slpc:%s)\n",
i915->params.enable_guc,
yesno(intel_uc_wants_guc(uc)),
yesno(intel_uc_wants_guc_submission(uc)),
- yesno(intel_uc_wants_huc(uc)));
+ yesno(intel_uc_wants_huc(uc)),
+ yesno(intel_uc_wants_guc_slpc(uc)));
if (i915->params.enable_guc == 0) {
GEM_BUG_ON(intel_uc_wants_guc(uc));
GEM_BUG_ON(intel_uc_wants_guc_submission(uc));
GEM_BUG_ON(intel_uc_wants_huc(uc));
+ GEM_BUG_ON(intel_uc_wants_guc_slpc(uc));
return;
}
@@ -120,6 +128,11 @@ void intel_uc_init_early(struct intel_uc *uc)
uc->ops = &uc_ops_off;
}
+void intel_uc_init_late(struct intel_uc *uc)
+{
+ intel_guc_init_late(&uc->guc);
+}
+
void intel_uc_driver_late_release(struct intel_uc *uc)
{
}
@@ -207,21 +220,6 @@ static void guc_handle_mmio_msg(struct intel_guc *guc)
spin_unlock_irq(&guc->irq_lock);
}
-static void guc_reset_interrupts(struct intel_guc *guc)
-{
- guc->interrupts.reset(guc);
-}
-
-static void guc_enable_interrupts(struct intel_guc *guc)
-{
- guc->interrupts.enable(guc);
-}
-
-static void guc_disable_interrupts(struct intel_guc *guc)
-{
- guc->interrupts.disable(guc);
-}
-
static int guc_enable_communication(struct intel_guc *guc)
{
struct intel_gt *gt = guc_to_gt(guc);
@@ -242,7 +240,7 @@ static int guc_enable_communication(struct intel_guc *guc)
guc_get_mmio_msg(guc);
guc_handle_mmio_msg(guc);
- guc_enable_interrupts(guc);
+ intel_guc_enable_interrupts(guc);
/* check for CT messages received before we enabled interrupts */
spin_lock_irq(&gt->irq_lock);
@@ -265,7 +263,7 @@ static void guc_disable_communication(struct intel_guc *guc)
*/
guc_clear_mmio_msg(guc);
- guc_disable_interrupts(guc);
+ intel_guc_disable_interrupts(guc);
intel_guc_ct_disable(&guc->ct);
@@ -323,9 +321,6 @@ static int __uc_init(struct intel_uc *uc)
if (i915_inject_probe_failure(uc_to_gt(uc)->i915))
return -ENOMEM;
- /* XXX: GuC submission is unavailable for now */
- GEM_BUG_ON(intel_uc_uses_guc_submission(uc));
-
ret = intel_guc_init(guc);
if (ret)
return ret;
@@ -463,7 +458,7 @@ static int __uc_init_hw(struct intel_uc *uc)
if (ret)
goto err_out;
- guc_reset_interrupts(guc);
+ intel_guc_reset_interrupts(guc);
/* WaEnableuKernelHeaderValidFix:skl */
/* WaEnableGuCBootHashCheckNotSet:skl,bxt,kbl */
@@ -505,12 +500,21 @@ static int __uc_init_hw(struct intel_uc *uc)
if (intel_uc_uses_guc_submission(uc))
intel_guc_submission_enable(guc);
+ if (intel_uc_uses_guc_slpc(uc)) {
+ ret = intel_guc_slpc_enable(&guc->slpc);
+ if (ret)
+ goto err_submission;
+ }
+
drm_info(&i915->drm, "%s firmware %s version %u.%u %s:%s\n",
intel_uc_fw_type_repr(INTEL_UC_FW_TYPE_GUC), guc->fw.path,
guc->fw.major_ver_found, guc->fw.minor_ver_found,
"submission",
enableddisabled(intel_uc_uses_guc_submission(uc)));
+ drm_info(&i915->drm, "GuC SLPC: %s\n",
+ enableddisabled(intel_uc_uses_guc_slpc(uc)));
+
if (intel_uc_uses_huc(uc)) {
drm_info(&i915->drm, "%s firmware %s version %u.%u %s:%s\n",
intel_uc_fw_type_repr(INTEL_UC_FW_TYPE_HUC),
@@ -525,6 +529,8 @@ static int __uc_init_hw(struct intel_uc *uc)
/*
* We've failed to load the firmware :(
*/
+err_submission:
+ intel_guc_submission_disable(guc);
err_log_capture:
__uc_capture_load_err_log(uc);
err_out:
@@ -565,23 +571,67 @@ void intel_uc_reset_prepare(struct intel_uc *uc)
{
struct intel_guc *guc = &uc->guc;
- if (!intel_guc_is_ready(guc))
+ uc->reset_in_progress = true;
+
+ /* Nothing to do if GuC isn't supported */
+ if (!intel_uc_supports_guc(uc))
return;
+ /* Firmware expected to be running when this function is called */
+ if (!intel_guc_is_ready(guc))
+ goto sanitize;
+
+ if (intel_uc_uses_guc_submission(uc))
+ intel_guc_submission_reset_prepare(guc);
+
+sanitize:
__uc_sanitize(uc);
}
+void intel_uc_reset(struct intel_uc *uc, bool stalled)
+{
+ struct intel_guc *guc = &uc->guc;
+
+ /* Firmware can not be running when this function is called */
+ if (intel_uc_uses_guc_submission(uc))
+ intel_guc_submission_reset(guc, stalled);
+}
+
+void intel_uc_reset_finish(struct intel_uc *uc)
+{
+ struct intel_guc *guc = &uc->guc;
+
+ uc->reset_in_progress = false;
+
+ /* Firmware expected to be running when this function is called */
+ if (intel_guc_is_fw_running(guc) && intel_uc_uses_guc_submission(uc))
+ intel_guc_submission_reset_finish(guc);
+}
+
+void intel_uc_cancel_requests(struct intel_uc *uc)
+{
+ struct intel_guc *guc = &uc->guc;
+
+ /* Firmware can not be running when this function is called */
+ if (intel_uc_uses_guc_submission(uc))
+ intel_guc_submission_cancel_requests(guc);
+}
+
void intel_uc_runtime_suspend(struct intel_uc *uc)
{
struct intel_guc *guc = &uc->guc;
- int err;
if (!intel_guc_is_ready(guc))
return;
- err = intel_guc_suspend(guc);
- if (err)
- DRM_DEBUG_DRIVER("Failed to suspend GuC, err=%d", err);
+ /*
+ * Wait for any outstanding CTB before tearing down communication /w the
+ * GuC.
+ */
+#define OUTSTANDING_CTB_TIMEOUT_PERIOD (HZ / 5)
+ intel_guc_wait_for_pending_msg(guc, &guc->outstanding_submission_g2h,
+ false, OUTSTANDING_CTB_TIMEOUT_PERIOD);
+ GEM_WARN_ON(atomic_read(&guc->outstanding_submission_g2h));
guc_disable_communication(guc);
}
@@ -590,17 +640,22 @@ void intel_uc_suspend(struct intel_uc *uc)
{
struct intel_guc *guc = &uc->guc;
intel_wakeref_t wakeref;
+ int err;
if (!intel_guc_is_ready(guc))
return;
- with_intel_runtime_pm(uc_to_gt(uc)->uncore->rpm, wakeref)
- intel_uc_runtime_suspend(uc);
+ with_intel_runtime_pm(&uc_to_gt(uc)->i915->runtime_pm, wakeref) {
+ err = intel_guc_suspend(guc);
+ if (err)
+ DRM_DEBUG_DRIVER("Failed to suspend GuC, err=%d", err);
+ }
}
static int __uc_resume(struct intel_uc *uc, bool enable_communication)
{
struct intel_guc *guc = &uc->guc;
+ struct intel_gt *gt = guc_to_gt(guc);
int err;
if (!intel_guc_is_fw_running(guc))
@@ -612,6 +667,13 @@ static int __uc_resume(struct intel_uc *uc, bool enable_communication)
if (enable_communication)
guc_enable_communication(guc);
+ /* If we are only resuming GuC communication but not reloading
+ * GuC, we need to ensure the ARAT timer interrupt is enabled
+ * again. In case of GuC reload, it is enabled during SLPC enable.
+ */
+ if (enable_communication && intel_uc_uses_guc_slpc(uc))
+ intel_guc_pm_intrmsk_enable(gt);
+
err = intel_guc_resume(guc);
if (err) {
DRM_DEBUG_DRIVER("Failed to resume GuC, err=%d", err);
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_uc.h b/drivers/gpu/drm/i915/gt/uc/intel_uc.h
index 9c954c589edf..866b462821c0 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_uc.h
+++ b/drivers/gpu/drm/i915/gt/uc/intel_uc.h
@@ -7,7 +7,9 @@
#define _INTEL_UC_H_
#include "intel_guc.h"
+#include "intel_guc_rc.h"
#include "intel_guc_submission.h"
+#include "intel_guc_slpc.h"
#include "intel_huc.h"
#include "i915_params.h"
@@ -30,13 +32,19 @@ struct intel_uc {
/* Snapshot of GuC log from last failed load */
struct drm_i915_gem_object *load_err_log;
+
+ bool reset_in_progress;
};
void intel_uc_init_early(struct intel_uc *uc);
+void intel_uc_init_late(struct intel_uc *uc);
void intel_uc_driver_late_release(struct intel_uc *uc);
void intel_uc_driver_remove(struct intel_uc *uc);
void intel_uc_init_mmio(struct intel_uc *uc);
void intel_uc_reset_prepare(struct intel_uc *uc);
+void intel_uc_reset(struct intel_uc *uc, bool stalled);
+void intel_uc_reset_finish(struct intel_uc *uc);
+void intel_uc_cancel_requests(struct intel_uc *uc);
void intel_uc_suspend(struct intel_uc *uc);
void intel_uc_runtime_suspend(struct intel_uc *uc);
int intel_uc_resume(struct intel_uc *uc);
@@ -77,10 +85,17 @@ __uc_state_checker(x, func, uses, used)
uc_state_checkers(guc, guc);
uc_state_checkers(huc, huc);
uc_state_checkers(guc, guc_submission);
+uc_state_checkers(guc, guc_slpc);
+uc_state_checkers(guc, guc_rc);
#undef uc_state_checkers
#undef __uc_state_checker
+static inline int intel_uc_wait_for_idle(struct intel_uc *uc, long timeout)
+{
+ return intel_guc_wait_for_idle(&uc->guc, timeout);
+}
+
#define intel_uc_ops_function(_NAME, _OPS, _TYPE, _RET) \
static inline _TYPE intel_uc_##_NAME(struct intel_uc *uc) \
{ \
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_uc_fw.c b/drivers/gpu/drm/i915/gt/uc/intel_uc_fw.c
index df647c9a8d56..3a16d08608a5 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_uc_fw.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_uc_fw.c
@@ -48,19 +48,20 @@ void intel_uc_fw_change_status(struct intel_uc_fw *uc_fw,
* firmware as TGL.
*/
#define INTEL_UC_FIRMWARE_DEFS(fw_def, guc_def, huc_def) \
- fw_def(ALDERLAKE_S, 0, guc_def(tgl, 49, 0, 1), huc_def(tgl, 7, 5, 0)) \
- fw_def(ROCKETLAKE, 0, guc_def(tgl, 49, 0, 1), huc_def(tgl, 7, 5, 0)) \
- fw_def(TIGERLAKE, 0, guc_def(tgl, 49, 0, 1), huc_def(tgl, 7, 5, 0)) \
- fw_def(JASPERLAKE, 0, guc_def(ehl, 49, 0, 1), huc_def(ehl, 9, 0, 0)) \
- fw_def(ELKHARTLAKE, 0, guc_def(ehl, 49, 0, 1), huc_def(ehl, 9, 0, 0)) \
- fw_def(ICELAKE, 0, guc_def(icl, 49, 0, 1), huc_def(icl, 9, 0, 0)) \
- fw_def(COMETLAKE, 5, guc_def(cml, 49, 0, 1), huc_def(cml, 4, 0, 0)) \
- fw_def(COMETLAKE, 0, guc_def(kbl, 49, 0, 1), huc_def(kbl, 4, 0, 0)) \
- fw_def(COFFEELAKE, 0, guc_def(kbl, 49, 0, 1), huc_def(kbl, 4, 0, 0)) \
- fw_def(GEMINILAKE, 0, guc_def(glk, 49, 0, 1), huc_def(glk, 4, 0, 0)) \
- fw_def(KABYLAKE, 0, guc_def(kbl, 49, 0, 1), huc_def(kbl, 4, 0, 0)) \
- fw_def(BROXTON, 0, guc_def(bxt, 49, 0, 1), huc_def(bxt, 2, 0, 0)) \
- fw_def(SKYLAKE, 0, guc_def(skl, 49, 0, 1), huc_def(skl, 2, 0, 0))
+ fw_def(ALDERLAKE_P, 0, guc_def(adlp, 62, 0, 3), huc_def(tgl, 7, 9, 3)) \
+ fw_def(ALDERLAKE_S, 0, guc_def(tgl, 62, 0, 0), huc_def(tgl, 7, 9, 3)) \
+ fw_def(ROCKETLAKE, 0, guc_def(tgl, 62, 0, 0), huc_def(tgl, 7, 9, 3)) \
+ fw_def(TIGERLAKE, 0, guc_def(tgl, 62, 0, 0), huc_def(tgl, 7, 9, 3)) \
+ fw_def(JASPERLAKE, 0, guc_def(ehl, 62, 0, 0), huc_def(ehl, 9, 0, 0)) \
+ fw_def(ELKHARTLAKE, 0, guc_def(ehl, 62, 0, 0), huc_def(ehl, 9, 0, 0)) \
+ fw_def(ICELAKE, 0, guc_def(icl, 62, 0, 0), huc_def(icl, 9, 0, 0)) \
+ fw_def(COMETLAKE, 5, guc_def(cml, 62, 0, 0), huc_def(cml, 4, 0, 0)) \
+ fw_def(COMETLAKE, 0, guc_def(kbl, 62, 0, 0), huc_def(kbl, 4, 0, 0)) \
+ fw_def(COFFEELAKE, 0, guc_def(kbl, 62, 0, 0), huc_def(kbl, 4, 0, 0)) \
+ fw_def(GEMINILAKE, 0, guc_def(glk, 62, 0, 0), huc_def(glk, 4, 0, 0)) \
+ fw_def(KABYLAKE, 0, guc_def(kbl, 62, 0, 0), huc_def(kbl, 4, 0, 0)) \
+ fw_def(BROXTON, 0, guc_def(bxt, 62, 0, 0), huc_def(bxt, 2, 0, 0)) \
+ fw_def(SKYLAKE, 0, guc_def(skl, 62, 0, 0), huc_def(skl, 2, 0, 0))
#define __MAKE_UC_FW_PATH(prefix_, name_, major_, minor_, patch_) \
"i915/" \