aboutsummaryrefslogtreecommitdiffstats
path: root/drivers/gpu/drm/i915/i915_cmd_parser.c (follow)
AgeCommit message (Collapse)AuthorFilesLines
2017-08-30drm/i915: Recreate vmapping even when the object is pinnedChris Wilson1-1/+1
Sometimes we know we are the only user of the bo, but since we take a protective pin_pages early on, an attempt to change the vmap on the object is denied because it is busy. i915_gem_object_pin_map() cannot tell from our single pin_count if the operation is safe. Instead we must pass that information down from the caller in the manner of I915_MAP_OVERRIDE. This issue has existed from the introduction of the mapping, but was never noticed as the only place where this conflict might happen is for cached kernel buffers (such as allocated by i915_gem_batch_pool_get()). Until recently there was only a single user (the cmdparser) so no conflicts ever occurred. However, we now use it to allocate batches for different operations (using MAP_WC on !llc for writes) in addition to the existing shadow batch (using MAP_WB for reads). We could either keep both mappings cached, or use a different write mechanism if we detect a MAP_WB already exists (i.e. clflush afterwards), but as we haven't seen this issue in the wild (it requires hitting the GPU reloc path in addition to the cmdparser) for simplicity just allow the mappings to be recreated. v2: Include the i915_MAP_OVERRIDE bit in the enum so the compiler knows about all the valid values. Fixes: 7dd4f6729f92 ("drm/i915: Async GPU relocation processing") Testcase: igt/gem_lut_handle # byt, completely by accident Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20170828104631.8606-1-chris@chris-wilson.co.uk Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> (cherry picked from commit a575c6761757232ea2c7dc9f370640754b90cc69) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2017-05-17drm/i915: Redefine ptr_pack_bits() and friendsChris Wilson1-1/+1
Rebrand the current (pointer | bits) pack/unpack utility macros as explicit bit twiddling for PAGE_SIZE so that we can use the more flexible underlying macros for different bits. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170517121007.27224-4-chris@chris-wilson.co.uk
2017-04-11drm/i915: Rename intel_engine_cs.exec_id to uabi_idChris Wilson1-4/+4
We want to refer to the index of the engine consistently throughout the userspace ABI. We already have such an index through the execbuffer engine specifier, that needs to be able to refer to each engine specifically, so rename it the index to uabi_id to reflect its generality beyond execbuf. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170411124306.15448-1-chris@chris-wilson.co.uk
2017-03-10drm/i915/cmdparser: Limit clflush to active cachelinesChris Wilson1-11/+14
We only need to clflush those cachelines that we have validated to be read by the GPU. Userspace typically fills the batch length in correctly, the exceptions tend to be explicit tests within igt. v2: Use ptr_mask_bits() to make Mika happy v3: cmd is not advanced on MI_BBE, so make sure to include an extra dword in the clflush. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170310115518.13832-1-chris@chris-wilson.co.uk
2017-01-06drm/i915: Consolidate checks for memcpy-from-wc supportChris Wilson1-1/+1
In order to silence sparse: ../drivers/gpu/drm/i915/i915_gpu_error.c:200:39: warning: Using plain integer as NULL pointer add a helper to check whether we have sse4.1 and that the desired alignment is valid for acceleration. v2: Explain the macros and split the two use cases between i915_has_memcpy_from_wc() and i915_can_memcpy_from_wc(). Reported-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170106152013.24684-1-chris@chris-wilson.co.uk
2016-11-24drm/i915: Use the precomputed value for whether to enable command parsingChris Wilson1-22/+1
As i915.enable_cmd_parser is an unsafe option, make it read-only at runtime. Now that it is constant, we can use the value determined during initialisation as to whether we need the cmdparser at execbuffer time. v2: Remove the inline for its single user, it is clear enough (and shorter) without! Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20161124125851.6615-1-chris@chris-wilson.co.uk
2016-11-24drm/i915: kick out cmd_parser specific structs from i915_drv.hMatthew Auld1-0/+96
No sense in keeping the cmd_descriptor and cmd_table structs in i915_drv.h, now that they are no longer referenced externally. Cc: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Matthew Auld <matthew.auld@intel.com> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1479942147-9837-1-git-send-email-matthew.auld@intel.com
2016-11-24drm/i915: cleanup use of INSTR_CLIENT_MASKMatthew Auld1-3/+3
Doing cmd_header >> 29 to extract our 3-bit client value where we know cmd_header is a u32 shouldn't then also require the use of a mask. So remove the redundant operation and get rid of INSTR_CLIENT_MASK now that there are no longer any users. Cc: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Matthew Auld <matthew.auld@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1479163174-29686-1-git-send-email-matthew.auld@intel.com
2016-11-22drm/i915: don't whitelist oacontrol in cmd parserRobert Bragg1-37/+5
Being able to program OACONTROL from a non-privileged batch buffer is not sufficient to be able to configure the OA unit. This was originally allowed to help enable Mesa to expose OA counters via the INTEL_performance_query extension, but the current implementation based on programming OACONTROL via a batch buffer isn't able to report useable data without a more complete OA unit configuration. Mesa handles the possibility that writes to OACONTROL may not be allowed and so only advertises the extension after explicitly testing that a write to OACONTROL succeeds. Based on this; removing OACONTROL from the whitelist should be ok for userspace. Removing this simplifies adding a new kernel api for configuring the OA unit without needing to consider the possibility that userspace might trample on OACONTROL state which we'd like to start managing within the kernel instead. In particular running any Mesa based GL application currently results in clearing OACONTROL when initializing which would disable the capturing of metrics. v2: This bumps the command parser version from 8 to 9, as the change is visible to userspace. Signed-off-by: Robert Bragg <robert@sixbynine.org> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Reviewed-by: Sourab Gupta <sourab.gupta@intel.com>Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: http://patchwork.freedesktop.org/patch/msgid/20161108125148.25007-1-robert@sixbynine.org
2016-11-22drm/i915: return EACCES for check_cmd() failuresRobert Bragg1-2/+5
check_cmd() is checking whether a command adheres to certain restrictions that ensure it's safe to execute within a privileged batch buffer. Returning false implies a privilege problem, not that the command is invalid. The distinction makes the difference between allowing the buffer to be executed as an unprivileged batch buffer or returning an EINVAL error to userspace without executing anything. In a case where userspace may want to test whether it can successfully write to a register that needs privileges the distinction may be important and an EINVAL error may be considered fatal. In particular this is currently true for Mesa, which includes a test for whether OACONTROL can be written too, but Mesa treats any error when flushing a batch buffer as fatal, calling exit(1). As it is currently Mesa can gracefully handle a failure to write to OACONTROL if the command parser is disabled, but if we were to remove OACONTROL from the parser's whitelist then the returned EINVAL would break Mesa applications as they attempt an OACONTROL write. This bumps the command parser version from 7 to 8, as the change is visible to userspace. Signed-off-by: Robert Bragg <robert@sixbynine.org> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Reviewed-by: Sourab Gupta <sourab.gupta@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: http://patchwork.freedesktop.org/patch/msgid/20161107194957.3385-4-robert@sixbynine.org
2016-11-22drm/i915: rename OACONTROL GEN7_OACONTROLRobert Bragg1-2/+2
OACONTROL changes quite a bit for gen8, with some bits split out into a per-context OACTXCONTROL register. Rename now before adding more gen7 OA registers Signed-off-by: Robert Bragg <robert@sixbynine.org> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Reviewed-by: Sourab Gupta <sourab.gupta@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: http://patchwork.freedesktop.org/patch/msgid/20161107194957.3385-3-robert@sixbynine.org
2016-10-28drm/i915: Refactor object page APIChris Wilson1-1/+1
The plan is to make obtaining the backing storage for the object avoid struct_mutex (i.e. use its own locking). The first step is to update the API so that normal users only call pin/unpin whilst working on the backing storage. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20161028125858.23563-12-chris@chris-wilson.co.uk
2016-10-14drm/i915: Allocate intel_engine_cs structure only for the enabled enginesAkash Goel1-1/+2
With the possibility of addition of many more number of rings in future, the drm_i915_private structure could bloat as an array, of type intel_engine_cs, is embedded inside it. struct intel_engine_cs engine[I915_NUM_ENGINES]; Though this is still fine as generally there is only a single instance of drm_i915_private structure used, but not all of the possible rings would be enabled or active on most of the platforms. Some memory can be saved by allocating intel_engine_cs structure only for the enabled/active engines. Currently the engine/ring ID is kept static and dev_priv->engine[] is simply indexed using the enums defined in intel_engine_id. To save memory and continue using the static engine/ring IDs, 'engine' is defined as an array of pointers. struct intel_engine_cs *engine[I915_NUM_ENGINES]; dev_priv->engine[engine_ID] will be NULL for disabled engine instances. There is a text size reduction of 928 bytes, from 1028200 to 1027272, for i915.o file (but for i915.ko file text size remain same as 1193131 bytes). v2: - Remove the engine iterator field added in drm_i915_private structure, instead pass a local iterator variable to the for_each_engine** macros. (Chris) - Do away with intel_engine_initialized() and instead directly use the NULL pointer check on engine pointer. (Chris) v3: - Remove for_each_engine_id() macro, as the updated macro for_each_engine() can be used in place of it. (Chris) - Protect the access to Render engine Fault register with a NULL check, as engine specific init is done later in Driver load sequence. v4: - Use !!dev_priv->engine[VCS] style for the engine check in getparam. (Chris) - Kill the superfluous init_engine_lists(). v5: - Cleanup the intel_engines_init() & intel_engines_setup(), with respect to allocation of intel_engine_cs structure. (Chris) v6: - Rebase. v7: - Optimize the for_each_engine_masked() macro. (Chris) - Change the type of 'iter' local variable to enum intel_engine_id. (Chris) - Rebase. v8: Rebase. v9: Rebase. v10: - For index calculation use engine ID instead of pointer based arithmetic in intel_engine_sync_index() as engine pointers are not contiguous now (Chris) - For appropriateness, rename local enum variable 'iter' to 'id'. (Joonas) - Use for_each_engine macro for cleanup in intel_engines_init() and remove check for NULL engine pointer in cleanup() routines. (Joonas) v11: Rebase. Cc: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Akash Goel <akash.goel@intel.com> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1476378888-7372-1-git-send-email-akash.goel@intel.com
2016-09-16drm/i915: use NULL for NULL pointersJani Nikula1-1/+1
Fix sparse warning: drivers/gpu/drm/i915/i915_cmd_parser.c:987:72: warning: Using plain integer as NULL pointer Fixes: 52a42cec4b70 ("drm/i915/cmdparser: Accelerate copies from WC memory") Cc: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Dhinakaran Pandiyan <dhinakaran.pandiyan@intel.com> Signed-off-by: Jani Nikula <jani.nikula@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1473946137-1931-4-git-send-email-jani.nikula@intel.com
2016-08-18drm/i915/cmdparser: Accelerate copies from WC memoryChris Wilson1-27/+43
If we need to use clflush to prepare our batch for reads from memory, we can bypass the cache instead by using non-temporal copies. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Matthew Auld <matthew.william.auld@gmail.com> Link: http://patchwork.freedesktop.org/patch/msgid/20160818161718.27187-39-chris@chris-wilson.co.uk
2016-08-18drm/i915/cmdparser: Use binary search for faster register lookupChris Wilson1-22/+20
A significant proportion of the cmdparsing time for some batches is the cost to find the register in the mmiotable. We ensure that those tables are in ascending order such that we could do a binary search if it was ever merited. It is. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20160818161718.27187-38-chris@chris-wilson.co.uk
2016-08-18drm/i915/cmdparser: Check for SKIP descriptors firstChris Wilson1-0/+3
If the command descriptor says to skip it, ignore checking for anyother other conflict. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20160818161718.27187-37-chris@chris-wilson.co.uk
2016-08-18drm/i915/cmdparser: Compare against the previous command descriptorChris Wilson1-7/+14
On the blitter (and in test code), we see long sequences of repeated commands, e.g. XY_PIXEL_BLT, XY_SCANLINE_BLT, or XY_SRC_COPY. For these, we can skip the hashtable lookup by remembering the previous command descriptor and doing a straightforward compare of the command header. The corollary is that we need to do one extra comparison before lookup up new commands. v2: Less magic mask (ok, it is still magic, but now you cannot see!) Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20160818161718.27187-36-chris@chris-wilson.co.uk
2016-08-18drm/i915/cmdparser: Improve hash functionChris Wilson1-20/+31
The existing code's hashfunction is very suboptimal (most 3D commands use the same bucket degrading the hash to a long list). The code even acknowledge that the issue was known and the fix simple: /* * If we attempt to generate a perfect hash, we should be able to look at bits * 31:29 of a command from a batch buffer and use the full mask for that * client. The existing INSTR_CLIENT_MASK/SHIFT defines can be used for this. */ Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20160818161718.27187-35-chris@chris-wilson.co.uk
2016-08-18drm/i915/cmdparser: Only cache the dst vmapChris Wilson1-14/+19
For simplicity, we want to continue using a contiguous mapping of the command buffer, but we can reduce the number of vmappings we hold by switching over to a page-by-page copy from the user batch buffer to the shadow. The cost for saving one linear mapping is about 5% in trivial workloads - which is more or less the overhead in calling kmap_atomic(). Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20160818161718.27187-34-chris@chris-wilson.co.uk
2016-08-18drm/i915/cmdparser: Use cached vmappingsChris Wilson1-80/+47
The single largest factor in the overhead of parsing the commands is the setup of the virtual mapping to provide a continuous block for the batch buffer. If we keep those vmappings around (against the better judgement of mm/vmalloc.c, which we offset by handwaving and looking suggestively at the shrinker) we can dramatically improve the performance of the parser for small batches (such as media workloads). Furthermore, we can use the prepare shmem read/write functions to determine how best we need to clflush the range (rather than every page of the object). The impact of caching both src/dst vmaps is +80% on ivb and +140% on byt for the throughput on small batches. (Caching just the dst vmap and iterating over the src, doing a page by page copy is roughly 5% slower on both platforms. That may be an acceptable trade-off to eliminate one cached vmapping, and we may be able to reduce the per-page copying overhead further.) For *this* simple test case, the cmdparser is now within a factor of 2 of ideal performance. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Matthew Auld <matthew.william.auld@gmail.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20160818161718.27187-33-chris@chris-wilson.co.uk
2016-08-18drm/i915/cmdparser: Add the TIMESTAMP register for the other enginesChris Wilson1-0/+5
Since I have been using the BCS_TIMESTAMP to measure latency of execution upon the blitter ring, allow regular userspace to also read from that register. They are already allowed RCS_TIMESTAMP! Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20160818161718.27187-32-chris@chris-wilson.co.uk
2016-08-18drm/i915/cmdparser: Make initialisation failure non-fatalChris Wilson1-13/+14
If the developer adds a register in the wrong order, we BUG during boot. That makes development and testing very difficult. Let's be a bit more friendly and disable the command parser with a big warning if the tables are invalid. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20160818161718.27187-31-chris@chris-wilson.co.uk
2016-08-18drm/i915: Extract i915_gem_obj_prepare_shmem_write()Chris Wilson1-2/+2
This is a companion to i915_gem_obj_prepare_shmem_read() that prepares the backing storage for direct writes. It first serialises with the GPU, pins the backing storage and then indicates what clfushes are required in order for the writes to be coherent. Whilst here, fix support for ancient CPUs without clflush for which we cannot do the GTT+clflush tricks. v2: Add i915_gem_obj_finish_shmem_access() for symmetry Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20160818161718.27187-8-chris@chris-wilson.co.uk
2016-07-27drm/i915/cmdparser: Remove stray intel_engine_cs *ringChris Wilson1-35/+39
When we refer to intel_engine_cs, we want to use engine so as not to confuse ourselves about ringbuffers. v2: Rename all the functions as well, as well as a few more stray comments. v3: Split the really long error message strings Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: http://patchwork.freedesktop.org/patch/msgid/1469432687-22756-6-git-send-email-chris@chris-wilson.co.uk Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1469606850-28659-1-git-send-email-chris@chris-wilson.co.uk
2016-06-06drm/i915: Fix a buch of kerneldoc warningsTvrtko Ursulin1-4/+5
Just a bunch of stale kerneldocs generating warnings when building the docs. Mostly function parameters so not very useful but still. v2: Tidy. Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: http://patchwork.freedesktop.org/patch/msgid/1464958937-23344-1-git-send-email-tvrtko.ursulin@linux.intel.com
2016-05-09drm/i915: Store a i915 backpointer from engine, and use itChris Wilson1-6/+6
text data bss dec hex filename 6309351 3578714 696320 10584385 a18141 vmlinux 6308391 3578714 696320 10583425 a17d81 vmlinux Almost 1KiB of code reduction. v2: More s/INTEL_INFO()->gen/INTEL_GEN()/ and IS_GENx() conversions text data bss dec hex filename 6304579 3578778 696320 10579677 a16edd vmlinux 6303427 3578778 696320 10578525 a16a5d vmlinux Now over 1KiB! Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1462545621-30125-3-git-send-email-chris@chris-wilson.co.uk
2016-05-09drm/i915: Allow MI_LOAD_REGISTER_REG between whitelisted registers.Kenneth Graunke1-2/+15
Allowing register copies where the source and destination are both whitelisted should be safe, and is useful. For example, Mesa uses this to load the command streamer math registers with data from the pipeline statistics counters. v2: Reject writes to OACONTROL (and reads as well :( Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> # v1 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: http://patchwork.freedesktop.org/patch/msgid/1462521014-13595-1-git-send-email-chris@chris-wilson.co.uk
2016-05-05drm/i915: Report command parser version 0 if disabledChris Wilson1-1/+14
If the command parser is not active, then it is appropriate to report it as operating at version 0 as no higher mode is supported. This greatly simplifies userspace querying for the command parser as we then do not need to second guess when it will be active (a mixture of module parameters and generational support, which may change over time). v2: s/comand/command/ misspelling in comment Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: http://patchwork.freedesktop.org/patch/msgid/1462368336-21230-1-git-send-email-chris@chris-wilson.co.uk Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2016-03-21drm/i915: Bump command parser version for new whitelisted registersJordan Justen1-1/+2
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Francisco Jerez <currojerez@riseup.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: http://patchwork.freedesktop.org/patch/msgid/1457335830-30923-6-git-send-email-jordan.l.justen@intel.com
2016-03-21drm/i915: Add Haswell CS GPR registers to whitelistJordan Justen1-0/+16
This is needed for the Mesa Vulkan driver on Haswell. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Francisco Jerez <currojerez@riseup.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: http://patchwork.freedesktop.org/patch/msgid/1457335830-30923-5-git-send-email-jordan.l.justen@intel.com
2016-03-21drm/i915: Move Haswell registers to separate whitelist tableJordan Justen1-0/+4
Now that we can whitelist registers only on Haswell, move HSW_SCRATCH1 and HSW_ROW_CHICKEN3 into a separate Haswell only table. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Cc: Francisco Jerez <currojerez@riseup.net> Reviewed-by: Francisco Jerez <currojerez@riseup.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: http://patchwork.freedesktop.org/patch/msgid/1457335830-30923-4-git-send-email-jordan.l.justen@intel.com
2016-03-21drm/i915: Use an array of register tables in command parserJordan Justen1-29/+72
For Haswell, we will want another table of registers while retaining the large common table of whitelisted registers shared by all gen7 devices. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Francisco Jerez <currojerez@riseup.net> [danvet: Pipe patch through sed -e 's/\<ring\>/engine/g' to make it apply.] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2016-03-21drm/i915: Add TIMESTAMP to register whitelistJordan Justen1-0/+1
This is needed for the Mesa Vulkan driver on Haswell. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Cc: Kristian Høgsberg <krh@bitplanet.net> Cc: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Francisco Jerez <currojerez@riseup.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: http://patchwork.freedesktop.org/patch/msgid/1457335830-30923-2-git-send-email-jordan.l.justen@intel.com
2016-03-16drm/i915: Rename intel_engine_cs function parametersTvrtko Ursulin1-61/+61
@@ identifier func; @@ func(..., struct intel_engine_cs * - ring + engine , ...) { <... - ring + engine ...> } @@ identifier func; type T; @@ T func(..., struct intel_engine_cs * - ring + engine , ...); Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
2015-11-18drm/i915: Type safe register read/writeVille Syrjälä1-4/+4
Make I915_READ and I915_WRITE more type safe by wrapping the register offset in a struct. This should eliminate most of the fumbles we've had with misplaced parens. This only takes care of normal mmio registers. We could extend the idea to other register types and define each with its own struct. That way you wouldn't be able to accidentally pass the wrong thing to a specific register access function. The gpio_reg setup is probably the ugliest thing left. But I figure I'd just leave it for now, and wait for some divine inspiration to strike before making it nice. As for the generated code, it's actually a bit better sometimes. Eg. looking at i915_irq_handler(), we can see the following change: lea 0x70024(%rdx,%rax,1),%r9d mov $0x1,%edx - movslq %r9d,%r9 - mov %r9,%rsi - mov %r9,-0x58(%rbp) - callq *0xd8(%rbx) + mov %r9d,%esi + mov %r9d,-0x48(%rbp) callq *0xd8(%rbx) So previously gcc thought the register offset might be signed and decided to sign extend it, just in case. The rest appears to be mostly just minor shuffling of instructions. v2: i915_mmio_reg_{offset,equal,valid}() helpers added s/_REG/_MMIO/ in the register defines mo more switch statements left to worry about ring_emit stuff got sorted in a prep patch cmd parser, lrc context and w/a batch buildup also in prep patch vgpu stuff cleaned up and moved to a prep patch all other unrelated changes split out v3: Rebased due to BXT DSI/BLC, MOCS, etc. v4: Rebased due to churn, s/i915_mmio_reg_t/i915_reg_t/ Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Link: http://patchwork.freedesktop.org/patch/msgid/1447853606-2751-1-git-send-email-ville.syrjala@linux.intel.com
2015-11-18drm/i915: Make the cmd parser 64bit regs explicitVille Syrjälä1-12/+17
Add defines for the upper halves of the registers used by the cmd parser. Getting rid of the arithmetic with the register offset will help in making registers type safe. v2: s/_HI/_UDW/ (Chris) Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Link: http://patchwork.freedesktop.org/patch/msgid/1446839080-18732-1-git-send-email-ville.syrjala@linux.intel.com
2015-10-06drm/i915: Add GEN7_GPGPU_DISPATCHDIMX/Y/Z to the register whitelistJordan Justen1-1/+5
This is required to support glDispatchComputeIndirect for gen7. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2015-09-04drm/i915: Fix cmdparser STORE/LOAD command descriptorsChris Wilson1-2/+2
Fixes regression from commit f1afe24f0e736b9d7f2275e2b1504af3fe612f2a Author: Arun Siluvery <arun.siluvery@linux.intel.com> Date: Tue Aug 4 16:22:20 2015 +0100 drm/i915: Change SRM, LRM instructions to use correct length which forgot to account for the length bias when declaring the fixed length. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=91844 Reported-by: Andreas Reis <andreas.reis@gmail.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Dave Gordon <david.s.gordon@intel.com> Cc: Arun Siluvery <arun.siluvery@linux.intel.com> Cc: Mika Kuoppala <mika.kuoppala@intel.com> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Reviewed-by: Arun Siluvery <arun.siluvery@linux.intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2015-09-01drm/i915: Bump command parser version number.Francisco Jerez1-1/+2
This was forgotten in commit d351f6d94893f3ba98b1b20c5ef44c35fc1da124 Author: Francisco Jerez <currojerez@riseup.net> Date: Fri May 29 16:44:15 2015 +0300 drm/i915: Add SCRATCH1 and ROW_CHICKEN3 to the register whitelist. Signed-off-by: Francisco Jerez <currojerez@riseup.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2015-08-26drm/i915: Change SRM, LRM instructions to use correct lengthArun Siluvery1-5/+5
MI_STORE_REGISTER_MEM, MI_LOAD_REGISTER_MEM instructions are not really variable length instructions unlike MI_LOAD_REGISTER_IMM where it expects (reg, addr) pairs so use fixed length for these instructions. v2: rebase Cc: Dave Gordon <david.s.gordon@intel.com> Signed-off-by: Arun Siluvery <arun.siluvery@linux.intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com> [danvet: Appease checkpatch as Mika spotted in i915_reg.h - it seems terminally unhappy about i915_cmd_parser.c so that would be a separate patch.] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2015-07-29drm/i915: Fix command parser table validatorHanno Böck1-1/+1
As we may like to use a bisection search on the tables in future, we need them to be ordered. For convenience we expect the compiled tables to be order and check on initialisation. However, the validator used the wrong iterators failed to spot the misordered MI tables and instead walked off into the unknown (as spotted by kasan). Signed-off-by: Hanno Boeck <hanno@hboeck.de> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> [danvet: Again hand-assemble patch ...] Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
2015-07-29drm/i915: Properly sort MI coomand tableHanno Böck1-1/+1
In the future, we may want to speed up command/register searching using a bisection and so we require them to be in ascending order respectively by command value or register address. However, this was not true for one pair in the MI table; make it so. Signed-off-by: Hanno Boeck <hanno@hboeck.de> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> [danvet: Hand-assemble patch from raw patch from Hanno and commit message from Chris.] Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
2015-07-06drm/i915: Update WaFlushCoherentL3CacheLinesAtContextSwitchArun Siluvery1-3/+3
In this WA we need to set GEN8_L3SQCREG4[21:21] and reset it after PIPE_CONTROL instruction but there is a slight complication as this is applied in WA batch where the values are only initialized once. Dave identified an issue with the current implementation where the register value is read once at the beginning and it is reused; this patch corrects this by saving the register value to memory, update register with the bit of our interest and restore it back with original value. This implementation uses MI_LOAD_REGISTER_MEM which is currently only used by command parser and was using a default length of 0. This is now updated with correct length and moved to appropriate place. Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Dave Gordon <david.s.gordon@intel.com> Signed-off-by: Arun Siluvery <arun.siluvery@linux.intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2015-06-15drm/i915: Add SCRATCH1 and ROW_CHICKEN3 to the register whitelist.Francisco Jerez1-0/+7
Only bit 27 of SCRATCH1 and bit 6 of ROW_CHICKEN3 are allowed to be set because of security-sensitive bits we don't want userspace to mess with. On HSW hardware the whitelisted bits control whether atomic read-modify-write operations are performed on L3 or on GTI, and when set to L3 (which can be 10x-30x better performing than on GTI, depending on the application) require great care to avoid a system hang, so we currently program them to be handled on GTI by default. Beignet can immediately start taking advantage of this change to enable L3 atomics. Mesa should eventually switch to L3 atomics too, but a number of non-trivial changes are still required so it will continue using GTI atomics for now. Signed-off-by: Francisco Jerez <currojerez@riseup.net> Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2015-06-15drm/i915: Extend the parser to check register writes against a mask/value pair.Francisco Jerez1-45/+93
In some cases it might be unnecessary or dangerous to give userspace the right to write arbitrary values to some register, even though it might be desirable to give it control of some of its bits. This patch extends the register whitelist entries to contain a mask/value pair in addition to the register offset. For registers with non-zero mask, any LRM writes and LRI writes where the bits of the immediate given by the mask don't match the specified value will be rejected. This will be used in my next patch to grant userspace partial write access to some sensitive registers. Signed-off-by: Francisco Jerez <currojerez@riseup.net> Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2015-06-15drm/i915: Fix command parser to validate multiple register access with the same command.Francisco Jerez1-31/+43
Until now the software command checker assumed that commands could read or write at most a single register per packet. This is not necessarily the case, MI_LOAD_REGISTER_IMM expects a variable-length list of offset/value pairs and writes them in sequence. The previous code would only check whether the first entry was valid, effectively allowing userspace to write unrestricted registers of the MMIO space by sending a multi-register write with a legal first register, with potential security implications on Gen6 and 7 hardware. Fix it by extending the drm_i915_cmd_descriptor table to represent multi-register access and making validate_cmd() iterate for all register offsets present in the command packet. Signed-off-by: Francisco Jerez <currojerez@riseup.net> Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2015-04-10drm/i915: Tidy batch pool logicChris Wilson1-9/+3
Move the madvise logic out of the execbuffer main path into the relatively rare allocation path, making the execbuffer manipulation less fragile. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2015-03-17drm/i915: Fix vmap_batch page iterator overrunMika Kuoppala1-1/+4
vmap_batch() calculates amount of needed pages for the mapping we are going to create. And it uses this page count as an argument for the for_each_sg_pages() macro. The macro takes the number of sg list entities as an argument, not the page count. So we ended up iterating through all the pages on the mapped object, corrupting memory past the smaller pages[] array. Fix this by bailing out when we have enough pages. This regression has been introduced in commit 17cabf571e50677d980e9ab2a43c5f11213003ae Author: Chris Wilson <chris@chris-wilson.co.uk> Date: Wed Jan 14 11:20:57 2015 +0000 drm/i915: Trim the command parser allocations Cc: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2015-02-23drm/i915: Trim the command parser allocationsChris Wilson1-40/+34
Currently, the command parser tries to create a secondary batch exactly as large as the original, and vmap both. This is open to abuse by userspace using extremely large batch objects, but only executing very short batches. For example, this would be if userspace were to implement a command submission ringbuffer. However, we only need to allocate pages for just the contents of the command sequence in the batch - all relocations copied to the secondary batch will reference the original batch and so there can be no access to the secondary batch outside of the explicit execution region. Testcase: igt/gem_exec_big #ivb,byt,hsw Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=88308 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: John Harrison <John.C.Harrison@Intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>