aboutsummaryrefslogtreecommitdiffstats
path: root/drivers/gpu/drm/i915/display/intel_dsb.c
diff options
context:
space:
mode:
authorVille Syrjälä <ville.syrjala@linux.intel.com>2021-10-21 01:33:39 +0300
committerVille Syrjälä <ville.syrjala@linux.intel.com>2021-11-10 00:38:06 +0200
commit115e0f687d29649b8805e3417e089e785b0ea61d (patch)
treeb650b254fd0e2875eb4c30523d67a79037f7e046 /drivers/gpu/drm/i915/display/intel_dsb.c
parentdrm/i915: Use vblank workers for gamma updates (diff)
downloadlinux-dev-115e0f687d29649b8805e3417e089e785b0ea61d.tar.xz
linux-dev-115e0f687d29649b8805e3417e089e785b0ea61d.zip
drm/i915: Use unlocked register accesses for LUT loads
We have to bash in a lot of registers to load the higher precision LUT modes. The locking overhead is significant, especially as we have to get this done as quickly as possible during vblank. So let's switch to unlocked accesses for these. Fortunately the LUT registers are mostly spread around such that two pipes do not have any registers on the same cacheline. So as long as commits on the same pipe are serialized (which they are) we should get away with this without angering the hardware. The only exceptions are the PREC_PIPEGCMAX registers on ilk/snb which we don't use atm as they are only used in the 12bit gamma mode. If/when we add support for that we may need to remember to still serialize those registers, though I'm not sure ilk/snb are actually affected by the same cacheline issue. I think ivb/hsw at least were, but they use a different set of registers for the precision LUT. I have a test case which is updating the LUTs on two pipes from a single atomic commit. Running that in a loop for a minute I get the following worst case with the locks in place: intel_crtc_vblank_work_start: pipe B, frame=10037, scanline=1081 intel_crtc_vblank_work_start: pipe A, frame=12274, scanline=769 intel_crtc_vblank_work_end: pipe A, frame=12274, scanline=58 intel_crtc_vblank_work_end: pipe B, frame=10037, scanline=74 And here's the worst case with the locks removed: intel_crtc_vblank_work_start: pipe B, frame=5869, scanline=1081 intel_crtc_vblank_work_start: pipe A, frame=7616, scanline=769 intel_crtc_vblank_work_end: pipe B, frame=5869, scanline=1096 intel_crtc_vblank_work_end: pipe A, frame=7616, scanline=777 The test was done on a snb using the 10bit 1024 entry LUT mode. The vtotals for the two displays are 793 and 1125. So we can see that with the locks ripped out the LUT updates are pretty nicely confined within the vblank, whereas with the locks in place we're routinely blasting past the vblank end which causes visual artifacts near the top of the screen. Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20211020223339.669-5-ville.syrjala@linux.intel.com Reviewed-by: Uma Shankar <uma.shankar@intel.com>
Diffstat (limited to 'drivers/gpu/drm/i915/display/intel_dsb.c')
-rw-r--r--drivers/gpu/drm/i915/display/intel_dsb.c4
1 files changed, 2 insertions, 2 deletions
diff --git a/drivers/gpu/drm/i915/display/intel_dsb.c b/drivers/gpu/drm/i915/display/intel_dsb.c
index 62a8a69f9f5d..83a69a4a4fea 100644
--- a/drivers/gpu/drm/i915/display/intel_dsb.c
+++ b/drivers/gpu/drm/i915/display/intel_dsb.c
@@ -100,7 +100,7 @@ void intel_dsb_indexed_reg_write(const struct intel_crtc_state *crtc_state,
u32 reg_val;
if (!dsb) {
- intel_de_write(dev_priv, reg, val);
+ intel_de_write_fw(dev_priv, reg, val);
return;
}
buf = dsb->cmd_buf;
@@ -177,7 +177,7 @@ void intel_dsb_reg_write(const struct intel_crtc_state *crtc_state,
dsb = crtc_state->dsb;
if (!dsb) {
- intel_de_write(dev_priv, reg, val);
+ intel_de_write_fw(dev_priv, reg, val);
return;
}