[Mesa-dev,69/95] i965/vec4: don't propagate single-precision uniforms into 4-wide instructions

Submitted by Iago Toral Quiroga on July 19, 2016, 10:41 a.m.

Details

Message ID 1468924892-6910-70-git-send-email-itoral@igalia.com
State New
Headers show
Series "i965 Haswell ARB_gpu_shader_fp64 / OpenGL 4.0" ( rev: 2 1 ) in Mesa

Not browsing as part of any series.

Commit Message

Iago Toral Quiroga July 19, 2016, 10:41 a.m.
Otherwise we end up producing code that violates the register region
restriction that says that when execsize == width and hstride != 0
the vstride can't be 0.
---
 src/mesa/drivers/dri/i965/brw_vec4_copy_propagation.cpp | 11 +++++++++++
 1 file changed, 11 insertions(+)

Patch hide | download patch | download mbox

diff --git a/src/mesa/drivers/dri/i965/brw_vec4_copy_propagation.cpp b/src/mesa/drivers/dri/i965/brw_vec4_copy_propagation.cpp
index d284528..93c3b0e 100644
--- a/src/mesa/drivers/dri/i965/brw_vec4_copy_propagation.cpp
+++ b/src/mesa/drivers/dri/i965/brw_vec4_copy_propagation.cpp
@@ -329,6 +329,17 @@  try_copy_propagate(const struct brw_device_info *devinfo,
    if (devinfo->gen < 8 && inst->regs_written > 1 && is_uniform(value))
       return false;
 
+   /* There is a regioning restriction such that if execsize == width
+    * and hstride != 0 then the vstride can't be 0. When we split instrutions
+    * that take a single-precision source (like F->DF conversions) we end up
+    * with a 4-wide source on an instruction with an execution size of 4.
+    * If we then copy-propagate the source from a uniform we also end up with a
+    * vstride of 0 and we violate the restriction.
+    */
+   if (inst->exec_size == 4 && value.file == UNIFORM &&
+       type_sz(value.type) == 4)
+      return false;
+
    /* If the type of the copy value is different from the type of the
     * instruction then the swizzles and writemasks involved don't have the same
     * meaning and simply replacing the source would produce different semantics.