[Mesa-dev,45/95] i965/vec4: split double-precision bcsel

Submitted by Iago Toral Quiroga on July 19, 2016, 10:40 a.m.

Details

Message ID 1468924892-6910-46-git-send-email-itoral@igalia.com
State New
Headers show
Series "i965 Haswell ARB_gpu_shader_fp64 / OpenGL 4.0" ( rev: 2 1 ) in Mesa

Not browsing as part of any series.

Commit Message

Iago Toral Quiroga July 19, 2016, 10:40 a.m.
There is a hardware bug affecting compressed double-precision bcsel
instructions in align16 mode by which they won't read predication mask
properly, leading to incorrect behavior at least in non-uniform control
flow scenarios. The bug does not affect other predicated instructions
and it does not affect bcsel in Align1 mode either. This was found
empirically and verified by Curro in the simulator.

Fix this by splitting double-precision bcsel in Align16 mode to use an
execution size of 4.
---
 src/mesa/drivers/dri/i965/brw_vec4.cpp | 6 ++++++
 1 file changed, 6 insertions(+)

Patch hide | download patch | download mbox

diff --git a/src/mesa/drivers/dri/i965/brw_vec4.cpp b/src/mesa/drivers/dri/i965/brw_vec4.cpp
index 88bf895..610c45d 100644
--- a/src/mesa/drivers/dri/i965/brw_vec4.cpp
+++ b/src/mesa/drivers/dri/i965/brw_vec4.cpp
@@ -1969,6 +1969,12 @@  get_lowered_simd_width(const struct brw_device_info *devinfo,
 
    unsigned lowered_width = MIN2(8, inst->exec_size);
 
+   /* Align16 8-wide double-precision bcsel does not work well. Verified
+    * empirically.
+    */
+   if (inst->opcode == BRW_OPCODE_SEL)
+      lowered_width = MIN2(lowered_width, 4);
+
    /* HSW PRM, 3D Media GPGPU Engine, Region Alignment Rules for Direct
     * Register Addressing:
     *