[Mesa-dev,47/95] i965/vec4: translate 64-bit swizzles to 32-bit

Submitted by Iago Toral Quiroga on July 19, 2016, 10:40 a.m.

Details

Message ID 1468924892-6910-48-git-send-email-itoral@igalia.com
State New
Headers show
Series "i965 Haswell ARB_gpu_shader_fp64 / OpenGL 4.0" ( rev: 2 1 ) in Mesa

Not browsing as part of any series.

Commit Message

Iago Toral Quiroga July 19, 2016, 10:40 a.m.
The hardware can only operate with 32-bit swizzles, which is a rather
limiting restriction. However, the idea is not to expose this to the
optimization passes, which would be a mess to deal with. Instead, we
let the bulk of the vec4 backend ignore this fact and we fix the
swizzles right before codegen.

At the moment the pass only needs to handle single value swizzles
thanks to the scalarization pass that runs before it.

Notice that this only works for X/Y swizzles. We will add support
for Z/W swizzles in the next patch, since they need a bit more
work.
---
 src/mesa/drivers/dri/i965/brw_vec4.cpp | 31 +++++++++++++++++++++++++++++++
 src/mesa/drivers/dri/i965/brw_vec4.h   |  1 +
 2 files changed, 32 insertions(+)

Patch hide | download patch | download mbox

diff --git a/src/mesa/drivers/dri/i965/brw_vec4.cpp b/src/mesa/drivers/dri/i965/brw_vec4.cpp
index 6bbe5da..a20b2fd 100644
--- a/src/mesa/drivers/dri/i965/brw_vec4.cpp
+++ b/src/mesa/drivers/dri/i965/brw_vec4.cpp
@@ -1847,6 +1847,8 @@  vec4_visitor::emit_shader_time_write(int shader_time_subindex, src_reg value)
 void
 vec4_visitor::convert_to_hw_regs()
 {
+   expand_64bit_swizzle_to_32bit();
+
    foreach_block_and_inst(block, vec4_instruction, inst, cfg) {
       for (int i = 0; i < 3; i++) {
          struct src_reg &src = inst->src[i];
@@ -2149,6 +2151,35 @@  vec4_visitor::scalarize_df()
 }
 
 bool
+vec4_visitor::expand_64bit_swizzle_to_32bit()
+{
+   bool progress = false;
+
+   foreach_block_and_inst_safe(block, vec4_instruction, inst, cfg) {
+      if (is_align1_df(inst))
+         continue;
+
+      for (int arg = 0; arg < 3; arg++) {
+         if (inst->src[arg].file == BAD_FILE)
+            continue;
+
+         if (type_sz(inst->src[arg].type) < 8)
+            continue;
+
+         /* This pass assumes that we have scalarized all DF instructions */
+         assert(brw_is_single_value_swizzle(inst->src[arg].swizzle));
+
+         unsigned swizzle = BRW_GET_SWZ(inst->src[arg].swizzle, 0);
+         inst->src[arg].swizzle = BRW_SWIZZLE4(swizzle * 2, swizzle * 2 + 1,
+                                               swizzle * 2, swizzle * 2 + 1);
+         progress = true;
+      }
+   }
+
+   return progress;
+}
+
+bool
 vec4_visitor::run()
 {
    if (shader_time_index >= 0)
diff --git a/src/mesa/drivers/dri/i965/brw_vec4.h b/src/mesa/drivers/dri/i965/brw_vec4.h
index 7abcc33..6504939 100644
--- a/src/mesa/drivers/dri/i965/brw_vec4.h
+++ b/src/mesa/drivers/dri/i965/brw_vec4.h
@@ -162,6 +162,7 @@  public:
 
    bool lower_simd_width();
    bool scalarize_df();
+   bool expand_64bit_swizzle_to_32bit();
 
    vec4_instruction *emit(vec4_instruction *inst);