[Mesa-dev,v2] i965/fs: Don't use the pixel interpolater for centroid interpolation

Submitted by Neil Roberts on July 9, 2015, 1:31 p.m.

Details

Message ID 1436448719-3392-1-git-send-email-neil@linux.intel.com
State New
Headers show

Not browsing as part of any series.

Commit Message

Neil Roberts July 9, 2015, 1:31 p.m.
For centroid interpolation we can just directly use the values set up
in the shader payload instead of querying the pixel interpolator. To
do this we need to modify brw_compute_barycentric_interp_modes to
detect when interpolateAtCentroid is called.

v2: Rebase on top of changes to set the pulls bary bit on SKL
---

As an aside, I was deliberating over whether to call the function
set_up_blah instead of setup_blah because I think the former is more
correct. The rest of Mesa seems to use setup so maybe it's more
important to be consistent than correct.

 src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 52 +++++++++++++++++++-----------
 src/mesa/drivers/dri/i965/brw_wm.c       | 55 ++++++++++++++++++++++++++++++++
 2 files changed, 88 insertions(+), 19 deletions(-)

Patch hide | download patch | download mbox

diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
index 5d1ea21..fd7f1b8 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
@@ -1238,6 +1238,25 @@  fs_visitor::emit_percomp(const fs_builder &bld, const fs_inst &inst,
    }
 }
 
+/* For most messages, we need one reg of ignored data; the hardware requires
+ * mlen==1 even when there is no payload. in the per-slot offset case, we'll
+ * replace this with the proper source data.
+ */
+static void
+setup_pixel_interpolater_instruction(fs_visitor *v,
+                                     nir_intrinsic_instr *instr,
+                                     fs_inst *inst,
+                                     int mlen = 1)
+{
+      inst->mlen = mlen;
+      inst->regs_written = 2 * v->dispatch_width / 8;
+      inst->pi_noperspective = instr->variables[0]->var->data.interpolation ==
+                               INTERP_QUALIFIER_NOPERSPECTIVE;
+
+      assert(v->stage == MESA_SHADER_FRAGMENT);
+      ((struct brw_wm_prog_data *) v->prog_data)->pulls_bary = true;
+}
+
 void
 fs_visitor::nir_emit_intrinsic(const fs_builder &bld, nir_intrinsic_instr *instr)
 {
@@ -1482,25 +1501,23 @@  fs_visitor::nir_emit_intrinsic(const fs_builder &bld, nir_intrinsic_instr *instr
    case nir_intrinsic_interp_var_at_centroid:
    case nir_intrinsic_interp_var_at_sample:
    case nir_intrinsic_interp_var_at_offset: {
-      assert(stage == MESA_SHADER_FRAGMENT);
-
-      ((struct brw_wm_prog_data *) prog_data)->pulls_bary = true;
-
       fs_reg dst_xy = bld.vgrf(BRW_REGISTER_TYPE_F, 2);
 
-      /* For most messages, we need one reg of ignored data; the hardware
-       * requires mlen==1 even when there is no payload. in the per-slot
-       * offset case, we'll replace this with the proper source data.
-       */
       fs_reg src = vgrf(glsl_type::float_type);
-      int mlen = 1;     /* one reg unless overriden */
       fs_inst *inst;
 
       switch (instr->intrinsic) {
-      case nir_intrinsic_interp_var_at_centroid:
-         inst = bld.emit(FS_OPCODE_INTERPOLATE_AT_CENTROID,
-                         dst_xy, src, fs_reg(0u));
+      case nir_intrinsic_interp_var_at_centroid: {
+         enum brw_wm_barycentric_interp_mode interp_mode;
+         if (instr->variables[0]->var->data.interpolation ==
+             INTERP_QUALIFIER_NOPERSPECTIVE)
+            interp_mode = BRW_WM_NONPERSPECTIVE_CENTROID_BARYCENTRIC;
+         else
+            interp_mode = BRW_WM_PERSPECTIVE_CENTROID_BARYCENTRIC;
+         uint8_t reg = payload.barycentric_coord_reg[interp_mode];
+         dst_xy = fs_reg(brw_vec16_grf(reg, 0));
          break;
+      }
 
       case nir_intrinsic_interp_var_at_sample: {
          /* XXX: We should probably handle non-constant sample id's */
@@ -1509,6 +1526,7 @@  fs_visitor::nir_emit_intrinsic(const fs_builder &bld, nir_intrinsic_instr *instr
          unsigned msg_data = const_sample ? const_sample->i[0] << 4 : 0;
          inst = bld.emit(FS_OPCODE_INTERPOLATE_AT_SAMPLE, dst_xy, src,
                          fs_reg(msg_data));
+         setup_pixel_interpolater_instruction(this, instr, inst);
          break;
       }
 
@@ -1521,6 +1539,7 @@  fs_visitor::nir_emit_intrinsic(const fs_builder &bld, nir_intrinsic_instr *instr
 
             inst = bld.emit(FS_OPCODE_INTERPOLATE_AT_SHARED_OFFSET, dst_xy, src,
                             fs_reg(off_x | (off_y << 4)));
+            setup_pixel_interpolater_instruction(this, instr, inst);
          } else {
             src = vgrf(glsl_type::ivec2_type);
             fs_reg offset_src = retype(get_nir_src(instr->src[0]),
@@ -1550,9 +1569,10 @@  fs_visitor::nir_emit_intrinsic(const fs_builder &bld, nir_intrinsic_instr *instr
                            bld.SEL(offset(src, bld, i), itemp, fs_reg(7)));
             }
 
-            mlen = 2 * dispatch_width / 8;
             inst = bld.emit(FS_OPCODE_INTERPOLATE_AT_PER_SLOT_OFFSET, dst_xy, src,
                             fs_reg(0u));
+            setup_pixel_interpolater_instruction(this, instr, inst,
+                                                 2 * dispatch_width / 8);
          }
          break;
       }
@@ -1561,12 +1581,6 @@  fs_visitor::nir_emit_intrinsic(const fs_builder &bld, nir_intrinsic_instr *instr
          unreachable("Invalid intrinsic");
       }
 
-      inst->mlen = mlen;
-      /* 2 floats per slot returned */
-      inst->regs_written = 2 * dispatch_width / 8;
-      inst->pi_noperspective = instr->variables[0]->var->data.interpolation ==
-                               INTERP_QUALIFIER_NOPERSPECTIVE;
-
       for (unsigned j = 0; j < instr->num_components; j++) {
          fs_reg src = interp_reg(instr->variables[0]->var->data.location, j);
          src.type = dest.type;
diff --git a/src/mesa/drivers/dri/i965/brw_wm.c b/src/mesa/drivers/dri/i965/brw_wm.c
index 592a729..f7fe1e0 100644
--- a/src/mesa/drivers/dri/i965/brw_wm.c
+++ b/src/mesa/drivers/dri/i965/brw_wm.c
@@ -40,9 +40,62 @@ 
 #include "program/prog_parameter.h"
 #include "program/program.h"
 #include "intel_mipmap_tree.h"
+#include "brw_nir.h"
 
 #include "util/ralloc.h"
 
+static bool
+compute_modes_in_block(nir_block *block,
+                       void *state)
+{
+   unsigned *interp_modes = state;
+   nir_intrinsic_instr *intrin;
+   enum brw_wm_barycentric_interp_mode interp_mode;
+
+   nir_foreach_instr(block, instr) {
+      if (instr->type != nir_instr_type_intrinsic)
+         continue;
+
+      intrin = nir_instr_as_intrinsic(instr);
+
+      if (intrin->intrinsic != nir_intrinsic_interp_var_at_centroid)
+         continue;
+
+      if (intrin->variables[0]->var->data.interpolation ==
+          INTERP_QUALIFIER_NOPERSPECTIVE)
+         interp_mode = BRW_WM_NONPERSPECTIVE_CENTROID_BARYCENTRIC;
+      else
+         interp_mode = BRW_WM_PERSPECTIVE_CENTROID_BARYCENTRIC;
+
+      *interp_modes |= 1 << interp_mode;
+   }
+
+   return true;
+}
+
+/**
+ * Looks for calls to interpolateAtCentroid within the program and returns a
+ * mask of the additional interpolation modes that they require.
+ */
+static unsigned
+compute_interpolate_at_centroid_modes(const struct gl_fragment_program *fprog)
+{
+   unsigned interp_modes = 0;
+   struct nir_shader *shader = fprog->Base.nir;
+
+   if (shader == NULL)
+      return 0;
+
+   nir_foreach_overload(shader, overload) {
+      if (overload->impl == NULL)
+         continue;
+
+      nir_foreach_block(overload->impl, compute_modes_in_block, &interp_modes);
+   }
+
+   return interp_modes;
+}
+
 /**
  * Return a bitfield where bit n is set if barycentric interpolation mode n
  * (see enum brw_wm_barycentric_interp_mode) is needed by the fragment shader.
@@ -114,6 +167,8 @@  brw_compute_barycentric_interp_modes(struct brw_context *brw,
       }
    }
 
+   barycentric_interp_modes |= compute_interpolate_at_centroid_modes(fprog);
+
    return barycentric_interp_modes;
 }
 

Comments

s/interpolater/interpolator/g

On Fri, Jul 10, 2015 at 1:31 AM, Neil Roberts <neil@linux.intel.com> wrote:
> For centroid interpolation we can just directly use the values set up
> in the shader payload instead of querying the pixel interpolator. To
> do this we need to modify brw_compute_barycentric_interp_modes to
> detect when interpolateAtCentroid is called.
>
> v2: Rebase on top of changes to set the pulls bary bit on SKL
> ---
>
> As an aside, I was deliberating over whether to call the function
> set_up_blah instead of setup_blah because I think the former is more
> correct. The rest of Mesa seems to use setup so maybe it's more
> important to be consistent than correct.
>
>  src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 52 +++++++++++++++++++-----------
>  src/mesa/drivers/dri/i965/brw_wm.c       | 55 ++++++++++++++++++++++++++++++++
>  2 files changed, 88 insertions(+), 19 deletions(-)
>
> diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
> index 5d1ea21..fd7f1b8 100644
> --- a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
> @@ -1238,6 +1238,25 @@ fs_visitor::emit_percomp(const fs_builder &bld, const fs_inst &inst,
>     }
>  }
>
> +/* For most messages, we need one reg of ignored data; the hardware requires
> + * mlen==1 even when there is no payload. in the per-slot offset case, we'll
> + * replace this with the proper source data.
> + */
> +static void
> +setup_pixel_interpolater_instruction(fs_visitor *v,
> +                                     nir_intrinsic_instr *instr,
> +                                     fs_inst *inst,
> +                                     int mlen = 1)
> +{
> +      inst->mlen = mlen;
> +      inst->regs_written = 2 * v->dispatch_width / 8;
> +      inst->pi_noperspective = instr->variables[0]->var->data.interpolation ==
> +                               INTERP_QUALIFIER_NOPERSPECTIVE;
> +
> +      assert(v->stage == MESA_SHADER_FRAGMENT);
> +      ((struct brw_wm_prog_data *) v->prog_data)->pulls_bary = true;
> +}
> +
>  void
>  fs_visitor::nir_emit_intrinsic(const fs_builder &bld, nir_intrinsic_instr *instr)
>  {
> @@ -1482,25 +1501,23 @@ fs_visitor::nir_emit_intrinsic(const fs_builder &bld, nir_intrinsic_instr *instr
>     case nir_intrinsic_interp_var_at_centroid:
>     case nir_intrinsic_interp_var_at_sample:
>     case nir_intrinsic_interp_var_at_offset: {
> -      assert(stage == MESA_SHADER_FRAGMENT);
> -
> -      ((struct brw_wm_prog_data *) prog_data)->pulls_bary = true;
> -
>        fs_reg dst_xy = bld.vgrf(BRW_REGISTER_TYPE_F, 2);
>
> -      /* For most messages, we need one reg of ignored data; the hardware
> -       * requires mlen==1 even when there is no payload. in the per-slot
> -       * offset case, we'll replace this with the proper source data.
> -       */
>        fs_reg src = vgrf(glsl_type::float_type);
> -      int mlen = 1;     /* one reg unless overriden */
>        fs_inst *inst;
>
>        switch (instr->intrinsic) {
> -      case nir_intrinsic_interp_var_at_centroid:
> -         inst = bld.emit(FS_OPCODE_INTERPOLATE_AT_CENTROID,
> -                         dst_xy, src, fs_reg(0u));
> +      case nir_intrinsic_interp_var_at_centroid: {
> +         enum brw_wm_barycentric_interp_mode interp_mode;
> +         if (instr->variables[0]->var->data.interpolation ==
> +             INTERP_QUALIFIER_NOPERSPECTIVE)
> +            interp_mode = BRW_WM_NONPERSPECTIVE_CENTROID_BARYCENTRIC;
> +         else
> +            interp_mode = BRW_WM_PERSPECTIVE_CENTROID_BARYCENTRIC;
> +         uint8_t reg = payload.barycentric_coord_reg[interp_mode];
> +         dst_xy = fs_reg(brw_vec16_grf(reg, 0));
>           break;
> +      }
>
>        case nir_intrinsic_interp_var_at_sample: {
>           /* XXX: We should probably handle non-constant sample id's */
> @@ -1509,6 +1526,7 @@ fs_visitor::nir_emit_intrinsic(const fs_builder &bld, nir_intrinsic_instr *instr
>           unsigned msg_data = const_sample ? const_sample->i[0] << 4 : 0;
>           inst = bld.emit(FS_OPCODE_INTERPOLATE_AT_SAMPLE, dst_xy, src,
>                           fs_reg(msg_data));
> +         setup_pixel_interpolater_instruction(this, instr, inst);
>           break;
>        }
>
> @@ -1521,6 +1539,7 @@ fs_visitor::nir_emit_intrinsic(const fs_builder &bld, nir_intrinsic_instr *instr
>
>              inst = bld.emit(FS_OPCODE_INTERPOLATE_AT_SHARED_OFFSET, dst_xy, src,
>                              fs_reg(off_x | (off_y << 4)));
> +            setup_pixel_interpolater_instruction(this, instr, inst);
>           } else {
>              src = vgrf(glsl_type::ivec2_type);
>              fs_reg offset_src = retype(get_nir_src(instr->src[0]),
> @@ -1550,9 +1569,10 @@ fs_visitor::nir_emit_intrinsic(const fs_builder &bld, nir_intrinsic_instr *instr
>                             bld.SEL(offset(src, bld, i), itemp, fs_reg(7)));
>              }
>
> -            mlen = 2 * dispatch_width / 8;
>              inst = bld.emit(FS_OPCODE_INTERPOLATE_AT_PER_SLOT_OFFSET, dst_xy, src,
>                              fs_reg(0u));
> +            setup_pixel_interpolater_instruction(this, instr, inst,
> +                                                 2 * dispatch_width / 8);
>           }
>           break;
>        }
> @@ -1561,12 +1581,6 @@ fs_visitor::nir_emit_intrinsic(const fs_builder &bld, nir_intrinsic_instr *instr
>           unreachable("Invalid intrinsic");
>        }
>
> -      inst->mlen = mlen;
> -      /* 2 floats per slot returned */
> -      inst->regs_written = 2 * dispatch_width / 8;
> -      inst->pi_noperspective = instr->variables[0]->var->data.interpolation ==
> -                               INTERP_QUALIFIER_NOPERSPECTIVE;
> -
>        for (unsigned j = 0; j < instr->num_components; j++) {
>           fs_reg src = interp_reg(instr->variables[0]->var->data.location, j);
>           src.type = dest.type;
> diff --git a/src/mesa/drivers/dri/i965/brw_wm.c b/src/mesa/drivers/dri/i965/brw_wm.c
> index 592a729..f7fe1e0 100644
> --- a/src/mesa/drivers/dri/i965/brw_wm.c
> +++ b/src/mesa/drivers/dri/i965/brw_wm.c
> @@ -40,9 +40,62 @@
>  #include "program/prog_parameter.h"
>  #include "program/program.h"
>  #include "intel_mipmap_tree.h"
> +#include "brw_nir.h"
>
>  #include "util/ralloc.h"
>
> +static bool
> +compute_modes_in_block(nir_block *block,
> +                       void *state)
> +{
> +   unsigned *interp_modes = state;
> +   nir_intrinsic_instr *intrin;
> +   enum brw_wm_barycentric_interp_mode interp_mode;
> +
> +   nir_foreach_instr(block, instr) {
> +      if (instr->type != nir_instr_type_intrinsic)
> +         continue;
> +
> +      intrin = nir_instr_as_intrinsic(instr);
> +
> +      if (intrin->intrinsic != nir_intrinsic_interp_var_at_centroid)
> +         continue;
> +
> +      if (intrin->variables[0]->var->data.interpolation ==
> +          INTERP_QUALIFIER_NOPERSPECTIVE)
> +         interp_mode = BRW_WM_NONPERSPECTIVE_CENTROID_BARYCENTRIC;
> +      else
> +         interp_mode = BRW_WM_PERSPECTIVE_CENTROID_BARYCENTRIC;
> +
> +      *interp_modes |= 1 << interp_mode;
> +   }
> +
> +   return true;
> +}
> +
> +/**
> + * Looks for calls to interpolateAtCentroid within the program and returns a
> + * mask of the additional interpolation modes that they require.
> + */
> +static unsigned
> +compute_interpolate_at_centroid_modes(const struct gl_fragment_program *fprog)
> +{
> +   unsigned interp_modes = 0;
> +   struct nir_shader *shader = fprog->Base.nir;
> +
> +   if (shader == NULL)
> +      return 0;
> +
> +   nir_foreach_overload(shader, overload) {
> +      if (overload->impl == NULL)
> +         continue;
> +
> +      nir_foreach_block(overload->impl, compute_modes_in_block, &interp_modes);
> +   }
> +
> +   return interp_modes;
> +}
> +
>  /**
>   * Return a bitfield where bit n is set if barycentric interpolation mode n
>   * (see enum brw_wm_barycentric_interp_mode) is needed by the fragment shader.
> @@ -114,6 +167,8 @@ brw_compute_barycentric_interp_modes(struct brw_context *brw,
>        }
>     }
>
> +   barycentric_interp_modes |= compute_interpolate_at_centroid_modes(fprog);
> +
>     return barycentric_interp_modes;
>  }
>
> --
> 1.9.3
>
> _______________________________________________
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Nitpicks aside, I don't think this is a great idea now that you've got
the SKL PI working.

I also think it's broken -- you need to arrange to have the centroid
barycentric coords delivered to the FS thread, which won't be
happening if this is the *only* use of them. Masked in the tests,
because they compare with a centroid-qualified input. [I'm assuming
you don't always get these delivered to the FS in SKL, but no docs
access...]

- Chris

On Sat, Jul 11, 2015 at 11:18 AM, Chris Forbes <chrisf@ijw.co.nz> wrote:
> s/interpolater/interpolator/g
>
> On Fri, Jul 10, 2015 at 1:31 AM, Neil Roberts <neil@linux.intel.com> wrote:
>> For centroid interpolation we can just directly use the values set up
>> in the shader payload instead of querying the pixel interpolator. To
>> do this we need to modify brw_compute_barycentric_interp_modes to
>> detect when interpolateAtCentroid is called.
>>
>> v2: Rebase on top of changes to set the pulls bary bit on SKL
>> ---
>>
>> As an aside, I was deliberating over whether to call the function
>> set_up_blah instead of setup_blah because I think the former is more
>> correct. The rest of Mesa seems to use setup so maybe it's more
>> important to be consistent than correct.
>>
>>  src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 52 +++++++++++++++++++-----------
>>  src/mesa/drivers/dri/i965/brw_wm.c       | 55 ++++++++++++++++++++++++++++++++
>>  2 files changed, 88 insertions(+), 19 deletions(-)
>>
>> diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
>> index 5d1ea21..fd7f1b8 100644
>> --- a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
>> +++ b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
>> @@ -1238,6 +1238,25 @@ fs_visitor::emit_percomp(const fs_builder &bld, const fs_inst &inst,
>>     }
>>  }
>>
>> +/* For most messages, we need one reg of ignored data; the hardware requires
>> + * mlen==1 even when there is no payload. in the per-slot offset case, we'll
>> + * replace this with the proper source data.
>> + */
>> +static void
>> +setup_pixel_interpolater_instruction(fs_visitor *v,
>> +                                     nir_intrinsic_instr *instr,
>> +                                     fs_inst *inst,
>> +                                     int mlen = 1)
>> +{
>> +      inst->mlen = mlen;
>> +      inst->regs_written = 2 * v->dispatch_width / 8;
>> +      inst->pi_noperspective = instr->variables[0]->var->data.interpolation ==
>> +                               INTERP_QUALIFIER_NOPERSPECTIVE;
>> +
>> +      assert(v->stage == MESA_SHADER_FRAGMENT);
>> +      ((struct brw_wm_prog_data *) v->prog_data)->pulls_bary = true;
>> +}
>> +
>>  void
>>  fs_visitor::nir_emit_intrinsic(const fs_builder &bld, nir_intrinsic_instr *instr)
>>  {
>> @@ -1482,25 +1501,23 @@ fs_visitor::nir_emit_intrinsic(const fs_builder &bld, nir_intrinsic_instr *instr
>>     case nir_intrinsic_interp_var_at_centroid:
>>     case nir_intrinsic_interp_var_at_sample:
>>     case nir_intrinsic_interp_var_at_offset: {
>> -      assert(stage == MESA_SHADER_FRAGMENT);
>> -
>> -      ((struct brw_wm_prog_data *) prog_data)->pulls_bary = true;
>> -
>>        fs_reg dst_xy = bld.vgrf(BRW_REGISTER_TYPE_F, 2);
>>
>> -      /* For most messages, we need one reg of ignored data; the hardware
>> -       * requires mlen==1 even when there is no payload. in the per-slot
>> -       * offset case, we'll replace this with the proper source data.
>> -       */
>>        fs_reg src = vgrf(glsl_type::float_type);
>> -      int mlen = 1;     /* one reg unless overriden */
>>        fs_inst *inst;
>>
>>        switch (instr->intrinsic) {
>> -      case nir_intrinsic_interp_var_at_centroid:
>> -         inst = bld.emit(FS_OPCODE_INTERPOLATE_AT_CENTROID,
>> -                         dst_xy, src, fs_reg(0u));
>> +      case nir_intrinsic_interp_var_at_centroid: {
>> +         enum brw_wm_barycentric_interp_mode interp_mode;
>> +         if (instr->variables[0]->var->data.interpolation ==
>> +             INTERP_QUALIFIER_NOPERSPECTIVE)
>> +            interp_mode = BRW_WM_NONPERSPECTIVE_CENTROID_BARYCENTRIC;
>> +         else
>> +            interp_mode = BRW_WM_PERSPECTIVE_CENTROID_BARYCENTRIC;
>> +         uint8_t reg = payload.barycentric_coord_reg[interp_mode];
>> +         dst_xy = fs_reg(brw_vec16_grf(reg, 0));
>>           break;
>> +      }
>>
>>        case nir_intrinsic_interp_var_at_sample: {
>>           /* XXX: We should probably handle non-constant sample id's */
>> @@ -1509,6 +1526,7 @@ fs_visitor::nir_emit_intrinsic(const fs_builder &bld, nir_intrinsic_instr *instr
>>           unsigned msg_data = const_sample ? const_sample->i[0] << 4 : 0;
>>           inst = bld.emit(FS_OPCODE_INTERPOLATE_AT_SAMPLE, dst_xy, src,
>>                           fs_reg(msg_data));
>> +         setup_pixel_interpolater_instruction(this, instr, inst);
>>           break;
>>        }
>>
>> @@ -1521,6 +1539,7 @@ fs_visitor::nir_emit_intrinsic(const fs_builder &bld, nir_intrinsic_instr *instr
>>
>>              inst = bld.emit(FS_OPCODE_INTERPOLATE_AT_SHARED_OFFSET, dst_xy, src,
>>                              fs_reg(off_x | (off_y << 4)));
>> +            setup_pixel_interpolater_instruction(this, instr, inst);
>>           } else {
>>              src = vgrf(glsl_type::ivec2_type);
>>              fs_reg offset_src = retype(get_nir_src(instr->src[0]),
>> @@ -1550,9 +1569,10 @@ fs_visitor::nir_emit_intrinsic(const fs_builder &bld, nir_intrinsic_instr *instr
>>                             bld.SEL(offset(src, bld, i), itemp, fs_reg(7)));
>>              }
>>
>> -            mlen = 2 * dispatch_width / 8;
>>              inst = bld.emit(FS_OPCODE_INTERPOLATE_AT_PER_SLOT_OFFSET, dst_xy, src,
>>                              fs_reg(0u));
>> +            setup_pixel_interpolater_instruction(this, instr, inst,
>> +                                                 2 * dispatch_width / 8);
>>           }
>>           break;
>>        }
>> @@ -1561,12 +1581,6 @@ fs_visitor::nir_emit_intrinsic(const fs_builder &bld, nir_intrinsic_instr *instr
>>           unreachable("Invalid intrinsic");
>>        }
>>
>> -      inst->mlen = mlen;
>> -      /* 2 floats per slot returned */
>> -      inst->regs_written = 2 * dispatch_width / 8;
>> -      inst->pi_noperspective = instr->variables[0]->var->data.interpolation ==
>> -                               INTERP_QUALIFIER_NOPERSPECTIVE;
>> -
>>        for (unsigned j = 0; j < instr->num_components; j++) {
>>           fs_reg src = interp_reg(instr->variables[0]->var->data.location, j);
>>           src.type = dest.type;
>> diff --git a/src/mesa/drivers/dri/i965/brw_wm.c b/src/mesa/drivers/dri/i965/brw_wm.c
>> index 592a729..f7fe1e0 100644
>> --- a/src/mesa/drivers/dri/i965/brw_wm.c
>> +++ b/src/mesa/drivers/dri/i965/brw_wm.c
>> @@ -40,9 +40,62 @@
>>  #include "program/prog_parameter.h"
>>  #include "program/program.h"
>>  #include "intel_mipmap_tree.h"
>> +#include "brw_nir.h"
>>
>>  #include "util/ralloc.h"
>>
>> +static bool
>> +compute_modes_in_block(nir_block *block,
>> +                       void *state)
>> +{
>> +   unsigned *interp_modes = state;
>> +   nir_intrinsic_instr *intrin;
>> +   enum brw_wm_barycentric_interp_mode interp_mode;
>> +
>> +   nir_foreach_instr(block, instr) {
>> +      if (instr->type != nir_instr_type_intrinsic)
>> +         continue;
>> +
>> +      intrin = nir_instr_as_intrinsic(instr);
>> +
>> +      if (intrin->intrinsic != nir_intrinsic_interp_var_at_centroid)
>> +         continue;
>> +
>> +      if (intrin->variables[0]->var->data.interpolation ==
>> +          INTERP_QUALIFIER_NOPERSPECTIVE)
>> +         interp_mode = BRW_WM_NONPERSPECTIVE_CENTROID_BARYCENTRIC;
>> +      else
>> +         interp_mode = BRW_WM_PERSPECTIVE_CENTROID_BARYCENTRIC;
>> +
>> +      *interp_modes |= 1 << interp_mode;
>> +   }
>> +
>> +   return true;
>> +}
>> +
>> +/**
>> + * Looks for calls to interpolateAtCentroid within the program and returns a
>> + * mask of the additional interpolation modes that they require.
>> + */
>> +static unsigned
>> +compute_interpolate_at_centroid_modes(const struct gl_fragment_program *fprog)
>> +{
>> +   unsigned interp_modes = 0;
>> +   struct nir_shader *shader = fprog->Base.nir;
>> +
>> +   if (shader == NULL)
>> +      return 0;
>> +
>> +   nir_foreach_overload(shader, overload) {
>> +      if (overload->impl == NULL)
>> +         continue;
>> +
>> +      nir_foreach_block(overload->impl, compute_modes_in_block, &interp_modes);
>> +   }
>> +
>> +   return interp_modes;
>> +}
>> +
>>  /**
>>   * Return a bitfield where bit n is set if barycentric interpolation mode n
>>   * (see enum brw_wm_barycentric_interp_mode) is needed by the fragment shader.
>> @@ -114,6 +167,8 @@ brw_compute_barycentric_interp_modes(struct brw_context *brw,
>>        }
>>     }
>>
>> +   barycentric_interp_modes |= compute_interpolate_at_centroid_modes(fprog);
>> +
>>     return barycentric_interp_modes;
>>  }
>>
>> --
>> 1.9.3
>>
>> _______________________________________________
>> mesa-dev mailing list
>> mesa-dev@lists.freedesktop.org
>> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Chris Forbes <chrisf@ijw.co.nz> writes:

> Nitpicks aside, I don't think this is a great idea now that you've got
> the SKL PI working.

Can you explain why you don't think this is a good idea? Is it because
it is an optimisation for something that is not known to be a big use
case so carrying around the extra code just adds unnecessary maintenance
burden? I could agree with that so I'm happy to abandon the patch for
now if that's the general consensus.

> I also think it's broken -- you need to arrange to have the centroid
> barycentric coords delivered to the FS thread, which won't be
> happening if this is the *only* use of them. Masked in the tests,
> because they compare with a centroid-qualified input. [I'm assuming
> you don't always get these delivered to the FS in SKL, but no docs
> access...]

The changes to brw_compute_barycentric_interp_modes in the patch ensure
that the centroid barycentric coords are delivered whenever
interpolateAtCentroid is used in a shader. I don't think this is a
problem. At least it seems to work in a simple test without using a
separate varying with the centroid qualifier.

Regards,
- Neil
Oh, never mind - I see there's another hunk that my mailer had folded away
for some reason. I'm happy that it's correct now :)
On Jul 13, 2015 23:33, "Neil Roberts" <neil@linux.intel.com> wrote:

> Chris Forbes <chrisf@ijw.co.nz> writes:
>
> > Nitpicks aside, I don't think this is a great idea now that you've got
> > the SKL PI working.
>
> Can you explain why you don't think this is a good idea? Is it because
> it is an optimisation for something that is not known to be a big use
> case so carrying around the extra code just adds unnecessary maintenance
> burden? I could agree with that so I'm happy to abandon the patch for
> now if that's the general consensus.
>
> > I also think it's broken -- you need to arrange to have the centroid
> > barycentric coords delivered to the FS thread, which won't be
> > happening if this is the *only* use of them. Masked in the tests,
> > because they compare with a centroid-qualified input. [I'm assuming
> > you don't always get these delivered to the FS in SKL, but no docs
> > access...]
>
> The changes to brw_compute_barycentric_interp_modes in the patch ensure
> that the centroid barycentric coords are delivered whenever
> interpolateAtCentroid is used in a shader. I don't think this is a
> problem. At least it seems to work in a simple test without using a
> separate varying with the centroid qualifier.
>
> Regards,
> - Neil
>