[v3,39/42] intel/compiler: remove MAD/LRP algebraic optimizations from the backend

Submitted by Iago Toral Quiroga on Jan. 15, 2019, 1:54 p.m.

Details

Message ID 20190115135414.2313-40-itoral@igalia.com
State New
Headers show
Series "intel: VK_KHR_shader_float16_int8 implementation" ( rev: 5 4 ) in Mesa

Not browsing as part of any series.

Commit Message

Iago Toral Quiroga Jan. 15, 2019, 1:54 p.m.
NIR already has these so they are redundant. A run of shader-db confirms
that the only cases where these backend optimizations are activated
are some Tomb Raider shaders where the affected variables are qualified
as "precise", which is why NIR won't apply them and why the backend
shouldn't either (so it is actually a bug).

Suggested-by: Jason Ekstrand <jason@jlekstrand.net>
---
 src/intel/compiler/brw_fs.cpp | 37 -----------------------------------
 1 file changed, 37 deletions(-)

Patch hide | download patch | download mbox

diff --git a/src/intel/compiler/brw_fs.cpp b/src/intel/compiler/brw_fs.cpp
index 77c955ac435..e7f5a8822a3 100644
--- a/src/intel/compiler/brw_fs.cpp
+++ b/src/intel/compiler/brw_fs.cpp
@@ -2568,16 +2568,6 @@  fs_visitor::opt_algebraic()
             break;
          }
          break;
-      case BRW_OPCODE_LRP:
-         if (inst->src[1].equals(inst->src[2])) {
-            inst->opcode = BRW_OPCODE_MOV;
-            inst->src[0] = inst->src[1];
-            inst->src[1] = reg_undef;
-            inst->src[2] = reg_undef;
-            progress = true;
-            break;
-         }
-         break;
       case BRW_OPCODE_CMP:
          if ((inst->conditional_mod == BRW_CONDITIONAL_Z ||
               inst->conditional_mod == BRW_CONDITIONAL_NZ) &&
@@ -2654,33 +2644,6 @@  fs_visitor::opt_algebraic()
             }
          }
          break;
-      case BRW_OPCODE_MAD:
-         if (inst->src[1].is_zero() || inst->src[2].is_zero()) {
-            inst->opcode = BRW_OPCODE_MOV;
-            inst->src[1] = reg_undef;
-            inst->src[2] = reg_undef;
-            progress = true;
-         } else if (inst->src[0].is_zero()) {
-            inst->opcode = BRW_OPCODE_MUL;
-            inst->src[0] = inst->src[2];
-            inst->src[2] = reg_undef;
-            progress = true;
-         } else if (inst->src[1].is_one()) {
-            inst->opcode = BRW_OPCODE_ADD;
-            inst->src[1] = inst->src[2];
-            inst->src[2] = reg_undef;
-            progress = true;
-         } else if (inst->src[2].is_one()) {
-            inst->opcode = BRW_OPCODE_ADD;
-            inst->src[2] = reg_undef;
-            progress = true;
-         } else if (inst->src[1].file == IMM && inst->src[2].file == IMM) {
-            inst->opcode = BRW_OPCODE_ADD;
-            inst->src[1].f *= inst->src[2].f;
-            inst->src[2] = reg_undef;
-            progress = true;
-         }
-         break;
       case SHADER_OPCODE_BROADCAST:
          if (is_uniform(inst->src[0])) {
             inst->opcode = BRW_OPCODE_MOV;

Comments

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>

On Tue, Jan 15, 2019 at 7:54 AM Iago Toral Quiroga <itoral@igalia.com>
wrote:

> NIR already has these so they are redundant. A run of shader-db confirms
> that the only cases where these backend optimizations are activated
> are some Tomb Raider shaders where the affected variables are qualified
> as "precise", which is why NIR won't apply them and why the backend
> shouldn't either (so it is actually a bug).
>
> Suggested-by: Jason Ekstrand <jason@jlekstrand.net>
> ---
>  src/intel/compiler/brw_fs.cpp | 37 -----------------------------------
>  1 file changed, 37 deletions(-)
>
> diff --git a/src/intel/compiler/brw_fs.cpp b/src/intel/compiler/brw_fs.cpp
> index 77c955ac435..e7f5a8822a3 100644
> --- a/src/intel/compiler/brw_fs.cpp
> +++ b/src/intel/compiler/brw_fs.cpp
> @@ -2568,16 +2568,6 @@ fs_visitor::opt_algebraic()
>              break;
>           }
>           break;
> -      case BRW_OPCODE_LRP:
> -         if (inst->src[1].equals(inst->src[2])) {
> -            inst->opcode = BRW_OPCODE_MOV;
> -            inst->src[0] = inst->src[1];
> -            inst->src[1] = reg_undef;
> -            inst->src[2] = reg_undef;
> -            progress = true;
> -            break;
> -         }
> -         break;
>        case BRW_OPCODE_CMP:
>           if ((inst->conditional_mod == BRW_CONDITIONAL_Z ||
>                inst->conditional_mod == BRW_CONDITIONAL_NZ) &&
> @@ -2654,33 +2644,6 @@ fs_visitor::opt_algebraic()
>              }
>           }
>           break;
> -      case BRW_OPCODE_MAD:
> -         if (inst->src[1].is_zero() || inst->src[2].is_zero()) {
> -            inst->opcode = BRW_OPCODE_MOV;
> -            inst->src[1] = reg_undef;
> -            inst->src[2] = reg_undef;
> -            progress = true;
> -         } else if (inst->src[0].is_zero()) {
> -            inst->opcode = BRW_OPCODE_MUL;
> -            inst->src[0] = inst->src[2];
> -            inst->src[2] = reg_undef;
> -            progress = true;
> -         } else if (inst->src[1].is_one()) {
> -            inst->opcode = BRW_OPCODE_ADD;
> -            inst->src[1] = inst->src[2];
> -            inst->src[2] = reg_undef;
> -            progress = true;
> -         } else if (inst->src[2].is_one()) {
> -            inst->opcode = BRW_OPCODE_ADD;
> -            inst->src[2] = reg_undef;
> -            progress = true;
> -         } else if (inst->src[1].file == IMM && inst->src[2].file == IMM)
> {
> -            inst->opcode = BRW_OPCODE_ADD;
> -            inst->src[1].f *= inst->src[2].f;
> -            inst->src[2] = reg_undef;
> -            progress = true;
> -         }
> -         break;
>        case SHADER_OPCODE_BROADCAST:
>           if (is_uniform(inst->src[0])) {
>              inst->opcode = BRW_OPCODE_MOV;
> --
> 2.17.1
>
> _______________________________________________
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
On Tue, Jan 15, 2019 at 5:54 AM Iago Toral Quiroga <itoral@igalia.com> wrote:
>
> NIR already has these so they are redundant. A run of shader-db confirms
> that the only cases where these backend optimizations are activated
> are some Tomb Raider shaders where the affected variables are qualified
> as "precise", which is why NIR won't apply them and why the backend
> shouldn't either (so it is actually a bug).

Which of the six optimizations that you're removing were responsible
for the change? I ask because...

>
> Suggested-by: Jason Ekstrand <jason@jlekstrand.net>
> ---
>  src/intel/compiler/brw_fs.cpp | 37 -----------------------------------
>  1 file changed, 37 deletions(-)
>
> diff --git a/src/intel/compiler/brw_fs.cpp b/src/intel/compiler/brw_fs.cpp
> index 77c955ac435..e7f5a8822a3 100644
> --- a/src/intel/compiler/brw_fs.cpp
> +++ b/src/intel/compiler/brw_fs.cpp
> @@ -2568,16 +2568,6 @@ fs_visitor::opt_algebraic()
>              break;
>           }
>           break;
> -      case BRW_OPCODE_LRP:
> -         if (inst->src[1].equals(inst->src[2])) {
> -            inst->opcode = BRW_OPCODE_MOV;
> -            inst->src[0] = inst->src[1];
> -            inst->src[1] = reg_undef;
> -            inst->src[2] = reg_undef;
> -            progress = true;
> -            break;

I'm not sure whether this is imprecise, and...

> -         }
> -         break;
>        case BRW_OPCODE_CMP:
>           if ((inst->conditional_mod == BRW_CONDITIONAL_Z ||
>                inst->conditional_mod == BRW_CONDITIONAL_NZ) &&
> @@ -2654,33 +2644,6 @@ fs_visitor::opt_algebraic()
>              }
>           }
>           break;
> -      case BRW_OPCODE_MAD:
> -         if (inst->src[1].is_zero() || inst->src[2].is_zero()) {
> -            inst->opcode = BRW_OPCODE_MOV;
> -            inst->src[1] = reg_undef;
> -            inst->src[2] = reg_undef;
> -            progress = true;
> -         } else if (inst->src[0].is_zero()) {
> -            inst->opcode = BRW_OPCODE_MUL;
> -            inst->src[0] = inst->src[2];
> -            inst->src[2] = reg_undef;
> -            progress = true;
> -         } else if (inst->src[1].is_one()) {
> -            inst->opcode = BRW_OPCODE_ADD;
> -            inst->src[1] = inst->src[2];
> -            inst->src[2] = reg_undef;
> -            progress = true;
> -         } else if (inst->src[2].is_one()) {
> -            inst->opcode = BRW_OPCODE_ADD;
> -            inst->src[2] = reg_undef;
> -            progress = true;
> -         } else if (inst->src[1].file == IMM && inst->src[2].file == IMM) {
> -            inst->opcode = BRW_OPCODE_ADD;
> -            inst->src[1].f *= inst->src[2].f;
> -            inst->src[2] = reg_undef;
> -            progress = true;

or this one.
On Thu, Jan 17, 2019 at 6:42 PM Matt Turner <mattst88@gmail.com> wrote:

> On Tue, Jan 15, 2019 at 5:54 AM Iago Toral Quiroga <itoral@igalia.com>
> wrote:
> >
> > NIR already has these so they are redundant. A run of shader-db confirms
> > that the only cases where these backend optimizations are activated
> > are some Tomb Raider shaders where the affected variables are qualified
> > as "precise", which is why NIR won't apply them and why the backend
> > shouldn't either (so it is actually a bug).
>
> Which of the six optimizations that you're removing were responsible
> for the change? I ask because...
>

If it's one of the precise ones, we should port it to NIR...


> >
> > Suggested-by: Jason Ekstrand <jason@jlekstrand.net>
> > ---
> >  src/intel/compiler/brw_fs.cpp | 37 -----------------------------------
> >  1 file changed, 37 deletions(-)
> >
> > diff --git a/src/intel/compiler/brw_fs.cpp
> b/src/intel/compiler/brw_fs.cpp
> > index 77c955ac435..e7f5a8822a3 100644
> > --- a/src/intel/compiler/brw_fs.cpp
> > +++ b/src/intel/compiler/brw_fs.cpp
> > @@ -2568,16 +2568,6 @@ fs_visitor::opt_algebraic()
> >              break;
> >           }
> >           break;
> > -      case BRW_OPCODE_LRP:
> > -         if (inst->src[1].equals(inst->src[2])) {
> > -            inst->opcode = BRW_OPCODE_MOV;
> > -            inst->src[0] = inst->src[1];
> > -            inst->src[1] = reg_undef;
> > -            inst->src[2] = reg_undef;
> > -            progress = true;
> > -            break;
>
> I'm not sure whether this is imprecise, and...
>

Doesn't work for NaN or either inf, at least not unles inf - inf == 0 which
I don't think it is.


> > -         }
> > -         break;
> >        case BRW_OPCODE_CMP:
> >           if ((inst->conditional_mod == BRW_CONDITIONAL_Z ||
> >                inst->conditional_mod == BRW_CONDITIONAL_NZ) &&
> > @@ -2654,33 +2644,6 @@ fs_visitor::opt_algebraic()
> >              }
> >           }
> >           break;
> > -      case BRW_OPCODE_MAD:
> > -         if (inst->src[1].is_zero() || inst->src[2].is_zero()) {
> > -            inst->opcode = BRW_OPCODE_MOV;
> > -            inst->src[1] = reg_undef;
> > -            inst->src[2] = reg_undef;
> > -            progress = true;
> > -         } else if (inst->src[0].is_zero()) {
> > -            inst->opcode = BRW_OPCODE_MUL;
> > -            inst->src[0] = inst->src[2];
> > -            inst->src[2] = reg_undef;
> > -            progress = true;
> > -         } else if (inst->src[1].is_one()) {
> > -            inst->opcode = BRW_OPCODE_ADD;
> > -            inst->src[1] = inst->src[2];
> > -            inst->src[2] = reg_undef;
> > -            progress = true;
> > -         } else if (inst->src[2].is_one()) {
> > -            inst->opcode = BRW_OPCODE_ADD;
> > -            inst->src[2] = reg_undef;
> > -            progress = true;
> > -         } else if (inst->src[1].file == IMM && inst->src[2].file ==
> IMM) {
> > -            inst->opcode = BRW_OPCODE_ADD;
> > -            inst->src[1].f *= inst->src[2].f;
> > -            inst->src[2] = reg_undef;
> > -            progress = true;
>
> or this one.
>

Yes, it is.  Part of the point of FMA is that it's more precise than
mul+add because the mul is done with extra precision and added to src[0] in
high-precision before the final rounding.  This optimization explicitly
breaks the MAD into mul+add.

--Jason
On Thu, 2019-01-17 at 22:31 -0600, Jason Ekstrand wrote:
> On Thu, Jan 17, 2019 at 6:42 PM Matt Turner <mattst88@gmail.com>
> wrote:
> > On Tue, Jan 15, 2019 at 5:54 AM Iago Toral Quiroga <
> > itoral@igalia.com> wrote:
> > 
> > >
> > 
> > > NIR already has these so they are redundant. A run of shader-db
> > confirms
> > 
> > > that the only cases where these backend optimizations are
> > activated
> > 
> > > are some Tomb Raider shaders where the affected variables are
> > qualified
> > 
> > > as "precise", which is why NIR won't apply them and why the
> > backend
> > 
> > > shouldn't either (so it is actually a bug).
> > 
> > 
> > 
> > Which of the six optimizations that you're removing were
> > responsible
> > 
> > for the change? I ask because...
> 
> If it's one of the precise ones, we should port it to NIR...
> 
> > >
> > 
> > > Suggested-by: Jason Ekstrand <jason@jlekstrand.net>
> > 
> > > ---
> > 
> > >  src/intel/compiler/brw_fs.cpp | 37 ---------------------------
> > --------
> > 
> > >  1 file changed, 37 deletions(-)
> > 
> > >
> > 
> > > diff --git a/src/intel/compiler/brw_fs.cpp
> > b/src/intel/compiler/brw_fs.cpp
> > 
> > > index 77c955ac435..e7f5a8822a3 100644
> > 
> > > --- a/src/intel/compiler/brw_fs.cpp
> > 
> > > +++ b/src/intel/compiler/brw_fs.cpp
> > 
> > > @@ -2568,16 +2568,6 @@ fs_visitor::opt_algebraic()
> > 
> > >              break;
> > 
> > >           }
> > 
> > >           break;
> > 
> > > -      case BRW_OPCODE_LRP:
> > 
> > > -         if (inst->src[1].equals(inst->src[2])) {
> > 
> > > -            inst->opcode = BRW_OPCODE_MOV;
> > 
> > > -            inst->src[0] = inst->src[1];
> > 
> > > -            inst->src[1] = reg_undef;
> > 
> > > -            inst->src[2] = reg_undef;
> > 
> > > -            progress = true;
> > 
> > > -            break;
> > 
> > 
> > 
> > I'm not sure whether this is imprecise, and...
> 
> Doesn't work for NaN or either inf, at least not unles inf - inf == 0
> which I don't think it is.

This exists in NIR algebraic and is marked as inexact there (plus it is
not triggered by anything in shader-db)
> > > -         }
> > 
> > > -         break;
> > 
> > >        case BRW_OPCODE_CMP:
> > 
> > >           if ((inst->conditional_mod == BRW_CONDITIONAL_Z ||
> > 
> > >                inst->conditional_mod == BRW_CONDITIONAL_NZ) &&
> > 
> > > @@ -2654,33 +2644,6 @@ fs_visitor::opt_algebraic()
> > 
> > >              }
> > 
> > >           }
> > 
> > >           break;
> > 
> > > -      case BRW_OPCODE_MAD:
> > 
> > > -         if (inst->src[1].is_zero() || inst->src[2].is_zero()) {
> > 
> > > -            inst->opcode = BRW_OPCODE_MOV;
> > 
> > > -            inst->src[1] = reg_undef;
> > 
> > > -            inst->src[2] = reg_undef;
> > 
> > > -            progress = true;
> > 
> > > -         } else if (inst->src[0].is_zero()) {
> > 
> > > -            inst->opcode = BRW_OPCODE_MUL;
> > 
> > > -            inst->src[0] = inst->src[2];
> > 
> > > -            inst->src[2] = reg_undef;
> > 
> > > -            progress = true;

I believe these were the only cases triggered by the Tomb Raider
shaders. These optimizations exist in NIR and  are marked as imprecise
there too.
> > > -         } else if (inst->src[1].is_one()) {
> > 
> > > -            inst->opcode = BRW_OPCODE_ADD;
> > 
> > > -            inst->src[1] = inst->src[2];
> > 
> > > -            inst->src[2] = reg_undef;
> > 
> > > -            progress = true;
> > 
> > > -         } else if (inst->src[2].is_one()) {
> > 
> > > -            inst->opcode = BRW_OPCODE_ADD;
> > 
> > > -            inst->src[2] = reg_undef;
> > 
> > > -            progress = true;

These also exist in NIR, but I see they are not marked as imprecise
there, not sure why, looks like a bug to me right?
> > > -         } else if (inst->src[1].file == IMM && inst-
> > >src[2].file == IMM) {
> > 
> > > -            inst->opcode = BRW_OPCODE_ADD;
> > 
> > > -            inst->src[1].f *= inst->src[2].f;
> > 
> > > -            inst->src[2] = reg_undef;
> > 
> > > -            progress = true;
> > 
> > 

This one doesn't exist in NIR but it clearly breaks precise
requirements since it its manually breaking  MAD into MUL + ADD, which
has different precision.
> or this one.
> 
> Yes, it is.  Part of the point of FMA is that it's more precise than
> mul+add because the mul is done with extra precision and added to
> src[0] in high-precision before the final rounding.  This
> optimization explicitly breaks the MAD into mul+add.--Jason
>