[Mesa-dev,RFC,5/5] i965: Skip execution size adjustment for instructions of width 4

Submitted by Iago Toral Quiroga on Dec. 9, 2015, 12:15 p.m.

Details

Message ID 1449663349-17772-6-git-send-email-itoral@igalia.com
State New
Headers show
Series "Skip automatic execsize for instructions with a width of 4" ( rev: 1 ) in Mesa

Not browsing as part of any series.

Commit Message

Iago Toral Quiroga Dec. 9, 2015, 12:15 p.m.
This code in brw_set_dest adjusts the execution size of any instruction
with a dst.width < 8. However, we don't want to do this with instructions
operating on doubles, since these will have a width of 4, but still
need an execution size of 8 (for SIMD8). Unfortunately, we can't just check
the size of the operands involved to detect if we are doing an operation on
doubles, because we can have instructions that do operations on double
operands interpreted as UD, operating on any of its 2 32-bit components.

Previous commits have made it so we never emit instructions with a horizontal
width of 4 that don't have the correct execution size set for gen7/gen8, so
we can skip it in this case, avoiding the conflicts with fp64 requirements.

Expanding the same fix to other hardware generations requires many more
changes but since we are not targetting fp64 support on them
wer don't really care for now.
---
 src/mesa/drivers/dri/i965/brw_eu_emit.c | 14 +++++++++++++-
 1 file changed, 13 insertions(+), 1 deletion(-)

Patch hide | download patch | download mbox

diff --git a/src/mesa/drivers/dri/i965/brw_eu_emit.c b/src/mesa/drivers/dri/i965/brw_eu_emit.c
index 78f2c8c..50a8771 100644
--- a/src/mesa/drivers/dri/i965/brw_eu_emit.c
+++ b/src/mesa/drivers/dri/i965/brw_eu_emit.c
@@ -202,8 +202,20 @@  brw_set_dest(struct brw_codegen *p, brw_inst *inst, struct brw_reg dest)
    /* Generators should set a default exec_size of either 8 (SIMD4x2 or SIMD8)
     * or 16 (SIMD16), as that's normally correct.  However, when dealing with
     * small registers, we automatically reduce it to match the register size.
+    *
+    * In platforms that support fp64 we can emit instructions with a width of
+    * 4 that need two SIMD8 registers and an exec_size of 8 or 16. In these
+    * cases we need to make sure that these instructions have their exec sizes
+    * set properly when they are emitted and we can't rely on this code to fix
+    * it.
     */
-   if (dest.width < BRW_EXECUTE_8)
+   bool fix_exec_size;
+   if (devinfo->gen == 7 || devinfo->gen == 8)
+      fix_exec_size = dest.width < BRW_EXECUTE_4;
+   else
+      fix_exec_size = dest.width < BRW_EXECUTE_8;
+
+   if (fix_exec_size)
       brw_inst_set_exec_size(devinfo, inst, dest.width);
 }
 

Comments

On Dec 9, 2015 4:16 AM, "Iago Toral Quiroga" <itoral@igalia.com> wrote:
>
> This code in brw_set_dest adjusts the execution size of any instruction
> with a dst.width < 8. However, we don't want to do this with instructions
> operating on doubles, since these will have a width of 4, but still
> need an execution size of 8 (for SIMD8). Unfortunately, we can't just
check
> the size of the operands involved to detect if we are doing an operation
on
> doubles, because we can have instructions that do operations on double
> operands interpreted as UD, operating on any of its 2 32-bit components.
>
> Previous commits have made it so we never emit instructions with a
horizontal
> width of 4 that don't have the correct execution size set for gen7/gen8,
so
> we can skip it in this case, avoiding the conflicts with fp64
requirements.
>
> Expanding the same fix to other hardware generations requires many more
> changes but since we are not targetting fp64 support on them
> wer don't really care for now.
> ---
>  src/mesa/drivers/dri/i965/brw_eu_emit.c | 14 +++++++++++++-
>  1 file changed, 13 insertions(+), 1 deletion(-)
>
> diff --git a/src/mesa/drivers/dri/i965/brw_eu_emit.c
b/src/mesa/drivers/dri/i965/brw_eu_emit.c
> index 78f2c8c..50a8771 100644
> --- a/src/mesa/drivers/dri/i965/brw_eu_emit.c
> +++ b/src/mesa/drivers/dri/i965/brw_eu_emit.c
> @@ -202,8 +202,20 @@ brw_set_dest(struct brw_codegen *p, brw_inst *inst,
struct brw_reg dest)
>     /* Generators should set a default exec_size of either 8 (SIMD4x2 or
SIMD8)
>      * or 16 (SIMD16), as that's normally correct.  However, when dealing
with
>      * small registers, we automatically reduce it to match the register
size.
> +    *
> +    * In platforms that support fp64 we can emit instructions with a
width of
> +    * 4 that need two SIMD8 registers and an exec_size of 8 or 16. In
these
> +    * cases we need to make sure that these instructions have their exec
sizes
> +    * set properly when they are emitted and we can't rely on this code
to fix
> +    * it.
>      */
> -   if (dest.width < BRW_EXECUTE_8)
> +   bool fix_exec_size;
> +   if (devinfo->gen == 7 || devinfo->gen == 8)

If we're doing to take this approach, we definitely want to make it gen > 6
or something so we include future gens.  Really gen > 4 is probably doable
since the only real problem is the legacy clipping code.

> +      fix_exec_size = dest.width < BRW_EXECUTE_4;
> +   else
> +      fix_exec_size = dest.width < BRW_EXECUTE_8;
> +
> +   if (fix_exec_size)
>        brw_inst_set_exec_size(devinfo, inst, dest.width);
>  }
>
> --
> 2.1.4
>
> _______________________________________________
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
On Wed, 2015-12-09 at 08:10 -0800, Jason Ekstrand wrote:
> 
> On Dec 9, 2015 4:16 AM, "Iago Toral Quiroga" <itoral@igalia.com>
> wrote:
> >
> > This code in brw_set_dest adjusts the execution size of any
> instruction
> > with a dst.width < 8. However, we don't want to do this with
> instructions
> > operating on doubles, since these will have a width of 4, but still
> > need an execution size of 8 (for SIMD8). Unfortunately, we can't
> just check
> > the size of the operands involved to detect if we are doing an
> operation on
> > doubles, because we can have instructions that do operations on
> double
> > operands interpreted as UD, operating on any of its 2 32-bit
> components.
> >
> > Previous commits have made it so we never emit instructions with a
> horizontal
> > width of 4 that don't have the correct execution size set for
> gen7/gen8, so
> > we can skip it in this case, avoiding the conflicts with fp64
> requirements.
> >
> > Expanding the same fix to other hardware generations requires many
> more
> > changes but since we are not targetting fp64 support on them
> > wer don't really care for now.
> > ---
> >  src/mesa/drivers/dri/i965/brw_eu_emit.c | 14 +++++++++++++-
> >  1 file changed, 13 insertions(+), 1 deletion(-)
> >
> > diff --git a/src/mesa/drivers/dri/i965/brw_eu_emit.c
> b/src/mesa/drivers/dri/i965/brw_eu_emit.c
> > index 78f2c8c..50a8771 100644
> > --- a/src/mesa/drivers/dri/i965/brw_eu_emit.c
> > +++ b/src/mesa/drivers/dri/i965/brw_eu_emit.c
> > @@ -202,8 +202,20 @@ brw_set_dest(struct brw_codegen *p, brw_inst
> *inst, struct brw_reg dest)
> >     /* Generators should set a default exec_size of either 8
> (SIMD4x2 or SIMD8)
> >      * or 16 (SIMD16), as that's normally correct.  However, when
> dealing with
> >      * small registers, we automatically reduce it to match the
> register size.
> > +    *
> > +    * In platforms that support fp64 we can emit instructions with
> a width of
> > +    * 4 that need two SIMD8 registers and an exec_size of 8 or 16.
> In these
> > +    * cases we need to make sure that these instructions have their
> exec sizes
> > +    * set properly when they are emitted and we can't rely on this
> code to fix
> > +    * it.
> >      */
> > -   if (dest.width < BRW_EXECUTE_8)
> > +   bool fix_exec_size;
> > +   if (devinfo->gen == 7 || devinfo->gen == 8)
> 
> If we're doing to take this approach, we definitely want to make it
> gen > 6 or something so we include future gens.  Really gen > 4 is
> probably doable since the only real problem is the legacy clipping
> code.

Strips and fans is also a problem, but it is certainly doable if we want
to do it.

Iago


> > +      fix_exec_size = dest.width < BRW_EXECUTE_4;
> > +   else
> > +      fix_exec_size = dest.width < BRW_EXECUTE_8;
> > +
> > +   if (fix_exec_size)
> >        brw_inst_set_exec_size(devinfo, inst, dest.width);
> >  }
> >
> > --
> > 2.1.4
> >
> > _______________________________________________
> > mesa-dev mailing list
> > mesa-dev@lists.freedesktop.org
> > http://lists.freedesktop.org/mailman/listinfo/mesa-dev
> 
>
On Dec 9, 2015 11:47 PM, "Iago Toral" <itoral@igalia.com> wrote:
>
> On Wed, 2015-12-09 at 08:10 -0800, Jason Ekstrand wrote:
> >
> > On Dec 9, 2015 4:16 AM, "Iago Toral Quiroga" <itoral@igalia.com>
> > wrote:
> > >
> > > This code in brw_set_dest adjusts the execution size of any
> > instruction
> > > with a dst.width < 8. However, we don't want to do this with
> > instructions
> > > operating on doubles, since these will have a width of 4, but still
> > > need an execution size of 8 (for SIMD8). Unfortunately, we can't
> > just check
> > > the size of the operands involved to detect if we are doing an
> > operation on
> > > doubles, because we can have instructions that do operations on
> > double
> > > operands interpreted as UD, operating on any of its 2 32-bit
> > components.
> > >
> > > Previous commits have made it so we never emit instructions with a
> > horizontal
> > > width of 4 that don't have the correct execution size set for
> > gen7/gen8, so
> > > we can skip it in this case, avoiding the conflicts with fp64
> > requirements.
> > >
> > > Expanding the same fix to other hardware generations requires many
> > more
> > > changes but since we are not targetting fp64 support on them
> > > wer don't really care for now.
> > > ---
> > >  src/mesa/drivers/dri/i965/brw_eu_emit.c | 14 +++++++++++++-
> > >  1 file changed, 13 insertions(+), 1 deletion(-)
> > >
> > > diff --git a/src/mesa/drivers/dri/i965/brw_eu_emit.c
> > b/src/mesa/drivers/dri/i965/brw_eu_emit.c
> > > index 78f2c8c..50a8771 100644
> > > --- a/src/mesa/drivers/dri/i965/brw_eu_emit.c
> > > +++ b/src/mesa/drivers/dri/i965/brw_eu_emit.c
> > > @@ -202,8 +202,20 @@ brw_set_dest(struct brw_codegen *p, brw_inst
> > *inst, struct brw_reg dest)
> > >     /* Generators should set a default exec_size of either 8
> > (SIMD4x2 or SIMD8)
> > >      * or 16 (SIMD16), as that's normally correct.  However, when
> > dealing with
> > >      * small registers, we automatically reduce it to match the
> > register size.
> > > +    *
> > > +    * In platforms that support fp64 we can emit instructions with
> > a width of
> > > +    * 4 that need two SIMD8 registers and an exec_size of 8 or 16.
> > In these
> > > +    * cases we need to make sure that these instructions have their
> > exec sizes
> > > +    * set properly when they are emitted and we can't rely on this
> > code to fix
> > > +    * it.
> > >      */
> > > -   if (dest.width < BRW_EXECUTE_8)
> > > +   bool fix_exec_size;
> > > +   if (devinfo->gen == 7 || devinfo->gen == 8)
> >
> > If we're doing to take this approach, we definitely want to make it
> > gen > 6 or something so we include future gens.  Really gen > 4 is
> > probably doable since the only real problem is the legacy clipping
> > code.
>
> Strips and fans is also a problem, but it is certainly doable if we want
> to do it.

Yeah, my primary point is that we should make it as little of an edge-case
as possible.  We could go back to at least gen6 and we should go forward.
That said, it'll take a little testing from the Intel side.

> Iago
>
>
> > > +      fix_exec_size = dest.width < BRW_EXECUTE_4;
> > > +   else
> > > +      fix_exec_size = dest.width < BRW_EXECUTE_8;
> > > +
> > > +   if (fix_exec_size)
> > >        brw_inst_set_exec_size(devinfo, inst, dest.width);
> > >  }
> > >
> > > --
> > > 2.1.4
> > >
> > > _______________________________________________
> > > mesa-dev mailing list
> > > mesa-dev@lists.freedesktop.org
> > > http://lists.freedesktop.org/mailman/listinfo/mesa-dev
> >
> >
>
>
On Wed, 2015-12-09 at 23:51 -0800, Jason Ekstrand wrote:
> 
> On Dec 9, 2015 11:47 PM, "Iago Toral" <itoral@igalia.com> wrote:
> >
> > On Wed, 2015-12-09 at 08:10 -0800, Jason Ekstrand wrote:
> > >
> > > On Dec 9, 2015 4:16 AM, "Iago Toral Quiroga" <itoral@igalia.com>
> > > wrote:
> > > >
> > > > This code in brw_set_dest adjusts the execution size of any
> > > instruction
> > > > with a dst.width < 8. However, we don't want to do this with
> > > instructions
> > > > operating on doubles, since these will have a width of 4, but
> still
> > > > need an execution size of 8 (for SIMD8). Unfortunately, we can't
> > > just check
> > > > the size of the operands involved to detect if we are doing an
> > > operation on
> > > > doubles, because we can have instructions that do operations on
> > > double
> > > > operands interpreted as UD, operating on any of its 2 32-bit
> > > components.
> > > >
> > > > Previous commits have made it so we never emit instructions with
> a
> > > horizontal
> > > > width of 4 that don't have the correct execution size set for
> > > gen7/gen8, so
> > > > we can skip it in this case, avoiding the conflicts with fp64
> > > requirements.
> > > >
> > > > Expanding the same fix to other hardware generations requires
> many
> > > more
> > > > changes but since we are not targetting fp64 support on them
> > > > wer don't really care for now.
> > > > ---
> > > >  src/mesa/drivers/dri/i965/brw_eu_emit.c | 14 +++++++++++++-
> > > >  1 file changed, 13 insertions(+), 1 deletion(-)
> > > >
> > > > diff --git a/src/mesa/drivers/dri/i965/brw_eu_emit.c
> > > b/src/mesa/drivers/dri/i965/brw_eu_emit.c
> > > > index 78f2c8c..50a8771 100644
> > > > --- a/src/mesa/drivers/dri/i965/brw_eu_emit.c
> > > > +++ b/src/mesa/drivers/dri/i965/brw_eu_emit.c
> > > > @@ -202,8 +202,20 @@ brw_set_dest(struct brw_codegen *p,
> brw_inst
> > > *inst, struct brw_reg dest)
> > > >     /* Generators should set a default exec_size of either 8
> > > (SIMD4x2 or SIMD8)
> > > >      * or 16 (SIMD16), as that's normally correct.  However,
> when
> > > dealing with
> > > >      * small registers, we automatically reduce it to match the
> > > register size.
> > > > +    *
> > > > +    * In platforms that support fp64 we can emit instructions
> with
> > > a width of
> > > > +    * 4 that need two SIMD8 registers and an exec_size of 8 or
> 16.
> > > In these
> > > > +    * cases we need to make sure that these instructions have
> their
> > > exec sizes
> > > > +    * set properly when they are emitted and we can't rely on
> this
> > > code to fix
> > > > +    * it.
> > > >      */
> > > > -   if (dest.width < BRW_EXECUTE_8)
> > > > +   bool fix_exec_size;
> > > > +   if (devinfo->gen == 7 || devinfo->gen == 8)
> > >
> > > If we're doing to take this approach, we definitely want to make
> it
> > > gen > 6 or something so we include future gens.  Really gen > 4 is
> > > probably doable since the only real problem is the legacy clipping
> > > code.
> >
> > Strips and fans is also a problem, but it is certainly doable if we
> want
> > to do it.
> 
> Yeah, my primary point is that we should make it as little of an
> edge-case as possible.  We could go back to at least gen6 and we
> should go forward.  That said, it'll take a little testing from the
> Intel side.

Sure, this sounds sensible to me. I'll scan the code for gen9 paths that
work with a horizontal width of 4 and include patches to cover gen6 as
well. When we have that I'll send the series for testing.

Thanks Jason!

Iago
> > Iago
> >
> >
> > > > +      fix_exec_size = dest.width < BRW_EXECUTE_4;
> > > > +   else
> > > > +      fix_exec_size = dest.width < BRW_EXECUTE_8;
> > > > +
> > > > +   if (fix_exec_size)
> > > >        brw_inst_set_exec_size(devinfo, inst, dest.width);
> > > >  }
> > > >
> > > > --
> > > > 2.1.4
> > > >
> > > > _______________________________________________
> > > > mesa-dev mailing list
> > > > mesa-dev@lists.freedesktop.org
> > > > http://lists.freedesktop.org/mailman/listinfo/mesa-dev
> > >
> > >
> >
> >
> 
>