[Mesa-dev,25/95] i965/vec4: fix base offset for nir_registers with doubles

Submitted by Iago Toral Quiroga on July 19, 2016, 10:40 a.m.

Details

Message ID 1468924892-6910-26-git-send-email-itoral@igalia.com
State New
Headers show
Series "i965 Haswell ARB_gpu_shader_fp64 / OpenGL 4.0" ( rev: 2 1 ) in Mesa

Not browsing as part of any series.

Commit Message

Iago Toral Quiroga July 19, 2016, 10:40 a.m.
---
 src/mesa/drivers/dri/i965/brw_vec4_nir.cpp | 8 +++++---
 1 file changed, 5 insertions(+), 3 deletions(-)

Patch hide | download patch | download mbox

diff --git a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp b/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
index cf35f2e..fde7b60 100644
--- a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
+++ b/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
@@ -280,7 +280,8 @@  vec4_visitor::get_nir_dest(const nir_dest &dest)
       nir_ssa_values[dest.ssa.index] = dst;
       return dst;
    } else {
-      return dst_reg_for_nir_reg(this, dest.reg.reg, dest.reg.base_offset,
+      unsigned base_offset = dest.reg.base_offset * dest.reg.reg->bit_size / 32;
+      return dst_reg_for_nir_reg(this, dest.reg.reg, base_offset,
                                  dest.reg.indirect);
    }
 }
@@ -308,8 +309,9 @@  vec4_visitor::get_nir_src(const nir_src &src, enum brw_reg_type type,
       reg = nir_ssa_values[src.ssa->index];
    }
    else {
-     reg = dst_reg_for_nir_reg(this, src.reg.reg, src.reg.base_offset,
-                               src.reg.indirect);
+      unsigned base_offset = src.reg.base_offset * src.reg.reg->bit_size / 32;
+      reg = dst_reg_for_nir_reg(this, src.reg.reg, base_offset,
+                                src.reg.indirect);
    }
 
    reg = retype(reg, type);

Comments

Iago Toral Quiroga <itoral@igalia.com> writes:

> ---
>  src/mesa/drivers/dri/i965/brw_vec4_nir.cpp | 8 +++++---
>  1 file changed, 5 insertions(+), 3 deletions(-)
>
> diff --git a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp b/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
> index cf35f2e..fde7b60 100644
> --- a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
> @@ -280,7 +280,8 @@ vec4_visitor::get_nir_dest(const nir_dest &dest)
>        nir_ssa_values[dest.ssa.index] = dst;
>        return dst;
>     } else {
> -      return dst_reg_for_nir_reg(this, dest.reg.reg, dest.reg.base_offset,
> +      unsigned base_offset = dest.reg.base_offset * dest.reg.reg->bit_size / 32;
> +      return dst_reg_for_nir_reg(this, dest.reg.reg, base_offset,
>                                   dest.reg.indirect);
>     }
>  }
> @@ -308,8 +309,9 @@ vec4_visitor::get_nir_src(const nir_src &src, enum brw_reg_type type,
>        reg = nir_ssa_values[src.ssa->index];
>     }
>     else {
> -     reg = dst_reg_for_nir_reg(this, src.reg.reg, src.reg.base_offset,
> -                               src.reg.indirect);
> +      unsigned base_offset = src.reg.base_offset * src.reg.reg->bit_size / 32;
> +      reg = dst_reg_for_nir_reg(this, src.reg.reg, base_offset,
> +                                src.reg.indirect);

I think this wouldn't have been necessary if you had fixed the offset()
helper to take into account the register type (as it does in the FS
back-end)?

>     }
>  
>     reg = retype(reg, type);
> -- 
> 2.7.4
>
> _______________________________________________
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
On Tue, 2016-08-02 at 18:40 -0700, Francisco Jerez wrote:
> Iago Toral Quiroga <itoral@igalia.com> writes:
> 
> > 
> > ---
> >  src/mesa/drivers/dri/i965/brw_vec4_nir.cpp | 8 +++++---
> >  1 file changed, 5 insertions(+), 3 deletions(-)
> > 
> > diff --git a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
> > b/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
> > index cf35f2e..fde7b60 100644
> > --- a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
> > +++ b/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
> > @@ -280,7 +280,8 @@ vec4_visitor::get_nir_dest(const nir_dest
> > &dest)
> >        nir_ssa_values[dest.ssa.index] = dst;
> >        return dst;
> >     } else {
> > -      return dst_reg_for_nir_reg(this, dest.reg.reg,
> > dest.reg.base_offset,
> > +      unsigned base_offset = dest.reg.base_offset * dest.reg.reg-
> > >bit_size / 32;
> > +      return dst_reg_for_nir_reg(this, dest.reg.reg, base_offset,
> >                                   dest.reg.indirect);
> >     }
> >  }
> > @@ -308,8 +309,9 @@ vec4_visitor::get_nir_src(const nir_src &src,
> > enum brw_reg_type type,
> >        reg = nir_ssa_values[src.ssa->index];
> >     }
> >     else {
> > -     reg = dst_reg_for_nir_reg(this, src.reg.reg,
> > src.reg.base_offset,
> > -                               src.reg.indirect);
> > +      unsigned base_offset = src.reg.base_offset * src.reg.reg-
> > >bit_size / 32;
> > +      reg = dst_reg_for_nir_reg(this, src.reg.reg, base_offset,
> > +                                src.reg.indirect);
> I think this wouldn't have been necessary if you had fixed the
> offset()
> helper to take into account the register type (as it does in the FS
> back-end)?

Yes, you are right.

I found it convenient that offset() didn't consider the type because
that gave us a bit more flexibility to offset into the second half of a
SIMD4x2 dvecN simply by doing offset(reg, 1). If offset takes the type
into account we lose that ability because for a DF register the minimum
we could offset with a delta=1 would be 2 SIMD8 registers (a full
SIMD4x2 dvec4). We can still avoid calling offset() in these cases and
just increase reg_offset manually by 1 instead, I understand that you
prefer this?

Alternatively, we could also have typed and untyped versions of the
offset function so we can choose the one we need depending on the case,
but maybe that would be more confusing?

Iago

> > 
> >     }
> >  
> >     reg = retype(reg, type);
Iago Toral <itoral@igalia.com> writes:

> On Tue, 2016-08-02 at 18:40 -0700, Francisco Jerez wrote:

>> Iago Toral Quiroga <itoral@igalia.com> writes:

>> 

>> > 

>> > ---

>> >  src/mesa/drivers/dri/i965/brw_vec4_nir.cpp | 8 +++++---

>> >  1 file changed, 5 insertions(+), 3 deletions(-)

>> > 

>> > diff --git a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp

>> > b/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp

>> > index cf35f2e..fde7b60 100644

>> > --- a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp

>> > +++ b/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp

>> > @@ -280,7 +280,8 @@ vec4_visitor::get_nir_dest(const nir_dest

>> > &dest)

>> >        nir_ssa_values[dest.ssa.index] = dst;

>> >        return dst;

>> >     } else {

>> > -      return dst_reg_for_nir_reg(this, dest.reg.reg,

>> > dest.reg.base_offset,

>> > +      unsigned base_offset = dest.reg.base_offset * dest.reg.reg-

>> > >bit_size / 32;

>> > +      return dst_reg_for_nir_reg(this, dest.reg.reg, base_offset,

>> >                                   dest.reg.indirect);

>> >     }

>> >  }

>> > @@ -308,8 +309,9 @@ vec4_visitor::get_nir_src(const nir_src &src,

>> > enum brw_reg_type type,

>> >        reg = nir_ssa_values[src.ssa->index];

>> >     }

>> >     else {

>> > -     reg = dst_reg_for_nir_reg(this, src.reg.reg,

>> > src.reg.base_offset,

>> > -                               src.reg.indirect);

>> > +      unsigned base_offset = src.reg.base_offset * src.reg.reg-

>> > >bit_size / 32;

>> > +      reg = dst_reg_for_nir_reg(this, src.reg.reg, base_offset,

>> > +                                src.reg.indirect);

>> I think this wouldn't have been necessary if you had fixed the

>> offset()

>> helper to take into account the register type (as it does in the FS

>> back-end)?

>

> Yes, you are right.

>

> I found it convenient that offset() didn't consider the type because

> that gave us a bit more flexibility to offset into the second half of a

> SIMD4x2 dvecN simply by doing offset(reg, 1). If offset takes the type

> into account we lose that ability because for a DF register the minimum

> we could offset with a delta=1 would be 2 SIMD8 registers (a full

> SIMD4x2 dvec4). We can still avoid calling offset() in these cases and

> just increase reg_offset manually by 1 instead, I understand that you

> prefer this?

>


Right, I think I'd rather have offset() mean the same thing on both the
VEC4 and FS back-ends to avoid confusion (i.e. step by as many scalar
components as the specified SIMD width).

> Alternatively, we could also have typed and untyped versions of the

> offset function so we can choose the one we need depending on the case,

> but maybe that would be more confusing?

>

I don't think an untyped (i.e. based on an offset argument in byte or
32-byte units) variant of offset() would be a particularly compelling
way to achieve the above either, in fact this seems to be the cause of
the rather artificial limitation of the SIMD lowering pass that prevents
it from handling cases where each half doesn't write or read exactly one
register of the instruction.  If you had something like the FS backend's
horiz_offset() you would just get the 4-wide vector of the register for
the group index specified as argument, regardless of the type of the
register.  The current representation of register offsets (as a 32B
multiple) may still get in the way for non-64 bit register types, but
that's not really the SIMD lowering pass' business and can probably be
addressed separately.

> Iago

>

>> > 

>> >     }

>> >  

>> >     reg = retype(reg, type);
On Wed, 2016-08-17 at 14:16 -0700, Francisco Jerez wrote:
> Iago Toral <itoral@igalia.com> writes:
> 
> > 
> > On Tue, 2016-08-02 at 18:40 -0700, Francisco Jerez wrote:
> > > 
> > > Iago Toral Quiroga <itoral@igalia.com> writes:
> > > 
> > > > 
> > > > 
> > > > ---
> > > >  src/mesa/drivers/dri/i965/brw_vec4_nir.cpp | 8 +++++---
> > > >  1 file changed, 5 insertions(+), 3 deletions(-)
> > > > 
> > > > diff --git a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
> > > > b/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
> > > > index cf35f2e..fde7b60 100644
> > > > --- a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
> > > > +++ b/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
> > > > @@ -280,7 +280,8 @@ vec4_visitor::get_nir_dest(const nir_dest
> > > > &dest)
> > > >        nir_ssa_values[dest.ssa.index] = dst;
> > > >        return dst;
> > > >     } else {
> > > > -      return dst_reg_for_nir_reg(this, dest.reg.reg,
> > > > dest.reg.base_offset,
> > > > +      unsigned base_offset = dest.reg.base_offset *
> > > > dest.reg.reg-
> > > > > 
> > > > > bit_size / 32;
> > > > +      return dst_reg_for_nir_reg(this, dest.reg.reg,
> > > > base_offset,
> > > >                                   dest.reg.indirect);
> > > >     }
> > > >  }
> > > > @@ -308,8 +309,9 @@ vec4_visitor::get_nir_src(const nir_src
> > > > &src,
> > > > enum brw_reg_type type,
> > > >        reg = nir_ssa_values[src.ssa->index];
> > > >     }
> > > >     else {
> > > > -     reg = dst_reg_for_nir_reg(this, src.reg.reg,
> > > > src.reg.base_offset,
> > > > -                               src.reg.indirect);
> > > > +      unsigned base_offset = src.reg.base_offset *
> > > > src.reg.reg-
> > > > > 
> > > > > bit_size / 32;
> > > > +      reg = dst_reg_for_nir_reg(this, src.reg.reg,
> > > > base_offset,
> > > > +                                src.reg.indirect);
> > > I think this wouldn't have been necessary if you had fixed the
> > > offset()
> > > helper to take into account the register type (as it does in the
> > > FS
> > > back-end)?
> > Yes, you are right.
> > 
> > I found it convenient that offset() didn't consider the type
> > because
> > that gave us a bit more flexibility to offset into the second half
> > of a
> > SIMD4x2 dvecN simply by doing offset(reg, 1). If offset takes the
> > type
> > into account we lose that ability because for a DF register the
> > minimum
> > we could offset with a delta=1 would be 2 SIMD8 registers (a full
> > SIMD4x2 dvec4). We can still avoid calling offset() in these cases
> > and
> > just increase reg_offset manually by 1 instead, I understand that
> > you
> > prefer this?
> > 
> Right, I think I'd rather have offset() mean the same thing on both
> the
> VEC4 and FS back-ends to avoid confusion (i.e. step by as many scalar
> components as the specified SIMD width).
> > 
> > Alternatively, we could also have typed and untyped versions of the
> > offset function so we can choose the one we need depending on the
> > case,
> > but maybe that would be more confusing?
> > 
> I don't think an untyped (i.e. based on an offset argument in byte or
> 32-byte units) variant of offset() would be a particularly compelling
> way to achieve the above either, in fact this seems to be the cause
> of
> the rather artificial limitation of the SIMD lowering pass that
> prevents
> it from handling cases where each half doesn't write or read exactly
> one
> register of the instruction.  If you had something like the FS
> backend's
> horiz_offset() you would just get the 4-wide vector of the register
> for
> the group index specified as argument, regardless of the type of the
> register.  The current representation of register offsets (as a 32B
> multiple) may still get in the way for non-64 bit register types, but
> that's not really the SIMD lowering pass' business and can probably
> be
> addressed separately.

I have been looking into this. It seems like a lot if use cases for the
offset() helper in the vec4 backend look like the following (this is
from the cse pass, but there are plenty more like this):

for (unsigned i = 0; i < regs_written(entry->generator); ++i) {
   vec4_instruction *copy = MOV(offset(entry->generator->dst, i),
                                offset(entry->tmp, i));
   ...
}

That is, code that is setup on a register by register basis rather than
a channel basis. For these cases I think the current offset() helper is
more useful. So how about we add a byte_offset() helper to the Vec4 IR
and we use that in cases like these and then we change the offset()
helper to operate with the channel semantics that we have in the FS IR
as you suggest and use that everywhere else?

> > 
> > Iago
> > 
> > > 
> > > > 
> > > > 
> > > >     }
> > > >  
> > > >     reg = retype(reg, type);