nir: Implement optional b2f->iand lowering

Submitted by Alyssa Rosenzweig on April 29, 2018, 6:19 p.m.

Details

Message ID 20180429181934.GA1707@devsus
State New
Headers show
Series "nir: Implement optional b2f->iand lowering" ( rev: 1 ) in Mesa

Not browsing as part of any series.

Commit Message

Alyssa Rosenzweig April 29, 2018, 6:19 p.m.
This pass is required by the Midgard compiler; our instruction set uses
NIR-style booleans (~0 for true) but lacks a dedicated b2f instruction.
Normally, this lowering pass would be implemented in a backend-specific
algebraic pass, but this conflicts with the existing iand->b2f pass in
nir_opt_algebraic.py, hanging the compiler. This patch thus makes the
existing pass optional (default on -- all other backends should remain
unaffected), adding an optional pass for lowering the opposite
direction.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
---
 src/compiler/nir/nir.h                | 3 +++
 src/compiler/nir/nir_opt_algebraic.py | 3 ++-
 2 files changed, 5 insertions(+), 1 deletion(-)

Patch hide | download patch | download mbox

diff --git a/src/compiler/nir/nir.h b/src/compiler/nir/nir.h
index de935392a2..135018004f 100644
--- a/src/compiler/nir/nir.h
+++ b/src/compiler/nir/nir.h
@@ -1862,6 +1862,9 @@  typedef struct nir_shader_compiler_options {
    /** enables rules to lower idiv by power-of-two: */
    bool lower_idiv;
 
+   /* lower b2f to iand */
+   bool lower_b2f;
+
    /* Does the native fdot instruction replicate its result for four
     * components?  If so, then opt_algebraic_late will turn all fdotN
     * instructions into fdot_replicatedN instructions.
diff --git a/src/compiler/nir/nir_opt_algebraic.py b/src/compiler/nir/nir_opt_algebraic.py
index 761c61ac0b..7da09ee918 100644
--- a/src/compiler/nir/nir_opt_algebraic.py
+++ b/src/compiler/nir/nir_opt_algebraic.py
@@ -318,7 +318,8 @@  optimizations = [
    (('imul', ('b2i', a), ('b2i', b)), ('b2i', ('iand', a, b))),
    (('fmul', ('b2f', a), ('b2f', b)), ('b2f', ('iand', a, b))),
    (('fsat', ('fadd', ('b2f', a), ('b2f', b))), ('b2f', ('ior', a, b))),
-   (('iand', 'a@bool', 1.0), ('b2f', a)),
+   (('iand', 'a@bool', 1.0), ('b2f', a), '!options->lower_b2f'),
+   (('b2f@32', a), ('iand', a, 1.0), 'options->lower_b2f'),
    # True/False are ~0 and 0 in NIR.  b2i of True is 1, and -1 is ~0 (True).
    (('ineg', ('b2i@32', a)), a),
    (('flt', ('fneg', ('b2f', a)), 0), a), # Generated by TGSI KILL_IF.

Comments

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>

On Sun, Apr 29, 2018 at 8:19 PM, Alyssa Rosenzweig <alyssa@rosenzweig.io> wrote:
> This pass is required by the Midgard compiler; our instruction set uses
> NIR-style booleans (~0 for true) but lacks a dedicated b2f instruction.
> Normally, this lowering pass would be implemented in a backend-specific
> algebraic pass, but this conflicts with the existing iand->b2f pass in
> nir_opt_algebraic.py, hanging the compiler. This patch thus makes the
> existing pass optional (default on -- all other backends should remain
> unaffected), adding an optional pass for lowering the opposite
> direction.
>
> Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
> ---
>  src/compiler/nir/nir.h                | 3 +++
>  src/compiler/nir/nir_opt_algebraic.py | 3 ++-
>  2 files changed, 5 insertions(+), 1 deletion(-)
>
> diff --git a/src/compiler/nir/nir.h b/src/compiler/nir/nir.h
> index de935392a2..135018004f 100644
> --- a/src/compiler/nir/nir.h
> +++ b/src/compiler/nir/nir.h
> @@ -1862,6 +1862,9 @@ typedef struct nir_shader_compiler_options {
>     /** enables rules to lower idiv by power-of-two: */
>     bool lower_idiv;
>
> +   /* lower b2f to iand */
> +   bool lower_b2f;
> +
>     /* Does the native fdot instruction replicate its result for four
>      * components?  If so, then opt_algebraic_late will turn all fdotN
>      * instructions into fdot_replicatedN instructions.
> diff --git a/src/compiler/nir/nir_opt_algebraic.py b/src/compiler/nir/nir_opt_algebraic.py
> index 761c61ac0b..7da09ee918 100644
> --- a/src/compiler/nir/nir_opt_algebraic.py
> +++ b/src/compiler/nir/nir_opt_algebraic.py
> @@ -318,7 +318,8 @@ optimizations = [
>     (('imul', ('b2i', a), ('b2i', b)), ('b2i', ('iand', a, b))),
>     (('fmul', ('b2f', a), ('b2f', b)), ('b2f', ('iand', a, b))),
>     (('fsat', ('fadd', ('b2f', a), ('b2f', b))), ('b2f', ('ior', a, b))),
> -   (('iand', 'a@bool', 1.0), ('b2f', a)),
> +   (('iand', 'a@bool', 1.0), ('b2f', a), '!options->lower_b2f'),
> +   (('b2f@32', a), ('iand', a, 1.0), 'options->lower_b2f'),
>     # True/False are ~0 and 0 in NIR.  b2i of True is 1, and -1 is ~0 (True).
>     (('ineg', ('b2i@32', a)), a),
>     (('flt', ('fneg', ('b2f', a)), 0), a), # Generated by TGSI KILL_IF.
> --
> 2.16.1
>
> _______________________________________________
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
On Sun, Apr 29, 2018 at 11:19 AM, Alyssa Rosenzweig
<alyssa@rosenzweig.io> wrote:
> This pass is required by the Midgard compiler; our instruction set uses
> NIR-style booleans (~0 for true) but lacks a dedicated b2f instruction.
> Normally, this lowering pass would be implemented in a backend-specific
> algebraic pass, but this conflicts with the existing iand->b2f pass in
> nir_opt_algebraic.py, hanging the compiler. This patch thus makes the
> existing pass optional (default on -- all other backends should remain
> unaffected), adding an optional pass for lowering the opposite
> direction.

Tell me a little more about what your hardware supports. Does it have
real integers?

In the i965 driver, we used to do what you suggest (AND with
0x3f800000), but changed to a type converting MOV with a source
modifier: MOV dest:F, -src0:D since we could occasionally optimize
that instruction away. That is, interpret the source as an integer (so
false is 0 and true is -1), negate it, and convert to a float.

We did both of these things in our backend when translating NIR's b2f
into our backend IR. Why is it better for you to do that in NIR?
> Tell me a little more about what your hardware supports. 

Both integers and floats are first-class; they are each natively 32-bit,
with 16-bit versions to be supported down the line. There's support for
int8 and float64, but I haven't seen these used with GLSL, so I don't
know how they work.

From what I can tell, booleans are NIR-style: ~0 for true, 0 for false.
There are no opcodes for b2f or f2b; instead, bitwise ops (iand, etc)
are used for the conversion.

> Interpret the source as an integer, negate it, and convert to a float.

I don't believe this can done with a move, although it may be possible
to accomplish this with a source modifier to the dedicated i2f
instruction. I'll need to test this at some point, although I'm not
convinced it's any faster on this architecture.

> We did both of these things in our backend when translating NIR's b2f
> into our backend IR. Why is it better for you to do that in NIR?

Hmm, that's a good point. I'm not too familiar with i965's architecture,
but at the moment, my compiler uses only a thin backend IR. That is, its
"IR" is essentially unpacked hardware instructions, with placeholders
instead of registers, and metadata. Given that the ISA is a good match
for NIR, normally this simplifies the backend tremendously, as the only
optimisation passes I need to implement are highly specialised to the
hardware. By contrast, I would not want to implement constant folding,
for instance, on the backend IR.

I was hoping that NIR, by design, would eliminate this code duplication
between backends. Accordingly, I try to keep code in NIR as late as
possible, reusing standard lowering and optimisation passes.

Theoretically, a patch like this would enable the lowered iand (or type
conversion) to benefit from further NIR optimisations and allow code to
be shared with other backends, but I'm admittedly grasping for straws
here.

The backend design is still early, and I haven't figured out what the
ideal split of responsibilities between common code and the backend
should be. I'm open to further input -- if major changes are needed,
it's better for that to happen now than when I send the patches in for
the compiler itself!
On Mon, Apr 30, 2018 at 7:50 PM, Alyssa Rosenzweig <alyssa@rosenzweig.io> wrote:
>> Tell me a little more about what your hardware supports.
>
> Both integers and floats are first-class; they are each natively 32-bit,
> with 16-bit versions to be supported down the line. There's support for
> int8 and float64, but I haven't seen these used with GLSL, so I don't
> know how they work.
>
> From what I can tell, booleans are NIR-style: ~0 for true, 0 for false.
> There are no opcodes for b2f or f2b; instead, bitwise ops (iand, etc)
> are used for the conversion.
>
>> Interpret the source as an integer, negate it, and convert to a float.
>
> I don't believe this can done with a move, although it may be possible
> to accomplish this with a source modifier to the dedicated i2f
> instruction. I'll need to test this at some point, although I'm not
> convinced it's any faster on this architecture.
>
>> We did both of these things in our backend when translating NIR's b2f
>> into our backend IR. Why is it better for you to do that in NIR?
>
> Hmm, that's a good point. I'm not too familiar with i965's architecture,
> but at the moment, my compiler uses only a thin backend IR. That is, its
> "IR" is essentially unpacked hardware instructions, with placeholders
> instead of registers, and metadata. Given that the ISA is a good match
> for NIR, normally this simplifies the backend tremendously, as the only
> optimisation passes I need to implement are highly specialised to the
> hardware. By contrast, I would not want to implement constant folding,
> for instance, on the backend IR.
>
> I was hoping that NIR, by design, would eliminate this code duplication
> between backends. Accordingly, I try to keep code in NIR as late as
> possible, reusing standard lowering and optimisation passes.
>
> Theoretically, a patch like this would enable the lowered iand (or type
> conversion) to benefit from further NIR optimisations and allow code to
> be shared with other backends, but I'm admittedly grasping for straws
> here.
>
> The backend design is still early, and I haven't figured out what the
> ideal split of responsibilities between common code and the backend
> should be. I'm open to further input -- if major changes are needed,
> it's better for that to happen now than when I send the patches in for
> the compiler itself!

That all sounds good. Those are exactly the things we wanted when
creating NIR, so it's nice to hear :)

The only concern I have is that you'll be losing out on a number of
optimizations on b2f if you lower it to iand in the normal algebraic
optimization pass. We have a section of nir_opt_algebraic.py that runs
optimizations "late" (function is called nir_opt_algebraic_late() in
C):

# This section contains "late" optimizations that should be run after the
# regular optimizations have finished.  Optimizations should go here if
# they help code generation but do not necessarily produce code that is
# more easily optimizable.
late_optimizations = [
...
]

It might be beneficial for you to lower b2f there instead.
> That all sounds good. Those are exactly the things we wanted when
> creating NIR, so it's nice to hear :)

Hooray!

> It might be beneficial for you to lower b2f there

Ah, I had not noticed that section was seperate. Yes, I think you're
right about that.

Updated patch sent. (I hope I did this right; I'm still new to
development over mailing lists.)