[1/2] nir: Add inverted bitwise ops

Submitted by Alyssa Rosenzweig on April 25, 2019, 10:37 p.m.

Details

Message ID 20190425223744.4324-2-alyssa@rosenzweig.io
State New
Headers show
Series "Add inverted bitwise forms to NIR" ( rev: 1 ) in Mesa

Not browsing as part of any series.

Commit Message

Alyssa Rosenzweig April 25, 2019, 10:37 p.m.
In addition to the familiar iand/ior/ixor, some architectures feature
destination-inverted versions inand/inor/inxor. Certain
architectures also have source-inverted forms, dubbed iandnot/iornot
here. Midgard has the all of these opcodes natively. Many arches have
comparible features to implement some/all of the above. Paired with De
Morgan's Laws, these opcodes allow anything of the form
"~? (~?a [&|] ~?b)" to complete in one instruction.

This can be used to simplify some backend-specific code on affected
architectures, e.f. 8eb36c91 ("intel/fs: Emit logical-not of operands on
Gen8+").

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Cc: Ian Romanick <ian.d.romanick@intel.com>
Cc: Kenneth Graunke <kenneth@whitecape.org>
---
 src/compiler/nir/nir.h                |  4 ++++
 src/compiler/nir/nir_opcodes.py       | 18 ++++++++++++++++++
 src/compiler/nir/nir_opt_algebraic.py | 12 ++++++++++++
 3 files changed, 34 insertions(+)

Patch hide | download patch | download mbox

diff --git a/src/compiler/nir/nir.h b/src/compiler/nir/nir.h
index e878a63409d..3e01ec2cc06 100644
--- a/src/compiler/nir/nir.h
+++ b/src/compiler/nir/nir.h
@@ -2318,6 +2318,10 @@  typedef struct nir_shader_compiler_options {
    bool lower_hadd;
    bool lower_add_sat;
 
+   /* Set if inand/inor/inxor and iandnot/iornot supported respectively */
+   bool bitwise_dest_invertable;
+   bool bitwise_src_invertable;
+
    /**
     * Should nir_lower_io() create load_interpolated_input intrinsics?
     *
diff --git a/src/compiler/nir/nir_opcodes.py b/src/compiler/nir/nir_opcodes.py
index d35d820aa5b..f9d92afb53e 100644
--- a/src/compiler/nir/nir_opcodes.py
+++ b/src/compiler/nir/nir_opcodes.py
@@ -690,6 +690,24 @@  binop("iand", tuint, commutative + associative, "src0 & src1")
 binop("ior", tuint, commutative + associative, "src0 | src1")
 binop("ixor", tuint, commutative + associative, "src0 ^ src1")
 
+# inverted bitwise logic operators
+#
+# These variants of the above include bitwise NOTs either on the result of the
+# whole expression or on the latter operand. On some hardware (e.g. Midgard),
+# these are native ops. On other hardware (e.g. Intel Gen8+), these can be
+# implemented as modifiers of the standard three. Along with appropriate
+# algebraic passes, these should permit any permutation of inverses on AND/OR
+# to execute in a single cycle. For example, ~(a & ~b) = ~(~(~a | ~(~b))) = ~a
+# | b = b | ~a = iornot(b, a).
+
+binop("inand", tuint, commutative, "~(src0 & src1)")
+binop("inor", tuint, commutative, "~(src0 | src1)")
+binop("inxor", tuint, commutative, "~(src0 ^ src1)")
+binop("iandnot", tuint, "", "src0 & (~src1)")
+binop("iornot", tuint, "", "src0 & (~src1)")
+
+
+
 
 # floating point logic operators
 #
diff --git a/src/compiler/nir/nir_opt_algebraic.py b/src/compiler/nir/nir_opt_algebraic.py
index dad0545594f..6cb3e8cb950 100644
--- a/src/compiler/nir/nir_opt_algebraic.py
+++ b/src/compiler/nir/nir_opt_algebraic.py
@@ -1052,6 +1052,18 @@  late_optimizations = [
    (('fmax', ('fadd(is_used_once)', '#c', a), ('fadd(is_used_once)', '#c', b)), ('fadd', c, ('fmax', a, b))),
 
    (('bcsel', a, 0, ('b2f32', ('inot', 'b@bool'))), ('b2f32', ('inot', ('ior', a, b)))),
+
+   # We don't want to deal with inverted forms, so run this late. Any
+   # combination of inverts on flags or output should result in a single
+   # instruction if these are supported; cases not explicitly handled would
+   # have been simplified via De Morgan's Law
+   (('inot', ('iand', a, b)), ('inand', a, b), 'options->bitwise_dest_invertable'),
+   (('inot', ('ior', a, b)), ('inor', a, b), 'options->bitwise_dest_invertable'),
+   (('inot', ('ixor', a, b)), ('inxor', a, b), 'options->bitwise_dest_invertable'),
+   (('iand', ('inot', a), b), ('iandnot', b, a), 'options->bitwise_src_invertable'),
+   (('iand', a, ('inot', b)), ('iandnot', a, b), 'options->bitwise_src_invertable'),
+   (('ior', a, ('inot', b)), ('iornot', a, b), 'options->bitwise_src_invertable'),
+   (('ior', ('inot', a), b), ('iornot', b, a), 'options->bitwise_src_invertable'),
 ]
 
 print(nir_algebraic.AlgebraicPass("nir_opt_algebraic", optimizations).render())

Comments

On Thu, Apr 25, 2019 at 3:37 PM Alyssa Rosenzweig <alyssa@rosenzweig.io> wrote:
>
> In addition to the familiar iand/ior/ixor, some architectures feature
> destination-inverted versions inand/inor/inxor. Certain
> architectures also have source-inverted forms, dubbed iandnot/iornot
> here. Midgard has the all of these opcodes natively. Many arches have
> comparible features to implement some/all of the above. Paired with De
> Morgan's Laws, these opcodes allow anything of the form
> "~? (~?a [&|] ~?b)" to complete in one instruction.
>
> This can be used to simplify some backend-specific code on affected
> architectures, e.f. 8eb36c91 ("intel/fs: Emit logical-not of operands on
> Gen8+").
>
> Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
> Cc: Ian Romanick <ian.d.romanick@intel.com>
> Cc: Kenneth Graunke <kenneth@whitecape.org>
> ---
>  src/compiler/nir/nir.h                |  4 ++++
>  src/compiler/nir/nir_opcodes.py       | 18 ++++++++++++++++++
>  src/compiler/nir/nir_opt_algebraic.py | 12 ++++++++++++
>  3 files changed, 34 insertions(+)
>
> diff --git a/src/compiler/nir/nir.h b/src/compiler/nir/nir.h
> index e878a63409d..3e01ec2cc06 100644
> --- a/src/compiler/nir/nir.h
> +++ b/src/compiler/nir/nir.h
> @@ -2318,6 +2318,10 @@ typedef struct nir_shader_compiler_options {
>     bool lower_hadd;
>     bool lower_add_sat;
>
> +   /* Set if inand/inor/inxor and iandnot/iornot supported respectively */
> +   bool bitwise_dest_invertable;
> +   bool bitwise_src_invertable;


neat, this trick seems to work on a6xx (and I'd assume earlier gens
that use ir3 although I'd have to dust off some older blobs to check..
blob is less kind when it comes to retaining support for older hw)..

very-bikeshed-comment: split into two comments per compiler_options flag?

Reviewed-by: Rob Clark <robdclark@gmail.com>


> +
>     /**
>      * Should nir_lower_io() create load_interpolated_input intrinsics?
>      *
> diff --git a/src/compiler/nir/nir_opcodes.py b/src/compiler/nir/nir_opcodes.py
> index d35d820aa5b..f9d92afb53e 100644
> --- a/src/compiler/nir/nir_opcodes.py
> +++ b/src/compiler/nir/nir_opcodes.py
> @@ -690,6 +690,24 @@ binop("iand", tuint, commutative + associative, "src0 & src1")
>  binop("ior", tuint, commutative + associative, "src0 | src1")
>  binop("ixor", tuint, commutative + associative, "src0 ^ src1")
>
> +# inverted bitwise logic operators
> +#
> +# These variants of the above include bitwise NOTs either on the result of the
> +# whole expression or on the latter operand. On some hardware (e.g. Midgard),
> +# these are native ops. On other hardware (e.g. Intel Gen8+), these can be
> +# implemented as modifiers of the standard three. Along with appropriate
> +# algebraic passes, these should permit any permutation of inverses on AND/OR
> +# to execute in a single cycle. For example, ~(a & ~b) = ~(~(~a | ~(~b))) = ~a
> +# | b = b | ~a = iornot(b, a).
> +
> +binop("inand", tuint, commutative, "~(src0 & src1)")
> +binop("inor", tuint, commutative, "~(src0 | src1)")
> +binop("inxor", tuint, commutative, "~(src0 ^ src1)")
> +binop("iandnot", tuint, "", "src0 & (~src1)")
> +binop("iornot", tuint, "", "src0 & (~src1)")
> +
> +
> +
>
>  # floating point logic operators
>  #
> diff --git a/src/compiler/nir/nir_opt_algebraic.py b/src/compiler/nir/nir_opt_algebraic.py
> index dad0545594f..6cb3e8cb950 100644
> --- a/src/compiler/nir/nir_opt_algebraic.py
> +++ b/src/compiler/nir/nir_opt_algebraic.py
> @@ -1052,6 +1052,18 @@ late_optimizations = [
>     (('fmax', ('fadd(is_used_once)', '#c', a), ('fadd(is_used_once)', '#c', b)), ('fadd', c, ('fmax', a, b))),
>
>     (('bcsel', a, 0, ('b2f32', ('inot', 'b@bool'))), ('b2f32', ('inot', ('ior', a, b)))),
> +
> +   # We don't want to deal with inverted forms, so run this late. Any
> +   # combination of inverts on flags or output should result in a single
> +   # instruction if these are supported; cases not explicitly handled would
> +   # have been simplified via De Morgan's Law
> +   (('inot', ('iand', a, b)), ('inand', a, b), 'options->bitwise_dest_invertable'),
> +   (('inot', ('ior', a, b)), ('inor', a, b), 'options->bitwise_dest_invertable'),
> +   (('inot', ('ixor', a, b)), ('inxor', a, b), 'options->bitwise_dest_invertable'),
> +   (('iand', ('inot', a), b), ('iandnot', b, a), 'options->bitwise_src_invertable'),
> +   (('iand', a, ('inot', b)), ('iandnot', a, b), 'options->bitwise_src_invertable'),
> +   (('ior', a, ('inot', b)), ('iornot', a, b), 'options->bitwise_src_invertable'),
> +   (('ior', ('inot', a), b), ('iornot', b, a), 'options->bitwise_src_invertable'),
>  ]
>
>  print(nir_algebraic.AlgebraicPass("nir_opt_algebraic", optimizations).render())
> --
> 2.20.1
>
> _______________________________________________
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev

On 4/25/19 3:37 PM, Alyssa Rosenzweig wrote:
> In addition to the familiar iand/ior/ixor, some architectures feature
> destination-inverted versions inand/inor/inxor. Certain
> architectures also have source-inverted forms, dubbed iandnot/iornot
> here. Midgard has the all of these opcodes natively. Many arches have
> comparible features to implement some/all of the above. Paired with De
> Morgan's Laws, these opcodes allow anything of the form
> "~? (~?a [&|] ~?b)" to complete in one instruction.
> 
> This can be used to simplify some backend-specific code on affected
> architectures, e.f. 8eb36c91 ("intel/fs: Emit logical-not of operands on
> Gen8+").
> 
> Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
> Cc: Ian Romanick <ian.d.romanick@intel.com>
> Cc: Kenneth Graunke <kenneth@whitecape.org>
> ---
>  src/compiler/nir/nir.h                |  4 ++++
>  src/compiler/nir/nir_opcodes.py       | 18 ++++++++++++++++++
>  src/compiler/nir/nir_opt_algebraic.py | 12 ++++++++++++
>  3 files changed, 34 insertions(+)
> 
> diff --git a/src/compiler/nir/nir.h b/src/compiler/nir/nir.h
> index e878a63409d..3e01ec2cc06 100644
> --- a/src/compiler/nir/nir.h
> +++ b/src/compiler/nir/nir.h
> @@ -2318,6 +2318,10 @@ typedef struct nir_shader_compiler_options {
>     bool lower_hadd;
>     bool lower_add_sat;
>  
> +   /* Set if inand/inor/inxor and iandnot/iornot supported respectively */
> +   bool bitwise_dest_invertable;
> +   bool bitwise_src_invertable;
> +
>     /**
>      * Should nir_lower_io() create load_interpolated_input intrinsics?
>      *
> diff --git a/src/compiler/nir/nir_opcodes.py b/src/compiler/nir/nir_opcodes.py
> index d35d820aa5b..f9d92afb53e 100644
> --- a/src/compiler/nir/nir_opcodes.py
> +++ b/src/compiler/nir/nir_opcodes.py
> @@ -690,6 +690,24 @@ binop("iand", tuint, commutative + associative, "src0 & src1")
>  binop("ior", tuint, commutative + associative, "src0 | src1")
>  binop("ixor", tuint, commutative + associative, "src0 ^ src1")
>  
> +# inverted bitwise logic operators
> +#
> +# These variants of the above include bitwise NOTs either on the result of the
> +# whole expression or on the latter operand. On some hardware (e.g. Midgard),
> +# these are native ops. On other hardware (e.g. Intel Gen8+), these can be
> +# implemented as modifiers of the standard three. Along with appropriate
> +# algebraic passes, these should permit any permutation of inverses on AND/OR
> +# to execute in a single cycle. For example, ~(a & ~b) = ~(~(~a | ~(~b))) = ~a
> +# | b = b | ~a = iornot(b, a).
> +
> +binop("inand", tuint, commutative, "~(src0 & src1)")
> +binop("inor", tuint, commutative, "~(src0 | src1)")
> +binop("inxor", tuint, commutative, "~(src0 ^ src1)")
> +binop("iandnot", tuint, "", "src0 & (~src1)")
> +binop("iornot", tuint, "", "src0 & (~src1)")
> +
> +
> +
>  
>  # floating point logic operators
>  #
> diff --git a/src/compiler/nir/nir_opt_algebraic.py b/src/compiler/nir/nir_opt_algebraic.py
> index dad0545594f..6cb3e8cb950 100644
> --- a/src/compiler/nir/nir_opt_algebraic.py
> +++ b/src/compiler/nir/nir_opt_algebraic.py
> @@ -1052,6 +1052,18 @@ late_optimizations = [
>     (('fmax', ('fadd(is_used_once)', '#c', a), ('fadd(is_used_once)', '#c', b)), ('fadd', c, ('fmax', a, b))),
>  
>     (('bcsel', a, 0, ('b2f32', ('inot', 'b@bool'))), ('b2f32', ('inot', ('ior', a, b)))),
> +
> +   # We don't want to deal with inverted forms, so run this late. Any
> +   # combination of inverts on flags or output should result in a single
> +   # instruction if these are supported; cases not explicitly handled would
> +   # have been simplified via De Morgan's Law
> +   (('inot', ('iand', a, b)), ('inand', a, b), 'options->bitwise_dest_invertable'),
> +   (('inot', ('ior', a, b)), ('inor', a, b), 'options->bitwise_dest_invertable'),
> +   (('inot', ('ixor', a, b)), ('inxor', a, b), 'options->bitwise_dest_invertable'),
> +   (('iand', ('inot', a), b), ('iandnot', b, a), 'options->bitwise_src_invertable'),
> +   (('iand', a, ('inot', b)), ('iandnot', a, b), 'options->bitwise_src_invertable'),

iand and ior are commutative, so you don't need both.

> +   (('ior', a, ('inot', b)), ('iornot', a, b), 'options->bitwise_src_invertable'),
> +   (('ior', ('inot', a), b), ('iornot', b, a), 'options->bitwise_src_invertable'),

Especially without instruction count data, I'm not very excited about
this.  When / how to do these translations is more complex than this.
For example, if the only use of inot(...some logic...) is an
if-condition, we don't want to re-write it.  Our backend will just
invert the "polarity" of the conditional branch.  What if the "...some
logic..." is used more than once?  My gut tells me that doing this in
the backend with something like Eric's NOLTIS is the right way to go.

I can try a shader-db run later.

This is also part of the reason that I've never sent out some other
patches that I have that convert certain kinds of logic operations into
other things.  For example,

   # True iff a == -1 and b == 0
   (('iand', 'a@bool32', ('inot', 'b@bool32)), ('ilt', a, b)),

>  ]
>  
>  print(nir_algebraic.AlgebraicPass("nir_opt_algebraic", optimizations).render())
>
> We can support all of these with source modifiers because the above three
> aren't really "dest invertable"...  For us, they'd be
> 
> ~src0 | ~src1
> ~src0 & ~src1
> ~src0 ^ ~src1
> 
> Is it really dest_invertable or both_srcs_invertable? :-)

Sure, I wasn't sure how other drivers would want to handle this, if at
all. I'm rather regretting not slapping an "RFC" tag on this, oh well :)

> Also worth noting that I've considered adding a not modifier to NIR (if
> source modifiers are a thing we actually want at that level).  Over-all,
> I'm a little uncertain if these need to be their own ops or not...

It's worth noting that, as far as I know, the instructions added here
are the _only_ ones that Midgard can do -- our source modifiers only
apply to floating-point ops (int ops use the bits to distinguish
sign-ext/zero-ext). The not's in these ops is baked right into the
instruction.

It'd be a simple lowering pass either way, of course.
> iand and ior are commutative, so you don't need both.

--Wait, woaaah, the algebraic generator respects that? Super neat, thank
you!

> Especially without instruction count data

(I'm assuming I won't be able to do shader-db on my hw at this point..)

> For example, if the only use of inot(...some logic...) is an
> if-condition, we don't want to re-write it.  Our backend will just
> invert the "polarity" of the conditional branch. 

Out of curiousity, what's the specific issue? Midgard also specifies
branch polarity, though I currently hardcode to true since I'm not
convinced it makes a difference.

> My gut tells me that doing this in
> the backend with something like Eric's NOLTIS is the right way to go.

I'm not sure what NOLTIS is, sorry.

Would you ack a change adding the ops to nir_opcode.py but not adding an
opt passes? I have a backend algebra pass which I'm happy to move this
into (and I think it would be a win on Midgard regardless); I just
(AFAIK) need the ops in NIR for the algebra pass to work at all. [I
don't want to duplicate this infrastructure to run over the machine IR,
which is quite limited since I trust the input NIR to be good.]

Thank you for the comments.

> This is also part of the reason that I've never sent out some other
> patches that I have that convert certain kinds of logic operations into
> other things.  For example,
> 
>    # True iff a == -1 and b == 0
>    (('iand', 'a@bool32', ('inot', 'b@bool32)), ('ilt', a, b)),

...Cute. Point taken :)