ac: fix exclusive scans on GFX8-GFX9

Submitted by Samuel Pitoiset on Aug. 21, 2019, 1:49 p.m.

Details

Message ID 20190821134914.18839-1-samuel.pitoiset@gmail.com
State New
Headers show
Series "ac: fix exclusive scans on GFX8-GFX9" ( rev: 1 ) in Mesa

Not browsing as part of any series.

Commit Message

Samuel Pitoiset Aug. 21, 2019, 1:49 p.m.
This fixes a regression introduced with scan&reduce operations
on GFX10. Note that some subgroups CTS still fail on GFX10 but
I assume it's a different issue.

This fixes dEQP-VK.subgroups.arithmetic.*.subgroupexclusive*.

Fixes: 227c29a80de "amd/common/gfx10: implement scan & reduce operations"
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
---
 src/amd/common/ac_llvm_build.c | 7 +++----
 1 file changed, 3 insertions(+), 4 deletions(-)

Patch hide | download patch | download mbox

diff --git a/src/amd/common/ac_llvm_build.c b/src/amd/common/ac_llvm_build.c
index 05871f5ea98..d72eaa2db46 100644
--- a/src/amd/common/ac_llvm_build.c
+++ b/src/amd/common/ac_llvm_build.c
@@ -4221,10 +4221,7 @@  ac_build_scan(struct ac_llvm_context *ctx, nir_op op, LLVMValueRef src, LLVMValu
 	if (ctx->chip_class >= GFX10) {
 		result = inclusive ? src : identity;
 	} else {
-		if (inclusive)
-			result = src;
-		else
-			result = ac_build_dpp(ctx, identity, src, dpp_wf_sr1, 0xf, 0xf, false);
+		result = src;
 	}
 	if (maxprefix <= 1)
 		return result;
@@ -4333,6 +4330,8 @@  ac_build_exclusive_scan(struct ac_llvm_context *ctx, LLVMValueRef src, nir_op op
 		get_reduction_identity(ctx, op, ac_get_type_size(LLVMTypeOf(src)));
 	result = LLVMBuildBitCast(ctx->builder, ac_build_set_inactive(ctx, src, identity),
 				  LLVMTypeOf(identity), "");
+	if (ctx->chip_class <= GFX9)
+		result = ac_build_dpp(ctx, identity, result, dpp_wf_sr1, 0xf, 0xf, false);
 	result = ac_build_scan(ctx, op, result, identity, ctx->wave_size, false);
 
 	return ac_build_wwm(ctx, result);

Comments

On Wed, Aug 21, 2019 at 3:45 PM Samuel Pitoiset
<samuel.pitoiset@gmail.com> wrote:
>
> This fixes a regression introduced with scan&reduce operations
> on GFX10. Note that some subgroups CTS still fail on GFX10 but
> I assume it's a different issue.
>
> This fixes dEQP-VK.subgroups.arithmetic.*.subgroupexclusive*.
>
> Fixes: 227c29a80de "amd/common/gfx10: implement scan & reduce operations"
> Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
> ---
>  src/amd/common/ac_llvm_build.c | 7 +++----
>  1 file changed, 3 insertions(+), 4 deletions(-)
>
> diff --git a/src/amd/common/ac_llvm_build.c b/src/amd/common/ac_llvm_build.c
> index 05871f5ea98..d72eaa2db46 100644
> --- a/src/amd/common/ac_llvm_build.c
> +++ b/src/amd/common/ac_llvm_build.c
> @@ -4221,10 +4221,7 @@ ac_build_scan(struct ac_llvm_context *ctx, nir_op op, LLVMValueRef src, LLVMValu
>         if (ctx->chip_class >= GFX10) {
>                 result = inclusive ? src : identity;
>         } else {
> -               if (inclusive)
> -                       result = src;
> -               else
> -                       result = ac_build_dpp(ctx, identity, src, dpp_wf_sr1, 0xf, 0xf, false);
> +               result = src;
>         }
>         if (maxprefix <= 1)
>                 return result;
> @@ -4333,6 +4330,8 @@ ac_build_exclusive_scan(struct ac_llvm_context *ctx, LLVMValueRef src, nir_op op
>                 get_reduction_identity(ctx, op, ac_get_type_size(LLVMTypeOf(src)));
>         result = LLVMBuildBitCast(ctx->builder, ac_build_set_inactive(ctx, src, identity),
>                                   LLVMTypeOf(identity), "");
> +       if (ctx->chip_class <= GFX9)
> +               result = ac_build_dpp(ctx, identity, result, dpp_wf_sr1, 0xf, 0xf, false);

Kinda annoying that we still do the inclusive/exclusive logic for
gfx10 inside ac_build_scan. Can we keep this inside the function by
using a intermediate src?

>         result = ac_build_scan(ctx, op, result, identity, ctx->wave_size, false);
>
>         return ac_build_wwm(ctx, result);
> --
> 2.22.1
>
> _______________________________________________
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
On 8/21/19 3:59 PM, Bas Nieuwenhuizen wrote:
> On Wed, Aug 21, 2019 at 3:45 PM Samuel Pitoiset
> <samuel.pitoiset@gmail.com> wrote:
>> This fixes a regression introduced with scan&reduce operations
>> on GFX10. Note that some subgroups CTS still fail on GFX10 but
>> I assume it's a different issue.
>>
>> This fixes dEQP-VK.subgroups.arithmetic.*.subgroupexclusive*.
>>
>> Fixes: 227c29a80de "amd/common/gfx10: implement scan & reduce operations"
>> Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
>> ---
>>   src/amd/common/ac_llvm_build.c | 7 +++----
>>   1 file changed, 3 insertions(+), 4 deletions(-)
>>
>> diff --git a/src/amd/common/ac_llvm_build.c b/src/amd/common/ac_llvm_build.c
>> index 05871f5ea98..d72eaa2db46 100644
>> --- a/src/amd/common/ac_llvm_build.c
>> +++ b/src/amd/common/ac_llvm_build.c
>> @@ -4221,10 +4221,7 @@ ac_build_scan(struct ac_llvm_context *ctx, nir_op op, LLVMValueRef src, LLVMValu
>>          if (ctx->chip_class >= GFX10) {
>>                  result = inclusive ? src : identity;
>>          } else {
>> -               if (inclusive)
>> -                       result = src;
>> -               else
>> -                       result = ac_build_dpp(ctx, identity, src, dpp_wf_sr1, 0xf, 0xf, false);
>> +               result = src;
>>          }
>>          if (maxprefix <= 1)
>>                  return result;
>> @@ -4333,6 +4330,8 @@ ac_build_exclusive_scan(struct ac_llvm_context *ctx, LLVMValueRef src, nir_op op
>>                  get_reduction_identity(ctx, op, ac_get_type_size(LLVMTypeOf(src)));
>>          result = LLVMBuildBitCast(ctx->builder, ac_build_set_inactive(ctx, src, identity),
>>                                    LLVMTypeOf(identity), "");
>> +       if (ctx->chip_class <= GFX9)
>> +               result = ac_build_dpp(ctx, identity, result, dpp_wf_sr1, 0xf, 0xf, false);
> Kinda annoying that we still do the inclusive/exclusive logic for
> gfx10 inside ac_build_scan. Can we keep this inside the function by
> using a intermediate src?
Looks better indeed, I will make that change.
>
>>          result = ac_build_scan(ctx, op, result, identity, ctx->wave_size, false);
>>
>>          return ac_build_wwm(ctx, result);
>> --
>> 2.22.1
>>
>> _______________________________________________
>> mesa-dev mailing list
>> mesa-dev@lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/mesa-dev