[Mesa-dev,08/10] radeonsi: re-enable unsafe-fp-math for LLVM 3.8

Submitted by Marek Olšák on Oct. 11, 2015, 1:29 a.m.

Details

Message ID 1444526990-14029-9-git-send-email-maraeo@gmail.com
State New
Headers show

Not browsing as part of any series.

Commit Message

Marek Olšák Oct. 11, 2015, 1:29 a.m.
From: Marek Olšák <marek.olsak@amd.com>

Required for 1/sqrt ==> rsq.

We should finally fix the hang instead of running away from the issue. This
assumes the bug is in LLVM and we have time to fix it before the release.
Include compute shaders as well, which only affects TGSI and thus OpenGL.

Totals:
SGPRS: 344368 -> 345104 (0.21 %)
VGPRS: 197552 -> 197420 (-0.07 %)
Code Size: 7366304 -> 7324692 (-0.56 %) bytes
LDS: 91 -> 91 (0.00 %) blocks
Scratch: 1615872 -> 1524736 (-5.64 %) bytes per wave

Totals from affected shaders:
SGPRS: 146696 -> 147432 (0.50 %)
VGPRS: 87212 -> 87080 (-0.15 %)
Code Size: 3852664 -> 3811052 (-1.08 %) bytes
LDS: 48 -> 48 (0.00 %) blocks
Scratch: 1179648 -> 1088512 (-7.73 %) bytes per wave
---
 src/gallium/drivers/radeon/radeon_llvm_emit.c | 7 +++++++
 1 file changed, 7 insertions(+)

Patch hide | download patch | download mbox

diff --git a/src/gallium/drivers/radeon/radeon_llvm_emit.c b/src/gallium/drivers/radeon/radeon_llvm_emit.c
index 6b2ebde..4bda4a4 100644
--- a/src/gallium/drivers/radeon/radeon_llvm_emit.c
+++ b/src/gallium/drivers/radeon/radeon_llvm_emit.c
@@ -84,6 +84,13 @@  void radeon_llvm_shader_type(LLVMValueRef F, unsigned type)
 	sprintf(Str, "%1d", llvm_type);
 
 	LLVMAddTargetDependentFunctionAttr(F, "ShaderType", Str);
+
+#if HAVE_LLVM >= 0x0308
+	/* This only affects TGSI (OpenGL), so it's okay to set it for
+	 * compute shaders too.
+	 */
+	LLVMAddTargetDependentFunctionAttr(F, "unsafe-fp-math", "true");
+#endif
 }
 
 static void init_r600_target()

Comments

FWIW, this isn't quite correct with ARB_shader_precision or GL4.1 --
it specifies that infinities should be correctly generated through
division by 0, which unsafe-fp-math doesn't guarantee. At least,
that's assuming this is similar to the "fast" per-instruction flag
(http://llvm.org/docs/LangRef.html#fast-math-flags) which says "This
flag implies all the others."

On Sat, Oct 10, 2015 at 9:29 PM, Marek Olšák <maraeo@gmail.com> wrote:
> From: Marek Olšák <marek.olsak@amd.com>
>
> Required for 1/sqrt ==> rsq.
>
> We should finally fix the hang instead of running away from the issue. This
> assumes the bug is in LLVM and we have time to fix it before the release.
> Include compute shaders as well, which only affects TGSI and thus OpenGL.
>
> Totals:
> SGPRS: 344368 -> 345104 (0.21 %)
> VGPRS: 197552 -> 197420 (-0.07 %)
> Code Size: 7366304 -> 7324692 (-0.56 %) bytes
> LDS: 91 -> 91 (0.00 %) blocks
> Scratch: 1615872 -> 1524736 (-5.64 %) bytes per wave
>
> Totals from affected shaders:
> SGPRS: 146696 -> 147432 (0.50 %)
> VGPRS: 87212 -> 87080 (-0.15 %)
> Code Size: 3852664 -> 3811052 (-1.08 %) bytes
> LDS: 48 -> 48 (0.00 %) blocks
> Scratch: 1179648 -> 1088512 (-7.73 %) bytes per wave
> ---
>  src/gallium/drivers/radeon/radeon_llvm_emit.c | 7 +++++++
>  1 file changed, 7 insertions(+)
>
> diff --git a/src/gallium/drivers/radeon/radeon_llvm_emit.c b/src/gallium/drivers/radeon/radeon_llvm_emit.c
> index 6b2ebde..4bda4a4 100644
> --- a/src/gallium/drivers/radeon/radeon_llvm_emit.c
> +++ b/src/gallium/drivers/radeon/radeon_llvm_emit.c
> @@ -84,6 +84,13 @@ void radeon_llvm_shader_type(LLVMValueRef F, unsigned type)
>         sprintf(Str, "%1d", llvm_type);
>
>         LLVMAddTargetDependentFunctionAttr(F, "ShaderType", Str);
> +
> +#if HAVE_LLVM >= 0x0308
> +       /* This only affects TGSI (OpenGL), so it's okay to set it for
> +        * compute shaders too.
> +        */
> +       LLVMAddTargetDependentFunctionAttr(F, "unsafe-fp-math", "true");
> +#endif
>  }
>
>  static void init_r600_target()
> --
> 2.1.4
>
> _______________________________________________
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Hi Marek,

I don't get the hang on Dota 2 Reborn with this patch and LLVM/Mesa git.

Tested-by: Nick Sarnie <commendsarnex@gmail.com>

Thanks!

On Sat, Oct 10, 2015 at 10:12 PM, Connor Abbott <cwabbott0@gmail.com> wrote:

> FWIW, this isn't quite correct with ARB_shader_precision or GL4.1 --
> it specifies that infinities should be correctly generated through
> division by 0, which unsafe-fp-math doesn't guarantee. At least,
> that's assuming this is similar to the "fast" per-instruction flag
> (http://llvm.org/docs/LangRef.html#fast-math-flags) which says "This
> flag implies all the others."
>
> On Sat, Oct 10, 2015 at 9:29 PM, Marek Olšák <maraeo@gmail.com> wrote:
> > From: Marek Olšák <marek.olsak@amd.com>
> >
> > Required for 1/sqrt ==> rsq.
> >
> > We should finally fix the hang instead of running away from the issue.
> This
> > assumes the bug is in LLVM and we have time to fix it before the release.
> > Include compute shaders as well, which only affects TGSI and thus OpenGL.
> >
> > Totals:
> > SGPRS: 344368 -> 345104 (0.21 %)
> > VGPRS: 197552 -> 197420 (-0.07 %)
> > Code Size: 7366304 -> 7324692 (-0.56 %) bytes
> > LDS: 91 -> 91 (0.00 %) blocks
> > Scratch: 1615872 -> 1524736 (-5.64 %) bytes per wave
> >
> > Totals from affected shaders:
> > SGPRS: 146696 -> 147432 (0.50 %)
> > VGPRS: 87212 -> 87080 (-0.15 %)
> > Code Size: 3852664 -> 3811052 (-1.08 %) bytes
> > LDS: 48 -> 48 (0.00 %) blocks
> > Scratch: 1179648 -> 1088512 (-7.73 %) bytes per wave
> > ---
> >  src/gallium/drivers/radeon/radeon_llvm_emit.c | 7 +++++++
> >  1 file changed, 7 insertions(+)
> >
> > diff --git a/src/gallium/drivers/radeon/radeon_llvm_emit.c
> b/src/gallium/drivers/radeon/radeon_llvm_emit.c
> > index 6b2ebde..4bda4a4 100644
> > --- a/src/gallium/drivers/radeon/radeon_llvm_emit.c
> > +++ b/src/gallium/drivers/radeon/radeon_llvm_emit.c
> > @@ -84,6 +84,13 @@ void radeon_llvm_shader_type(LLVMValueRef F, unsigned
> type)
> >         sprintf(Str, "%1d", llvm_type);
> >
> >         LLVMAddTargetDependentFunctionAttr(F, "ShaderType", Str);
> > +
> > +#if HAVE_LLVM >= 0x0308
> > +       /* This only affects TGSI (OpenGL), so it's okay to set it for
> > +        * compute shaders too.
> > +        */
> > +       LLVMAddTargetDependentFunctionAttr(F, "unsafe-fp-math", "true");
> > +#endif
> >  }
> >
> >  static void init_r600_target()
> > --
> > 2.1.4
> >
> > _______________________________________________
> > mesa-dev mailing list
> > mesa-dev@lists.freedesktop.org
> > http://lists.freedesktop.org/mailman/listinfo/mesa-dev
> _______________________________________________
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
On Sun, Oct 11, 2015 at 4:12 AM, Connor Abbott <cwabbott0@gmail.com> wrote:
> FWIW, this isn't quite correct with ARB_shader_precision or GL4.1 --
> it specifies that infinities should be correctly generated through
> division by 0, which unsafe-fp-math doesn't guarantee. At least,
> that's assuming this is similar to the "fast" per-instruction flag
> (http://llvm.org/docs/LangRef.html#fast-math-flags) which says "This
> flag implies all the others."

We don't use the per-instruction flags yet.

Sadly, we need this flag to be able to get (1/sqrt) -> RSQ and (1/x)
-> RCP. LLVM doesn't have standard intrinsics for those instructions,
so we have to unwind them and rely on LLVM to combine them. Without
it, the TGSI->LLVM conversion would produce worse code.

Marek
On Sun, Oct 11, 2015 at 7:08 AM, Marek Olšák <maraeo@gmail.com> wrote:
> On Sun, Oct 11, 2015 at 4:12 AM, Connor Abbott <cwabbott0@gmail.com> wrote:
>> FWIW, this isn't quite correct with ARB_shader_precision or GL4.1 --
>> it specifies that infinities should be correctly generated through
>> division by 0, which unsafe-fp-math doesn't guarantee. At least,
>> that's assuming this is similar to the "fast" per-instruction flag
>> (http://llvm.org/docs/LangRef.html#fast-math-flags) which says "This
>> flag implies all the others."
>
> We don't use the per-instruction flags yet.
>
> Sadly, we need this flag to be able to get (1/sqrt) -> RSQ and (1/x)
> -> RCP. LLVM doesn't have standard intrinsics for those instructions,
> so we have to unwind them and rely on LLVM to combine them. Without
> it, the TGSI->LLVM conversion would produce worse code.
>
> Marek

Right, but you're also allowing transforms which GLSL explicitly
disallows. It sounds like you need more precise control than what LLVM
currently offers.

Connor
On Sun, Oct 11, 2015 at 03:29:48AM +0200, Marek Olšák wrote:
> From: Marek Olšák <marek.olsak@amd.com>
> 

I don't think we should globally enable this until we are sure it does not
introduce any illegal transforms.

> Required for 1/sqrt ==> rsq.

I think the arcp fast-math flag for instruction is supposed to allow this.
Let me check with some LLVM people.

-Tom
> 
> We should finally fix the hang instead of running away from the issue. This
> assumes the bug is in LLVM and we have time to fix it before the release.
> Include compute shaders as well, which only affects TGSI and thus OpenGL.
> 
> Totals:
> SGPRS: 344368 -> 345104 (0.21 %)
> VGPRS: 197552 -> 197420 (-0.07 %)
> Code Size: 7366304 -> 7324692 (-0.56 %) bytes
> LDS: 91 -> 91 (0.00 %) blocks
> Scratch: 1615872 -> 1524736 (-5.64 %) bytes per wave
> 
> Totals from affected shaders:
> SGPRS: 146696 -> 147432 (0.50 %)
> VGPRS: 87212 -> 87080 (-0.15 %)
> Code Size: 3852664 -> 3811052 (-1.08 %) bytes
> LDS: 48 -> 48 (0.00 %) blocks
> Scratch: 1179648 -> 1088512 (-7.73 %) bytes per wave
> ---
>  src/gallium/drivers/radeon/radeon_llvm_emit.c | 7 +++++++
>  1 file changed, 7 insertions(+)
> 
> diff --git a/src/gallium/drivers/radeon/radeon_llvm_emit.c b/src/gallium/drivers/radeon/radeon_llvm_emit.c
> index 6b2ebde..4bda4a4 100644
> --- a/src/gallium/drivers/radeon/radeon_llvm_emit.c
> +++ b/src/gallium/drivers/radeon/radeon_llvm_emit.c
> @@ -84,6 +84,13 @@ void radeon_llvm_shader_type(LLVMValueRef F, unsigned type)
>  	sprintf(Str, "%1d", llvm_type);
>  
>  	LLVMAddTargetDependentFunctionAttr(F, "ShaderType", Str);
> +
> +#if HAVE_LLVM >= 0x0308
> +	/* This only affects TGSI (OpenGL), so it's okay to set it for
> +	 * compute shaders too.
> +	 */
> +	LLVMAddTargetDependentFunctionAttr(F, "unsafe-fp-math", "true");
> +#endif
>  }
>  
>  static void init_r600_target()
> -- 
> 2.1.4
> 
> _______________________________________________
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev