[2/2] arb_shader_image_load_store/invalid: Skip the index bounds test on Intel Ivybridge hardware.

Submitted by Francisco Jerez on July 24, 2015, 6:01 p.m.

Details

Message ID 1437760907-23174-2-git-send-email-currojerez@riseup.net
State New
Headers show

Not browsing as part of any series.

Commit Message

Francisco Jerez July 24, 2015, 6:01 p.m.
This test is known to cause a GPU hang on IVB due to a hardware bug.
The reason is that the hardware seems to be unable to cope with
binding table indices mapping to an invalid surface, which is
sometimes the case when a shader accesses an array of images out of
bounds.

According to the ARB_shader_image_load_store spec "if the index used
to select an individual element [of an array of images] is negative or
greater than or equal to the size of the array, the results of the
operation are undefined but may not lead to termination." -- Which is
precisely one of the typical outcomes of the GPU hang caused by this
test.

In theory this could be fixed by implementing array bounds checking in
software, but it doesn't seem worth doing because on the one hand it
would punish well-behaved applications unnecessarily, and on the other
hand there doesn't seem to be any reasonable use-case for the
behaviour required by the spec [the implementation may set the
computer on fire as long as it somehow manages not to terminate the
program ;)].

If the breakage caused by this test case was confined to a single
process I would probably consider to leave it alone, however I've
noticed that in some cases the hang will cause other unrelated piglit
tests that happen to be run in parallel to fail, causing some random
noise in the results we get from our piglit system, what makes them
rather difficult to interpret.
---
 tests/spec/arb_shader_image_load_store/invalid.c | 13 ++++++++++++-
 1 file changed, 12 insertions(+), 1 deletion(-)

Patch hide | download patch | download mbox

diff --git a/tests/spec/arb_shader_image_load_store/invalid.c b/tests/spec/arb_shader_image_load_store/invalid.c
index a0981e4..52b3275 100644
--- a/tests/spec/arb_shader_image_load_store/invalid.c
+++ b/tests/spec/arb_shader_image_load_store/invalid.c
@@ -292,6 +292,17 @@  invalidate_address_bounds(const struct image_info img, GLuint prog)
 }
 
 static bool
+can_test_index_bounds(void)
+{
+   /* This test is known to cause a GPU hang on Intel Ivybridge
+    * hardware due to a hardware bug.  Don't annoy users
+    * unnecessarily.
+    */
+   return !strstr((const char *)glGetString(GL_RENDERER),
+                  "Intel(R) Ivybridge");
+}
+
+static bool
 invalidate_index_bounds(const struct image_info img, GLuint prog)
 {
         return set_uniform_int(prog, "u", 0xdeadcafe);
@@ -444,7 +455,7 @@  piglit_init(int argc, char **argv)
                  *  element is negative or greater than or equal to
                  *  the size of the array [...]"
                  */
-                subtest(&status, true,
+                subtest(&status, can_test_index_bounds(),
                         run_test(op, def_img, def_img,
                                  invalidate_index_bounds, true),
                         "%s/index bounds test", op->name);

Comments

I don't know of any other cases where we skip a test based on the
render name. I don't think this is a good practice to start.

-Jordan

On 2015-07-24 11:01:47, Francisco Jerez wrote:
> This test is known to cause a GPU hang on IVB due to a hardware bug.
> The reason is that the hardware seems to be unable to cope with
> binding table indices mapping to an invalid surface, which is
> sometimes the case when a shader accesses an array of images out of
> bounds.
> 
> According to the ARB_shader_image_load_store spec "if the index used
> to select an individual element [of an array of images] is negative or
> greater than or equal to the size of the array, the results of the
> operation are undefined but may not lead to termination." -- Which is
> precisely one of the typical outcomes of the GPU hang caused by this
> test.
> 
> In theory this could be fixed by implementing array bounds checking in
> software, but it doesn't seem worth doing because on the one hand it
> would punish well-behaved applications unnecessarily, and on the other
> hand there doesn't seem to be any reasonable use-case for the
> behaviour required by the spec [the implementation may set the
> computer on fire as long as it somehow manages not to terminate the
> program ;)].
> 
> If the breakage caused by this test case was confined to a single
> process I would probably consider to leave it alone, however I've
> noticed that in some cases the hang will cause other unrelated piglit
> tests that happen to be run in parallel to fail, causing some random
> noise in the results we get from our piglit system, what makes them
> rather difficult to interpret.
> ---
>  tests/spec/arb_shader_image_load_store/invalid.c | 13 ++++++++++++-
>  1 file changed, 12 insertions(+), 1 deletion(-)
> 
> diff --git a/tests/spec/arb_shader_image_load_store/invalid.c b/tests/spec/arb_shader_image_load_store/invalid.c
> index a0981e4..52b3275 100644
> --- a/tests/spec/arb_shader_image_load_store/invalid.c
> +++ b/tests/spec/arb_shader_image_load_store/invalid.c
> @@ -292,6 +292,17 @@ invalidate_address_bounds(const struct image_info img, GLuint prog)
>  }
>  
>  static bool
> +can_test_index_bounds(void)
> +{
> +   /* This test is known to cause a GPU hang on Intel Ivybridge
> +    * hardware due to a hardware bug.  Don't annoy users
> +    * unnecessarily.
> +    */
> +   return !strstr((const char *)glGetString(GL_RENDERER),
> +                  "Intel(R) Ivybridge");
> +}
> +
> +static bool
>  invalidate_index_bounds(const struct image_info img, GLuint prog)
>  {
>          return set_uniform_int(prog, "u", 0xdeadcafe);
> @@ -444,7 +455,7 @@ piglit_init(int argc, char **argv)
>                   *  element is negative or greater than or equal to
>                   *  the size of the array [...]"
>                   */
> -                subtest(&status, true,
> +                subtest(&status, can_test_index_bounds(),
>                          run_test(op, def_img, def_img,
>                                   invalidate_index_bounds, true),
>                          "%s/index bounds test", op->name);
> -- 
> 2.4.3
> 
> _______________________________________________
> Piglit mailing list
> Piglit@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/piglit
Jordan Justen <jordan.l.justen@intel.com> writes:

> I don't know of any other cases where we skip a test based on the
> render name. I don't think this is a good practice to start.
>

What do you think we should do in that case?

> -Jordan
>
> On 2015-07-24 11:01:47, Francisco Jerez wrote:
>> This test is known to cause a GPU hang on IVB due to a hardware bug.
>> The reason is that the hardware seems to be unable to cope with
>> binding table indices mapping to an invalid surface, which is
>> sometimes the case when a shader accesses an array of images out of
>> bounds.
>> 
>> According to the ARB_shader_image_load_store spec "if the index used
>> to select an individual element [of an array of images] is negative or
>> greater than or equal to the size of the array, the results of the
>> operation are undefined but may not lead to termination." -- Which is
>> precisely one of the typical outcomes of the GPU hang caused by this
>> test.
>> 
>> In theory this could be fixed by implementing array bounds checking in
>> software, but it doesn't seem worth doing because on the one hand it
>> would punish well-behaved applications unnecessarily, and on the other
>> hand there doesn't seem to be any reasonable use-case for the
>> behaviour required by the spec [the implementation may set the
>> computer on fire as long as it somehow manages not to terminate the
>> program ;)].
>> 
>> If the breakage caused by this test case was confined to a single
>> process I would probably consider to leave it alone, however I've
>> noticed that in some cases the hang will cause other unrelated piglit
>> tests that happen to be run in parallel to fail, causing some random
>> noise in the results we get from our piglit system, what makes them
>> rather difficult to interpret.
>> ---
>>  tests/spec/arb_shader_image_load_store/invalid.c | 13 ++++++++++++-
>>  1 file changed, 12 insertions(+), 1 deletion(-)
>> 
>> diff --git a/tests/spec/arb_shader_image_load_store/invalid.c b/tests/spec/arb_shader_image_load_store/invalid.c
>> index a0981e4..52b3275 100644
>> --- a/tests/spec/arb_shader_image_load_store/invalid.c
>> +++ b/tests/spec/arb_shader_image_load_store/invalid.c
>> @@ -292,6 +292,17 @@ invalidate_address_bounds(const struct image_info img, GLuint prog)
>>  }
>>  
>>  static bool
>> +can_test_index_bounds(void)
>> +{
>> +   /* This test is known to cause a GPU hang on Intel Ivybridge
>> +    * hardware due to a hardware bug.  Don't annoy users
>> +    * unnecessarily.
>> +    */
>> +   return !strstr((const char *)glGetString(GL_RENDERER),
>> +                  "Intel(R) Ivybridge");
>> +}
>> +
>> +static bool
>>  invalidate_index_bounds(const struct image_info img, GLuint prog)
>>  {
>>          return set_uniform_int(prog, "u", 0xdeadcafe);
>> @@ -444,7 +455,7 @@ piglit_init(int argc, char **argv)
>>                   *  element is negative or greater than or equal to
>>                   *  the size of the array [...]"
>>                   */
>> -                subtest(&status, true,
>> +                subtest(&status, can_test_index_bounds(),
>>                          run_test(op, def_img, def_img,
>>                                   invalidate_index_bounds, true),
>>                          "%s/index bounds test", op->name);
>> -- 
>> 2.4.3
>> 
>> _______________________________________________
>> Piglit mailing list
>> Piglit@lists.freedesktop.org
>> http://lists.freedesktop.org/mailman/listinfo/piglit
On 2015-07-24 12:49:05, Francisco Jerez wrote:
> Jordan Justen <jordan.l.justen@intel.com> writes:
> > I don't know of any other cases where we skip a test based on the
> > render name. I don't think this is a good practice to start.
> 
> What do you think we should do in that case?

* Let the driver hang when the test is run

* See if we can make the driver crash rather than hang the GPU

* See if we can make the driver neither crash nor hang the GPU

* Let the piglit community reach consensus that a driver quirks system
  is reasonable (and thus, go ahead and skip the test on ivb...)

> > On 2015-07-24 11:01:47, Francisco Jerez wrote:
> >> This test is known to cause a GPU hang on IVB due to a hardware bug.
> >> The reason is that the hardware seems to be unable to cope with
> >> binding table indices mapping to an invalid surface, which is
> >> sometimes the case when a shader accesses an array of images out of
> >> bounds.
> >> 
> >> According to the ARB_shader_image_load_store spec "if the index used
> >> to select an individual element [of an array of images] is negative or
> >> greater than or equal to the size of the array, the results of the
> >> operation are undefined but may not lead to termination." -- Which is

Hmm. Maybe we shouldn't test the scenario where the results (according
to the spec) are undefined?

-Jordan

> >> precisely one of the typical outcomes of the GPU hang caused by this
> >> test.
> >> 
> >> In theory this could be fixed by implementing array bounds checking in
> >> software, but it doesn't seem worth doing because on the one hand it
> >> would punish well-behaved applications unnecessarily, and on the other
> >> hand there doesn't seem to be any reasonable use-case for the
> >> behaviour required by the spec [the implementation may set the
> >> computer on fire as long as it somehow manages not to terminate the
> >> program ;)].
> >> 
> >> If the breakage caused by this test case was confined to a single
> >> process I would probably consider to leave it alone, however I've
> >> noticed that in some cases the hang will cause other unrelated piglit
> >> tests that happen to be run in parallel to fail, causing some random
> >> noise in the results we get from our piglit system, what makes them
> >> rather difficult to interpret.
> >> ---
> >>  tests/spec/arb_shader_image_load_store/invalid.c | 13 ++++++++++++-
> >>  1 file changed, 12 insertions(+), 1 deletion(-)
> >> 
> >> diff --git a/tests/spec/arb_shader_image_load_store/invalid.c b/tests/spec/arb_shader_image_load_store/invalid.c
> >> index a0981e4..52b3275 100644
> >> --- a/tests/spec/arb_shader_image_load_store/invalid.c
> >> +++ b/tests/spec/arb_shader_image_load_store/invalid.c
> >> @@ -292,6 +292,17 @@ invalidate_address_bounds(const struct image_info img, GLuint prog)
> >>  }
> >>  
> >>  static bool
> >> +can_test_index_bounds(void)
> >> +{
> >> +   /* This test is known to cause a GPU hang on Intel Ivybridge
> >> +    * hardware due to a hardware bug.  Don't annoy users
> >> +    * unnecessarily.
> >> +    */
> >> +   return !strstr((const char *)glGetString(GL_RENDERER),
> >> +                  "Intel(R) Ivybridge");
> >> +}
> >> +
> >> +static bool
> >>  invalidate_index_bounds(const struct image_info img, GLuint prog)
> >>  {
> >>          return set_uniform_int(prog, "u", 0xdeadcafe);
> >> @@ -444,7 +455,7 @@ piglit_init(int argc, char **argv)
> >>                   *  element is negative or greater than or equal to
> >>                   *  the size of the array [...]"
> >>                   */
> >> -                subtest(&status, true,
> >> +                subtest(&status, can_test_index_bounds(),
> >>                          run_test(op, def_img, def_img,
> >>                                   invalidate_index_bounds, true),
> >>                          "%s/index bounds test", op->name);
> >> -- 
> >> 2.4.3
> >> 
> >> _______________________________________________
> >> Piglit mailing list
> >> Piglit@lists.freedesktop.org
> >> http://lists.freedesktop.org/mailman/listinfo/piglit
Jordan Justen <jordan.l.justen@intel.com> writes:

> On 2015-07-24 12:49:05, Francisco Jerez wrote:
>> Jordan Justen <jordan.l.justen@intel.com> writes:
>> > I don't know of any other cases where we skip a test based on the
>> > render name. I don't think this is a good practice to start.
>> 
>> What do you think we should do in that case?
>
> * Let the driver hang when the test is run
>
> * See if we can make the driver crash rather than hang the GPU
>
> * See if we can make the driver neither crash nor hang the GPU
>
I doubt any of the three possibilities above is satisfactory for the
reasons I explained in the commit messages -- The bug can surely be
worked around in the compiler but with a performance penalty which IMO
outweights the slight benefit from avoiding a hang on already
ill-behaved applications.

> * Let the piglit community reach consensus that a driver quirks system
>   is reasonable (and thus, go ahead and skip the test on ivb...)
>
>> > On 2015-07-24 11:01:47, Francisco Jerez wrote:
>> >> This test is known to cause a GPU hang on IVB due to a hardware bug.
>> >> The reason is that the hardware seems to be unable to cope with
>> >> binding table indices mapping to an invalid surface, which is
>> >> sometimes the case when a shader accesses an array of images out of
>> >> bounds.
>> >> 
>> >> According to the ARB_shader_image_load_store spec "if the index used
>> >> to select an individual element [of an array of images] is negative or
>> >> greater than or equal to the size of the array, the results of the
>> >> operation are undefined but may not lead to termination." -- Which is
>
> Hmm. Maybe we shouldn't test the scenario where the results (according
> to the spec) are undefined?
>
The one thing the spec doesn't leave undefined is that it may not lead
to program termination (precisely what happens so strictly speaking IVB
is not conformant because of this), which is all the test is checking --
i.e. it always passes as long as it's able to run to completion.

> -Jordan
>
>> >> precisely one of the typical outcomes of the GPU hang caused by this
>> >> test.
>> >> 
>> >> In theory this could be fixed by implementing array bounds checking in
>> >> software, but it doesn't seem worth doing because on the one hand it
>> >> would punish well-behaved applications unnecessarily, and on the other
>> >> hand there doesn't seem to be any reasonable use-case for the
>> >> behaviour required by the spec [the implementation may set the
>> >> computer on fire as long as it somehow manages not to terminate the
>> >> program ;)].
>> >> 
>> >> If the breakage caused by this test case was confined to a single
>> >> process I would probably consider to leave it alone, however I've
>> >> noticed that in some cases the hang will cause other unrelated piglit
>> >> tests that happen to be run in parallel to fail, causing some random
>> >> noise in the results we get from our piglit system, what makes them
>> >> rather difficult to interpret.
>> >> ---
>> >>  tests/spec/arb_shader_image_load_store/invalid.c | 13 ++++++++++++-
>> >>  1 file changed, 12 insertions(+), 1 deletion(-)
>> >> 
>> >> diff --git a/tests/spec/arb_shader_image_load_store/invalid.c b/tests/spec/arb_shader_image_load_store/invalid.c
>> >> index a0981e4..52b3275 100644
>> >> --- a/tests/spec/arb_shader_image_load_store/invalid.c
>> >> +++ b/tests/spec/arb_shader_image_load_store/invalid.c
>> >> @@ -292,6 +292,17 @@ invalidate_address_bounds(const struct image_info img, GLuint prog)
>> >>  }
>> >>  
>> >>  static bool
>> >> +can_test_index_bounds(void)
>> >> +{
>> >> +   /* This test is known to cause a GPU hang on Intel Ivybridge
>> >> +    * hardware due to a hardware bug.  Don't annoy users
>> >> +    * unnecessarily.
>> >> +    */
>> >> +   return !strstr((const char *)glGetString(GL_RENDERER),
>> >> +                  "Intel(R) Ivybridge");
>> >> +}
>> >> +
>> >> +static bool
>> >>  invalidate_index_bounds(const struct image_info img, GLuint prog)
>> >>  {
>> >>          return set_uniform_int(prog, "u", 0xdeadcafe);
>> >> @@ -444,7 +455,7 @@ piglit_init(int argc, char **argv)
>> >>                   *  element is negative or greater than or equal to
>> >>                   *  the size of the array [...]"
>> >>                   */
>> >> -                subtest(&status, true,
>> >> +                subtest(&status, can_test_index_bounds(),
>> >>                          run_test(op, def_img, def_img,
>> >>                                   invalidate_index_bounds, true),
>> >>                          "%s/index bounds test", op->name);
>> >> -- 
>> >> 2.4.3
>> >> 
>> >> _______________________________________________
>> >> Piglit mailing list
>> >> Piglit@lists.freedesktop.org
>> >> http://lists.freedesktop.org/mailman/listinfo/piglit
Francisco Jerez <currojerez@riseup.net> writes:

> Jordan Justen <jordan.l.justen@intel.com> writes:
>
>> On 2015-07-24 12:49:05, Francisco Jerez wrote:
>>> Jordan Justen <jordan.l.justen@intel.com> writes:
>>> > I don't know of any other cases where we skip a test based on the
>>> > render name. I don't think this is a good practice to start.
>>> 
>>> What do you think we should do in that case?
>>
>> * Let the driver hang when the test is run
>>
>> * See if we can make the driver crash rather than hang the GPU
>>
>> * See if we can make the driver neither crash nor hang the GPU
>>
> I doubt any of the three possibilities above is satisfactory for the
> reasons I explained in the commit messages -- The bug can surely be
> worked around in the compiler but with a performance penalty which IMO
> outweights the slight benefit from avoiding a hang on already
> ill-behaved applications.
>

Meh...  I've implemented the darn work-around :P, it doesn't seem as bad
as I expected.  Feel free to put your R-b on it if you think that's the
way to go, I'll add you to the CC list.

>> * Let the piglit community reach consensus that a driver quirks system
>>   is reasonable (and thus, go ahead and skip the test on ivb...)
>>
>>> > On 2015-07-24 11:01:47, Francisco Jerez wrote:
>>> >> This test is known to cause a GPU hang on IVB due to a hardware bug.
>>> >> The reason is that the hardware seems to be unable to cope with
>>> >> binding table indices mapping to an invalid surface, which is
>>> >> sometimes the case when a shader accesses an array of images out of
>>> >> bounds.
>>> >> 
>>> >> According to the ARB_shader_image_load_store spec "if the index used
>>> >> to select an individual element [of an array of images] is negative or
>>> >> greater than or equal to the size of the array, the results of the
>>> >> operation are undefined but may not lead to termination." -- Which is
>>
>> Hmm. Maybe we shouldn't test the scenario where the results (according
>> to the spec) are undefined?
>>
> The one thing the spec doesn't leave undefined is that it may not lead
> to program termination (precisely what happens so strictly speaking IVB
> is not conformant because of this), which is all the test is checking --
> i.e. it always passes as long as it's able to run to completion.
>
>> -Jordan
>>
>>> >> precisely one of the typical outcomes of the GPU hang caused by this
>>> >> test.
>>> >> 
>>> >> In theory this could be fixed by implementing array bounds checking in
>>> >> software, but it doesn't seem worth doing because on the one hand it
>>> >> would punish well-behaved applications unnecessarily, and on the other
>>> >> hand there doesn't seem to be any reasonable use-case for the
>>> >> behaviour required by the spec [the implementation may set the
>>> >> computer on fire as long as it somehow manages not to terminate the
>>> >> program ;)].
>>> >> 
>>> >> If the breakage caused by this test case was confined to a single
>>> >> process I would probably consider to leave it alone, however I've
>>> >> noticed that in some cases the hang will cause other unrelated piglit
>>> >> tests that happen to be run in parallel to fail, causing some random
>>> >> noise in the results we get from our piglit system, what makes them
>>> >> rather difficult to interpret.
>>> >> ---
>>> >>  tests/spec/arb_shader_image_load_store/invalid.c | 13 ++++++++++++-
>>> >>  1 file changed, 12 insertions(+), 1 deletion(-)
>>> >> 
>>> >> diff --git a/tests/spec/arb_shader_image_load_store/invalid.c b/tests/spec/arb_shader_image_load_store/invalid.c
>>> >> index a0981e4..52b3275 100644
>>> >> --- a/tests/spec/arb_shader_image_load_store/invalid.c
>>> >> +++ b/tests/spec/arb_shader_image_load_store/invalid.c
>>> >> @@ -292,6 +292,17 @@ invalidate_address_bounds(const struct image_info img, GLuint prog)
>>> >>  }
>>> >>  
>>> >>  static bool
>>> >> +can_test_index_bounds(void)
>>> >> +{
>>> >> +   /* This test is known to cause a GPU hang on Intel Ivybridge
>>> >> +    * hardware due to a hardware bug.  Don't annoy users
>>> >> +    * unnecessarily.
>>> >> +    */
>>> >> +   return !strstr((const char *)glGetString(GL_RENDERER),
>>> >> +                  "Intel(R) Ivybridge");
>>> >> +}
>>> >> +
>>> >> +static bool
>>> >>  invalidate_index_bounds(const struct image_info img, GLuint prog)
>>> >>  {
>>> >>          return set_uniform_int(prog, "u", 0xdeadcafe);
>>> >> @@ -444,7 +455,7 @@ piglit_init(int argc, char **argv)
>>> >>                   *  element is negative or greater than or equal to
>>> >>                   *  the size of the array [...]"
>>> >>                   */
>>> >> -                subtest(&status, true,
>>> >> +                subtest(&status, can_test_index_bounds(),
>>> >>                          run_test(op, def_img, def_img,
>>> >>                                   invalidate_index_bounds, true),
>>> >>                          "%s/index bounds test", op->name);
>>> >> -- 
>>> >> 2.4.3
>>> >> 
>>> >> _______________________________________________
>>> >> Piglit mailing list
>>> >> Piglit@lists.freedesktop.org
>>> >> http://lists.freedesktop.org/mailman/listinfo/piglit
> _______________________________________________
> Piglit mailing list
> Piglit@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/piglit