framework: Handle tests with subtests crashing in totals

Submitted by Dylan Baker on Jan. 19, 2018, 10:04 p.m.

Details

Message ID 20180119220413.9347-1-dylan@pnwbakers.com
State New
Headers show
Series "framework: Handle tests with subtests crashing in totals" ( rev: 1 ) in Piglit

Not browsing as part of any series.

Commit Message

Dylan Baker Jan. 19, 2018, 10:04 p.m.
Currently piglit doesn't account for a test with subtests crashing when
it calculates the total number of tests of each status. The result is
that if a test with subtests runs no tests before crashing it is handled
correctly (since it goes down the non-subtest path), but if one or more
subtests are run, and those tests return a better result than crash,
then the test will be marked as that status instead.

The real problem is that the python framework has no idea how many
subtests that a test binary is going to run, so if the test crashes it
has no idea if some subtests weren't run. To paper over that if the
result of a test is not the same as the worst result of it's subtests
we'll treat the test as a single test rather than a group, this results
in the summaries generating the expected results.

A better fix would be to have tests with subtests inform the framework
(preferably via JSON) that all of the subtests that it will run before
it starts running, so that the python framework can pre-populate the
subtests and generate the right result.

This solution is a better in the short term because it makes the results
consistent, if a test crashes or not it will produce the same results.

Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
---
 framework/results.py | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

Patch hide | download patch | download mbox

diff --git a/framework/results.py b/framework/results.py
index 99dd3735b..4c7266208 100644
--- a/framework/results.py
+++ b/framework/results.py
@@ -329,7 +329,12 @@  class TestrunResult(object):
         for name, result in six.iteritems(self.tests):
             # If there are subtests treat the test as if it is a group instead
             # of a test.
-            if result.subtests:
+            # FIXME: If there overall test crashed, then don't treat it as a
+            # group, ignore all of the subtests and report that the test was
+            # crash. This is just papering over the fact that the binaries
+            # don't inform the python layer how many subtests (and the names)
+            # of the subtests it wants to run.
+            if result.subtests and result.result == max(six.itervalues(result.subtests)):
                 for res in six.itervalues(result.subtests):
                     res = str(res)
                     temp = name

Comments

I mean to CC Fabien as well...

Quoting Dylan Baker (2018-01-19 14:04:13)
> Currently piglit doesn't account for a test with subtests crashing when
> it calculates the total number of tests of each status. The result is
> that if a test with subtests runs no tests before crashing it is handled
> correctly (since it goes down the non-subtest path), but if one or more
> subtests are run, and those tests return a better result than crash,
> then the test will be marked as that status instead.
> 
> The real problem is that the python framework has no idea how many
> subtests that a test binary is going to run, so if the test crashes it
> has no idea if some subtests weren't run. To paper over that if the
> result of a test is not the same as the worst result of it's subtests
> we'll treat the test as a single test rather than a group, this results
> in the summaries generating the expected results.
> 
> A better fix would be to have tests with subtests inform the framework
> (preferably via JSON) that all of the subtests that it will run before
> it starts running, so that the python framework can pre-populate the
> subtests and generate the right result.
> 
> This solution is a better in the short term because it makes the results
> consistent, if a test crashes or not it will produce the same results.
> 
> Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
> ---
>  framework/results.py | 7 ++++++-
>  1 file changed, 6 insertions(+), 1 deletion(-)
> 
> diff --git a/framework/results.py b/framework/results.py
> index 99dd3735b..4c7266208 100644
> --- a/framework/results.py
> +++ b/framework/results.py
> @@ -329,7 +329,12 @@ class TestrunResult(object):
>          for name, result in six.iteritems(self.tests):
>              # If there are subtests treat the test as if it is a group instead
>              # of a test.
> -            if result.subtests:
> +            # FIXME: If there overall test crashed, then don't treat it as a
> +            # group, ignore all of the subtests and report that the test was
> +            # crash. This is just papering over the fact that the binaries
> +            # don't inform the python layer how many subtests (and the names)
> +            # of the subtests it wants to run.
> +            if result.subtests and result.result == max(six.itervalues(result.subtests)):
>                  for res in six.itervalues(result.subtests):
>                      res = str(res)
>                      temp = name
> -- 
> 2.15.1
> 
> _______________________________________________
> Piglit mailing list
> Piglit@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/piglit
On Fri, 2018-01-19 at 15:26 -0800, Dylan Baker wrote:
> I mean to CC Fabien as well...

I think you did, at least the email looks correct.

I'm not sure how to test this change. After reverting Fabien's patch
things got back to normal. the tests on Jan 17 and 18 ran with Fabien's
change, the rest is without [0].

thanks,
Jan


[0] http://paul.rutgers.edu/~jv356/piglit/radeon-latest-5/problems.html

> 
> Quoting Dylan Baker (2018-01-19 14:04:13)
> > Currently piglit doesn't account for a test with subtests crashing when
> > it calculates the total number of tests of each status. The result is
> > that if a test with subtests runs no tests before crashing it is handled
> > correctly (since it goes down the non-subtest path), but if one or more
> > subtests are run, and those tests return a better result than crash,
> > then the test will be marked as that status instead.
> > 
> > The real problem is that the python framework has no idea how many
> > subtests that a test binary is going to run, so if the test crashes it
> > has no idea if some subtests weren't run. To paper over that if the
> > result of a test is not the same as the worst result of it's subtests
> > we'll treat the test as a single test rather than a group, this results
> > in the summaries generating the expected results.
> > 
> > A better fix would be to have tests with subtests inform the framework
> > (preferably via JSON) that all of the subtests that it will run before
> > it starts running, so that the python framework can pre-populate the
> > subtests and generate the right result.
> > 
> > This solution is a better in the short term because it makes the results
> > consistent, if a test crashes or not it will produce the same results.
> > 
> > Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
> > ---
> >  framework/results.py | 7 ++++++-
> >  1 file changed, 6 insertions(+), 1 deletion(-)
> > 
> > diff --git a/framework/results.py b/framework/results.py
> > index 99dd3735b..4c7266208 100644
> > --- a/framework/results.py
> > +++ b/framework/results.py
> > @@ -329,7 +329,12 @@ class TestrunResult(object):
> >          for name, result in six.iteritems(self.tests):
> >              # If there are subtests treat the test as if it is a group instead
> >              # of a test.
> > -            if result.subtests:
> > +            # FIXME: If there overall test crashed, then don't treat it as a
> > +            # group, ignore all of the subtests and report that the test was
> > +            # crash. This is just papering over the fact that the binaries
> > +            # don't inform the python layer how many subtests (and the names)
> > +            # of the subtests it wants to run.
> > +            if result.subtests and result.result == max(six.itervalues(result.subtests)):
> >                  for res in six.itervalues(result.subtests):
> >                      res = str(res)
> >                      temp = name
> > -- 
> > 2.15.1
> > 
> > _______________________________________________
> > Piglit mailing list
> > Piglit@lists.freedesktop.org
> > https://lists.freedesktop.org/mailman/listinfo/piglit
On 2018-01-19 23:04, Dylan Baker wrote:
> Currently piglit doesn't account for a test with subtests crashing when
> it calculates the total number of tests of each status. The result is
> that if a test with subtests runs no tests before crashing it is handled
> correctly (since it goes down the non-subtest path), but if one or more
> subtests are run, and those tests return a better result than crash,
> then the test will be marked as that status instead.
> 
> The real problem is that the python framework has no idea how many
> subtests that a test binary is going to run, so if the test crashes it
> has no idea if some subtests weren't run. To paper over that if the
> result of a test is not the same as the worst result of it's subtests
> we'll treat the test as a single test rather than a group, this results
> in the summaries generating the expected results.
> 
> A better fix would be to have tests with subtests inform the framework
> (preferably via JSON) that all of the subtests that it will run before
> it starts running, so that the python framework can pre-populate the
> subtests and generate the right result.
> 
> This solution is a better in the short term because it makes the results
> consistent, if a test crashes or not it will produce the same results.
> 
> Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
> ---
>  framework/results.py | 7 ++++++-
>  1 file changed, 6 insertions(+), 1 deletion(-)
> 
> diff --git a/framework/results.py b/framework/results.py
> index 99dd3735b..4c7266208 100644
> --- a/framework/results.py
> +++ b/framework/results.py
> @@ -329,7 +329,12 @@ class TestrunResult(object):
>          for name, result in six.iteritems(self.tests):
>              # If there are subtests treat the test as if it is a group instead
>              # of a test.
> -            if result.subtests:
> +            # FIXME: If there overall test crashed, then don't treat it as a
> +            # group, ignore all of the subtests and report that the test was
> +            # crash. This is just papering over the fact that the binaries
> +            # don't inform the python layer how many subtests (and the names)
> +            # of the subtests it wants to run.
> +            if result.subtests and result.result == max(six.itervalues(result.subtests)):
>                  for res in six.itervalues(result.subtests):
>                      res = str(res)
>                      temp = name
> 

With this patch tests that crash after a call to
piglit_report_subtest_result aren't included into the regressions and
fixes totals, respectively when compared to a passing test run.
Also, the crash still doesn't show up in the result summary.

However, since you already posted the 8 part series with a proper fix,
I'd suggest use that instead.