[i-g-t,3/4] lib: Don't leak children in igt_waitchildren_timeout

Submitted by Daniel Vetter on Feb. 7, 2019, 2:57 p.m.

Details

Message ID 20190207145707.27623-3-daniel.vetter@ffwll.ch
State New
Headers show
Series "Series without cover letter" ( rev: 2 1 ) in IGT

Not browsing as part of any series.

Commit Message

Daniel Vetter Feb. 7, 2019, 2:57 p.m.
Instead of cleaning up the mess in igt_exit make sure we don't even
let it out of the container. See also

commit 754876378d6c9b2775e8c07b4d16f9878c55949f
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date:   Fri Feb 26 22:11:10 2016 +0000

    igt/gem_sync: Enforce a timeout of 20s

which added this helper.

To make sure that everyone follows the rules, add an assert.

We're keeping the cleanup code as a failsafe, and because it speeds
up the testcase I'm following up with.

Cc: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
---
 lib/igt_core.c | 6 ++++++
 1 file changed, 6 insertions(+)

Patch hide | download patch | download mbox

diff --git a/lib/igt_core.c b/lib/igt_core.c
index 49fbf70deb06..a7105f0591fc 100644
--- a/lib/igt_core.c
+++ b/lib/igt_core.c
@@ -1525,6 +1525,7 @@  void igt_exit(void)
 
 	for (int c = 0; c < num_test_children; c++)
 		kill(test_children[c], SIGKILL);
+	assert(!num_test_children);
 
 	if (!test_with_subtests) {
 		struct timespec now;
@@ -1842,6 +1843,11 @@  void igt_waitchildren_timeout(int seconds, const char *reason)
 	igt_set_timeout(seconds, reason);
 	igt_waitchildren();
 	igt_reset_timeout();
+
+	for (int c = 0; c < num_test_children; c++)
+		kill(test_children[c], SIGKILL);
+
+	__igt_waitchildren();
 }
 
 /* exit handler code */

Comments

Quoting Daniel Vetter (2019-02-07 14:57:06)
> Instead of cleaning up the mess in igt_exit make sure we don't even
> let it out of the container. See also
> 
> commit 754876378d6c9b2775e8c07b4d16f9878c55949f
> Author: Chris Wilson <chris@chris-wilson.co.uk>
> Date:   Fri Feb 26 22:11:10 2016 +0000
> 
>     igt/gem_sync: Enforce a timeout of 20s
> 
> which added this helper.
> 
> To make sure that everyone follows the rules, add an assert.
> 
> We're keeping the cleanup code as a failsafe, and because it speeds
> up the testcase I'm following up with.
> 
> Cc: Chris Wilson <chris@chris-wilson.co.uk>
> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
> ---
>  lib/igt_core.c | 6 ++++++
>  1 file changed, 6 insertions(+)
> 
> diff --git a/lib/igt_core.c b/lib/igt_core.c
> index 49fbf70deb06..a7105f0591fc 100644
> --- a/lib/igt_core.c
> +++ b/lib/igt_core.c
> @@ -1525,6 +1525,7 @@ void igt_exit(void)
>  
>         for (int c = 0; c < num_test_children; c++)
>                 kill(test_children[c], SIGKILL);
> +       assert(!num_test_children);
>  
>         if (!test_with_subtests) {
>                 struct timespec now;
> @@ -1842,6 +1843,11 @@ void igt_waitchildren_timeout(int seconds, const char *reason)
>         igt_set_timeout(seconds, reason);
>         igt_waitchildren();
>         igt_reset_timeout();
> +
> +       for (int c = 0; c < num_test_children; c++)
> +               kill(test_children[c], SIGKILL);
> +
> +       __igt_waitchildren();

How did we escape?

If the SIGALRM fired, we hit igt_fail() in the parent with children in
tow, otherwise we completed igt_waitchildren() successfully and cleaned
everything up.
-Chris
On Thu, Feb 07, 2019 at 03:20:37PM +0000, Chris Wilson wrote:
> Quoting Daniel Vetter (2019-02-07 14:57:06)
> > Instead of cleaning up the mess in igt_exit make sure we don't even
> > let it out of the container. See also
> > 
> > commit 754876378d6c9b2775e8c07b4d16f9878c55949f
> > Author: Chris Wilson <chris@chris-wilson.co.uk>
> > Date:   Fri Feb 26 22:11:10 2016 +0000
> > 
> >     igt/gem_sync: Enforce a timeout of 20s
> > 
> > which added this helper.
> > 
> > To make sure that everyone follows the rules, add an assert.
> > 
> > We're keeping the cleanup code as a failsafe, and because it speeds
> > up the testcase I'm following up with.
> > 
> > Cc: Chris Wilson <chris@chris-wilson.co.uk>
> > Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
> > ---
> >  lib/igt_core.c | 6 ++++++
> >  1 file changed, 6 insertions(+)
> > 
> > diff --git a/lib/igt_core.c b/lib/igt_core.c
> > index 49fbf70deb06..a7105f0591fc 100644
> > --- a/lib/igt_core.c
> > +++ b/lib/igt_core.c
> > @@ -1525,6 +1525,7 @@ void igt_exit(void)
> >  
> >         for (int c = 0; c < num_test_children; c++)
> >                 kill(test_children[c], SIGKILL);
> > +       assert(!num_test_children);
> >  
> >         if (!test_with_subtests) {
> >                 struct timespec now;
> > @@ -1842,6 +1843,11 @@ void igt_waitchildren_timeout(int seconds, const char *reason)
> >         igt_set_timeout(seconds, reason);
> >         igt_waitchildren();
> >         igt_reset_timeout();
> > +
> > +       for (int c = 0; c < num_test_children; c++)
> > +               kill(test_children[c], SIGKILL);
> > +
> > +       __igt_waitchildren();
> 
> How did we escape?
> 
> If the SIGALRM fired, we hit igt_fail() in the parent with children in
> tow, otherwise we completed igt_waitchildren() successfully and cleaned
> everything up.

Hm right, this does nothing.

The motivation for this is that I noticed that without an explicit
igt_waitchildren() the igt_assert forwarding doesn't work. So I've tried
to catch that kind of leaking.

The trouble that does happen is if you run multiple tests, then the
children do leak into the next subtest (since igt_exit is only called at
the very end). That's the usual trouble with our exit handlers not
properly stacking. In the end I wanted to put a wait() into igt_exit and
check that it gives us ECHILD, to make sure everything is cleaned up.

So not sure what to do ... any ideas?
-Daniel
On Thu, Feb 07, 2019 at 05:06:55PM +0100, Daniel Vetter wrote:
> On Thu, Feb 07, 2019 at 03:20:37PM +0000, Chris Wilson wrote:
> > Quoting Daniel Vetter (2019-02-07 14:57:06)
> > > Instead of cleaning up the mess in igt_exit make sure we don't even
> > > let it out of the container. See also
> > > 
> > > commit 754876378d6c9b2775e8c07b4d16f9878c55949f
> > > Author: Chris Wilson <chris@chris-wilson.co.uk>
> > > Date:   Fri Feb 26 22:11:10 2016 +0000
> > > 
> > >     igt/gem_sync: Enforce a timeout of 20s
> > > 
> > > which added this helper.
> > > 
> > > To make sure that everyone follows the rules, add an assert.
> > > 
> > > We're keeping the cleanup code as a failsafe, and because it speeds
> > > up the testcase I'm following up with.
> > > 
> > > Cc: Chris Wilson <chris@chris-wilson.co.uk>
> > > Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
> > > ---
> > >  lib/igt_core.c | 6 ++++++
> > >  1 file changed, 6 insertions(+)
> > > 
> > > diff --git a/lib/igt_core.c b/lib/igt_core.c
> > > index 49fbf70deb06..a7105f0591fc 100644
> > > --- a/lib/igt_core.c
> > > +++ b/lib/igt_core.c
> > > @@ -1525,6 +1525,7 @@ void igt_exit(void)
> > >  
> > >         for (int c = 0; c < num_test_children; c++)
> > >                 kill(test_children[c], SIGKILL);
> > > +       assert(!num_test_children);
> > >  
> > >         if (!test_with_subtests) {
> > >                 struct timespec now;
> > > @@ -1842,6 +1843,11 @@ void igt_waitchildren_timeout(int seconds, const char *reason)
> > >         igt_set_timeout(seconds, reason);
> > >         igt_waitchildren();
> > >         igt_reset_timeout();
> > > +
> > > +       for (int c = 0; c < num_test_children; c++)
> > > +               kill(test_children[c], SIGKILL);
> > > +
> > > +       __igt_waitchildren();
> > 
> > How did we escape?
> > 
> > If the SIGALRM fired, we hit igt_fail() in the parent with children in
> > tow, otherwise we completed igt_waitchildren() successfully and cleaned
> > everything up.
> 
> Hm right, this does nothing.
> 
> The motivation for this is that I noticed that without an explicit
> igt_waitchildren() the igt_assert forwarding doesn't work. So I've tried
> to catch that kind of leaking.
> 
> The trouble that does happen is if you run multiple tests, then the
> children do leak into the next subtest (since igt_exit is only called at
> the very end). That's the usual trouble with our exit handlers not
> properly stacking. In the end I wanted to put a wait() into igt_exit and
> check that it gives us ECHILD, to make sure everything is cleaned up.
> 
> So not sure what to do ... any ideas?

I realized my testcase was also testing for the wrong thing, so resending
both.
-Daniel