tests: fix memory leak

Submitted by Marek Chalupa on Nov. 21, 2014, 10:18 a.m.

Details

Message ID 1416565113-14292-1-git-send-email-mchqwerty@gmail.com
State Accepted
Commit e118c111783f89635986fecdd7990ce7dcb1363b
Headers show

Not browsing as part of any series.

Commit Message

Marek Chalupa Nov. 21, 2014, 10:18 a.m.
We didn't free the struct client that we got from client_connect()

Signed-off-by: Marek Chalupa <mchqwerty@gmail.com>
---
 tests/display-test.c    | 23 +++++++++++++----------
 tests/test-compositor.c |  1 +
 2 files changed, 14 insertions(+), 10 deletions(-)

Patch hide | download patch | download mbox

diff --git a/tests/display-test.c b/tests/display-test.c
index aecf341..776a9c9 100644
--- a/tests/display-test.c
+++ b/tests/display-test.c
@@ -155,6 +155,14 @@  bind_seat(struct wl_client *client, void *data,
 }
 
 static void
+client_disconnect_nocheck(struct client *c)
+{
+	wl_proxy_destroy((struct wl_proxy *) c->tc);
+	wl_display_disconnect(c->wl_display);
+	free(c);
+}
+
+static void
 post_error_main(void)
 {
 	struct client *c = client_connect();
@@ -170,8 +178,7 @@  post_error_main(void)
 	/* don't call client_disconnect(c), because then the test would be
 	 * aborted due to checks for error in this function */
 	wl_proxy_destroy((struct wl_proxy *) seat);
-	wl_proxy_destroy((struct wl_proxy *) c->tc);
-	wl_display_disconnect(c->wl_display);
+	client_disconnect_nocheck(c);
 }
 
 TEST(post_error_to_one_client)
@@ -224,8 +231,7 @@  post_error_main3(void)
 	/* don't call client_disconnect(c), because then the test would be
 	 * aborted due to checks for error in this function */
 	wl_proxy_destroy((struct wl_proxy *) seat);
-	wl_proxy_destroy((struct wl_proxy *) c->tc);
-	wl_display_disconnect(c->wl_display);
+	client_disconnect_nocheck(c);
 }
 
 /* all the testcases could be in one TEST, but splitting it
@@ -294,8 +300,7 @@  post_nomem_main(void)
 	assert(wl_display_get_error(c->wl_display) == ENOMEM);
 
 	wl_proxy_destroy((struct wl_proxy *) seat);
-	wl_proxy_destroy((struct wl_proxy *) c->tc);
-	wl_display_disconnect(c->wl_display);
+	client_disconnect_nocheck(c);
 }
 
 TEST(post_nomem_tst)
@@ -413,8 +418,7 @@  threading_post_err(void)
 	test_set_timeout(3);
 	pthread_join(thread, NULL);
 
-	wl_proxy_destroy((struct wl_proxy *) c->tc);
-	wl_display_disconnect(c->wl_display);
+	client_disconnect_nocheck(c);
 }
 
 TEST(threading_errors_tst)
@@ -565,8 +569,7 @@  threading_read_after_error(void)
 	test_set_timeout(3);
 	pthread_join(thread, NULL);
 
-	wl_proxy_destroy((struct wl_proxy *) c->tc);
-	wl_display_disconnect(c->wl_display);
+	client_disconnect_nocheck(c);
 }
 
 TEST(threading_read_after_error_tst)
diff --git a/tests/test-compositor.c b/tests/test-compositor.c
index 3248e2d..6f86a85 100644
--- a/tests/test-compositor.c
+++ b/tests/test-compositor.c
@@ -452,6 +452,7 @@  client_disconnect(struct client *c)
 
 	wl_proxy_destroy((struct wl_proxy *) c->tc);
 	wl_display_disconnect(c->wl_display);
+	free(c);
 }
 
 /* num is number of clients that requests to stop display.

Comments

On Fri, 21 Nov 2014 11:18:33 +0100
Marek Chalupa <mchqwerty@gmail.com> wrote:

> We didn't free the struct client that we got from client_connect()
> 
> Signed-off-by: Marek Chalupa <mchqwerty@gmail.com>
> ---
>  tests/display-test.c    | 23 +++++++++++++----------
>  tests/test-compositor.c |  1 +
>  2 files changed, 14 insertions(+), 10 deletions(-)

Hi,

how did you find these leaks?

More importantly, why doesn't our leak checker find these?
That is, the code in test-runner.c, run_test().

For one, I think those are not thread-safe. Calling exit() in a test
would bypass it too, but I'm not sure that happens.

The Valgrind output nowadays is quite messy. Even when I run a single
test from one test binary which should not fork at all, I think I see
two forks in Valgrind output: one is likely the debugger check, maybe
the other is actually a thread for the test compositor or something?

I think Valgrind is saying that there are many more leaks to plug.


In any case, this patch pushed.

Thanks,
pq

> 
> diff --git a/tests/display-test.c b/tests/display-test.c
> index aecf341..776a9c9 100644
> --- a/tests/display-test.c
> +++ b/tests/display-test.c
> @@ -155,6 +155,14 @@ bind_seat(struct wl_client *client, void *data,
>  }
>  
>  static void
> +client_disconnect_nocheck(struct client *c)
> +{
> +	wl_proxy_destroy((struct wl_proxy *) c->tc);
> +	wl_display_disconnect(c->wl_display);
> +	free(c);
> +}
> +
> +static void
>  post_error_main(void)
>  {
>  	struct client *c = client_connect();
> @@ -170,8 +178,7 @@ post_error_main(void)
>  	/* don't call client_disconnect(c), because then the test would be
>  	 * aborted due to checks for error in this function */
>  	wl_proxy_destroy((struct wl_proxy *) seat);
> -	wl_proxy_destroy((struct wl_proxy *) c->tc);
> -	wl_display_disconnect(c->wl_display);
> +	client_disconnect_nocheck(c);
>  }
>  
>  TEST(post_error_to_one_client)
> @@ -224,8 +231,7 @@ post_error_main3(void)
>  	/* don't call client_disconnect(c), because then the test would be
>  	 * aborted due to checks for error in this function */
>  	wl_proxy_destroy((struct wl_proxy *) seat);
> -	wl_proxy_destroy((struct wl_proxy *) c->tc);
> -	wl_display_disconnect(c->wl_display);
> +	client_disconnect_nocheck(c);
>  }
>  
>  /* all the testcases could be in one TEST, but splitting it
> @@ -294,8 +300,7 @@ post_nomem_main(void)
>  	assert(wl_display_get_error(c->wl_display) == ENOMEM);
>  
>  	wl_proxy_destroy((struct wl_proxy *) seat);
> -	wl_proxy_destroy((struct wl_proxy *) c->tc);
> -	wl_display_disconnect(c->wl_display);
> +	client_disconnect_nocheck(c);
>  }
>  
>  TEST(post_nomem_tst)
> @@ -413,8 +418,7 @@ threading_post_err(void)
>  	test_set_timeout(3);
>  	pthread_join(thread, NULL);
>  
> -	wl_proxy_destroy((struct wl_proxy *) c->tc);
> -	wl_display_disconnect(c->wl_display);
> +	client_disconnect_nocheck(c);
>  }
>  
>  TEST(threading_errors_tst)
> @@ -565,8 +569,7 @@ threading_read_after_error(void)
>  	test_set_timeout(3);
>  	pthread_join(thread, NULL);
>  
> -	wl_proxy_destroy((struct wl_proxy *) c->tc);
> -	wl_display_disconnect(c->wl_display);
> +	client_disconnect_nocheck(c);
>  }
>  
>  TEST(threading_read_after_error_tst)
> diff --git a/tests/test-compositor.c b/tests/test-compositor.c
> index 3248e2d..6f86a85 100644
> --- a/tests/test-compositor.c
> +++ b/tests/test-compositor.c
> @@ -452,6 +452,7 @@ client_disconnect(struct client *c)
>  
>  	wl_proxy_destroy((struct wl_proxy *) c->tc);
>  	wl_display_disconnect(c->wl_display);
> +	free(c);
>  }
>  
>  /* num is number of clients that requests to stop display.
On 1 December 2014 at 11:42, Pekka Paalanen <ppaalanen@gmail.com> wrote:

> On Fri, 21 Nov 2014 11:18:33 +0100
> Marek Chalupa <mchqwerty@gmail.com> wrote:
>
> > We didn't free the struct client that we got from client_connect()
> >
> > Signed-off-by: Marek Chalupa <mchqwerty@gmail.com>
> > ---
> >  tests/display-test.c    | 23 +++++++++++++----------
> >  tests/test-compositor.c |  1 +
> >  2 files changed, 14 insertions(+), 10 deletions(-)
>
> Hi,
>
> how did you find these leaks?
>

By valgrind --trace-children=yes


>
> More importantly, why doesn't our leak checker find these?
> That is, the code in test-runner.c, run_test().
>

Because it is in another fork. It looks like this:

run_test()
     int the alloc_count  and fds_count
     run the TEST function
          create_client --> fork
                [in fork]   alloc, dealloc, whatever
          wait for client
     check alloc_count and fds_count

The problem is that the forked client has the alloc_count and fds_count
in its own virtual memory, so the leak checker finds only leaks in
TEST function.

I have some patches that extend these leak checks
into the client too, but I ran into problem with filedescriptors there.
The problem is that when forking the client, we pass WAYLAND_SOCKET fd
to wl_display_connect(). wl_display_connect() just uses this fd (no dup),
so wl_display_disconnect() closes this fd.
So the result of leak checker is that the client closed
one more fd that it opened. Hardcoding -1 to client's fds number
is not a solution too, because client does not have to call
wl_display_connect()

So summing it up: using leak checker in clients is possible, but only for
memory
leaks. For fd leaks we'd have to use some more sophisticated solution.


> For one, I think those are not thread-safe. Calling exit() in a test
> would bypass it too, but I'm not sure that happens.
>

Huh? What is not thread-safe? you mean leak checks when running only one
test (without fork?)
Yes, it looks like when calling exit in the test, the leak check is
bypassed.
Quick grep though the tests shows that there are no calls to exit in TEST,
just in
the forked clients (which is not a problem _now_ with no leak checks in
clients)

>
> The Valgrind output nowadays is quite messy. Even when I run a single
> test from one test binary which should not fork at all, I think I see
> two forks in Valgrind output: one is likely the debugger check, maybe
> the other is actually a thread for the test compositor or something?
>

Yes, the first fork is the debugger check and the other is the client
for test compositor. In the tests without the test compositor you should
see only
one fork (the debugger check)


> I think Valgrind is saying that there are many more leaks to plug.
>
>
Yes


>
> In any case, this patch pushed.
>
> Thanks,
> pq
>
> >
> > diff --git a/tests/display-test.c b/tests/display-test.c
> > index aecf341..776a9c9 100644
> > --- a/tests/display-test.c
> > +++ b/tests/display-test.c
> > @@ -155,6 +155,14 @@ bind_seat(struct wl_client *client, void *data,
> >  }
> >
> >  static void
> > +client_disconnect_nocheck(struct client *c)
> > +{
> > +     wl_proxy_destroy((struct wl_proxy *) c->tc);
> > +     wl_display_disconnect(c->wl_display);
> > +     free(c);
> > +}
> > +
> > +static void
> >  post_error_main(void)
> >  {
> >       struct client *c = client_connect();
> > @@ -170,8 +178,7 @@ post_error_main(void)
> >       /* don't call client_disconnect(c), because then the test would be
> >        * aborted due to checks for error in this function */
> >       wl_proxy_destroy((struct wl_proxy *) seat);
> > -     wl_proxy_destroy((struct wl_proxy *) c->tc);
> > -     wl_display_disconnect(c->wl_display);
> > +     client_disconnect_nocheck(c);
> >  }
> >
> >  TEST(post_error_to_one_client)
> > @@ -224,8 +231,7 @@ post_error_main3(void)
> >       /* don't call client_disconnect(c), because then the test would be
> >        * aborted due to checks for error in this function */
> >       wl_proxy_destroy((struct wl_proxy *) seat);
> > -     wl_proxy_destroy((struct wl_proxy *) c->tc);
> > -     wl_display_disconnect(c->wl_display);
> > +     client_disconnect_nocheck(c);
> >  }
> >
> >  /* all the testcases could be in one TEST, but splitting it
> > @@ -294,8 +300,7 @@ post_nomem_main(void)
> >       assert(wl_display_get_error(c->wl_display) == ENOMEM);
> >
> >       wl_proxy_destroy((struct wl_proxy *) seat);
> > -     wl_proxy_destroy((struct wl_proxy *) c->tc);
> > -     wl_display_disconnect(c->wl_display);
> > +     client_disconnect_nocheck(c);
> >  }
> >
> >  TEST(post_nomem_tst)
> > @@ -413,8 +418,7 @@ threading_post_err(void)
> >       test_set_timeout(3);
> >       pthread_join(thread, NULL);
> >
> > -     wl_proxy_destroy((struct wl_proxy *) c->tc);
> > -     wl_display_disconnect(c->wl_display);
> > +     client_disconnect_nocheck(c);
> >  }
> >
> >  TEST(threading_errors_tst)
> > @@ -565,8 +569,7 @@ threading_read_after_error(void)
> >       test_set_timeout(3);
> >       pthread_join(thread, NULL);
> >
> > -     wl_proxy_destroy((struct wl_proxy *) c->tc);
> > -     wl_display_disconnect(c->wl_display);
> > +     client_disconnect_nocheck(c);
> >  }
> >
> >  TEST(threading_read_after_error_tst)
> > diff --git a/tests/test-compositor.c b/tests/test-compositor.c
> > index 3248e2d..6f86a85 100644
> > --- a/tests/test-compositor.c
> > +++ b/tests/test-compositor.c
> > @@ -452,6 +452,7 @@ client_disconnect(struct client *c)
> >
> >       wl_proxy_destroy((struct wl_proxy *) c->tc);
> >       wl_display_disconnect(c->wl_display);
> > +     free(c);
> >  }
> >
> >  /* num is number of clients that requests to stop display.
>
>
Thanks,
Marek
On Wed, 3 Dec 2014 11:44:47 +0100
Marek Chalupa <mchqwerty@gmail.com> wrote:

> On 1 December 2014 at 11:42, Pekka Paalanen <ppaalanen@gmail.com> wrote:
> 
> > On Fri, 21 Nov 2014 11:18:33 +0100
> > Marek Chalupa <mchqwerty@gmail.com> wrote:
> >
> > > We didn't free the struct client that we got from client_connect()
> > >
> > > Signed-off-by: Marek Chalupa <mchqwerty@gmail.com>
> > > ---
> > >  tests/display-test.c    | 23 +++++++++++++----------
> > >  tests/test-compositor.c |  1 +
> > >  2 files changed, 14 insertions(+), 10 deletions(-)
> >
> > Hi,
> >
> > how did you find these leaks?
> >
> 
> By valgrind --trace-children=yes
> 
> 
> >
> > More importantly, why doesn't our leak checker find these?
> > That is, the code in test-runner.c, run_test().
> >
> 
> Because it is in another fork. It looks like this:
> 
> run_test()
>      int the alloc_count  and fds_count
>      run the TEST function
>           create_client --> fork
>                 [in fork]   alloc, dealloc, whatever
>           wait for client
>      check alloc_count and fds_count
> 
> The problem is that the forked client has the alloc_count and fds_count
> in its own virtual memory, so the leak checker finds only leaks in
> TEST function.
> 
> I have some patches that extend these leak checks
> into the client too, but I ran into problem with filedescriptors there.
> The problem is that when forking the client, we pass WAYLAND_SOCKET fd
> to wl_display_connect(). wl_display_connect() just uses this fd (no dup),
> so wl_display_disconnect() closes this fd.
> So the result of leak checker is that the client closed
> one more fd that it opened. Hardcoding -1 to client's fds number
> is not a solution too, because client does not have to call
> wl_display_connect()
> 
> So summing it up: using leak checker in clients is possible, but only for
> memory
> leaks. For fd leaks we'd have to use some more sophisticated solution.

Erf, more complicated than I imagined.

I think the fd leak thing should be easy to fix though: I'm not sure
the strict equality check for open fds makes too much sense, it would
be also good if there are less fds open in the end than in the
beginning. Right? So fail only if in the end there are more fds open.

That seems more useful to me than not doing fd leak checks at all.
Maybe we can even do it differently in TEST vs. clients?

> > For one, I think those are not thread-safe. Calling exit() in a test
> > would bypass it too, but I'm not sure that happens.
> >
> 
> Huh? What is not thread-safe? you mean leak checks when running only one
> test (without fork?)

I mean e.g. this function

__attribute__ ((visibility("default"))) void *
malloc(size_t size)
{
        num_alloc++;
        return sys_malloc(size);
}

is not thread-safe, because it read-modify-writes a global variable
without any explicit synchronization guarantees. It's not using a mutex
nor a guaranteed-atomic variable or operation. We just get lucky on
most arches, I assume. Or at least on x86?

> Yes, it looks like when calling exit in the test, the leak check is
> bypassed.
> Quick grep though the tests shows that there are no calls to exit in TEST,
> just in
> the forked clients (which is not a problem _now_ with no leak checks in
> clients)

Yeah.

> > The Valgrind output nowadays is quite messy. Even when I run a single
> > test from one test binary which should not fork at all, I think I see
> > two forks in Valgrind output: one is likely the debugger check, maybe
> > the other is actually a thread for the test compositor or something?
> >
> 
> Yes, the first fork is the debugger check and the other is the client
> for test compositor. In the tests without the test compositor you should
> see only
> one fork (the debugger check)

I suppose we need to live with that.

> > I think Valgrind is saying that there are many more leaks to plug.
> >
> >
> Yes


Thanks,
pq
On 4 December 2014 at 12:12, Pekka Paalanen <ppaalanen@gmail.com> wrote:

> On Wed, 3 Dec 2014 11:44:47 +0100
> Marek Chalupa <mchqwerty@gmail.com> wrote:
>
> > On 1 December 2014 at 11:42, Pekka Paalanen <ppaalanen@gmail.com> wrote:
> >
> > > On Fri, 21 Nov 2014 11:18:33 +0100
> > > Marek Chalupa <mchqwerty@gmail.com> wrote:
> > >
> > > > We didn't free the struct client that we got from client_connect()
> > > >
> > > > Signed-off-by: Marek Chalupa <mchqwerty@gmail.com>
> > > > ---
> > > >  tests/display-test.c    | 23 +++++++++++++----------
> > > >  tests/test-compositor.c |  1 +
> > > >  2 files changed, 14 insertions(+), 10 deletions(-)
> > >
> > > Hi,
> > >
> > > how did you find these leaks?
> > >
> >
> > By valgrind --trace-children=yes
> >
> >
> > >
> > > More importantly, why doesn't our leak checker find these?
> > > That is, the code in test-runner.c, run_test().
> > >
> >
> > Because it is in another fork. It looks like this:
> >
> > run_test()
> >      int the alloc_count  and fds_count
> >      run the TEST function
> >           create_client --> fork
> >                 [in fork]   alloc, dealloc, whatever
> >           wait for client
> >      check alloc_count and fds_count
> >
> > The problem is that the forked client has the alloc_count and fds_count
> > in its own virtual memory, so the leak checker finds only leaks in
> > TEST function.
> >
> > I have some patches that extend these leak checks
> > into the client too, but I ran into problem with filedescriptors there.
> > The problem is that when forking the client, we pass WAYLAND_SOCKET fd
> > to wl_display_connect(). wl_display_connect() just uses this fd (no dup),
> > so wl_display_disconnect() closes this fd.
> > So the result of leak checker is that the client closed
> > one more fd that it opened. Hardcoding -1 to client's fds number
> > is not a solution too, because client does not have to call
> > wl_display_connect()
> >
> > So summing it up: using leak checker in clients is possible, but only for
> > memory
> > leaks. For fd leaks we'd have to use some more sophisticated solution.
>
> Erf, more complicated than I imagined.
>
> I think the fd leak thing should be easy to fix though: I'm not sure
> the strict equality check for open fds makes too much sense, it would
> be also good if there are less fds open in the end than in the
> beginning. Right? So fail only if in the end there are more fds open.
>
> That seems more useful to me than not doing fd leak checks at all.
> Maybe we can even do it differently in TEST vs. clients?
>

In my branch I chose even different solution. Instead of doing it
differently in TEST and clients,
I did it differently in normal TESTs and sanity TESTs. Sanity tests are
currently the
only one that are not using wl_display_connect, so I added a macro
TEST_CLIENT_NOT_USING_WL_DISPLAY_CONNECT which just fixed the leak checks
for the sanity tests (or any other test that do not use
wl_display_connect() -- but there is none atm).


> > > For one, I think those are not thread-safe. Calling exit() in a test
> > > would bypass it too, but I'm not sure that happens.
> > >
> >
> > Huh? What is not thread-safe? you mean leak checks when running only one
> > test (without fork?)
>
> I mean e.g. this function
>
> __attribute__ ((visibility("default"))) void *
> malloc(size_t size)
> {
>         num_alloc++;
>         return sys_malloc(size);
> }
>
> is not thread-safe, because it read-modify-writes a global variable
> without any explicit synchronization guarantees. It's not using a mutex
> nor a guaranteed-atomic variable or operation. We just get lucky on
> most arches, I assume. Or at least on x86?
>

Dunno, haven't tried anywhere else that on x86_64...
Anyway, in my development branch I added support for atomic
increasing/decreasing
of the counter.


>
> > Yes, it looks like when calling exit in the test, the leak check is
> > bypassed.
> > Quick grep though the tests shows that there are no calls to exit in
> TEST,
> > just in
> > the forked clients (which is not a problem _now_ with no leak checks in
> > clients)
>
> Yeah.
>
> > > The Valgrind output nowadays is quite messy. Even when I run a single
> > > test from one test binary which should not fork at all, I think I see
> > > two forks in Valgrind output: one is likely the debugger check, maybe
> > > the other is actually a thread for the test compositor or something?
> > >
> >
> > Yes, the first fork is the debugger check and the other is the client
> > for test compositor. In the tests without the test compositor you should
> > see only
> > one fork (the debugger check)
>
> I suppose we need to live with that.
>
> > > I think Valgrind is saying that there are many more leaks to plug.
> > >
> > >
> > Yes
>

Still keep the patches on github, because the leak checker made the test to
fail
and I'd like to check if it is correct or I have a bug in the code
somewhere.
If you're (or anybody is) interested, it is here:

https://github.com/mchalupa/wayland/tree/tc-client-leaks


>
>
> Thanks,
> pq
>

Regards,
Marek