[spice-server,5/5] red-client: Prevent some nasty threading problems disconnecting channels

Submitted by Frediano Ziglio on Aug. 24, 2017, 2:37 p.m.

Details

Message ID 20170824143720.7805-5-fziglio@redhat.com
State New
Headers show
Series "Series without cover letter" ( rev: 1 ) in Spice

Not browsing as part of any series.

Commit Message

Frediano Ziglio Aug. 24, 2017, 2:37 p.m.
The channels list was not protected by a lock in red_client_destroy.
This could cause for instance a RedChannelClient object to be
created while scanning the list so potentially modifying the
list while scanning with all race issues.
Another issue was that was not clear how the ownership were
managed. When red_client_destroy was called red_channel_client_destroy
was called which freed the RedChannelClient object so this should
imply ownership however same red_channel_client_destroy call was
attempted by RedChannel using its list. Basically the two pointers
(the one from the channel and the one from the client) were
considered as one ownership. To avoid the confusion now the client
list always decrement the counter.
Another issue was that disconnecting a single channel from the client
caused the server to keep a stale channel client till the client entirely
disconnected. Calling red_client_remove_channel from
red_channel_client_disconnect fix this last issue.

Signed-off-by: Frediano Ziglio <fziglio@redhat.com>
---
 server/red-channel-client.c | 11 ++++++++++-
 server/red-client.c         | 48 ++++++++++++++++++++++++++++++++++++++++-----
 2 files changed, 53 insertions(+), 6 deletions(-)

Patch hide | download patch | download mbox

diff --git a/server/red-channel-client.c b/server/red-channel-client.c
index 1162ddca..b236e3bd 100644
--- a/server/red-channel-client.c
+++ b/server/red-channel-client.c
@@ -696,6 +696,8 @@  static void red_channel_client_ping_timer(void *opaque)
 {
     RedChannelClient *rcc = opaque;
 
+    g_object_ref(rcc);
+
     spice_assert(rcc->priv->latency_monitor.state == PING_STATE_TIMER);
     red_channel_client_cancel_ping_timer(rcc);
 
@@ -718,6 +720,7 @@  static void red_channel_client_ping_timer(void *opaque)
     /* More portable alternative code path (less accurate but avoids bogus ioctls)*/
     red_channel_client_push_ping(rcc);
 #endif /* ifdef HAVE_LINUX_SOCKIOS_H */
+    g_object_unref(rcc);
 }
 
 static inline int red_channel_client_waiting_for_ack(RedChannelClient *rcc)
@@ -749,6 +752,8 @@  static void red_channel_client_connectivity_timer(void *opaque)
     RedChannelClientConnectivityMonitor *monitor = &rcc->priv->connectivity_monitor;
     int is_alive = TRUE;
 
+    g_object_ref(rcc);
+
     if (monitor->state == CONNECTIVITY_STATE_BLOCKED) {
         if (!monitor->received_bytes && !monitor->sent_bytes) {
             if (!red_channel_client_is_blocked(rcc) && !red_channel_client_waiting_for_ack(rcc)) {
@@ -793,6 +798,7 @@  static void red_channel_client_connectivity_timer(void *opaque)
                       rcc, type, id, monitor->timeout);
         red_channel_client_disconnect(rcc);
     }
+    g_object_unref(rcc);
 }
 
 void red_channel_client_start_connectivity_monitoring(RedChannelClient *rcc, uint32_t timeout_ms)
@@ -1012,7 +1018,6 @@  void red_channel_client_destroy(RedChannelClient *rcc)
     rcc->priv->destroying = TRUE;
     red_channel_client_disconnect(rcc);
     red_client_remove_channel(rcc);
-    g_object_unref(rcc);
 }
 
 void red_channel_client_shutdown(RedChannelClient *rcc)
@@ -1709,6 +1714,10 @@  void red_channel_client_disconnect(RedChannelClient *rcc)
     }
     red_channel_remove_client(channel, rcc);
     red_channel_on_disconnect(channel, rcc);
+    // remove client from RedClient
+    // NOTE this may trigger the free of the object, if we are in a watch/timer
+    // we should make sure we keep a reference
+    red_client_remove_channel(rcc);
 }
 
 gboolean red_channel_client_is_blocked(RedChannelClient *rcc)
diff --git a/server/red-client.c b/server/red-client.c
index 3ce09e33..7394bd1f 100644
--- a/server/red-client.c
+++ b/server/red-client.c
@@ -192,9 +192,6 @@  void red_client_migrate(RedClient *client)
 
 void red_client_destroy(RedClient *client)
 {
-    RedChannelClient *rcc;
-
-    spice_printerr("destroy client %p with #channels=%d", client, g_list_length(client->channels));
     if (!pthread_equal(pthread_self(), client->thread_id)) {
         spice_warning("client->thread_id (0x%lx) != pthread_self (0x%lx)."
                       "If one of the threads is != io-thread && != vcpu-thread,"
@@ -202,22 +199,46 @@  void red_client_destroy(RedClient *client)
                       client->thread_id,
                       pthread_self());
     }
-    FOREACH_CHANNEL_CLIENT(client, rcc) {
+
+    pthread_mutex_lock(&client->lock);
+    spice_printerr("destroy client %p with #channels=%d", client, g_list_length(client->channels));
+    // this make sure that possible RedChannelClient setting up
+    // to be added won't be added to the list
+    client->disconnecting = TRUE;
+    while (client->channels) {
         RedChannel *channel;
+        RedChannelClient *rcc = client->channels->data;
+
+        // Remove the list to the RedChannelClient we are processing
+        // note that we own the object so is safe to do some operations on it.
+        // This manual scan of the list is done to have a thread safe
+        // iteration of the list
+        client->channels = g_list_delete_link(client->channels, client->channels);
+
         // some channels may be in other threads, so disconnection
         // is not synchronous.
         channel = red_channel_client_get_channel(rcc);
         red_channel_client_set_destroying(rcc);
+
+        // prevent dead lock disconnecting rcc (which can happen
+        // in the same thread or synchronously on another one)
+        pthread_mutex_unlock(&client->lock);
+
         // some channels may be in other threads. However we currently
         // assume disconnect is synchronous (we changed the dispatcher
         // to wait for disconnection)
         // TODO: should we go back to async. For this we need to use
         // ref count for channel clients.
         red_channel_disconnect_client(channel, rcc);
+
         spice_assert(red_channel_client_pipe_is_empty(rcc));
         spice_assert(red_channel_client_no_item_being_sent(rcc));
+
         red_channel_client_destroy(rcc);
+        g_object_unref(rcc);
+        pthread_mutex_lock(&client->lock);
     }
+    pthread_mutex_unlock(&client->lock);
     g_object_unref(client);
 }
 
@@ -254,6 +275,16 @@  gboolean red_client_add_channel(RedClient *client, RedChannelClient *rcc, GError
     pthread_mutex_lock(&client->lock);
 
     g_object_get(channel, "channel-type", &type, "id", &id, NULL);
+    if (client->disconnecting) {
+        g_set_error(error,
+                    SPICE_SERVER_ERROR,
+                    SPICE_SERVER_ERROR_FAILED,
+                    "Client %p got disconnected while connecting channel type %d id %d",
+                    client, type, id);
+        result = FALSE;
+        goto cleanup;
+    }
+
     if (red_client_get_channel(client, type, id)) {
         g_set_error(error,
                     SPICE_SERVER_ERROR,
@@ -316,10 +347,17 @@  int red_client_during_migrate_at_target(RedClient *client)
 
 void red_client_remove_channel(RedChannelClient *rcc)
 {
+    GList *link;
     RedClient *client = red_channel_client_get_client(rcc);
     pthread_mutex_lock(&client->lock);
-    client->channels = g_list_remove(client->channels, rcc);
+    link = g_list_find(client->channels, rcc);
+    if (link) {
+        client->channels = g_list_remove_link(client->channels, link);
+    }
     pthread_mutex_unlock(&client->lock);
+    if (link) {
+        g_object_unref(rcc);
+    }
 }
 
 /* returns TRUE If all channels are finished migrating, FALSE otherwise */

Comments

On Thu, Aug 24, 2017 at 03:37:20PM +0100, Frediano Ziglio wrote:
> The channels list was not protected by a lock in red_client_destroy.
> This could cause for instance a RedChannelClient object to be
> created while scanning the list so potentially modifying the
> list while scanning with all race issues.
> Another issue was that was not clear how the ownership were
> managed. When red_client_destroy was called red_channel_client_destroy
> was called which freed the RedChannelClient object so this should
> imply ownership however same red_channel_client_destroy call was
> attempted by RedChannel using its list. Basically the two pointers
> (the one from the channel and the one from the client) were
> considered as one ownership. To avoid the confusion now the client
> list always decrement the counter.
> Another issue was that disconnecting a single channel from the client
> caused the server to keep a stale channel client till the client entirely
> disconnected. Calling red_client_remove_channel from
> red_channel_client_disconnect fix this last issue.

Just reading the log, I get the impression there are 3 fixes together in
this commit. Given its title ("nasty threading issues"), this is really
not a nice trick to play to the person reviewing the patch :(

Christophe

> 
> Signed-off-by: Frediano Ziglio <fziglio@redhat.com>
> ---
>  server/red-channel-client.c | 11 ++++++++++-
>  server/red-client.c         | 48 ++++++++++++++++++++++++++++++++++++++++-----
>  2 files changed, 53 insertions(+), 6 deletions(-)
> 
> diff --git a/server/red-channel-client.c b/server/red-channel-client.c
> index 1162ddca..b236e3bd 100644
> --- a/server/red-channel-client.c
> +++ b/server/red-channel-client.c
> @@ -696,6 +696,8 @@ static void red_channel_client_ping_timer(void *opaque)
>  {
>      RedChannelClient *rcc = opaque;
>  
> +    g_object_ref(rcc);
> +
>      spice_assert(rcc->priv->latency_monitor.state == PING_STATE_TIMER);
>      red_channel_client_cancel_ping_timer(rcc);
>  
> @@ -718,6 +720,7 @@ static void red_channel_client_ping_timer(void *opaque)
>      /* More portable alternative code path (less accurate but avoids bogus ioctls)*/
>      red_channel_client_push_ping(rcc);
>  #endif /* ifdef HAVE_LINUX_SOCKIOS_H */
> +    g_object_unref(rcc);
>  }
>  
>  static inline int red_channel_client_waiting_for_ack(RedChannelClient *rcc)
> @@ -749,6 +752,8 @@ static void red_channel_client_connectivity_timer(void *opaque)
>      RedChannelClientConnectivityMonitor *monitor = &rcc->priv->connectivity_monitor;
>      int is_alive = TRUE;
>  
> +    g_object_ref(rcc);
> +
>      if (monitor->state == CONNECTIVITY_STATE_BLOCKED) {
>          if (!monitor->received_bytes && !monitor->sent_bytes) {
>              if (!red_channel_client_is_blocked(rcc) && !red_channel_client_waiting_for_ack(rcc)) {
> @@ -793,6 +798,7 @@ static void red_channel_client_connectivity_timer(void *opaque)
>                        rcc, type, id, monitor->timeout);
>          red_channel_client_disconnect(rcc);
>      }
> +    g_object_unref(rcc);
>  }
>  
>  void red_channel_client_start_connectivity_monitoring(RedChannelClient *rcc, uint32_t timeout_ms)
> @@ -1012,7 +1018,6 @@ void red_channel_client_destroy(RedChannelClient *rcc)
>      rcc->priv->destroying = TRUE;
>      red_channel_client_disconnect(rcc);
>      red_client_remove_channel(rcc);
> -    g_object_unref(rcc);
>  }
>  
>  void red_channel_client_shutdown(RedChannelClient *rcc)
> @@ -1709,6 +1714,10 @@ void red_channel_client_disconnect(RedChannelClient *rcc)
>      }
>      red_channel_remove_client(channel, rcc);
>      red_channel_on_disconnect(channel, rcc);
> +    // remove client from RedClient
> +    // NOTE this may trigger the free of the object, if we are in a watch/timer
> +    // we should make sure we keep a reference
> +    red_client_remove_channel(rcc);
>  }
>  
>  gboolean red_channel_client_is_blocked(RedChannelClient *rcc)
> diff --git a/server/red-client.c b/server/red-client.c
> index 3ce09e33..7394bd1f 100644
> --- a/server/red-client.c
> +++ b/server/red-client.c
> @@ -192,9 +192,6 @@ void red_client_migrate(RedClient *client)
>  
>  void red_client_destroy(RedClient *client)
>  {
> -    RedChannelClient *rcc;
> -
> -    spice_printerr("destroy client %p with #channels=%d", client, g_list_length(client->channels));
>      if (!pthread_equal(pthread_self(), client->thread_id)) {
>          spice_warning("client->thread_id (0x%lx) != pthread_self (0x%lx)."
>                        "If one of the threads is != io-thread && != vcpu-thread,"
> @@ -202,22 +199,46 @@ void red_client_destroy(RedClient *client)
>                        client->thread_id,
>                        pthread_self());
>      }
> -    FOREACH_CHANNEL_CLIENT(client, rcc) {
> +
> +    pthread_mutex_lock(&client->lock);
> +    spice_printerr("destroy client %p with #channels=%d", client, g_list_length(client->channels));
> +    // this make sure that possible RedChannelClient setting up
> +    // to be added won't be added to the list
> +    client->disconnecting = TRUE;
> +    while (client->channels) {
>          RedChannel *channel;
> +        RedChannelClient *rcc = client->channels->data;
> +
> +        // Remove the list to the RedChannelClient we are processing
> +        // note that we own the object so is safe to do some operations on it.
> +        // This manual scan of the list is done to have a thread safe
> +        // iteration of the list
> +        client->channels = g_list_delete_link(client->channels, client->channels);
> +
>          // some channels may be in other threads, so disconnection
>          // is not synchronous.
>          channel = red_channel_client_get_channel(rcc);
>          red_channel_client_set_destroying(rcc);
> +
> +        // prevent dead lock disconnecting rcc (which can happen
> +        // in the same thread or synchronously on another one)
> +        pthread_mutex_unlock(&client->lock);
> +
>          // some channels may be in other threads. However we currently
>          // assume disconnect is synchronous (we changed the dispatcher
>          // to wait for disconnection)
>          // TODO: should we go back to async. For this we need to use
>          // ref count for channel clients.
>          red_channel_disconnect_client(channel, rcc);
> +
>          spice_assert(red_channel_client_pipe_is_empty(rcc));
>          spice_assert(red_channel_client_no_item_being_sent(rcc));
> +
>          red_channel_client_destroy(rcc);
> +        g_object_unref(rcc);
> +        pthread_mutex_lock(&client->lock);
>      }
> +    pthread_mutex_unlock(&client->lock);
>      g_object_unref(client);
>  }
>  
> @@ -254,6 +275,16 @@ gboolean red_client_add_channel(RedClient *client, RedChannelClient *rcc, GError
>      pthread_mutex_lock(&client->lock);
>  
>      g_object_get(channel, "channel-type", &type, "id", &id, NULL);
> +    if (client->disconnecting) {
> +        g_set_error(error,
> +                    SPICE_SERVER_ERROR,
> +                    SPICE_SERVER_ERROR_FAILED,
> +                    "Client %p got disconnected while connecting channel type %d id %d",
> +                    client, type, id);
> +        result = FALSE;
> +        goto cleanup;
> +    }
> +
>      if (red_client_get_channel(client, type, id)) {
>          g_set_error(error,
>                      SPICE_SERVER_ERROR,
> @@ -316,10 +347,17 @@ int red_client_during_migrate_at_target(RedClient *client)
>  
>  void red_client_remove_channel(RedChannelClient *rcc)
>  {
> +    GList *link;
>      RedClient *client = red_channel_client_get_client(rcc);
>      pthread_mutex_lock(&client->lock);
> -    client->channels = g_list_remove(client->channels, rcc);
> +    link = g_list_find(client->channels, rcc);
> +    if (link) {
> +        client->channels = g_list_remove_link(client->channels, link);
> +    }
>      pthread_mutex_unlock(&client->lock);
> +    if (link) {
> +        g_object_unref(rcc);
> +    }
>  }
>  
>  /* returns TRUE If all channels are finished migrating, FALSE otherwise */
> -- 
> 2.13.5
> 
> _______________________________________________
> Spice-devel mailing list
> Spice-devel@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/spice-devel
> 
> [spice-devel was dropped when you replied, I don't know if this was
> intentional]
> 

no, added again.

> Hey,
> 
> On Fri, Aug 25, 2017 at 12:48:51PM -0400, Frediano Ziglio wrote:
> > Don't want to fool nobody, these issues are really bound together
> > and from the small lock issue I manage to spend lot of time trying
> > to understand it
> 
> The issues are indeed related, but even after taking a closer look, I
> think at least the refcounting and the list iteration code could be
> separated...
> 

I'll try to separate.

> 
> > (OT: I started writing a document, where should
> > I put it? A markdown in docs??).
> 
> Yes, we can have new files in docs/
> One thing which could be useful is some annotations in
> RedChannelClient/RedChannel/RedClient regarding which functions are safe
> to call from any thread, and which functions should be called from the
> thread the GObject was created in. I would not be surprised if we had
> more issues like the ones you are fixing here :( Dunno if qemu is
> runnable in helgrind.
> 
> > 
> > So, let me try to recap. No lock on the list which is accessed by
> > multiple threads. Added the lock... deadlock adding. So decide
> > to scan the list in a more safe way but with less lock.
> 
> Hmm, my preference would be to require that RedClient::channels is
> always manipulated from the main thread, have you considered that
> option? (either asynchronously call
> red_client_add_channel/red_client_remove_channel from RedChannelClient,
> or blocking call which calls into the main thread).
> 

I have considered but creates different issues too.
For instance adding would be more of an issue, currently the creation
of a RedChannelClient is done in the channel thread and you add
to RedClient. If you want to split you'll have to create the
RedChannelClient but not "activate" it (registering watches and timers),
then call the main thread to add to RedClient, than RedClient with another
call to channel thread will activate the RedClient.
During removal you'll have to do a call to the main thread to remove the
RedChannelClient from RedClient.
But this will also have some potential problems. For instance during
creation you'd have a "connected" (by definition RCCs are connected
if inside RedChannel lists) RCC not bound to RedClient while now
this condition is impossible.

> 
> > Than I realized some function call was referring to objects
> > (like RedChannelClient but also RedChannel) potentially running
> > in different thread. So had to check them if the data they accesses
> > are safe to be accessed that way.
> > Than as the RedClient function is supposed to destroy objects I
> > started printing when stuff got released and looking at the paths.
> > Then some objects were not released (the RedChannelClient) so
> > goes and check the various reference counting. Then after fixing
> > some some usage after free started to happen.
> > 
> > If you look at red-channel-client.c and search for g_object_ref
> > or unref you will note lot of cases where the object you are working
> > on are incremented at some point and decremented at the end.
> > But so the object pointer we receive has no ownership? The problem
> > is that when you receive it it has ownership but during the
> > execution it can loose it. For instance you scan the list of
> > ChannelClients from Channel and call a function and in the meantime
> > the object get removed from the initial list. Note: the lists are
> > all scanned in a "safe" way, we store the next element before
> > handling the current so if the current is removed the next SHOULD
> > be safe (I think this is one reason why MainChannelClient
> > disconnection is done asynchronously in the main thread even if
> > MainChannelClient run in the main thread).
> 
> After looking at that code, ownership of RedChannelClient is indeed
> odd.. After your patch, it looks like RedClient::channels is considered
> as owning the newly created RedChannelClient?
> 

Even before actually but was less clear. This as in red_client_destroy
the RCCs were unreferenced.

> If that's your goal, I would make it very explicit, both in the commit
> log and in the code. red_channel_client_initable_init() adds the
> RedChannelClient to both RedChannel and RedClient, but when the object
> has been created, its refcount is only 1, so I'd also make it very
> explicit that the 2nd list does not own any reference to the
> RedChannelClient.
> Once this is cleared up, hopefully reviewing your
> g_object_ref/g_object_unref changes becomes very easy.
> 
> > > > --- a/server/red-client.c
> > > > +++ b/server/red-client.c
> > > > @@ -192,9 +192,6 @@ void red_client_migrate(RedClient *client)
> > > >  
> > > >  void red_client_destroy(RedClient *client)
> > > >  {
> > > > -    RedChannelClient *rcc;
> > > > -
> > > > -    spice_printerr("destroy client %p with #channels=%d", client,
> > > > g_list_length(client->channels));
> > > >      if (!pthread_equal(pthread_self(), client->thread_id)) {
> > > >          spice_warning("client->thread_id (0x%lx) != pthread_self
> > > >          (0x%lx)."
> > > >                        "If one of the threads is != io-thread && !=
> > > >                        vcpu-thread,"
> > > > @@ -202,22 +199,46 @@ void red_client_destroy(RedClient *client)
> > > >                        client->thread_id,
> > > >                        pthread_self());
> > > >      }
> > > > -    FOREACH_CHANNEL_CLIENT(client, rcc) {
> > > > +
> > > > +    pthread_mutex_lock(&client->lock);
> > > > +    spice_printerr("destroy client %p with #channels=%d", client,
> > > > g_list_length(client->channels));
> > > > +    // this make sure that possible RedChannelClient setting up
> > > > +    // to be added won't be added to the list
> > > > +    client->disconnecting = TRUE;
> > > > +    while (client->channels) {
> > > >          RedChannel *channel;
> > > > +        RedChannelClient *rcc = client->channels->data;
> > > > +
> > > > +        // Remove the list to the RedChannelClient we are processing
> > > > +        // note that we own the object so is safe to do some
> > > > operations on
> > > > it.
> > > > +        // This manual scan of the list is done to have a thread safe
> > > > +        // iteration of the list
> > > > +        client->channels = g_list_delete_link(client->channels,
> > > > client->channels);
> > > > +
> > > >          // some channels may be in other threads, so disconnection
> > > >          // is not synchronous.
> > > >          channel = red_channel_client_get_channel(rcc);
> > > >          red_channel_client_set_destroying(rcc);
> > > > +
> > > > +        // prevent dead lock disconnecting rcc (which can happen
> > > > +        // in the same thread or synchronously on another one)
> > > > +        pthread_mutex_unlock(&client->lock);
> > > > +
> > 
> > One of the reasons I had to change the reference counting is to
> > avoid a double free not knowing if we have the ownership or not.
> > 
> > > >          // some channels may be in other threads. However we currently
> > > >          // assume disconnect is synchronous (we changed the dispatcher
> > > >          // to wait for disconnection)
> > > >          // TODO: should we go back to async. For this we need to use
> > > >          // ref count for channel clients.
> > > >          red_channel_disconnect_client(channel, rcc);
> > > > +
> > > >          spice_assert(red_channel_client_pipe_is_empty(rcc));
> > > >          spice_assert(red_channel_client_no_item_being_sent(rcc));
> > > > +
> > > >          red_channel_client_destroy(rcc);
> > 
> > Actually I think that now maybe this call is not necessary...
> > but better safe than sorry.
> 
> Yes, it seems very redundant as this call is doing disconnect +
> set_destroying. This could be a nice preparatory commit, makes things
> simpler :)
> 

Do not understand the suggestion here...

> > 
> > Basically the real disconnection can happen either in
> > red_channel_disconnect_client (which potentially run in a
> > separate thread) or in red_channel_client_destroy, they
> > both call red_channel_client_disconnect.
> > 
> > > > +        g_object_unref(rcc);
> > > > +        pthread_mutex_lock(&client->lock);
> > > >      }
> > > > +    pthread_mutex_unlock(&client->lock);
> > > >      g_object_unref(client);
> > > >  }
> > > >  
> > > > @@ -254,6 +275,16 @@ gboolean red_client_add_channel(RedClient *client,
> > > > RedChannelClient *rcc, GError
> > > >      pthread_mutex_lock(&client->lock);
> > > >  
> > > >      g_object_get(channel, "channel-type", &type, "id", &id, NULL);
> > > > +    if (client->disconnecting) {
> > > > +        g_set_error(error,
> > > > +                    SPICE_SERVER_ERROR,
> > > > +                    SPICE_SERVER_ERROR_FAILED,
> > > > +                    "Client %p got disconnected while connecting
> > > > channel
> > > > type %d id %d",
> > > > +                    client, type, id);
> > > > +        result = FALSE;
> > > > +        goto cleanup;
> > > > +    }
> 
> The client->disconnecting stuff in this commit could also go in a
> separate commit. It seems like the goal is to prevent new
> RedChannelClient creation when the RedClient is being destroyed?
> 

Yes, the question suggest that the message 
"Client %p got disconnected while connecting channel type %d id %d"
is not clear. Suggestion on how to rewrite it?

This happens as the connection is asynchronous so (MT main thread,
CT channel thread):
- MT you get a new connection;
- MT a connection is sent to CT
- MT you get a disconnection of main channel
- MT red_client_destroy is called
- CT you attempt to add the RCC to RedClient

> Christophe
> 

Frediano
On Wed, Aug 30, 2017 at 04:19:47AM -0400, Frediano Ziglio wrote:
> > > > >          // some channels may be in other threads. However we currently
> > > > >          // assume disconnect is synchronous (we changed the dispatcher
> > > > >          // to wait for disconnection)
> > > > >          // TODO: should we go back to async. For this we need to use
> > > > >          // ref count for channel clients.
> > > > >          red_channel_disconnect_client(channel, rcc);
> > > > > +
> > > > >          spice_assert(red_channel_client_pipe_is_empty(rcc));
> > > > >          spice_assert(red_channel_client_no_item_being_sent(rcc));
> > > > > +
> > > > >          red_channel_client_destroy(rcc);
> > > 
> > > Actually I think that now maybe this call is not necessary...
> > > but better safe than sorry.
> > 
> > Yes, it seems very redundant as this call is doing disconnect +
> > set_destroying. This could be a nice preparatory commit, makes things
> > simpler :)
> > 
> 
> Do not understand the suggestion here...

Ah, I was agreeing with you, red_channel_client_destroy() is not
necessary here, so I would drop it in a commit before or after this
patch we are discussing.

> 
> > > 
> > > Basically the real disconnection can happen either in
> > > red_channel_disconnect_client (which potentially run in a
> > > separate thread) or in red_channel_client_destroy, they
> > > both call red_channel_client_disconnect.
> > > 
> > > > > +        g_object_unref(rcc);
> > > > > +        pthread_mutex_lock(&client->lock);
> > > > >      }
> > > > > +    pthread_mutex_unlock(&client->lock);
> > > > >      g_object_unref(client);
> > > > >  }
> > > > >  
> > > > > @@ -254,6 +275,16 @@ gboolean red_client_add_channel(RedClient *client,
> > > > > RedChannelClient *rcc, GError
> > > > >      pthread_mutex_lock(&client->lock);
> > > > >  
> > > > >      g_object_get(channel, "channel-type", &type, "id", &id, NULL);
> > > > > +    if (client->disconnecting) {
> > > > > +        g_set_error(error,
> > > > > +                    SPICE_SERVER_ERROR,
> > > > > +                    SPICE_SERVER_ERROR_FAILED,
> > > > > +                    "Client %p got disconnected while connecting
> > > > > channel
> > > > > type %d id %d",
> > > > > +                    client, type, id);
> > > > > +        result = FALSE;
> > > > > +        goto cleanup;
> > > > > +    }
> > 
> > The client->disconnecting stuff in this commit could also go in a
> > separate commit. It seems like the goal is to prevent new
> > RedChannelClient creation when the RedClient is being destroyed?
> > 
> 
> Yes, the question suggest that the message 
> "Client %p got disconnected while connecting channel type %d id %d"
> is not clear. Suggestion on how to rewrite it?

Hmm I'd say the message is fine, I did not re-read through the commit
while writing that answer, so I just forgot about the message, and just
wanted to make sure I understood your goal :)

> 
> This happens as the connection is asynchronous so (MT main thread,
> CT channel thread):
> - MT you get a new connection;
> - MT a connection is sent to CT
> - MT you get a disconnection of main channel
> - MT red_client_destroy is called
> - CT you attempt to add the RCC to RedClient

Yes, I got this one, and while reviewing the changes related to
accessing the 'channels' list, I was wondering what would happen if we
kept trying to create new RCC while iterating over the list, so it was
good to see this 'client->disconnecting' change ;)

This short description you just gave belongs in one of the commit logs
in my opinion.

Christophe