[xserver] Pass ClientPtr to FlushCallback

Submitted by Michel Dänzer on Aug. 2, 2016, 8:53 a.m.

Details

Message ID 1470127981-1491-1-git-send-email-michel@daenzer.net
State Accepted
Commit b380f3ac51f40ffefcde7d3db5c4c149f274246d
Headers show
Series "Pass ClientPtr to FlushCallback" ( rev: 1 ) in X.org (DEPRECATED - USE GITLAB)

Not browsing as part of any series.

Commit Message

Michel Dänzer Aug. 2, 2016, 8:53 a.m.
From: Michel Dänzer <michel.daenzer@amd.com>

This change has two effects:

1. Only calls FlushCallbacks when we're actually flushing data to a
   client. The unnecessary FlushCallback calls could cause significant
   performance degradation with compositing, which is significantly
   reduced even without any driver changes.

2. By passing the ClientPtr to FlushCallbacks, drivers can completely
   eliminate unnecessary flushing of GPU commands by keeping track of
   whether we're flushing any XDamageNotify events to the client for
   which the corresponding rendering commands haven't been flushed to
   the GPU yet.

Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
---

See https://lists.freedesktop.org/archives/amd-gfx/2016-August/000977.html
for an example of how to take advantage of this change to eliminate
unnecessary GPU flushes.

 os/connection.c | 2 +-
 os/io.c         | 9 +++------
 2 files changed, 4 insertions(+), 7 deletions(-)

Patch hide | download patch | download mbox

diff --git a/os/connection.c b/os/connection.c
index 4294ee7..a901ebf 100644
--- a/os/connection.c
+++ b/os/connection.c
@@ -908,7 +908,7 @@  CloseDownConnection(ClientPtr client)
     OsCommPtr oc = (OsCommPtr) client->osPrivate;
 
     if (FlushCallback)
-        CallCallbacks(&FlushCallback, NULL);
+        CallCallbacks(&FlushCallback, client);
 
     if (oc->output)
 	FlushClient(client, oc, (char *) NULL, 0);
diff --git a/os/io.c b/os/io.c
index 88edf12..ff0ec3d 100644
--- a/os/io.c
+++ b/os/io.c
@@ -598,9 +598,6 @@  FlushAllOutput(void)
     register ClientPtr client, tmp;
     Bool newoutput = NewOutputPending;
 
-    if (FlushCallback)
-        CallCallbacks(&FlushCallback, NULL);
-
     if (!newoutput)
         return;
 
@@ -767,9 +764,6 @@  WriteToClient(ClientPtr who, int count, const void *__buf)
             NewOutputPending = FALSE;
         }
 
-        if (FlushCallback)
-            CallCallbacks(&FlushCallback, NULL);
-
         return FlushClient(who, oc, buf, count);
     }
 
@@ -815,6 +809,9 @@  FlushClient(ClientPtr who, OsCommPtr oc, const void *__extraBuf, int extraCount)
     if (!notWritten)
         return 0;
 
+    if (FlushCallback)
+        CallCallbacks(&FlushCallback, who);
+
     todo = notWritten;
     while (notWritten) {
         long before = written;  /* amount of whole thing written */

Comments

Michel Dänzer <michel@daenzer.net> writes:

> From: Michel Dänzer <michel.daenzer@amd.com>
>
> This change has two effects:
>
> 1. Only calls FlushCallbacks when we're actually flushing data to a
>    client. The unnecessary FlushCallback calls could cause significant
>    performance degradation with compositing, which is significantly
>    reduced even without any driver changes.

Seems like a completely reasonable plan. As you can see, the original
goal was to have the callback called only once when flushing output to
multiple clients though. That appears to only be used by the record
extension, so perhaps we just don't care.

> 2. By passing the ClientPtr to FlushCallbacks, drivers can completely
>    eliminate unnecessary flushing of GPU commands by keeping track of
>    whether we're flushing any XDamageNotify events to the client for
>    which the corresponding rendering commands haven't been flushed to
>    the GPU yet.

Is this something we should be doing in either glamor or DIX itself? It
looks like the ATI driver has a number that is incremented every time
commands are sent to the GPU and that clients need to be flushed
whenever they haven't been flushed since the last time that number was
changed?

I don't even know how this works in the generic modesetting driver at
this point; what makes sure that glFlush is called before data are sent
to the client?
Michel Dänzer <michel@daenzer.net> writes:

> From: Michel Dänzer <michel.daenzer@amd.com>
>
> This change has two effects:
>
> 1. Only calls FlushCallbacks when we're actually flushing data to a
>    client. The unnecessary FlushCallback calls could cause significant
>    performance degradation with compositing, which is significantly
>    reduced even without any driver changes.
>
> 2. By passing the ClientPtr to FlushCallbacks, drivers can completely
>    eliminate unnecessary flushing of GPU commands by keeping track of
>    whether we're flushing any XDamageNotify events to the client for
>    which the corresponding rendering commands haven't been flushed to
>    the GPU yet.
>
> Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
> ---
>
> See https://lists.freedesktop.org/archives/amd-gfx/2016-August/000977.html
> for an example of how to take advantage of this change to eliminate
> unnecessary GPU flushes.

Note: Mesa's DRI2 is (supposed to be) doing XSync() during glXWaitX() to
ensure that the server has processed the client's X requests and flushed
its batchbuffers, so that the kernel serializes the batchbuffer from X
before the next rendering by Mesa.  I think your xf86-video-ati patches
will break that.
Keith Packard <keithp@keithp.com> writes:

> Michel Dänzer <michel@daenzer.net> writes:
>
>> From: Michel Dänzer <michel.daenzer@amd.com>
>>
>> This change has two effects:
>>
>> 1. Only calls FlushCallbacks when we're actually flushing data to a
>>    client. The unnecessary FlushCallback calls could cause significant
>>    performance degradation with compositing, which is significantly
>>    reduced even without any driver changes.
>
> Seems like a completely reasonable plan. As you can see, the original
> goal was to have the callback called only once when flushing output to
> multiple clients though. That appears to only be used by the record
> extension, so perhaps we just don't care.
>
>> 2. By passing the ClientPtr to FlushCallbacks, drivers can completely
>>    eliminate unnecessary flushing of GPU commands by keeping track of
>>    whether we're flushing any XDamageNotify events to the client for
>>    which the corresponding rendering commands haven't been flushed to
>>    the GPU yet.
>
> Is this something we should be doing in either glamor or DIX itself? It
> looks like the ATI driver has a number that is incremented every time
> commands are sent to the GPU and that clients need to be flushed
> whenever they haven't been flushed since the last time that number was
> changed?
>
> I don't even know how this works in the generic modesetting driver at
> this point; what makes sure that glFlush is called before data are sent
> to the client?

We glFlush in the BlockHandler in glamor.
Eric Anholt <eric@anholt.net> writes:

> We glFlush in the BlockHandler in glamor.

Sure, what ensures that is called before damage events are delivered?
On 03.08.2016 09:40, Eric Anholt wrote:
> Michel Dänzer <michel@daenzer.net> writes:
> 
>> From: Michel Dänzer <michel.daenzer@amd.com>
>>
>> This change has two effects:
>>
>> 1. Only calls FlushCallbacks when we're actually flushing data to a
>>    client. The unnecessary FlushCallback calls could cause significant
>>    performance degradation with compositing, which is significantly
>>    reduced even without any driver changes.
>>
>> 2. By passing the ClientPtr to FlushCallbacks, drivers can completely
>>    eliminate unnecessary flushing of GPU commands by keeping track of
>>    whether we're flushing any XDamageNotify events to the client for
>>    which the corresponding rendering commands haven't been flushed to
>>    the GPU yet.
>>
>> Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
>> ---
>>
>> See https://lists.freedesktop.org/archives/amd-gfx/2016-August/000977.html
>> for an example of how to take advantage of this change to eliminate
>> unnecessary GPU flushes.
> 
> Note: Mesa's DRI2 is (supposed to be) doing XSync() during glXWaitX() to
> ensure that the server has processed the client's X requests and flushed
> its batchbuffers, so that the kernel serializes the batchbuffer from X
> before the next rendering by Mesa.  I think your xf86-video-ati patches
> will break that.

Can you elaborate how? I honestly can't imagine.
On 03.08.2016 09:40, Eric Anholt wrote:
> Keith Packard <keithp@keithp.com> writes:
> 
>> Michel Dänzer <michel@daenzer.net> writes:
>>
>>> From: Michel Dänzer <michel.daenzer@amd.com>
>>>
>>> This change has two effects:
>>>
>>> 1. Only calls FlushCallbacks when we're actually flushing data to a
>>>    client. The unnecessary FlushCallback calls could cause significant
>>>    performance degradation with compositing, which is significantly
>>>    reduced even without any driver changes.
>>
>> Seems like a completely reasonable plan. As you can see, the original
>> goal was to have the callback called only once when flushing output to
>> multiple clients though. That appears to only be used by the record
>> extension, so perhaps we just don't care.

I suspect that the record extension could also take advantage of this
change, but I'm not sure and can't say that I care too much. :)


>>> 2. By passing the ClientPtr to FlushCallbacks, drivers can completely
>>>    eliminate unnecessary flushing of GPU commands by keeping track of
>>>    whether we're flushing any XDamageNotify events to the client for
>>>    which the corresponding rendering commands haven't been flushed to
>>>    the GPU yet.
>>
>> Is this something we should be doing in either glamor or DIX itself? It
>> looks like the ATI driver has a number that is incremented every time
>> commands are sent to the GPU and that clients need to be flushed
>> whenever they haven't been flushed since the last time that number was
>> changed?
>>
>> I don't even know how this works in the generic modesetting driver at
>> this point; what makes sure that glFlush is called before data are sent
>> to the client?
> 
> We glFlush in the BlockHandler in glamor.

That's not sufficient for correctness with direct rendering compositors,
because FlushClient can be called before BlockHandler.
On 03.08.2016 09:51, Michel Dänzer wrote:
> On 03.08.2016 09:40, Eric Anholt wrote:
>> Michel Dänzer <michel@daenzer.net> writes:
>>
>>> From: Michel Dänzer <michel.daenzer@amd.com>
>>>
>>> This change has two effects:
>>>
>>> 1. Only calls FlushCallbacks when we're actually flushing data to a
>>>    client. The unnecessary FlushCallback calls could cause significant
>>>    performance degradation with compositing, which is significantly
>>>    reduced even without any driver changes.
>>>
>>> 2. By passing the ClientPtr to FlushCallbacks, drivers can completely
>>>    eliminate unnecessary flushing of GPU commands by keeping track of
>>>    whether we're flushing any XDamageNotify events to the client for
>>>    which the corresponding rendering commands haven't been flushed to
>>>    the GPU yet.
>>>
>>> Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
>>> ---
>>>
>>> See https://lists.freedesktop.org/archives/amd-gfx/2016-August/000977.html
>>> for an example of how to take advantage of this change to eliminate
>>> unnecessary GPU flushes.
>>
>> Note: Mesa's DRI2 is (supposed to be) doing XSync() during glXWaitX() to
>> ensure that the server has processed the client's X requests and flushed
>> its batchbuffers, so that the kernel serializes the batchbuffer from X
>> before the next rendering by Mesa.  I think your xf86-video-ati patches
>> will break that.
> 
> Can you elaborate how? I honestly can't imagine.

I guess you mean because we're no longer flushing when sending out the
replies to whatever requests Mesa uses for glXWaitX? That could indeed
be an issue (at least in theory, see below), but right now, neither
dri2_wait_x (which uses DRI2CopyRegion) nor dri3_wait_x (which uses
CopyArea) actually wait for the request reply AFAICT, so it doesn't make
much difference in practice.

I guess the price question is: Can the replies to these requests be
flushed to the client before the BlockHandler?


Anyway, we still always flush in the BlockHandler, so even with my
patches our drivers are strictly more correct than the modesetting
driver. :)
On 03.08.2016 01:12, Keith Packard wrote:
> Michel Dänzer <michel@daenzer.net> writes:
> 
>> 2. By passing the ClientPtr to FlushCallbacks, drivers can completely
>>    eliminate unnecessary flushing of GPU commands by keeping track of
>>    whether we're flushing any XDamageNotify events to the client for
>>    which the corresponding rendering commands haven't been flushed to
>>    the GPU yet.
> 
> Is this something we should be doing in either glamor or DIX itself?

I'm afraid that's not feasible: An additional optimization will be to
never flush for clients which aren't sharing any buffers with the server
via DRI. Only the driver can keep track of that.
Keith Packard <keithp@keithp.com> writes:

> Eric Anholt <eric@anholt.net> writes:
>
>> We glFlush in the BlockHandler in glamor.
>
> Sure, what ensures that is called before damage events are delivered?

You had assured me in the past that BlockHandler gets called before
things are written to clients.
Michel Dänzer <michel@daenzer.net> writes:

> I'm afraid that's not feasible: An additional optimization will be to
> never flush for clients which aren't sharing any buffers with the server
> via DRI. Only the driver can keep track of that.

The DRI3 extension and glamor presumably know about those; I'm mostly
interested in getting as much of this into shared code as possible so
that semantics across drivers are 'the same'.
On 04.08.2016 03:18, Keith Packard wrote:
> Michel Dänzer <michel@daenzer.net> writes:
> 
>> I'm afraid that's not feasible: An additional optimization will be to
>> never flush for clients which aren't sharing any buffers with the server
>> via DRI. Only the driver can keep track of that.
> 
> The DRI3 extension and glamor presumably know about those;

There's also DRI2.


> I'm mostly interested in getting as much of this into shared code as
> possible so that semantics across drivers are 'the same'.

I agree that would be nice, but I just can't see how it's possible.
On Tue, 2016-08-02 at 17:40 -0700, Eric Anholt wrote:

> Note: Mesa's DRI2 is (supposed to be) doing XSync() during glXWaitX() to
> ensure that the server has processed the client's X requests and flushed
> its batchbuffers, so that the kernel serializes the batchbuffer from X
> before the next rendering by Mesa.

This is a bit incorrect. For direct contexts, DRI2's WaitX becomes
DRI2CopyRegion from real-front to fake-front. That _does_ generate a
reply, but whether it flushes the X rendering queue appears to be up to
the driver's CopyRegion{,2} hook; at least ms doesn't do anything
special for that case, although presumably glamor's Flush in
BlockHandler will fire. (DRI3 does something similar, although
apparently uses core X CopyArea to do it instead of present? Sigh.)

Both the DRI2 and DRI3 paths are broken, in fact, since they will only
emit the copy if there's a current context (per spec) and drawable
(maybe implied by spec but questionable), and only for that drawable
(definitely wrong). They ought to emit a real GLXWaitX request
unconditionally, which the server then ought to have enough context to
handle correctly since it keeps track of current drawables globally
(and knows about all the different DRIs). Getting this right isn't
hard, but as with pretty much everything else about multiple GL
clients, people have historically decided it was too difficult to want
to care about, thus ensuring it gets even more difficult to fix over
time...

The WaitX man page and GLX spec are wrong to mention XSync at all,
which is just GetInputFocus on the wire and AFAIK no DDX ever has
interpreted that as a hint to flush rendering; a future GLX spec update
will remove this language.

- ajax
On Tue, 2016-08-02 at 17:53 +0900, Michel Dänzer wrote:
> From: Michel Dänzer <michel.daenzer@amd.com>
> 
> This change has two effects:
> 
> 1. Only calls FlushCallbacks when we're actually flushing data to a
>    client. The unnecessary FlushCallback calls could cause significant
>    performance degradation with compositing, which is significantly
>    reduced even without any driver changes.
> 
> 2. By passing the ClientPtr to FlushCallbacks, drivers can completely
>    eliminate unnecessary flushing of GPU commands by keeping track of
>    whether we're flushing any XDamageNotify events to the client for
>    which the corresponding rendering commands haven't been flushed to
>    the GPU yet.
> 
> Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>

I can imagine a better approach to flush management, but we're honestly
pretty far from implementing even what little the GLX spec asks for.
This looks like an entirely reasonable change on its own and the
improvements it enables are worthwhile, so that's progress at least.

remote: I: patch #102509 updated using rev b380f3ac51f40ffefcde7d3db5c4c149f274246d.
remote: I: 1 patch(es) updated to state Accepted.
To ssh://git.freedesktop.org/git/xorg/xserver
   92b3cd3..b380f3a  master -> master

- ajax