| Message ID | 1384637846-5343-2-git-send-email-jnsptrsn1@gmail.com |
|---|---|
| State | Superseded |
| Headers | show |
diff --git a/src/xcb_io.c b/src/xcb_io.c index 727c6c7..f2978d0 100644 --- a/src/xcb_io.c +++ b/src/xcb_io.c @@ -455,7 +455,7 @@ void _XSend(Display *dpy, const char *data, long size) static const xReq dummy_request; static char const pad[3]; struct iovec vec[3]; - uint64_t requests; + uint64_t requests, unwrapped_request; _XExtension *ext; xcb_connection_t *c = dpy->xcb->connection; if(dpy->flags & XlibDisplayIOError) @@ -464,6 +464,10 @@ void _XSend(Display *dpy, const char *data, long size) if(dpy->bufptr == dpy->buffer && !size) return; + /* Set bit 8 of 'request' when a 32-bit wrap has just happened + * so the sequence stays correct relatively to 'last_flushed'. */ + unwrapped_request = ((uint64_t)(dpy->request < dpy->xcb->last_flushed) << 32) + dpy->request; + /* iff we asked XCB to set aside errors, we must pick those up * eventually. iff there are async handlers, we may have just * issued requests that will generate replies. in either case, @@ -471,10 +475,10 @@ void _XSend(Display *dpy, const char *data, long size) if(dpy->xcb->event_owner != XlibOwnsEventQueue || dpy->async_handlers) { uint64_t sequence; - for(sequence = dpy->xcb->last_flushed + 1; sequence <= dpy->request; ++sequence) + for(sequence = dpy->xcb->last_flushed + 1; sequence <= unwrapped_request; ++sequence) append_pending_request(dpy, sequence); } - requests = dpy->request - dpy->xcb->last_flushed; + requests = unwrapped_request - dpy->xcb->last_flushed; dpy->xcb->last_flushed = dpy->request; vec[0].iov_base = dpy->buffer;
On 16.11.2013 22:37, Jonas Petersen wrote: > By design the Xlib 32-bit internal request sequence numbers may wrap. There > is two locations within xcb_io.c that are not wrap-safe. The value of > last_flushed relies on the request to be sequential all the time. This is > not given when the sequence has just wrapped. Applications may then crash > with a "Fatal IO error 11 (Resource temporarily unavailable)". > > This patch fixes this by unwrapping the sequence number when needed to retain > the sequence relative to last_flushed. > > Signed-off-by: Jonas Petersen <jnsptrsn1@gmail.com> > --- > src/xcb_io.c | 10 +++++++--- > 1 file changed, 7 insertions(+), 3 deletions(-) > > diff --git a/src/xcb_io.c b/src/xcb_io.c > index 727c6c7..f2978d0 100644 > --- a/src/xcb_io.c > +++ b/src/xcb_io.c > @@ -455,7 +455,7 @@ void _XSend(Display *dpy, const char *data, long size) > static const xReq dummy_request; > static char const pad[3]; > struct iovec vec[3]; > - uint64_t requests; > + uint64_t requests, unwrapped_request; > _XExtension *ext; > xcb_connection_t *c = dpy->xcb->connection; > if(dpy->flags & XlibDisplayIOError) > @@ -464,6 +464,10 @@ void _XSend(Display *dpy, const char *data, long size) > if(dpy->bufptr == dpy->buffer && !size) > return; > > + /* Set bit 8 of 'request' when a 32-bit wrap has just happened > + * so the sequence stays correct relatively to 'last_flushed'. */ "1 << 32" does not set bit 8 and this doesn't set anything in request, really. > + unwrapped_request = ((uint64_t)(dpy->request < dpy->xcb->last_flushed) << 32) + dpy->request; Also, I don't think that this code is intuitive this way. I would propose this instead: unwrapped_request = dpy->request; /* If there was a sequence number wrap since our last flush, make sure we * use the correct 64 bit sequence number */ if (sizeof(uint64_t) > sizeof(unsigned long) && dpy->request < dpy->xcb_last_flushed) unwrapped_request += UINT64_C(1) << 32; (I am not sure/convinced if the sizeof comparision is necessary, but I saw something like this in require_socket() and then thought that this might be necessary on systems where unsigned long already is a 64 bit type.) > /* iff we asked XCB to set aside errors, we must pick those up > * eventually. iff there are async handlers, we may have just > * issued requests that will generate replies. in either case, > @@ -471,10 +475,10 @@ void _XSend(Display *dpy, const char *data, long size) > if(dpy->xcb->event_owner != XlibOwnsEventQueue || dpy->async_handlers) > { > uint64_t sequence; > - for(sequence = dpy->xcb->last_flushed + 1; sequence <= dpy->request; ++sequence) > + for(sequence = dpy->xcb->last_flushed + 1; sequence <= unwrapped_request; ++sequence) > append_pending_request(dpy, sequence); > } > - requests = dpy->request - dpy->xcb->last_flushed; > + requests = unwrapped_request - dpy->xcb->last_flushed; > dpy->xcb->last_flushed = dpy->request; > > vec[0].iov_base = dpy->buffer; >
Am 17.11.2013 11:58, schrieb Uli Schlachter: > On 16.11.2013 22:37, Jonas Petersen wrote: >> + /* Set bit 8 of 'request' when a 32-bit wrap has just happened >> + * so the sequence stays correct relatively to 'last_flushed'. */ > "1 << 32" does not set bit 8 and this doesn't set anything in request, really. You're right, sorry, "bit 8" is a typo, it's bit 32 of course. Other than that it will indeed set something. Not exactly "in request", but it will end up in unwrapped_request. > >> + unwrapped_request = ((uint64_t)(dpy->request < dpy->xcb->last_flushed) << 32) + dpy->request; > Also, I don't think that this code is intuitive this way. I agree somewhat on that. I thought efficiency would have precedence here. I was actually inspired by static void widen() in xcb_io.c where it reads: *wide = new + ((unsigned long) (new < *wide) << 16 << 16); The comment says "Treating the comparison as a 1 and shifting it avoids a conditional branch". > I would propose this instead: > > unwrapped_request = dpy->request; > /* If there was a sequence number wrap since our last flush, make sure we > * use the correct 64 bit sequence number */ > if (sizeof(uint64_t) > sizeof(unsigned long) > && dpy->request < dpy->xcb_last_flushed) > unwrapped_request += UINT64_C(1) << 32; > > (I am not sure/convinced if the sizeof comparision is necessary, but I saw > something like this in require_socket() and then thought that this might be > necessary on systems where unsigned long already is a 64 bit type.) I admit I haven't tought it all through on other system configurations. So thanks for giving your input. I guess the sizeof comparison would not be necessary since the condition should never meet with 64-bit longs. And if it does, something is wrong anyway (from my understanding). After all this is the trigger of the bug. Btw. as you might have guessed, on my system unsigned longs are 32-bit. - Jonas
> *wide = new + ((unsigned long) (new < *wide) << 16 << 16); > The comment says "Treating the comparison as a 1 and shifting it > avoids a conditional branch". Only on architectures with conditional moves - and, on those, the version using ? : is likely to compile down to a conditional move anyway. I think that comment should be fixed. > I guess the sizeof comparison would not be necessary since the > condition should never meet with 64-bit longs. Unless it's in a code fragment that's used only on machines with <64-bit longs, it will; X runs on systems with 64-bit longs. > And if it does, something is wrong anyway (from my understanding). > After all this is the trigger of the bug. Does anyone know whether the bug triggers on systems with 64-bit longs? /~\ The ASCII Mouse \ / Ribbon Campaign X Against HTML mouse@rodents-montreal.org / \ Email! 7D C8 61 52 5D E7 2D 39 4E F1 31 3E E8 B3 27 4B
Am 17.11.2013 20:20, schrieb Mouse: >> I guess the sizeof comparison would not be necessary since the >> condition should never meet with 64-bit longs. > Unless it's in a code fragment that's used only on machines with > <64-bit longs, it will; X runs on systems with 64-bit longs. I meant the condition "dpy->request < dpy->xcb_last_flushed" should not meet on systems with 64-bit longs (at least not until the 64-bit wrap). So the "sizeof(uint64_t) > sizeof(unsigned long)" would not be necessary. It would just increase the overhead on <64-bit systems. > >> And if it does, something is wrong anyway (from my understanding). >> After all this is the trigger of the bug. > Does anyone know whether the bug triggers on systems with 64-bit longs? I would say no, because there is no 32-bit to wrap. But who knows, I have not tried it. - Jonas
Am 17.11.2013 21:30, schrieb Jonas Petersen: > Am 17.11.2013 20:20, schrieb Mouse: >>> And if it does, something is wrong anyway (from my understanding). >>> After all this is the trigger of the bug. >> Does anyone know whether the bug triggers on systems with 64-bit longs? > > I would say no, because there is no 32-bit to wrap. But who knows, I > have not tried it. I have just run a test on a blank 64-bit Ubuntu 13.10. It shows sizeof(unsigned long) == 8. The bug does *not* trigger. - Jonas
Am 17.11.2013 20:20, schrieb Mouse: >> *wide = new + ((unsigned long) (new < *wide) << 16 << 16); >> The comment says "Treating the comparison as a 1 and shifting it >> avoids a conditional branch". > Only on architectures with conditional moves - and, on those, the > version using ? : is likely to compile down to a conditional move > anyway. I think that comment should be fixed. What do you mean exactly? I thought the comment was saying that using "(x < y) << z" is avoiding a conditinal branch introduced by using an "if" or "? :" in some way to get the calculation done. Isn't the comment true then? - Jonas
>>> *wide = new + ((unsigned long) (new < *wide) << 16 << 16); >>> The comment says "Treating the comparison as a 1 and shifting it >>> avoids a conditional branch". >> Only on architectures with conditional moves - and, on those, the >> version using ? : is likely to compile down to a conditional move >> anyway. I think that comment should be fixed. > What do you mean exactly? Computing the numeric value of (a < b) - as opposed to using it in a control-flow construct - is normally implemented much the same way that ((a < b) ? 1 : 0) is. By definition of the < operator, they are semantically identical anyway. > I thought the comment was saying that using "(x < y) << z" is > avoiding a conditinal branch introduced by using an "if" or "? :" in > some way to get the calculation done. That's what I think the comment is saying, yes. I also think it's false - or, at least, it's false more often than it's true, and in a way that's unpredictable from the point of view of anyone who actually needs the comment. On architectures without conditional moves, it's difficult[%] to compute the numerical value of (x < y) without a conditional branch. And on architectures with conditional moves, ((x < y) ? 1 : 0) will likely use a conditional move rather than a branch anyway. > Isn't the comment true then? It depends on the compiler, the target CPU architecture, and possibly other things (like the optimization tradeoffs the compiler has been told to use). But if the optimizer is even half-decent, (x < y) << z and ((x < y) ? 1 : 0) << z will compile to exactly the same instructions. I think it is likely enough that it's false that leaving the comment there with that wording is a bad idea; I also see it as a instance of pushing a very very machine-dependent tradeoff up to the C level, something that is better left to the optimizer, or at the very least buried in a machine-dependent preprocessor macro. [%] Not impossible; it can be done with arithemtic and logical operations. I'd render the core of the idea in 32-bit C as ((x-y)>>31)&1 - that's got issues with numbers large enough that their high few bits aren't all the same, but they can be fixed with a little more code. /~\ The ASCII Mouse \ / Ribbon Campaign X Against HTML mouse@rodents-montreal.org / \ Email! 7D C8 61 52 5D E7 2D 39 4E F1 31 3E E8 B3 27 4B
On Sun, Nov 17, 2013 at 21:30:44 +0100, Jonas Petersen wrote: > Am 17.11.2013 20:20, schrieb Mouse: > >>I guess the sizeof comparison would not be necessary since the > >>condition should never meet with 64-bit longs. > >Unless it's in a code fragment that's used only on machines with > ><64-bit longs, it will; X runs on systems with 64-bit longs. > I meant the condition "dpy->request < dpy->xcb_last_flushed" should > not meet on systems with 64-bit longs (at least not until the 64-bit > wrap). So the "sizeof(uint64_t) > sizeof(unsigned long)" would not > be necessary. It would just increase the overhead on <64-bit > systems. > The sizeof comparison is a compile-time constant so I'd expect it to be optimized out by the compiler anyway, so no overhead. Cheers, Julien
Am 18.11.2013 19:16, schrieb Julien Cristau: > On Sun, Nov 17, 2013 at 21:30:44 +0100, Jonas Petersen wrote: > >> Am 17.11.2013 20:20, schrieb Mouse: >>>> I guess the sizeof comparison would not be necessary since the >>>> condition should never meet with 64-bit longs. >>> Unless it's in a code fragment that's used only on machines with >>> <64-bit longs, it will; X runs on systems with 64-bit longs. >> I meant the condition "dpy->request < dpy->xcb_last_flushed" should >> not meet on systems with 64-bit longs (at least not until the 64-bit >> wrap). So the "sizeof(uint64_t) > sizeof(unsigned long)" would not >> be necessary. It would just increase the overhead on <64-bit >> systems. >> > The sizeof comparison is a compile-time constant so I'd expect it to be > optimized out by the compiler anyway, so no overhead. This makes sense. Being curious, I did some tests (with gcc 4.7.2 on i686). They showed that this code: int i = 0; for (;;) { ++i; if (sizeof(unsigned long) > sizeof(unsigned int) && i < 100) { printf("%d\n", i); } } on a 32-bit system compiles into the same binary as this code: for (;;) { } and on a 64-bit system it compiles into the same binary as this code: int i = 0; for (;;) { ++i; if (i < 100) { printf("%d\n", i); } } All that without any optimization options. So adding the size comparison seems to not only not increasing overhead, but even effectively optimizing for the targets. At least with this compiler on these architectures. - Jonas
Am 18.11.2013 00:33, schrieb Mouse: >>>> *wide = new + ((unsigned long) (new < *wide) << 16 << 16); >>>> The comment says "Treating the comparison as a 1 and shifting it >>>> avoids a conditional branch". >>> Only on architectures with conditional moves - and, on those, the >>> version using ? : is likely to compile down to a conditional move >>> anyway. I think that comment should be fixed. >> What do you mean exactly? > Computing the numeric value of (a < b) - as opposed to using it in a > control-flow construct - is normally implemented much the same way that > ((a < b) ? 1 : 0) is. By definition of the < operator, they are > semantically identical anyway. > >> I thought the comment was saying that using "(x < y) << z" is >> avoiding a conditinal branch introduced by using an "if" or "? :" in >> some way to get the calculation done. > That's what I think the comment is saying, yes. I also think it's > false - or, at least, it's false more often than it's true, and in a > way that's unpredictable from the point of view of anyone who actually > needs the comment. > > On architectures without conditional moves, it's difficult[%] to > compute the numerical value of (x < y) without a conditional branch. > And on architectures with conditional moves, ((x < y) ? 1 : 0) will > likely use a conditional move rather than a branch anyway. > >> Isn't the comment true then? > It depends on the compiler, the target CPU architecture, and possibly > other things (like the optimization tradeoffs the compiler has been > told to use). But if the optimizer is even half-decent, (x < y) << z > and ((x < y) ? 1 : 0) << z will compile to exactly the same > instructions. > > I think it is likely enough that it's false that leaving the comment > there with that wording is a bad idea; I also see it as a instance of > pushing a very very machine-dependent tradeoff up to the C level, > something that is better left to the optimizer, or at the very least > buried in a machine-dependent preprocessor macro. > > [%] Not impossible; it can be done with arithemtic and logical > operations. I'd render the core of the idea in 32-bit C as > ((x-y)>>31)&1 - that's got issues with numbers large enough that > their high few bits aren't all the same, but they can be fixed with > a little more code. Thanks for your explanation. I see what you mean. I guess it was just from a pure (C) code level point of view then, considering "if" or "?:" as "conditional branch" as is, without regarding what's happening at the lower levels when compiling. And possibly not regarding the meaning of "conditional branch" in other contexts. So I guess I will revoke my inspiration then and go for Ulis proposal (using and if including the size comparison). - Jonas
By design the Xlib 32-bit internal request sequence numbers may wrap. There is two locations within xcb_io.c that are not wrap-safe. The value of last_flushed relies on the request to be sequential all the time. This is not given when the sequence has just wrapped. Applications may then crash with a "Fatal IO error 11 (Resource temporarily unavailable)". This patch fixes this by unwrapping the sequence number when needed to retain the sequence relative to last_flushed. Signed-off-by: Jonas Petersen <jnsptrsn1@gmail.com> --- src/xcb_io.c | 10 +++++++--- 1 file changed, 7 insertions(+), 3 deletions(-)