qxl and page flip

Submitted by Frediano Ziglio on Oct. 6, 2017, 9:31 a.m.

Details

Message ID 1215613498.13949162.1507282316866.JavaMail.zimbra@redhat.com
State New
Headers show
Series "qxl and page flip" ( rev: 2 ) in Spice

Not browsing as part of any series.

Commit Message

Frediano Ziglio Oct. 6, 2017, 9:31 a.m.
> 
> Hi,
> 
> > The other option is we revert atomic modesetting from upstream qxl
> > until the hw
> > is upgraded to it, and we use the atomic path only on the newer
> > hardware.
> 
> Well, the non-atomic "page-flip support" is blitting one framebuffer
> over the other.  I wouldn't call this a solution either.  It just
> happens to work ok in most cases due to lazy rendering (i.e. spice-
> server not updating local surfaces unless there is a need to do that).
> 
> > I'd really like to see some commitment to fixing qxl if we do decide
> > to do that.
> 
> Ok, so how about the attached virtual hardware update?
> 
> cheers,
>   Gerd
> 
> 
> [Text Documents:0001-qxl-add-primary-surface-update-support.patch]
> 

Would not actually be possible to detect a destroy + create
commands and avoid having to change any version/driver?

On a physical card you are expecting to always have an output the
current destroy+create create a race condition where there's no
output at all.

Frediano

Patch hide | download patch | download mbox

From 47f514ba3c2d261dc3fa76ebab37f0bb5d566252 Mon Sep 17 00:00:00 2001
From: Gerd Hoffmann <kraxel@redhat.com>
Date: Fri, 6 Oct 2017 10:17:10 +0200
Subject: [RfC PATCH] qxl: add primary surface update support

This operation allows to update the primary surface without going
through a destroy + create cycle.  It is limited to backing memory
updates (i.e. pageflip operations).  Only the spice server needs the
memory location, so this doesn't need changes in the spice protocol
and the spice client.

The guest must update QXLRam->create_surface->mem, then invoke the new
QXL_IO_UPDATE_PRIMARY_ASYNC operation via ioport write.

I'd suggest spice-server provides a new spice_qxl_update_surface_async()
function implementing this.

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
---
 spice/qxl_dev.h | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/spice/qxl_dev.h b/spice/qxl_dev.h
index 9e753c4073..b8fba8e233 100644
--- a/spice/qxl_dev.h
+++ b/spice/qxl_dev.h
@@ -50,6 +50,7 @@  enum {
     QXL_REVISION_STABLE_V06=0x02,
     QXL_REVISION_STABLE_V10=0x03,
     QXL_REVISION_STABLE_V12=0x04,
+    QXL_REVISION_STABLE_V14=0x05,
 };
 
 #define QXL_DEVICE_ID_DEVEL 0x01ff
@@ -97,6 +98,8 @@  enum {
     QXL_IO_FLUSH_RELEASE,
     /* appended for qxl-4 */
     QXL_IO_MONITORS_CONFIG_ASYNC,
+    /* appended for qxl-5 */
+    QXL_IO_UPDATE_PRIMARY_ASYNC,
 
     QXL_IO_RANGE_SIZE
 };
-- 
2.9.3


Comments

Hi,

> Would not actually be possible to detect a destroy + create
> commands and avoid having to change any version/driver?

Well, that would be option (3) of the original mail:  Make the spice
client hide this.  Basically not go into "no display" mode instantly
after destroy-surface, but wait a bit to see whenever it is followed by
a create-surface command.

cheers,
  Gerd
On Fri, Oct 6, 2017 at 5:40 AM, Gerd Hoffmann <kraxel@redhat.com> wrote:
>   Hi,
>
>> Would not actually be possible to detect a destroy + create
>> commands and avoid having to change any version/driver?
>
> Well, that would be option (3) of the original mail:  Make the spice
> client hide this.  Basically not go into "no display" mode instantly
> after destroy-surface, but wait a bit to see whenever it is followed by
> a create-surface command.
>

One thing to consider here. Anything that requires a host change is
going to leave some existing users broken. Some of the reports we have
seen on Fedora were quick to point out that virtio-vga wasn't an
option because they were running on CentOS or other older hosts.

Justin
On Fri, 2017-10-06 at 12:40 +0200, Gerd Hoffmann wrote:
>   Hi,
> 
> > Would not actually be possible to detect a destroy + create
> > commands and avoid having to change any version/driver?
> 
> Well, that would be option (3) of the original mail:  Make the spice
> client hide this.  Basically not go into "no display" mode instantly
> after destroy-surface, but wait a bit to see whenever it is followed
> by
> a create-surface command.

Ping.  What do you think?

cheers,
  Gerd
On 12 October 2017 at 19:16, Gerd Hoffmann <kraxel@redhat.com> wrote:
> On Fri, 2017-10-06 at 12:40 +0200, Gerd Hoffmann wrote:
>>   Hi,
>>
>> > Would not actually be possible to detect a destroy + create
>> > commands and avoid having to change any version/driver?
>>
>> Well, that would be option (3) of the original mail:  Make the spice
>> client hide this.  Basically not go into "no display" mode instantly
>> after destroy-surface, but wait a bit to see whenever it is followed
>> by
>> a create-surface command.
>
> Ping.  What do you think?

We are fixing it host side, I still think we need hack the guest until it knows
the host has the workaround,

Daniel Vetter suggested we take a different path for atomic page flip, and
put back the copy mechanism until we can support atomic modesetting properly,
which might be simpler than reverting all the qxl atomic work.

Dave.
> 
> On 12 October 2017 at 19:16, Gerd Hoffmann <kraxel@redhat.com> wrote:
> > On Fri, 2017-10-06 at 12:40 +0200, Gerd Hoffmann wrote:
> >>   Hi,
> >>
> >> > Would not actually be possible to detect a destroy + create
> >> > commands and avoid having to change any version/driver?
> >>
> >> Well, that would be option (3) of the original mail:  Make the spice
> >> client hide this.  Basically not go into "no display" mode instantly
> >> after destroy-surface, but wait a bit to see whenever it is followed
> >> by
> >> a create-surface command.
> >
> > Ping.  What do you think?
> 
> We are fixing it host side, I still think we need hack the guest until it
> knows
> the host has the workaround,
> 
> Daniel Vetter suggested we take a different path for atomic page flip, and
> put back the copy mechanism until we can support atomic modesetting properly,
> which might be simpler than reverting all the qxl atomic work.
> 
> Dave.
> 

I read the entire thread on

   https://bugzilla.kernel.org/show_bug.cgi?id=196777

so, somebody changed the driver introducing this bug.
So, by definition the fix should be in kernel (VM).

I honestly don't think that any workaround is a good way to go.
Simply we want to do a new thing that was not planned (atomic
page flipping), the current QXL protocol requires a destroy+create
on surface 0 which is the primary.

Changing host (Qemu or spice-server) is an update and as Justin
pointed out is a bad idea to require an user to update the host
in order to update the VM.

Changing client is a workaround but does not seem so easy to
do, I mean users uses different clients packages in different
way (Windows, RHV, virt-viewer). Accepting the limitation that
this workaround can bring I personally would like to have some
opinion on some spice-gtk guy on what they think. How long this
could require to be implemented? Is an easy workaround to maintain
or requires so much changes that is not worth?

About my suggestion on create+destroy detection I was suggesting
more of a spice-server change. I remember (I'm getting older)
that on the old VGA all you have to do for page flipping was to
change a register stating the start of video memory. X had Xmouse
and panning, you could have a large video and see a small portion
(never saw nobody using this but was cool). As I said I think
requiring destroy before create is not really good. I was
discussing this with Christophe Fergeau when I fixed some bugs
on surface handling. But changing this (allowing create without
destroy first) is changing an improper/undefined behaviour to
a defined one that is a protocol (QXL) change, and as such should
be declared to VM.

About multiple primary (visible on output) surfaces maybe is a
good idea on QXL protocol, no idea how this could fit in all
design (like monitors config). Current monitors config (both
QXL and network) has a surface_id so seems to indicate that
this can be used for output. How this is handled by all
components and if can be easily extended I have no idea.
In the network protocol any surface can be primary, in
server code it must be surface 0 which actually cannot be
secondary.

Frediano
Hi,

> > Daniel Vetter suggested we take a different path for atomic page
> > flip, and
> > put back the copy mechanism until we can support atomic modesetting
> > properly,
> > which might be simpler than reverting all the qxl atomic work.
> > 
> > Dave.

> About my suggestion on create+destroy detection I was suggesting
> more of a spice-server change. I remember (I'm getting older)
> that on the old VGA all you have to do for page flipping was to
> change a register stating the start of video memory.

Correct.  Fundamentally still the same on modern hardware.

Ironically pageflip works better with stdvga and spice.  When running
non-qxl display adapters spice-server doesn't know where the guest
video memory is.  qemu hands a scratchpad buffer to spice-server as
primary surface backing storage, then sends QXL_DRAW_COPY commands for
display updates.  So guests can pageflip and spice-server doesn't even
know this happened.

The problem we have with qxl is that spice-server needs to know where
the video memory is, for lazy (local) rendering.  In most cases the
local rendering never happens because it is not needed.  Thats why the
framebuffer blit hack sort-of works:  In theory spice-server's local
rendering executing the blit could overwrite the guests framebuffer,
but in practice only the spice client actually runs the blit command in
the vast majority of cases.

Switching the qxl kms driver to use a scratchpad model too should be
easy for userspace apps not using qxl ioctls (i.e. basically everything
but xorg-x11-drv-qxl).  Just use the vga memory region (start of pci
bar 0) as scratchpad, then we can safely use framebuffer blits for
display updates.  The tricky part here is to not break xorg-x11-drv-
qxl.

> on surface handling. But changing this (allowing create without
> destroy first) is changing an improper/undefined behaviour to
> a defined one that is a protocol (QXL) change, and as such should
> be declared to VM.

Yes.  When adding page flip support I'd likewise do it rather explicit.
New qxl pci revision indicating support, new qxl command (or flag) for
the guest to request this.

> About multiple primary (visible on output) surfaces maybe is a
> good idea on QXL protocol, no idea how this could fit in all
> design (like monitors config). Current monitors config (both
> QXL and network) has a surface_id so seems to indicate that
> this can be used for output.

Yes, in theory.  Big question is whenever this is implemented correctly
everywhere, given that the code likely has never seen a surface_id !=
0.

> In the network protocol any surface can be primary, in
> server code it must be surface 0 which actually cannot be
> secondary.

The qxl kms driver registers all framebuffers as secondary anyway.  The
currently visible one is configured as primary in addition, so it
actually has two ids, 0 and the real one ;)

cheers,
  Gerd
Hi,

> Switching the qxl kms driver to use a scratchpad model too should be
> easy for userspace apps not using qxl ioctls (i.e. basically
> everything
> but xorg-x11-drv-qxl).  Just use the vga memory region (start of pci
> bar 0) as scratchpad, then we can safely use framebuffer blits for
> display updates.  The tricky part here is to not break xorg-x11-drv-
> qxl.

Looks workable.  On a quick glance it seems to be as simple as handling
dumb buffers in a different way, because xorg-x11-drv-qxl doesn't use
dumb buffers but allocates objects using qxl ioctls instead.

I'll go investigate more next week.

cheers,
  Gerd
> 
> Hi,
> 
> > Switching the qxl kms driver to use a scratchpad model too should be
> > easy for userspace apps not using qxl ioctls (i.e. basically
> > everything
> > but xorg-x11-drv-qxl).  Just use the vga memory region (start of pci
> > bar 0) as scratchpad, then we can safely use framebuffer blits for
> > display updates.  The tricky part here is to not break xorg-x11-drv-
> > qxl.
> 
> Looks workable.  On a quick glance it seems to be as simple as handling
> dumb buffers in a different way, because xorg-x11-drv-qxl doesn't use
> dumb buffers but allocates objects using qxl ioctls instead.
> 
> I'll go investigate more next week.
> 
> cheers,
>   Gerd
> 

Why extending QXL protocol is not worth doing but writing workaround
on workaround is?

By the way, on option is to allocate memory, do any drawing then
build a new offscreen surface (with KEEP_DATA flag) on the allocated
memory and execute a draw from the offscreen surface to the primary one.

Frediano
Hi,

> Why extending QXL protocol is not worth doing but writing workaround
> on workaround is?

Problem is extending the qxl protocol requires changes on the host
side.  A fix which doesn't need that has the advantage that it'll work
on hosts with old spice-server and qemu versions.

We can look into qxl protocol changes too of course, so we'll have
something better long-term.

cheers,
  Gerd