[V2] modesetting: Fix X crash in ms_dirty_update()

Submitted by Qu, Jim on Aug. 6, 2018, 2:40 a.m.

Details

Message ID 1533523214-25980-1-git-send-email-Jim.Qu@amd.com
State New
Headers show
Series "modesetting: Fix X crash in ms_dirty_update()" ( rev: 3 ) in X.org

Not browsing as part of any series.

Commit Message

Qu, Jim Aug. 6, 2018, 2:40 a.m.
On some Intel iGPU + AMD dGPU platform, when connect extern display
from dGPU, X will crash, show the log like:

randr: falling back to unsynchronized pixmap sharing
(EE)
(EE) Backtrace:
(EE) 0: /usr/lib/xorg/Xorg (xorg_backtrace+0x4e)
(EE) 1: /usr/lib/xorg/Xorg (0x55cb0151a000+0x1b5ce9)
(EE) 2: /lib/x86_64-linux-gnu/libpthread.so.0 (0x7f1587a1d000+0x11390)
(EE)
(EE) Segmentation fault at address 0x0
(EE)

There is NULL pointer accessing on ent->slave_dst->drawable.pScreen->
SharedPixmapNotifyDamage.

On the platform, since the dGPU is GPU device, so that the iGPU is
output master device. SharedPixmapNotifyDamage() should be called when
current device is output master.

Change-Id: I8fa6922a4f75b5e068970fc4d362f778052379f2
Signed-off-by: Jim Qu <Jim.Qu@amd.com>
---
 hw/xfree86/drivers/modesetting/driver.c | 22 ++++++++++++----------
 1 file changed, 12 insertions(+), 10 deletions(-)

Patch hide | download patch | download mbox

diff --git a/hw/xfree86/drivers/modesetting/driver.c b/hw/xfree86/drivers/modesetting/driver.c
index 9362370..37fafb1 100644
--- a/hw/xfree86/drivers/modesetting/driver.c
+++ b/hw/xfree86/drivers/modesetting/driver.c
@@ -640,19 +640,21 @@  ms_dirty_update(ScreenPtr screen, int *timeout)
     xorg_list_for_each_entry(ent, &screen->pixmap_dirty_list, ent) {
         region = DamageRegion(ent->damage);
         if (RegionNotEmpty(region)) {
-            msPixmapPrivPtr ppriv =
-                msGetPixmapPriv(&ms->drmmode, ent->slave_dst);
+            if (!screen->isGPU) {
+                    msPixmapPrivPtr ppriv =
+                        msGetPixmapPriv(&ms->drmmode, ent->slave_dst->master_pixmap);
 
-            if (ppriv->notify_on_damage) {
-                ppriv->notify_on_damage = FALSE;
+                    if (ppriv->notify_on_damage) {
+                        ppriv->notify_on_damage = FALSE;
 
-                ent->slave_dst->drawable.pScreen->
-                    SharedPixmapNotifyDamage(ent->slave_dst);
-            }
+                        ent->slave_dst->drawable.pScreen->
+                            SharedPixmapNotifyDamage(ent->slave_dst);
+                    }
 
-            /* Requested manual updating */
-            if (ppriv->defer_dirty_update)
-                continue;
+                    /* Requested manual updating */
+                    if (ppriv->defer_dirty_update)
+                        continue;
+            }
 
             redisplay_dirty(screen, ent, timeout);
             DamageEmpty(ent->damage);

Comments

ping....


On 2018年08月06日 10:40, Jim Qu wrote:
> On some Intel iGPU + AMD dGPU platform, when connect extern display
> from dGPU, X will crash, show the log like:
>
> randr: falling back to unsynchronized pixmap sharing
> (EE)
> (EE) Backtrace:
> (EE) 0: /usr/lib/xorg/Xorg (xorg_backtrace+0x4e)
> (EE) 1: /usr/lib/xorg/Xorg (0x55cb0151a000+0x1b5ce9)
> (EE) 2: /lib/x86_64-linux-gnu/libpthread.so.0 (0x7f1587a1d000+0x11390)
> (EE)
> (EE) Segmentation fault at address 0x0
> (EE)
>
> There is NULL pointer accessing on ent->slave_dst->drawable.pScreen->
> SharedPixmapNotifyDamage.
>
> On the platform, since the dGPU is GPU device, so that the iGPU is
> output master device. SharedPixmapNotifyDamage() should be called when
> current device is output master.
>
> Change-Id: I8fa6922a4f75b5e068970fc4d362f778052379f2
> Signed-off-by: Jim Qu <Jim.Qu@amd.com>
> ---
>   hw/xfree86/drivers/modesetting/driver.c | 22 ++++++++++++----------
>   1 file changed, 12 insertions(+), 10 deletions(-)
>
> diff --git a/hw/xfree86/drivers/modesetting/driver.c b/hw/xfree86/drivers/modesetting/driver.c
> index 9362370..37fafb1 100644
> --- a/hw/xfree86/drivers/modesetting/driver.c
> +++ b/hw/xfree86/drivers/modesetting/driver.c
> @@ -640,19 +640,21 @@ ms_dirty_update(ScreenPtr screen, int *timeout)
>       xorg_list_for_each_entry(ent, &screen->pixmap_dirty_list, ent) {
>           region = DamageRegion(ent->damage);
>           if (RegionNotEmpty(region)) {
> -            msPixmapPrivPtr ppriv =
> -                msGetPixmapPriv(&ms->drmmode, ent->slave_dst);
> +            if (!screen->isGPU) {
> +                    msPixmapPrivPtr ppriv =
> +                        msGetPixmapPriv(&ms->drmmode, ent->slave_dst->master_pixmap);
>   
> -            if (ppriv->notify_on_damage) {
> -                ppriv->notify_on_damage = FALSE;
> +                    if (ppriv->notify_on_damage) {
> +                        ppriv->notify_on_damage = FALSE;
>   
> -                ent->slave_dst->drawable.pScreen->
> -                    SharedPixmapNotifyDamage(ent->slave_dst);
> -            }
> +                        ent->slave_dst->drawable.pScreen->
> +                            SharedPixmapNotifyDamage(ent->slave_dst);
> +                    }
>   
> -            /* Requested manual updating */
> -            if (ppriv->defer_dirty_update)
> -                continue;
> +                    /* Requested manual updating */
> +                    if (ppriv->defer_dirty_update)
> +                        continue;
> +            }
>   
>               redisplay_dirty(screen, ent, timeout);
>               DamageEmpty(ent->damage);
Anybody could give a RB? Michel?

Thanks

JimQu


On 2018年08月07日 12:47, jimqu wrote:
> ping....
>
>
> On 2018年08月06日 10:40, Jim Qu wrote:
>> On some Intel iGPU + AMD dGPU platform, when connect extern display
>> from dGPU, X will crash, show the log like:
>>
>> randr: falling back to unsynchronized pixmap sharing
>> (EE)
>> (EE) Backtrace:
>> (EE) 0: /usr/lib/xorg/Xorg (xorg_backtrace+0x4e)
>> (EE) 1: /usr/lib/xorg/Xorg (0x55cb0151a000+0x1b5ce9)
>> (EE) 2: /lib/x86_64-linux-gnu/libpthread.so.0 (0x7f1587a1d000+0x11390)
>> (EE)
>> (EE) Segmentation fault at address 0x0
>> (EE)
>>
>> There is NULL pointer accessing on ent->slave_dst->drawable.pScreen->
>> SharedPixmapNotifyDamage.
>>
>> On the platform, since the dGPU is GPU device, so that the iGPU is
>> output master device. SharedPixmapNotifyDamage() should be called when
>> current device is output master.
>>
>> Change-Id: I8fa6922a4f75b5e068970fc4d362f778052379f2
>> Signed-off-by: Jim Qu <Jim.Qu@amd.com>
>> ---
>>   hw/xfree86/drivers/modesetting/driver.c | 22 ++++++++++++----------
>>   1 file changed, 12 insertions(+), 10 deletions(-)
>>
>> diff --git a/hw/xfree86/drivers/modesetting/driver.c 
>> b/hw/xfree86/drivers/modesetting/driver.c
>> index 9362370..37fafb1 100644
>> --- a/hw/xfree86/drivers/modesetting/driver.c
>> +++ b/hw/xfree86/drivers/modesetting/driver.c
>> @@ -640,19 +640,21 @@ ms_dirty_update(ScreenPtr screen, int *timeout)
>>       xorg_list_for_each_entry(ent, &screen->pixmap_dirty_list, ent) {
>>           region = DamageRegion(ent->damage);
>>           if (RegionNotEmpty(region)) {
>> -            msPixmapPrivPtr ppriv =
>> -                msGetPixmapPriv(&ms->drmmode, ent->slave_dst);
>> +            if (!screen->isGPU) {
>> +                    msPixmapPrivPtr ppriv =
>> +                        msGetPixmapPriv(&ms->drmmode, 
>> ent->slave_dst->master_pixmap);
>>   -            if (ppriv->notify_on_damage) {
>> -                ppriv->notify_on_damage = FALSE;
>> +                    if (ppriv->notify_on_damage) {
>> +                        ppriv->notify_on_damage = FALSE;
>>   -                ent->slave_dst->drawable.pScreen->
>> - SharedPixmapNotifyDamage(ent->slave_dst);
>> -            }
>> + ent->slave_dst->drawable.pScreen->
>> + SharedPixmapNotifyDamage(ent->slave_dst);
>> +                    }
>>   -            /* Requested manual updating */
>> -            if (ppriv->defer_dirty_update)
>> -                continue;
>> +                    /* Requested manual updating */
>> +                    if (ppriv->defer_dirty_update)
>> +                        continue;
>> +            }
>>                 redisplay_dirty(screen, ent, timeout);
>>               DamageEmpty(ent->damage);
>
On Sun, Aug 5, 2018 at 10:40 PM, Jim Qu <Jim.Qu@amd.com> wrote:
> On some Intel iGPU + AMD dGPU platform, when connect extern display
> from dGPU, X will crash, show the log like:
>
> randr: falling back to unsynchronized pixmap sharing
> (EE)
> (EE) Backtrace:
> (EE) 0: /usr/lib/xorg/Xorg (xorg_backtrace+0x4e)
> (EE) 1: /usr/lib/xorg/Xorg (0x55cb0151a000+0x1b5ce9)
> (EE) 2: /lib/x86_64-linux-gnu/libpthread.so.0 (0x7f1587a1d000+0x11390)
> (EE)
> (EE) Segmentation fault at address 0x0
> (EE)
>
> There is NULL pointer accessing on ent->slave_dst->drawable.pScreen->
> SharedPixmapNotifyDamage.
>
> On the platform, since the dGPU is GPU device, so that the iGPU is
> output master device. SharedPixmapNotifyDamage() should be called when
> current device is output master.
>
> Change-Id: I8fa6922a4f75b5e068970fc4d362f778052379f2
> Signed-off-by: Jim Qu <Jim.Qu@amd.com>

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>


> ---
>  hw/xfree86/drivers/modesetting/driver.c | 22 ++++++++++++----------
>  1 file changed, 12 insertions(+), 10 deletions(-)
>
> diff --git a/hw/xfree86/drivers/modesetting/driver.c b/hw/xfree86/drivers/modesetting/driver.c
> index 9362370..37fafb1 100644
> --- a/hw/xfree86/drivers/modesetting/driver.c
> +++ b/hw/xfree86/drivers/modesetting/driver.c
> @@ -640,19 +640,21 @@ ms_dirty_update(ScreenPtr screen, int *timeout)
>      xorg_list_for_each_entry(ent, &screen->pixmap_dirty_list, ent) {
>          region = DamageRegion(ent->damage);
>          if (RegionNotEmpty(region)) {
> -            msPixmapPrivPtr ppriv =
> -                msGetPixmapPriv(&ms->drmmode, ent->slave_dst);
> +            if (!screen->isGPU) {
> +                    msPixmapPrivPtr ppriv =
> +                        msGetPixmapPriv(&ms->drmmode, ent->slave_dst->master_pixmap);
>
> -            if (ppriv->notify_on_damage) {
> -                ppriv->notify_on_damage = FALSE;
> +                    if (ppriv->notify_on_damage) {
> +                        ppriv->notify_on_damage = FALSE;
>
> -                ent->slave_dst->drawable.pScreen->
> -                    SharedPixmapNotifyDamage(ent->slave_dst);
> -            }
> +                        ent->slave_dst->drawable.pScreen->
> +                            SharedPixmapNotifyDamage(ent->slave_dst);
> +                    }
>
> -            /* Requested manual updating */
> -            if (ppriv->defer_dirty_update)
> -                continue;
> +                    /* Requested manual updating */
> +                    if (ppriv->defer_dirty_update)
> +                        continue;
> +            }
>
>              redisplay_dirty(screen, ent, timeout);
>              DamageEmpty(ent->damage);
> --
> 2.7.4
>
> _______________________________________________
> xorg-devel@lists.x.org: X.Org development
> Archives: http://lists.x.org/archives/xorg-devel
> Info: https://lists.x.org/mailman/listinfo/xorg-devel
I think this patch is along the right lines, but misses part of the fix.
Although I don't have access to a modesetting<->modesetting PRIME output slaving
setup to test this theory right now, I think this will break it. Have you tested
that this path is exercised?

It is true that SharedPixmapNotifyDamage() is a function implemented by the
slave that should only be called by the master, and that ppriv->notify_on_damage
is supposed to be set on master's ppriv. The current implementation is
incorrect, as you've found, but it still worked just fine for
modesetting<->modesetting PRIME output slaving. Why is that?

The control flow should be as follows (and only applies to PRIME Sync):

--

1) Slave calls master->PresentSharedPixmap(), and it fails (see
drmmode_SharedPixmapPresent).

2) Slave calls master->RequestSharedPixmapNotifyDamage(pSlavePix) to request for
the master to call slave->SharedPixmapNotifyDamage(pSlavePix) when the backing
master pixmap receives damage. If the master is the modesetting driver, the
master sets pMasterPixPriv->notify_on_damage = TRUE to know that it needs to
notify the slave when damage is received.

3) When the master receives damage, if pMasterPixPriv->notify_on_damage == TRUE,
it calls slave->SharedPixmapNotifyDamage(pSlavePix) and then sets
pMasterPixPriv->notify_on_damage = FALSE.

--

There are currently errors in both step 2 and step 3. In step 2, the modesetting
driver as master currently sets pSlavePixPriv->notify_on_damage = TRUE instead
of pMasterPixPriv->notify_on_damage = TRUE.  In step 3, the modesetting driver
as master checks pSlavePixPriv->notify_on_damage == TRUE, and if it passes,
calls pSlave->SharedPixmapNotifyDamage(pSlavePix). This currently works on
modesetting<->modesetting PRIME because while both are wrong, they are
consistently wrong, and on modesetting<->modesetting PRIME setups, pSlavePixPriv
is accessible to both master and slave. All of the attributes are being stored
in and accessed from pSlavePixPriv, whereas the master-related ones should
actually be stored in pMasterPixPriv.

My understanding of your setup is that you have modesetting as the master, and
AMD as the slave. Thus, when the master attempts to access pSlavePixPriv,
problems result due to the fact that it's accessing AMD's 'ppriv' using the
modesetting structure definition.

Your patch fixes step 3 by getting 'ppriv' from the master pixmap
(pMasterPixPriv as above), rather than the slave. The !screen->isGPU check is
probably good to have, but it's probably redundant because pixmap_dirty_list
should be empty for GPU screens. I think that is all fine.

The patch also needs to fix step 2, again by getting 'ppriv' from the master
pixmap rather than the slave, but this time for
msRequestSharedPixmapNotifyDamage(). Without that,
pMasterPixPriv->notify_on_damage will always be FALSE, breaking
modesetting<->modesetting PRIME. Moreover, it could cause more problems like the
original bug if a non-modesetting slave were to call
master->RequestSharedPixmapNotifyDamage(pSlavePix).

I was also confused as to why slave->SharedPixmapNotifyDamage() is being called at
all on your setup, as 'randr: falling back to unsynchronized pixmap sharing'
indicates that PRIME Sync is disabled which should mean that
ppriv->notify_on_damage should always be FALSE. I think that's just because
ppriv is currently that of the AMD slave, so it's reading some random part of
AMD's ppriv using modesetting's structure definition.

I think if you fix msRequestSharedPixmapNotifyDamage() as described above, this
patch should be good to go, but as it stands, it will likely introduce a regression in
modesetting<->modesetting PRIME. Please fix that, then verify that
modesetting<->modesetting PRIME (with PRIME Sync) still works. I can also verify
it myself once I have access to a test setup that will work with it.

Thanks,
Alex

On Wed, 8 Aug 2018, Alex Deucher wrote:

> On Sun, Aug 5, 2018 at 10:40 PM, Jim Qu <Jim.Qu@amd.com> wrote:
> > On some Intel iGPU + AMD dGPU platform, when connect extern display
> > from dGPU, X will crash, show the log like:
> >
> > randr: falling back to unsynchronized pixmap sharing
> > (EE)
> > (EE) Backtrace:
> > (EE) 0: /usr/lib/xorg/Xorg (xorg_backtrace+0x4e)
> > (EE) 1: /usr/lib/xorg/Xorg (0x55cb0151a000+0x1b5ce9)
> > (EE) 2: /lib/x86_64-linux-gnu/libpthread.so.0 (0x7f1587a1d000+0x11390)
> > (EE)
> > (EE) Segmentation fault at address 0x0
> > (EE)
> >
> > There is NULL pointer accessing on ent->slave_dst->drawable.pScreen->
> > SharedPixmapNotifyDamage.
> >
> > On the platform, since the dGPU is GPU device, so that the iGPU is
> > output master device. SharedPixmapNotifyDamage() should be called when
> > current device is output master.
> >
> > Change-Id: I8fa6922a4f75b5e068970fc4d362f778052379f2
> > Signed-off-by: Jim Qu <Jim.Qu@amd.com>
> 
> Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
> 
> 
> > ---
> >  hw/xfree86/drivers/modesetting/driver.c | 22 ++++++++++++----------
> >  1 file changed, 12 insertions(+), 10 deletions(-)
> >
> > diff --git a/hw/xfree86/drivers/modesetting/driver.c b/hw/xfree86/drivers/modesetting/driver.c
> > index 9362370..37fafb1 100644
> > --- a/hw/xfree86/drivers/modesetting/driver.c
> > +++ b/hw/xfree86/drivers/modesetting/driver.c
> > @@ -640,19 +640,21 @@ ms_dirty_update(ScreenPtr screen, int *timeout)
> >      xorg_list_for_each_entry(ent, &screen->pixmap_dirty_list, ent) {
> >          region = DamageRegion(ent->damage);
> >          if (RegionNotEmpty(region)) {
> > -            msPixmapPrivPtr ppriv =
> > -                msGetPixmapPriv(&ms->drmmode, ent->slave_dst);
> > +            if (!screen->isGPU) {
> > +                    msPixmapPrivPtr ppriv =
> > +                        msGetPixmapPriv(&ms->drmmode, ent->slave_dst->master_pixmap);
> >
> > -            if (ppriv->notify_on_damage) {
> > -                ppriv->notify_on_damage = FALSE;
> > +                    if (ppriv->notify_on_damage) {
> > +                        ppriv->notify_on_damage = FALSE;
> >
> > -                ent->slave_dst->drawable.pScreen->
> > -                    SharedPixmapNotifyDamage(ent->slave_dst);
> > -            }
> > +                        ent->slave_dst->drawable.pScreen->
> > +                            SharedPixmapNotifyDamage(ent->slave_dst);
> > +                    }
> >
> > -            /* Requested manual updating */
> > -            if (ppriv->defer_dirty_update)
> > -                continue;
> > +                    /* Requested manual updating */
> > +                    if (ppriv->defer_dirty_update)
> > +                        continue;
> > +            }
> >
> >              redisplay_dirty(screen, ent, timeout);
> >              DamageEmpty(ent->damage);
> > --
> > 2.7.4
> >
> > _______________________________________________
> > xorg-devel@lists.x.org: X.Org development
> > Archives: http://lists.x.org/archives/xorg-devel
> > Info: https://lists.x.org/mailman/listinfo/xorg-devel
> _______________________________________________
> xorg-devel@lists.x.org: X.Org development
> Archives: http://lists.x.org/archives/xorg-devel
> Info: https://lists.x.org/mailman/listinfo/xorg-devel
On 2018年08月18日 02:55, Alex Goins wrote:
> I think this patch is along the right lines, but misses part of the fix.
> Although I don't have access to a modesetting<->modesetting PRIME output slaving
> setup to test this theory right now, I think this will break it. Have you tested
> that this path is exercised?
>
> It is true that SharedPixmapNotifyDamage() is a function implemented by the
> slave that should only be called by the master, and that ppriv->notify_on_damage
> is supposed to be set on master's ppriv. The current implementation is
> incorrect, as you've found, but it still worked just fine for
> modesetting<->modesetting PRIME output slaving. Why is that?
>
> The control flow should be as follows (and only applies to PRIME Sync):
>
> --
>
> 1) Slave calls master->PresentSharedPixmap(), and it fails (see
> drmmode_SharedPixmapPresent).
>
> 2) Slave calls master->RequestSharedPixmapNotifyDamage(pSlavePix) to request for
> the master to call slave->SharedPixmapNotifyDamage(pSlavePix) when the backing
> master pixmap receives damage. If the master is the modesetting driver, the
> master sets pMasterPixPriv->notify_on_damage = TRUE to know that it needs to
> notify the slave when damage is received.
>
> 3) When the master receives damage, if pMasterPixPriv->notify_on_damage == TRUE,
> it calls slave->SharedPixmapNotifyDamage(pSlavePix) and then sets
> pMasterPixPriv->notify_on_damage = FALSE.
>
> --
>
> There are currently errors in both step 2 and step 3. In step 2, the modesetting
> driver as master currently sets pSlavePixPriv->notify_on_damage = TRUE instead
> of pMasterPixPriv->notify_on_damage = TRUE.  In step 3, the modesetting driver
> as master checks pSlavePixPriv->notify_on_damage == TRUE, and if it passes,
> calls pSlave->SharedPixmapNotifyDamage(pSlavePix). This currently works on
> modesetting<->modesetting PRIME because while both are wrong, they are
> consistently wrong, and on modesetting<->modesetting PRIME setups, pSlavePixPriv
> is accessible to both master and slave. All of the attributes are being stored
> in and accessed from pSlavePixPriv, whereas the master-related ones should
> actually be stored in pMasterPixPriv.
>
> My understanding of your setup is that you have modesetting as the master, and
> AMD as the slave. Thus, when the master attempts to access pSlavePixPriv,
> problems result due to the fact that it's accessing AMD's 'ppriv' using the
> modesetting structure definition.
>
> Your patch fixes step 3 by getting 'ppriv' from the master pixmap
> (pMasterPixPriv as above), rather than the slave. The !screen->isGPU check is
> probably good to have, but it's probably redundant because pixmap_dirty_list
> should be empty for GPU screens. I think that is all fine.

No, based on the test actually, the pixmap_dirty_list is not empty(not 
sure it is a bug or not), so that driver has the chance to refer AMD 
'ppriv' from modesetting structure.

> The patch also needs to fix step 2, again by getting 'ppriv' from the master
> pixmap rather than the slave, but this time for
> msRequestSharedPixmapNotifyDamage(). Without that,
> pMasterPixPriv->notify_on_damage will always be FALSE, breaking
> modesetting<->modesetting PRIME. Moreover, it could cause more problems like the
> original bug if a non-modesetting slave were to call
> master->RequestSharedPixmapNotifyDamage(pSlavePix).
>
> I was also confused as to why slave->SharedPixmapNotifyDamage() is being called at
> all on your setup, as 'randr: falling back to unsynchronized pixmap sharing'
> indicates that PRIME Sync is disabled which should mean that
> ppriv->notify_on_damage should always be FALSE. I think that's just because
> ppriv is currently that of the AMD slave, so it's reading some random part of
> AMD's ppriv using modesetting's structure definition.
>
> I think if you fix msRequestSharedPixmapNotifyDamage() as described above, this
> patch should be good to go, but as it stands, it will likely introduce a regression in
> modesetting<->modesetting PRIME. Please fix that, then verify that
> modesetting<->modesetting PRIME (with PRIME Sync) still works. I can also verify
> it myself once I have access to a test setup that will work with it.
>
> Thanks,
> Alex

OK, I will work on it today.


Thanks
JimQu
> On Wed, 8 Aug 2018, Alex Deucher wrote:
>
>> On Sun, Aug 5, 2018 at 10:40 PM, Jim Qu <Jim.Qu@amd.com> wrote:
>>> On some Intel iGPU + AMD dGPU platform, when connect extern display
>>> from dGPU, X will crash, show the log like:
>>>
>>> randr: falling back to unsynchronized pixmap sharing
>>> (EE)
>>> (EE) Backtrace:
>>> (EE) 0: /usr/lib/xorg/Xorg (xorg_backtrace+0x4e)
>>> (EE) 1: /usr/lib/xorg/Xorg (0x55cb0151a000+0x1b5ce9)
>>> (EE) 2: /lib/x86_64-linux-gnu/libpthread.so.0 (0x7f1587a1d000+0x11390)
>>> (EE)
>>> (EE) Segmentation fault at address 0x0
>>> (EE)
>>>
>>> There is NULL pointer accessing on ent->slave_dst->drawable.pScreen->
>>> SharedPixmapNotifyDamage.
>>>
>>> On the platform, since the dGPU is GPU device, so that the iGPU is
>>> output master device. SharedPixmapNotifyDamage() should be called when
>>> current device is output master.
>>>
>>> Change-Id: I8fa6922a4f75b5e068970fc4d362f778052379f2
>>> Signed-off-by: Jim Qu <Jim.Qu@amd.com>
>> Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
>>
>>
>>> ---
>>>   hw/xfree86/drivers/modesetting/driver.c | 22 ++++++++++++----------
>>>   1 file changed, 12 insertions(+), 10 deletions(-)
>>>
>>> diff --git a/hw/xfree86/drivers/modesetting/driver.c b/hw/xfree86/drivers/modesetting/driver.c
>>> index 9362370..37fafb1 100644
>>> --- a/hw/xfree86/drivers/modesetting/driver.c
>>> +++ b/hw/xfree86/drivers/modesetting/driver.c
>>> @@ -640,19 +640,21 @@ ms_dirty_update(ScreenPtr screen, int *timeout)
>>>       xorg_list_for_each_entry(ent, &screen->pixmap_dirty_list, ent) {
>>>           region = DamageRegion(ent->damage);
>>>           if (RegionNotEmpty(region)) {
>>> -            msPixmapPrivPtr ppriv =
>>> -                msGetPixmapPriv(&ms->drmmode, ent->slave_dst);
>>> +            if (!screen->isGPU) {
>>> +                    msPixmapPrivPtr ppriv =
>>> +                        msGetPixmapPriv(&ms->drmmode, ent->slave_dst->master_pixmap);
>>>
>>> -            if (ppriv->notify_on_damage) {
>>> -                ppriv->notify_on_damage = FALSE;
>>> +                    if (ppriv->notify_on_damage) {
>>> +                        ppriv->notify_on_damage = FALSE;
>>>
>>> -                ent->slave_dst->drawable.pScreen->
>>> -                    SharedPixmapNotifyDamage(ent->slave_dst);
>>> -            }
>>> +                        ent->slave_dst->drawable.pScreen->
>>> +                            SharedPixmapNotifyDamage(ent->slave_dst);
>>> +                    }
>>>
>>> -            /* Requested manual updating */
>>> -            if (ppriv->defer_dirty_update)
>>> -                continue;
>>> +                    /* Requested manual updating */
>>> +                    if (ppriv->defer_dirty_update)
>>> +                        continue;
>>> +            }
>>>
>>>               redisplay_dirty(screen, ent, timeout);
>>>               DamageEmpty(ent->damage);
>>> --
>>> 2.7.4
>>>
>>> _______________________________________________
>>> xorg-devel@lists.x.org: X.Org development
>>> Archives: http://lists.x.org/archives/xorg-devel
>>> Info: https://lists.x.org/mailman/listinfo/xorg-devel
>> _______________________________________________
>> xorg-devel@lists.x.org: X.Org development
>> Archives: http://lists.x.org/archives/xorg-devel
>> Info: https://lists.x.org/mailman/listinfo/xorg-devel
Hi AlexG,

I just has a question.

Did you suppose the output master device(dGPU) is equivalent to master 
screen, output slave(iGPU) device is equivalent to slave screen in 
modesetting driver under PRIME(not reverse) case?

Thanks

JimQu


On 2018年08月18日 02:55, Alex Goins wrote:
> I think this patch is along the right lines, but misses part of the fix.
> Although I don't have access to a modesetting<->modesetting PRIME output slaving
> setup to test this theory right now, I think this will break it. Have you tested
> that this path is exercised?
>
> It is true that SharedPixmapNotifyDamage() is a function implemented by the
> slave that should only be called by the master, and that ppriv->notify_on_damage
> is supposed to be set on master's ppriv. The current implementation is
> incorrect, as you've found, but it still worked just fine for
> modesetting<->modesetting PRIME output slaving. Why is that?
>
> The control flow should be as follows (and only applies to PRIME Sync):
>
> --
>
> 1) Slave calls master->PresentSharedPixmap(), and it fails (see
> drmmode_SharedPixmapPresent).
>
> 2) Slave calls master->RequestSharedPixmapNotifyDamage(pSlavePix) to request for
> the master to call slave->SharedPixmapNotifyDamage(pSlavePix) when the backing
> master pixmap receives damage. If the master is the modesetting driver, the
> master sets pMasterPixPriv->notify_on_damage = TRUE to know that it needs to
> notify the slave when damage is received.
>
> 3) When the master receives damage, if pMasterPixPriv->notify_on_damage == TRUE,
> it calls slave->SharedPixmapNotifyDamage(pSlavePix) and then sets
> pMasterPixPriv->notify_on_damage = FALSE.
>
> --
>
> There are currently errors in both step 2 and step 3. In step 2, the modesetting
> driver as master currently sets pSlavePixPriv->notify_on_damage = TRUE instead
> of pMasterPixPriv->notify_on_damage = TRUE.  In step 3, the modesetting driver
> as master checks pSlavePixPriv->notify_on_damage == TRUE, and if it passes,
> calls pSlave->SharedPixmapNotifyDamage(pSlavePix). This currently works on
> modesetting<->modesetting PRIME because while both are wrong, they are
> consistently wrong, and on modesetting<->modesetting PRIME setups, pSlavePixPriv
> is accessible to both master and slave. All of the attributes are being stored
> in and accessed from pSlavePixPriv, whereas the master-related ones should
> actually be stored in pMasterPixPriv.
>
> My understanding of your setup is that you have modesetting as the master, and
> AMD as the slave. Thus, when the master attempts to access pSlavePixPriv,
> problems result due to the fact that it's accessing AMD's 'ppriv' using the
> modesetting structure definition.
>
> Your patch fixes step 3 by getting 'ppriv' from the master pixmap
> (pMasterPixPriv as above), rather than the slave. The !screen->isGPU check is
> probably good to have, but it's probably redundant because pixmap_dirty_list
> should be empty for GPU screens. I think that is all fine.
>
> The patch also needs to fix step 2, again by getting 'ppriv' from the master
> pixmap rather than the slave, but this time for
> msRequestSharedPixmapNotifyDamage(). Without that,
> pMasterPixPriv->notify_on_damage will always be FALSE, breaking
> modesetting<->modesetting PRIME. Moreover, it could cause more problems like the
> original bug if a non-modesetting slave were to call
> master->RequestSharedPixmapNotifyDamage(pSlavePix).
>
> I was also confused as to why slave->SharedPixmapNotifyDamage() is being called at
> all on your setup, as 'randr: falling back to unsynchronized pixmap sharing'
> indicates that PRIME Sync is disabled which should mean that
> ppriv->notify_on_damage should always be FALSE. I think that's just because
> ppriv is currently that of the AMD slave, so it's reading some random part of
> AMD's ppriv using modesetting's structure definition.
>
> I think if you fix msRequestSharedPixmapNotifyDamage() as described above, this
> patch should be good to go, but as it stands, it will likely introduce a regression in
> modesetting<->modesetting PRIME. Please fix that, then verify that
> modesetting<->modesetting PRIME (with PRIME Sync) still works. I can also verify
> it myself once I have access to a test setup that will work with it.
>
> Thanks,
> Alex
>
> On Wed, 8 Aug 2018, Alex Deucher wrote:
>
>> On Sun, Aug 5, 2018 at 10:40 PM, Jim Qu <Jim.Qu@amd.com> wrote:
>>> On some Intel iGPU + AMD dGPU platform, when connect extern display
>>> from dGPU, X will crash, show the log like:
>>>
>>> randr: falling back to unsynchronized pixmap sharing
>>> (EE)
>>> (EE) Backtrace:
>>> (EE) 0: /usr/lib/xorg/Xorg (xorg_backtrace+0x4e)
>>> (EE) 1: /usr/lib/xorg/Xorg (0x55cb0151a000+0x1b5ce9)
>>> (EE) 2: /lib/x86_64-linux-gnu/libpthread.so.0 (0x7f1587a1d000+0x11390)
>>> (EE)
>>> (EE) Segmentation fault at address 0x0
>>> (EE)
>>>
>>> There is NULL pointer accessing on ent->slave_dst->drawable.pScreen->
>>> SharedPixmapNotifyDamage.
>>>
>>> On the platform, since the dGPU is GPU device, so that the iGPU is
>>> output master device. SharedPixmapNotifyDamage() should be called when
>>> current device is output master.
>>>
>>> Change-Id: I8fa6922a4f75b5e068970fc4d362f778052379f2
>>> Signed-off-by: Jim Qu <Jim.Qu@amd.com>
>> Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
>>
>>
>>> ---
>>>   hw/xfree86/drivers/modesetting/driver.c | 22 ++++++++++++----------
>>>   1 file changed, 12 insertions(+), 10 deletions(-)
>>>
>>> diff --git a/hw/xfree86/drivers/modesetting/driver.c b/hw/xfree86/drivers/modesetting/driver.c
>>> index 9362370..37fafb1 100644
>>> --- a/hw/xfree86/drivers/modesetting/driver.c
>>> +++ b/hw/xfree86/drivers/modesetting/driver.c
>>> @@ -640,19 +640,21 @@ ms_dirty_update(ScreenPtr screen, int *timeout)
>>>       xorg_list_for_each_entry(ent, &screen->pixmap_dirty_list, ent) {
>>>           region = DamageRegion(ent->damage);
>>>           if (RegionNotEmpty(region)) {
>>> -            msPixmapPrivPtr ppriv =
>>> -                msGetPixmapPriv(&ms->drmmode, ent->slave_dst);
>>> +            if (!screen->isGPU) {
>>> +                    msPixmapPrivPtr ppriv =
>>> +                        msGetPixmapPriv(&ms->drmmode, ent->slave_dst->master_pixmap);
>>>
>>> -            if (ppriv->notify_on_damage) {
>>> -                ppriv->notify_on_damage = FALSE;
>>> +                    if (ppriv->notify_on_damage) {
>>> +                        ppriv->notify_on_damage = FALSE;
>>>
>>> -                ent->slave_dst->drawable.pScreen->
>>> -                    SharedPixmapNotifyDamage(ent->slave_dst);
>>> -            }
>>> +                        ent->slave_dst->drawable.pScreen->
>>> +                            SharedPixmapNotifyDamage(ent->slave_dst);
>>> +                    }
>>>
>>> -            /* Requested manual updating */
>>> -            if (ppriv->defer_dirty_update)
>>> -                continue;
>>> +                    /* Requested manual updating */
>>> +                    if (ppriv->defer_dirty_update)
>>> +                        continue;
>>> +            }
>>>
>>>               redisplay_dirty(screen, ent, timeout);
>>>               DamageEmpty(ent->damage);
>>> --
>>> 2.7.4
>>>
>>> _______________________________________________
>>> xorg-devel@lists.x.org: X.Org development
>>> Archives: http://lists.x.org/archives/xorg-devel
>>> Info: https://lists.x.org/mailman/listinfo/xorg-devel
>> _______________________________________________
>> xorg-devel@lists.x.org: X.Org development
>> Archives: http://lists.x.org/archives/xorg-devel
>> Info: https://lists.x.org/mailman/listinfo/xorg-devel
Hi JimQu,

I'm not sure that I fully understand the question, so I'll just try to cover all
the bases, apologies if I miss what you are asking.

The output master device (typically dGPU, also known as the 'source') is that
which is driving the true X screen, whereas the output slave device (typically
iGPU, also known as the 'sink') is associated with the GPU screen, which doesn't
do any rendering, just display. 'Reverse PRIME' refers to when a dGPU is the
slave/sink, although that terminology has been confusing to me in the past.

Your original description seems to claim that for the crashing config, the AMD
dGPU is actually the output slave/sink (GPU screen), and the Intel iGPU is the
output master/source (X screen). I suppose that would be considered reverse
PRIME.

The modesetting driver is meant to store PRIME parameters that are
pixmap-specific in the associated msPixmapPrivRec. The issue here is that there
is a different copy of the pixmap for the output master and the output slave,
respectively. You'll notice in the definition of msPixmapPrivRec that some
fields are designated for the sink, and others are designated for the source.
The sink fields should only be set/get from the slave copy of the pixmap, and
the source fields should only be set/get from the master copy of the pixmap.
Currently, they are all being set/get from the slave copy. This ends up working
out fine if the modesetting driver is being used for both master and slave (such
as Nouveau+Intel), or if the modesetting driver is being used only as the slave
(such as NVIDIA+Intel or AMD+Intel non-reverse PRIME), but not if the
modesetting driver is being used as a master with a non-modesetting slave (such
as with Intel+AMD reverse PRIME, what I assume is your config).

So, the core of the fix should just be making sure that in any case where the
modesetting driver sets/gets pixmap-specific PRIME parameters that are
associated with the master/source (as specified in the comments in the
msPixmapPrivRec definition), it does so via the master copy of the pixmap, as
you've done for ms_dirty_update() already.

Thanks,
Alex

On Mon, 20 Aug 2018, jimqu wrote:

> 
> Hi AlexG,
> 
> I just has a question.
> 
> Did you suppose the output master device(dGPU) is equivalent to master screen, output slave(iGPU) device
> is equivalent to slave screen in modesetting driver under PRIME(not reverse) case?
> 
> Thanks
> 
> JimQu
> 
> 
> On 2018年08月18日 02:55, Alex Goins wrote:
> 
> I think this patch is along the right lines, but misses part of the fix.
> Although I don't have access to a modesetting<->modesetting PRIME output slaving
> setup to test this theory right now, I think this will break it. Have you tested
> that this path is exercised?
> 
> It is true that SharedPixmapNotifyDamage() is a function implemented by the
> slave that should only be called by the master, and that ppriv->notify_on_damage
> is supposed to be set on master's ppriv. The current implementation is
> incorrect, as you've found, but it still worked just fine for
> modesetting<->modesetting PRIME output slaving. Why is that?
> 
> The control flow should be as follows (and only applies to PRIME Sync):
> 
> --
> 
> 1) Slave calls master->PresentSharedPixmap(), and it fails (see
> drmmode_SharedPixmapPresent).
> 
> 2) Slave calls master->RequestSharedPixmapNotifyDamage(pSlavePix) to request for
> the master to call slave->SharedPixmapNotifyDamage(pSlavePix) when the backing
> master pixmap receives damage. If the master is the modesetting driver, the
> master sets pMasterPixPriv->notify_on_damage = TRUE to know that it needs to
> notify the slave when damage is received.
> 
> 3) When the master receives damage, if pMasterPixPriv->notify_on_damage == TRUE,
> it calls slave->SharedPixmapNotifyDamage(pSlavePix) and then sets
> pMasterPixPriv->notify_on_damage = FALSE.
> 
> --
> 
> There are currently errors in both step 2 and step 3. In step 2, the modesetting
> driver as master currently sets pSlavePixPriv->notify_on_damage = TRUE instead
> of pMasterPixPriv->notify_on_damage = TRUE.  In step 3, the modesetting driver
> as master checks pSlavePixPriv->notify_on_damage == TRUE, and if it passes,
> calls pSlave->SharedPixmapNotifyDamage(pSlavePix). This currently works on
> modesetting<->modesetting PRIME because while both are wrong, they are
> consistently wrong, and on modesetting<->modesetting PRIME setups, pSlavePixPriv
> is accessible to both master and slave. All of the attributes are being stored
> in and accessed from pSlavePixPriv, whereas the master-related ones should
> actually be stored in pMasterPixPriv.
> 
> My understanding of your setup is that you have modesetting as the master, and
> AMD as the slave. Thus, when the master attempts to access pSlavePixPriv,
> problems result due to the fact that it's accessing AMD's 'ppriv' using the
> modesetting structure definition.
> 
> Your patch fixes step 3 by getting 'ppriv' from the master pixmap
> (pMasterPixPriv as above), rather than the slave. The !screen->isGPU check is
> probably good to have, but it's probably redundant because pixmap_dirty_list
> should be empty for GPU screens. I think that is all fine.
> 
> The patch also needs to fix step 2, again by getting 'ppriv' from the master
> pixmap rather than the slave, but this time for
> msRequestSharedPixmapNotifyDamage(). Without that,
> pMasterPixPriv->notify_on_damage will always be FALSE, breaking
> modesetting<->modesetting PRIME. Moreover, it could cause more problems like the
> original bug if a non-modesetting slave were to call
> master->RequestSharedPixmapNotifyDamage(pSlavePix).
> 
> I was also confused as to why slave->SharedPixmapNotifyDamage() is being called at
> all on your setup, as 'randr: falling back to unsynchronized pixmap sharing'
> indicates that PRIME Sync is disabled which should mean that
> ppriv->notify_on_damage should always be FALSE. I think that's just because
> ppriv is currently that of the AMD slave, so it's reading some random part of
> AMD's ppriv using modesetting's structure definition.
> 
> I think if you fix msRequestSharedPixmapNotifyDamage() as described above, this
> patch should be good to go, but as it stands, it will likely introduce a regression in
> modesetting<->modesetting PRIME. Please fix that, then verify that
> modesetting<->modesetting PRIME (with PRIME Sync) still works. I can also verify
> it myself once I have access to a test setup that will work with it.
> 
> Thanks,
> Alex
> 
> On Wed, 8 Aug 2018, Alex Deucher wrote:
> 
> On Sun, Aug 5, 2018 at 10:40 PM, Jim Qu <Jim.Qu@amd.com> wrote:
> 
> On some Intel iGPU + AMD dGPU platform, when connect extern display
> from dGPU, X will crash, show the log like:
> 
> randr: falling back to unsynchronized pixmap sharing
> (EE)
> (EE) Backtrace:
> (EE) 0: /usr/lib/xorg/Xorg (xorg_backtrace+0x4e)
> (EE) 1: /usr/lib/xorg/Xorg (0x55cb0151a000+0x1b5ce9)
> (EE) 2: /lib/x86_64-linux-gnu/libpthread.so.0 (0x7f1587a1d000+0x11390)
> (EE)
> (EE) Segmentation fault at address 0x0
> (EE)
> 
> There is NULL pointer accessing on ent->slave_dst->drawable.pScreen->
> SharedPixmapNotifyDamage.
> 
> On the platform, since the dGPU is GPU device, so that the iGPU is
> output master device. SharedPixmapNotifyDamage() should be called when
> current device is output master.
> 
> Change-Id: I8fa6922a4f75b5e068970fc4d362f778052379f2
> Signed-off-by: Jim Qu <Jim.Qu@amd.com>
> 
> Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
> 
> 
> ---
>  hw/xfree86/drivers/modesetting/driver.c | 22 ++++++++++++----------
>  1 file changed, 12 insertions(+), 10 deletions(-)
> 
> diff --git a/hw/xfree86/drivers/modesetting/driver.c b/hw/xfree86/drivers/modesetting/driver.c
> index 9362370..37fafb1 100644
> --- a/hw/xfree86/drivers/modesetting/driver.c
> +++ b/hw/xfree86/drivers/modesetting/driver.c
> @@ -640,19 +640,21 @@ ms_dirty_update(ScreenPtr screen, int *timeout)
>      xorg_list_for_each_entry(ent, &screen->pixmap_dirty_list, ent) {
>          region = DamageRegion(ent->damage);
>          if (RegionNotEmpty(region)) {
> -            msPixmapPrivPtr ppriv =
> -                msGetPixmapPriv(&ms->drmmode, ent->slave_dst);
> +            if (!screen->isGPU) {
> +                    msPixmapPrivPtr ppriv =
> +                        msGetPixmapPriv(&ms->drmmode, ent->slave_dst->master_pixmap);
> 
> -            if (ppriv->notify_on_damage) {
> -                ppriv->notify_on_damage = FALSE;
> +                    if (ppriv->notify_on_damage) {
> +                        ppriv->notify_on_damage = FALSE;
> 
> -                ent->slave_dst->drawable.pScreen->
> -                    SharedPixmapNotifyDamage(ent->slave_dst);
> -            }
> +                        ent->slave_dst->drawable.pScreen->
> +                            SharedPixmapNotifyDamage(ent->slave_dst);
> +                    }
> 
> -            /* Requested manual updating */
> -            if (ppriv->defer_dirty_update)
> -                continue;
> +                    /* Requested manual updating */
> +                    if (ppriv->defer_dirty_update)
> +                        continue;
> +            }
> 
>              redisplay_dirty(screen, ent, timeout);
>              DamageEmpty(ent->damage);
> --
> 2.7.4
> 
> _______________________________________________
> xorg-devel@lists.x.org: X.Org development
> Archives: http://lists.x.org/archives/xorg-devel
> Info: https://lists.x.org/mailman/listinfo/xorg-devel
> 
> _______________________________________________
> xorg-devel@lists.x.org: X.Org development
> Archives: http://lists.x.org/archives/xorg-devel
> Info: https://lists.x.org/mailman/listinfo/xorg-devel
> 
> 
> 
>