[3/5] drm/nouveau: Don't register Thunderbolt eGPU with vga_switcheroo

Submitted by Lukas Wunner on Feb. 24, 2017, 7:19 p.m.

Details

Message ID 859ea96a4aaddd203a6d7795322cf0e4543257b5.1487938188.git.lukas@wunner.de
State New
Headers show
Series "Thunderbolt GPU fixes" ( rev: 1 ) in Nouveau

Not browsing as part of any series.

Commit Message

Lukas Wunner Feb. 24, 2017, 7:19 p.m.
An external Thunderbolt GPU can neither drive the laptop's panel nor be
powered off by the platform, so there's no point in registering it with
vga_switcheroo.  In fact, when the external GPU is runtime suspended,
vga_switcheroo will cut power to the internal discrete GPU, resulting in
a lockup.

Cc: Ben Skeggs <bskeggs@redhat.com>
Signed-off-by: Lukas Wunner <lukas@wunner.de>
---
 drivers/gpu/drm/nouveau/nouveau_vga.c | 10 +++++++++-
 1 file changed, 9 insertions(+), 1 deletion(-)

Patch hide | download patch | download mbox

diff --git a/drivers/gpu/drm/nouveau/nouveau_vga.c b/drivers/gpu/drm/nouveau/nouveau_vga.c
index eef22c6b9665..c2a7fd606c2e 100644
--- a/drivers/gpu/drm/nouveau/nouveau_vga.c
+++ b/drivers/gpu/drm/nouveau/nouveau_vga.c
@@ -95,6 +95,10 @@  nouveau_vga_init(struct nouveau_drm *drm)
 
 	vga_client_register(dev->pdev, dev, NULL, nouveau_vga_set_decode);
 
+	/* don't register Thunderbolt eGPU with vga_switcheroo */
+	if (pci_is_thunderbolt_attached(dev->pdev))
+		return;
+
 	if (nouveau_runtime_pm == 1)
 		runtime = true;
 	if ((nouveau_runtime_pm == -1) && (nouveau_is_optimus() || nouveau_is_v1_dsm()))
@@ -111,6 +115,11 @@  nouveau_vga_fini(struct nouveau_drm *drm)
 	struct drm_device *dev = drm->dev;
 	bool runtime = false;
 
+	vga_client_register(dev->pdev, NULL, NULL, NULL);
+
+	if (pci_is_thunderbolt_attached(dev->pdev))
+		return;
+
 	if (nouveau_runtime_pm == 1)
 		runtime = true;
 	if ((nouveau_runtime_pm == -1) && (nouveau_is_optimus() || nouveau_is_v1_dsm()))
@@ -119,7 +128,6 @@  nouveau_vga_fini(struct nouveau_drm *drm)
 	vga_switcheroo_unregister_client(dev->pdev);
 	if (runtime && nouveau_is_v1_dsm() && !nouveau_is_optimus())
 		vga_switcheroo_fini_domain_pm_ops(drm->dev->dev);
-	vga_client_register(dev->pdev, NULL, NULL, NULL);
 }
 
 

Comments

On Fri, Feb 24, 2017 at 1:19 PM, Lukas Wunner <lukas@wunner.de> wrote:
> An external Thunderbolt GPU can neither drive the laptop's panel nor be
> powered off by the platform, so there's no point in registering it with
> vga_switcheroo.

> In fact, when the external GPU is runtime suspended,
> vga_switcheroo will cut power to the internal discrete GPU, resulting in
> a lockup.

Why does suspending the external GPU cause vga_switcheroo to cut power
to the internal GPU?  No doubt this would be obvious to a GPU person,
which I definitely am not.

> Cc: Ben Skeggs <bskeggs@redhat.com>
> Signed-off-by: Lukas Wunner <lukas@wunner.de>
> ---
>  drivers/gpu/drm/nouveau/nouveau_vga.c | 10 +++++++++-
>  1 file changed, 9 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/nouveau/nouveau_vga.c b/drivers/gpu/drm/nouveau/nouveau_vga.c
> index eef22c6b9665..c2a7fd606c2e 100644
> --- a/drivers/gpu/drm/nouveau/nouveau_vga.c
> +++ b/drivers/gpu/drm/nouveau/nouveau_vga.c
> @@ -95,6 +95,10 @@ nouveau_vga_init(struct nouveau_drm *drm)
>
>         vga_client_register(dev->pdev, dev, NULL, nouveau_vga_set_decode);
>
> +       /* don't register Thunderbolt eGPU with vga_switcheroo */
> +       if (pci_is_thunderbolt_attached(dev->pdev))
> +               return;

I guess there's no way to move this inside
vga_switcheroo_register_client() instead of putting the test in all
the drivers?

> +
>         if (nouveau_runtime_pm == 1)
>                 runtime = true;
>         if ((nouveau_runtime_pm == -1) && (nouveau_is_optimus() || nouveau_is_v1_dsm()))
> @@ -111,6 +115,11 @@ nouveau_vga_fini(struct nouveau_drm *drm)
>         struct drm_device *dev = drm->dev;
>         bool runtime = false;
>
> +       vga_client_register(dev->pdev, NULL, NULL, NULL);
> +
> +       if (pci_is_thunderbolt_attached(dev->pdev))
> +               return;
> +
>         if (nouveau_runtime_pm == 1)
>                 runtime = true;
>         if ((nouveau_runtime_pm == -1) && (nouveau_is_optimus() || nouveau_is_v1_dsm()))
> @@ -119,7 +128,6 @@ nouveau_vga_fini(struct nouveau_drm *drm)
>         vga_switcheroo_unregister_client(dev->pdev);
>         if (runtime && nouveau_is_v1_dsm() && !nouveau_is_optimus())
>                 vga_switcheroo_fini_domain_pm_ops(drm->dev->dev);
> -       vga_client_register(dev->pdev, NULL, NULL, NULL);

The amd & radeon patches look like this:

-       vga_switcheroo_unregister_client(rdev->pdev);
+       if (!pci_is_thunderbolt_attached(adev->pdev))
+               vga_switcheroo_unregister_client(adev->pdev);

This nouveau patch looks suspiciously different.  But again, I'm not a
GPU guy so all I can really say is "hmm, I wonder why it's different?"

>  }
>
>
> --
> 2.11.0
>
On Fri, Feb 24, 2017 at 02:59:44PM -0600, Bjorn Helgaas wrote:
> On Fri, Feb 24, 2017 at 1:19 PM, Lukas Wunner <lukas@wunner.de> wrote:
> > An external Thunderbolt GPU can neither drive the laptop's panel nor be
> > powered off by the platform, so there's no point in registering it with
> > vga_switcheroo.
> 
> > In fact, when the external GPU is runtime suspended,
> > vga_switcheroo will cut power to the internal discrete GPU, resulting in
> > a lockup.
> 
> Why does suspending the external GPU cause vga_switcheroo to cut power
> to the internal GPU?  No doubt this would be obvious to a GPU person,
> which I definitely am not.

vga_switcheroo_init_domain_pm_ops() is currently called both for an
internal discrete GPU as well as an external one attached with Thunderbolt.

That function overrides the ->runtime_suspend hook to cut power to the
internal discrete GPU after runtime suspending whichever GPU the
->runtime_suspend hook was called for.  So the external GPU runtime
suspends, the hook cuts power to the internal GPU, boom.

The function should only be called for the internal GPU, which is the
objective of this patch.


> > --- a/drivers/gpu/drm/nouveau/nouveau_vga.c
> > +++ b/drivers/gpu/drm/nouveau/nouveau_vga.c
> > @@ -95,6 +95,10 @@ nouveau_vga_init(struct nouveau_drm *drm)
> >
> >         vga_client_register(dev->pdev, dev, NULL, nouveau_vga_set_decode);
> >
> > +       /* don't register Thunderbolt eGPU with vga_switcheroo */
> > +       if (pci_is_thunderbolt_attached(dev->pdev))
> > +               return;
> 
> I guess there's no way to move this inside
> vga_switcheroo_register_client() instead of putting the test in all
> the drivers?

It's only needed for a subset of the drivers, in particular not for i915.

The preferred approach in the kernel seems to be to avoid midlayers which
bundle stuff that is not applicable to every driver interfacing with it,
and instead put the functionality in library functions which drivers can
opt in to if necessary.  In this case that library function would be
pci_is_thunderbolt_attached().

A link list on midlayers vs. libraries is here:
http://blog.ffwll.ch/2016/12/midlayers-once-more-with-feeling.html


> > @@ -111,6 +115,11 @@ nouveau_vga_fini(struct nouveau_drm *drm)
> >         struct drm_device *dev = drm->dev;
> >         bool runtime = false;
> >
> > +       vga_client_register(dev->pdev, NULL, NULL, NULL);
> > +
> > +       if (pci_is_thunderbolt_attached(dev->pdev))
> > +               return;
> > +
> >         if (nouveau_runtime_pm == 1)
> >                 runtime = true;
> >         if ((nouveau_runtime_pm == -1) && (nouveau_is_optimus() || nouveau_is_v1_dsm()))
> > @@ -119,7 +128,6 @@ nouveau_vga_fini(struct nouveau_drm *drm)
> >         vga_switcheroo_unregister_client(dev->pdev);
> >         if (runtime && nouveau_is_v1_dsm() && !nouveau_is_optimus())
> >                 vga_switcheroo_fini_domain_pm_ops(drm->dev->dev);
> > -       vga_client_register(dev->pdev, NULL, NULL, NULL);
> 
> The amd & radeon patches look like this:
> 
> -       vga_switcheroo_unregister_client(rdev->pdev);
> +       if (!pci_is_thunderbolt_attached(adev->pdev))
> +               vga_switcheroo_unregister_client(adev->pdev);
> 
> This nouveau patch looks suspiciously different.  But again, I'm not a
> GPU guy so all I can really say is "hmm, I wonder why it's different?"

Functionally it's the same, it only looks differently because the
DRM drivers aren't structured identically.  In particular, radeon
and amdgpu call vga_switcheroo_init_domain_pm_ops() only if the flag
RADEON_IS_PX / AMD_IS_PX (is PowerXpress) has been set, so a call
to pci_is_thunderbolt_attached() is added when setting that flag,
whereas nouveau doesn't have that indirection.

Thanks,

Lukas