Potential fix for runpm issues on various laptops

Submitted by Karol Herbst on May 4, 2019, 4:32 p.m.

Details

Reviewer None
Submitted May 4, 2019, 4:32 p.m.
Last Updated May 7, 2019, 8:13 p.m.
Revision 2

Cover Letter(s)

Revision 1
      While investigating the runpm issues on my GP107 I noticed that something
inside devinit makes runpm break. If Nouveau loads up to the point right
before doing devinit, runpm works without any issues, if devinit is ran,
not anymore.

Out of curiousity I even tried to "bisect" devinit by not running it on
vbios provided signed PMU image, but on the devinit parser we have inside
Nouveau.
Allthough this one isn't as feature complete as the vbios one, I was able
to reproduce the runpm issues as well. From that point I was able to only
run a certain amount of commands until I got to some PCIe initialization
code inside devinit which trigger those runpm issues.

Devinit on my GPU was changing the PCIe link from 8.0 to 2.5, reversing
that on the fini path makes runpm work again.

There are a few other things going on, but with my limited knowledge about
PCIe in general, the change in the link speed sounded like it could cause
issues on resume if the controller and the device disagree on the actual
link.

Maybe this is just a bug within the PCI subsystem inside linux instead and
the controller has to be forced to do _something_?

Anyway, with this runpm seems to work nicely on my machine. Secure booting
the gr (even with my workaround applied I need anyway) might fail after
the GPU got runtime resumed though...

Karol Herbst (5):
  drm: don't set the pci power state if the pci subsystem handles the
    ACPI bits
  pci: enable pcie link changes for pascal
  pci: add nvkm_pcie_get_speed
  pci: save the boot pcie link speed
  pci: restore the boot pcie link speed on fini

 drm/nouveau/include/nvkm/subdev/pci.h |  6 ++++--
 drm/nouveau/nouveau_acpi.c            |  6 +++---
 drm/nouveau/nouveau_acpi.h            |  4 ++--
 drm/nouveau/nouveau_drm.c             | 15 +++++++++----
 drm/nouveau/nouveau_drv.h             |  2 ++
 drm/nouveau/nvkm/subdev/pci/base.c    |  9 ++++++--
 drm/nouveau/nvkm/subdev/pci/gk104.c   |  8 +++----
 drm/nouveau/nvkm/subdev/pci/gp100.c   | 10 +++++++++
 drm/nouveau/nvkm/subdev/pci/pcie.c    | 31 +++++++++++++++++++++++----
 drm/nouveau/nvkm/subdev/pci/priv.h    |  7 ++++++
 10 files changed, 77 insertions(+), 21 deletions(-)
    
Revision 2
      CCing linux-pci and Bjorn Helgaas. Maybe we could get better insights on
how a reasonable fix would look like.

Anyway, to me this entire issue looks like something which has to be fixed
on a PCI level instead of inside a driver, so it makes sense to ask the
pci folks if they have any better suggestions.

Original cover letter:
While investigating the runpm issues on my GP107 I noticed that something
inside devinit makes runpm break. If Nouveau loads up to the point right
before doing devinit, runpm works without any issues, if devinit is ran,
not anymore.

Out of curiousity I even tried to "bisect" devinit by not running it on
vbios provided signed PMU image, but on the devinit parser we have inside
Nouveau.
Allthough this one isn't as feature complete as the vbios one, I was able
to reproduce the runpm issues as well. From that point I was able to only
run a certain amount of commands until I got to some PCIe initialization
code inside devinit which trigger those runpm issues.

Devinit on my GPU was changing the PCIe link from 8.0 to 2.5, reversing
that on the fini path makes runpm work again.

There are a few other things going on, but with my limited knowledge about
PCIe in general, the change in the link speed sounded like it could cause
issues on resume if the controller and the device disagree on the actual
link.

Maybe this is just a bug within the PCI subsystem inside linux instead and
the controller has to be forced to do _something_?

Anyway, with this runpm seems to work nicely on my machine. Secure booting
the gr (even with my workaround applied I need anyway) might fail after
the GPU got runtime resumed though...

Karol Herbst (4):
  drm: don't set the pci power state if the pci subsystem handles the
    ACPI bits
  pci: enable pcie link changes for pascal
  pci: add nvkm_pcie_get_speed
  pci: save the boot pcie link speed and restore it on fini

 drm/nouveau/include/nvkm/subdev/pci.h |  6 +++--
 drm/nouveau/nouveau_acpi.c            |  7 +++++-
 drm/nouveau/nouveau_acpi.h            |  2 ++
 drm/nouveau/nouveau_drm.c             | 14 +++++++++---
 drm/nouveau/nouveau_drv.h             |  2 ++
 drm/nouveau/nvkm/subdev/pci/base.c    |  9 ++++++--
 drm/nouveau/nvkm/subdev/pci/gk104.c   |  8 +++----
 drm/nouveau/nvkm/subdev/pci/gp100.c   | 10 +++++++++
 drm/nouveau/nvkm/subdev/pci/pcie.c    | 32 +++++++++++++++++++++++----
 drm/nouveau/nvkm/subdev/pci/priv.h    |  7 ++++++
 10 files changed, 81 insertions(+), 16 deletions(-)
    

Revisions