[1/2] drm/amdgpu: fix power state when port pm is unavailable

Submitted by Alex Deucher on Nov. 28, 2016, 10:28 p.m.

Details

Message ID CADnq5_MfLP+Ym+beD+Oak+CTxvyMVYfb9+DQDZuR9pbBWqwQtA@mail.gmail.com
State New
Headers show
Series "Series without cover letter" ( rev: 5 ) in AMD X.Org drivers

Not browsing as part of any series.

Commit Message

Alex Deucher Nov. 28, 2016, 10:28 p.m.
On Mon, Nov 28, 2016 at 12:48 AM, Mike Lothian <mike@fireburn.co.uk> wrote:
> The attached patch allows my machine to boot also
>
> Thanks
>
> Mike
>
> On Sat, 26 Nov 2016 at 14:14 Peter Wu <peter@lekensteyn.nl> wrote:
>>
>> Found the problem.
>>
>> amdgpu_atpx_pci_probe_handle is used to detect the ATPX handle, but this
>> does not necessarily have to be on the dGPU. In fact, on your machine it
>> is located on the Intel iGPU (00:02.0):
>>
>> [  134.760509] vga_switcheroo: detected switching method
>> \_SB_.PCI0.GFX0.ATPX handle
>>
>> The parent device of the iGPU (00:00.0) is the root port which has no
>> PCIe capability, hence bridge_d3 evaluated to false.
>>
>> Can you try the attached patch?

Looks good to me.  Patch for radeon included as well.

Alex

>>
>> Kind regards,
>> Peter
>>
>> On Sat, Nov 26, 2016 at 01:49:08PM +0000, Mike Lothian wrote:
>> > I can confirm the patch allows my machine to boot again
>> >
>> > On Sat, 26 Nov 2016 at 11:39 Peter Wu <peter@lekensteyn.nl> wrote:
>> >
>> > > Hi Mike,
>> > >
>> > > To grab console output, you could also try to blacklist amdgpu at
>> > > startup and load it later.
>> > >
>> > > Can you provide your acpidump (and lspci -nn, especially if you do not
>> > > have a dmesg).
>> > >
>> > > This is the call chain:
>> > >
>> > > + amdgpu_atpx_detect
>> > >   + amdgpu_atpx_pci_probe_handle
>> > >     + return if ATPX ACPI handle cannot be found
>> > >     + set amdgpu_atpx_priv.bridge_pm_usable
>> > >   + amdgpu_atpx_init
>> > >     + amdgpu_atpx_validate
>> > >       + use amdgpu_atpx_priv.bridge_pm_usable
>> > >
>> > > I wonder whether newer devices remove ATPX completely. If that is the
>> > > case (please provide acpidump!), can you try this hack:
>> > >
>> > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_atpx_handler.c
>> > > b/drivers/gpu/drm/amd/amdgpu/amdgpu_atpx_handler.c
>> > > index 951addf..5c23382 100644
>> > > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_atpx_handler.c
>> > > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_atpx_handler.c
>> > > @@ -487,6 +487,7 @@ static bool amdgpu_atpx_pci_probe_handle(struct
>> > > pci_dev *pdev)
>> > >         status = acpi_get_handle(dhandle, "ATPX", &atpx_handle);
>> > >         if (ACPI_FAILURE(status)) {
>> > >                 amdgpu_atpx_priv.other_handle = dhandle;
>> > > +               amdgpu_atpx_priv.bridge_pm_usable = true;
>> > >                 return false;
>> > >         }
>> > >         amdgpu_atpx_priv.dhandle = dhandle;
>> > >
>> > > This is not intended for merging, but just testing a hypothesis.
>> > >
>> > > Kind regards,
>> > > Peter
>> > >
>> > > On Sat, Nov 26, 2016 at 11:20:28AM +0000, Mike Lothian wrote:
>> > > > He's some screen shots of the booting screens, apologies they're not
>> > > > that
>> > > > clear
>> > > >
>> > > > https://imgur.com/a/V7mRA
>> > > >
>> > > > On Sat, 26 Nov 2016 at 11:00 Mike Lothian <mike@fireburn.co.uk>
>> > > > wrote:
>> > > >
>> > > > > Hi
>> > > > >
>> > > > > The previous working kernel, was the patch before this - reverting
>> > > fixes it
>> > > > >
>> > > > > My bios date is quite new 08/05/2016
>> > > > >
>> > > > > Most of the messages fly straight off the screen, the journal
>> > > > > isn't
>> > > kept
>> > > > > and the wifi driver also complains, will see if I can get a slow
>> > > > > motion
>> > > > > capture of the errors and screen shot them
>> > > > >
>> > > > > Regards
>> > > > >
>> > > > > Mike
>> > > > >
>> > > > > On Sat, 26 Nov 2016 at 10:20 Peter Wu <peter@lekensteyn.nl> wrote:
>> > > > >
>> > > > > Hi Mike,
>> > > > >
>> > > > > On Sat, Nov 26, 2016 at 12:36:27AM +0000, Mike Lothian wrote:
>> > > > > > This patch is preventing my laptop from booting, I'm getting D3
>> > > > > > error
>> > > > > > messages and atom bios stuck messages
>> > > > >
>> > > > > Can you post a dmesg and acpidump? What was the previous working
>> > > kernel?
>> > > > > I suppose your BIOS date (/sys/class/dmi/id/bios_date) is
>> > > > > pre-2015?
>> > > > > Does booting with runpm=0 work?
>> > > > >
>> > > > > > I mistakenly saw the v2 patch and didn't realise it was for
>> > > > > > radeon
>> > > not
>> > > > > > amdgpu - this branch and Linus's tree are currently not booting
>> > > > > > for
>> > > me
>> > > > >
>> > > > > What do you mean by this?
>> > > > >
>> > > > > Kind regards,
>> > > > > Peter
>> > > > >
>> > > > > > On Wed, 23 Nov 2016 at 17:16 Deucher, Alexander <
>> > > > > Alexander.Deucher@amd.com>
>> > > > > > wrote:
>> > > > > >
>> > > > > > > > -----Original Message-----
>> > > > > > > > From: Peter Wu [mailto:peter@lekensteyn.nl]
>> > > > > > > > Sent: Wednesday, November 23, 2016 12:13 PM
>> > > > > > > > To: Alex Deucher
>> > > > > > > > Cc: amd-gfx list; Deucher, Alexander; Nayan Deshmukh
>> > > > > > > > Subject: Re: [PATCH 1/2] drm/amdgpu: fix power state when
>> > > > > > > > port
>> > > pm is
>> > > > > > > > unavailable
>> > > > > > > >
>> > > > > > > > On Wed, Nov 23, 2016 at 12:01:55PM -0500, Alex Deucher
>> > > > > > > > wrote:
>> > > > > > > > >  On Tue, Nov 22, 2016 at 8:22 PM, Peter Wu <
>> > > peter@lekensteyn.nl>
>> > > > > > > wrote:
>> > > > > > > > > > When PCIe port PM is not enabled (system BIOS is
>> > > > > > > > > > pre-2015 or
>> > > the
>> > > > > > > > > > pcie_port_pm=off parameter is set), legacy ATPX PM
>> > > > > > > > > > should
>> > > still
>> > > > > be
>> > > > > > > > > > marked as supported. Otherwise the GPU can fail to power
>> > > > > > > > > > on
>> > > after
>> > > > > > > > > > runtime suspend. This affected a Dell Inspiron 5548.
>> > > > > > > > > >
>> > > > > > > > > > Ideally the BIOS date in the PCI core is lowered to 2013
>> > > > > > > > > > (the
>> > > > > first
>> > > > > > > year
>> > > > > > > > > > where hybrid graphics platforms using power resources
>> > > > > > > > > > was
>> > > > > > > introduced),
>> > > > > > > > > > but that seems more risky at this point and would not
>> > > > > > > > > > solve
>> > > the
>> > > > > > > > > > pcie_port_pm=off issue.
>> > > > > > > > > >
>> > > > > > > > > > Bugzilla:
>> > > > > > > > > > https://bugs.freedesktop.org/show_bug.cgi?id=98505
>> > > > > > > > > > Reported-and-tested-by: Nayan Deshmukh
>> > > > > > > > <nayan26deshmukh@gmail.com>
>> > > > > > > > > > Signed-off-by: Peter Wu <peter@lekensteyn.nl>
>> > > > > > > > > > ---
>> > > > > > > > > > Hi,
>> > > > > > > > > >
>> > > > > > > > > > This patch is already three weeks old. One alternative
>> > > > > > > > > > idea
>> > > was
>> > > > > > > lowering
>> > > > > > > > BIOS
>> > > > > > > > > > date in PCI core, but as pcie_port_pm=force did not have
>> > > > > > > > > > the
>> > > > > desired
>> > > > > > > > effect, I
>> > > > > > > > > > do not think that this would help though.
>> > > > > > > > >
>> > > > > > > > > Thanks for doing this.
>> > > > > > > > >
>> > > > > > > > > >
>> > > > > > > > > > I have also not contacted linux-pci or Mika about
>> > > > > > > > > > lowering
>> > > the
>> > > > > year
>> > > > > > > due to
>> > > > > > > > the
>> > > > > > > > > > lack of a good reason.  Might do it later though once
>> > > > > > > > > > ACPICA
>> > > bug
>> > > > > > > 1333 is
>> > > > > > > > sorted
>> > > > > > > > > > out such that lowering the year actually has benefits
>> > > > > > > > > > for a
>> > > > > Nvidia
>> > > > > > > laptop
>> > > > > > > > (or if
>> > > > > > > > > > some amdgpu problem can be solved by this).
>> > > > > > > > > >
>> > > > > > > > > > Both patches should probably be Cc stable (4.8+), fixing
>> > > > > > > 2f5af82eeab2 and
>> > > > > > > > > > b8c9fd5ad4b4 ("track whether if this is a hybrid
>> > > > > > > > > > graphics
>> > > > > > > platform"). There
>> > > > > > > > have
>> > > > > > > > > > been some ifdef 0's and reverts in between, so I was not
>> > > sure if
>> > > > > > > adding
>> > > > > > > > the
>> > > > > > > > > > Fixes tag is appropriate.
>> > > > > > > > >
>> > > > > > > > > I don't think we need to cc stable.  In kernel 4.8 we
>> > > > > > > > > don't
>> > > > > attempt to
>> > > > > > > > > do d3cold at all.  4.8 and older kernels have:
>> > > > > > > > > c63695cc5e5f685e924e25a8f9555f6e846f1fc6 (drm/amdgpu: work
>> > > around
>> > > > > > > > lack
>> > > > > > > > > of upstream ACPI support for D3cold)
>> > > > > > > > > which was reverted for 4.9.
>> > > > > > > >
>> > > > > > > > That patch was already reverted in 4.8 as far as I can see:
>> > > > > > > >
>> > > > > > > > $ git tag --contains
>> > > > > > > > c39b487f195b93235ee76384427467786f7bf29f |
>> > > grep
>> > > > > v4.8
>> > > > > > > > v4.8
>> > > > > > > >
>> > > > > > > > Can you double-check? v4.8 was the first kernel with D3cold
>> > > support
>> > > > > in
>> > > > > > > > PCI core.
>> > > > > > >
>> > > > > > > You are right.  I was thinking d3cold went upstream in 4.9, so
>> > > yes, we
>> > > > > > > should apply this to 4.8.
>> > > > > > >
>> > > > > > > Alex
>> > > > > > >
>> > > > > > > >
>> > > > > > > > > Series is:
>> > > > > > > > > Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
>> > > > > > > >
>> > > > > > > > Thanks!
>> > > > > > > >
>> > > > > > > > Kind regards,
>> > > > > > > > Peter
>> > > > > > > >
>> > > > > > > > >
>> > > > > > > > > >
>> > > > > > > > > > Kind regards,
>> > > > > > > > > > Peter
>> > > > > > > > > > ---
>> > > > > > > > > >  drivers/gpu/drm/amd/amdgpu/amdgpu_atpx_handler.c | 9
>> > > ++++++++-
>> > > > > > > > > >  1 file changed, 8 insertions(+), 1 deletion(-)
>> > > > > > > > > >
>> > > > > > > > > > diff --git
>> > > > > > > > > > a/drivers/gpu/drm/amd/amdgpu/amdgpu_atpx_handler.c
>> > > > > > > > b/drivers/gpu/drm/amd/amdgpu/amdgpu_atpx_handler.c
>> > > > > > > > > > index 10b5ddf..951addf 100644
>> > > > > > > > > > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_atpx_handler.c
>> > > > > > > > > > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_atpx_handler.c
>> > > > > > > > > > @@ -33,6 +33,7 @@ struct amdgpu_atpx {
>> > > > > > > > > >
>> > > > > > > > > >  static struct amdgpu_atpx_priv {
>> > > > > > > > > >         bool atpx_detected;
>> > > > > > > > > > +       bool bridge_pm_usable;
>> > > > > > > > > >         /* handle for device - and atpx */
>> > > > > > > > > >         acpi_handle dhandle;
>> > > > > > > > > >         acpi_handle other_handle;
>> > > > > > > > > > @@ -200,7 +201,11 @@ static int
>> > > > > > > > > > amdgpu_atpx_validate(struct
>> > > > > > > > amdgpu_atpx *atpx)
>> > > > > > > > > >         atpx->is_hybrid = false;
>> > > > > > > > > >         if (valid_bits & ATPX_MS_HYBRID_GFX_SUPPORTED) {
>> > > > > > > > > >                 printk("ATPX Hybrid Graphics\n");
>> > > > > > > > > > -               atpx->functions.power_cntl = false;
>> > > > > > > > > > +               /*
>> > > > > > > > > > +                * Disable legacy PM methods only when
>> > > > > > > > > > pcie
>> > > port
>> > > > > PM
>> > > > > > > is usable,
>> > > > > > > > > > +                * otherwise the device might fail to
>> > > > > > > > > > power
>> > > off
>> > > > > or
>> > > > > > > power on.
>> > > > > > > > > > +                */
>> > > > > > > > > > +               atpx->functions.power_cntl =
>> > > > > > > > !amdgpu_atpx_priv.bridge_pm_usable;
>> > > > > > > > > >                 atpx->is_hybrid = true;
>> > > > > > > > > >         }
>> > > > > > > > > >
>> > > > > > > > > > @@ -471,6 +476,7 @@ static int
>> > > > > > > > > > amdgpu_atpx_power_state(enum
>> > > > > > > > vga_switcheroo_client_id id,
>> > > > > > > > > >   */
>> > > > > > > > > >  static bool amdgpu_atpx_pci_probe_handle(struct pci_dev
>> > > *pdev)
>> > > > > > > > > >  {
>> > > > > > > > > > +       struct pci_dev *parent_pdev =
>> > > pci_upstream_bridge(pdev);
>> > > > > > > > > >         acpi_handle dhandle, atpx_handle;
>> > > > > > > > > >         acpi_status status;
>> > > > > > > > > >
>> > > > > > > > > > @@ -485,6 +491,7 @@ static bool
>> > > > > > > > amdgpu_atpx_pci_probe_handle(struct pci_dev *pdev)
>> > > > > > > > > >         }
>> > > > > > > > > >         amdgpu_atpx_priv.dhandle = dhandle;
>> > > > > > > > > >         amdgpu_atpx_priv.atpx.handle = atpx_handle;
>> > > > > > > > > > +       amdgpu_atpx_priv.bridge_pm_usable = parent_pdev
>> > > > > > > > > > &&
>> > > > > > > > parent_pdev->bridge_d3;
>> > > > > > > > > >         return true;
>> > > > > > > > > >  }
>> > > > > > > > > >
>> > > > > > > > > > --
>> > > > > > > > > > 2.10.2

Patch hide | download patch | download mbox

From cd21b5055cca49b30b0caaf1107a9aaeb60a447f Mon Sep 17 00:00:00 2001
From: Alex Deucher <alexander.deucher@amd.com>
Date: Mon, 28 Nov 2016 17:23:40 -0500
Subject: [PATCH] drm/radeon: fix check for port PM availability

The ATPX method does not always exist on the dGPU, it may be located at
the iGPU. The parent device of the iGPU is the root port for which
bridge_d3 is false. This accidentally enables the legacy PM method which
conflicts with port PM and prevented the dGPU from powering on.

Ported from amdgpu commit:
drm/amdgpu: fix check for port PM availability
from Peter Wu.

Fixes: d3ac31f3b4bf9fad (drm/radeon: fix power state when port pm is unavailable (v2))
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: Peter Wu <peter@lekensteyn.nl>
Cc: <stable@vger.kernel.org> # 4.8+
---
 drivers/gpu/drm/radeon/radeon_atpx_handler.c | 11 +++++++++--
 1 file changed, 9 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/radeon/radeon_atpx_handler.c b/drivers/gpu/drm/radeon/radeon_atpx_handler.c
index 4129b12..0ae13cd2 100644
--- a/drivers/gpu/drm/radeon/radeon_atpx_handler.c
+++ b/drivers/gpu/drm/radeon/radeon_atpx_handler.c
@@ -479,7 +479,6 @@  static int radeon_atpx_power_state(enum vga_switcheroo_client_id id,
  */
 static bool radeon_atpx_pci_probe_handle(struct pci_dev *pdev)
 {
-	struct pci_dev *parent_pdev = pci_upstream_bridge(pdev);
 	acpi_handle dhandle, atpx_handle;
 	acpi_status status;
 
@@ -493,7 +492,6 @@  static bool radeon_atpx_pci_probe_handle(struct pci_dev *pdev)
 
 	radeon_atpx_priv.dhandle = dhandle;
 	radeon_atpx_priv.atpx.handle = atpx_handle;
-	radeon_atpx_priv.bridge_pm_usable = parent_pdev && parent_pdev->bridge_d3;
 	return true;
 }
 
@@ -555,11 +553,16 @@  static bool radeon_atpx_detect(void)
 	struct pci_dev *pdev = NULL;
 	bool has_atpx = false;
 	int vga_count = 0;
+	bool d3_supported = false;
+	struct pci_dev *parent_pdev;
 
 	while ((pdev = pci_get_class(PCI_CLASS_DISPLAY_VGA << 8, pdev)) != NULL) {
 		vga_count++;
 
 		has_atpx |= (radeon_atpx_pci_probe_handle(pdev) == true);
+
+		parent_pdev = pci_upstream_bridge(pdev);
+		d3_supported |= parent_pdev && parent_pdev->bridge_d3;
 	}
 
 	/* some newer PX laptops mark the dGPU as a non-VGA display device */
@@ -567,6 +570,9 @@  static bool radeon_atpx_detect(void)
 		vga_count++;
 
 		has_atpx |= (radeon_atpx_pci_probe_handle(pdev) == true);
+
+		parent_pdev = pci_upstream_bridge(pdev);
+		d3_supported |= parent_pdev && parent_pdev->bridge_d3;
 	}
 
 	if (has_atpx && vga_count == 2) {
@@ -574,6 +580,7 @@  static bool radeon_atpx_detect(void)
 		printk(KERN_INFO "vga_switcheroo: detected switching method %s handle\n",
 		       acpi_method_name);
 		radeon_atpx_priv.atpx_detected = true;
+		radeon_atpx_priv.bridge_pm_usable = d3_supported;
 		radeon_atpx_init();
 		return true;
 	}
-- 
2.5.5


Comments

On Mon, Nov 28, 2016 at 05:28:00PM -0500, Alex Deucher wrote:
> On Mon, Nov 28, 2016 at 12:48 AM, Mike Lothian <mike@fireburn.co.uk> wrote:
> > The attached patch allows my machine to boot also
> >
> > Thanks
> >
> > Mike
> >
> > On Sat, 26 Nov 2016 at 14:14 Peter Wu <peter@lekensteyn.nl> wrote:
> >>
> >> Found the problem.
> >>
> >> amdgpu_atpx_pci_probe_handle is used to detect the ATPX handle, but this
> >> does not necessarily have to be on the dGPU. In fact, on your machine it
> >> is located on the Intel iGPU (00:02.0):
> >>
> >> [  134.760509] vga_switcheroo: detected switching method
> >> \_SB_.PCI0.GFX0.ATPX handle
> >>
> >> The parent device of the iGPU (00:00.0) is the root port which has no
> >> PCIe capability, hence bridge_d3 evaluated to false.
> >>
> >> Can you try the attached patch?
> 
> Looks good to me.  Patch for radeon included as well.
> 
> Alex

Thanks, the port looks good to me. (If still possible, I think you can
change Reported-by also to Reported-and-tested-by for the amdgpu patch.)

Kind regards,
Peter

> >>
> >> Kind regards,
> >> Peter
> >>
> >> On Sat, Nov 26, 2016 at 01:49:08PM +0000, Mike Lothian wrote:
> >> > I can confirm the patch allows my machine to boot again
> >> >
> >> > On Sat, 26 Nov 2016 at 11:39 Peter Wu <peter@lekensteyn.nl> wrote:
> >> >
> >> > > Hi Mike,
> >> > >
> >> > > To grab console output, you could also try to blacklist amdgpu at
> >> > > startup and load it later.
> >> > >
> >> > > Can you provide your acpidump (and lspci -nn, especially if you do not
> >> > > have a dmesg).
> >> > >
> >> > > This is the call chain:
> >> > >
> >> > > + amdgpu_atpx_detect
> >> > >   + amdgpu_atpx_pci_probe_handle
> >> > >     + return if ATPX ACPI handle cannot be found
> >> > >     + set amdgpu_atpx_priv.bridge_pm_usable
> >> > >   + amdgpu_atpx_init
> >> > >     + amdgpu_atpx_validate
> >> > >       + use amdgpu_atpx_priv.bridge_pm_usable
> >> > >
> >> > > I wonder whether newer devices remove ATPX completely. If that is the
> >> > > case (please provide acpidump!), can you try this hack:
> >> > >
> >> > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_atpx_handler.c
> >> > > b/drivers/gpu/drm/amd/amdgpu/amdgpu_atpx_handler.c
> >> > > index 951addf..5c23382 100644
> >> > > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_atpx_handler.c
> >> > > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_atpx_handler.c
> >> > > @@ -487,6 +487,7 @@ static bool amdgpu_atpx_pci_probe_handle(struct
> >> > > pci_dev *pdev)
> >> > >         status = acpi_get_handle(dhandle, "ATPX", &atpx_handle);
> >> > >         if (ACPI_FAILURE(status)) {
> >> > >                 amdgpu_atpx_priv.other_handle = dhandle;
> >> > > +               amdgpu_atpx_priv.bridge_pm_usable = true;
> >> > >                 return false;
> >> > >         }
> >> > >         amdgpu_atpx_priv.dhandle = dhandle;
> >> > >
> >> > > This is not intended for merging, but just testing a hypothesis.
> >> > >
> >> > > Kind regards,
> >> > > Peter
> >> > >
> >> > > On Sat, Nov 26, 2016 at 11:20:28AM +0000, Mike Lothian wrote:
> >> > > > He's some screen shots of the booting screens, apologies they're not
> >> > > > that
> >> > > > clear
> >> > > >
> >> > > > https://imgur.com/a/V7mRA
> >> > > >
> >> > > > On Sat, 26 Nov 2016 at 11:00 Mike Lothian <mike@fireburn.co.uk>
> >> > > > wrote:
> >> > > >
> >> > > > > Hi
> >> > > > >
> >> > > > > The previous working kernel, was the patch before this - reverting
> >> > > fixes it
> >> > > > >
> >> > > > > My bios date is quite new 08/05/2016
> >> > > > >
> >> > > > > Most of the messages fly straight off the screen, the journal
> >> > > > > isn't
> >> > > kept
> >> > > > > and the wifi driver also complains, will see if I can get a slow
> >> > > > > motion
> >> > > > > capture of the errors and screen shot them
> >> > > > >
> >> > > > > Regards
> >> > > > >
> >> > > > > Mike
> >> > > > >
> >> > > > > On Sat, 26 Nov 2016 at 10:20 Peter Wu <peter@lekensteyn.nl> wrote:
> >> > > > >
> >> > > > > Hi Mike,
> >> > > > >
> >> > > > > On Sat, Nov 26, 2016 at 12:36:27AM +0000, Mike Lothian wrote:
> >> > > > > > This patch is preventing my laptop from booting, I'm getting D3
> >> > > > > > error
> >> > > > > > messages and atom bios stuck messages
> >> > > > >
> >> > > > > Can you post a dmesg and acpidump? What was the previous working
> >> > > kernel?
> >> > > > > I suppose your BIOS date (/sys/class/dmi/id/bios_date) is
> >> > > > > pre-2015?
> >> > > > > Does booting with runpm=0 work?
> >> > > > >
> >> > > > > > I mistakenly saw the v2 patch and didn't realise it was for
> >> > > > > > radeon
> >> > > not
> >> > > > > > amdgpu - this branch and Linus's tree are currently not booting
> >> > > > > > for
> >> > > me
> >> > > > >
> >> > > > > What do you mean by this?
> >> > > > >
> >> > > > > Kind regards,
> >> > > > > Peter
> >> > > > >
> >> > > > > > On Wed, 23 Nov 2016 at 17:16 Deucher, Alexander <
> >> > > > > Alexander.Deucher@amd.com>
> >> > > > > > wrote:
> >> > > > > >
> >> > > > > > > > -----Original Message-----
> >> > > > > > > > From: Peter Wu [mailto:peter@lekensteyn.nl]
> >> > > > > > > > Sent: Wednesday, November 23, 2016 12:13 PM
> >> > > > > > > > To: Alex Deucher
> >> > > > > > > > Cc: amd-gfx list; Deucher, Alexander; Nayan Deshmukh
> >> > > > > > > > Subject: Re: [PATCH 1/2] drm/amdgpu: fix power state when
> >> > > > > > > > port
> >> > > pm is
> >> > > > > > > > unavailable
> >> > > > > > > >
> >> > > > > > > > On Wed, Nov 23, 2016 at 12:01:55PM -0500, Alex Deucher
> >> > > > > > > > wrote:
> >> > > > > > > > >  On Tue, Nov 22, 2016 at 8:22 PM, Peter Wu <
> >> > > peter@lekensteyn.nl>
> >> > > > > > > wrote:
> >> > > > > > > > > > When PCIe port PM is not enabled (system BIOS is
> >> > > > > > > > > > pre-2015 or
> >> > > the
> >> > > > > > > > > > pcie_port_pm=off parameter is set), legacy ATPX PM
> >> > > > > > > > > > should
> >> > > still
> >> > > > > be
> >> > > > > > > > > > marked as supported. Otherwise the GPU can fail to power
> >> > > > > > > > > > on
> >> > > after
> >> > > > > > > > > > runtime suspend. This affected a Dell Inspiron 5548.
> >> > > > > > > > > >
> >> > > > > > > > > > Ideally the BIOS date in the PCI core is lowered to 2013
> >> > > > > > > > > > (the
> >> > > > > first
> >> > > > > > > year
> >> > > > > > > > > > where hybrid graphics platforms using power resources
> >> > > > > > > > > > was
> >> > > > > > > introduced),
> >> > > > > > > > > > but that seems more risky at this point and would not
> >> > > > > > > > > > solve
> >> > > the
> >> > > > > > > > > > pcie_port_pm=off issue.
> >> > > > > > > > > >
> >> > > > > > > > > > Bugzilla:
> >> > > > > > > > > > https://bugs.freedesktop.org/show_bug.cgi?id=98505
> >> > > > > > > > > > Reported-and-tested-by: Nayan Deshmukh
> >> > > > > > > > <nayan26deshmukh@gmail.com>
> >> > > > > > > > > > Signed-off-by: Peter Wu <peter@lekensteyn.nl>
> >> > > > > > > > > > ---
> >> > > > > > > > > > Hi,
> >> > > > > > > > > >
> >> > > > > > > > > > This patch is already three weeks old. One alternative
> >> > > > > > > > > > idea
> >> > > was
> >> > > > > > > lowering
> >> > > > > > > > BIOS
> >> > > > > > > > > > date in PCI core, but as pcie_port_pm=force did not have
> >> > > > > > > > > > the
> >> > > > > desired
> >> > > > > > > > effect, I
> >> > > > > > > > > > do not think that this would help though.
> >> > > > > > > > >
> >> > > > > > > > > Thanks for doing this.
> >> > > > > > > > >
> >> > > > > > > > > >
> >> > > > > > > > > > I have also not contacted linux-pci or Mika about
> >> > > > > > > > > > lowering
> >> > > the
> >> > > > > year
> >> > > > > > > due to
> >> > > > > > > > the
> >> > > > > > > > > > lack of a good reason.  Might do it later though once
> >> > > > > > > > > > ACPICA
> >> > > bug
> >> > > > > > > 1333 is
> >> > > > > > > > sorted
> >> > > > > > > > > > out such that lowering the year actually has benefits
> >> > > > > > > > > > for a
> >> > > > > Nvidia
> >> > > > > > > laptop
> >> > > > > > > > (or if
> >> > > > > > > > > > some amdgpu problem can be solved by this).
> >> > > > > > > > > >
> >> > > > > > > > > > Both patches should probably be Cc stable (4.8+), fixing
> >> > > > > > > 2f5af82eeab2 and
> >> > > > > > > > > > b8c9fd5ad4b4 ("track whether if this is a hybrid
> >> > > > > > > > > > graphics
> >> > > > > > > platform"). There
> >> > > > > > > > have
> >> > > > > > > > > > been some ifdef 0's and reverts in between, so I was not
> >> > > sure if
> >> > > > > > > adding
> >> > > > > > > > the
> >> > > > > > > > > > Fixes tag is appropriate.
> >> > > > > > > > >
> >> > > > > > > > > I don't think we need to cc stable.  In kernel 4.8 we
> >> > > > > > > > > don't
> >> > > > > attempt to
> >> > > > > > > > > do d3cold at all.  4.8 and older kernels have:
> >> > > > > > > > > c63695cc5e5f685e924e25a8f9555f6e846f1fc6 (drm/amdgpu: work
> >> > > around
> >> > > > > > > > lack
> >> > > > > > > > > of upstream ACPI support for D3cold)
> >> > > > > > > > > which was reverted for 4.9.
> >> > > > > > > >
> >> > > > > > > > That patch was already reverted in 4.8 as far as I can see:
> >> > > > > > > >
> >> > > > > > > > $ git tag --contains
> >> > > > > > > > c39b487f195b93235ee76384427467786f7bf29f |
> >> > > grep
> >> > > > > v4.8
> >> > > > > > > > v4.8
> >> > > > > > > >
> >> > > > > > > > Can you double-check? v4.8 was the first kernel with D3cold
> >> > > support
> >> > > > > in
> >> > > > > > > > PCI core.
> >> > > > > > >
> >> > > > > > > You are right.  I was thinking d3cold went upstream in 4.9, so
> >> > > yes, we
> >> > > > > > > should apply this to 4.8.
> >> > > > > > >
> >> > > > > > > Alex
> >> > > > > > >
> >> > > > > > > >
> >> > > > > > > > > Series is:
> >> > > > > > > > > Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
> >> > > > > > > >
> >> > > > > > > > Thanks!
> >> > > > > > > >
> >> > > > > > > > Kind regards,
> >> > > > > > > > Peter
> >> > > > > > > >
> >> > > > > > > > >
> >> > > > > > > > > >
> >> > > > > > > > > > Kind regards,
> >> > > > > > > > > > Peter
> >> > > > > > > > > > ---
> >> > > > > > > > > >  drivers/gpu/drm/amd/amdgpu/amdgpu_atpx_handler.c | 9
> >> > > ++++++++-
> >> > > > > > > > > >  1 file changed, 8 insertions(+), 1 deletion(-)
> >> > > > > > > > > >
> >> > > > > > > > > > diff --git
> >> > > > > > > > > > a/drivers/gpu/drm/amd/amdgpu/amdgpu_atpx_handler.c
> >> > > > > > > > b/drivers/gpu/drm/amd/amdgpu/amdgpu_atpx_handler.c
> >> > > > > > > > > > index 10b5ddf..951addf 100644
> >> > > > > > > > > > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_atpx_handler.c
> >> > > > > > > > > > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_atpx_handler.c
> >> > > > > > > > > > @@ -33,6 +33,7 @@ struct amdgpu_atpx {
> >> > > > > > > > > >
> >> > > > > > > > > >  static struct amdgpu_atpx_priv {
> >> > > > > > > > > >         bool atpx_detected;
> >> > > > > > > > > > +       bool bridge_pm_usable;
> >> > > > > > > > > >         /* handle for device - and atpx */
> >> > > > > > > > > >         acpi_handle dhandle;
> >> > > > > > > > > >         acpi_handle other_handle;
> >> > > > > > > > > > @@ -200,7 +201,11 @@ static int
> >> > > > > > > > > > amdgpu_atpx_validate(struct
> >> > > > > > > > amdgpu_atpx *atpx)
> >> > > > > > > > > >         atpx->is_hybrid = false;
> >> > > > > > > > > >         if (valid_bits & ATPX_MS_HYBRID_GFX_SUPPORTED) {
> >> > > > > > > > > >                 printk("ATPX Hybrid Graphics\n");
> >> > > > > > > > > > -               atpx->functions.power_cntl = false;
> >> > > > > > > > > > +               /*
> >> > > > > > > > > > +                * Disable legacy PM methods only when
> >> > > > > > > > > > pcie
> >> > > port
> >> > > > > PM
> >> > > > > > > is usable,
> >> > > > > > > > > > +                * otherwise the device might fail to
> >> > > > > > > > > > power
> >> > > off
> >> > > > > or
> >> > > > > > > power on.
> >> > > > > > > > > > +                */
> >> > > > > > > > > > +               atpx->functions.power_cntl =
> >> > > > > > > > !amdgpu_atpx_priv.bridge_pm_usable;
> >> > > > > > > > > >                 atpx->is_hybrid = true;
> >> > > > > > > > > >         }
> >> > > > > > > > > >
> >> > > > > > > > > > @@ -471,6 +476,7 @@ static int
> >> > > > > > > > > > amdgpu_atpx_power_state(enum
> >> > > > > > > > vga_switcheroo_client_id id,
> >> > > > > > > > > >   */
> >> > > > > > > > > >  static bool amdgpu_atpx_pci_probe_handle(struct pci_dev
> >> > > *pdev)
> >> > > > > > > > > >  {
> >> > > > > > > > > > +       struct pci_dev *parent_pdev =
> >> > > pci_upstream_bridge(pdev);
> >> > > > > > > > > >         acpi_handle dhandle, atpx_handle;
> >> > > > > > > > > >         acpi_status status;
> >> > > > > > > > > >
> >> > > > > > > > > > @@ -485,6 +491,7 @@ static bool
> >> > > > > > > > amdgpu_atpx_pci_probe_handle(struct pci_dev *pdev)
> >> > > > > > > > > >         }
> >> > > > > > > > > >         amdgpu_atpx_priv.dhandle = dhandle;
> >> > > > > > > > > >         amdgpu_atpx_priv.atpx.handle = atpx_handle;
> >> > > > > > > > > > +       amdgpu_atpx_priv.bridge_pm_usable = parent_pdev
> >> > > > > > > > > > &&
> >> > > > > > > > parent_pdev->bridge_d3;
> >> > > > > > > > > >         return true;
> >> > > > > > > > > >  }
> >> > > > > > > > > >
> >> > > > > > > > > > --
> >> > > > > > > > > > 2.10.2

> From cd21b5055cca49b30b0caaf1107a9aaeb60a447f Mon Sep 17 00:00:00 2001
> From: Alex Deucher <alexander.deucher@amd.com>
> Date: Mon, 28 Nov 2016 17:23:40 -0500
> Subject: [PATCH] drm/radeon: fix check for port PM availability
> 
> The ATPX method does not always exist on the dGPU, it may be located at
> the iGPU. The parent device of the iGPU is the root port for which
> bridge_d3 is false. This accidentally enables the legacy PM method which
> conflicts with port PM and prevented the dGPU from powering on.
> 
> Ported from amdgpu commit:
> drm/amdgpu: fix check for port PM availability
> from Peter Wu.
> 
> Fixes: d3ac31f3b4bf9fad (drm/radeon: fix power state when port pm is unavailable (v2))
> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
> Cc: Peter Wu <peter@lekensteyn.nl>
> Cc: <stable@vger.kernel.org> # 4.8+
> ---
>  drivers/gpu/drm/radeon/radeon_atpx_handler.c | 11 +++++++++--
>  1 file changed, 9 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/gpu/drm/radeon/radeon_atpx_handler.c b/drivers/gpu/drm/radeon/radeon_atpx_handler.c
> index 4129b12..0ae13cd2 100644
> --- a/drivers/gpu/drm/radeon/radeon_atpx_handler.c
> +++ b/drivers/gpu/drm/radeon/radeon_atpx_handler.c
> @@ -479,7 +479,6 @@ static int radeon_atpx_power_state(enum vga_switcheroo_client_id id,
>   */
>  static bool radeon_atpx_pci_probe_handle(struct pci_dev *pdev)
>  {
> -	struct pci_dev *parent_pdev = pci_upstream_bridge(pdev);
>  	acpi_handle dhandle, atpx_handle;
>  	acpi_status status;
>  
> @@ -493,7 +492,6 @@ static bool radeon_atpx_pci_probe_handle(struct pci_dev *pdev)
>  
>  	radeon_atpx_priv.dhandle = dhandle;
>  	radeon_atpx_priv.atpx.handle = atpx_handle;
> -	radeon_atpx_priv.bridge_pm_usable = parent_pdev && parent_pdev->bridge_d3;
>  	return true;
>  }
>  
> @@ -555,11 +553,16 @@ static bool radeon_atpx_detect(void)
>  	struct pci_dev *pdev = NULL;
>  	bool has_atpx = false;
>  	int vga_count = 0;
> +	bool d3_supported = false;
> +	struct pci_dev *parent_pdev;
>  
>  	while ((pdev = pci_get_class(PCI_CLASS_DISPLAY_VGA << 8, pdev)) != NULL) {
>  		vga_count++;
>  
>  		has_atpx |= (radeon_atpx_pci_probe_handle(pdev) == true);
> +
> +		parent_pdev = pci_upstream_bridge(pdev);
> +		d3_supported |= parent_pdev && parent_pdev->bridge_d3;
>  	}
>  
>  	/* some newer PX laptops mark the dGPU as a non-VGA display device */
> @@ -567,6 +570,9 @@ static bool radeon_atpx_detect(void)
>  		vga_count++;
>  
>  		has_atpx |= (radeon_atpx_pci_probe_handle(pdev) == true);
> +
> +		parent_pdev = pci_upstream_bridge(pdev);
> +		d3_supported |= parent_pdev && parent_pdev->bridge_d3;
>  	}
>  
>  	if (has_atpx && vga_count == 2) {
> @@ -574,6 +580,7 @@ static bool radeon_atpx_detect(void)
>  		printk(KERN_INFO "vga_switcheroo: detected switching method %s handle\n",
>  		       acpi_method_name);
>  		radeon_atpx_priv.atpx_detected = true;
> +		radeon_atpx_priv.bridge_pm_usable = d3_supported;
>  		radeon_atpx_init();
>  		return true;
>  	}
> -- 
> 2.5.5
>