[v3] clk: Restore BYPASS_PLL_CHECK from PLLs

Submitted by Mark Menzynski on Sept. 6, 2019, 11:43 a.m.

Details

Message ID 20190906114303.7559-1-mmenzyns@redhat.com
State New
Headers show
Series "clk: Restore BYPASS_PLL_CHECK from PLLs" ( rev: 1 ) in Nouveau

Not browsing as part of any series.

Commit Message

Mark Menzynski Sept. 6, 2019, 11:43 a.m.
I have looked at problem with Fermi GPUs where changing to higher clock
led to really bad perfomance (with GpuTest 20x worse perfomance) and later also crashes of the nouveau. It seemed
to be affected by Shader Clock in Voltage Entries in the video BIOS. Disabling
BYPASS_PLL_CHECK in CLK0_CTRL seems to completely fix the issue. I have
tried to search this BYPASS_PLL_CHECK in Nvidia traces but seemed it
wasn't used nowhere for CLK settings.

Removing this works fine, but I don't know what it's really for.
Actual bit setting this BYPASS_PLL_CHECK is on 0x10:
	lookup -ac0 0x137000 0x10
	PCLOCK.CLK0_CTRL => { BYPASS_PLL_CHECK | UNK12 = 0 }
Also, disabling this bit on other CLKs doesn't seem to break anything.

v2: add back PLL lock test
v3: add restoring original value after PLL lock test

Signed-off-by: Mark Menzynski <mmenzyns@redhat.com>
---
 drm/nouveau/nvkm/subdev/clk/gf100.c | 7 +++++--
 drm/nouveau/nvkm/subdev/clk/gk104.c | 5 ++++-
 2 files changed, 9 insertions(+), 3 deletions(-)

Patch hide | download patch | download mbox

diff --git a/drm/nouveau/nvkm/subdev/clk/gf100.c b/drm/nouveau/nvkm/subdev/clk/gf100.c
index 7f67f9f5..a97af0e9 100644
--- a/drm/nouveau/nvkm/subdev/clk/gf100.c
+++ b/drm/nouveau/nvkm/subdev/clk/gf100.c
@@ -368,6 +368,7 @@  gf100_clk_prog_2(struct gf100_clk *clk, int idx)
 	struct gf100_clk_info *info = &clk->eng[idx];
 	struct nvkm_device *device = clk->base.subdev.device;
 	const u32 addr = 0x137000 + (idx * 0x20);
+	bool bypass_state = false;
 	if (idx <= 7) {
 		nvkm_mask(device, addr + 0x00, 0x00000004, 0x00000000);
 		nvkm_mask(device, addr + 0x00, 0x00000001, 0x00000000);
@@ -376,12 +377,14 @@  gf100_clk_prog_2(struct gf100_clk *clk, int idx)
 			nvkm_mask(device, addr + 0x00, 0x00000001, 0x00000001);
 
 			/* Test PLL lock */
+			bypass_state = nvkm_rd32(device, addr + 0x00) & 0x00000010;
 			nvkm_mask(device, addr + 0x00, 0x00000010, 0x00000000);
 			nvkm_msec(device, 2000,
 				if (nvkm_rd32(device, addr + 0x00) & 0x00020000)
 					break;
 			);
-			nvkm_mask(device, addr + 0x00, 0x00000010, 0x00000010);
+			if (bypass_state)
+				nvkm_mask(device, addr + 0x00, 0x00000010, 0x00000010);
 
 			/* Enable sync mode */
 			nvkm_mask(device, addr + 0x00, 0x00000004, 0x00000004);
@@ -476,5 +479,5 @@  gf100_clk_new(struct nvkm_device *device, int index, struct nvkm_clk **pclk)
 		return -ENOMEM;
 	*pclk = &clk->base;
 
-	return nvkm_clk_ctor(&gf100_clk, device, index, false, &clk->base);
+	return nvkm_clk_ctor(&gf100_clk, device, index, true, &clk->base);
 }
diff --git a/drm/nouveau/nvkm/subdev/clk/gk104.c b/drm/nouveau/nvkm/subdev/clk/gk104.c
index 0b37e3da..c9ede404 100644
--- a/drm/nouveau/nvkm/subdev/clk/gk104.c
+++ b/drm/nouveau/nvkm/subdev/clk/gk104.c
@@ -388,6 +388,7 @@  gk104_clk_prog_2(struct gk104_clk *clk, int idx)
 	struct gk104_clk_info *info = &clk->eng[idx];
 	struct nvkm_device *device = clk->base.subdev.device;
 	const u32 addr = 0x137000 + (idx * 0x20);
+	bool bypass_state = false;
 	nvkm_mask(device, addr + 0x00, 0x00000004, 0x00000000);
 	nvkm_mask(device, addr + 0x00, 0x00000001, 0x00000000);
 	if (info->coef) {
@@ -395,12 +396,14 @@  gk104_clk_prog_2(struct gk104_clk *clk, int idx)
 		nvkm_mask(device, addr + 0x00, 0x00000001, 0x00000001);
 
 		/* Test PLL lock */
+		bypass_state = nvkm_rd32(device, addr + 0x00);
 		nvkm_mask(device, addr + 0x00, 0x00000010, 0x00000000);
 		nvkm_msec(device, 2000,
 			if (nvkm_rd32(device, addr + 0x00) & 0x00020000)
 				break;
 		);
-		nvkm_mask(device, addr + 0x00, 0x00000010, 0x00000010);
+		if (bypass_state)
+			nvkm_mask(device, addr + 0x00, 0x00000010, 0x00000010);
 
 		/* Enable sync mode */
 		nvkm_mask(device, addr + 0x00, 0x00000004, 0x00000004);

Comments

(Accidentally replied in person instead of to the list. Here a ~verbatim 
copy of my e-mail earlier)

Briefly, the BYPASS_PLL_CHECK bit disables the PLL locking test when set 
to 1. PLLs should still lock (or not, if the params are wildly 
out-of-bounds), but the test is bypassed and disabled when the bit is set.
This bit was definitely used on NVA3/5/8 by the blob as I confirmed 
ploughing through traces. I recall seeing the same behaviour (disabling) 
on early Fermi's, but I'd have to dig a whole lot deeper through old 
trace to confirm. As for why NVIDIA disables this test after verifying 
the PLL locked, my highly speculative guess back then was just power 
savings. As to why nouveau does it: when in Rome...
The fact that this makes a difference in stability on GF119 makes me 
wonder whether the PLL control register has a different layout on these 
GPUs. The BYPASS_PLL_CHECK bit might no longer be a BYPASS_PLL_CHECK 
bit, but have a completely different meaning. As this gen was 
centimetring towards Kepler, you might want to:
1) investigate behaviour of this bit and the rest of the control bits on 
an idle GF119. Got a machine with an Intel/AMD IGP? Good, stick the 
GF119 in as a secondary GPU and you can trace while messing with the 
VDEC PLL or some other insignificant domain's PLL manually without 
hanging the card or your machine,
2) Look at Kepler code either in nouveau or nvgpu to see if their PLL 
reconfiguration code matches your traces closer,
3) Be cheeky and ask NVIDIA.

Depending on further investigation you may have to create a new 
clk/gf119.c subdev to facilitate potential differences between earlier 
Fermis and this one.

Roy


On 06/09/2019 12:43, Mark Menzynski wrote:
> I have looked at problem with Fermi GPUs where changing to higher clock
> led to really bad perfomance (with GpuTest 20x worse perfomance) and later also crashes of the nouveau. It seemed
> to be affected by Shader Clock in Voltage Entries in the video BIOS. Disabling
> BYPASS_PLL_CHECK in CLK0_CTRL seems to completely fix the issue. I have
> tried to search this BYPASS_PLL_CHECK in Nvidia traces but seemed it
> wasn't used nowhere for CLK settings.
> 
> Removing this works fine, but I don't know what it's really for.
> Actual bit setting this BYPASS_PLL_CHECK is on 0x10:
> 	lookup -ac0 0x137000 0x10
> 	PCLOCK.CLK0_CTRL => { BYPASS_PLL_CHECK | UNK12 = 0 }
> Also, disabling this bit on other CLKs doesn't seem to break anything.
> 
> v2: add back PLL lock test
> v3: add restoring original value after PLL lock test
> 
> Signed-off-by: Mark Menzynski <mmenzyns@redhat.com>
> ---
>   drm/nouveau/nvkm/subdev/clk/gf100.c | 7 +++++--
>   drm/nouveau/nvkm/subdev/clk/gk104.c | 5 ++++-
>   2 files changed, 9 insertions(+), 3 deletions(-)
> 
> diff --git a/drm/nouveau/nvkm/subdev/clk/gf100.c b/drm/nouveau/nvkm/subdev/clk/gf100.c
> index 7f67f9f5..a97af0e9 100644
> --- a/drm/nouveau/nvkm/subdev/clk/gf100.c
> +++ b/drm/nouveau/nvkm/subdev/clk/gf100.c
> @@ -368,6 +368,7 @@ gf100_clk_prog_2(struct gf100_clk *clk, int idx)
>   	struct gf100_clk_info *info = &clk->eng[idx];
>   	struct nvkm_device *device = clk->base.subdev.device;
>   	const u32 addr = 0x137000 + (idx * 0x20);
> +	bool bypass_state = false;
>   	if (idx <= 7) {
>   		nvkm_mask(device, addr + 0x00, 0x00000004, 0x00000000);
>   		nvkm_mask(device, addr + 0x00, 0x00000001, 0x00000000);
> @@ -376,12 +377,14 @@ gf100_clk_prog_2(struct gf100_clk *clk, int idx)
>   			nvkm_mask(device, addr + 0x00, 0x00000001, 0x00000001);
>   
>   			/* Test PLL lock */
> +			bypass_state = nvkm_rd32(device, addr + 0x00) & 0x00000010;
>   			nvkm_mask(device, addr + 0x00, 0x00000010, 0x00000000);
>   			nvkm_msec(device, 2000,
>   				if (nvkm_rd32(device, addr + 0x00) & 0x00020000)
>   					break;
>   			);
> -			nvkm_mask(device, addr + 0x00, 0x00000010, 0x00000010);
> +			if (bypass_state)
> +				nvkm_mask(device, addr + 0x00, 0x00000010, 0x00000010);
>   
>   			/* Enable sync mode */
>   			nvkm_mask(device, addr + 0x00, 0x00000004, 0x00000004);
> @@ -476,5 +479,5 @@ gf100_clk_new(struct nvkm_device *device, int index, struct nvkm_clk **pclk)
>   		return -ENOMEM;
>   	*pclk = &clk->base;
>   
> -	return nvkm_clk_ctor(&gf100_clk, device, index, false, &clk->base);
> +	return nvkm_clk_ctor(&gf100_clk, device, index, true, &clk->base);
>   }
> diff --git a/drm/nouveau/nvkm/subdev/clk/gk104.c b/drm/nouveau/nvkm/subdev/clk/gk104.c
> index 0b37e3da..c9ede404 100644
> --- a/drm/nouveau/nvkm/subdev/clk/gk104.c
> +++ b/drm/nouveau/nvkm/subdev/clk/gk104.c
> @@ -388,6 +388,7 @@ gk104_clk_prog_2(struct gk104_clk *clk, int idx)
>   	struct gk104_clk_info *info = &clk->eng[idx];
>   	struct nvkm_device *device = clk->base.subdev.device;
>   	const u32 addr = 0x137000 + (idx * 0x20);
> +	bool bypass_state = false;
>   	nvkm_mask(device, addr + 0x00, 0x00000004, 0x00000000);
>   	nvkm_mask(device, addr + 0x00, 0x00000001, 0x00000000);
>   	if (info->coef) {
> @@ -395,12 +396,14 @@ gk104_clk_prog_2(struct gk104_clk *clk, int idx)
>   		nvkm_mask(device, addr + 0x00, 0x00000001, 0x00000001);
>   
>   		/* Test PLL lock */
> +		bypass_state = nvkm_rd32(device, addr + 0x00);
>   		nvkm_mask(device, addr + 0x00, 0x00000010, 0x00000000);
>   		nvkm_msec(device, 2000,
>   			if (nvkm_rd32(device, addr + 0x00) & 0x00020000)
>   				break;
>   		);
> -		nvkm_mask(device, addr + 0x00, 0x00000010, 0x00000010);
> +		if (bypass_state)
> +			nvkm_mask(device, addr + 0x00, 0x00000010, 0x00000010);
>   
>   		/* Enable sync mode */
>   		nvkm_mask(device, addr + 0x00, 0x00000004, 0x00000004);
>