[v4,20/37] volt: add coefficients

Submitted by Karol Herbst on April 18, 2016, 7:13 p.m.

Details

Message ID 1461006851-5007-21-git-send-email-nouveau@karolherbst.de
State New
Headers show
Series "Volting/Clocking improvements for Fermi and newer" ( rev: 4 ) in Nouveau

Not browsing as part of any series.

Commit Message

Karol Herbst April 18, 2016, 7:13 p.m.
I am sure that those are a bit different, but while testing the biggest error
compared to nvidia was -1.5%.

Without this change we are most of the time around 10% below nvidias voltage,
so this change causes no harm and improves the situation a lot already.

These coefficients were REed by modifing the voltage map entries and by
calculating the set voltage back until I was able to forecast which voltage
nvidia sets for a given voltage map entry.

v4: use better coefficients and speedo

Signed-off-by: Karol Herbst <nouveau@karolherbst.de>
---
 drm/nouveau/nvkm/subdev/volt/base.c | 38 +++++++++++++++++++++++++++++++++++--
 1 file changed, 36 insertions(+), 2 deletions(-)

Patch hide | download patch | download mbox

diff --git a/drm/nouveau/nvkm/subdev/volt/base.c b/drm/nouveau/nvkm/subdev/volt/base.c
index cecfac6..5e35d96 100644
--- a/drm/nouveau/nvkm/subdev/volt/base.c
+++ b/drm/nouveau/nvkm/subdev/volt/base.c
@@ -110,13 +110,47 @@  nvkm_volt_map(struct nvkm_volt *volt, u8 id, u8 temp)
 
 	vmap = nvbios_vmap_entry_parse(bios, id, &ver, &len, &info);
 	if (vmap) {
+		s64 result;
+
+		if (volt->speedo < 0)
+			return volt->speedo;
+
+		if (ver == 0x10 || (ver == 0x20 && info.mode == 0)) {
+			result  =  (s64)info.arg[0] / 10;
+			result += ((s64)info.arg[1] * volt->speedo) / 10;
+			result += ((s64)info.arg[2] * volt->speedo * volt->speedo) / 100000;
+		} else if (ver == 0x20) {
+			switch (info.mode) {
+			/* 0x0 handled above! */
+			case 0x1:
+				result =  ((s64)info.arg[0] * 15625) >> 18;
+				result += ((s64)info.arg[1] * volt->speedo * 15625) >> 18;
+				result += ((s64)info.arg[2] * temp * 15625) >> 10;
+				result += ((s64)info.arg[3] * volt->speedo * temp * 15625) >> 18;
+				result += ((s64)info.arg[4] * volt->speedo * volt->speedo * 15625) >> 30;
+				result += ((s64)info.arg[5] * temp * temp * 15625) >> 18;
+				break;
+			case 0x3:
+				result = (info.min + info.max) / 2;
+				break;
+			case 0x2:
+			default:
+				result = info.min;
+				break;
+			}
+		} else {
+			return -ENODEV;
+		}
+
+		result = min(max(result, (s64)info.min), (s64)info.max);
+
 		if (info.link != 0xff) {
 			int ret = nvkm_volt_map(volt, info.link, temp);
 			if (ret < 0)
 				return ret;
-			info.min += ret;
+			result += ret;
 		}
-		return info.min;
+		return result;
 	}
 
 	return id ? id * 10000 : -ENODEV;

Comments

On 18/04/16 22:13, Karol Herbst wrote:
> I am sure that those are a bit different, but while testing the biggest error
> compared to nvidia was -1.5%.

Is this still true? I thought we were *much* closer now.

>
> Without this change we are most of the time around 10% below nvidias voltage,
> so this change causes no harm and improves the situation a lot already.

Yeah, this is definitely not up to date!

>
> These coefficients were REed by modifing the voltage map entries and by
> calculating the set voltage back until I was able to forecast which voltage
> nvidia sets for a given voltage map entry.
>
> v4: use better coefficients and speedo
>
> Signed-off-by: Karol Herbst <nouveau@karolherbst.de>
> ---
>   drm/nouveau/nvkm/subdev/volt/base.c | 38 +++++++++++++++++++++++++++++++++++--
>   1 file changed, 36 insertions(+), 2 deletions(-)
>
> diff --git a/drm/nouveau/nvkm/subdev/volt/base.c b/drm/nouveau/nvkm/subdev/volt/base.c
> index cecfac6..5e35d96 100644
> --- a/drm/nouveau/nvkm/subdev/volt/base.c
> +++ b/drm/nouveau/nvkm/subdev/volt/base.c
> @@ -110,13 +110,47 @@ nvkm_volt_map(struct nvkm_volt *volt, u8 id, u8 temp)
>   
>   	vmap = nvbios_vmap_entry_parse(bios, id, &ver, &len, &info);
>   	if (vmap) {
> +		s64 result;
> +
> +		if (volt->speedo < 0)
> +			return volt->speedo;

Hmm, so you will refuse reclocking if the speedo cannot be read... 
Fair-enough, but I would like to see a warning in the kernel logs.

> +
> +		if (ver == 0x10 || (ver == 0x20 && info.mode == 0)) {
> +			result  =  (s64)info.arg[0] / 10;
> +			result += ((s64)info.arg[1] * volt->speedo) / 10;
> +			result += ((s64)info.arg[2] * volt->speedo * volt->speedo) / 100000;
> +		} else if (ver == 0x20) {
> +			switch (info.mode) {
> +			/* 0x0 handled above! */
> +			case 0x1:
> +				result =  ((s64)info.arg[0] * 15625) >> 18;
> +				result += ((s64)info.arg[1] * volt->speedo * 15625) >> 18;
> +				result += ((s64)info.arg[2] * temp * 15625) >> 10;
> +				result += ((s64)info.arg[3] * volt->speedo * temp * 15625) >> 18;
> +				result += ((s64)info.arg[4] * volt->speedo * volt->speedo * 15625) >> 30;
> +				result += ((s64)info.arg[5] * temp * temp * 15625) >> 18;
> +				break;

Well, I can only say that these values got us really really close to 
what nvidia does... On around 10 GPUs... So... until we know better, 
let's stick to them.

Hopefully, we can figure out where this 15625 comes from. This patch is:

Reviewed-by: Martin Peres <martin.peres@free.fr>

> +			case 0x3:
> +				result = (info.min + info.max) / 2;
> +				break;
> +			case 0x2:
> +			default:
> +				result = info.min;
> +				break;
> +			}
> +		} else {
> +			return -ENODEV;
> +		}
> +
> +		result = min(max(result, (s64)info.min), (s64)info.max);
> +
>   		if (info.link != 0xff) {
>   			int ret = nvkm_volt_map(volt, info.link, temp);
>   			if (ret < 0)
>   				return ret;
> -			info.min += ret;
> +			result += ret;
>   		}
> -		return info.min;
> +		return result;
>   	}
>   
>   	return id ? id * 10000 : -ENODEV;
On Tue, Apr 19, 2016 at 5:52 PM, Martin Peres <martin.peres@free.fr> wrote:
>> +                               result =  ((s64)info.arg[0] * 15625) >>
>> 18;
>> +                               result += ((s64)info.arg[1] * volt->speedo
>> * 15625) >> 18;
>> +                               result += ((s64)info.arg[2] * temp *
>> 15625) >> 10;
>> +                               result += ((s64)info.arg[3] * volt->speedo
>> * temp * 15625) >> 18;
>> +                               result += ((s64)info.arg[4] * volt->speedo
>> * volt->speedo * 15625) >> 30;
>> +                               result += ((s64)info.arg[5] * temp * temp
>> * 15625) >> 18;
>> +                               break;
>
>
> Well, I can only say that these values got us really really close to what
> nvidia does... On around 10 GPUs... So... until we know better, let's stick
> to them.
>
> Hopefully, we can figure out where this 15625 comes from. This patch is:

5^6. Really it's 10^6, but it's silly to multiply by 10^6 and then
divide by powers of 2. Instead that's factored out...

  -ilia
On 20/04/16 00:54, Ilia Mirkin wrote:
> On Tue, Apr 19, 2016 at 5:52 PM, Martin Peres <martin.peres@free.fr> wrote:
>>> +                               result =  ((s64)info.arg[0] * 15625) >>
>>> 18;
>>> +                               result += ((s64)info.arg[1] * volt->speedo
>>> * 15625) >> 18;
>>> +                               result += ((s64)info.arg[2] * temp *
>>> 15625) >> 10;
>>> +                               result += ((s64)info.arg[3] * volt->speedo
>>> * temp * 15625) >> 18;
>>> +                               result += ((s64)info.arg[4] * volt->speedo
>>> * volt->speedo * 15625) >> 30;
>>> +                               result += ((s64)info.arg[5] * temp * temp
>>> * 15625) >> 18;
>>> +                               break;
>>
>> Well, I can only say that these values got us really really close to what
>> nvidia does... On around 10 GPUs... So... until we know better, let's stick
>> to them.
>>
>> Hopefully, we can figure out where this 15625 comes from. This patch is:
> 5^6. Really it's 10^6, but it's silly to multiply by 10^6 and then
> divide by powers of 2. Instead that's factored out...

That makes sense. I must have missed this on the IRC discussion.

Major congrats to Karol and you for figuring out all of this!

@Ben: I really think this is the right thing to do, it really was solid 
work for Karol's part.