drm/amd/display: avoid 64-bit division

Submitted by Arnd Bergmann on July 8, 2019, 1:52 p.m.

Details

Message ID 20190708135238.651483-1-arnd@arndb.de
State New
Headers show
Series "drm/amd/display: avoid 64-bit division" ( rev: 1 ) in DRI devel

Not browsing as part of any series.

Commit Message

Arnd Bergmann July 8, 2019, 1:52 p.m.
On 32-bit architectures, dividing a 64-bit integer in the kernel
leads to a link error:

ERROR: "__udivdi3" [drivers/gpu/drm/amd/amdgpu/amdgpu.ko] undefined!
ERROR: "__divdi3" [drivers/gpu/drm/amd/amdgpu/amdgpu.ko] undefined!

Change the two recently introduced instances to a multiply+shift
operation that is also much cheaper on 32-bit architectures.
We can do that here, since both of them are really 32-bit numbers
that change a few percent.

Fixes: bedbbe6af4be ("drm/amd/display: Move link functions from dc to dc_link")
Fixes: f18bc4e53ad6 ("drm/amd/display: update calculated bounding box logic for NV")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
---
 drivers/gpu/drm/amd/display/dc/core/dc_link.c         | 4 ++--
 drivers/gpu/drm/amd/display/dc/dcn20/dcn20_resource.c | 2 +-
 2 files changed, 3 insertions(+), 3 deletions(-)

Patch hide | download patch | download mbox

diff --git a/drivers/gpu/drm/amd/display/dc/core/dc_link.c b/drivers/gpu/drm/amd/display/dc/core/dc_link.c
index c17db5c144aa..8dbf759eba45 100644
--- a/drivers/gpu/drm/amd/display/dc/core/dc_link.c
+++ b/drivers/gpu/drm/amd/display/dc/core/dc_link.c
@@ -3072,8 +3072,8 @@  uint32_t dc_link_bandwidth_kbps(
 		 * but the difference is minimal and is in a safe direction,
 		 * which all works well around potential ambiguity of DP 1.4a spec.
 		 */
-		long long fec_link_bw_kbps = link_bw_kbps * 970LL;
-		link_bw_kbps = (uint32_t)(fec_link_bw_kbps / 1000LL);
+		link_bw_kbps = mul_u64_u32_shr(BIT_ULL(32) * 970LL / 1000,
+					       link_bw_kbps, 32);
 	}
 #endif
 
diff --git a/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_resource.c b/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_resource.c
index b35327bafbc5..70ac8a95d2db 100644
--- a/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_resource.c
+++ b/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_resource.c
@@ -2657,7 +2657,7 @@  static void update_bounding_box(struct dc *dc, struct _vcs_dpi_soc_bounding_box_
 		calculated_states[i].dram_speed_mts = uclk_states[i] * 16 / 1000;
 
 		// FCLK:UCLK ratio is 1.08
-		min_fclk_required_by_uclk = ((unsigned long long)uclk_states[i]) * 1080 / 1000000;
+		min_fclk_required_by_uclk = mul_u64_u32_shr(BIT_ULL(32) * 1080 / 1000000, uclk_states[i], 32);
 
 		calculated_states[i].fabricclk_mhz = (min_fclk_required_by_uclk < min_dcfclk) ?
 				min_dcfclk : min_fclk_required_by_uclk;

Comments

On 7/8/19 9:52 AM, Arnd Bergmann wrote:
> On 32-bit architectures, dividing a 64-bit integer in the kernel

> leads to a link error:

> 

> ERROR: "__udivdi3" [drivers/gpu/drm/amd/amdgpu/amdgpu.ko] undefined!

> ERROR: "__divdi3" [drivers/gpu/drm/amd/amdgpu/amdgpu.ko] undefined!

> 

> Change the two recently introduced instances to a multiply+shift

> operation that is also much cheaper on 32-bit architectures.

> We can do that here, since both of them are really 32-bit numbers

> that change a few percent.

> 

> Fixes: bedbbe6af4be ("drm/amd/display: Move link functions from dc to dc_link")

> Fixes: f18bc4e53ad6 ("drm/amd/display: update calculated bounding box logic for NV")

> Signed-off-by: Arnd Bergmann <arnd@arndb.de>

> ---

>   drivers/gpu/drm/amd/display/dc/core/dc_link.c         | 4 ++--

>   drivers/gpu/drm/amd/display/dc/dcn20/dcn20_resource.c | 2 +-

>   2 files changed, 3 insertions(+), 3 deletions(-)

> 

> diff --git a/drivers/gpu/drm/amd/display/dc/core/dc_link.c b/drivers/gpu/drm/amd/display/dc/core/dc_link.c

> index c17db5c144aa..8dbf759eba45 100644

> --- a/drivers/gpu/drm/amd/display/dc/core/dc_link.c

> +++ b/drivers/gpu/drm/amd/display/dc/core/dc_link.c

> @@ -3072,8 +3072,8 @@ uint32_t dc_link_bandwidth_kbps(

>   		 * but the difference is minimal and is in a safe direction,

>   		 * which all works well around potential ambiguity of DP 1.4a spec.

>   		 */

> -		long long fec_link_bw_kbps = link_bw_kbps * 970LL;

> -		link_bw_kbps = (uint32_t)(fec_link_bw_kbps / 1000LL);

> +		link_bw_kbps = mul_u64_u32_shr(BIT_ULL(32) * 970LL / 1000,

> +					       link_bw_kbps, 32);

>   	}

>   #endif

>   

> diff --git a/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_resource.c b/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_resource.c

> index b35327bafbc5..70ac8a95d2db 100644

> --- a/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_resource.c

> +++ b/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_resource.c

> @@ -2657,7 +2657,7 @@ static void update_bounding_box(struct dc *dc, struct _vcs_dpi_soc_bounding_box_

>   		calculated_states[i].dram_speed_mts = uclk_states[i] * 16 / 1000;

>   

>   		// FCLK:UCLK ratio is 1.08

> -		min_fclk_required_by_uclk = ((unsigned long long)uclk_states[i]) * 1080 / 1000000;

> +		min_fclk_required_by_uclk = mul_u64_u32_shr(BIT_ULL(32) * 1080 / 1000000, uclk_states[i], 32);


Even though the mul + shift will be faster here, I would prefer that 
this just be a div_u64 for clarity.

Nicholas Kazlauskas

>   

>   		calculated_states[i].fabricclk_mhz = (min_fclk_required_by_uclk < min_dcfclk) ?

>   				min_dcfclk : min_fclk_required_by_uclk;

>