drm/amdgpu: fix addr handling in amdgpu_vm_bo_update_mapping

Submitted by Cui, Flora on Sept. 26, 2016, 7:19 a.m.

Details

Message ID 20160926071900.GF9691@flora
State New
Headers show
Series "drm/amdgpu: fix addr handling in amdgpu_vm_bo_update_mapping" ( rev: 2 ) in AMD X.Org drivers

Not browsing as part of any series.

Commit Message

Cui, Flora Sept. 26, 2016, 7:19 a.m.
On Sun, Sep 25, 2016 at 11:55:13AM +0200, Christian König wrote:
> From: Christian König <christian.koenig@amd.com>
> 
> Otherwise we will look at the wrong place in the IB when GART
> mappings are split into smaller updates.
> 
> Signed-off-by: Christian König <christian.koenig@amd.com>
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> index 2bb78dc..da31189 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> @@ -1017,6 +1017,7 @@ static int amdgpu_vm_bo_update_mapping(struct amdgpu_device *adev,
>  						    AMDGPU_GPU_PAGE_SIZE);
>  			pte[i] |= flags;
>  		}
> +		addr = 0;
>  	}
>  
>  	r = amdgpu_sync_fence(adev, &job->sync, exclusive);
> -- 
> 2.5.0
> 
>

IMHO this could fix the vmfault issue.

8<---
From cc7b5618665defd88e2adcd6f735562ecd784298 Mon Sep 17 00:00:00 2001
From: Flora Cui <Flora.Cui@amd.com>
Date: Mon, 26 Sep 2016 15:14:02 +0800
Subject: [PATCH] drm/amdgpu: add ttm_bind in amdgpu_vm_bo_update()

Change-Id: If73d5b06e9188e40250ccdfd1a2a659ed1ef52a6
Signed-off-by: Flora Cui <Flora.Cui@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 2 ++
 1 file changed, 2 insertions(+)

Patch hide | download patch | download mbox

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
index 2bb78dc..7f17127 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
@@ -1166,6 +1166,8 @@  int amdgpu_vm_bo_update(struct amdgpu_device *adev,
 			ttm = container_of(bo_va->bo->tbo.ttm, struct
 					   ttm_dma_tt, ttm);
 			pages_addr = ttm->dma_address;
+			amdgpu_ttm_bind(&bo_va->bo->tbo, mem);
+			addr = (u64)mem->start << PAGE_SHIFT;
 			break;
 
 		case TTM_PL_VRAM:

Comments

please ignore this patch. it actually revert the gtt mgr changes.

On Mon, Sep 26, 2016 at 03:19:01PM +0800, Flora Cui wrote:
> On Sun, Sep 25, 2016 at 11:55:13AM +0200, Christian König wrote:
> > From: Christian König <christian.koenig@amd.com>
> > 
> > Otherwise we will look at the wrong place in the IB when GART
> > mappings are split into smaller updates.
> > 
> > Signed-off-by: Christian König <christian.koenig@amd.com>
> > ---
> >  drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 1 +
> >  1 file changed, 1 insertion(+)
> > 
> > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> > index 2bb78dc..da31189 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> > @@ -1017,6 +1017,7 @@ static int amdgpu_vm_bo_update_mapping(struct amdgpu_device *adev,
> >  						    AMDGPU_GPU_PAGE_SIZE);
> >  			pte[i] |= flags;
> >  		}
> > +		addr = 0;
> >  	}
> >  
> >  	r = amdgpu_sync_fence(adev, &job->sync, exclusive);
> > -- 
> > 2.5.0
> > 
> >
> 
> IMHO this could fix the vmfault issue.
> 
> 8<---
> From cc7b5618665defd88e2adcd6f735562ecd784298 Mon Sep 17 00:00:00 2001
> From: Flora Cui <Flora.Cui@amd.com>
> Date: Mon, 26 Sep 2016 15:14:02 +0800
> Subject: [PATCH] drm/amdgpu: add ttm_bind in amdgpu_vm_bo_update()
> 
> Change-Id: If73d5b06e9188e40250ccdfd1a2a659ed1ef52a6
> Signed-off-by: Flora Cui <Flora.Cui@amd.com>
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> index 2bb78dc..7f17127 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> @@ -1166,6 +1166,8 @@ int amdgpu_vm_bo_update(struct amdgpu_device *adev,
>  			ttm = container_of(bo_va->bo->tbo.ttm, struct
>  					   ttm_dma_tt, ttm);
>  			pages_addr = ttm->dma_address;
> +			amdgpu_ttm_bind(&bo_va->bo->tbo, mem);
> +			addr = (u64)mem->start << PAGE_SHIFT;
>  			break;
>  
>  		case TTM_PL_VRAM:
> -- 
> 2.7.4
> 
> 
> _______________________________________________
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
Yeah, that wouldn't really help but just make the problem more unlikely 
to happen once more.

Anyway Tom St confirmed that the patch seems to work for the open stack.

Anybody brave enough to throw an rb on this so that I can commit it?

Thanks,
Christian.

Am 26.09.2016 um 11:16 schrieb Flora Cui:
> please ignore this patch. it actually revert the gtt mgr changes.
>
> On Mon, Sep 26, 2016 at 03:19:01PM +0800, Flora Cui wrote:
>> On Sun, Sep 25, 2016 at 11:55:13AM +0200, Christian König wrote:
>>> From: Christian König <christian.koenig@amd.com>
>>>
>>> Otherwise we will look at the wrong place in the IB when GART
>>> mappings are split into smaller updates.
>>>
>>> Signed-off-by: Christian König <christian.koenig@amd.com>
>>> ---
>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 1 +
>>>   1 file changed, 1 insertion(+)
>>>
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>>> index 2bb78dc..da31189 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>>> @@ -1017,6 +1017,7 @@ static int amdgpu_vm_bo_update_mapping(struct amdgpu_device *adev,
>>>   						    AMDGPU_GPU_PAGE_SIZE);
>>>   			pte[i] |= flags;
>>>   		}
>>> +		addr = 0;
>>>   	}
>>>   
>>>   	r = amdgpu_sync_fence(adev, &job->sync, exclusive);
>>> -- 
>>> 2.5.0
>>>
>>>
>> IMHO this could fix the vmfault issue.
>>
>> 8<---
>>  From cc7b5618665defd88e2adcd6f735562ecd784298 Mon Sep 17 00:00:00 2001
>> From: Flora Cui <Flora.Cui@amd.com>
>> Date: Mon, 26 Sep 2016 15:14:02 +0800
>> Subject: [PATCH] drm/amdgpu: add ttm_bind in amdgpu_vm_bo_update()
>>
>> Change-Id: If73d5b06e9188e40250ccdfd1a2a659ed1ef52a6
>> Signed-off-by: Flora Cui <Flora.Cui@amd.com>
>> ---
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 2 ++
>>   1 file changed, 2 insertions(+)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>> index 2bb78dc..7f17127 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>> @@ -1166,6 +1166,8 @@ int amdgpu_vm_bo_update(struct amdgpu_device *adev,
>>   			ttm = container_of(bo_va->bo->tbo.ttm, struct
>>   					   ttm_dma_tt, ttm);
>>   			pages_addr = ttm->dma_address;
>> +			amdgpu_ttm_bind(&bo_va->bo->tbo, mem);
>> +			addr = (u64)mem->start << PAGE_SHIFT;
>>   			break;
>>   
>>   		case TTM_PL_VRAM:
>> -- 
>> 2.7.4
>>
>>
>> _______________________________________________
>> amd-gfx mailing list
>> amd-gfx@lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
> _______________________________________________
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
I'm reading through the amdgpu_vm.c to try and see if the patch is correct but I'm not that familiar with the VM side of things.  It seems to boil down to calling params->func() with a new dst value of NULL but that's where I'm stopped at the moment since I don't know what func() is.  Nothing up to that point looks overtly wrong (like trying to use that address as a source for a read/write).


Tom
The function called is either amdgpu_vm_do_set_ptes() or 
amdgpu_vm_do_copy_ptes().

But that is actually rather unrelated to the local handling of addr in 
the function changed.

The point here is we have handled the address offset by giving it to 
amdgpu_vm_map_gart() and so we shouldn't add the address again.

Regards,
Christian.

Am 26.09.2016 um 15:46 schrieb StDenis, Tom:
>
> I'm reading through the amdgpu_vm.c to try and see if the patch is 
> correct but I'm not that familiar with the VM side of things.  It 
> seems to boil down to calling params->func() with a new dst value of 
> NULL but that's where I'm stopped at the moment since I don't know 
> what func() is.  Nothing up to that point looks overtly wrong (like 
> trying to use that address as a source for a read/write).
>
>
> Tom
>
>
> ------------------------------------------------------------------------
> *From:* amd-gfx <amd-gfx-bounces@lists.freedesktop.org> on behalf of 
> Christian König <deathsimple@vodafone.de>
> *Sent:* Monday, September 26, 2016 09:29
> *To:* Cui, Flora
> *Cc:* Huang, Ray; amd-gfx@lists.freedesktop.org
> *Subject:* Re: [PATCH] drm/amdgpu: fix addr handling in 
> amdgpu_vm_bo_update_mapping
> Yeah, that wouldn't really help but just make the problem more unlikely
> to happen once more.
>
> Anyway Tom St confirmed that the patch seems to work for the open stack.
>
> Anybody brave enough to throw an rb on this so that I can commit it?
>
> Thanks,
> Christian.
>
> Am 26.09.2016 um 11:16 schrieb Flora Cui:
> > please ignore this patch. it actually revert the gtt mgr changes.
> >
> > On Mon, Sep 26, 2016 at 03:19:01PM +0800, Flora Cui wrote:
> >> On Sun, Sep 25, 2016 at 11:55:13AM +0200, Christian König wrote:
> >>> From: Christian König <christian.koenig@amd.com>
> >>>
> >>> Otherwise we will look at the wrong place in the IB when GART
> >>> mappings are split into smaller updates.
> >>>
> >>> Signed-off-by: Christian König <christian.koenig@amd.com>
> >>> ---
> >>>   drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 1 +
> >>>   1 file changed, 1 insertion(+)
> >>>
> >>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> >>> index 2bb78dc..da31189 100644
> >>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> >>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> >>> @@ -1017,6 +1017,7 @@ static int 
> amdgpu_vm_bo_update_mapping(struct amdgpu_device *adev,
> >>>AMDGPU_GPU_PAGE_SIZE);
> >>>                      pte[i] |= flags;
> >>>              }
> >>> +           addr = 0;
> >>>      }
> >>>
> >>>      r = amdgpu_sync_fence(adev, &job->sync, exclusive);
> >>> --
> >>> 2.5.0
> >>>
> >>>
> >> IMHO this could fix the vmfault issue.
> >>
> >> 8<---
> >>  From cc7b5618665defd88e2adcd6f735562ecd784298 Mon Sep 17 00:00:00 2001
> >> From: Flora Cui <Flora.Cui@amd.com>
> >> Date: Mon, 26 Sep 2016 15:14:02 +0800
> >> Subject: [PATCH] drm/amdgpu: add ttm_bind in amdgpu_vm_bo_update()
> >>
> >> Change-Id: If73d5b06e9188e40250ccdfd1a2a659ed1ef52a6
> >> Signed-off-by: Flora Cui <Flora.Cui@amd.com>
> >> ---
> >>   drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 2 ++
> >>   1 file changed, 2 insertions(+)
> >>
> >> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> >> index 2bb78dc..7f17127 100644
> >> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> >> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> >> @@ -1166,6 +1166,8 @@ int amdgpu_vm_bo_update(struct amdgpu_device 
> *adev,
> >>                       ttm = container_of(bo_va->bo->tbo.ttm, struct
> >> ttm_dma_tt, ttm);
> >>                       pages_addr = ttm->dma_address;
> >> + amdgpu_ttm_bind(&bo_va->bo->tbo, mem);
> >> +                    addr = (u64)mem->start << PAGE_SHIFT;
> >>                       break;
> >>
> >>               case TTM_PL_VRAM:
> >> --
> >> 2.7.4
> >>
> >>
> >> _______________________________________________
> >> amd-gfx mailing list
> >> amd-gfx@lists.freedesktop.org
> >> https://lists.freedesktop.org/mailman/listinfo/amd-gfx 
> <https://lists.freedesktop.org/mailman/listinfo/amd-gfx>
> amd-gfx Info Page - lists.freedesktop.org 
> <https://lists.freedesktop.org/mailman/listinfo/amd-gfx>
> lists.freedesktop.org
> To see the collection of prior postings to the list, visit the amd-gfx 
> Archives. Using amd-gfx: To post a message to all the list members, 
> send email ...
>
>
>
> > _______________________________________________
> > amd-gfx mailing list
> > amd-gfx@lists.freedesktop.org
> > https://lists.freedesktop.org/mailman/listinfo/amd-gfx
>
>
> _______________________________________________
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
Maybe it would be cleaner to just pass 0 to amdgpu_vm_frag_ptes() or remove the parameter altogether.

Alex

From: amd-gfx [mailto:amd-gfx-bounces@lists.freedesktop.org] On Behalf Of Christian König
Sent: Monday, September 26, 2016 11:12 AM
To: StDenis, Tom; Cui, Flora
Cc: Huang, Ray; amd-gfx@lists.freedesktop.org
Subject: Re: [PATCH] drm/amdgpu: fix addr handling in amdgpu_vm_bo_update_mapping

The function called is either amdgpu_vm_do_set_ptes() or amdgpu_vm_do_copy_ptes().

But that is actually rather unrelated to the local handling of addr in the function changed.

The point here is we have handled the address offset by giving it to amdgpu_vm_map_gart() and so we shouldn't add the address again.

Regards,
Christian.

Am 26.09.2016 um 15:46 schrieb StDenis, Tom:

I'm reading through the amdgpu_vm.c to try and see if the patch is correct but I'm not that familiar with the VM side of things.  It seems to boil down to calling params->func() with a new dst value of NULL but that's where I'm stopped at the moment since I don't know what func() is.  Nothing up to that point looks overtly wrong (like trying to use that address as a source for a read/write).



Tom
Wait, nevermind.  It would be good to add a comment that addr needs to be reset to 0 in this case since it's used below in the amdgpu_vm_frag_ptes().

The patch is:
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>

From: amd-gfx [mailto:amd-gfx-bounces@lists.freedesktop.org] On Behalf Of Deucher, Alexander
Sent: Monday, September 26, 2016 11:17 AM
To: 'Christian König'; StDenis, Tom; Cui, Flora
Cc: Huang, Ray; amd-gfx@lists.freedesktop.org
Subject: RE: [PATCH] drm/amdgpu: fix addr handling in amdgpu_vm_bo_update_mapping

Maybe it would be cleaner to just pass 0 to amdgpu_vm_frag_ptes() or remove the parameter altogether.

Alex

From: amd-gfx [mailto:amd-gfx-bounces@lists.freedesktop.org] On Behalf Of Christian König
Sent: Monday, September 26, 2016 11:12 AM
To: StDenis, Tom; Cui, Flora
Cc: Huang, Ray; amd-gfx@lists.freedesktop.org<mailto:amd-gfx@lists.freedesktop.org>
Subject: Re: [PATCH] drm/amdgpu: fix addr handling in amdgpu_vm_bo_update_mapping

The function called is either amdgpu_vm_do_set_ptes() or amdgpu_vm_do_copy_ptes().

But that is actually rather unrelated to the local handling of addr in the function changed.

The point here is we have handled the address offset by giving it to amdgpu_vm_map_gart() and so we shouldn't add the address again.

Regards,
Christian.

Am 26.09.2016 um 15:46 schrieb StDenis, Tom:

I'm reading through the amdgpu_vm.c to try and see if the patch is correct but I'm not that familiar with the VM side of things.  It seems to boil down to calling params->func() with a new dst value of NULL but that's where I'm stopped at the moment since I don't know what func() is.  Nothing up to that point looks overtly wrong (like trying to use that address as a source for a read/write).



Tom