[1/4] drm/amdgpu: Optimization of critical section

Submitted by Xie, AlexBin on June 12, 2017, 8:31 p.m.

Details

Message ID 1497299485-7337-1-git-send-email-AlexBin.Xie@amd.com
State New
Headers show
Series "Series without cover letter" ( rev: 1 ) in AMD X.Org drivers

Not browsing as part of any series.

Commit Message

Xie, AlexBin June 12, 2017, 8:31 p.m.
This patch is to move a loop of unref BOs and
several memory free function calls out of
critical sections.

Signed-off-by: Alex Xie <AlexBin.Xie@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_bo_list.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

Patch hide | download patch | download mbox

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_bo_list.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_bo_list.c
index a664987..02c138f 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_bo_list.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_bo_list.c
@@ -75,9 +75,12 @@  static void amdgpu_bo_list_destroy(struct amdgpu_fpriv *fpriv, int id)
 		/* Another user may have a reference to this list still */
 		mutex_lock(&list->lock);
 		mutex_unlock(&list->lock);
+		mutex_unlock(&fpriv->bo_list_lock);
 		amdgpu_bo_list_free(list);
 	}
-	mutex_unlock(&fpriv->bo_list_lock);
+	else {
+		mutex_unlock(&fpriv->bo_list_lock);
+	}
 }
 
 static int amdgpu_bo_list_set(struct amdgpu_device *adev,

Comments

Am 12.06.2017 um 22:31 schrieb Alex Xie:
> This patch is to move a loop of unref BOs and
> several memory free function calls out of
> critical sections.
>
> Signed-off-by: Alex Xie <AlexBin.Xie@amd.com>
> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_bo_list.c | 5 ++++-
>   1 file changed, 4 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_bo_list.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_bo_list.c
> index a664987..02c138f 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_bo_list.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_bo_list.c
> @@ -75,9 +75,12 @@ static void amdgpu_bo_list_destroy(struct amdgpu_fpriv *fpriv, int id)
>   		/* Another user may have a reference to this list still */
>   		mutex_lock(&list->lock);
>   		mutex_unlock(&list->lock);
> +		mutex_unlock(&fpriv->bo_list_lock);
>   		amdgpu_bo_list_free(list);
>   	}
> -	mutex_unlock(&fpriv->bo_list_lock);
> +	else {
> +		mutex_unlock(&fpriv->bo_list_lock);
> +	}

You could move the unlock of bo_list_lock even before the if.

But since you pointed it out there is quite a bug in this function:
>                 /* Another user may have a reference to this list still */
>                 mutex_lock(&list->lock);
>                 mutex_unlock(&list->lock);
Not sure if that is still up to date, but that use case used to be 
illegal with mutexes.

Christian.

>   }
>   
>   static int amdgpu_bo_list_set(struct amdgpu_device *adev,
On 13 June 2017 at 09:00, Christian König <deathsimple@vodafone.de> wrote:
> Am 12.06.2017 um 22:31 schrieb Alex Xie:
>>
>> This patch is to move a loop of unref BOs and
>> several memory free function calls out of
>> critical sections.
>>
>> Signed-off-by: Alex Xie <AlexBin.Xie@amd.com>
>> ---
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_bo_list.c | 5 ++++-
>>   1 file changed, 4 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_bo_list.c
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_bo_list.c
>> index a664987..02c138f 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_bo_list.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_bo_list.c
>> @@ -75,9 +75,12 @@ static void amdgpu_bo_list_destroy(struct amdgpu_fpriv
>> *fpriv, int id)
>>                 /* Another user may have a reference to this list still */
>>                 mutex_lock(&list->lock);
>>                 mutex_unlock(&list->lock);
>> +               mutex_unlock(&fpriv->bo_list_lock);
>>                 amdgpu_bo_list_free(list);
>>         }
>> -       mutex_unlock(&fpriv->bo_list_lock);
>> +       else {
>> +               mutex_unlock(&fpriv->bo_list_lock);
>> +       }
>
>
> You could move the unlock of bo_list_lock even before the if.
>
> But since you pointed it out there is quite a bug in this function:
>>
>>                 /* Another user may have a reference to this list still */
>>                 mutex_lock(&list->lock);
>>                 mutex_unlock(&list->lock);
>
> Not sure if that is still up to date, but that use case used to be illegal
> with mutexes.

Oh wow, looks like this code has refcounting bugs, there should never
be a reason to lock/unlock to find another user.

Can anyone say krefs?

Dave.
On 2017-06-12 07:00 PM, Christian König wrote:
> Am 12.06.2017 um 22:31 schrieb Alex Xie:
>> This patch is to move a loop of unref BOs and
>> several memory free function calls out of
>> critical sections.
>>
>> Signed-off-by: Alex Xie <AlexBin.Xie@amd.com>
>> ---
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_bo_list.c | 5 ++++-
>>   1 file changed, 4 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_bo_list.c 
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_bo_list.c
>> index a664987..02c138f 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_bo_list.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_bo_list.c
>> @@ -75,9 +75,12 @@ static void amdgpu_bo_list_destroy(struct 
>> amdgpu_fpriv *fpriv, int id)
>>           /* Another user may have a reference to this list still */
>>           mutex_lock(&list->lock);
>>           mutex_unlock(&list->lock);
>> +        mutex_unlock(&fpriv->bo_list_lock);
>>           amdgpu_bo_list_free(list);
>>       }
>> -    mutex_unlock(&fpriv->bo_list_lock);
>> +    else {
>> +        mutex_unlock(&fpriv->bo_list_lock);
>> +    }
>
> You could move the unlock of bo_list_lock even before the if.
>
> But since you pointed it out there is quite a bug in this function:
>>                 /* Another user may have a reference to this list 
>> still */
>>                 mutex_lock(&list->lock);
>>                 mutex_unlock(&list->lock);
> Not sure if that is still up to date, but that use case used to be 
> illegal with mutexes.
As I understand this piece of code, these mutex_lock and mutex_unlock 
pair are used to make sure all other tasks have finished
access of the bo list. Another side of this story is in function 
amdgpu_bo_list_get. These two piece of codes together make sure
we can safely destroy bo list.

Otherwise we can easily simplify these lockings.
Let me give an example here.

In function amdgpu_bo_list_get, assuming we change the code like this:
...
     down_read(&fpriv->bo_list_lock);
     result = idr_find(&fpriv->bo_list_handles, id);
     up_read(&fpriv->bo_list_lock);
     /**Line 1. Task A was scheduled away from CPU**/
     if (result)
         mutex_lock(&result->lock);
...

In function amdgpu_bo_list_destroy, assuming we change the code like this:
...
     down_write(&fpriv->bo_list_lock);
     list = idr_remove(&fpriv->bo_list_handles, id);
     up_write(&fpriv->bo_list_lock);
     if (list) {
         /* Another user may have a reference to this list still */
         mutex_lock(&list->lock);
         mutex_unlock(&list->lock);
         amdgpu_bo_list_free(list);
     }
}

When task A is running in function amdgpu_bo_list_get in line 1, CPU 
scheduler takes CPU away from task A.
Then Task B run function amdgpu_bo_list_destroy. Task B can run all the 
way to destroy and free mutex.
Later Task A is back to run. The mutex result->lock has been destroyed 
by task B. Now task A try to lock a mutex
which has been destroyed and freed.


>
> Christian.
>
>>   }
>>     static int amdgpu_bo_list_set(struct amdgpu_device *adev,
>
>
Am 13.06.2017 um 04:25 schrieb Dave Airlie:
> On 13 June 2017 at 09:00, Christian König <deathsimple@vodafone.de> wrote:
>> Am 12.06.2017 um 22:31 schrieb Alex Xie:
>>> This patch is to move a loop of unref BOs and
>>> several memory free function calls out of
>>> critical sections.
>>>
>>> Signed-off-by: Alex Xie <AlexBin.Xie@amd.com>
>>> ---
>>>    drivers/gpu/drm/amd/amdgpu/amdgpu_bo_list.c | 5 ++++-
>>>    1 file changed, 4 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_bo_list.c
>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_bo_list.c
>>> index a664987..02c138f 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_bo_list.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_bo_list.c
>>> @@ -75,9 +75,12 @@ static void amdgpu_bo_list_destroy(struct amdgpu_fpriv
>>> *fpriv, int id)
>>>                  /* Another user may have a reference to this list still */
>>>                  mutex_lock(&list->lock);
>>>                  mutex_unlock(&list->lock);
>>> +               mutex_unlock(&fpriv->bo_list_lock);
>>>                  amdgpu_bo_list_free(list);
>>>          }
>>> -       mutex_unlock(&fpriv->bo_list_lock);
>>> +       else {
>>> +               mutex_unlock(&fpriv->bo_list_lock);
>>> +       }
>>
>> You could move the unlock of bo_list_lock even before the if.
>>
>> But since you pointed it out there is quite a bug in this function:
>>>                  /* Another user may have a reference to this list still */
>>>                  mutex_lock(&list->lock);
>>>                  mutex_unlock(&list->lock);
>> Not sure if that is still up to date, but that use case used to be illegal
>> with mutexes.
> Oh wow, looks like this code has refcounting bugs, there should never
> be a reason to lock/unlock to find another user.

Actually that is not the issue.

We remove the bo_list object from the idr first which only works when 
nobody else is looking it up at the same time.

Taking and releasing the lock then should make sure that nobody is 
actually using the object any more in another thread.

> Can anyone say krefs?

The problem is rather that unlocking a mutex directly before it is 
released used to be illegal because mutexes work with their structure 
even after releasing it.

But the best solution would be to do the whole thing lockless with a 
kref and/or atomic updates.

Regards,
Christian.

>
> Dave.
Am 13.06.2017 um 08:29 schrieb axie:
> On 2017-06-12 07:00 PM, Christian König wrote:
>> Am 12.06.2017 um 22:31 schrieb Alex Xie:
>>> This patch is to move a loop of unref BOs and
>>> several memory free function calls out of
>>> critical sections.
>>>
>>> Signed-off-by: Alex Xie <AlexBin.Xie@amd.com>
>>> ---
>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_bo_list.c | 5 ++++-
>>>   1 file changed, 4 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_bo_list.c 
>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_bo_list.c
>>> index a664987..02c138f 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_bo_list.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_bo_list.c
>>> @@ -75,9 +75,12 @@ static void amdgpu_bo_list_destroy(struct 
>>> amdgpu_fpriv *fpriv, int id)
>>>           /* Another user may have a reference to this list still */
>>>           mutex_lock(&list->lock);
>>>           mutex_unlock(&list->lock);
>>> +        mutex_unlock(&fpriv->bo_list_lock);
>>>           amdgpu_bo_list_free(list);
>>>       }
>>> -    mutex_unlock(&fpriv->bo_list_lock);
>>> +    else {
>>> +        mutex_unlock(&fpriv->bo_list_lock);
>>> +    }
>>
>> You could move the unlock of bo_list_lock even before the if.
>>
>> But since you pointed it out there is quite a bug in this function:
>>>                 /* Another user may have a reference to this list 
>>> still */
>>>                 mutex_lock(&list->lock);
>>>                 mutex_unlock(&list->lock);
>> Not sure if that is still up to date, but that use case used to be 
>> illegal with mutexes.
> As I understand this piece of code, these mutex_lock and mutex_unlock 
> pair are used to make sure all other tasks have finished
> access of the bo list. Another side of this story is in function 
> amdgpu_bo_list_get. These two piece of codes together make sure
> we can safely destroy bo list.

Yeah, the idea behind the code is correct. But using mutexes in that way 
is illegal, see here https://lwn.net/Articles/575460/.

I'm not sure if that is still up to date, but in ancient times you 
needed to avoid patterns like this:
mutex_unlock(&obj->lock);
kfree(obj);

Anyway I suggest we just replace the whole bo_list handling with and RCU 
and refcount based implementation. That should avoid the whole locking 
for the read only code path.

Regards,
Christian.

>
> Otherwise we can easily simplify these lockings.
> Let me give an example here.
>
> In function amdgpu_bo_list_get, assuming we change the code like this:
> ...
>     down_read(&fpriv->bo_list_lock);
>     result = idr_find(&fpriv->bo_list_handles, id);
>     up_read(&fpriv->bo_list_lock);
>     /**Line 1. Task A was scheduled away from CPU**/
>     if (result)
>         mutex_lock(&result->lock);
> ...
>
> In function amdgpu_bo_list_destroy, assuming we change the code like 
> this:
> ...
>     down_write(&fpriv->bo_list_lock);
>     list = idr_remove(&fpriv->bo_list_handles, id);
>     up_write(&fpriv->bo_list_lock);
>     if (list) {
>         /* Another user may have a reference to this list still */
>         mutex_lock(&list->lock);
>         mutex_unlock(&list->lock);
>         amdgpu_bo_list_free(list);
>     }
> }
>
> When task A is running in function amdgpu_bo_list_get in line 1, CPU 
> scheduler takes CPU away from task A.
> Then Task B run function amdgpu_bo_list_destroy. Task B can run all 
> the way to destroy and free mutex.
> Later Task A is back to run. The mutex result->lock has been destroyed 
> by task B. Now task A try to lock a mutex
> which has been destroyed and freed.
>
>
>>
>> Christian.
>>
>>>   }
>>>     static int amdgpu_bo_list_set(struct amdgpu_device *adev,
>>
>>
>
> _______________________________________________
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
On 2017-06-13 06:23 AM, Christian König wrote:
> Am 13.06.2017 um 08:29 schrieb axie:
>> On 2017-06-12 07:00 PM, Christian König wrote:
>>> Am 12.06.2017 um 22:31 schrieb Alex Xie:
>>>> This patch is to move a loop of unref BOs and
>>>> several memory free function calls out of
>>>> critical sections.
>>>>
>>>> Signed-off-by: Alex Xie <AlexBin.Xie@amd.com>
>>>> ---
>>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_bo_list.c | 5 ++++-
>>>>   1 file changed, 4 insertions(+), 1 deletion(-)
>>>>
>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_bo_list.c 
>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_bo_list.c
>>>> index a664987..02c138f 100644
>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_bo_list.c
>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_bo_list.c
>>>> @@ -75,9 +75,12 @@ static void amdgpu_bo_list_destroy(struct 
>>>> amdgpu_fpriv *fpriv, int id)
>>>>           /* Another user may have a reference to this list still */
>>>>           mutex_lock(&list->lock);
>>>>           mutex_unlock(&list->lock);
>>>> +        mutex_unlock(&fpriv->bo_list_lock);
>>>>           amdgpu_bo_list_free(list);
>>>>       }
>>>> -    mutex_unlock(&fpriv->bo_list_lock);
>>>> +    else {
>>>> +        mutex_unlock(&fpriv->bo_list_lock);
>>>> +    }
>>>
>>> You could move the unlock of bo_list_lock even before the if.
>>>
>>> But since you pointed it out there is quite a bug in this function:
>>>>                 /* Another user may have a reference to this list 
>>>> still */
>>>>                 mutex_lock(&list->lock);
>>>>                 mutex_unlock(&list->lock);
>>> Not sure if that is still up to date, but that use case used to be 
>>> illegal with mutexes.
>> As I understand this piece of code, these mutex_lock and mutex_unlock 
>> pair are used to make sure all other tasks have finished
>> access of the bo list. Another side of this story is in function 
>> amdgpu_bo_list_get. These two piece of codes together make sure
>> we can safely destroy bo list.
>
> Yeah, the idea behind the code is correct. But using mutexes in that 
> way is illegal, see here https://lwn.net/Articles/575460/.
>
> I'm not sure if that is still up to date, but in ancient times you 
> needed to avoid patterns like this:
> mutex_unlock(&obj->lock);
> kfree(obj);
>
> Anyway I suggest we just replace the whole bo_list handling with and 
> RCU and refcount based implementation. That should avoid the whole 
> locking for the read only code path.
>
> Regards,
> Christian.
>

The article may be still generally applicable. But it is not applicable 
for amdgpu_bo_list.c.
amdgpu_bo_list.c has two extra steps to avoid similar issue.

1. In function amdgpu_bo_list_destroy,
...
     down_write(&fpriv->bo_list_lock);  //Note that the bo_list_lock is 
mutex or write lock here.
     list = idr_remove(&fpriv->bo_list_handles, id);
...
With the above statements, there is no new task(user) can access the bo 
list.

2. With the above statments, there is no task currently waiting for the 
bo list lock after function amdgpu_bo_list_destroy holds bo_list_lock.
Why? In function amdgpu_bo_list_get, other task need to hold 
bo_list_lock mutex or read lock in order to wait for result->lock.
amdgpu_bo_list_get(struct amdgpu_fpriv *fpriv, int id)
{
   ...
     down_read(&fpriv->bo_list_lock);
...
         mutex_lock(&result->lock);
...
     up_read(&fpriv->bo_list_lock);
}
So when amdgpu_bo_list_destroy function is holding bo_list_lock. All 
other tasks are not waiting for result->lock.

3.
After the above 2 points, we know that when destroy function is holding 
bo_list_lock, there can only be zero/one task is holding
list->lock. There is no other scenario.

The following piece of code wait for possible existing bo list 
task(user) finishes accessing the bo list:
static void amdgpu_bo_list_destroy(struct amdgpu_fpriv *fpriv, int id)
{
...
     down_write(&fpriv->bo_list_lock);
...
         /* Another user may have a reference to this list still */
         mutex_lock(&list->lock);    //// WAITing for the exising 
task/user unlock, after this waiting, we can safely destroy and free the 
mutex.  [Alex Bin]
         mutex_unlock(&list->lock);

         up_write(&fpriv->bo_list_lock);
         amdgpu_bo_list_free(list);
...
NOTE: When we are holding bo_list_lock, there is only one possible 
scenario for list->lock left, that is,
in function amdgpu_bo_list_put, to unlock list->lock.


PLEASE give me example to prove if I miss anything.

Thanks for all other comments.

RCU lock and kref might be a cleaner solution than rw_semaphore. But 
that requires bigger change to current implementation.
I would have a look at that later.

Regards,
Alex Bin Xie

>>
>> Otherwise we can easily simplify these lockings.
>> Let me give an example here.
>>
>> In function amdgpu_bo_list_get, assuming we change the code like this:
>> ...
>>     down_read(&fpriv->bo_list_lock);
>>     result = idr_find(&fpriv->bo_list_handles, id);
>>     up_read(&fpriv->bo_list_lock);
>>     /**Line 1. Task A was scheduled away from CPU**/
>>     if (result)
>>         mutex_lock(&result->lock);
>> ...
>>
>> In function amdgpu_bo_list_destroy, assuming we change the code like 
>> this:
>> ...
>>     down_write(&fpriv->bo_list_lock);
>>     list = idr_remove(&fpriv->bo_list_handles, id);
>>     up_write(&fpriv->bo_list_lock);
>>     if (list) {
>>         /* Another user may have a reference to this list still */
>>         mutex_lock(&list->lock);
>>         mutex_unlock(&list->lock);
>>         amdgpu_bo_list_free(list);
>>     }
>> }
>>
>> When task A is running in function amdgpu_bo_list_get in line 1, CPU 
>> scheduler takes CPU away from task A.
>> Then Task B run function amdgpu_bo_list_destroy. Task B can run all 
>> the way to destroy and free mutex.
>> Later Task A is back to run. The mutex result->lock has been 
>> destroyed by task B. Now task A try to lock a mutex
>> which has been destroyed and freed.
>>
>>
>>>
>>> Christian.
>>>
>>>>   }
>>>>     static int amdgpu_bo_list_set(struct amdgpu_device *adev,
>>>
>>>
>>
>> _______________________________________________
>> amd-gfx mailing list
>> amd-gfx@lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
>
>