drm/amdgpu: Fix mutex lock from atomic context.

Submitted by Chen, Guchun on Sept. 11, 2019, 6:54 a.m.

Details

Message ID BYAPR12MB280636CE43902317A10C3E8CF1B10@BYAPR12MB2806.namprd12.prod.outlook.com
State New
Headers show
Series "drm/amdgpu: Fix mutex lock from atomic context." ( rev: 2 ) in AMD X.Org drivers

Not browsing as part of any series.

Commit Message

Chen, Guchun Sept. 11, 2019, 6:54 a.m.
Also it's irrelevant for this partilcular interrupt as this is generic RAS interrupt and not memory errors specific.
[Guchun]One typo, it should be "particular", not " partilcular". With that fixed, the patch is: Reviewed-by: Guchun Chen <guchun.chen@amd.com>


-----Original Message-----
From: Andrey Grodzovsky <andrey.grodzovsky@amd.com> 
Sent: Wednesday, September 11, 2019 3:41 AM
To: amd-gfx@lists.freedesktop.org
Cc: Chen, Guchun <Guchun.Chen@amd.com>; Zhou1, Tao <Tao.Zhou1@amd.com>; Deucher, Alexander <Alexander.Deucher@amd.com>; Grodzovsky, Andrey <Andrey.Grodzovsky@amd.com>
Subject: [PATCH] drm/amdgpu: Fix mutex lock from atomic context.

Problem:
amdgpu_ras_reserve_bad_pages was moved to amdgpu_ras_reset_gpu because writing to EEPROM during ASIC reset was unstable.
But for ERREVENT_ATHUB_INTERRUPT amdgpu_ras_reset_gpu is called directly from ISR context and so locking is not allowed. Also it's irrelevant for this partilcular interrupt as this is generic RAS interrupt and not memory errors specific.
[Guchun]One typo, it should be "particular", not " partilcular". With that fixed, the patch is: Reviewed-by: Guchun Chen <guchun.chen@amd.com>

Fix:
Avoid calling amdgpu_ras_reserve_bad_pages if not in task context.

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_ras.h | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

--
2.7.4

Patch hide | download patch | download mbox

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.h
index 012034d..dd5da3c 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.h
@@ -504,7 +504,9 @@  static inline int amdgpu_ras_reset_gpu(struct amdgpu_device *adev,
 	/* save bad page to eeprom before gpu reset,
 	 * i2c may be unstable in gpu reset
 	 */
-	amdgpu_ras_reserve_bad_pages(adev);
+	if (in_task())
+		amdgpu_ras_reserve_bad_pages(adev);
+
 	if (atomic_cmpxchg(&ras->in_recovery, 0, 1) == 0)
 		schedule_work(&ras->recovery_work);
 	return 0;