drm/amdgpu/sdma4:fix sdma engine hang

Submitted by Deng, Emily on April 2, 2018, 9:53 a.m.


Message ID CY4PR12MB11252C502EB9D72E1814F4418FA60@CY4PR12MB1125.namprd12.prod.outlook.com
State New
Headers show
Series "drm/amdgpu/sdma4:fix sdma engine hang" ( rev: 3 ) in AMD X.Org drivers

Not browsing as part of any series.

Commit Message

Deng, Emily April 2, 2018, 9:53 a.m.
Hi Alex,
When set hdp bit, the poll reg mem command will do follow sequence:
Write reference value to reg0,
Poll reg1,
Compare the reference with the  reg1 value&mask(The detail behavior information is from sdma team),
but in our use case the reference value is not equal to mask, also may not equal to the reg1 value, so the compare won't pass, so causes the sdma hang.
   Also verified the detail behavior with follow test:
   Use the original method to do invalidate request and poll ack, but replace the mask with ref, then the sdma also will hang:
void amdgpu_ring_emit_reg_write_reg_wait_helper(struct amdgpu_ring *ring,
                                                                                                uint32_t reg0, uint32_t reg1,
                                                                                                uint32_t ref, uint32_t mask)
                amdgpu_ring_emit_wreg(ring, reg0, ref);
//amdgpu_ring_emit_reg_wait(ring, reg1, mask, mask);
                amdgpu_ring_emit_reg_wait(ring, reg1, ref, mask);

Best Wishes,
Emily Deng

From: Deucher, Alexander
Sent: Friday, March 30, 2018 8:55 PM
To: Deng, Emily <Emily.Deng@amd.com>; amd-gfx@lists.freedesktop.org
Subject: Re: [PATCH] drm/amdgpu/sdma4:fix sdma engine hang

The spec claims it does and we use it for HDP flush...

Acked-by: Alex Deucher <alexander.deucher@amd.com<mailto:alexander.deucher@amd.com>>

Patch hide | download patch | download mbox

diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c b/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c
index 9ac28b2..84d148d 100644
--- a/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c
@@ -1178,13 +1178,6 @@  static void sdma_v4_0_ring_emit_reg_wait(struct amdgpu_ring *ring, uint32_t reg,
         sdma_v4_0_wait_reg_mem(ring, 0, 0, reg, 0, val, mask, 10);

-static void sdma_v4_0_ring_emit_reg_write_reg_wait(struct amdgpu_ring *ring,
-                                                  uint32_t reg0, uint32_t reg1,
-                                                  uint32_t ref, uint32_t mask)
-       sdma_v4_0_wait_reg_mem(ring, 0, 1, reg0, reg1, ref, mask, 10);
 static int sdma_v4_0_early_init(void *handle)
         struct amdgpu_device *adev = (struct amdgpu_device *)handle;
@@ -1626,7 +1619,7 @@  static const struct amdgpu_ring_funcs sdma_v4_0_ring_funcs = {
         .pad_ib = sdma_v4_0_ring_pad_ib,
         .emit_wreg = sdma_v4_0_ring_emit_wreg,
         .emit_reg_wait = sdma_v4_0_ring_emit_reg_wait,
-       .emit_reg_write_reg_wait = sdma_v4_0_ring_emit_reg_write_reg_wait,
+       .emit_reg_write_reg_wait = amdgpu_ring_emit_reg_write_reg_wait_helper,

 static void sdma_v4_0_set_ring_funcs(struct amdgpu_device *adev)