drm/amdgpu: don't invalidate caches in RELEASE_MEM, only do the writeback

Submitted by Marek Olšák on July 2, 2019, 6:29 p.m.

Details

Message ID 20190702182901.22491-1-maraeo@gmail.com
State New
Headers show
Series "drm/amdgpu: don't invalidate caches in RELEASE_MEM, only do the writeback" ( rev: 1 ) in AMD X.Org drivers

Not browsing as part of any series.

Commit Message

Marek Olšák July 2, 2019, 6:29 p.m.
From: Marek Olšák <marek.olsak@amd.com>

This RELEASE_MEM use has the Release semantic, which means we should write
back but not invalidate. Invalidations only make sense with the Acquire
semantic (ACQUIRE_MEM), or when RELEASE_MEM is used to do the combined
Acquire-Release semantic, which is a barrier, not a fence.

The undesirable side effect of doing invalidations for the Release semantic
is that it invalidates caches while shaders are running, because the Release
can execute in the middle of the next IB.

UMDs should use ACQUIRE_MEM at the beginning of IBs. Doing cache
invalidations for a fence (like in this case) doesn't do anything
for correctness.

Signed-off-by: Marek Olšák <marek.olsak@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c | 6 +-----
 1 file changed, 1 insertion(+), 5 deletions(-)

Patch hide | download patch | download mbox

diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
index 210d24511dc6..a30f5d4913b9 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
@@ -4296,25 +4296,21 @@  static void gfx_v10_0_ring_emit_fence(struct amdgpu_ring *ring, u64 addr,
 	bool int_sel = flags & AMDGPU_FENCE_FLAG_INT;
 
 	/* Interrupt not work fine on GFX10.1 model yet. Use fallback instead */
 	if (adev->pdev->device == 0x50)
 		int_sel = false;
 
 	/* RELEASE_MEM - flush caches, send int */
 	amdgpu_ring_write(ring, PACKET3(PACKET3_RELEASE_MEM, 6));
 	amdgpu_ring_write(ring, (PACKET3_RELEASE_MEM_GCR_SEQ |
 				 PACKET3_RELEASE_MEM_GCR_GL2_WB |
-				 PACKET3_RELEASE_MEM_GCR_GL2_INV |
-				 PACKET3_RELEASE_MEM_GCR_GL2_US |
-				 PACKET3_RELEASE_MEM_GCR_GL1_INV |
-				 PACKET3_RELEASE_MEM_GCR_GLV_INV |
-				 PACKET3_RELEASE_MEM_GCR_GLM_INV |
+				 PACKET3_RELEASE_MEM_GCR_GLM_INV | /* must be set with GLM_WB */
 				 PACKET3_RELEASE_MEM_GCR_GLM_WB |
 				 PACKET3_RELEASE_MEM_CACHE_POLICY(3) |
 				 PACKET3_RELEASE_MEM_EVENT_TYPE(CACHE_FLUSH_AND_INV_TS_EVENT) |
 				 PACKET3_RELEASE_MEM_EVENT_INDEX(5)));
 	amdgpu_ring_write(ring, (PACKET3_RELEASE_MEM_DATA_SEL(write64bit ? 2 : 1) |
 				 PACKET3_RELEASE_MEM_INT_SEL(int_sel ? 2 : 0)));
 
 	/*
 	 * the address should be Qword aligned if 64bit write, Dword
 	 * aligned if only send 32bit data low (discard data high)

Comments