drm/i915: Consider HW CSB write pointer before resetting the sw read pointer

Submitted by Michel Thierry on Sept. 23, 2015, 2:43 p.m.

Details

Message ID 1443019430-20792-1-git-send-email-michel.thierry@intel.com
State New
Headers show

Not browsing as part of any series.

Commit Message

Michel Thierry Sept. 23, 2015, 2:43 p.m.
A previous commit resets the Context Status Buffer (CSB) read pointer in
ring init
    commit c0a03a2e4c4e ("drm/i915: Reset CSB read pointer in ring init")

This is generally correct, but this pointer is not reset after
suspend/resume in some platforms (cht). In this case, the driver should
read the register value instead of resetting the sw read counter to 0.
Otherwise we process old events, leading to unwanted pre-emptions or
something worse.

But in other platforms (bdw) and also during GPU reset or power up, the
CSBWP is reset to 0x7 (an invalid number), and in this case the read
pointer should be set to 0.

Signed-off-by: Lei Shen <lei.shen@intel.com>
Signed-off-by: Deepak S <deepak.s@intel.com>
Signed-off-by: Michel Thierry <michel.thierry@intel.com>
---
 drivers/gpu/drm/i915/intel_lrc.c | 24 +++++++++++++++++++++++-
 1 file changed, 23 insertions(+), 1 deletion(-)

Patch hide | download patch | download mbox

diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index ff9a481..dd87812 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -1479,6 +1479,7 @@  static int gen8_init_common_ring(struct intel_engine_cs *ring)
 {
 	struct drm_device *dev = ring->dev;
 	struct drm_i915_private *dev_priv = dev->dev_private;
+	u8 next_context_status_buffer_hw;
 
 	lrc_setup_hardware_status_page(ring,
 				ring->default_context->engine[ring->id].state);
@@ -1496,7 +1497,28 @@  static int gen8_init_common_ring(struct intel_engine_cs *ring)
 		   _MASKED_BIT_DISABLE(GFX_REPLAY_MODE) |
 		   _MASKED_BIT_ENABLE(GFX_RUN_LIST_ENABLE));
 	POSTING_READ(RING_MODE_GEN7(ring));
-	ring->next_context_status_buffer = 0;
+
+	/*
+	 * Instead of resetting the Context Status Buffer (CSB) read pointer to
+	 * zero, we need to read the write pointer from hardware and use its
+	 * value because "this register is power context save restored".
+	 * Effectively, these states have been observed:
+	 *
+	 *      | Suspend-to-idle (freeze) | Suspend-to-RAM (mem) |
+	 * BDW  | CSB regs not reset       | CSB regs reset       |
+	 * CHT  | CSB regs not reset       | CSB regs not reset   |
+	 */
+	next_context_status_buffer_hw = I915_READ(RING_CONTEXT_STATUS_PTR(ring)) & 0x07;
+
+	/*
+	 * When the CSB registers are reset (also after power-up / gpu reset),
+	 * CSB write pointer is set to all 1's, which is not valid, use 0 in
+	 * this special case.
+	 */
+	if (next_context_status_buffer_hw == 0x7)
+		next_context_status_buffer_hw = 0;
+
+	ring->next_context_status_buffer = next_context_status_buffer_hw;
 	DRM_DEBUG_DRIVER("Execlists enabled for %s\n", ring->name);
 
 	memset(&ring->hangcheck, 0, sizeof(ring->hangcheck));

Comments

Michel Thierry <michel.thierry@intel.com> writes:

> A previous commit resets the Context Status Buffer (CSB) read pointer in
> ring init
>     commit c0a03a2e4c4e ("drm/i915: Reset CSB read pointer in ring init")
>
> This is generally correct, but this pointer is not reset after
> suspend/resume in some platforms (cht). In this case, the driver should
> read the register value instead of resetting the sw read counter to 0.
> Otherwise we process old events, leading to unwanted pre-emptions or
> something worse.
>
> But in other platforms (bdw) and also during GPU reset or power up, the
> CSBWP is reset to 0x7 (an invalid number), and in this case the read
> pointer should be set to 0.
>
> Signed-off-by: Lei Shen <lei.shen@intel.com>
> Signed-off-by: Deepak S <deepak.s@intel.com>
> Signed-off-by: Michel Thierry <michel.thierry@intel.com>
> ---
>  drivers/gpu/drm/i915/intel_lrc.c | 24 +++++++++++++++++++++++-
>  1 file changed, 23 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
> index ff9a481..dd87812 100644
> --- a/drivers/gpu/drm/i915/intel_lrc.c
> +++ b/drivers/gpu/drm/i915/intel_lrc.c
> @@ -1479,6 +1479,7 @@ static int gen8_init_common_ring(struct intel_engine_cs *ring)
>  {
>  	struct drm_device *dev = ring->dev;
>  	struct drm_i915_private *dev_priv = dev->dev_private;
> +	u8 next_context_status_buffer_hw;
>  
>  	lrc_setup_hardware_status_page(ring,
>  				ring->default_context->engine[ring->id].state);
> @@ -1496,7 +1497,28 @@ static int gen8_init_common_ring(struct intel_engine_cs *ring)
>  		   _MASKED_BIT_DISABLE(GFX_REPLAY_MODE) |
>  		   _MASKED_BIT_ENABLE(GFX_RUN_LIST_ENABLE));
>  	POSTING_READ(RING_MODE_GEN7(ring));
> -	ring->next_context_status_buffer = 0;
> +
> +	/*
> +	 * Instead of resetting the Context Status Buffer (CSB) read pointer to
> +	 * zero, we need to read the write pointer from hardware and use its
> +	 * value because "this register is power context save restored".
> +	 * Effectively, these states have been observed:
> +	 *
> +	 *      | Suspend-to-idle (freeze) | Suspend-to-RAM (mem) |
> +	 * BDW  | CSB regs not reset       | CSB regs reset       |
> +	 * CHT  | CSB regs not reset       | CSB regs not reset   |
> +	 */
> +	next_context_status_buffer_hw = I915_READ(RING_CONTEXT_STATUS_PTR(ring)) & 0x07;
> +
> +	/*
> +	 * When the CSB registers are reset (also after power-up / gpu reset),
> +	 * CSB write pointer is set to all 1's, which is not valid, use 0 in
> +	 * this special case.
> +	 */
> +	if (next_context_status_buffer_hw == 0x7)
> +		next_context_status_buffer_hw = 0;


If hardware has been reset and we have b111 here, I assume
the first write will be at index zero.

If we look at the interrupt code there is while (read_pointer <
write_pointer). Initializing next status buffer to zero
would mean that you miss the first write to csb[0].

If b111 is found, I think the correct value is 5.

-Mika


> +
> +	ring->next_context_status_buffer = next_context_status_buffer_hw;
>  	DRM_DEBUG_DRIVER("Execlists enabled for %s\n", ring->name);
>  
>  	memset(&ring->hangcheck, 0, sizeof(ring->hangcheck));
> -- 
> 2.5.3
>
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx
On 9/25/2015 4:44 PM, Mika Kuoppala wrote:
> Michel Thierry <michel.thierry@intel.com> writes:
>> -	ring->next_context_status_buffer = 0;
>> +
>> +	/*
>> +	 * Instead of resetting the Context Status Buffer (CSB) read pointer to
>> +	 * zero, we need to read the write pointer from hardware and use its
>> +	 * value because "this register is power context save restored".
>> +	 * Effectively, these states have been observed:
>> +	 *
>> +	 *      | Suspend-to-idle (freeze) | Suspend-to-RAM (mem) |
>> +	 * BDW  | CSB regs not reset       | CSB regs reset       |
>> +	 * CHT  | CSB regs not reset       | CSB regs not reset   |
>> +	 */
>> +	next_context_status_buffer_hw = I915_READ(RING_CONTEXT_STATUS_PTR(ring)) & 0x07;
>> +
>> +	/*
>> +	 * When the CSB registers are reset (also after power-up / gpu reset),
>> +	 * CSB write pointer is set to all 1's, which is not valid, use 0 in
>> +	 * this special case.
>> +	 */
>> +	if (next_context_status_buffer_hw == 0x7)
>> +		next_context_status_buffer_hw = 0;
>
>
> If hardware has been reset and we have b111 here, I assume
> the first write will be at index zero.
>
> If we look at the interrupt code there is while (read_pointer <
> write_pointer). Initializing next status buffer to zero
> would mean that you miss the first write to csb[0].
>
> If b111 is found, I think the correct value is 5.

Correct, it needs to be set to 5. Luckily, csb[0] would always be 
Idle-to-Active, which we ignore.

>
> -Mika
>
>