[v3] drm/i915: Workaround to avoid lite restore with HEAD==TAIL

Submitted by Michel Thierry on April 15, 2015, 4:17 p.m.

Details

Message ID 1429114633-17185-1-git-send-email-michel.thierry@intel.com
State New
Headers show

Not browsing as part of any series.

Commit Message

Michel Thierry April 15, 2015, 4:17 p.m.
WaIdleLiteRestore is an execlists-only workaround, and requires the driver
to ensure that any context always has HEAD!=TAIL when attempting lite
restore.

Add two extra MI_NOOP instructions at the end of each request, but keep
the requests tail pointing before the MI_NOOPs. We may not need to
executed them, and this is why request->tail must be sampled before adding
these extra instructions.

If we submit a context to the ELSP which has previously been submitted,
move the tail pointer past the MI_NOOPs. This ensures HEAD!=TAIL.

v2: Move overallocation to gen8_emit_request, and added note about
sampling request->tail in commit message (Chris).

v3: Remove redundant request->tail assignment in __i915_add_request, in
lrc mode this is already set in execlists_context_queue.
Do not add wa implementation details inside gem (Chris).

Cc: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Thomas Daniel <thomas.daniel@intel.com>
Signed-off-by: Michel Thierry <michel.thierry@intel.com>
---
 drivers/gpu/drm/i915/i915_gem.c  |  3 ++-
 drivers/gpu/drm/i915/intel_lrc.c | 35 ++++++++++++++++++++++++++++++++++-
 2 files changed, 36 insertions(+), 2 deletions(-)

Patch hide | download patch | download mbox

diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 3d5a5a8..980e17c 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -2400,10 +2400,11 @@  int __i915_add_request(struct intel_engine_cs *ring,
 		ret = ring->add_request(ring);
 		if (ret)
 			return ret;
+
+		request->tail = intel_ring_get_tail(ringbuf);
 	}
 
 	request->head = request_start;
-	request->tail = intel_ring_get_tail(ringbuf);
 
 	/* Whilst this request exists, batch_obj will be on the
 	 * active_list, and so will hold the active reference. Only when this
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index f4a5ef9..0296350 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -427,6 +427,26 @@  static void execlists_context_unqueue(struct intel_engine_cs *ring)
 		}
 	}
 
+	if (IS_GEN8(ring->dev) || IS_GEN9(ring->dev)) {
+		/*
+		 * WaIdleLiteRestore: make sure we never cause a lite
+		 * restore with HEAD==TAIL
+		 */
+		if (req0 && req0->elsp_submitted == 1) {
+			/*
+			 * Consume the buffer NOOPs to ensure HEAD != TAIL when
+			 * submitting. elsp_submitted can only be >1 after
+			 * reset, in which case we don't need the workaround as
+			 * a lite restore will not occur.
+			 */
+			struct intel_ringbuffer *ringbuf;
+
+			ringbuf = req0->ctx->engine[ring->id].ringbuf;
+			req0->tail += 8;
+			req0->tail &= ringbuf->size - 1;
+		}
+	}
+
 	WARN_ON(req1 && req1->elsp_submitted);
 
 	execlists_submit_contexts(ring, req0->ctx, req0->tail,
@@ -1289,7 +1309,12 @@  static int gen8_emit_request(struct intel_ringbuffer *ringbuf,
 	u32 cmd;
 	int ret;
 
-	ret = intel_logical_ring_begin(ringbuf, request->ctx, 6);
+	/*
+	 * Reserve space for 2 NOOPs at the end of each request to be
+	 * used as a workaround for not being allowed to do lite
+	 * restore with HEAD==TAIL (WaIdleLiteRestore).
+	 */
+	ret = intel_logical_ring_begin(ringbuf, request->ctx, 8);
 	if (ret)
 		return ret;
 
@@ -1307,6 +1332,14 @@  static int gen8_emit_request(struct intel_ringbuffer *ringbuf,
 	intel_logical_ring_emit(ringbuf, MI_NOOP);
 	intel_logical_ring_advance_and_submit(ringbuf, request->ctx, request);
 
+	/*
+	 * Here we add two extra NOOPs as padding to avoid
+	 * lite restore of a context with HEAD==TAIL.
+	 */
+	intel_logical_ring_emit(ringbuf, MI_NOOP);
+	intel_logical_ring_emit(ringbuf, MI_NOOP);
+	intel_logical_ring_advance(ringbuf);
+
 	return 0;
 }
 

Comments

On Wed, Apr 15, 2015 at 05:17:13PM +0100, Michel Thierry wrote:
> WaIdleLiteRestore is an execlists-only workaround, and requires the driver
> to ensure that any context always has HEAD!=TAIL when attempting lite
> restore.
> 
> Add two extra MI_NOOP instructions at the end of each request, but keep
> the requests tail pointing before the MI_NOOPs. We may not need to
> executed them, and this is why request->tail must be sampled before adding
> these extra instructions.
> 
> If we submit a context to the ELSP which has previously been submitted,
> move the tail pointer past the MI_NOOPs. This ensures HEAD!=TAIL.
> 
> v2: Move overallocation to gen8_emit_request, and added note about
> sampling request->tail in commit message (Chris).
> 
> v3: Remove redundant request->tail assignment in __i915_add_request, in
> lrc mode this is already set in execlists_context_queue.
> Do not add wa implementation details inside gem (Chris).
> 
> Cc: Chris Wilson <chris@chris-wilson.co.uk>
> Signed-off-by: Thomas Daniel <thomas.daniel@intel.com>
> Signed-off-by: Michel Thierry <michel.thierry@intel.com>
> ---
>  drivers/gpu/drm/i915/i915_gem.c  |  3 ++-
>  drivers/gpu/drm/i915/intel_lrc.c | 35 ++++++++++++++++++++++++++++++++++-
>  2 files changed, 36 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index 3d5a5a8..980e17c 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -2400,10 +2400,11 @@ int __i915_add_request(struct intel_engine_cs *ring,
>  		ret = ring->add_request(ring);
>  		if (ret)
>  			return ret;
> +
> +		request->tail = intel_ring_get_tail(ringbuf);
>  	}
>  
>  	request->head = request_start;
> -	request->tail = intel_ring_get_tail(ringbuf);
>  
>  	/* Whilst this request exists, batch_obj will be on the
>  	 * active_list, and so will hold the active reference. Only when this
> diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
> index f4a5ef9..0296350 100644
> --- a/drivers/gpu/drm/i915/intel_lrc.c
> +++ b/drivers/gpu/drm/i915/intel_lrc.c
> @@ -427,6 +427,26 @@ static void execlists_context_unqueue(struct intel_engine_cs *ring)
>  		}
>  	}
>  
> +	if (IS_GEN8(ring->dev) || IS_GEN9(ring->dev)) {
> +		/*
> +		 * WaIdleLiteRestore: make sure we never cause a lite
> +		 * restore with HEAD==TAIL
> +		 */
> +		if (req0 && req0->elsp_submitted == 1) {
> +			/*
> +			 * Consume the buffer NOOPs to ensure HEAD != TAIL when
> +			 * submitting. elsp_submitted can only be >1 after
> +			 * reset, in which case we don't need the workaround as
> +			 * a lite restore will not occur.

I actually think you can remove the == 1 and hence remove comment since
the wa is safe to apply in that case as well.

/* Apply the wa NOOPS to prevent ring:HEAD == rq:TAIL as we
 * resubmit the request. See gen8_emit_request() for where we
 * prepare the padding after the end of the request.
 */

> +			 */
> +			struct intel_ringbuffer *ringbuf;
> +
> +			ringbuf = req0->ctx->engine[ring->id].ringbuf;
> +			req0->tail += 8;
> +			req0->tail &= ringbuf->size - 1;
> +		}
> +	}
> +
>  	WARN_ON(req1 && req1->elsp_submitted);
>  
>  	execlists_submit_contexts(ring, req0->ctx, req0->tail,
> @@ -1289,7 +1309,12 @@ static int gen8_emit_request(struct intel_ringbuffer *ringbuf,
>  	u32 cmd;
>  	int ret;
>  
> -	ret = intel_logical_ring_begin(ringbuf, request->ctx, 6);
> +	/*
> +	 * Reserve space for 2 NOOPs at the end of each request to be
> +	 * used as a workaround for not being allowed to do lite
> +	 * restore with HEAD==TAIL (WaIdleLiteRestore).
> +	 */
> +	ret = intel_logical_ring_begin(ringbuf, request->ctx, 8);
>  	if (ret)
>  		return ret;
>  
> @@ -1307,6 +1332,14 @@ static int gen8_emit_request(struct intel_ringbuffer *ringbuf,
>  	intel_logical_ring_emit(ringbuf, MI_NOOP);
>  	intel_logical_ring_advance_and_submit(ringbuf, request->ctx, request);
>  
> +	/*
> +	 * Here we add two extra NOOPs as padding to avoid
> +	 * lite restore of a context with HEAD==TAIL.
> +	 */
> +	intel_logical_ring_emit(ringbuf, MI_NOOP);
> +	intel_logical_ring_emit(ringbuf, MI_NOOP);
> +	intel_logical_ring_advance(ringbuf);
> +

Ok, looks better.
-Chris
On 4/15/2015 5:40 PM, Chris Wilson wrote:
> On Wed, Apr 15, 2015 at 05:17:13PM +0100, Michel Thierry wrote:
>> WaIdleLiteRestore is an execlists-only workaround, and requires the driver
>> to ensure that any context always has HEAD!=TAIL when attempting lite
>> restore.
>>
>> Add two extra MI_NOOP instructions at the end of each request, but keep
>> the requests tail pointing before the MI_NOOPs. We may not need to
>> executed them, and this is why request->tail must be sampled before adding
>> these extra instructions.
>>
>> If we submit a context to the ELSP which has previously been submitted,
>> move the tail pointer past the MI_NOOPs. This ensures HEAD!=TAIL.
>>
>> v2: Move overallocation to gen8_emit_request, and added note about
>> sampling request->tail in commit message (Chris).
>>
>> v3: Remove redundant request->tail assignment in __i915_add_request, in
>> lrc mode this is already set in execlists_context_queue.
>> Do not add wa implementation details inside gem (Chris).
>>
>> Cc: Chris Wilson <chris@chris-wilson.co.uk>
>> Signed-off-by: Thomas Daniel <thomas.daniel@intel.com>
>> Signed-off-by: Michel Thierry <michel.thierry@intel.com>
>> ---
>>   drivers/gpu/drm/i915/i915_gem.c  |  3 ++-
>>   drivers/gpu/drm/i915/intel_lrc.c | 35 ++++++++++++++++++++++++++++++++++-
>>   2 files changed, 36 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
>> index 3d5a5a8..980e17c 100644
>> --- a/drivers/gpu/drm/i915/i915_gem.c
>> +++ b/drivers/gpu/drm/i915/i915_gem.c
>> @@ -2400,10 +2400,11 @@ int __i915_add_request(struct intel_engine_cs *ring,
>>   		ret = ring->add_request(ring);
>>   		if (ret)
>>   			return ret;
>> +
>> +		request->tail = intel_ring_get_tail(ringbuf);
>>   	}
>>   
>>   	request->head = request_start;
>> -	request->tail = intel_ring_get_tail(ringbuf);
>>   
>>   	/* Whilst this request exists, batch_obj will be on the
>>   	 * active_list, and so will hold the active reference. Only when this
>> diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
>> index f4a5ef9..0296350 100644
>> --- a/drivers/gpu/drm/i915/intel_lrc.c
>> +++ b/drivers/gpu/drm/i915/intel_lrc.c
>> @@ -427,6 +427,26 @@ static void execlists_context_unqueue(struct intel_engine_cs *ring)
>>   		}
>>   	}
>>   
>> +	if (IS_GEN8(ring->dev) || IS_GEN9(ring->dev)) {
>> +		/*
>> +		 * WaIdleLiteRestore: make sure we never cause a lite
>> +		 * restore with HEAD==TAIL
>> +		 */
>> +		if (req0 && req0->elsp_submitted == 1) {
>> +			/*
>> +			 * Consume the buffer NOOPs to ensure HEAD != TAIL when
>> +			 * submitting. elsp_submitted can only be >1 after
>> +			 * reset, in which case we don't need the workaround as
>> +			 * a lite restore will not occur.
> I actually think you can remove the == 1 and hence remove comment since
> the wa is safe to apply in that case as well.
>
> /* Apply the wa NOOPS to prevent ring:HEAD == rq:TAIL as we
>   * resubmit the request. See gen8_emit_request() for where we
>   * prepare the padding after the end of the request.
>   */
Yes, it's safe to apply it after the request has been submitted multiple 
times.
I'll change that and update the comment.

Thanks,

-Michel

>> +			 */
>> +			struct intel_ringbuffer *ringbuf;
>> +
>> +			ringbuf = req0->ctx->engine[ring->id].ringbuf;
>> +			req0->tail += 8;
>> +			req0->tail &= ringbuf->size - 1;
>> +		}
>> +	}
>> +
>>   	WARN_ON(req1 && req1->elsp_submitted);
>>   
>>   	execlists_submit_contexts(ring, req0->ctx, req0->tail,
>> @@ -1289,7 +1309,12 @@ static int gen8_emit_request(struct intel_ringbuffer *ringbuf,
>>   	u32 cmd;
>>   	int ret;
>>   
>> -	ret = intel_logical_ring_begin(ringbuf, request->ctx, 6);
>> +	/*
>> +	 * Reserve space for 2 NOOPs at the end of each request to be
>> +	 * used as a workaround for not being allowed to do lite
>> +	 * restore with HEAD==TAIL (WaIdleLiteRestore).
>> +	 */
>> +	ret = intel_logical_ring_begin(ringbuf, request->ctx, 8);
>>   	if (ret)
>>   		return ret;
>>   
>> @@ -1307,6 +1332,14 @@ static int gen8_emit_request(struct intel_ringbuffer *ringbuf,
>>   	intel_logical_ring_emit(ringbuf, MI_NOOP);
>>   	intel_logical_ring_advance_and_submit(ringbuf, request->ctx, request);
>>   
>> +	/*
>> +	 * Here we add two extra NOOPs as padding to avoid
>> +	 * lite restore of a context with HEAD==TAIL.
>> +	 */
>> +	intel_logical_ring_emit(ringbuf, MI_NOOP);
>> +	intel_logical_ring_emit(ringbuf, MI_NOOP);
>> +	intel_logical_ring_advance(ringbuf);
>> +
> Ok, looks better.
> -Chris
>
Tested-By: Intel Graphics QA PRTS (Patch Regression Test System Contact: shuang.he@intel.com)
Task id: 6205
-------------------------------------Summary-------------------------------------
Platform          Delta          drm-intel-nightly          Series Applied
PNV                                  276/276              276/276
ILK                 -2              302/302              300/302
SNB                                  318/318              318/318
IVB                                  341/341              341/341
BYT                                  287/287              287/287
HSW                                  395/395              395/395
BDW                                  318/318              318/318
-------------------------------------Detailed-------------------------------------
Platform  Test                                drm-intel-nightly          Series Applied
*ILK  igt@gem_fenced_exec_thrash@no-spare-fences-busy      PASS(3)      DMESG_WARN(1)PASS(1)
(dmesg patch applied)drm:i915_hangcheck_elapsed[i915]]*ERROR*Hangcheck_timer_elapsed...bsd_ring_idle@Hangcheck timer elapsed... bsd ring idle
 ILK  igt@gem_unfence_active_buffers      DMESG_WARN(1)PASS(1)      DMESG_WARN(1)PASS(1)
(dmesg patch applied)drm:i915_hangcheck_elapsed[i915]]*ERROR*Hangcheck_timer_elapsed...bsd_ring_idle@Hangcheck timer elapsed... bsd ring idle
Note: You need to pay more attention to line start with '*'