[2/2] drm/i915/execlists: Ensure the context is reloaded after a GPU reset

Submitted by Chris Wilson on Sept. 12, 2019, 9:29 a.m.

Details

Message ID 20190912092933.4729-2-chris@chris-wilson.co.uk
State Accepted
Commit a17592effdc16f3a51ef9c1dda30fe5c6d668263
Headers show
Series "Series without cover letter" ( rev: 1 ) in Intel GFX

Not browsing as part of any series.

Commit Message

Chris Wilson Sept. 12, 2019, 9:29 a.m.
After we manipulate the context to allow replay after a GPU reset, force
that context to be reloaded. This should be a layer of paranoia, for if
the GPU was reset, the context will no longer be resident!

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
---
 drivers/gpu/drm/i915/gt/intel_lrc.c | 1 +
 1 file changed, 1 insertion(+)

Patch hide | download patch | download mbox

diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c
index dbc90da2341a..47d766ccea71 100644
--- a/drivers/gpu/drm/i915/gt/intel_lrc.c
+++ b/drivers/gpu/drm/i915/gt/intel_lrc.c
@@ -2445,6 +2445,7 @@  static void __execlists_reset(struct intel_engine_cs *engine, bool stalled)
 	intel_ring_update_space(ce->ring);
 	__execlists_reset_reg_state(ce, engine);
 	__execlists_update_reg_state(ce, engine);
+	ce->lrc_desc |= CTX_DESC_FORCE_RESTORE; /* paranoid: GPU was reset! */
 	mutex_release(&ce->pin_mutex.dep_map, 0, _THIS_IP_);
 
 unwind:

Comments

Chris Wilson <chris@chris-wilson.co.uk> writes:

> After we manipulate the context to allow replay after a GPU reset, force
> that context to be reloaded. This should be a layer of paranoia, for if
> the GPU was reset, the context will no longer be resident!
>
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
> ---
>  drivers/gpu/drm/i915/gt/intel_lrc.c | 1 +
>  1 file changed, 1 insertion(+)
>
> diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c
> index dbc90da2341a..47d766ccea71 100644
> --- a/drivers/gpu/drm/i915/gt/intel_lrc.c
> +++ b/drivers/gpu/drm/i915/gt/intel_lrc.c
> @@ -2445,6 +2445,7 @@ static void __execlists_reset(struct intel_engine_cs *engine, bool stalled)
>  	intel_ring_update_space(ce->ring);
>  	__execlists_reset_reg_state(ce, engine);
>  	__execlists_update_reg_state(ce, engine);
> +	ce->lrc_desc |= CTX_DESC_FORCE_RESTORE; /* paranoid: GPU was
> reset! */

The CCID should be reset also, but I see no harm to be explicit.

Acked-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>

>  	mutex_release(&ce->pin_mutex.dep_map, 0, _THIS_IP_);
>  
>  unwind:
> -- 
> 2.23.0
Quoting Mika Kuoppala (2019-09-12 12:53:01)
> Chris Wilson <chris@chris-wilson.co.uk> writes:
> 
> > After we manipulate the context to allow replay after a GPU reset, force
> > that context to be reloaded. This should be a layer of paranoia, for if
> > the GPU was reset, the context will no longer be resident!
> >
> > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> > Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
> > ---
> >  drivers/gpu/drm/i915/gt/intel_lrc.c | 1 +
> >  1 file changed, 1 insertion(+)
> >
> > diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c
> > index dbc90da2341a..47d766ccea71 100644
> > --- a/drivers/gpu/drm/i915/gt/intel_lrc.c
> > +++ b/drivers/gpu/drm/i915/gt/intel_lrc.c
> > @@ -2445,6 +2445,7 @@ static void __execlists_reset(struct intel_engine_cs *engine, bool stalled)
> >       intel_ring_update_space(ce->ring);
> >       __execlists_reset_reg_state(ce, engine);
> >       __execlists_update_reg_state(ce, engine);
> > +     ce->lrc_desc |= CTX_DESC_FORCE_RESTORE; /* paranoid: GPU was
> > reset! */
> 
> The CCID should be reset also, but I see no harm to be explicit.

Yeah, I think it's developing into a healthy enough pattern. If we ever
manipulate anything inside the image itself, we should probably force
the restore. A bit more mulling over that, I like the current comment :)
-Chris