[v2] drm/i915: Also perform gpu reset under execlist mode.

Submitted by Wang, Zhi A on July 6, 2015, 4:04 p.m.

Details

Message ID 559AA6FA.2050405@intel.com
State New
Headers show

Not browsing as part of any series.

Commit Message

Wang, Zhi A July 6, 2015, 4:04 p.m.
Hi Chris and Mika:
     Thanks for the comments. I think that reset HW on module unload is 
an good idea. For now I think we should choose a proper position in the 
module unload sequence to reset HW. As GPU reset is render engine reset 
plus ring imrs(They will become to alll F after full GPU reset), I think 
a proper position should be after render and interrupt shutdown path.

How about this place?

base object refcount
                  * will be 2 (+1 from object creation and +1 from 
do_switch()).
                  * i915_gem_context_fini() will be called after 
gpu_idle() has switched


于 07/03/15 18:52, Mika Kuoppala 写道:
> Chris Wilson <chris@chris-wilson.co.uk> writes:
>
>> On Sat, Jul 04, 2015 at 12:27:34AM +0800, bing.niu@intel.com wrote:
>>> From: "Niu,Bing" <bing.niu@intel.com>
>>>
>>> It is found that i915 will not reset gpu under execlist mode when
>>> unload module. that will lead to some issues when unload/load module
>>> with different submission mode. e.g. from execlist mode to ring
>>> buffer mode via loading/unloading i915. Because HW is not in a reset
>>> state and registers are not clean under such condition.
>>>
>>> Signed-off-by: Niu,Bing <bing.niu@intel.com>
>> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
>>
>> I think we may end up doing the reset unconditionally in
>> i915_driver_unload() because this argument holds for almost everything
>> we setup. It's a bigger risk because of doing the gpu-reset on more
>> machines, but module-unloading is a "developer feature"!
>
> And after that has been sorted, we should try reset on module load.
>
> This way initial state would be identical to after reset/unload state.
> Now we have this situation that we don't know how much we are leaning on
> bios on state setup.
>
> -Mika
>
>> The only issue is making sure that the reset is ordered appropriately.
>> -Chris
>>
>> --
>> Chris Wilson, Intel Open Source Technology Centre
>> _______________________________________________
>> Intel-gfx mailing list
>> Intel-gfx@lists.freedesktop.org
>> http://lists.freedesktop.org/mailman/listinfo/intel-gfx
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx
>

Patch hide | download patch | download mbox

diff --git a/drivers/gpu/drm/i915/i915_dma.c 
b/drivers/gpu/drm/i915/i915_dma.c
index c5349fa..aeaf59e 100644
--- a/drivers/gpu/drm/i915/i915_dma.c
+++ b/drivers/gpu/drm/i915/i915_dma.c
@@ -1133,7 +1133,10 @@  int i915_driver_unload(struct drm_device *dev)
         pm_qos_remove_request(&dev_priv->pm_qos);

         i915_global_gtt_cleanup(dev);
-
+       /* The only known way to stop the gpu from accessing the hw 
context is
+        * to reset it. Do this as the very last operation to avoid 
confusing
+        * other code, leading to spurious errors. */
+       intel_gpu_reset(dev);
         intel_uncore_fini(dev);
         if (dev_priv->regs != NULL)
                 pci_iounmap(dev->pdev, dev_priv->regs);
diff --git a/drivers/gpu/drm/i915/i915_gem_context.c 
b/drivers/gpu/drm/i915/i915_gem_context.c
index a7e58a8..376ee6b 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/i915_gem_context.c
@@ -373,11 +373,6 @@  void i915_gem_context_fini(struct drm_device *dev)
         int i;

         if (dctx->legacy_hw_ctx.rcs_state) {
-               /* The only known way to stop the gpu from accessing the 
hw context is
-                * to reset it. Do this as the very last operation to 
avoid confusing
-                * other code, leading to spurious errors. */
-               intel_gpu_reset(dev);
-
                 /* When default context is created and switched to, 

Comments

On Tue, Jul 07, 2015 at 12:04:10AM +0800, Zhi Wang wrote:
> Hi Chris and Mika:
>     Thanks for the comments. I think that reset HW on module unload
> is an good idea. For now I think we should choose a proper position
> in the module unload sequence to reset HW. As GPU reset is render
> engine reset plus ring imrs(They will become to alll F after full
> GPU reset), I think a proper position should be after render and
> interrupt shutdown path.
> 
> How about this place?
> 
> diff --git a/drivers/gpu/drm/i915/i915_dma.c
> b/drivers/gpu/drm/i915/i915_dma.c
> index c5349fa..aeaf59e 100644
> --- a/drivers/gpu/drm/i915/i915_dma.c
> +++ b/drivers/gpu/drm/i915/i915_dma.c
> @@ -1133,7 +1133,10 @@ int i915_driver_unload(struct drm_device *dev)
>         pm_qos_remove_request(&dev_priv->pm_qos);
> 
>         i915_global_gtt_cleanup(dev);
> -
> +       /* The only known way to stop the gpu from accessing the hw
> context is
> +        * to reset it. Do this as the very last operation to avoid
> confusing
> +        * other code, leading to spurious errors. */
> +       intel_gpu_reset(dev);

That feels right. The comment is out-of-place now and needs expansion to
include other side effects for which the gpu reset is meritted.

But this is a riskier patch since we now start doing unconditional
resets for gen3-gen5. Just requires more soak testing, but I would
prefer it as (1) add execlists reset, (2) combine execlists reset +
power context reset into a single unload reset. That way if we do get a
regression in doing the unload reset we can revert back to execlists
easily.
-Chris