Gen8 Execlist based Engine reset and recovery support

Submitted by arun.siluvery@linux.intel.com on April 12, 2016, 4:59 p.m.

Details

Reviewer Mika Kuoppala
Submitted April 12, 2016, 4:59 p.m.
Last Updated April 13, 2016, 6:46 p.m.
Revision 1

Cover Letter(s)

Revision 1
      This series for Engine reset functionality from Gen8 onwards. Some of the
prep patches are already sent and merged, now follows more of them and
implementation patches.

Many many thanks to Mika and Chris for their time in review, these patches
have become much more simpler than they were originally and they are easy
to follow as well. I request you to please review further and provide
feedback so that they can be get closer to upstream. We can also get some
testing done now.

Tomas Elf originally started upstreaming effort for Gen8 and I am
continuing it, any mistakes they are mine.

These are based on nightly tree pulled on 11th April. 

Arun Siluvery (12):
  drm/i915: Update i915.reset to handle engine resets
  drm/i915/tdr: Extend the idea of reset_counter to engine reset
  drm/i915/tdr: Modify error handler for per engine hang recovery
  drm/i915/tdr: Prepare execlist submission to handle tdr resubmission
    after reset
  drm/i915/tdr: Capture engine state before reset
  drm/i915/tdr: Restore engine state and start after reset
  drm/i915/tdr: Add support for per engine reset recovery
  drm/i915: Extending i915_gem_check_wedge to check engine reset in
    progress
  drm/i915: Port of Added scheduler support to __wait_request() calls
  drm/i915/tdr: Add engine reset count to error state
  drm/i915/tdr: Export reset count info to debugfs
  drm/i915/tdr: Enable Engine reset and recovery support

Mika Kuoppala (1):
  drm/i915: Skip reset request if there is one already

Tomas Elf (1):
  drm/i915: Reinstate hang recovery work queue.

 drivers/gpu/drm/i915/i915_debugfs.c     |  33 ++++
 drivers/gpu/drm/i915/i915_dma.c         |   1 +
 drivers/gpu/drm/i915/i915_drv.c         |  73 +++++++++
 drivers/gpu/drm/i915/i915_drv.h         |  39 ++++-
 drivers/gpu/drm/i915/i915_gem.c         |  96 +++++++++---
 drivers/gpu/drm/i915/i915_gpu_error.c   |   3 +
 drivers/gpu/drm/i915/i915_irq.c         | 262 +++++++++++++++++++++++---------
 drivers/gpu/drm/i915/i915_params.c      |   6 +-
 drivers/gpu/drm/i915/i915_params.h      |   2 +-
 drivers/gpu/drm/i915/intel_display.c    |   4 +-
 drivers/gpu/drm/i915/intel_lrc.c        | 216 ++++++++++++++++++++++++--
 drivers/gpu/drm/i915/intel_lrc.h        |   3 +
 drivers/gpu/drm/i915/intel_ringbuffer.c |   7 +-
 drivers/gpu/drm/i915/intel_ringbuffer.h |  19 +++
 drivers/gpu/drm/i915/intel_uncore.c     |  60 +++++++-
 15 files changed, 714 insertions(+), 110 deletions(-)
    

Revisions

Patches download mbox

Tests

Series 5603v1 Gen8 Execlist based Engine reset and recovery support
http://patchwork.freedesktop.org/api/1.0/series/5603/revisions/1/mbox/

Test gem_exec_suspend:
        Subgroup basic-s3:
                incomplete -> PASS       (hsw-gt2)
Test kms_pipe_crc_basic:
        Subgroup hang-read-crc-pipe-a:
                pass       -> DMESG-WARN (skl-nuci5)
                pass       -> INCOMPLETE (bdw-nuci7)
        Subgroup hang-read-crc-pipe-b:
                pass       -> DMESG-WARN (skl-i7k-2)
        Subgroup hang-read-crc-pipe-c:
                pass       -> INCOMPLETE (bsw-nuc-2)

bdw-nuci7        total:61   pass:58   dwarn:0   dfail:0   fail:0   skip:2  
bsw-nuc-2        total:48   pass:38   dwarn:0   dfail:0   fail:0   skip:9  
byt-nuc          total:202  pass:164  dwarn:0   dfail:0   fail:0   skip:38 
hsw-brixbox      total:203  pass:179  dwarn:0   dfail:0   fail:0   skip:24 
hsw-gt2          total:203  pass:184  dwarn:0   dfail:0   fail:0   skip:19 
ilk-hp8440p      total:203  pass:135  dwarn:0   dfail:0   fail:0   skip:68 
ivb-t430s        total:203  pass:175  dwarn:0   dfail:0   fail:0   skip:28 
skl-i7k-2        total:203  pass:177  dwarn:1   dfail:0   fail:0   skip:25 
skl-nuci5        total:203  pass:191  dwarn:1   dfail:0   fail:0   skip:11 
snb-dellxps      total:203  pass:165  dwarn:0   dfail:0   fail:0   skip:38 
snb-x220t        total:203  pass:165  dwarn:0   dfail:0   fail:1   skip:37 
BOOT FAILED for bdw-ultra

Results at /archive/results/CI_IGT_test/Patchwork_1879/

631ffd2f45bb43964f729e8661532fb115f5eeec drm-intel-nightly: 2016y-04m-13d-13h-00m-18s UTC integration manifest
337ccfd drm/i915/tdr: Enable Engine reset and recovery support
7a1d1c5 drm/i915/tdr: Export reset count info to debugfs
4257315 drm/i915/tdr: Add engine reset count to error state
3a21f41 drm/i915: Port of Added scheduler support to __wait_request() calls
bf4282f drm/i915: Extending i915_gem_check_wedge to check engine reset in progress
7fbe89e drm/i915: Skip reset request if there is one already
18f2c91 drm/i915/tdr: Add support for per engine reset recovery
13f2a80 drm/i915/tdr: Restore engine state and start after reset
106098d drm/i915/tdr: Capture engine state before reset
ffb8f8c drm/i915/tdr: Prepare execlist submission to handle tdr resubmission after reset
4d92de0 drm/i915/tdr: Modify error handler for per engine hang recovery
c687108 drm/i915: Reinstate hang recovery work queue.
8d0b3e1 drm/i915/tdr: Extend the idea of reset_counter to engine reset
aeea231 drm/i915: Update i915.reset to handle engine resets



Test core_prop_blob:
        Subgroup basic:
                pass       -> INCOMPLETE (ro-bdw-i7-5600u)
Test drv_hangman:
        Subgroup error-state-basic:
                pass       -> INCOMPLETE (ro-bdw-i7-5600u)
                fail       -> PASS       (ro-ilk1-i5-650)
Test gem_busy:
        Subgroup basic-blt:
                skip       -> PASS       (ro-bsw-n3050)
Test gem_cs_prefetch:
        Subgroup basic-default:
                incomplete -> SKIP       (ro-bsw-n3050)
Test gem_cs_tlb:
        Subgroup basic-default:
                incomplete -> PASS       (ro-bsw-n3050)
Test gem_ctx_basic:
                pass       -> INCOMPLETE (ro-bdw-i7-5600u)
Test gem_exec_basic:
        Subgroup basic-vebox:
                pass       -> INCOMPLETE (ro-bdw-i7-5600u)
        Subgroup gtt-blt:
                incomplete -> PASS       (ro-bsw-n3050)
        Subgroup gtt-bsd1:
                skip       -> INCOMPLETE (ro-bdw-i7-5600u)
        Subgroup gtt-vebox:
                incomplete -> PASS       (ro-bsw-n3050)
        Subgroup readonly-bsd1:
                skip       -> INCOMPLETE (ro-bdw-i7-5600u)
        Subgroup readonly-render:
                incomplete -> PASS       (ro-bsw-n3050)
Test gem_exec_store:
        Subgroup basic-bsd1:
                incomplete -> SKIP       (ro-bsw-n3050)
                skip       -> INCOMPLETE (ro-bdw-i7-5600u)
Test gem_linear_blits:
        Subgroup basic:
                incomplete -> PASS       (ro-bsw-n3050)
Test gem_mmap:
        Subgroup basic:
                pass       -> INCOMPLETE (ro-bdw-i7-5600u)
        Subgroup basic-small-bo:
                pass       -> INCOMPLETE (ro-bdw-i7-5600u)
Test gem_mmap_gtt:
        Subgroup basic-small-copy-xy:
                pass       -> INCOMPLETE (ro-bdw-i7-5600u)
Test gem_ringfill:
        Subgroup basic-default-hang:
                pass       -> INCOMPLETE (ro-bsw-n3050)
Test gem_storedw_loop:
        Subgroup basic-bsd2:
                skip       -> INCOMPLETE (ro-bdw-i7-5600u)
        Subgroup basic-vebox:
                pass       -> INCOMPLETE (ro-bdw-i7-5600u)
Test gem_sync:
        Subgroup basic-bsd1:
                skip       -> INCOMPLETE (ro-bdw-i7-5600u)
        Subgroup basic-render:
                incomplete -> PASS       (ro-bsw-n3050)
Test gem_tiled_fence_blits:
        Subgroup basic:
                pass       -> INCOMPLETE (ro-bdw-i7-5600u)
Test kms_addfb_basic:
        Subgroup addfb25-y-tiled:
                incomplete -> PASS       (ro-bsw-n3050)
        Subgroup addfb25-y-tiled-small:
                skip       -> INCOMPLETE (ro-bdw-i7-5600u)
        Subgroup small-bo:
                incomplete -> PASS       (ro-bsw-n3050)
        Subgroup unused-pitches:
                pass       -> INCOMPLETE (ro-bdw-i7-5600u)
Test kms_flip:
        Subgroup basic-flip-vs-dpms:
                skip       -> PASS       (ro-bdw-i5-5250u)
        Subgroup basic-flip-vs-modeset:
                pass       -> DMESG-WARN (fi-skl-i7-6700k)
        Subgroup basic-flip-vs-wf_vblank:
                pass       -> SKIP       (ro-bdw-i5-5250u)
Test kms_force_connector_basic:
        Subgroup force-edid:
                skip       -> INCOMPLETE (ro-bdw-i7-5600u)
Test kms_frontbuffer_tracking:
        Subgroup basic:
                dmesg-fail -> PASS       (ro-skl2-4405Y)
Test kms_pipe_crc_basic:
        Subgroup hang-read-crc-pipe-b:
                pass       -> DMESG-WARN (ro-bdw-i5-5250u)
        Subgroup hang-read-crc-pipe-c:
                pass       -> DMESG-WARN (ro-bdw-i7-5600u)
                pass       -> DMESG-WARN (ro-skl2-4405Y)
                pass       -> DMESG-WARN (fi-skl-i7-6700k)
        Subgroup nonblocking-crc-pipe-c-frame-sequence:
                pass       -> SKIP       (ro-bdw-i5-5250u)
        Subgroup read-crc-pipe-b:
                skip       -> PASS       (ro-bdw-i5-5250u)
        Subgroup suspend-read-crc-pipe-c:
                dmesg-warn -> SKIP       (ro-bdw-i5-5250u)
Test kms_setmode:
        Subgroup basic-clone-single-crtc:
                pass       -> INCOMPLETE (ro-bdw-i7-5600u)
Test kms_sink_crc_basic:
                incomplete -> SKIP       (ro-bsw-n3050)
Test prime_self_import:
        Subgroup basic-llseek-size:
                incomplete -> PASS       (ro-bsw-n3050)

fi-byt-n2820     total:203  pass:165  dwarn:0   dfail:0   fail:0   skip:38 
fi-skl-i7-6700k  total:203  pass:176  dwarn:2   dfail:0   fail:0   skip:25 
ro-bdw-i5-5250u  total:203  pass:184  dwarn:2   dfail:0   fail:0   skip:17 
ro-bdw-i7-5600u  total:203  pass:168  dwarn:1   dfail:0   fail:0   skip:16 
ro-bsw-n3050     total:135  pass:112  dwarn:0   dfail:0   fail:0   skip:22 
ro-byt-n2820     total:203  pass:169  dwarn:0   dfail:0   fail:0   skip:34 
ro-hsw-i3-4010u  total:203  pass:180  dwarn:0   dfail:0   fail:0   skip:23 
ro-hsw-i7-4770r  total:203  pass:180  dwarn:0   dfail:0   fail:0   skip:23 
ro-ilk1-i5-650   total:202  pass:134  dwarn:0   dfail:0   fail:0   skip:68 
ro-ivb-i7-3770   total:203  pass:170  dwarn:0   dfail:0   fail:0   skip:33 
ro-ivb3-i7-3770  total:203  pass:174  dwarn:0   dfail:0   fail:0   skip:29 
ro-skl-i7-6700hq total:49   pass:37   dwarn:8   dfail:0   fail:0   skip:4  
ro-skl2-4405Y    total:203  pass:179  dwarn:2   dfail:0   fail:0   skip:22 
BOOT FAILED for ro-skl-i7-6700hq

Results at /archive/results/CI_IGT_test/RO_Patchwork_528/

2c6ae0c drm-intel-nightly: 2016y-04m-12d-17h-02m-03s UTC integration manifest
757ccb6 drm/i915/tdr: Enable Engine reset and recovery support
787c6f8 drm/i915/tdr: Export reset count info to debugfs
771c79c drm/i915/tdr: Add engine reset count to error state
04f14be drm/i915: Port of Added scheduler support to __wait_request() calls
bb5d02b drm/i915: Extending i915_gem_check_wedge to check engine reset in progress
64d7da3 drm/i915: Skip reset request if there is one already
57b09dc1 drm/i915/tdr: Add support for per engine reset recovery
e1c6d24 drm/i915/tdr: Restore engine state and start after reset
c9dd6a4 drm/i915/tdr: Capture engine state before reset
1414103 drm/i915/tdr: Prepare execlist submission to handle tdr resubmission after reset
d38a71d drm/i915/tdr: Modify error handler for per engine hang recovery
0465d6b drm/i915: Reinstate hang recovery work queue.
13f9250 drm/i915/tdr: Extend the idea of reset_counter to engine reset
3f30226 drm/i915: Update i915.reset to handle engine resets