[3/3] drm/i915/uc: Never fail on HuC firmware errors

Submitted by Michal Wajdeczko on Aug. 18, 2019, 9:52 a.m.

Details

Message ID 20190818095204.31568-4-michal.wajdeczko@intel.com
State Accepted
Commit a8dc0f6d187bcccc7c1a41f0062badbf3d74e44c
Headers show
Series "Don't fail on GuC failure" ( rev: 1 ) in Intel GFX

Not browsing as part of any series.

Commit Message

Michal Wajdeczko Aug. 18, 2019, 9:52 a.m.
There is no need to mark whole GPU as wedged just because
of the custom HuC fw failure as users can always verify
actual HuC firmware status using existing HUC_STATUS ioctl.

Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/gt/uc/intel_huc.c |  4 +++-
 drivers/gpu/drm/i915/gt/uc/intel_uc.c  | 13 ++-----------
 2 files changed, 5 insertions(+), 12 deletions(-)

Patch hide | download patch | download mbox

diff --git a/drivers/gpu/drm/i915/gt/uc/intel_huc.c b/drivers/gpu/drm/i915/gt/uc/intel_huc.c
index af61ae864ab8..d4625c97b4f9 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_huc.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_huc.c
@@ -129,9 +129,11 @@  int intel_huc_auth(struct intel_huc *huc)
 	struct intel_guc *guc = &gt->uc.guc;
 	int ret;
 
-	GEM_BUG_ON(!intel_uc_fw_is_loaded(&huc->fw));
 	GEM_BUG_ON(intel_huc_is_authenticated(huc));
 
+	if (!intel_uc_fw_is_loaded(&huc->fw))
+		return -ENOEXEC;
+
 	ret = i915_inject_load_error(gt->i915, -ENXIO);
 	if (ret)
 		goto fail;
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_uc.c b/drivers/gpu/drm/i915/gt/uc/intel_uc.c
index 10978e7ff06d..71ee7ab035cc 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_uc.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_uc.c
@@ -457,12 +457,7 @@  int intel_uc_init_hw(struct intel_uc *uc)
 		if (ret)
 			goto err_out;
 
-		if (intel_uc_uses_huc(uc)) {
-			ret = intel_huc_fw_upload(huc);
-			if (ret && intel_uc_fw_is_overridden(&huc->fw))
-				goto err_out;
-		}
-
+		intel_huc_fw_upload(huc);
 		intel_guc_ads_reset(guc);
 		intel_guc_write_params(guc);
 		ret = intel_guc_fw_upload(guc);
@@ -481,11 +476,7 @@  int intel_uc_init_hw(struct intel_uc *uc)
 	if (ret)
 		goto err_log_capture;
 
-	if (intel_uc_fw_is_loaded(&huc->fw)) {
-		ret = intel_huc_auth(huc);
-		if (ret && intel_uc_fw_is_overridden(&huc->fw))
-			goto err_communication;
-	}
+	intel_huc_auth(huc);
 
 	ret = intel_guc_sample_forcewake(guc);
 	if (ret)

Comments

Quoting Michal Wajdeczko (2019-08-18 10:52:04)
> There is no need to mark whole GPU as wedged just because
> of the custom HuC fw failure as users can always verify
> actual HuC firmware status using existing HUC_STATUS ioctl.

If we try and fail, would it not be worth a dev_notice?
 
> Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
> Cc: Chris Wilson <chris@chris-wilson.co.uk>

Either way,
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
-Chris
On Sun, 18 Aug 2019 12:00:11 +0200, Chris Wilson  
<chris@chris-wilson.co.uk> wrote:

> Quoting Michal Wajdeczko (2019-08-18 10:52:04)
>> There is no need to mark whole GPU as wedged just because
>> of the custom HuC fw failure as users can always verify
>> actual HuC firmware status using existing HUC_STATUS ioctl.
>
> If we try and fail, would it not be worth a dev_notice?

if GuC is ok and if HuC was enabled there will be something like this:

<6> i915 0000:00:02.0: GuC firmware i915/kbl_guc_33.0.0.bin version 33.0  
submission:disabled
<6> i915 0000:00:02.0: HuC firmware i915/kbl_huc_ver02_00_1810.bin version  
2.0 authenticated:no

otherwise we should get:

<5> i915 0000:00:02.0: GuC is uninitialized

>
>> Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
>> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
>> Cc: Chris Wilson <chris@chris-wilson.co.uk>
>
> Either way,
> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
> -Chris
Quoting Michal Wajdeczko (2019-08-18 11:16:56)
> On Sun, 18 Aug 2019 12:00:11 +0200, Chris Wilson  
> <chris@chris-wilson.co.uk> wrote:
> 
> > Quoting Michal Wajdeczko (2019-08-18 10:52:04)
> >> There is no need to mark whole GPU as wedged just because
> >> of the custom HuC fw failure as users can always verify
> >> actual HuC firmware status using existing HUC_STATUS ioctl.
> >
> > If we try and fail, would it not be worth a dev_notice?
> 
> if GuC is ok and if HuC was enabled there will be something like this:
> 
> <6> i915 0000:00:02.0: GuC firmware i915/kbl_guc_33.0.0.bin version 33.0  
> submission:disabled
> <6> i915 0000:00:02.0: HuC firmware i915/kbl_huc_ver02_00_1810.bin version  
> 2.0 authenticated:no
> 
> otherwise we should get:
> 
> <5> i915 0000:00:02.0: GuC is uninitialized

But can we not fail to load HuC leaving GuC setup? E.g. for the fabled
submission backend?
-Chris