[1/1] drm/i915/uc: Turn on GuC/HuC auto mode again

Submitted by Michal Wajdeczko on Aug. 18, 2019, 3:51 p.m.

Details

Message ID 20190818155109.31360-2-michal.wajdeczko@intel.com
State New
Headers show
Series "Turn on GuC/HuC auto mode again" ( rev: 1 ) in Intel GFX

Not browsing as part of any series.

Commit Message

Michal Wajdeczko Aug. 18, 2019, 3:51 p.m.
We hope that now all issues related to missing uC firmwares
are fixed and that driver can continue to load without GuC
or HuC firmware available and running:

 - missing or corrupted HuC firmware has no impact to driver
   load (clients should continue to use HUC_STATUS to check
   if HuC firmware was loaded and authenticated)

 - missing or corrupted GuC firmware has no impact to driver
   load (unless GuC firmware blob was overridden by the user
   or GuC submission was requested or GuC was previously
   enabled on this system without reboot - then we will mark
   GPU as wedged and continue with KMS only)

References: commit f774f0964919 ("drm/i915/guc: Turn on GuC/HuC auto mode")
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Jani Nikula <jani.nikula@intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
---
 drivers/gpu/drm/i915/i915_params.c | 2 +-
 drivers/gpu/drm/i915/i915_params.h | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

Patch hide | download patch | download mbox

diff --git a/drivers/gpu/drm/i915/i915_params.c b/drivers/gpu/drm/i915/i915_params.c
index 296452f9efe4..b4f481e1e6b6 100644
--- a/drivers/gpu/drm/i915/i915_params.c
+++ b/drivers/gpu/drm/i915/i915_params.c
@@ -146,7 +146,7 @@  i915_param_named_unsafe(edp_vswing, int, 0400,
 i915_param_named_unsafe(enable_guc, int, 0400,
 	"Enable GuC load for GuC submission and/or HuC load. "
 	"Required functionality can be selected using bitmask values. "
-	"(-1=auto, 0=disable [default], 1=GuC submission, 2=HuC load)");
+	"(-1=auto [default], 0=disable, 1=GuC submission, 2=HuC load)");
 
 i915_param_named(guc_log_level, int, 0400,
 	"GuC firmware logging level. Requires GuC to be loaded. "
diff --git a/drivers/gpu/drm/i915/i915_params.h b/drivers/gpu/drm/i915/i915_params.h
index d29ade3b7de6..5736c55694fe 100644
--- a/drivers/gpu/drm/i915/i915_params.h
+++ b/drivers/gpu/drm/i915/i915_params.h
@@ -54,7 +54,7 @@  struct drm_printer;
 	param(int, disable_power_well, -1) \
 	param(int, enable_ips, 1) \
 	param(int, invert_brightness, 0) \
-	param(int, enable_guc, 0) \
+	param(int, enable_guc, -1) \
 	param(int, guc_log_level, -1) \
 	param(char *, guc_firmware_path, NULL) \
 	param(char *, huc_firmware_path, NULL) \

Comments

On 18/08/2019 18:51, Michal Wajdeczko wrote:
> We hope that now all issues related to missing uC firmwares
> are fixed and that driver can continue to load without GuC
> or HuC firmware available and running:
> 
>  - missing or corrupted HuC firmware has no impact to driver
>    load (clients should continue to use HUC_STATUS to check
>    if HuC firmware was loaded and authenticated)
> 
>  - missing or corrupted GuC firmware has no impact to driver
>    load (unless GuC firmware blob was overridden by the user
>    or GuC submission was requested or GuC was previously
>    enabled on this system without reboot - then we will mark
>    GPU as wedged and continue with KMS only)

Please hold merging this patch, as many more items need to be crossed
off before such a patch can land.

Such items include:

 - Assess all the existing GUC-related bugs, and prove they won't
suddenly get seen by more users
 - add fault injection to the FW loading path
 - add IGT tests to make sure driver behaves well on different FW
loading errors

Martin

> 
> References: commit f774f0964919 ("drm/i915/guc: Turn on GuC/HuC auto mode")
> Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
> Cc: Jani Nikula <jani.nikula@intel.com>
> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
> Cc: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
> ---
>  drivers/gpu/drm/i915/i915_params.c | 2 +-
>  drivers/gpu/drm/i915/i915_params.h | 2 +-
>  2 files changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_params.c b/drivers/gpu/drm/i915/i915_params.c
> index 296452f9efe4..b4f481e1e6b6 100644
> --- a/drivers/gpu/drm/i915/i915_params.c
> +++ b/drivers/gpu/drm/i915/i915_params.c
> @@ -146,7 +146,7 @@ i915_param_named_unsafe(edp_vswing, int, 0400,
>  i915_param_named_unsafe(enable_guc, int, 0400,
>  	"Enable GuC load for GuC submission and/or HuC load. "
>  	"Required functionality can be selected using bitmask values. "
> -	"(-1=auto, 0=disable [default], 1=GuC submission, 2=HuC load)");
> +	"(-1=auto [default], 0=disable, 1=GuC submission, 2=HuC load)");
>  
>  i915_param_named(guc_log_level, int, 0400,
>  	"GuC firmware logging level. Requires GuC to be loaded. "
> diff --git a/drivers/gpu/drm/i915/i915_params.h b/drivers/gpu/drm/i915/i915_params.h
> index d29ade3b7de6..5736c55694fe 100644
> --- a/drivers/gpu/drm/i915/i915_params.h
> +++ b/drivers/gpu/drm/i915/i915_params.h
> @@ -54,7 +54,7 @@ struct drm_printer;
>  	param(int, disable_power_well, -1) \
>  	param(int, enable_ips, 1) \
>  	param(int, invert_brightness, 0) \
> -	param(int, enable_guc, 0) \
> +	param(int, enable_guc, -1) \
>  	param(int, guc_log_level, -1) \
>  	param(char *, guc_firmware_path, NULL) \
>  	param(char *, huc_firmware_path, NULL) \
>
On Mon, 19 Aug 2019 10:09:15 +0200, Martin Peres  
<martin.peres@linux.intel.com> wrote:

> On 18/08/2019 18:51, Michal Wajdeczko wrote:
>> We hope that now all issues related to missing uC firmwares
>> are fixed and that driver can continue to load without GuC
>> or HuC firmware available and running:
>>
>>  - missing or corrupted HuC firmware has no impact to driver
>>    load (clients should continue to use HUC_STATUS to check
>>    if HuC firmware was loaded and authenticated)
>>
>>  - missing or corrupted GuC firmware has no impact to driver
>>    load (unless GuC firmware blob was overridden by the user
>>    or GuC submission was requested or GuC was previously
>>    enabled on this system without reboot - then we will mark
>>    GPU as wedged and continue with KMS only)
>
> Please hold merging this patch, as many more items need to be crossed
> off before such a patch can land.
>
> Such items include:
>
>  - Assess all the existing GUC-related bugs, and prove they won't
> suddenly get seen by more users

Yep, this is right time to double check that.

>  - add fault injection to the FW loading path

About 40 new failure points were added to firmware loading path by
recent commits like [1] ("drm/i915/uc: Hardening firmware fetch")
and [2] ("drm/i915/uc: Inject probe errors into intel_uc_init_hw")
so I assume we could mark this item as done.

>  - add IGT tests to make sure driver behaves well on different FW
> loading errors

Not sure which if we need another set of tests beyond existing and
already used igt@i915_module_load@reload-with-fault-injection.

Maybe all we need is to add dedicated CI machines that will try to
use GuC but without valid firmware on the system? Let CI team speak
here. Note I was already trying to mimic that scenario with [3] and
[4] and they we all good.

Thanks,
Michal

[1]  
https://cgit.freedesktop.org/drm/drm-tip/commit/?id=5e0a809af2a2b577934ec54c0e50897b806f0732
[2]  
https://cgit.freedesktop.org/drm/drm-tip/commit/?id=5d1ef2b4270de45e1b1b40a00838e3b6196eefd7
[3] https://patchwork.freedesktop.org/series/65378/#rev1
[4] https://patchwork.freedesktop.org/series/65379/#rev1

>
> Martin
>
>>
>> References: commit f774f0964919 ("drm/i915/guc: Turn on GuC/HuC auto  
>> mode")
>> Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
>> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
>> Cc: Jani Nikula <jani.nikula@intel.com>
>> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
>> Cc: Chris Wilson <chris@chris-wilson.co.uk>
>> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
>> ---
>>  drivers/gpu/drm/i915/i915_params.c | 2 +-
>>  drivers/gpu/drm/i915/i915_params.h | 2 +-
>>  2 files changed, 2 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/i915/i915_params.c  
>> b/drivers/gpu/drm/i915/i915_params.c
>> index 296452f9efe4..b4f481e1e6b6 100644
>> --- a/drivers/gpu/drm/i915/i915_params.c
>> +++ b/drivers/gpu/drm/i915/i915_params.c
>> @@ -146,7 +146,7 @@ i915_param_named_unsafe(edp_vswing, int, 0400,
>>  i915_param_named_unsafe(enable_guc, int, 0400,
>>  	"Enable GuC load for GuC submission and/or HuC load. "
>>  	"Required functionality can be selected using bitmask values. "
>> -	"(-1=auto, 0=disable [default], 1=GuC submission, 2=HuC load)");
>> +	"(-1=auto [default], 0=disable, 1=GuC submission, 2=HuC load)");
>>
>>  i915_param_named(guc_log_level, int, 0400,
>>  	"GuC firmware logging level. Requires GuC to be loaded. "
>> diff --git a/drivers/gpu/drm/i915/i915_params.h  
>> b/drivers/gpu/drm/i915/i915_params.h
>> index d29ade3b7de6..5736c55694fe 100644
>> --- a/drivers/gpu/drm/i915/i915_params.h
>> +++ b/drivers/gpu/drm/i915/i915_params.h
>> @@ -54,7 +54,7 @@ struct drm_printer;
>>  	param(int, disable_power_well, -1) \
>>  	param(int, enable_ips, 1) \
>>  	param(int, invert_brightness, 0) \
>> -	param(int, enable_guc, 0) \
>> +	param(int, enable_guc, -1) \
>>  	param(int, guc_log_level, -1) \
>>  	param(char *, guc_firmware_path, NULL) \
>>  	param(char *, huc_firmware_path, NULL) \
On Mon, Aug 19, l we need is to add2019 at 4:15 AM Michal Wajdeczko
<michal.wajdeczko@intel.com> wrote:
>
> On Mon, 19 Aug 2019 10:09:15 +0200, Martin Peres
> <martin.peres@linux.intel.com> wrote:
>
> > On 18/08/2019 18:51, Michal Wajdeczko wrote:
> >> We hope that now all issues related to missing uC firmwares
> >> are fixed and that driver can continue to load without GuC
> >> or HuC firmware available and running:
> >>
> >>  - missing or corrupted HuC firmware has no impact to driver
> >>    load (clients should continue to use HUC_STATUS to check
> >>    if HuC firmware was loaded and authenticated)
> >>
> >>  - missing or corrupted GuC firmware has no impact to driver
> >>    load (unless GuC firmware blob was overridden by the user
> >>    or GuC submission was requested or GuC was previously
> >>    enabled on this system without reboot - then we will mark
> >>    GPU as wedged and continue with KMS only)
> >
> > Please hold merging this patch, as many more items need to be crossed
> > off before such a patch can land.
> >
> > Such items include:
> >
> >  - Assess all the existing GUC-related bugs, and prove they won't
> > suddenly get seen by more users
>
> Yep, this is right time to double check that.
>
> >  - add fault injection to the FW loading path
>
> About 40 new failure points were added to firmware loading path by
> recent commits like [1] ("drm/i915/uc: Hardening firmware fetch")
> and [2] ("drm/i915/uc: Inject probe errors into intel_uc_init_hw")
> so I assume we could mark this item as done.
>
> >  - add IGT tests to make sure driver behaves well on different FW
> > loading errors
>
> Not sure which if we need another set of tests beyond existing and
> already used igt@i915_module_load@reload-with-fault-injection.
>
> Maybe all we need is to add dedicated CI machines that will try to
> use GuC but without valid firmware on the system? Let CI team speak
> here. Note I was already trying to mimic that scenario with [3] and
> [4] and they we all good.

I don't think we should use dedicated CI machines for that. We could
make the test to bind mount an empty
directory before executing in order to hide the firmware, which would
be an easy solution.

In kmod testsuite I do something similar for modules, but I trap all
fs-related calls (I had to do that for other reasons):
https://git.kernel.org/pub/scm/utils/kernel/kmod/kmod.git/tree/testsuite/path.c

Lucas De Marchi


>
> Thanks,
> Michal
>
> [1]
> https://cgit.freedesktop.org/drm/drm-tip/commit/?id=5e0a809af2a2b577934ec54c0e50897b806f0732
> [2]
> https://cgit.freedesktop.org/drm/drm-tip/commit/?id=5d1ef2b4270de45e1b1b40a00838e3b6196eefd7
> [3] https://patchwork.freedesktop.org/series/65378/#rev1
> [4] https://patchwork.freedesktop.org/series/65379/#rev1
>
> >
> > Martin
> >
> >>
> >> References: commit f774f0964919 ("drm/i915/guc: Turn on GuC/HuC auto
> >> mode")
> >> Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
> >> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
> >> Cc: Jani Nikula <jani.nikula@intel.com>
> >> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
> >> Cc: Chris Wilson <chris@chris-wilson.co.uk>
> >> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> >> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
> >> ---
> >>  drivers/gpu/drm/i915/i915_params.c | 2 +-
> >>  drivers/gpu/drm/i915/i915_params.h | 2 +-
> >>  2 files changed, 2 insertions(+), 2 deletions(-)
> >>
> >> diff --git a/drivers/gpu/drm/i915/i915_params.c
> >> b/drivers/gpu/drm/i915/i915_params.c
> >> index 296452f9efe4..b4f481e1e6b6 100644
> >> --- a/drivers/gpu/drm/i915/i915_params.c
> >> +++ b/drivers/gpu/drm/i915/i915_params.c
> >> @@ -146,7 +146,7 @@ i915_param_named_unsafe(edp_vswing, int, 0400,
> >>  i915_param_named_unsafe(enable_guc, int, 0400,
> >>      "Enable GuC load for GuC submission and/or HuC load. "
> >>      "Required functionality can be selected using bitmask values. "
> >> -    "(-1=auto, 0=disable [default], 1=GuC submission, 2=HuC load)");
> >> +    "(-1=auto [default], 0=disable, 1=GuC submission, 2=HuC load)");
> >>
> >>  i915_param_named(guc_log_level, int, 0400,
> >>      "GuC firmware logging level. Requires GuC to be loaded. "
> >> diff --git a/drivers/gpu/drm/i915/i915_params.h
> >> b/drivers/gpu/drm/i915/i915_params.h
> >> index d29ade3b7de6..5736c55694fe 100644
> >> --- a/drivers/gpu/drm/i915/i915_params.h
> >> +++ b/drivers/gpu/drm/i915/i915_params.h
> >> @@ -54,7 +54,7 @@ struct drm_printer;
> >>      param(int, disable_power_well, -1) \
> >>      param(int, enable_ips, 1) \
> >>      param(int, invert_brightness, 0) \
> >> -    param(int, enable_guc, 0) \
> >> +    param(int, enable_guc, -1) \
> >>      param(int, guc_log_level, -1) \
> >>      param(char *, guc_firmware_path, NULL) \
> >>      param(char *, huc_firmware_path, NULL) \
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx
On 19/08/2019 21:22, Lucas De Marchi wrote:
> On Mon, Aug 19, l we need is to add2019 at 4:15 AM Michal Wajdeczko
> <michal.wajdeczko@intel.com> wrote:
>>
>> On Mon, 19 Aug 2019 10:09:15 +0200, Martin Peres
>> <martin.peres@linux.intel.com> wrote:
>>
>>> On 18/08/2019 18:51, Michal Wajdeczko wrote:
>>>> We hope that now all issues related to missing uC firmwares
>>>> are fixed and that driver can continue to load without GuC
>>>> or HuC firmware available and running:
>>>>
>>>>  - missing or corrupted HuC firmware has no impact to driver
>>>>    load (clients should continue to use HUC_STATUS to check
>>>>    if HuC firmware was loaded and authenticated)
>>>>
>>>>  - missing or corrupted GuC firmware has no impact to driver
>>>>    load (unless GuC firmware blob was overridden by the user
>>>>    or GuC submission was requested or GuC was previously
>>>>    enabled on this system without reboot - then we will mark
>>>>    GPU as wedged and continue with KMS only)
>>>
>>> Please hold merging this patch, as many more items need to be crossed
>>> off before such a patch can land.
>>>
>>> Such items include:
>>>
>>>  - Assess all the existing GUC-related bugs, and prove they won't
>>> suddenly get seen by more users
>>
>> Yep, this is right time to double check that.
>>
>>>  - add fault injection to the FW loading path
>>
>> About 40 new failure points were added to firmware loading path by
>> recent commits like [1] ("drm/i915/uc: Hardening firmware fetch")
>> and [2] ("drm/i915/uc: Inject probe errors into intel_uc_init_hw")
>> so I assume we could mark this item as done.
>>
>>>  - add IGT tests to make sure driver behaves well on different FW
>>> loading errors
>>
>> Not sure which if we need another set of tests beyond existing and
>> already used igt@i915_module_load@reload-with-fault-injection.
>>
>> Maybe all we need is to add dedicated CI machines that will try to
>> use GuC but without valid firmware on the system? Let CI team speak
>> here. Note I was already trying to mimic that scenario with [3] and
>> [4] and they we all good.
> 
> I don't think we should use dedicated CI machines for that. We could
> make the test to bind mount an empty
> directory before executing in order to hide the firmware, which would
> be an easy solution.
> 
> In kmod testsuite I do something similar for modules, but I trap all
> fs-related calls (I had to do that for other reasons):
> https://git.kernel.org/pub/scm/utils/kernel/kmod/kmod.git/tree/testsuite/path.c

Let's not forget that IGT is meant to be run by anyone, and the more
testing we can do on a developer's machine, the better.

So thank you Lucas for being faster than me to suggest that relying on
dedicated machines is a bad idea, especially when it is such an easy
solution :)

Martin

> 
> Lucas De Marchi
> 
> 
>>
>> Thanks,
>> Michal
>>
>> [1]
>> https://cgit.freedesktop.org/drm/drm-tip/commit/?id=5e0a809af2a2b577934ec54c0e50897b806f0732
>> [2]
>> https://cgit.freedesktop.org/drm/drm-tip/commit/?id=5d1ef2b4270de45e1b1b40a00838e3b6196eefd7
>> [3] https://patchwork.freedesktop.org/series/65378/#rev1
>> [4] https://patchwork.freedesktop.org/series/65379/#rev1
>>
>>>
>>> Martin
>>>
>>>>
>>>> References: commit f774f0964919 ("drm/i915/guc: Turn on GuC/HuC auto
>>>> mode")
>>>> Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
>>>> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
>>>> Cc: Jani Nikula <jani.nikula@intel.com>
>>>> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
>>>> Cc: Chris Wilson <chris@chris-wilson.co.uk>
>>>> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>>>> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
>>>> ---
>>>>  drivers/gpu/drm/i915/i915_params.c | 2 +-
>>>>  drivers/gpu/drm/i915/i915_params.h | 2 +-
>>>>  2 files changed, 2 insertions(+), 2 deletions(-)
>>>>
>>>> diff --git a/drivers/gpu/drm/i915/i915_params.c
>>>> b/drivers/gpu/drm/i915/i915_params.c
>>>> index 296452f9efe4..b4f481e1e6b6 100644
>>>> --- a/drivers/gpu/drm/i915/i915_params.c
>>>> +++ b/drivers/gpu/drm/i915/i915_params.c
>>>> @@ -146,7 +146,7 @@ i915_param_named_unsafe(edp_vswing, int, 0400,
>>>>  i915_param_named_unsafe(enable_guc, int, 0400,
>>>>      "Enable GuC load for GuC submission and/or HuC load. "
>>>>      "Required functionality can be selected using bitmask values. "
>>>> -    "(-1=auto, 0=disable [default], 1=GuC submission, 2=HuC load)");
>>>> +    "(-1=auto [default], 0=disable, 1=GuC submission, 2=HuC load)");
>>>>
>>>>  i915_param_named(guc_log_level, int, 0400,
>>>>      "GuC firmware logging level. Requires GuC to be loaded. "
>>>> diff --git a/drivers/gpu/drm/i915/i915_params.h
>>>> b/drivers/gpu/drm/i915/i915_params.h
>>>> index d29ade3b7de6..5736c55694fe 100644
>>>> --- a/drivers/gpu/drm/i915/i915_params.h
>>>> +++ b/drivers/gpu/drm/i915/i915_params.h
>>>> @@ -54,7 +54,7 @@ struct drm_printer;
>>>>      param(int, disable_power_well, -1) \
>>>>      param(int, enable_ips, 1) \
>>>>      param(int, invert_brightness, 0) \
>>>> -    param(int, enable_guc, 0) \
>>>> +    param(int, enable_guc, -1) \
>>>>      param(int, guc_log_level, -1) \
>>>>      param(char *, guc_firmware_path, NULL) \
>>>>      param(char *, huc_firmware_path, NULL) \
>> _______________________________________________
>> Intel-gfx mailing list
>> Intel-gfx@lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/intel-gfx
> 
> 
>
On Thu, 22 Aug 2019 08:40:33 +0200, Arkadiusz Hiler  
<arkadiusz.hiler@intel.com> wrote:

> On Mon, Aug 19, 2019 at 11:09:15AM +0300, Martin Peres wrote:
>> On 18/08/2019 18:51, Michal Wajdeczko wrote:
>> > We hope that now all issues related to missing uC firmwares
>> > are fixed and that driver can continue to load without GuC
>> > or HuC firmware available and running:
>> >
>> >  - missing or corrupted HuC firmware has no impact to driver
>> >    load (clients should continue to use HUC_STATUS to check
>> >    if HuC firmware was loaded and authenticated)
>> >
>> >  - missing or corrupted GuC firmware has no impact to driver
>> >    load (unless GuC firmware blob was overridden by the user
>> >    or GuC submission was requested or GuC was previously
>> >    enabled on this system without reboot - then we will mark
>> >    GPU as wedged and continue with KMS only)
>>
>> Please hold merging this patch, as many more items need to be crossed
>> off before such a patch can land.
>>
>> Such items include:
>>
>>  - Assess all the existing GUC-related bugs, and prove they won't
>> suddenly get seen by more users
>>  - add fault injection to the FW loading path
>>  - add IGT tests to make sure driver behaves well on different FW
>> loading errors
>
> If this is going to get enabled by default we should add some tests to
> verify that HuC is indeed loaded and operational. Otherwise this may
> degrade without anyone noticing.

we were discussing such test some time ago [1], but we couldn't get
into final agreement.

[1] https://patchwork.freedesktop.org/series/60800/


>
> Something along those lines:
>     int huc_status = getparam(I915_PARAM_HUC_STATUS);
>
>     assert(MI_PREDICATE(HUC_STATUS) == !!huc_status);
>     skip_on_f(huc_status == 0, "HuC disabled\n");
>
>     assert_f(huc_status == 1, "HuC status is not enabled: %d\n",  
> huc_status);
>     assert(submit_commands_to_check_that_huc_is_operational());
>
>
>
> The issue is that the PARAM_HUC_STATUS won't even work right now because:
>
>     case I915_PARAM_HUC_STATUS:
>     	value = intel_huc_check_status(&i915->gt.uc.huc);
>     	if (value < 0)
>     		return value;
>     	break;
>     /* ... */
>     return 0;

Please note that this return if for ioctl() call status

>
>
>     /**
>      * intel_huc_check_status() - check HuC status
>      * @huc: intel_huc structure
>      *
>      * This function reads status register to verify if HuC
>      * firmware was successfully loaded.
>      *
>      * Returns: 1 if HuC firmware is loaded and verified,
>      * 0 if HuC firmware is not loaded and -ENODEV if HuC
>      * is not present on this platform.
>      */
>
> This is broken - we will get 0 whether it's enabled or disabled.

I don't think so. Negative values returned by this function are simply
used as ioctl() errors, while 0/1 values are returned as 'value' field
that holds reply with actual HuC status:

	if (put_user(value, param->value))
		return -EFAULT;

More coffee?

Michal
On Thu, Aug 22, 2019 at 09:31:07AM +0200, Michal Wajdeczko wrote:
> On Thu, 22 Aug 2019 08:40:33 +0200, Arkadiusz Hiler
> <arkadiusz.hiler@intel.com> wrote:
> 
> > On Mon, Aug 19, 2019 at 11:09:15AM +0300, Martin Peres wrote:
> > > On 18/08/2019 18:51, Michal Wajdeczko wrote:
> > > > We hope that now all issues related to missing uC firmwares
> > > > are fixed and that driver can continue to load without GuC
> > > > or HuC firmware available and running:
> > > >
> > > >  - missing or corrupted HuC firmware has no impact to driver
> > > >    load (clients should continue to use HUC_STATUS to check
> > > >    if HuC firmware was loaded and authenticated)
> > > >
> > > >  - missing or corrupted GuC firmware has no impact to driver
> > > >    load (unless GuC firmware blob was overridden by the user
> > > >    or GuC submission was requested or GuC was previously
> > > >    enabled on this system without reboot - then we will mark
> > > >    GPU as wedged and continue with KMS only)
> > > 
> > > Please hold merging this patch, as many more items need to be crossed
> > > off before such a patch can land.
> > > 
> > > Such items include:
> > > 
> > >  - Assess all the existing GUC-related bugs, and prove they won't
> > > suddenly get seen by more users
> > >  - add fault injection to the FW loading path
> > >  - add IGT tests to make sure driver behaves well on different FW
> > > loading errors
> > 
> > If this is going to get enabled by default we should add some tests to
> > verify that HuC is indeed loaded and operational. Otherwise this may
> > degrade without anyone noticing.
> 
> we were discussing such test some time ago [1], but we couldn't get
> into final agreement.
> 
> [1] https://patchwork.freedesktop.org/series/60800/

Oh, that's a good start. It would be good to land this along the default
enabling of HuC and have the agreement on the "best error codes" by then.

> > Something along those lines:
> >     int huc_status = getparam(I915_PARAM_HUC_STATUS);
> > 
> >     assert(MI_PREDICATE(HUC_STATUS) == !!huc_status);
> >     skip_on_f(huc_status == 0, "HuC disabled\n");
> > 
> >     assert_f(huc_status == 1, "HuC status is not enabled: %d\n",
> > huc_status);
> >     assert(submit_commands_to_check_that_huc_is_operational());
> > 
> > 
> > 
> > The issue is that the PARAM_HUC_STATUS won't even work right now because:
> > 
> >     case I915_PARAM_HUC_STATUS:
> >     	value = intel_huc_check_status(&i915->gt.uc.huc);
> >     	if (value < 0)
> >     		return value;
> >     	break;
> >     /* ... */
> >     return 0;
> 
> Please note that this return if for ioctl() call status
> 
> > 
> > 
> >     /**
> >      * intel_huc_check_status() - check HuC status
> >      * @huc: intel_huc structure
> >      *
> >      * This function reads status register to verify if HuC
> >      * firmware was successfully loaded.
> >      *
> >      * Returns: 1 if HuC firmware is loaded and verified,
> >      * 0 if HuC firmware is not loaded and -ENODEV if HuC
> >      * is not present on this platform.
> >      */
> > 
> > This is broken - we will get 0 whether it's enabled or disabled.
> 
> I don't think so. Negative values returned by this function are simply
> used as ioctl() errors, while 0/1 values are returned as 'value' field
> that holds reply with actual HuC status:
> 
> 	if (put_user(value, param->value))
> 		return -EFAULT;
> 
> More coffee?

My bad. coffee++ would have helped