[RESEND,1/2] drm/i915: make device info bitfield flags bools

Submitted by Jani Nikula on May 13, 2016, 2:04 p.m.

Details

Message ID 1463148278-23193-1-git-send-email-jani.nikula@intel.com
State New
Headers show
Series "Series without cover letter" ( rev: 1 ) in Intel GFX

Not browsing as part of any series.

Commit Message

Jani Nikula May 13, 2016, 2:04 p.m.
This is more robust for assignments and comparisons.

Signed-off-by: Jani Nikula <jani.nikula@intel.com>
---
 drivers/gpu/drm/i915/i915_drv.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Patch hide | download patch | download mbox

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index d9d07b70f05c..bb0b6f64000e 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -752,7 +752,7 @@  struct intel_csr {
 	func(has_ddi) sep \
 	func(has_fpga_dbg)
 
-#define DEFINE_FLAG(name) u8 name:1
+#define DEFINE_FLAG(name) bool name:1
 #define SEP_SEMICOLON ;
 
 struct intel_device_info {

Comments

On 13/05/16 15:04, Jani Nikula wrote:
> This is more robust for assignments and comparisons.
>
> Signed-off-by: Jani Nikula <jani.nikula@intel.com>
> ---
>   drivers/gpu/drm/i915/i915_drv.h | 2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index d9d07b70f05c..bb0b6f64000e 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -752,7 +752,7 @@ struct intel_csr {
>   	func(has_ddi) sep \
>   	func(has_fpga_dbg)
>
> -#define DEFINE_FLAG(name) u8 name:1
> +#define DEFINE_FLAG(name) bool name:1
>   #define SEP_SEMICOLON ;
>
>   struct intel_device_info {
>

The churn virus spreads? :)

I tried that but it was negatively impacting the compiler. For some 
reason it increases .text by 2.5k here. Don't see anything obvious, 
would have to look at the code more closely to spot exactly why.

Regards,

Tvrtko
On Fri, May 13, 2016 at 03:25:05PM +0100, Tvrtko Ursulin wrote:
> 
> On 13/05/16 15:04, Jani Nikula wrote:
> >This is more robust for assignments and comparisons.
> >
> >Signed-off-by: Jani Nikula <jani.nikula@intel.com>
> >---
> >  drivers/gpu/drm/i915/i915_drv.h | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> >
> >diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> >index d9d07b70f05c..bb0b6f64000e 100644
> >--- a/drivers/gpu/drm/i915/i915_drv.h
> >+++ b/drivers/gpu/drm/i915/i915_drv.h
> >@@ -752,7 +752,7 @@ struct intel_csr {
> >  	func(has_ddi) sep \
> >  	func(has_fpga_dbg)
> >
> >-#define DEFINE_FLAG(name) u8 name:1
> >+#define DEFINE_FLAG(name) bool name:1
> >  #define SEP_SEMICOLON ;
> >
> >  struct intel_device_info {
> >
> 
> The churn virus spreads? :)
> 
> I tried that but it was negatively impacting the compiler. For some
> reason it increases .text by 2.5k here. Don't see anything obvious,
> would have to look at the code more closely to spot exactly why.

Oh, that's not fun. bool:1 holds such promise for a clear explanation of
the most common form of bitfield.
-Chris
On Fri, 13 May 2016, Chris Wilson <chris@chris-wilson.co.uk> wrote:
> On Fri, May 13, 2016 at 03:25:05PM +0100, Tvrtko Ursulin wrote:
>> 
>> On 13/05/16 15:04, Jani Nikula wrote:
>> >This is more robust for assignments and comparisons.
>> >
>> >Signed-off-by: Jani Nikula <jani.nikula@intel.com>
>> >---
>> >  drivers/gpu/drm/i915/i915_drv.h | 2 +-
>> >  1 file changed, 1 insertion(+), 1 deletion(-)
>> >
>> >diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
>> >index d9d07b70f05c..bb0b6f64000e 100644
>> >--- a/drivers/gpu/drm/i915/i915_drv.h
>> >+++ b/drivers/gpu/drm/i915/i915_drv.h
>> >@@ -752,7 +752,7 @@ struct intel_csr {
>> >  	func(has_ddi) sep \
>> >  	func(has_fpga_dbg)
>> >
>> >-#define DEFINE_FLAG(name) u8 name:1
>> >+#define DEFINE_FLAG(name) bool name:1
>> >  #define SEP_SEMICOLON ;
>> >
>> >  struct intel_device_info {
>> >
>> 
>> The churn virus spreads? :)
>> 
>> I tried that but it was negatively impacting the compiler. For some
>> reason it increases .text by 2.5k here. Don't see anything obvious,
>> would have to look at the code more closely to spot exactly why.
>
> Oh, that's not fun. bool:1 holds such promise for a clear explanation of
> the most common form of bitfield.

Really a bummer, especially since assigning any positive even number to
unsigned int foo:1 will result in 0.

BR,
Jani.
On 13/05/16 15:47, Jani Nikula wrote:
> On Fri, 13 May 2016, Chris Wilson <chris@chris-wilson.co.uk> wrote:
>> On Fri, May 13, 2016 at 03:25:05PM +0100, Tvrtko Ursulin wrote:
>>>
>>> On 13/05/16 15:04, Jani Nikula wrote:
>>>> This is more robust for assignments and comparisons.
>>>>
>>>> Signed-off-by: Jani Nikula <jani.nikula@intel.com>
>>>> ---
>>>>   drivers/gpu/drm/i915/i915_drv.h | 2 +-
>>>>   1 file changed, 1 insertion(+), 1 deletion(-)
>>>>
>>>> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
>>>> index d9d07b70f05c..bb0b6f64000e 100644
>>>> --- a/drivers/gpu/drm/i915/i915_drv.h
>>>> +++ b/drivers/gpu/drm/i915/i915_drv.h
>>>> @@ -752,7 +752,7 @@ struct intel_csr {
>>>>   	func(has_ddi) sep \
>>>>   	func(has_fpga_dbg)
>>>>
>>>> -#define DEFINE_FLAG(name) u8 name:1
>>>> +#define DEFINE_FLAG(name) bool name:1
>>>>   #define SEP_SEMICOLON ;
>>>>
>>>>   struct intel_device_info {
>>>>
>>>
>>> The churn virus spreads? :)
>>>
>>> I tried that but it was negatively impacting the compiler. For some
>>> reason it increases .text by 2.5k here. Don't see anything obvious,
>>> would have to look at the code more closely to spot exactly why.
>>
>> Oh, that's not fun. bool:1 holds such promise for a clear explanation of
>> the most common form of bitfield.
>
> Really a bummer, especially since assigning any positive even number to
> unsigned int foo:1 will result in 0.

That is a pretty strong argument to go for this rather than make sure 
all places which set them are correct.

Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Regards,

Tvrtko
On 13/05/16 16:05, Tvrtko Ursulin wrote:
>
> On 13/05/16 15:47, Jani Nikula wrote:
>> On Fri, 13 May 2016, Chris Wilson <chris@chris-wilson.co.uk> wrote:
>>> On Fri, May 13, 2016 at 03:25:05PM +0100, Tvrtko Ursulin wrote:
>>>>
>>>> On 13/05/16 15:04, Jani Nikula wrote:
>>>>> This is more robust for assignments and comparisons.
>>>>>
>>>>> Signed-off-by: Jani Nikula <jani.nikula@intel.com>
>>>>> ---
>>>>>   drivers/gpu/drm/i915/i915_drv.h | 2 +-
>>>>>   1 file changed, 1 insertion(+), 1 deletion(-)
>>>>>
>>>>> diff --git a/drivers/gpu/drm/i915/i915_drv.h
>>>>> b/drivers/gpu/drm/i915/i915_drv.h
>>>>> index d9d07b70f05c..bb0b6f64000e 100644
>>>>> --- a/drivers/gpu/drm/i915/i915_drv.h
>>>>> +++ b/drivers/gpu/drm/i915/i915_drv.h
>>>>> @@ -752,7 +752,7 @@ struct intel_csr {
>>>>>       func(has_ddi) sep \
>>>>>       func(has_fpga_dbg)
>>>>>
>>>>> -#define DEFINE_FLAG(name) u8 name:1
>>>>> +#define DEFINE_FLAG(name) bool name:1
>>>>>   #define SEP_SEMICOLON ;
>>>>>
>>>>>   struct intel_device_info {
>>>>>
>>>>
>>>> The churn virus spreads? :)
>>>>
>>>> I tried that but it was negatively impacting the compiler. For some
>>>> reason it increases .text by 2.5k here. Don't see anything obvious,
>>>> would have to look at the code more closely to spot exactly why.
>>>
>>> Oh, that's not fun. bool:1 holds such promise for a clear explanation of
>>> the most common form of bitfield.
>>
>> Really a bummer, especially since assigning any positive even number to
>> unsigned int foo:1 will result in 0.
>
> That is a pretty strong argument to go for this rather than make sure
> all places which set them are correct.
>
> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>
> Regards,
> Tvrtko

Unfortunately, the compiler doesn't generate such good code when using 
'bool' vs. 'u8' for these bitfields.

Firstly, it looks like the compiler DOESN'T combine bool-bitfield tests 
such as "if (IS_A() || IS_B())", whereas it does if they're u8. Example:

With bitfields as 'u8'

	if (IS_HASWELL(dev) || IS_BROADWELL(dev))
48 8b 57 28          	mov    0x28(%rdi),%rdx	# dev->dev_priv
0f b6 52 30          	movzbl 0x30(%rdx),%edx	# dev_priv->flag byte
f6 c2 60             	test   $0x60,%dl	# (haswell|broadwell)
0f 85 bb 00 00 00    	jne    12e4		# branch to if-true

	else if (IS_VALLEYVIEW(dev) || IS_CHERRYVIEW(dev))
83 e2 18             	and    $0x18,%edx	# test two more flags
0f 84 1c 01 00 00    	je     134e		# skip if neither
                         ... code for VLV/CHV ...
(code for HSW/BDW is out-of-line, at the end of the function)

With bitfields as bools:

48 8b 57 28          	mov    0x28(%rdi),%rdx	# dev->dev_priv
0f b6 52 30          	movzbl 0x30(%rdx),%edx	# dev_priv->flag byte
f6 c2 20             	test   $0x20,%dl	# test one flag
75 09                	jne    1241		# jump if true
f6 c2 40             	test   $0x40,%dl	# test other flag
0f 84 b7 00 00 00    	je     12f8		# skip if not
                         ... code for HSW/BDW ...
(code for VLV/CHV/other is later)

So that would suggest that the compiler generates more efficient code 
for 'u8' (at least it didn't reload the flag byte even in the 'bool' 
case). Here's another case:

With bitfields as bools:

	dev_priv->psr_mmio_base = IS_HASWELL(dev_priv) ?
		HSW_EDP_PSR_BASE : BDW_EDP_PSR_BASE;
0f b6 43 30          	movzbl 0x30(%rbx),%eax	# flags byte
c0 e8 05             	shr    $0x5,%al		# haswell->bit 0
83 e0 01             	and    $0x1,%eax	# leaves bit 0 only
3c 01                	cmp    $0x1,%al		# compare to 1u
19 c0                	sbb    %eax,%eax	# convert to 0/-1
25 00 b0 00 00       	and    $0xb000,%eax	# convert to 0/b000
05 00 48 06 00       	add    $0x64800,%eax	# final result
89 83 e8 20 00 00    	mov    %eax,0x20e8(%rbx)# store it

This is ingenious code, avoiding any branching. Now with 'u8':

0f b6 43 30          	movzbl 0x30(%rbx),%eax
83 e0 20             	and    $0x20,%eax	# leaves bit 5 only
3c 01                	cmp    $0x1,%al		# compare to 1u
19 c0                	sbb    %eax,%eax	# convert to 0/-1
25 00 b0 00 00       	and    $0xb000,%eax	# etc
05 00 48 06 00       	add    $0x64800,%eax
89 83 e8 20 00 00    	mov    %eax,0x20e8(%rbx)

Again it's shorter, because the compiler has been able to extract a 
truth-value without shifting the "haswell" bit down to bit 0 first.

.Dave.
On Mon, 16 May 2016, Dave Gordon <david.s.gordon@intel.com> wrote:
> On 13/05/16 16:05, Tvrtko Ursulin wrote:
>>
>> On 13/05/16 15:47, Jani Nikula wrote:
>>> On Fri, 13 May 2016, Chris Wilson <chris@chris-wilson.co.uk> wrote:
>>>> On Fri, May 13, 2016 at 03:25:05PM +0100, Tvrtko Ursulin wrote:
>>>>>
>>>>> On 13/05/16 15:04, Jani Nikula wrote:
>>>>>> This is more robust for assignments and comparisons.
>>>>>>
>>>>>> Signed-off-by: Jani Nikula <jani.nikula@intel.com>
>>>>>> ---
>>>>>>   drivers/gpu/drm/i915/i915_drv.h | 2 +-
>>>>>>   1 file changed, 1 insertion(+), 1 deletion(-)
>>>>>>
>>>>>> diff --git a/drivers/gpu/drm/i915/i915_drv.h
>>>>>> b/drivers/gpu/drm/i915/i915_drv.h
>>>>>> index d9d07b70f05c..bb0b6f64000e 100644
>>>>>> --- a/drivers/gpu/drm/i915/i915_drv.h
>>>>>> +++ b/drivers/gpu/drm/i915/i915_drv.h
>>>>>> @@ -752,7 +752,7 @@ struct intel_csr {
>>>>>>       func(has_ddi) sep \
>>>>>>       func(has_fpga_dbg)
>>>>>>
>>>>>> -#define DEFINE_FLAG(name) u8 name:1
>>>>>> +#define DEFINE_FLAG(name) bool name:1
>>>>>>   #define SEP_SEMICOLON ;
>>>>>>
>>>>>>   struct intel_device_info {
>>>>>>
>>>>>
>>>>> The churn virus spreads? :)
>>>>>
>>>>> I tried that but it was negatively impacting the compiler. For some
>>>>> reason it increases .text by 2.5k here. Don't see anything obvious,
>>>>> would have to look at the code more closely to spot exactly why.
>>>>
>>>> Oh, that's not fun. bool:1 holds such promise for a clear explanation of
>>>> the most common form of bitfield.
>>>
>>> Really a bummer, especially since assigning any positive even number to
>>> unsigned int foo:1 will result in 0.
>>
>> That is a pretty strong argument to go for this rather than make sure
>> all places which set them are correct.
>>
>> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>>
>> Regards,
>> Tvrtko
>
> Unfortunately, the compiler doesn't generate such good code when using 
> 'bool' vs. 'u8' for these bitfields.


I've pushed patch 2/2 while the jury seems to be still out for 1/2.

Generally we might prefer bools anyway, but this struct is const in
dev_priv with the idea that this is immutable data, and we override the
const briefly on driver load in intel_device_info_runtime_init().

Checking all usage is not a huge effort, although we might still screw
this up later on...

BR,
Jani.


>
> Firstly, it looks like the compiler DOESN'T combine bool-bitfield tests 
> such as "if (IS_A() || IS_B())", whereas it does if they're u8. Example:
>
> With bitfields as 'u8'
>
> 	if (IS_HASWELL(dev) || IS_BROADWELL(dev))
> 48 8b 57 28          	mov    0x28(%rdi),%rdx	# dev->dev_priv
> 0f b6 52 30          	movzbl 0x30(%rdx),%edx	# dev_priv->flag byte
> f6 c2 60             	test   $0x60,%dl	# (haswell|broadwell)
> 0f 85 bb 00 00 00    	jne    12e4		# branch to if-true
>
> 	else if (IS_VALLEYVIEW(dev) || IS_CHERRYVIEW(dev))
> 83 e2 18             	and    $0x18,%edx	# test two more flags
> 0f 84 1c 01 00 00    	je     134e		# skip if neither
>                          ... code for VLV/CHV ...
> (code for HSW/BDW is out-of-line, at the end of the function)
>
> With bitfields as bools:
>
> 48 8b 57 28          	mov    0x28(%rdi),%rdx	# dev->dev_priv
> 0f b6 52 30          	movzbl 0x30(%rdx),%edx	# dev_priv->flag byte
> f6 c2 20             	test   $0x20,%dl	# test one flag
> 75 09                	jne    1241		# jump if true
> f6 c2 40             	test   $0x40,%dl	# test other flag
> 0f 84 b7 00 00 00    	je     12f8		# skip if not
>                          ... code for HSW/BDW ...
> (code for VLV/CHV/other is later)
>
> So that would suggest that the compiler generates more efficient code 
> for 'u8' (at least it didn't reload the flag byte even in the 'bool' 
> case). Here's another case:
>
> With bitfields as bools:
>
> 	dev_priv->psr_mmio_base = IS_HASWELL(dev_priv) ?
> 		HSW_EDP_PSR_BASE : BDW_EDP_PSR_BASE;
> 0f b6 43 30          	movzbl 0x30(%rbx),%eax	# flags byte
> c0 e8 05             	shr    $0x5,%al		# haswell->bit 0
> 83 e0 01             	and    $0x1,%eax	# leaves bit 0 only
> 3c 01                	cmp    $0x1,%al		# compare to 1u
> 19 c0                	sbb    %eax,%eax	# convert to 0/-1
> 25 00 b0 00 00       	and    $0xb000,%eax	# convert to 0/b000
> 05 00 48 06 00       	add    $0x64800,%eax	# final result
> 89 83 e8 20 00 00    	mov    %eax,0x20e8(%rbx)# store it
>
> This is ingenious code, avoiding any branching. Now with 'u8':
>
> 0f b6 43 30          	movzbl 0x30(%rbx),%eax
> 83 e0 20             	and    $0x20,%eax	# leaves bit 5 only
> 3c 01                	cmp    $0x1,%al		# compare to 1u
> 19 c0                	sbb    %eax,%eax	# convert to 0/-1
> 25 00 b0 00 00       	and    $0xb000,%eax	# etc
> 05 00 48 06 00       	add    $0x64800,%eax
> 89 83 e8 20 00 00    	mov    %eax,0x20e8(%rbx)
>
> Again it's shorter, because the compiler has been able to extract a 
> truth-value without shifting the "haswell" bit down to bit 0 first.
>
> .Dave.
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx