[Mesa-dev,25/57] i965/fs: Stop using fs_reg::in_range() in favor of regions_overlap().

Submitted by Francisco Jerez on Sept. 8, 2016, 1:48 a.m.

Details

Message ID 20160908014924.9269-26-currojerez@riseup.net
State New
Headers show
Series "i965/ir: Switch representation of register offsets and sizes to byte units." ( rev: 1 ) in Mesa

Not browsing as part of any series.

Commit Message

Francisco Jerez Sept. 8, 2016, 1:48 a.m.
Its only use left in the FS back-end should be using regions_overlap()
instead to avoid getting a false negative result in cases where source
and destination overlap but the former starts before the latter in the
VGRF file.
---
 src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

Patch hide | download patch | download mbox

diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
index 42ed131..b06606b 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
@@ -4820,7 +4820,8 @@  shuffle_64bit_data_for_32bit_write(const fs_builder &bld,
    assert(type_sz(src.type) == 8);
    assert(type_sz(dst.type) == 4);
 
-   assert(!src.in_range(dst, 2 * components * bld.dispatch_width() / 8));
+   assert(!regions_overlap(dst, 2 * dst.component_size(bld.dispatch_width()),
+                           src, src.component_size(bld.dispatch_width())));
 
    for (unsigned i = 0; i < components; i++) {
       const fs_reg component_i = offset(src, bld, i);

Comments

On Wed, 2016-09-07 at 18:48 -0700, Francisco Jerez wrote:
> Its only use left in the FS back-end should be using
> regions_overlap()
> instead to avoid getting a false negative result in cases where
> source
> and destination overlap but the former starts before the latter in
> the
> VGRF file.
> ---
>  src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
> b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
> index 42ed131..b06606b 100644
> --- a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
> @@ -4820,7 +4820,8 @@ shuffle_64bit_data_for_32bit_write(const
> fs_builder &bld,
>     assert(type_sz(src.type) == 8);
>     assert(type_sz(dst.type) == 4);
>  
> -   assert(!src.in_range(dst, 2 * components * bld.dispatch_width() /
> 8));
> +   assert(!regions_overlap(dst, 2 *
> dst.component_size(bld.dispatch_width()),
> +                           src,
> src.component_size(bld.dispatch_width())));

I think you need to multiply the sizes of the regions for dst and src
by the number of components being shuffled, like in the original
assert, otherwise we would only be testing if there is overlap for the
first component.

>     for (unsigned i = 0; i < components; i++) {
>        const fs_reg component_i = offset(src, bld, i);
Iago Toral <itoral@igalia.com> writes:

> On Wed, 2016-09-07 at 18:48 -0700, Francisco Jerez wrote:
>> Its only use left in the FS back-end should be using
>> regions_overlap()
>> instead to avoid getting a false negative result in cases where
>> source
>> and destination overlap but the former starts before the latter in
>> the
>> VGRF file.
>> ---
>>  src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 3 ++-
>>  1 file changed, 2 insertions(+), 1 deletion(-)
>> 
>> diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
>> b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
>> index 42ed131..b06606b 100644
>> --- a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
>> +++ b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
>> @@ -4820,7 +4820,8 @@ shuffle_64bit_data_for_32bit_write(const
>> fs_builder &bld,
>>     assert(type_sz(src.type) == 8);
>>     assert(type_sz(dst.type) == 4);
>>  
>> -   assert(!src.in_range(dst, 2 * components * bld.dispatch_width() /
>> 8));
>> +   assert(!regions_overlap(dst, 2 *
>> dst.component_size(bld.dispatch_width()),
>> +                           src,
>> src.component_size(bld.dispatch_width())));
>
> I think you need to multiply the sizes of the regions for dst and src
> by the number of components being shuffled, like in the original
> assert, otherwise we would only be testing if there is overlap for the
> first component.
>
Good point, thanks!

>>     for (unsigned i = 0; i < components; i++) {
>>        const fs_reg component_i = offset(src, bld, i);