i965/ir: Switch representation of register offsets and sizes to byte units.

Submitted by Francisco Jerez on Sept. 8, 2016, 1:48 a.m.

Details

Reviewer None
Submitted Sept. 8, 2016, 1:48 a.m.
Last Updated Sept. 8, 2016, 1:52 a.m.
Revision 1

Cover Letter(s)

Revision 1
      This series reworks the representation of register region offsets in
the i965 IR to be universally byte-based instead of the rather awkward
split between reg_offset and subreg_offset we have in the FS back-end
right now, or the reg_offset field currently used in the VEC4 IR which
doesn't allow better granularity than 32B.  The most immediate
motivation is to enable sub-GRF offsets in the VEC4 back-end, which
will be useful for various kinds of lowering and instruction splitting
required for FP64 support on VEC4 platforms.

Patches 01-11 take care of scaling the regs_written and regs_read
instruction methods on both back-ends and the reg_offset register
field of VEC4 IR registers.  The fs_reg::reg_offset and
::subreg_offset fields are also unified into a single register field.
Because this part of the series is rather bulky I've tried to keep the
changes as obvious and functionally equivalent as possible at the cost
of introducing not particularly clever code in some cases that could
be simplified with some knowledge of the context.  Patches 31-46 make
a second pass through the code touched in the first part of the series
in order to get rid of an amount of cruft.

Patches 12-30 address an amount of bugs that became obvious during the
conversion to byte units, some of them seem worrying enough that it
might make sense to back-port them to stable releases.

Patches 47-57 go through the VEC4 back-end and address a number of
issues that would arise in existing optimization passes with
non-GRF-aligned regions, which will be useful for FP64 support.  It's
likely not complete and the handling of sub-GRF offsets doesn't
attempt to be nearly as clever as the FS back-end, but they should
make a substantial improvement over the current situation.

[PATCH 01/57] i965/fs: Replace fs_reg::reg_offset with fs_reg::offset expressed in bytes.
[PATCH 02/57] i965/vec4: Replace dst/src_reg::reg_offset with dst/src_reg::offset expressed in bytes.
[PATCH 03/57] i965/ir: Remove backend_reg::reg_offset.
[PATCH 04/57] i965/fs: Replace fs_reg::subreg_offset with fs_reg::offset expressed in bytes.
[PATCH 05/57] i965/fs: Add wrapper functions for fs_inst::regs_read and ::regs_written.
[PATCH 06/57] i965/vec4: Add wrapper functions for vec4_instruction::regs_read and ::regs_written.
[PATCH 07/57] i965/fs: Replace fs_inst::regs_written with ::size_written field in bytes.
[PATCH 08/57] i965/vec4: Replace vec4_instruction::regs_written with ::size_written field in bytes.
[PATCH 09/57] i965/ir: Drop backend_instruction::regs_written field.
[PATCH 10/57] i965/fs: Replace fs_inst::regs_read with ::size_read using byte units.
[PATCH 11/57] i965/vec4: Replace vec4_instruction::regs_read with ::size_read using byte units.
[PATCH 12/57] i965/fs: Return more accurate read size from fs_inst::size_read for IMM and UNIFORM files.
[PATCH 13/57] i965/fs: Return more accurate read size for LINTERP from fs_inst::size_read.
[PATCH 14/57] i965/fs: Handle arbitrary offsets in brw_reg_from_fs_reg for MRF/VGRF registers.
[PATCH 15/57] i965/fs: Handle fixed HW GRF subnr in reg_offset().
[PATCH 16/57] i965/fs: Take into account trailing padding in regs_written() and regs_read().
[PATCH 17/57] i965/fs: Take into account misalignment in regs_written() and regs_read().
[PATCH 18/57] i965/vec4: Take into account misalignment in regs_written() and regs_read().
[PATCH 19/57] i965/fs: Don't consider LOAD_PAYLOAD with sub-GRF offset to behave like a raw copy.
[PATCH 20/57] i965/fs: Don't consider LOAD_PAYLOAD with stride > 1 source to behave like a raw copy.
[PATCH 21/57] i965/fs: Compare full register offsets in cmod propagation pass.
[PATCH 22/57] i965/fs: Fix can_propagate_from() source/destination overlap check.
[PATCH 23/57] i965/fs: Fix LOAD_PAYLOAD handling in register coalesce is_nop_mov().
[PATCH 24/57] i965/fs: Drop fs_inst::overwrites_reg() in favor of regions_overlap().
[PATCH 25/57] i965/fs: Stop using fs_reg::in_range() in favor of regions_overlap().
[PATCH 26/57] i965/vec4: Port regions_overlap() to the vec4 IR.
[PATCH 27/57] i965/vec4: Drop backend_reg::in_range() in favor of regions_overlap().
[PATCH 28/57] i965/fs: Take into account copy register offset during compute-to-mrf.
[PATCH 29/57] i965/fs: Fix bogus sub-MRF offset calculation in compute-to-mrf.
[PATCH 30/57] i965/fs: Switch mask_relative_to() used in compute-to-mrf to byte units.
[PATCH 31/57] i965/fs: Fix signedness of the return value of fs_inst::size_read().
[PATCH 32/57] i965/fs: Simplify byte_offset().
[PATCH 33/57] i965/fs: Simplify get_fpu_lowered_simd_width() by using inequalities instead of rounding.
[PATCH 34/57] i965/fs: Simplify and fix buggy stride/offset calculations using subscript().
[PATCH 35/57] i965/fs: Simplify result_live calculation in dead_code_eliminate().
[PATCH 36/57] i965/fs: Simplify a bunch of fs_inst::size_written calculations by using component_size().
[PATCH 37/57] i965/fs: Simplify copy propagation LOAD_PAYLOAD ACP setup.
[PATCH 38/57] i965/fs: Change region_contained_in() to use byte units.
[PATCH 39/57] i965/fs: Move region_contained_in to the IR header and fix for non-VGRF files.
[PATCH 40/57] i965/fs: Use region_contained_in() in compute-to-mrf coalescing pass.
[PATCH 41/57] i965/fs: Get rid of fs_inst::set_smear().
[PATCH 42/57] i965/fs: Misc simplification.
[PATCH 43/57] i965/fs: Print fs_reg::offset field consistently for all register files.
[PATCH 44/57] i965/vec4: Print src/dst_reg::offset field consistently for all register files.
[PATCH 45/57] i965/ir: Don't print ARF subnr values twice.
[PATCH 46/57] i965/ir: Update several stale comments.
[PATCH 47/57] i965/vec4: Simplify src/dst_reg to brw_reg conversion by using byte_offset().
[PATCH 48/57] i965/vec4: Change opt_vector_float to keep track of the last offset seen in bytes.
[PATCH 49/57] i965/vec4: Check that the write offsets match when setting dependency controls.
[PATCH 50/57] i965/vec4: Compare full register offsets in opt_register_coalesce nop move check.
[PATCH 51/57] i965/vec4: Don't coalesce registers with overlapping writes not matching the MOV source.
[PATCH 52/57] i965/vec4: Assign correct destination offset to rewritten instruction in register coalesce.
[PATCH 53/57] i965/vec4: Compare full register offsets in cmod propagation.
[PATCH 54/57] i965/vec4: Fix copy propagation for non-register-aligned regions.
[PATCH 55/57] i965/vec4: Don't spill non-GRF-aligned register regions.
[PATCH 56/57] i965/vec4: Assert that ATTR regions are register-aligned.
[PATCH 57/57] i965/vec4: Assert that pull constant load offsets are 16B-aligned.
    

Revisions

Patches download mbox

# Name Submitter State A F R T
[Mesa-dev,01/57] i965/fs: Replace fs_reg::reg_offset with fs_reg::offset expressed in bytes. Francisco Jerez Accepted
[Mesa-dev,02/57] i965/vec4: Replace dst/src_reg::reg_offset with dst/src_reg::offset expressed in... Francisco Jerez New
[Mesa-dev,03/57] i965/ir: Remove backend_reg::reg_offset. Francisco Jerez Accepted
[Mesa-dev,04/57] i965/fs: Replace fs_reg::subreg_offset with fs_reg::offset expressed in bytes. Francisco Jerez Accepted
[Mesa-dev,05/57] i965/fs: Add wrapper functions for fs_inst::regs_read and ::regs_written. Francisco Jerez Accepted
[Mesa-dev,06/57] i965/vec4: Add wrapper functions for vec4_instruction::regs_read and ::regs_writ... Francisco Jerez Accepted
[Mesa-dev,07/57] i965/fs: Replace fs_inst::regs_written with ::size_written field in bytes. Francisco Jerez New
[Mesa-dev,08/57] i965/vec4: Replace vec4_instruction::regs_written with ::size_written field in b... Francisco Jerez Accepted
[Mesa-dev,09/57] i965/ir: Drop backend_instruction::regs_written field. Francisco Jerez Accepted
[Mesa-dev,10/57] i965/fs: Replace fs_inst::regs_read with ::size_read using byte units. Francisco Jerez Accepted
[Mesa-dev,11/57] i965/vec4: Replace vec4_instruction::regs_read with ::size_read using byte units. Francisco Jerez Accepted
[Mesa-dev,12/57] i965/fs: Return more accurate read size from fs_inst::size_read for IMM and UNIF... Francisco Jerez Accepted
[Mesa-dev,13/57] i965/fs: Return more accurate read size for LINTERP from fs_inst::size_read. Francisco Jerez Accepted
[Mesa-dev,14/57] i965/fs: Handle arbitrary offsets in brw_reg_from_fs_reg for MRF/VGRF registers. Francisco Jerez Accepted
[Mesa-dev,15/57] i965/fs: Handle fixed HW GRF subnr in reg_offset(). Francisco Jerez Accepted
[Mesa-dev,16/57] i965/fs: Take into account trailing padding in regs_written() and regs_read(). Francisco Jerez Accepted
[Mesa-dev,17/57] i965/fs: Take into account misalignment in regs_written() and regs_read(). Francisco Jerez Accepted
[Mesa-dev,18/57] i965/vec4: Take into account misalignment in regs_written() and regs_read(). Francisco Jerez Accepted
[Mesa-dev,19/57] i965/fs: Don't consider LOAD_PAYLOAD with sub-GRF offset to behave like a raw copy. Francisco Jerez Accepted
[Mesa-dev,20/57] i965/fs: Don't consider LOAD_PAYLOAD with stride > 1 source to behave like a raw... Francisco Jerez Accepted
[Mesa-dev,21/57] i965/fs: Compare full register offsets in cmod propagation pass. Francisco Jerez Accepted
[Mesa-dev,22/57] i965/fs: Fix can_propagate_from() source/destination overlap check. Francisco Jerez Accepted
[Mesa-dev,23/57] i965/fs: Fix LOAD_PAYLOAD handling in register coalesce is_nop_mov(). Francisco Jerez Accepted
[Mesa-dev,24/57] i965/fs: Drop fs_inst::overwrites_reg() in favor of regions_overlap(). Francisco Jerez Accepted
[Mesa-dev,25/57] i965/fs: Stop using fs_reg::in_range() in favor of regions_overlap(). Francisco Jerez New
[Mesa-dev,26/57] i965/vec4: Port regions_overlap() to the vec4 IR. Francisco Jerez Accepted
[Mesa-dev,27/57] i965/vec4: Drop backend_reg::in_range() in favor of regions_overlap(). Francisco Jerez Accepted
[Mesa-dev,28/57] i965/fs: Take into account copy register offset during compute-to-mrf. Francisco Jerez Accepted
[Mesa-dev,29/57] i965/fs: Fix bogus sub-MRF offset calculation in compute-to-mrf. Francisco Jerez Accepted
[Mesa-dev,30/57] i965/fs: Switch mask_relative_to() used in compute-to-mrf to byte units. Francisco Jerez Accepted
[Mesa-dev,31/57] i965/fs: Fix signedness of the return value of fs_inst::size_read(). Francisco Jerez Accepted
[Mesa-dev,32/57] i965/fs: Simplify byte_offset(). Francisco Jerez Accepted
[Mesa-dev,33/57] i965/fs: Simplify get_fpu_lowered_simd_width() by using inequalities instead of ... Francisco Jerez Accepted
[Mesa-dev,34/57] i965/fs: Simplify and fix buggy stride/offset calculations using subscript(). Francisco Jerez Accepted
[Mesa-dev,35/57] i965/fs: Simplify result_live calculation in dead_code_eliminate(). Francisco Jerez Accepted
[Mesa-dev,36/57] i965/fs: Simplify a bunch of fs_inst::size_written calculations by using compone... Francisco Jerez Accepted
[Mesa-dev,37/57] i965/fs: Simplify copy propagation LOAD_PAYLOAD ACP setup. Francisco Jerez Accepted
[Mesa-dev,38/57] i965/fs: Change region_contained_in() to use byte units. Francisco Jerez Accepted
[Mesa-dev,39/57] i965/fs: Move region_contained_in to the IR header and fix for non-VGRF files. Francisco Jerez Accepted
[Mesa-dev,40/57] i965/fs: Use region_contained_in() in compute-to-mrf coalescing pass. Francisco Jerez Accepted
[Mesa-dev,41/57] i965/fs: Get rid of fs_inst::set_smear(). Francisco Jerez Accepted
[Mesa-dev,42/57] i965/fs: Misc simplification. Francisco Jerez Accepted
[Mesa-dev,43/57] i965/fs: Print fs_reg::offset field consistently for all register files. Francisco Jerez Accepted
[Mesa-dev,44/57] i965/vec4: Print src/dst_reg::offset field consistently for all register files. Francisco Jerez Accepted
[Mesa-dev,45/57] i965/ir: Don't print ARF subnr values twice. Francisco Jerez Accepted
[Mesa-dev,46/57] i965/ir: Update several stale comments. Francisco Jerez Accepted
[Mesa-dev,47/57] i965/vec4: Simplify src/dst_reg to brw_reg conversion by using byte_offset(). Francisco Jerez New
[Mesa-dev,48/57] i965/vec4: Change opt_vector_float to keep track of the last offset seen in bytes. Francisco Jerez Accepted
[Mesa-dev,49/57] i965/vec4: Check that the write offsets match when setting dependency controls. Francisco Jerez Accepted
[Mesa-dev,50/57] i965/vec4: Compare full register offsets in opt_register_coalesce nop move check. Francisco Jerez Accepted
[Mesa-dev,51/57] i965/vec4: Don't coalesce registers with overlapping writes not matching the MOV... Francisco Jerez Accepted
[Mesa-dev,52/57] i965/vec4: Assign correct destination offset to rewritten instruction in registe... Francisco Jerez Accepted
[Mesa-dev,53/57] i965/vec4: Compare full register offsets in cmod propagation. Francisco Jerez Accepted
[Mesa-dev,54/57] i965/vec4: Fix copy propagation for non-register-aligned regions. Francisco Jerez Accepted
[Mesa-dev,55/57] i965/vec4: Don't spill non-GRF-aligned register regions. Francisco Jerez Accepted
[Mesa-dev,56/57] i965/vec4: Assert that ATTR regions are register-aligned. Francisco Jerez Accepted
[Mesa-dev,57/57] i965/vec4: Assert that pull constant load offsets are 16B-aligned. Francisco Jerez Accepted