nir/vtn/compiler: first batch of compute support

Submitted by Jason Ekstrand on Feb. 28, 2018, 7:51 p.m.

Details

Reviewer None
Submitted Feb. 28, 2018, 7:51 p.m.
Last Updated March 25, 2018, 3:01 a.m.
Revision 5

Cover Letter(s)

Revision 1
      This is by no means everything needed for clover/OpenCL.. but I took
a bit of time this morning to extract some parts of our growing stack
of patches which where plausibly mergable (or at least not complete
hacks), with the idea that review could start in parallel with further
clover/compute hacking.

In particular, it could be useful to at least (once reviewed) merge the
new nir intrinsics, since those would unblock landing the corresponding
nvir and ir3 patches which implement the new intrinsics.

The new intrinsics are:
 - load_param - for compute kernels the entrypoint can have parameters
   and we need a way to load them
 - load/store_global - for dereferencing pointers

In the case of pointers, I punted on dealing with local vs global
pointers.  AFAIU with amd/nv GPUs local memory can be mapped into a
single address space alongside global pointers, so they might not really
have to care about pointers into local memory.  For ir3, there *are*
different instructions for local vs global.  One option could be to
emulate a flat address space using the high bits of a pointer (at least
that would work in 64b mode, not sure about 32b mode).  Other option is
"fat" pointers (ie. storing a pointer in a vec2 or two registers, where
one value indicates the pointer type), which at least works ok as long
as everything is in SSA.  But not sure how that would work if you had
to store a pointer value to memory.  But bigger fires, so I punted on
that for now.

The last patch, which adds load/store_global support to vtn, is just RFC.
Mostly looking for suggestions on how best to handle "logical" pointers
(ie. gfx/vk shaders) vs "physical" pointers (ie. compute/kernel).

Karol Herbst (5):
  nir: allow 64 bit shifts
  nir: add load_param
  nir: add load/store_global intrinsics
  nir/vtn: implement BuiltInGlobalSize
  compiler: int8/uint8 support

Rob Clark (8):
  nir: kernel entrypoints can have arguments
  nir: expose 'C' wrappers for std430 size/alignment
  nir/vtn: implement SpvOpCopyMemorySized
  nir/vtn: handle WorkGroupSize for kernels
  nir/vtn: add OpLifetime*
  nir/vtn: add OpConvertPtrToU
  nir/vtn: print extension name in fail msg
  RFC: nir/vtn: "raw" pointer support

 src/compiler/builtin_type_macros.h              |  10 +
 src/compiler/glsl/ast_to_hir.cpp                |   2 +
 src/compiler/glsl/ir_clone.cpp                  |   2 +
 src/compiler/glsl/link_uniform_initializers.cpp |   2 +
 src/compiler/glsl_types.cpp                     |  33 ++++
 src/compiler/glsl_types.h                       |   4 +
 src/compiler/nir/nir.h                          |   5 +-
 src/compiler/nir/nir_intrinsics.h               |  10 +-
 src/compiler/nir/nir_lower_io.c                 |  13 +-
 src/compiler/nir/nir_lower_system_values.c      |   8 +
 src/compiler/nir/nir_opcodes.py                 |   6 +-
 src/compiler/nir/nir_search.c                   |   2 +
 src/compiler/nir_types.cpp                      |  24 +++
 src/compiler/nir_types.h                        |  10 +
 src/compiler/shader_enums.c                     |   1 +
 src/compiler/shader_enums.h                     |   2 +
 src/compiler/spirv/spirv_to_nir.c               |  83 +++++++-
 src/compiler/spirv/vtn_alu.c                    |   1 +
 src/compiler/spirv/vtn_private.h                |  21 +-
 src/compiler/spirv/vtn_variables.c              | 247 +++++++++++++++++++++---
 src/intel/compiler/brw_fs.cpp                   |   3 +
 src/intel/compiler/brw_shader.cpp               |   4 +
 src/intel/compiler/brw_vec4_visitor.cpp         |   2 +
 src/mesa/program/ir_to_mesa.cpp                 |   4 +
 src/mesa/state_tracker/st_glsl_types.cpp        |   2 +
 25 files changed, 459 insertions(+), 42 deletions(-)
    
Revision 2
      first series here:
https://lists.freedesktop.org/archives/mesa-dev/2018-February/187275.html

change summery since v1:
 * removed 64 bit shift patch
 * reworked new intrinsics
 * fixed some int8/uint8 issues
 * add handling for CL types
 * add support for lowering loading kernel args in nir_lower_io

The biggest addition to the last series is the code for lowering load_vars of
kernel arguments. For this we have to calculate the size and alignment of types
correctly according to C rules. With that it is possible to lower a load_var
into a load_kernel_param with the correct offset as its source.

Rob still wants to rework his pointer support patch afaik.

One of the biggest problems we currently have is handling of pointers into
Function Memory to load struct members.

Karol Herbst (9):
  compiler: int8/uint8 support
  vtn: handle SpvExecutionModelKernel
  nir: add load/store_global intrinsics
  nir/vtn: implement BuiltInGlobalSize
  glsl: add packed for struct types
  glsl: add glsl_base_get_byte_size
  RFC glsl: add cl_size and cl_alignment
  nir: add load_kernel_param
  RFC nir/lower_io: lower kernel entry param load_vars to
    load_kernel_param

Rob Clark (9):
  nir/vtn: Use imov where we might have 8 bit types
  nir: kernel entrypoints can have arguments
  nir/vtn: implement SpvOpCopyMemorySized
  nir/vtn: handle WorkGroupSize for kernels
  nir/vtn: add OpLifetime*
  nir/vtn: add OpConvertPtrToU
  nir/vtn: print extension name in fail msg
  nir: use load_local_group_size
  RFC: nir/vtn: "raw" pointer support

 src/compiler/builtin_type_macros.h              |  10 +
 src/compiler/glsl/ast_to_hir.cpp                |   2 +
 src/compiler/glsl/ir_clone.cpp                  |   2 +
 src/compiler/glsl/link_uniform_initializers.cpp |   2 +
 src/compiler/glsl_types.cpp                     |  98 +++++++++-
 src/compiler/glsl_types.h                       |  60 +++++-
 src/compiler/nir/nir.h                          |   5 +-
 src/compiler/nir/nir_intrinsics.h               |  10 +-
 src/compiler/nir/nir_lower_io.c                 |  39 +++-
 src/compiler/nir/nir_lower_system_values.c      |  31 ++-
 src/compiler/nir_types.cpp                      |  29 ++-
 src/compiler/nir_types.h                        |  35 +---
 src/compiler/shader_enums.c                     |   1 +
 src/compiler/shader_enums.h                     |   2 +
 src/compiler/spirv/spirv_to_nir.c               | 128 +++++++++---
 src/compiler/spirv/vtn_alu.c                    |   1 +
 src/compiler/spirv/vtn_private.h                |  28 ++-
 src/compiler/spirv/vtn_variables.c              | 249 +++++++++++++++++++++---
 src/intel/compiler/brw_fs.cpp                   |   3 +
 src/intel/compiler/brw_shader.cpp               |   4 +
 src/intel/compiler/brw_vec4_visitor.cpp         |   2 +
 src/mesa/program/ir_to_mesa.cpp                 |   4 +
 src/mesa/state_tracker/st_glsl_types.cpp        |   2 +
 23 files changed, 650 insertions(+), 97 deletions(-)
    
Revision 3
      second series here:
https://lists.freedesktop.org/archives/mesa-dev/2018-March/188218.html

Main difference to the last series is, that I tried to focus on the real core
pars we need to get basic OpenCL support in spirv_to_nir, so that we can run
more or less complex examples.

There are some important core NIR changes and somebody should take a closer
look at those.

Karol Herbst (12):
  nir: add load/store_global intrinsics
  vtn: handle SpvExecutionModelKernel
  glsl: add packed for struct types
  glsl: add glsl_base_get_byte_size
  RFC glsl: add cl_size and cl_alignment
  RFC: nir/vtn: handle constant builtins from kernels
  nir/vtn: pointers can point to cross_workgroup or local memory as well
  nir: specify bit_size when loading system values
  nir/vtn/opencl: support fma
  nir: add load_kernel_param
  RFC nir/lower_io: lower kernel entry param load_vars to
    load_kernel_param
  RFC: nir/vtn: member in struct deref

Rob Clark (7):
  RFC: nir/vtn: "raw" pointer support
  nir/vtn: print extension name in fail msg
  nir/vtn: import OpenCL.std.h
  nir/vtn: initial OpenCL.std extension
  nir/vtn: Handle OpInBoundsPtrAccessChain
  nir: use load_local_group_size
  nir: kernel entrypoints can have arguments

 src/compiler/glsl_types.cpp                      |  65 ++++-
 src/compiler/glsl_types.h                        |  56 +++-
 src/compiler/nir/meson.build                     |   1 +
 src/compiler/nir/nir.h                           |   1 -
 src/compiler/nir/nir_builder.h                   |  10 +-
 src/compiler/nir/nir_intrinsics.h                |   8 +-
 src/compiler/nir/nir_lower_alpha_test.c          |   2 +-
 src/compiler/nir/nir_lower_clip.c                |   3 +-
 src/compiler/nir/nir_lower_io.c                  |  39 ++-
 src/compiler/nir/nir_lower_subgroups.c           |   8 +-
 src/compiler/nir/nir_lower_system_values.c       |  48 ++--
 src/compiler/nir/nir_lower_two_sided_color.c     |   2 +-
 src/compiler/nir/nir_lower_wpos_center.c         |   2 +-
 src/compiler/nir/nir_opcodes.py                  |   3 +-
 src/compiler/nir_types.cpp                       |  17 +-
 src/compiler/nir_types.h                         |  37 +--
 src/compiler/spirv/OpenCL.std.h                  | 211 +++++++++++++++
 src/compiler/spirv/spirv_to_nir.c                | 106 ++++++--
 src/compiler/spirv/vtn_opencl.c                  | 268 +++++++++++++++++++
 src/compiler/spirv/vtn_private.h                 |  35 ++-
 src/compiler/spirv/vtn_subgroup.c                |   2 +-
 src/compiler/spirv/vtn_variables.c               | 313 +++++++++++++++++++----
 src/gallium/auxiliary/nir/tgsi_to_nir.c          |   3 +-
 src/intel/blorp/blorp_blit.c                     |   2 +-
 src/intel/blorp/blorp_clear.c                    |   2 +-
 src/intel/compiler/brw_nir_lower_cs_intrinsics.c |   6 +-
 src/mesa/drivers/dri/i965/brw_tcs.c              |   2 +-
 27 files changed, 1099 insertions(+), 153 deletions(-)
 create mode 100644 src/compiler/spirv/OpenCL.std.h
 create mode 100644 src/compiler/spirv/vtn_opencl.c
    

Revisions