[mesa,1/6] tgsi_build: Fix return of uninitialized memory in tgsi_*_instruction_memory

Submitted by Hans de Goede on March 16, 2016, 9:23 a.m.

Details

Message ID 1458120239-27659-1-git-send-email-hdegoede@redhat.com
State New
Headers show
Series "Series without cover letter" ( rev: 1 ) in Nouveau

Not browsing as part of any series.

Commit Message

Hans de Goede March 16, 2016, 9:23 a.m.
tgsi_default_instruction_memory / tgsi_build_instruction_memory were
returning uninitialized memory for tgsi_instruction_memory.Texture and
tgsi_instruction_memory.Format. Note 0 means not set, and thus is a
correct default initializer for these.

Fixes: 3243b6fc97 ("tgsi: add Texture and Format to tgsi_instruction_memory")
Cc: Nicolai Hähnle <nicolai.haehnle@amd.com>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
---
 src/gallium/auxiliary/tgsi/tgsi_build.c | 4 ++++
 1 file changed, 4 insertions(+)

Patch hide | download patch | download mbox

diff --git a/src/gallium/auxiliary/tgsi/tgsi_build.c b/src/gallium/auxiliary/tgsi/tgsi_build.c
index a3e659b..7e30bb6 100644
--- a/src/gallium/auxiliary/tgsi/tgsi_build.c
+++ b/src/gallium/auxiliary/tgsi/tgsi_build.c
@@ -781,6 +781,8 @@  tgsi_default_instruction_memory( void )
    struct tgsi_instruction_memory instruction_memory;
 
    instruction_memory.Qualifier = 0;
+   instruction_memory.Texture = 0;
+   instruction_memory.Format = 0;
    instruction_memory.Padding = 0;
 
    return instruction_memory;
@@ -796,6 +798,8 @@  tgsi_build_instruction_memory(
    struct tgsi_instruction_memory instruction_memory;
 
    instruction_memory.Qualifier = qualifier;
+   instruction_memory.Texture = 0;
+   instruction_memory.Format = 0;
    instruction_memory.Padding = 0;
    instruction->Memory = 1;
 

Comments

Reviewed-by: Marek Olšák <marek.olsak@amd.com>

Marek

On Wed, Mar 16, 2016 at 10:23 AM, Hans de Goede <hdegoede@redhat.com> wrote:
> tgsi_default_instruction_memory / tgsi_build_instruction_memory were
> returning uninitialized memory for tgsi_instruction_memory.Texture and
> tgsi_instruction_memory.Format. Note 0 means not set, and thus is a
> correct default initializer for these.
>
> Fixes: 3243b6fc97 ("tgsi: add Texture and Format to tgsi_instruction_memory")
> Cc: Nicolai Hähnle <nicolai.haehnle@amd.com>
> Signed-off-by: Hans de Goede <hdegoede@redhat.com>
> ---
>  src/gallium/auxiliary/tgsi/tgsi_build.c | 4 ++++
>  1 file changed, 4 insertions(+)
>
> diff --git a/src/gallium/auxiliary/tgsi/tgsi_build.c b/src/gallium/auxiliary/tgsi/tgsi_build.c
> index a3e659b..7e30bb6 100644
> --- a/src/gallium/auxiliary/tgsi/tgsi_build.c
> +++ b/src/gallium/auxiliary/tgsi/tgsi_build.c
> @@ -781,6 +781,8 @@ tgsi_default_instruction_memory( void )
>     struct tgsi_instruction_memory instruction_memory;
>
>     instruction_memory.Qualifier = 0;
> +   instruction_memory.Texture = 0;
> +   instruction_memory.Format = 0;
>     instruction_memory.Padding = 0;
>
>     return instruction_memory;
> @@ -796,6 +798,8 @@ tgsi_build_instruction_memory(
>     struct tgsi_instruction_memory instruction_memory;
>
>     instruction_memory.Qualifier = qualifier;
> +   instruction_memory.Texture = 0;
> +   instruction_memory.Format = 0;
>     instruction_memory.Padding = 0;
>     instruction->Memory = 1;
>
> --
> 2.7.2
>
> _______________________________________________
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>

On 03/16/2016 10:23 AM, Hans de Goede wrote:
> Use the dst temp variable which was used in the TGSI_FILE_OUTPUT
> case everywhere. This makes the code somewhat easier to reads
> and helps avoiding going over 80 chars with upcoming changes.
>
> This also brings the dst handling more in line with the src
> handling.
>
> Signed-off-by: Hans de Goede <hdegoede@redhat.com>
> ---
>   src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp | 12 ++++++------
>   1 file changed, 6 insertions(+), 6 deletions(-)
>
> diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp
> index 8a1a426..1e91ad3 100644
> --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp
> +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp
> @@ -1261,9 +1261,9 @@ bool Source::scanInstruction(const struct tgsi_full_instruction *inst)
>         info->numBarriers = 1;
>
>      if (insn.dstCount()) {
> -      if (insn.getDst(0).getFile() == TGSI_FILE_OUTPUT) {
> -         Instruction::DstRegister dst = insn.getDst(0);
> +      Instruction::DstRegister dst = insn.getDst(0);
>
> +      if (dst.getFile() == TGSI_FILE_OUTPUT) {
>            if (dst.isIndirect(0))
>               for (unsigned i = 0; i < info->numOutputs; ++i)
>                  info->out[i].mask = 0xf;
> @@ -1280,11 +1280,11 @@ bool Source::scanInstruction(const struct tgsi_full_instruction *inst)
>            if (isEdgeFlagPassthrough(insn))
>               info->io.edgeFlagIn = insn.getSrc(0).getIndex(0);
>         } else
> -      if (insn.getDst(0).getFile() == TGSI_FILE_TEMPORARY) {
> -         if (insn.getDst(0).isIndirect(0))
> -            indirectTempArrays.insert(insn.getDst(0).getArrayId());
> +      if (dst.getFile() == TGSI_FILE_TEMPORARY) {
> +         if (dst.isIndirect(0))
> +            indirectTempArrays.insert(dst.getArrayId());
>         } else
> -      if (insn.getDst(0).getFile() == TGSI_FILE_BUFFER) {
> +      if (dst.getFile() == TGSI_FILE_BUFFER) {
>            info->io.globalAccess |= 0x2;
>         }
>      }
>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>

On 03/16/2016 10:23 AM, Hans de Goede wrote:
> Make the store offset handling in CodeEmitterGK110::emitSTORE identical
> to the one in CodeEmitterGK110::emitLOAD handling.
>
> This is just a cleanup, it does not cause any functional changes.
>
> Signed-off-by: Hans de Goede <hdegoede@redhat.com>
> ---
>   src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gk110.cpp | 4 +---
>   1 file changed, 1 insertion(+), 3 deletions(-)
>
> diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gk110.cpp b/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gk110.cpp
> index 0d7d95e..70f3c3f 100644
> --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gk110.cpp
> +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gk110.cpp
> @@ -1655,10 +1655,8 @@ CodeEmitterGK110::emitSTORE(const Instruction *i)
>         break;
>      }
>
> -   if (i->src(0).getFile() != FILE_MEMORY_GLOBAL)
> -      offset &= 0xffffff;
> -
>      if (code[0] & 0x2) {
> +      offset &= 0xffffff;
>         emitLoadStoreType(i->dType, 0x33);
>         if (i->src(0).getFile() == FILE_MEMORY_LOCAL)
>            emitCachingMode(i->cache, 0x2f);
>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>

On 03/16/2016 10:23 AM, Hans de Goede wrote:
> FILE_MEMORY_GLOBAL is currently only used for buffer handling, as we
> do not yet have (opencl) global memory support. Global memory support
> actually requires some different handling during lowering, so rename
> FILE_MEMORY_GLOBAL to FILE_MEMORY_BUFFER to reflect that the current
> code is for buffer handling, this will allow the later (re-)addition
> of FILE_MEMORY_GLOBAL for regular global memory.
>
> Signed-off-by: Hans de Goede <hdegoede@redhat.com>
> ---
>   src/gallium/drivers/nouveau/codegen/nv50_ir.h                |  2 +-
>   src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gk110.cpp   | 10 +++++-----
>   src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp   |  6 +++---
>   src/gallium/drivers/nouveau/codegen/nv50_ir_emit_nv50.cpp    | 10 +++++-----
>   src/gallium/drivers/nouveau/codegen/nv50_ir_emit_nvc0.cpp    | 12 ++++++------
>   src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp    |  8 ++++----
>   .../drivers/nouveau/codegen/nv50_ir_lowering_nvc0.cpp        | 10 +++++-----
>   src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp     |  8 ++++----
>   src/gallium/drivers/nouveau/codegen/nv50_ir_print.cpp        |  2 +-
>   src/gallium/drivers/nouveau/codegen/nv50_ir_target_nv50.cpp  |  6 +++---
>   src/gallium/drivers/nouveau/codegen/nv50_ir_target_nvc0.cpp  |  2 +-
>   11 files changed, 38 insertions(+), 38 deletions(-)
>
> diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir.h b/src/gallium/drivers/nouveau/codegen/nv50_ir.h
> index 7b0eb2f..fdc2195 100644
> --- a/src/gallium/drivers/nouveau/codegen/nv50_ir.h
> +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir.h
> @@ -332,7 +332,7 @@ enum DataFile
>      FILE_MEMORY_CONST,
>      FILE_SHADER_INPUT,
>      FILE_SHADER_OUTPUT,
> -   FILE_MEMORY_GLOBAL,
> +   FILE_MEMORY_BUFFER,
>      FILE_MEMORY_SHARED,
>      FILE_MEMORY_LOCAL,
>      FILE_SYSTEM_VALUE,
> diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gk110.cpp b/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gk110.cpp
> index 70f3c3f..02a1101 100644
> --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gk110.cpp
> +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gk110.cpp
> @@ -1641,7 +1641,7 @@ CodeEmitterGK110::emitSTORE(const Instruction *i)
>      int32_t offset = SDATA(i->src(0)).offset;
>
>      switch (i->src(0).getFile()) {
> -   case FILE_MEMORY_GLOBAL: code[1] = 0xe0000000; code[0] = 0x00000000; break;
> +   case FILE_MEMORY_BUFFER: code[1] = 0xe0000000; code[0] = 0x00000000; break;
>      case FILE_MEMORY_LOCAL:  code[1] = 0x7a800000; code[0] = 0x00000002; break;
>      case FILE_MEMORY_SHARED:
>         code[0] = 0x00000002;
> @@ -1678,7 +1678,7 @@ CodeEmitterGK110::emitSTORE(const Instruction *i)
>
>      srcId(i->src(1), 2);
>      srcId(i->src(0).getIndirect(0), 10);
> -   if (i->src(0).getFile() == FILE_MEMORY_GLOBAL &&
> +   if (i->src(0).getFile() == FILE_MEMORY_BUFFER &&
>          i->src(0).isIndirect(0) &&
>          i->getIndirect(0, 0)->reg.size == 8)
>         code[1] |= 1 << 23;
> @@ -1690,7 +1690,7 @@ CodeEmitterGK110::emitLOAD(const Instruction *i)
>      int32_t offset = SDATA(i->src(0)).offset;
>
>      switch (i->src(0).getFile()) {
> -   case FILE_MEMORY_GLOBAL: code[1] = 0xc0000000; code[0] = 0x00000000; break;
> +   case FILE_MEMORY_BUFFER: code[1] = 0xc0000000; code[0] = 0x00000000; break;
>      case FILE_MEMORY_LOCAL:  code[1] = 0x7a000000; code[0] = 0x00000002; break;
>      case FILE_MEMORY_SHARED:
>         code[0] = 0x00000002;
> @@ -1800,7 +1800,7 @@ CodeEmitterGK110::emitMOV(const Instruction *i)
>   static inline bool
>   uses64bitAddress(const Instruction *ldst)
>   {
> -   return ldst->src(0).getFile() == FILE_MEMORY_GLOBAL &&
> +   return ldst->src(0).getFile() == FILE_MEMORY_BUFFER &&
>         ldst->src(0).isIndirect(0) &&
>         ldst->getIndirect(0, 0)->reg.size == 8;
>   }
> @@ -1862,7 +1862,7 @@ CodeEmitterGK110::emitCCTL(const Instruction *i)
>
>      code[0] = 0x00000002 | (i->subOp << 2);
>
> -   if (i->src(0).getFile() == FILE_MEMORY_GLOBAL) {
> +   if (i->src(0).getFile() == FILE_MEMORY_BUFFER) {
>         code[1] = 0x7b000000;
>      } else {
>         code[1] = 0x7c000000;
> diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp b/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp
> index e079a57..27f287f 100644
> --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp
> +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp
> @@ -2417,7 +2417,7 @@ void
>   CodeEmitterGM107::emitCCTL()
>   {
>      unsigned width;
> -   if (insn->src(0).getFile() == FILE_MEMORY_GLOBAL) {
> +   if (insn->src(0).getFile() == FILE_MEMORY_BUFFER) {
>         emitInsn(0xef600000);
>         width = 30;
>      } else {
> @@ -2988,7 +2988,7 @@ CodeEmitterGM107::emitInstruction(Instruction *i)
>         case FILE_MEMORY_CONST : emitLDC(); break;
>         case FILE_MEMORY_LOCAL : emitLDL(); break;
>         case FILE_MEMORY_SHARED: emitLDS(); break;
> -      case FILE_MEMORY_GLOBAL: emitLD(); break;
> +      case FILE_MEMORY_BUFFER: emitLD(); break;
>         default:
>            assert(!"invalid load");
>            emitNOP();
> @@ -2999,7 +2999,7 @@ CodeEmitterGM107::emitInstruction(Instruction *i)
>         switch (insn->src(0).getFile()) {
>         case FILE_MEMORY_LOCAL : emitSTL(); break;
>         case FILE_MEMORY_SHARED: emitSTS(); break;
> -      case FILE_MEMORY_GLOBAL: emitST(); break;
> +      case FILE_MEMORY_BUFFER: emitST(); break;
>         default:
>            assert(!"invalid load");
>            emitNOP();
> diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_nv50.cpp b/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_nv50.cpp
> index 682a19d..7476e21 100644
> --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_nv50.cpp
> +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_nv50.cpp
> @@ -662,7 +662,7 @@ CodeEmitterNV50::emitLOAD(const Instruction *i)
>         code[0] = 0xd0000001;
>         code[1] = 0x40000000;
>         break;
> -   case FILE_MEMORY_GLOBAL:
> +   case FILE_MEMORY_BUFFER:
>         code[0] = 0xd0000001 | (i->getSrc(0)->reg.fileIndex << 16);
>         code[1] = 0x80000000;
>         break;
> @@ -671,7 +671,7 @@ CodeEmitterNV50::emitLOAD(const Instruction *i)
>         break;
>      }
>      if (sf == FILE_MEMORY_LOCAL ||
> -       sf == FILE_MEMORY_GLOBAL)
> +       sf == FILE_MEMORY_BUFFER)
>         emitLoadStoreSizeLG(i->sType, 21 + 32);
>
>      setDst(i, 0);
> @@ -679,7 +679,7 @@ CodeEmitterNV50::emitLOAD(const Instruction *i)
>      emitFlagsRd(i);
>      emitFlagsWr(i);
>
> -   if (i->src(0).getFile() == FILE_MEMORY_GLOBAL) {
> +   if (i->src(0).getFile() == FILE_MEMORY_BUFFER) {
>         srcId(*i->src(0).getIndirect(0), 9);
>      } else {
>         setAReg16(i, 0);
> @@ -699,7 +699,7 @@ CodeEmitterNV50::emitSTORE(const Instruction *i)
>         code[1] = 0x80c00000;
>         srcId(i->src(1), 32 + 14);
>         break;
> -   case FILE_MEMORY_GLOBAL:
> +   case FILE_MEMORY_BUFFER:
>         code[0] = 0xd0000001 | (i->getSrc(0)->reg.fileIndex << 16);
>         code[1] = 0xa0000000;
>         emitLoadStoreSizeLG(i->dType, 21 + 32);
> @@ -737,7 +737,7 @@ CodeEmitterNV50::emitSTORE(const Instruction *i)
>         break;
>      }
>
> -   if (f == FILE_MEMORY_GLOBAL)
> +   if (f == FILE_MEMORY_BUFFER)
>         srcId(*i->src(0).getIndirect(0), 9);
>      else
>         setAReg16(i, 0);
> diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_nvc0.cpp b/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_nvc0.cpp
> index 8b9328b..6236659 100644
> --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_nvc0.cpp
> +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_nvc0.cpp
> @@ -280,7 +280,7 @@ void
>   CodeEmitterNVC0::setAddressByFile(const ValueRef& src)
>   {
>      switch (src.getFile()) {
> -   case FILE_MEMORY_GLOBAL:
> +   case FILE_MEMORY_BUFFER:
>         srcAddr32(src, 26, 0);
>         break;
>      case FILE_MEMORY_LOCAL:
> @@ -1768,7 +1768,7 @@ CodeEmitterNVC0::emitCachingMode(CacheMode c)
>   static inline bool
>   uses64bitAddress(const Instruction *ldst)
>   {
> -   return ldst->src(0).getFile() == FILE_MEMORY_GLOBAL &&
> +   return ldst->src(0).getFile() == FILE_MEMORY_BUFFER &&
>         ldst->src(0).isIndirect(0) &&
>         ldst->getIndirect(0, 0)->reg.size == 8;
>   }
> @@ -1779,7 +1779,7 @@ CodeEmitterNVC0::emitSTORE(const Instruction *i)
>      uint32_t opc;
>
>      switch (i->src(0).getFile()) {
> -   case FILE_MEMORY_GLOBAL: opc = 0x90000000; break;
> +   case FILE_MEMORY_BUFFER: opc = 0x90000000; break;
>      case FILE_MEMORY_LOCAL:  opc = 0xc8000000; break;
>      case FILE_MEMORY_SHARED:
>         if (i->subOp == NV50_IR_SUBOP_STORE_UNLOCKED) {
> @@ -1828,7 +1828,7 @@ CodeEmitterNVC0::emitLOAD(const Instruction *i)
>      code[0] = 0x00000005;
>
>      switch (i->src(0).getFile()) {
> -   case FILE_MEMORY_GLOBAL: opc = 0x80000000; break;
> +   case FILE_MEMORY_BUFFER: opc = 0x80000000; break;
>      case FILE_MEMORY_LOCAL:  opc = 0xc0000000; break;
>      case FILE_MEMORY_SHARED:
>         if (i->subOp == NV50_IR_SUBOP_LOAD_LOCKED) {
> @@ -2090,7 +2090,7 @@ CodeEmitterNVC0::emitCCTL(const Instruction *i)
>   {
>      code[0] = 0x00000005 | (i->subOp << 5);
>
> -   if (i->src(0).getFile() == FILE_MEMORY_GLOBAL) {
> +   if (i->src(0).getFile() == FILE_MEMORY_BUFFER) {
>         code[1] = 0x98000000;
>         srcAddr32(i->src(0), 28, 2);
>      } else {
> @@ -3121,7 +3121,7 @@ SchedDataCalculator::checkRd(const Value *v, int cycle, int& delay) const
>      case FILE_MEMORY_LOCAL:
>      case FILE_MEMORY_CONST:
>      case FILE_MEMORY_SHARED:
> -   case FILE_MEMORY_GLOBAL:
> +   case FILE_MEMORY_BUFFER:
>      case FILE_SYSTEM_VALUE:
>         // TODO: any restrictions here ?
>         break;
> diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp
> index 1e91ad3..91879e4 100644
> --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp
> +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp
> @@ -373,8 +373,8 @@ static nv50_ir::DataFile translateFile(uint file)
>      case TGSI_FILE_PREDICATE:       return nv50_ir::FILE_PREDICATE;
>      case TGSI_FILE_IMMEDIATE:       return nv50_ir::FILE_IMMEDIATE;
>      case TGSI_FILE_SYSTEM_VALUE:    return nv50_ir::FILE_SYSTEM_VALUE;
> -   case TGSI_FILE_BUFFER:          return nv50_ir::FILE_MEMORY_GLOBAL;
> -   case TGSI_FILE_MEMORY:          return nv50_ir::FILE_MEMORY_GLOBAL;
> +   case TGSI_FILE_BUFFER:          return nv50_ir::FILE_MEMORY_BUFFER;
> +   case TGSI_FILE_MEMORY:          return nv50_ir::FILE_MEMORY_BUFFER;
>      case TGSI_FILE_SAMPLER:
>      case TGSI_FILE_NULL:
>      default:
> @@ -2191,7 +2191,7 @@ Converter::getResourceBase(const int r)
>
>      switch (r) {
>      case TGSI_RESOURCE_GLOBAL:
> -      sym = new_Symbol(prog, nv50_ir::FILE_MEMORY_GLOBAL, 15);
> +      sym = new_Symbol(prog, nv50_ir::FILE_MEMORY_BUFFER, 15);
>         break;
>      case TGSI_RESOURCE_LOCAL:
>         assert(prog->getType() == Program::TYPE_COMPUTE);
> @@ -2209,7 +2209,7 @@ Converter::getResourceBase(const int r)
>         break;
>      default:
>         sym = new_Symbol(prog,
> -                       nv50_ir::FILE_MEMORY_GLOBAL, code->resources.at(r).slot);
> +                       nv50_ir::FILE_MEMORY_BUFFER, code->resources.at(r).slot);
>         break;
>      }
>      return sym;
> diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_nvc0.cpp b/src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_nvc0.cpp
> index d0936d8..563d7c2 100644
> --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_nvc0.cpp
> +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_nvc0.cpp
> @@ -1141,7 +1141,7 @@ NVC0LoweringPass::handleATOM(Instruction *atom)
>         handleSharedATOM(atom);
>         return true;
>      default:
> -      assert(atom->src(0).getFile() == FILE_MEMORY_GLOBAL);
> +      assert(atom->src(0).getFile() == FILE_MEMORY_BUFFER);
>         base = loadResInfo64(ind, atom->getSrc(0)->reg.fileIndex * 16);
>         assert(base->reg.size == 8);
>         if (ptr)
> @@ -1154,7 +1154,7 @@ NVC0LoweringPass::handleATOM(Instruction *atom)
>         bld.mkOp1v(OP_RDSV, TYPE_U32, bld.getScratch(), bld.mkSysVal(sv, 0));
>
>      atom->setSrc(0, cloneShallow(func, atom->getSrc(0)));
> -   atom->getSrc(0)->reg.file = FILE_MEMORY_GLOBAL;
> +   atom->getSrc(0)->reg.file = FILE_MEMORY_BUFFER;
>      if (ptr)
>         base = bld.mkOp2v(OP_ADD, TYPE_U32, base, base, ptr);
>      atom->setIndirect(0, 1, NULL);
> @@ -1571,7 +1571,7 @@ NVC0LoweringPass::handleSurfaceOpNVE4(TexInstruction *su)
>         Instruction *red = bld.mkOp(OP_ATOM, su->dType, su->getDef(0));
>         red->subOp = su->subOp;
>         if (!gMemBase)
> -         gMemBase = bld.mkSymbol(FILE_MEMORY_GLOBAL, 0, TYPE_U32, 0);
> +         gMemBase = bld.mkSymbol(FILE_MEMORY_BUFFER, 0, TYPE_U32, 0);
>         red->setSrc(0, gMemBase);
>         red->setSrc(1, su->getSrc(3));
>         if (su->subOp == NV50_IR_SUBOP_ATOM_CAS)
> @@ -1963,7 +1963,7 @@ NVC0LoweringPass::visit(Instruction *i)
>         } else if (i->src(0).getFile() == FILE_SHADER_OUTPUT) {
>            assert(prog->getType() == Program::TYPE_TESSELLATION_CONTROL);
>            i->op = OP_VFETCH;
> -      } else if (i->src(0).getFile() == FILE_MEMORY_GLOBAL) {
> +      } else if (i->src(0).getFile() == FILE_MEMORY_BUFFER) {
>            Value *ind = i->getIndirect(0, 1);
>            Value *ptr = loadResInfo64(ind, i->getSrc(0)->reg.fileIndex * 16);
>            // XXX come up with a way not to do this for EVERY little access but
> @@ -1987,7 +1987,7 @@ NVC0LoweringPass::visit(Instruction *i)
>         break;
>      case OP_ATOM:
>      {
> -      const bool cctl = i->src(0).getFile() == FILE_MEMORY_GLOBAL;
> +      const bool cctl = i->src(0).getFile() == FILE_MEMORY_BUFFER;
>         handleATOM(i);
>         handleCasExch(i, cctl);
>      }
> diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
> index 66e7b2e..4a96d04 100644
> --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
> +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
> @@ -2580,14 +2580,14 @@ MemoryOpt::runOpt(BasicBlock *bb)
>                ldst->op == OP_BAR ||
>                ldst->op == OP_MEMBAR) {
>               purgeRecords(NULL, FILE_MEMORY_LOCAL);
> -            purgeRecords(NULL, FILE_MEMORY_GLOBAL);
> +            purgeRecords(NULL, FILE_MEMORY_BUFFER);
>               purgeRecords(NULL, FILE_MEMORY_SHARED);
>               purgeRecords(NULL, FILE_SHADER_OUTPUT);
>            } else
>            if (ldst->op == OP_ATOM || ldst->op == OP_CCTL) {
> -            if (ldst->src(0).getFile() == FILE_MEMORY_GLOBAL) {
> +            if (ldst->src(0).getFile() == FILE_MEMORY_BUFFER) {
>                  purgeRecords(NULL, FILE_MEMORY_LOCAL);
> -               purgeRecords(NULL, FILE_MEMORY_GLOBAL);
> +               purgeRecords(NULL, FILE_MEMORY_BUFFER);
>                  purgeRecords(NULL, FILE_MEMORY_SHARED);
>               } else {
>                  purgeRecords(NULL, ldst->src(0).getFile());
> @@ -2607,7 +2607,7 @@ MemoryOpt::runOpt(BasicBlock *bb)
>            DataFile file = ldst->src(0).getFile();
>
>            // if ld l[]/g[] look for previous store to eliminate the reload
> -         if (file == FILE_MEMORY_GLOBAL || file == FILE_MEMORY_LOCAL) {
> +         if (file == FILE_MEMORY_BUFFER || file == FILE_MEMORY_LOCAL) {
>               // TODO: shared memory ?
>               rec = findRecord(ldst, false, isAdjacent);
>               if (rec && !isAdjacent)
> diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_print.cpp b/src/gallium/drivers/nouveau/codegen/nv50_ir_print.cpp
> index cfa85ec..73ed753 100644
> --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_print.cpp
> +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_print.cpp
> @@ -455,7 +455,7 @@ int Symbol::print(char *buf, size_t size,
>      case FILE_MEMORY_CONST:  c = 'c'; break;
>      case FILE_SHADER_INPUT:  c = 'a'; break;
>      case FILE_SHADER_OUTPUT: c = 'o'; break;
> -   case FILE_MEMORY_GLOBAL: c = 'g'; break;
> +   case FILE_MEMORY_BUFFER: c = 'g'; break;
>      case FILE_MEMORY_SHARED: c = 's'; break;
>      case FILE_MEMORY_LOCAL:  c = 'l'; break;
>      default:
> diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_target_nv50.cpp b/src/gallium/drivers/nouveau/codegen/nv50_ir_target_nv50.cpp
> index 2c4d7f5..1cd45a2 100644
> --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_target_nv50.cpp
> +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_target_nv50.cpp
> @@ -207,7 +207,7 @@ TargetNV50::getFileSize(DataFile file) const
>      case FILE_MEMORY_CONST:  return 65536;
>      case FILE_SHADER_INPUT:  return 0x200;
>      case FILE_SHADER_OUTPUT: return 0x200;
> -   case FILE_MEMORY_GLOBAL: return 0xffffffff;
> +   case FILE_MEMORY_BUFFER: return 0xffffffff;
>      case FILE_MEMORY_SHARED: return 16 << 10;
>      case FILE_MEMORY_LOCAL:  return 48 << 10;
>      case FILE_SYSTEM_VALUE:  return 16;
> @@ -406,7 +406,7 @@ TargetNV50::isAccessSupported(DataFile file, DataType ty) const
>      if (ty == TYPE_B96 || ty == TYPE_NONE)
>         return false;
>      if (typeSizeof(ty) > 4)
> -      return (file == FILE_MEMORY_LOCAL) || (file == FILE_MEMORY_GLOBAL);
> +      return (file == FILE_MEMORY_LOCAL) || (file == FILE_MEMORY_BUFFER);
>      return true;
>   }
>
> @@ -508,7 +508,7 @@ int TargetNV50::getLatency(const Instruction *i) const
>      if (i->op == OP_LOAD) {
>         switch (i->src(0).getFile()) {
>         case FILE_MEMORY_LOCAL:
> -      case FILE_MEMORY_GLOBAL:
> +      case FILE_MEMORY_BUFFER:
>            return 100; // really 400 to 800
>         default:
>            return 22;
> diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_target_nvc0.cpp b/src/gallium/drivers/nouveau/codegen/nv50_ir_target_nvc0.cpp
> index a03afa8..bda59a5 100644
> --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_target_nvc0.cpp
> +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_target_nvc0.cpp
> @@ -248,7 +248,7 @@ TargetNVC0::getFileSize(DataFile file) const
>      case FILE_MEMORY_CONST:  return 65536;
>      case FILE_SHADER_INPUT:  return 0x400;
>      case FILE_SHADER_OUTPUT: return 0x400;
> -   case FILE_MEMORY_GLOBAL: return 0xffffffff;
> +   case FILE_MEMORY_BUFFER: return 0xffffffff;
>      case FILE_MEMORY_SHARED: return 16 << 10;
>      case FILE_MEMORY_LOCAL:  return 48 << 10;
>      case FILE_SYSTEM_VALUE:  return 32;
>
Could you please get rid of the cosmetic changes (eg. the switch ones)?
Because this doesn't really improve readability and in my opinion these 
changes should be eventually done in a separate patch.

Other than that, this patch is :

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>

Yes, this probably won't work as is for atomic operations but the 
lowering pass is already here, so it should be easy to make it work.

On 03/16/2016 10:23 AM, Hans de Goede wrote:
> Add support for OpenCL global memory buffers, note this has only
> been tested with regular load and stores and likely needs more work
> for e.g. atomic ops.
>
> Signed-off-by: Hans de Goede <hdegoede@redhat.com>
> ---
>   src/gallium/drivers/nouveau/codegen/nv50_ir.h      |  1 +
>   .../drivers/nouveau/codegen/nv50_ir_emit_gk110.cpp | 31 +++++++++++++++++-----
>   .../drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp |  5 +++-
>   .../drivers/nouveau/codegen/nv50_ir_emit_nv50.cpp  | 10 ++++---
>   .../drivers/nouveau/codegen/nv50_ir_emit_nvc0.cpp  | 26 +++++++++++++-----
>   .../drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp  | 14 +++++++---
>   .../drivers/nouveau/codegen/nv50_ir_peephole.cpp   |  5 +++-
>   .../drivers/nouveau/codegen/nv50_ir_print.cpp      |  1 +
>   .../nouveau/codegen/nv50_ir_target_nv50.cpp        |  1 +
>   .../nouveau/codegen/nv50_ir_target_nvc0.cpp        |  1 +
>   10 files changed, 74 insertions(+), 21 deletions(-)
>
> diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir.h b/src/gallium/drivers/nouveau/codegen/nv50_ir.h
> index fdc2195..5141fc6 100644
> --- a/src/gallium/drivers/nouveau/codegen/nv50_ir.h
> +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir.h
> @@ -333,6 +333,7 @@ enum DataFile
>      FILE_SHADER_INPUT,
>      FILE_SHADER_OUTPUT,
>      FILE_MEMORY_BUFFER,
> +   FILE_MEMORY_GLOBAL,
>      FILE_MEMORY_SHARED,
>      FILE_MEMORY_LOCAL,
>      FILE_SYSTEM_VALUE,
> diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gk110.cpp b/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gk110.cpp
> index 02a1101..62f1598 100644
> --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gk110.cpp
> +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gk110.cpp
> @@ -1641,8 +1641,15 @@ CodeEmitterGK110::emitSTORE(const Instruction *i)
>      int32_t offset = SDATA(i->src(0)).offset;
>
>      switch (i->src(0).getFile()) {
> -   case FILE_MEMORY_BUFFER: code[1] = 0xe0000000; code[0] = 0x00000000; break;
> -   case FILE_MEMORY_LOCAL:  code[1] = 0x7a800000; code[0] = 0x00000002; break;
> +   case FILE_MEMORY_BUFFER:
> +   case FILE_MEMORY_GLOBAL:
> +      code[0] = 0x00000000;
> +      code[1] = 0xe0000000;
> +      break;
> +   case FILE_MEMORY_LOCAL:
> +      code[0] = 0x00000002;
> +      code[1] = 0x7a800000;
> +      break;
>      case FILE_MEMORY_SHARED:
>         code[0] = 0x00000002;
>         if (i->subOp == NV50_IR_SUBOP_STORE_UNLOCKED)
> @@ -1678,7 +1685,8 @@ CodeEmitterGK110::emitSTORE(const Instruction *i)
>
>      srcId(i->src(1), 2);
>      srcId(i->src(0).getIndirect(0), 10);
> -   if (i->src(0).getFile() == FILE_MEMORY_BUFFER &&
> +   if ((i->src(0).getFile() == FILE_MEMORY_BUFFER ||
> +        i->src(0).getFile() == FILE_MEMORY_GLOBAL) &&
>          i->src(0).isIndirect(0) &&
>          i->getIndirect(0, 0)->reg.size == 8)
>         code[1] |= 1 << 23;
> @@ -1690,8 +1698,15 @@ CodeEmitterGK110::emitLOAD(const Instruction *i)
>      int32_t offset = SDATA(i->src(0)).offset;
>
>      switch (i->src(0).getFile()) {
> -   case FILE_MEMORY_BUFFER: code[1] = 0xc0000000; code[0] = 0x00000000; break;
> -   case FILE_MEMORY_LOCAL:  code[1] = 0x7a000000; code[0] = 0x00000002; break;
> +   case FILE_MEMORY_BUFFER:
> +   case FILE_MEMORY_GLOBAL:
> +      code[0] = 0x00000000;
> +      code[1] = 0xc0000000;
> +      break;
> +   case FILE_MEMORY_LOCAL:
> +      code[0] = 0x00000002;
> +      code[1] = 0x7a000000;
> +      break;
>      case FILE_MEMORY_SHARED:
>         code[0] = 0x00000002;
>         if (i->subOp == NV50_IR_SUBOP_LOAD_LOCKED)
> @@ -1800,7 +1815,8 @@ CodeEmitterGK110::emitMOV(const Instruction *i)
>   static inline bool
>   uses64bitAddress(const Instruction *ldst)
>   {
> -   return ldst->src(0).getFile() == FILE_MEMORY_BUFFER &&
> +   return (ldst->src(0).getFile() == FILE_MEMORY_BUFFER ||
> +           ldst->src(0).getFile() == FILE_MEMORY_GLOBAL) &&
>         ldst->src(0).isIndirect(0) &&
>         ldst->getIndirect(0, 0)->reg.size == 8;
>   }
> @@ -1862,7 +1878,8 @@ CodeEmitterGK110::emitCCTL(const Instruction *i)
>
>      code[0] = 0x00000002 | (i->subOp << 2);
>
> -   if (i->src(0).getFile() == FILE_MEMORY_BUFFER) {
> +   if (i->src(0).getFile() == FILE_MEMORY_BUFFER ||
> +       i->src(0).getFile() == FILE_MEMORY_GLOBAL) {
>         code[1] = 0x7b000000;
>      } else {
>         code[1] = 0x7c000000;
> diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp b/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp
> index 27f287f..3fcdc55 100644
> --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp
> +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp
> @@ -2417,7 +2417,8 @@ void
>   CodeEmitterGM107::emitCCTL()
>   {
>      unsigned width;
> -   if (insn->src(0).getFile() == FILE_MEMORY_BUFFER) {
> +   if (insn->src(0).getFile() == FILE_MEMORY_BUFFER ||
> +       insn->src(0).getFile() == FILE_MEMORY_GLOBAL) {
>         emitInsn(0xef600000);
>         width = 30;
>      } else {
> @@ -2989,6 +2990,7 @@ CodeEmitterGM107::emitInstruction(Instruction *i)
>         case FILE_MEMORY_LOCAL : emitLDL(); break;
>         case FILE_MEMORY_SHARED: emitLDS(); break;
>         case FILE_MEMORY_BUFFER: emitLD(); break;
> +      case FILE_MEMORY_GLOBAL: emitLD(); break;
>         default:
>            assert(!"invalid load");
>            emitNOP();
> @@ -3000,6 +3002,7 @@ CodeEmitterGM107::emitInstruction(Instruction *i)
>         case FILE_MEMORY_LOCAL : emitSTL(); break;
>         case FILE_MEMORY_SHARED: emitSTS(); break;
>         case FILE_MEMORY_BUFFER: emitST(); break;
> +      case FILE_MEMORY_GLOBAL: emitST(); break;
>         default:
>            assert(!"invalid load");
>            emitNOP();
> diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_nv50.cpp b/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_nv50.cpp
> index 7476e21..2653c82 100644
> --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_nv50.cpp
> +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_nv50.cpp
> @@ -663,6 +663,7 @@ CodeEmitterNV50::emitLOAD(const Instruction *i)
>         code[1] = 0x40000000;
>         break;
>      case FILE_MEMORY_BUFFER:
> +   case FILE_MEMORY_GLOBAL:
>         code[0] = 0xd0000001 | (i->getSrc(0)->reg.fileIndex << 16);
>         code[1] = 0x80000000;
>         break;
> @@ -671,7 +672,8 @@ CodeEmitterNV50::emitLOAD(const Instruction *i)
>         break;
>      }
>      if (sf == FILE_MEMORY_LOCAL ||
> -       sf == FILE_MEMORY_BUFFER)
> +       sf == FILE_MEMORY_BUFFER ||
> +       sf == FILE_MEMORY_GLOBAL)
>         emitLoadStoreSizeLG(i->sType, 21 + 32);
>
>      setDst(i, 0);
> @@ -679,7 +681,8 @@ CodeEmitterNV50::emitLOAD(const Instruction *i)
>      emitFlagsRd(i);
>      emitFlagsWr(i);
>
> -   if (i->src(0).getFile() == FILE_MEMORY_BUFFER) {
> +   if (i->src(0).getFile() == FILE_MEMORY_BUFFER ||
> +       i->src(0).getFile() == FILE_MEMORY_GLOBAL) {
>         srcId(*i->src(0).getIndirect(0), 9);
>      } else {
>         setAReg16(i, 0);
> @@ -700,6 +703,7 @@ CodeEmitterNV50::emitSTORE(const Instruction *i)
>         srcId(i->src(1), 32 + 14);
>         break;
>      case FILE_MEMORY_BUFFER:
> +   case FILE_MEMORY_GLOBAL:
>         code[0] = 0xd0000001 | (i->getSrc(0)->reg.fileIndex << 16);
>         code[1] = 0xa0000000;
>         emitLoadStoreSizeLG(i->dType, 21 + 32);
> @@ -737,7 +741,7 @@ CodeEmitterNV50::emitSTORE(const Instruction *i)
>         break;
>      }
>
> -   if (f == FILE_MEMORY_BUFFER)
> +   if (f == FILE_MEMORY_BUFFER || f == FILE_MEMORY_GLOBAL)
>         srcId(*i->src(0).getIndirect(0), 9);
>      else
>         setAReg16(i, 0);
> diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_nvc0.cpp b/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_nvc0.cpp
> index 6236659..ca475ce 100644
> --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_nvc0.cpp
> +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_nvc0.cpp
> @@ -281,6 +281,7 @@ CodeEmitterNVC0::setAddressByFile(const ValueRef& src)
>   {
>      switch (src.getFile()) {
>      case FILE_MEMORY_BUFFER:
> +   case FILE_MEMORY_GLOBAL:
>         srcAddr32(src, 26, 0);
>         break;
>      case FILE_MEMORY_LOCAL:
> @@ -1768,7 +1769,8 @@ CodeEmitterNVC0::emitCachingMode(CacheMode c)
>   static inline bool
>   uses64bitAddress(const Instruction *ldst)
>   {
> -   return ldst->src(0).getFile() == FILE_MEMORY_BUFFER &&
> +   return (ldst->src(0).getFile() == FILE_MEMORY_BUFFER ||
> +           ldst->src(0).getFile() == FILE_MEMORY_GLOBAL) &&
>         ldst->src(0).isIndirect(0) &&
>         ldst->getIndirect(0, 0)->reg.size == 8;
>   }
> @@ -1779,8 +1781,13 @@ CodeEmitterNVC0::emitSTORE(const Instruction *i)
>      uint32_t opc;
>
>      switch (i->src(0).getFile()) {
> -   case FILE_MEMORY_BUFFER: opc = 0x90000000; break;
> -   case FILE_MEMORY_LOCAL:  opc = 0xc8000000; break;
> +   case FILE_MEMORY_BUFFER:
> +   case FILE_MEMORY_GLOBAL:
> +      opc = 0x90000000;
> +      break;
> +   case FILE_MEMORY_LOCAL:
> +      opc = 0xc8000000;
> +      break;
>      case FILE_MEMORY_SHARED:
>         if (i->subOp == NV50_IR_SUBOP_STORE_UNLOCKED) {
>            if (targ->getChipset() >= NVISA_GK104_CHIPSET)
> @@ -1828,8 +1835,13 @@ CodeEmitterNVC0::emitLOAD(const Instruction *i)
>      code[0] = 0x00000005;
>
>      switch (i->src(0).getFile()) {
> -   case FILE_MEMORY_BUFFER: opc = 0x80000000; break;
> -   case FILE_MEMORY_LOCAL:  opc = 0xc0000000; break;
> +   case FILE_MEMORY_BUFFER:
> +   case FILE_MEMORY_GLOBAL:
> +      opc = 0x80000000;
> +      break;
> +   case FILE_MEMORY_LOCAL:
> +      opc = 0xc0000000;
> +      break;
>      case FILE_MEMORY_SHARED:
>         if (i->subOp == NV50_IR_SUBOP_LOAD_LOCKED) {
>            if (targ->getChipset() >= NVISA_GK104_CHIPSET)
> @@ -2090,7 +2102,8 @@ CodeEmitterNVC0::emitCCTL(const Instruction *i)
>   {
>      code[0] = 0x00000005 | (i->subOp << 5);
>
> -   if (i->src(0).getFile() == FILE_MEMORY_BUFFER) {
> +   if (i->src(0).getFile() == FILE_MEMORY_BUFFER ||
> +       i->src(0).getFile() == FILE_MEMORY_GLOBAL) {
>         code[1] = 0x98000000;
>         srcAddr32(i->src(0), 28, 2);
>      } else {
> @@ -3122,6 +3135,7 @@ SchedDataCalculator::checkRd(const Value *v, int cycle, int& delay) const
>      case FILE_MEMORY_CONST:
>      case FILE_MEMORY_SHARED:
>      case FILE_MEMORY_BUFFER:
> +   case FILE_MEMORY_GLOBAL:
>      case FILE_SYSTEM_VALUE:
>         // TODO: any restrictions here ?
>         break;
> diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp
> index 91879e4..c167c4a 100644
> --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp
> +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp
> @@ -374,7 +374,7 @@ static nv50_ir::DataFile translateFile(uint file)
>      case TGSI_FILE_IMMEDIATE:       return nv50_ir::FILE_IMMEDIATE;
>      case TGSI_FILE_SYSTEM_VALUE:    return nv50_ir::FILE_SYSTEM_VALUE;
>      case TGSI_FILE_BUFFER:          return nv50_ir::FILE_MEMORY_BUFFER;
> -   case TGSI_FILE_MEMORY:          return nv50_ir::FILE_MEMORY_BUFFER;
> +   case TGSI_FILE_MEMORY:          return nv50_ir::FILE_MEMORY_GLOBAL;
>      case TGSI_FILE_SAMPLER:
>      case TGSI_FILE_NULL:
>      default:
> @@ -1284,7 +1284,9 @@ bool Source::scanInstruction(const struct tgsi_full_instruction *inst)
>            if (dst.isIndirect(0))
>               indirectTempArrays.insert(dst.getArrayId());
>         } else
> -      if (dst.getFile() == TGSI_FILE_BUFFER) {
> +      if (dst.getFile() == TGSI_FILE_BUFFER ||
> +          (dst.getFile() == TGSI_FILE_MEMORY &&
> +           memoryFiles[dst.getIndex(0)].mem_type == TGSI_MEMORY_TYPE_GLOBAL)) {
>            info->io.globalAccess |= 0x2;
>         }
>      }
> @@ -1295,7 +1297,9 @@ bool Source::scanInstruction(const struct tgsi_full_instruction *inst)
>            if (src.isIndirect(0))
>               indirectTempArrays.insert(src.getArrayId());
>         } else
> -      if (src.getFile() == TGSI_FILE_BUFFER) {
> +      if (src.getFile() == TGSI_FILE_BUFFER ||
> +          (src.getFile() == TGSI_FILE_MEMORY &&
> +           memoryFiles[src.getIndex(0)].mem_type == TGSI_MEMORY_TYPE_GLOBAL)) {
>            info->io.globalAccess |= (insn.getOpcode() == TGSI_OPCODE_LOAD) ?
>                  0x1 : 0x2;
>         } else
> @@ -1529,6 +1533,10 @@ Converter::makeSym(uint tgsiFile, int fileIdx, int idx, int c, uint32_t address)
>
>      if (tgsiFile == TGSI_FILE_MEMORY) {
>         switch (code->memoryFiles[fileIdx].mem_type) {
> +      case TGSI_MEMORY_TYPE_GLOBAL:
> +         /* No-op this is the default for TGSI_FILE_MEMORY */
> +         sym->setFile(FILE_MEMORY_GLOBAL);
> +         break;
>         case TGSI_MEMORY_TYPE_SHARED:
>            sym->setFile(FILE_MEMORY_SHARED);
>            break;
> diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
> index 4a96d04..84d2944 100644
> --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
> +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
> @@ -2581,6 +2581,7 @@ MemoryOpt::runOpt(BasicBlock *bb)
>                ldst->op == OP_MEMBAR) {
>               purgeRecords(NULL, FILE_MEMORY_LOCAL);
>               purgeRecords(NULL, FILE_MEMORY_BUFFER);
> +            purgeRecords(NULL, FILE_MEMORY_GLOBAL);
>               purgeRecords(NULL, FILE_MEMORY_SHARED);
>               purgeRecords(NULL, FILE_SHADER_OUTPUT);
>            } else
> @@ -2588,6 +2589,7 @@ MemoryOpt::runOpt(BasicBlock *bb)
>               if (ldst->src(0).getFile() == FILE_MEMORY_BUFFER) {
>                  purgeRecords(NULL, FILE_MEMORY_LOCAL);
>                  purgeRecords(NULL, FILE_MEMORY_BUFFER);
> +               purgeRecords(NULL, FILE_MEMORY_GLOBAL);
>                  purgeRecords(NULL, FILE_MEMORY_SHARED);
>               } else {
>                  purgeRecords(NULL, ldst->src(0).getFile());
> @@ -2607,7 +2609,8 @@ MemoryOpt::runOpt(BasicBlock *bb)
>            DataFile file = ldst->src(0).getFile();
>
>            // if ld l[]/g[] look for previous store to eliminate the reload
> -         if (file == FILE_MEMORY_BUFFER || file == FILE_MEMORY_LOCAL) {
> +         if (file == FILE_MEMORY_BUFFER || file == FILE_MEMORY_LOCAL ||
> +             file == FILE_MEMORY_GLOBAL) {
>               // TODO: shared memory ?
>               rec = findRecord(ldst, false, isAdjacent);
>               if (rec && !isAdjacent)
> diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_print.cpp b/src/gallium/drivers/nouveau/codegen/nv50_ir_print.cpp
> index 73ed753..3917768 100644
> --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_print.cpp
> +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_print.cpp
> @@ -456,6 +456,7 @@ int Symbol::print(char *buf, size_t size,
>      case FILE_SHADER_INPUT:  c = 'a'; break;
>      case FILE_SHADER_OUTPUT: c = 'o'; break;
>      case FILE_MEMORY_BUFFER: c = 'g'; break;
> +   case FILE_MEMORY_GLOBAL: c = 'g'; break;
>      case FILE_MEMORY_SHARED: c = 's'; break;
>      case FILE_MEMORY_LOCAL:  c = 'l'; break;
>      default:
> diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_target_nv50.cpp b/src/gallium/drivers/nouveau/codegen/nv50_ir_target_nv50.cpp
> index 1cd45a2..5c60b22 100644
> --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_target_nv50.cpp
> +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_target_nv50.cpp
> @@ -208,6 +208,7 @@ TargetNV50::getFileSize(DataFile file) const
>      case FILE_SHADER_INPUT:  return 0x200;
>      case FILE_SHADER_OUTPUT: return 0x200;
>      case FILE_MEMORY_BUFFER: return 0xffffffff;
> +   case FILE_MEMORY_GLOBAL: return 0xffffffff;
>      case FILE_MEMORY_SHARED: return 16 << 10;
>      case FILE_MEMORY_LOCAL:  return 48 << 10;
>      case FILE_SYSTEM_VALUE:  return 16;
> diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_target_nvc0.cpp b/src/gallium/drivers/nouveau/codegen/nv50_ir_target_nvc0.cpp
> index bda59a5..9e1e7bf 100644
> --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_target_nvc0.cpp
> +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_target_nvc0.cpp
> @@ -249,6 +249,7 @@ TargetNVC0::getFileSize(DataFile file) const
>      case FILE_SHADER_INPUT:  return 0x400;
>      case FILE_SHADER_OUTPUT: return 0x400;
>      case FILE_MEMORY_BUFFER: return 0xffffffff;
> +   case FILE_MEMORY_GLOBAL: return 0xffffffff;
>      case FILE_MEMORY_SHARED: return 16 << 10;
>      case FILE_MEMORY_LOCAL:  return 48 << 10;
>      case FILE_SYSTEM_VALUE:  return 32;
>
On 03/16/2016 10:23 AM, Hans de Goede wrote:
> Commit c3083c7082 ("nv50/ir: add support for BUFFER accesses") disabled /
> commented out some of the old resource handling code, but not all of it.
>
> Effectively all of it is dead already, if we ever enter the old code
> paths in handeLOAD / handleSTORE / handleATOM we will get an exception
> due to trying to access the now always zero-sized resources vector.
>
> Make non buffer / memory file accesses not being supported in these
> functions more explicit and comment out a whole bunch of dead code.
>
> Also remove the magic file-indexe defines from the old resource code
> from include/pipe/p_shader_tokens.h as those are no longer used now
> (which is a good thing).
>
> Signed-off-by: Hans de Goede <hdegoede@redhat.com>
> ---
>   .../drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp  | 42 +++++++++++++++-------
>   src/gallium/include/pipe/p_shader_tokens.h         |  9 -----
>   2 files changed, 30 insertions(+), 21 deletions(-)
>
> diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp
> index c167c4a..115d0bb 100644
> --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp
> +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp
> @@ -856,12 +856,14 @@ public:
>      };
>      std::vector<TextureView> textureViews;
>
> +   /*
>      struct Resource {
>         uint8_t target; // TGSI_TEXTURE_*
>         bool raw;
>         uint8_t slot; // $surface index
>      };
>      std::vector<Resource> resources;
> +   */
>
>      struct MemoryFile {
>         uint8_t mem_type; // TGSI_MEMORY_TYPE_*
> @@ -1423,8 +1425,8 @@ private:
>      void handleLIT(Value *dst0[4]);
>      void handleUserClipPlanes();
>
> -   Symbol *getResourceBase(int r);
> -   void getResourceCoords(std::vector<Value *>&, int r, int s);
> +   // Symbol *getResourceBase(int r);
> +   // void getResourceCoords(std::vector<Value *>&, int r, int s);
>
>      void handleLOAD(Value *dst0[4]);
>      void handleSTORE();
> @@ -2169,6 +2171,7 @@ Converter::handleLIT(Value *dst0[4])
>      }
>   }
>
> +/* Keep this around for now as reference when adding img support
>   static inline bool
>   isResourceSpecial(const int r)
>   {
> @@ -2264,6 +2267,7 @@ partitionLoadStore(uint8_t comp[2], uint8_t size[2], uint8_t mask)
>      }
>      return n + 1;
>   }
> +*/
>
>   // For raw loads, granularity is 4 byte.
>   // Usage of the texture read mask on OP_SULDP is not allowed.
> @@ -2274,8 +2278,9 @@ Converter::handleLOAD(Value *dst0[4])
>      int c;
>      std::vector<Value *> off, src, ldv, def;
>
> -   if (tgsi.getSrc(0).getFile() == TGSI_FILE_BUFFER ||
> -       tgsi.getSrc(0).getFile() == TGSI_FILE_MEMORY) {
> +   switch (tgsi.getSrc(0).getFile()) {
> +   case TGSI_FILE_BUFFER:
> +   case TGSI_FILE_MEMORY:
>         for (c = 0; c < 4; ++c) {
>            if (!dst0[c])
>               continue;
> @@ -2295,9 +2300,12 @@ Converter::handleLOAD(Value *dst0[4])
>            if (tgsi.getSrc(0).isIndirect(0))
>               ld->setIndirect(0, 1, fetchSrc(tgsi.getSrc(0).getIndirect(0), 0, 0));
>         }
> -      return;
> +      break;
> +   default:
> +      assert(!"Unsupported srcFile for LOAD");
>      }
>
> +/* Keep this around for now as reference when adding img support
>      getResourceCoords(off, r, 1);
>
>      if (isResourceRaw(code, r)) {
> @@ -2363,6 +2371,7 @@ Converter::handleLOAD(Value *dst0[4])
>      FOR_EACH_DST_ENABLED_CHANNEL(0, c, tgsi)
>         if (dst0[c] != def[c])
>            mkMov(dst0[c], def[tgsi.getSrc(0).getSwizzle(c)]);
> +*/
>   }
>
>   // For formatted stores, the write mask on OP_SUSTP can be used.
> @@ -2374,8 +2383,9 @@ Converter::handleSTORE()
>      int c;
>      std::vector<Value *> off, src, dummy;
>
> -   if (tgsi.getDst(0).getFile() == TGSI_FILE_BUFFER ||
> -       tgsi.getDst(0).getFile() == TGSI_FILE_MEMORY) {
> +   switch (tgsi.getDst(0).getFile()) {
> +   case TGSI_FILE_BUFFER:
> +   case TGSI_FILE_MEMORY:
>         for (c = 0; c < 4; ++c) {
>            if (!(tgsi.getDst(0).getMask() & (1 << c)))
>               continue;
> @@ -2396,9 +2406,12 @@ Converter::handleSTORE()
>            if (tgsi.getDst(0).isIndirect(0))
>               st->setIndirect(0, 1, fetchSrc(tgsi.getDst(0).getIndirect(0), 0, 0));
>         }
> -      return;
> +      break;
> +   default:
> +      assert(!"Unsupported dstFile for STORE");
>      }
>
> +/* Keep this around for now as reference when adding img support
>      getResourceCoords(off, r, 0);
>      src = off;
>      const int s = src.size();
> @@ -2446,6 +2459,7 @@ Converter::handleSTORE()
>         mkTex(OP_SUSTP, getResourceTarget(code, r), code->resources[r].slot, 0,
>               dummy, src)->tex.mask = tgsi.getDst(0).getMask();
>      }
> +*/
>   }
>
>   // XXX: These only work on resources with the single-component u32/s32 formats.
> @@ -2460,8 +2474,9 @@ Converter::handleATOM(Value *dst0[4], DataType ty, uint16_t subOp)
>      std::vector<Value *> defv;
>      LValue *dst = getScratch();
>
> -   if (tgsi.getSrc(0).getFile() == TGSI_FILE_BUFFER ||
> -       tgsi.getSrc(0).getFile() == TGSI_FILE_MEMORY) {
> +   switch (tgsi.getSrc(0).getFile()) {
> +   case TGSI_FILE_BUFFER:
> +   case TGSI_FILE_MEMORY:
>         for (int c = 0; c < 4; ++c) {
>            if (!dst0[c])
>               continue;
> @@ -2489,10 +2504,12 @@ Converter::handleATOM(Value *dst0[4], DataType ty, uint16_t subOp)
>         for (int c = 0; c < 4; ++c)
>            if (dst0[c])
>               dst0[c] = dst; // not equal to rDst so handleInstruction will do mkMov
> -      return;
> +      break;
> +   default:
> +      assert(!"Unsupported srcFile for ATOM");
>      }
>
> -
> +/* Keep this around for now as reference when adding img support
>      getResourceCoords(srcv, r, 1);
>
>      if (isResourceSpecial(r)) {
> @@ -2520,6 +2537,7 @@ Converter::handleATOM(Value *dst0[4], DataType ty, uint16_t subOp)
>      for (int c = 0; c < 4; ++c)
>         if (dst0[c])
>            dst0[c] = dst; // not equal to rDst so handleInstruction will do mkMov
> +*/
>   }
>
>   void
> diff --git a/src/gallium/include/pipe/p_shader_tokens.h b/src/gallium/include/pipe/p_shader_tokens.h
> index 65d8ad9..5ef6c30 100644
> --- a/src/gallium/include/pipe/p_shader_tokens.h
> +++ b/src/gallium/include/pipe/p_shader_tokens.h
> @@ -237,15 +237,6 @@ struct tgsi_declaration_array {
>      unsigned Padding : 22;
>   };
>
> -/*
> - * Special resources that don't need to be declared.  They map to the
> - * GLOBAL/LOCAL/PRIVATE/INPUT compute memory spaces.
> - */
> -#define TGSI_RESOURCE_GLOBAL	0x7fff
> -#define TGSI_RESOURCE_LOCAL	0x7ffe
> -#define TGSI_RESOURCE_PRIVATE	0x7ffd
> -#define TGSI_RESOURCE_INPUT	0x7ffc
> -

This should be in a separate patch with "gallium:" as prefix even if 
nouveau is the only driver which somehow uses these constants.

Other than that, the patch looks fine.
And thanks to not remove this resource thing because this could help for 
arb_shader_image_load_store. :-)

I have the same comment as the previous patch, I think the cosmetic 
changes should not be here.

>   #define TGSI_IMM_FLOAT32   0
>   #define TGSI_IMM_UINT32    1
>   #define TGSI_IMM_INT32     2
>
Hi,

On 16-03-16 11:37, Samuel Pitoiset wrote:
> Could you please get rid of the cosmetic changes (eg. the switch ones)?
> Because this doesn't really improve readability and in my opinion these changes should be eventually done in a separate patch.

I need at least halve of those cosmetic changes, because half of them is not cosmetic, e.g. :

-   case FILE_MEMORY_BUFFER: code[1] = 0xe0000000; code[0] = 0x00000000; break;
-   case FILE_MEMORY_LOCAL:  code[1] = 0x7a800000; code[0] = 0x00000002; break;
+   case FILE_MEMORY_BUFFER:
+   case FILE_MEMORY_GLOBAL:
+      code[0] = 0x00000000;
+      code[1] = 0xe0000000;
+      break;
+   case FILE_MEMORY_LOCAL:
+      code[0] = 0x00000002;
+      code[1] = 0x7a800000;
+      break;

The first bit actually changes things to have 2 cases for the BUFFER code, an
other way of writing this would be:

+   case FILE_MEMORY_GLOBAL:
     case FILE_MEMORY_BUFFER: code[1] = 0xe0000000; code[0] = 0x00000000; break;
     case FILE_MEMORY_LOCAL: code[1] = 0x7a800000; code[0] = 0x00000002; break;

But that just looks weird, if we have multiple case labels we should not use
the single line statement following the case label style IMHO, which brings us to:

+   case FILE_MEMORY_BUFFER:
+   case FILE_MEMORY_GLOBAL:
+      code[0] = 0x00000000;
+      code[1] = 0xe0000000;
+      break;

At which point keeping the LOCAL code looks ugly IMHO:

+   case FILE_MEMORY_BUFFER:
+   case FILE_MEMORY_GLOBAL:
+      code[0] = 0x00000000;
+      code[1] = 0xe0000000;
+      break;
     case FILE_MEMORY_LOCAL: code[1] = 0x7a800000; code[0] = 0x00000002; break;
     case FILE_MEMORY_SHARED:
        code[0] = 0x00000002;
        if (i->subOp == NV50_IR_SUBOP_LOAD_LOCKED)

Notice how the FILE_MEMORY_LOCAL case looks weird now.

Note I'm open to fixing this however you like, just explaining why I did it
the way I did it.

> Other than that, this patch is :
>
> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>

Thanks for the reviews!

Regards,

Hans


> Yes, this probably won't work as is for atomic operations but the lowering pass is already here, so it should be easy to make it work.
>
> On 03/16/2016 10:23 AM, Hans de Goede wrote:
>> Add support for OpenCL global memory buffers, note this has only
>> been tested with regular load and stores and likely needs more work
>> for e.g. atomic ops.
>>
>> Signed-off-by: Hans de Goede <hdegoede@redhat.com>
>> ---
>>   src/gallium/drivers/nouveau/codegen/nv50_ir.h      |  1 +
>>   .../drivers/nouveau/codegen/nv50_ir_emit_gk110.cpp | 31 +++++++++++++++++-----
>>   .../drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp |  5 +++-
>>   .../drivers/nouveau/codegen/nv50_ir_emit_nv50.cpp  | 10 ++++---
>>   .../drivers/nouveau/codegen/nv50_ir_emit_nvc0.cpp  | 26 +++++++++++++-----
>>   .../drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp  | 14 +++++++---
>>   .../drivers/nouveau/codegen/nv50_ir_peephole.cpp   |  5 +++-
>>   .../drivers/nouveau/codegen/nv50_ir_print.cpp      |  1 +
>>   .../nouveau/codegen/nv50_ir_target_nv50.cpp        |  1 +
>>   .../nouveau/codegen/nv50_ir_target_nvc0.cpp        |  1 +
>>   10 files changed, 74 insertions(+), 21 deletions(-)
>>
>> diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir.h b/src/gallium/drivers/nouveau/codegen/nv50_ir.h
>> index fdc2195..5141fc6 100644
>> --- a/src/gallium/drivers/nouveau/codegen/nv50_ir.h
>> +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir.h
>> @@ -333,6 +333,7 @@ enum DataFile
>>      FILE_SHADER_INPUT,
>>      FILE_SHADER_OUTPUT,
>>      FILE_MEMORY_BUFFER,
>> +   FILE_MEMORY_GLOBAL,
>>      FILE_MEMORY_SHARED,
>>      FILE_MEMORY_LOCAL,
>>      FILE_SYSTEM_VALUE,
>> diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gk110.cpp b/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gk110.cpp
>> index 02a1101..62f1598 100644
>> --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gk110.cpp
>> +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gk110.cpp
>> @@ -1641,8 +1641,15 @@ CodeEmitterGK110::emitSTORE(const Instruction *i)
>>      int32_t offset = SDATA(i->src(0)).offset;
>>
>>      switch (i->src(0).getFile()) {
>> -   case FILE_MEMORY_BUFFER: code[1] = 0xe0000000; code[0] = 0x00000000; break;
>> -   case FILE_MEMORY_LOCAL:  code[1] = 0x7a800000; code[0] = 0x00000002; break;
>> +   case FILE_MEMORY_BUFFER:
>> +   case FILE_MEMORY_GLOBAL:
>> +      code[0] = 0x00000000;
>> +      code[1] = 0xe0000000;
>> +      break;
>> +   case FILE_MEMORY_LOCAL:
>> +      code[0] = 0x00000002;
>> +      code[1] = 0x7a800000;
>> +      break;
>>      case FILE_MEMORY_SHARED:
>>         code[0] = 0x00000002;
>>         if (i->subOp == NV50_IR_SUBOP_STORE_UNLOCKED)
>> @@ -1678,7 +1685,8 @@ CodeEmitterGK110::emitSTORE(const Instruction *i)
>>
>>      srcId(i->src(1), 2);
>>      srcId(i->src(0).getIndirect(0), 10);
>> -   if (i->src(0).getFile() == FILE_MEMORY_BUFFER &&
>> +   if ((i->src(0).getFile() == FILE_MEMORY_BUFFER ||
>> +        i->src(0).getFile() == FILE_MEMORY_GLOBAL) &&
>>          i->src(0).isIndirect(0) &&
>>          i->getIndirect(0, 0)->reg.size == 8)
>>         code[1] |= 1 << 23;
>> @@ -1690,8 +1698,15 @@ CodeEmitterGK110::emitLOAD(const Instruction *i)
>>      int32_t offset = SDATA(i->src(0)).offset;
>>
>>      switch (i->src(0).getFile()) {
>> -   case FILE_MEMORY_BUFFER: code[1] = 0xc0000000; code[0] = 0x00000000; break;
>> -   case FILE_MEMORY_LOCAL:  code[1] = 0x7a000000; code[0] = 0x00000002; break;
>> +   case FILE_MEMORY_BUFFER:
>> +   case FILE_MEMORY_GLOBAL:
>> +      code[0] = 0x00000000;
>> +      code[1] = 0xc0000000;
>> +      break;
>> +   case FILE_MEMORY_LOCAL:
>> +      code[0] = 0x00000002;
>> +      code[1] = 0x7a000000;
>> +      break;
>>      case FILE_MEMORY_SHARED:
>>         code[0] = 0x00000002;
>>         if (i->subOp == NV50_IR_SUBOP_LOAD_LOCKED)
>> @@ -1800,7 +1815,8 @@ CodeEmitterGK110::emitMOV(const Instruction *i)
>>   static inline bool
>>   uses64bitAddress(const Instruction *ldst)
>>   {
>> -   return ldst->src(0).getFile() == FILE_MEMORY_BUFFER &&
>> +   return (ldst->src(0).getFile() == FILE_MEMORY_BUFFER ||
>> +           ldst->src(0).getFile() == FILE_MEMORY_GLOBAL) &&
>>         ldst->src(0).isIndirect(0) &&
>>         ldst->getIndirect(0, 0)->reg.size == 8;
>>   }
>> @@ -1862,7 +1878,8 @@ CodeEmitterGK110::emitCCTL(const Instruction *i)
>>
>>      code[0] = 0x00000002 | (i->subOp << 2);
>>
>> -   if (i->src(0).getFile() == FILE_MEMORY_BUFFER) {
>> +   if (i->src(0).getFile() == FILE_MEMORY_BUFFER ||
>> +       i->src(0).getFile() == FILE_MEMORY_GLOBAL) {
>>         code[1] = 0x7b000000;
>>      } else {
>>         code[1] = 0x7c000000;
>> diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp b/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp
>> index 27f287f..3fcdc55 100644
>> --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp
>> +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp
>> @@ -2417,7 +2417,8 @@ void
>>   CodeEmitterGM107::emitCCTL()
>>   {
>>      unsigned width;
>> -   if (insn->src(0).getFile() == FILE_MEMORY_BUFFER) {
>> +   if (insn->src(0).getFile() == FILE_MEMORY_BUFFER ||
>> +       insn->src(0).getFile() == FILE_MEMORY_GLOBAL) {
>>         emitInsn(0xef600000);
>>         width = 30;
>>      } else {
>> @@ -2989,6 +2990,7 @@ CodeEmitterGM107::emitInstruction(Instruction *i)
>>         case FILE_MEMORY_LOCAL : emitLDL(); break;
>>         case FILE_MEMORY_SHARED: emitLDS(); break;
>>         case FILE_MEMORY_BUFFER: emitLD(); break;
>> +      case FILE_MEMORY_GLOBAL: emitLD(); break;
>>         default:
>>            assert(!"invalid load");
>>            emitNOP();
>> @@ -3000,6 +3002,7 @@ CodeEmitterGM107::emitInstruction(Instruction *i)
>>         case FILE_MEMORY_LOCAL : emitSTL(); break;
>>         case FILE_MEMORY_SHARED: emitSTS(); break;
>>         case FILE_MEMORY_BUFFER: emitST(); break;
>> +      case FILE_MEMORY_GLOBAL: emitST(); break;
>>         default:
>>            assert(!"invalid load");
>>            emitNOP();
>> diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_nv50.cpp b/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_nv50.cpp
>> index 7476e21..2653c82 100644
>> --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_nv50.cpp
>> +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_nv50.cpp
>> @@ -663,6 +663,7 @@ CodeEmitterNV50::emitLOAD(const Instruction *i)
>>         code[1] = 0x40000000;
>>         break;
>>      case FILE_MEMORY_BUFFER:
>> +   case FILE_MEMORY_GLOBAL:
>>         code[0] = 0xd0000001 | (i->getSrc(0)->reg.fileIndex << 16);
>>         code[1] = 0x80000000;
>>         break;
>> @@ -671,7 +672,8 @@ CodeEmitterNV50::emitLOAD(const Instruction *i)
>>         break;
>>      }
>>      if (sf == FILE_MEMORY_LOCAL ||
>> -       sf == FILE_MEMORY_BUFFER)
>> +       sf == FILE_MEMORY_BUFFER ||
>> +       sf == FILE_MEMORY_GLOBAL)
>>         emitLoadStoreSizeLG(i->sType, 21 + 32);
>>
>>      setDst(i, 0);
>> @@ -679,7 +681,8 @@ CodeEmitterNV50::emitLOAD(const Instruction *i)
>>      emitFlagsRd(i);
>>      emitFlagsWr(i);
>>
>> -   if (i->src(0).getFile() == FILE_MEMORY_BUFFER) {
>> +   if (i->src(0).getFile() == FILE_MEMORY_BUFFER ||
>> +       i->src(0).getFile() == FILE_MEMORY_GLOBAL) {
>>         srcId(*i->src(0).getIndirect(0), 9);
>>      } else {
>>         setAReg16(i, 0);
>> @@ -700,6 +703,7 @@ CodeEmitterNV50::emitSTORE(const Instruction *i)
>>         srcId(i->src(1), 32 + 14);
>>         break;
>>      case FILE_MEMORY_BUFFER:
>> +   case FILE_MEMORY_GLOBAL:
>>         code[0] = 0xd0000001 | (i->getSrc(0)->reg.fileIndex << 16);
>>         code[1] = 0xa0000000;
>>         emitLoadStoreSizeLG(i->dType, 21 + 32);
>> @@ -737,7 +741,7 @@ CodeEmitterNV50::emitSTORE(const Instruction *i)
>>         break;
>>      }
>>
>> -   if (f == FILE_MEMORY_BUFFER)
>> +   if (f == FILE_MEMORY_BUFFER || f == FILE_MEMORY_GLOBAL)
>>         srcId(*i->src(0).getIndirect(0), 9);
>>      else
>>         setAReg16(i, 0);
>> diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_nvc0.cpp b/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_nvc0.cpp
>> index 6236659..ca475ce 100644
>> --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_nvc0.cpp
>> +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_nvc0.cpp
>> @@ -281,6 +281,7 @@ CodeEmitterNVC0::setAddressByFile(const ValueRef& src)
>>   {
>>      switch (src.getFile()) {
>>      case FILE_MEMORY_BUFFER:
>> +   case FILE_MEMORY_GLOBAL:
>>         srcAddr32(src, 26, 0);
>>         break;
>>      case FILE_MEMORY_LOCAL:
>> @@ -1768,7 +1769,8 @@ CodeEmitterNVC0::emitCachingMode(CacheMode c)
>>   static inline bool
>>   uses64bitAddress(const Instruction *ldst)
>>   {
>> -   return ldst->src(0).getFile() == FILE_MEMORY_BUFFER &&
>> +   return (ldst->src(0).getFile() == FILE_MEMORY_BUFFER ||
>> +           ldst->src(0).getFile() == FILE_MEMORY_GLOBAL) &&
>>         ldst->src(0).isIndirect(0) &&
>>         ldst->getIndirect(0, 0)->reg.size == 8;
>>   }
>> @@ -1779,8 +1781,13 @@ CodeEmitterNVC0::emitSTORE(const Instruction *i)
>>      uint32_t opc;
>>
>>      switch (i->src(0).getFile()) {
>> -   case FILE_MEMORY_BUFFER: opc = 0x90000000; break;
>> -   case FILE_MEMORY_LOCAL:  opc = 0xc8000000; break;
>> +   case FILE_MEMORY_BUFFER:
>> +   case FILE_MEMORY_GLOBAL:
>> +      opc = 0x90000000;
>> +      break;
>> +   case FILE_MEMORY_LOCAL:
>> +      opc = 0xc8000000;
>> +      break;
>>      case FILE_MEMORY_SHARED:
>>         if (i->subOp == NV50_IR_SUBOP_STORE_UNLOCKED) {
>>            if (targ->getChipset() >= NVISA_GK104_CHIPSET)
>> @@ -1828,8 +1835,13 @@ CodeEmitterNVC0::emitLOAD(const Instruction *i)
>>      code[0] = 0x00000005;
>>
>>      switch (i->src(0).getFile()) {
>> -   case FILE_MEMORY_BUFFER: opc = 0x80000000; break;
>> -   case FILE_MEMORY_LOCAL:  opc = 0xc0000000; break;
>> +   case FILE_MEMORY_BUFFER:
>> +   case FILE_MEMORY_GLOBAL:
>> +      opc = 0x80000000;
>> +      break;
>> +   case FILE_MEMORY_LOCAL:
>> +      opc = 0xc0000000;
>> +      break;
>>      case FILE_MEMORY_SHARED:
>>         if (i->subOp == NV50_IR_SUBOP_LOAD_LOCKED) {
>>            if (targ->getChipset() >= NVISA_GK104_CHIPSET)
>> @@ -2090,7 +2102,8 @@ CodeEmitterNVC0::emitCCTL(const Instruction *i)
>>   {
>>      code[0] = 0x00000005 | (i->subOp << 5);
>>
>> -   if (i->src(0).getFile() == FILE_MEMORY_BUFFER) {
>> +   if (i->src(0).getFile() == FILE_MEMORY_BUFFER ||
>> +       i->src(0).getFile() == FILE_MEMORY_GLOBAL) {
>>         code[1] = 0x98000000;
>>         srcAddr32(i->src(0), 28, 2);
>>      } else {
>> @@ -3122,6 +3135,7 @@ SchedDataCalculator::checkRd(const Value *v, int cycle, int& delay) const
>>      case FILE_MEMORY_CONST:
>>      case FILE_MEMORY_SHARED:
>>      case FILE_MEMORY_BUFFER:
>> +   case FILE_MEMORY_GLOBAL:
>>      case FILE_SYSTEM_VALUE:
>>         // TODO: any restrictions here ?
>>         break;
>> diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp
>> index 91879e4..c167c4a 100644
>> --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp
>> +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp
>> @@ -374,7 +374,7 @@ static nv50_ir::DataFile translateFile(uint file)
>>      case TGSI_FILE_IMMEDIATE:       return nv50_ir::FILE_IMMEDIATE;
>>      case TGSI_FILE_SYSTEM_VALUE:    return nv50_ir::FILE_SYSTEM_VALUE;
>>      case TGSI_FILE_BUFFER:          return nv50_ir::FILE_MEMORY_BUFFER;
>> -   case TGSI_FILE_MEMORY:          return nv50_ir::FILE_MEMORY_BUFFER;
>> +   case TGSI_FILE_MEMORY:          return nv50_ir::FILE_MEMORY_GLOBAL;
>>      case TGSI_FILE_SAMPLER:
>>      case TGSI_FILE_NULL:
>>      default:
>> @@ -1284,7 +1284,9 @@ bool Source::scanInstruction(const struct tgsi_full_instruction *inst)
>>            if (dst.isIndirect(0))
>>               indirectTempArrays.insert(dst.getArrayId());
>>         } else
>> -      if (dst.getFile() == TGSI_FILE_BUFFER) {
>> +      if (dst.getFile() == TGSI_FILE_BUFFER ||
>> +          (dst.getFile() == TGSI_FILE_MEMORY &&
>> +           memoryFiles[dst.getIndex(0)].mem_type == TGSI_MEMORY_TYPE_GLOBAL)) {
>>            info->io.globalAccess |= 0x2;
>>         }
>>      }
>> @@ -1295,7 +1297,9 @@ bool Source::scanInstruction(const struct tgsi_full_instruction *inst)
>>            if (src.isIndirect(0))
>>               indirectTempArrays.insert(src.getArrayId());
>>         } else
>> -      if (src.getFile() == TGSI_FILE_BUFFER) {
>> +      if (src.getFile() == TGSI_FILE_BUFFER ||
>> +          (src.getFile() == TGSI_FILE_MEMORY &&
>> +           memoryFiles[src.getIndex(0)].mem_type == TGSI_MEMORY_TYPE_GLOBAL)) {
>>            info->io.globalAccess |= (insn.getOpcode() == TGSI_OPCODE_LOAD) ?
>>                  0x1 : 0x2;
>>         } else
>> @@ -1529,6 +1533,10 @@ Converter::makeSym(uint tgsiFile, int fileIdx, int idx, int c, uint32_t address)
>>
>>      if (tgsiFile == TGSI_FILE_MEMORY) {
>>         switch (code->memoryFiles[fileIdx].mem_type) {
>> +      case TGSI_MEMORY_TYPE_GLOBAL:
>> +         /* No-op this is the default for TGSI_FILE_MEMORY */
>> +         sym->setFile(FILE_MEMORY_GLOBAL);
>> +         break;
>>         case TGSI_MEMORY_TYPE_SHARED:
>>            sym->setFile(FILE_MEMORY_SHARED);
>>            break;
>> diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
>> index 4a96d04..84d2944 100644
>> --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
>> +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
>> @@ -2581,6 +2581,7 @@ MemoryOpt::runOpt(BasicBlock *bb)
>>                ldst->op == OP_MEMBAR) {
>>               purgeRecords(NULL, FILE_MEMORY_LOCAL);
>>               purgeRecords(NULL, FILE_MEMORY_BUFFER);
>> +            purgeRecords(NULL, FILE_MEMORY_GLOBAL);
>>               purgeRecords(NULL, FILE_MEMORY_SHARED);
>>               purgeRecords(NULL, FILE_SHADER_OUTPUT);
>>            } else
>> @@ -2588,6 +2589,7 @@ MemoryOpt::runOpt(BasicBlock *bb)
>>               if (ldst->src(0).getFile() == FILE_MEMORY_BUFFER) {
>>                  purgeRecords(NULL, FILE_MEMORY_LOCAL);
>>                  purgeRecords(NULL, FILE_MEMORY_BUFFER);
>> +               purgeRecords(NULL, FILE_MEMORY_GLOBAL);
>>                  purgeRecords(NULL, FILE_MEMORY_SHARED);
>>               } else {
>>                  purgeRecords(NULL, ldst->src(0).getFile());
>> @@ -2607,7 +2609,8 @@ MemoryOpt::runOpt(BasicBlock *bb)
>>            DataFile file = ldst->src(0).getFile();
>>
>>            // if ld l[]/g[] look for previous store to eliminate the reload
>> -         if (file == FILE_MEMORY_BUFFER || file == FILE_MEMORY_LOCAL) {
>> +         if (file == FILE_MEMORY_BUFFER || file == FILE_MEMORY_LOCAL ||
>> +             file == FILE_MEMORY_GLOBAL) {
>>               // TODO: shared memory ?
>>               rec = findRecord(ldst, false, isAdjacent);
>>               if (rec && !isAdjacent)
>> diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_print.cpp b/src/gallium/drivers/nouveau/codegen/nv50_ir_print.cpp
>> index 73ed753..3917768 100644
>> --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_print.cpp
>> +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_print.cpp
>> @@ -456,6 +456,7 @@ int Symbol::print(char *buf, size_t size,
>>      case FILE_SHADER_INPUT:  c = 'a'; break;
>>      case FILE_SHADER_OUTPUT: c = 'o'; break;
>>      case FILE_MEMORY_BUFFER: c = 'g'; break;
>> +   case FILE_MEMORY_GLOBAL: c = 'g'; break;
>>      case FILE_MEMORY_SHARED: c = 's'; break;
>>      case FILE_MEMORY_LOCAL:  c = 'l'; break;
>>      default:
>> diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_target_nv50.cpp b/src/gallium/drivers/nouveau/codegen/nv50_ir_target_nv50.cpp
>> index 1cd45a2..5c60b22 100644
>> --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_target_nv50.cpp
>> +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_target_nv50.cpp
>> @@ -208,6 +208,7 @@ TargetNV50::getFileSize(DataFile file) const
>>      case FILE_SHADER_INPUT:  return 0x200;
>>      case FILE_SHADER_OUTPUT: return 0x200;
>>      case FILE_MEMORY_BUFFER: return 0xffffffff;
>> +   case FILE_MEMORY_GLOBAL: return 0xffffffff;
>>      case FILE_MEMORY_SHARED: return 16 << 10;
>>      case FILE_MEMORY_LOCAL:  return 48 << 10;
>>      case FILE_SYSTEM_VALUE:  return 16;
>> diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_target_nvc0.cpp b/src/gallium/drivers/nouveau/codegen/nv50_ir_target_nvc0.cpp
>> index bda59a5..9e1e7bf 100644
>> --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_target_nvc0.cpp
>> +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_target_nvc0.cpp
>> @@ -249,6 +249,7 @@ TargetNVC0::getFileSize(DataFile file) const
>>      case FILE_SHADER_INPUT:  return 0x400;
>>      case FILE_SHADER_OUTPUT: return 0x400;
>>      case FILE_MEMORY_BUFFER: return 0xffffffff;
>> +   case FILE_MEMORY_GLOBAL: return 0xffffffff;
>>      case FILE_MEMORY_SHARED: return 16 << 10;
>>      case FILE_MEMORY_LOCAL:  return 48 << 10;
>>      case FILE_SYSTEM_VALUE:  return 32;
>>
Hi,

On 16-03-16 11:45, Samuel Pitoiset wrote:
>
>
> On 03/16/2016 10:23 AM, Hans de Goede wrote:
>> Commit c3083c7082 ("nv50/ir: add support for BUFFER accesses") disabled /
>> commented out some of the old resource handling code, but not all of it.
>>
>> Effectively all of it is dead already, if we ever enter the old code
>> paths in handeLOAD / handleSTORE / handleATOM we will get an exception
>> due to trying to access the now always zero-sized resources vector.
>>
>> Make non buffer / memory file accesses not being supported in these
>> functions more explicit and comment out a whole bunch of dead code.
>>
>> Also remove the magic file-indexe defines from the old resource code
>> from include/pipe/p_shader_tokens.h as those are no longer used now
>> (which is a good thing).
>>
>> Signed-off-by: Hans de Goede <hdegoede@redhat.com>
>> ---
>>   .../drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp  | 42 +++++++++++++++-------
>>   src/gallium/include/pipe/p_shader_tokens.h         |  9 -----
>>   2 files changed, 30 insertions(+), 21 deletions(-)
>>
>> diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp
>> index c167c4a..115d0bb 100644
>> --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp
>> +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp
>> @@ -856,12 +856,14 @@ public:
>>      };
>>      std::vector<TextureView> textureViews;
>>
>> +   /*
>>      struct Resource {
>>         uint8_t target; // TGSI_TEXTURE_*
>>         bool raw;
>>         uint8_t slot; // $surface index
>>      };
>>      std::vector<Resource> resources;
>> +   */
>>
>>      struct MemoryFile {
>>         uint8_t mem_type; // TGSI_MEMORY_TYPE_*
>> @@ -1423,8 +1425,8 @@ private:
>>      void handleLIT(Value *dst0[4]);
>>      void handleUserClipPlanes();
>>
>> -   Symbol *getResourceBase(int r);
>> -   void getResourceCoords(std::vector<Value *>&, int r, int s);
>> +   // Symbol *getResourceBase(int r);
>> +   // void getResourceCoords(std::vector<Value *>&, int r, int s);
>>
>>      void handleLOAD(Value *dst0[4]);
>>      void handleSTORE();
>> @@ -2169,6 +2171,7 @@ Converter::handleLIT(Value *dst0[4])
>>      }
>>   }
>>
>> +/* Keep this around for now as reference when adding img support
>>   static inline bool
>>   isResourceSpecial(const int r)
>>   {
>> @@ -2264,6 +2267,7 @@ partitionLoadStore(uint8_t comp[2], uint8_t size[2], uint8_t mask)
>>      }
>>      return n + 1;
>>   }
>> +*/
>>
>>   // For raw loads, granularity is 4 byte.
>>   // Usage of the texture read mask on OP_SULDP is not allowed.
>> @@ -2274,8 +2278,9 @@ Converter::handleLOAD(Value *dst0[4])
>>      int c;
>>      std::vector<Value *> off, src, ldv, def;
>>
>> -   if (tgsi.getSrc(0).getFile() == TGSI_FILE_BUFFER ||
>> -       tgsi.getSrc(0).getFile() == TGSI_FILE_MEMORY) {
>> +   switch (tgsi.getSrc(0).getFile()) {
>> +   case TGSI_FILE_BUFFER:
>> +   case TGSI_FILE_MEMORY:
>>         for (c = 0; c < 4; ++c) {
>>            if (!dst0[c])
>>               continue;
>> @@ -2295,9 +2300,12 @@ Converter::handleLOAD(Value *dst0[4])
>>            if (tgsi.getSrc(0).isIndirect(0))
>>               ld->setIndirect(0, 1, fetchSrc(tgsi.getSrc(0).getIndirect(0), 0, 0));
>>         }
>> -      return;
>> +      break;
>> +   default:
>> +      assert(!"Unsupported srcFile for LOAD");
>>      }
>>
>> +/* Keep this around for now as reference when adding img support
>>      getResourceCoords(off, r, 1);
>>
>>      if (isResourceRaw(code, r)) {
>> @@ -2363,6 +2371,7 @@ Converter::handleLOAD(Value *dst0[4])
>>      FOR_EACH_DST_ENABLED_CHANNEL(0, c, tgsi)
>>         if (dst0[c] != def[c])
>>            mkMov(dst0[c], def[tgsi.getSrc(0).getSwizzle(c)]);
>> +*/
>>   }
>>
>>   // For formatted stores, the write mask on OP_SUSTP can be used.
>> @@ -2374,8 +2383,9 @@ Converter::handleSTORE()
>>      int c;
>>      std::vector<Value *> off, src, dummy;
>>
>> -   if (tgsi.getDst(0).getFile() == TGSI_FILE_BUFFER ||
>> -       tgsi.getDst(0).getFile() == TGSI_FILE_MEMORY) {
>> +   switch (tgsi.getDst(0).getFile()) {
>> +   case TGSI_FILE_BUFFER:
>> +   case TGSI_FILE_MEMORY:
>>         for (c = 0; c < 4; ++c) {
>>            if (!(tgsi.getDst(0).getMask() & (1 << c)))
>>               continue;
>> @@ -2396,9 +2406,12 @@ Converter::handleSTORE()
>>            if (tgsi.getDst(0).isIndirect(0))
>>               st->setIndirect(0, 1, fetchSrc(tgsi.getDst(0).getIndirect(0), 0, 0));
>>         }
>> -      return;
>> +      break;
>> +   default:
>> +      assert(!"Unsupported dstFile for STORE");
>>      }
>>
>> +/* Keep this around for now as reference when adding img support
>>      getResourceCoords(off, r, 0);
>>      src = off;
>>      const int s = src.size();
>> @@ -2446,6 +2459,7 @@ Converter::handleSTORE()
>>         mkTex(OP_SUSTP, getResourceTarget(code, r), code->resources[r].slot, 0,
>>               dummy, src)->tex.mask = tgsi.getDst(0).getMask();
>>      }
>> +*/
>>   }
>>
>>   // XXX: These only work on resources with the single-component u32/s32 formats.
>> @@ -2460,8 +2474,9 @@ Converter::handleATOM(Value *dst0[4], DataType ty, uint16_t subOp)
>>      std::vector<Value *> defv;
>>      LValue *dst = getScratch();
>>
>> -   if (tgsi.getSrc(0).getFile() == TGSI_FILE_BUFFER ||
>> -       tgsi.getSrc(0).getFile() == TGSI_FILE_MEMORY) {
>> +   switch (tgsi.getSrc(0).getFile()) {
>> +   case TGSI_FILE_BUFFER:
>> +   case TGSI_FILE_MEMORY:
>>         for (int c = 0; c < 4; ++c) {
>>            if (!dst0[c])
>>               continue;
>> @@ -2489,10 +2504,12 @@ Converter::handleATOM(Value *dst0[4], DataType ty, uint16_t subOp)
>>         for (int c = 0; c < 4; ++c)
>>            if (dst0[c])
>>               dst0[c] = dst; // not equal to rDst so handleInstruction will do mkMov
>> -      return;
>> +      break;
>> +   default:
>> +      assert(!"Unsupported srcFile for ATOM");
>>      }
>>
>> -
>> +/* Keep this around for now as reference when adding img support
>>      getResourceCoords(srcv, r, 1);
>>
>>      if (isResourceSpecial(r)) {
>> @@ -2520,6 +2537,7 @@ Converter::handleATOM(Value *dst0[4], DataType ty, uint16_t subOp)
>>      for (int c = 0; c < 4; ++c)
>>         if (dst0[c])
>>            dst0[c] = dst; // not equal to rDst so handleInstruction will do mkMov
>> +*/
>>   }
>>
>>   void
>> diff --git a/src/gallium/include/pipe/p_shader_tokens.h b/src/gallium/include/pipe/p_shader_tokens.h
>> index 65d8ad9..5ef6c30 100644
>> --- a/src/gallium/include/pipe/p_shader_tokens.h
>> +++ b/src/gallium/include/pipe/p_shader_tokens.h
>> @@ -237,15 +237,6 @@ struct tgsi_declaration_array {
>>      unsigned Padding : 22;
>>   };
>>
>> -/*
>> - * Special resources that don't need to be declared.  They map to the
>> - * GLOBAL/LOCAL/PRIVATE/INPUT compute memory spaces.
>> - */
>> -#define TGSI_RESOURCE_GLOBAL    0x7fff
>> -#define TGSI_RESOURCE_LOCAL    0x7ffe
>> -#define TGSI_RESOURCE_PRIVATE    0x7ffd
>> -#define TGSI_RESOURCE_INPUT    0x7ffc
>> -
>
> This should be in a separate patch with "gallium:" as prefix even if nouveau is the only driver which somehow uses these constants.

Ok, will do.

> Other than that, the patch looks fine.
> And thanks to not remove this resource thing because this could help for arb_shader_image_load_store. :-)
>
> I have the same comment as the previous patch, I think the cosmetic changes should not be here.

You mean the changing from if (... || ...) to switch case ? That is not cosmetic, note the
new default: code path with the assert(). This replaces the implicit assert we had before by
in the form of the old resource code throwing an exceptions because of the code indexing
a 0 size vector.

This is the part of the commit described by this bit of the commit msg:

"Make non buffer / memory file accesses not being supported in these
functions more explicit and comment out a whole bunch of dead code."

I could split this into a separate commit if you want me to.

Regards,

Hans
On 03/16/2016 11:45 AM, Hans de Goede wrote:
> Hi,
>
> On 16-03-16 11:37, Samuel Pitoiset wrote:
>> Could you please get rid of the cosmetic changes (eg. the switch ones)?
>> Because this doesn't really improve readability and in my opinion
>> these changes should be eventually done in a separate patch.
>
> I need at least halve of those cosmetic changes, because half of them is
> not cosmetic, e.g. :
>
> -   case FILE_MEMORY_BUFFER: code[1] = 0xe0000000; code[0] = 0x00000000;
> break;
> -   case FILE_MEMORY_LOCAL:  code[1] = 0x7a800000; code[0] = 0x00000002;
> break;
> +   case FILE_MEMORY_BUFFER:
> +   case FILE_MEMORY_GLOBAL:
> +      code[0] = 0x00000000;
> +      code[1] = 0xe0000000;
> +      break;
> +   case FILE_MEMORY_LOCAL:
> +      code[0] = 0x00000002;
> +      code[1] = 0x7a800000;
> +      break;
>
> The first bit actually changes things to have 2 cases for the BUFFER
> code, an
> other way of writing this would be:
>
> +   case FILE_MEMORY_GLOBAL:
>      case FILE_MEMORY_BUFFER: code[1] = 0xe0000000; code[0] =
> 0x00000000; break;
>      case FILE_MEMORY_LOCAL: code[1] = 0x7a800000; code[0] = 0x00000002;
> break;
>
> But that just looks weird, if we have multiple case labels we should not
> use
> the single line statement following the case label style IMHO, which
> brings us to:
>
> +   case FILE_MEMORY_BUFFER:
> +   case FILE_MEMORY_GLOBAL:
> +      code[0] = 0x00000000;
> +      code[1] = 0xe0000000;
> +      break;
>
> At which point keeping the LOCAL code looks ugly IMHO:
>
> +   case FILE_MEMORY_BUFFER:
> +   case FILE_MEMORY_GLOBAL:
> +      code[0] = 0x00000000;
> +      code[1] = 0xe0000000;
> +      break;
>      case FILE_MEMORY_LOCAL: code[1] = 0x7a800000; code[0] = 0x00000002;
> break;
>      case FILE_MEMORY_SHARED:
>         code[0] = 0x00000002;
>         if (i->subOp == NV50_IR_SUBOP_LOAD_LOCKED)
>
> Notice how the FILE_MEMORY_LOCAL case looks weird now.
>
> Note I'm open to fixing this however you like, just explaining why I did it
> the way I did it.

This makes more sense actually, and you have strong arguments. :-)
Feel free to keep this as is, but at the first look it looked weird.

>
>> Other than that, this patch is :
>>
>> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
>
> Thanks for the reviews!

You're welcome.

>
> Regards,
>
> Hans
>
>
>> Yes, this probably won't work as is for atomic operations but the
>> lowering pass is already here, so it should be easy to make it work.
>>
>> On 03/16/2016 10:23 AM, Hans de Goede wrote:
>>> Add support for OpenCL global memory buffers, note this has only
>>> been tested with regular load and stores and likely needs more work
>>> for e.g. atomic ops.
>>>
>>> Signed-off-by: Hans de Goede <hdegoede@redhat.com>
>>> ---
>>>   src/gallium/drivers/nouveau/codegen/nv50_ir.h      |  1 +
>>>   .../drivers/nouveau/codegen/nv50_ir_emit_gk110.cpp | 31
>>> +++++++++++++++++-----
>>>   .../drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp |  5 +++-
>>>   .../drivers/nouveau/codegen/nv50_ir_emit_nv50.cpp  | 10 ++++---
>>>   .../drivers/nouveau/codegen/nv50_ir_emit_nvc0.cpp  | 26
>>> +++++++++++++-----
>>>   .../drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp  | 14 +++++++---
>>>   .../drivers/nouveau/codegen/nv50_ir_peephole.cpp   |  5 +++-
>>>   .../drivers/nouveau/codegen/nv50_ir_print.cpp      |  1 +
>>>   .../nouveau/codegen/nv50_ir_target_nv50.cpp        |  1 +
>>>   .../nouveau/codegen/nv50_ir_target_nvc0.cpp        |  1 +
>>>   10 files changed, 74 insertions(+), 21 deletions(-)
>>>
>>> diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir.h
>>> b/src/gallium/drivers/nouveau/codegen/nv50_ir.h
>>> index fdc2195..5141fc6 100644
>>> --- a/src/gallium/drivers/nouveau/codegen/nv50_ir.h
>>> +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir.h
>>> @@ -333,6 +333,7 @@ enum DataFile
>>>      FILE_SHADER_INPUT,
>>>      FILE_SHADER_OUTPUT,
>>>      FILE_MEMORY_BUFFER,
>>> +   FILE_MEMORY_GLOBAL,
>>>      FILE_MEMORY_SHARED,
>>>      FILE_MEMORY_LOCAL,
>>>      FILE_SYSTEM_VALUE,
>>> diff --git
>>> a/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gk110.cpp
>>> b/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gk110.cpp
>>> index 02a1101..62f1598 100644
>>> --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gk110.cpp
>>> +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gk110.cpp
>>> @@ -1641,8 +1641,15 @@ CodeEmitterGK110::emitSTORE(const Instruction *i)
>>>      int32_t offset = SDATA(i->src(0)).offset;
>>>
>>>      switch (i->src(0).getFile()) {
>>> -   case FILE_MEMORY_BUFFER: code[1] = 0xe0000000; code[0] =
>>> 0x00000000; break;
>>> -   case FILE_MEMORY_LOCAL:  code[1] = 0x7a800000; code[0] =
>>> 0x00000002; break;
>>> +   case FILE_MEMORY_BUFFER:
>>> +   case FILE_MEMORY_GLOBAL:
>>> +      code[0] = 0x00000000;
>>> +      code[1] = 0xe0000000;
>>> +      break;
>>> +   case FILE_MEMORY_LOCAL:
>>> +      code[0] = 0x00000002;
>>> +      code[1] = 0x7a800000;
>>> +      break;
>>>      case FILE_MEMORY_SHARED:
>>>         code[0] = 0x00000002;
>>>         if (i->subOp == NV50_IR_SUBOP_STORE_UNLOCKED)
>>> @@ -1678,7 +1685,8 @@ CodeEmitterGK110::emitSTORE(const Instruction *i)
>>>
>>>      srcId(i->src(1), 2);
>>>      srcId(i->src(0).getIndirect(0), 10);
>>> -   if (i->src(0).getFile() == FILE_MEMORY_BUFFER &&
>>> +   if ((i->src(0).getFile() == FILE_MEMORY_BUFFER ||
>>> +        i->src(0).getFile() == FILE_MEMORY_GLOBAL) &&
>>>          i->src(0).isIndirect(0) &&
>>>          i->getIndirect(0, 0)->reg.size == 8)
>>>         code[1] |= 1 << 23;
>>> @@ -1690,8 +1698,15 @@ CodeEmitterGK110::emitLOAD(const Instruction *i)
>>>      int32_t offset = SDATA(i->src(0)).offset;
>>>
>>>      switch (i->src(0).getFile()) {
>>> -   case FILE_MEMORY_BUFFER: code[1] = 0xc0000000; code[0] =
>>> 0x00000000; break;
>>> -   case FILE_MEMORY_LOCAL:  code[1] = 0x7a000000; code[0] =
>>> 0x00000002; break;
>>> +   case FILE_MEMORY_BUFFER:
>>> +   case FILE_MEMORY_GLOBAL:
>>> +      code[0] = 0x00000000;
>>> +      code[1] = 0xc0000000;
>>> +      break;
>>> +   case FILE_MEMORY_LOCAL:
>>> +      code[0] = 0x00000002;
>>> +      code[1] = 0x7a000000;
>>> +      break;
>>>      case FILE_MEMORY_SHARED:
>>>         code[0] = 0x00000002;
>>>         if (i->subOp == NV50_IR_SUBOP_LOAD_LOCKED)
>>> @@ -1800,7 +1815,8 @@ CodeEmitterGK110::emitMOV(const Instruction *i)
>>>   static inline bool
>>>   uses64bitAddress(const Instruction *ldst)
>>>   {
>>> -   return ldst->src(0).getFile() == FILE_MEMORY_BUFFER &&
>>> +   return (ldst->src(0).getFile() == FILE_MEMORY_BUFFER ||
>>> +           ldst->src(0).getFile() == FILE_MEMORY_GLOBAL) &&
>>>         ldst->src(0).isIndirect(0) &&
>>>         ldst->getIndirect(0, 0)->reg.size == 8;
>>>   }
>>> @@ -1862,7 +1878,8 @@ CodeEmitterGK110::emitCCTL(const Instruction *i)
>>>
>>>      code[0] = 0x00000002 | (i->subOp << 2);
>>>
>>> -   if (i->src(0).getFile() == FILE_MEMORY_BUFFER) {
>>> +   if (i->src(0).getFile() == FILE_MEMORY_BUFFER ||
>>> +       i->src(0).getFile() == FILE_MEMORY_GLOBAL) {
>>>         code[1] = 0x7b000000;
>>>      } else {
>>>         code[1] = 0x7c000000;
>>> diff --git
>>> a/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp
>>> b/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp
>>> index 27f287f..3fcdc55 100644
>>> --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp
>>> +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp
>>> @@ -2417,7 +2417,8 @@ void
>>>   CodeEmitterGM107::emitCCTL()
>>>   {
>>>      unsigned width;
>>> -   if (insn->src(0).getFile() == FILE_MEMORY_BUFFER) {
>>> +   if (insn->src(0).getFile() == FILE_MEMORY_BUFFER ||
>>> +       insn->src(0).getFile() == FILE_MEMORY_GLOBAL) {
>>>         emitInsn(0xef600000);
>>>         width = 30;
>>>      } else {
>>> @@ -2989,6 +2990,7 @@ CodeEmitterGM107::emitInstruction(Instruction *i)
>>>         case FILE_MEMORY_LOCAL : emitLDL(); break;
>>>         case FILE_MEMORY_SHARED: emitLDS(); break;
>>>         case FILE_MEMORY_BUFFER: emitLD(); break;
>>> +      case FILE_MEMORY_GLOBAL: emitLD(); break;
>>>         default:
>>>            assert(!"invalid load");
>>>            emitNOP();
>>> @@ -3000,6 +3002,7 @@ CodeEmitterGM107::emitInstruction(Instruction *i)
>>>         case FILE_MEMORY_LOCAL : emitSTL(); break;
>>>         case FILE_MEMORY_SHARED: emitSTS(); break;
>>>         case FILE_MEMORY_BUFFER: emitST(); break;
>>> +      case FILE_MEMORY_GLOBAL: emitST(); break;
>>>         default:
>>>            assert(!"invalid load");
>>>            emitNOP();
>>> diff --git
>>> a/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_nv50.cpp
>>> b/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_nv50.cpp
>>> index 7476e21..2653c82 100644
>>> --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_nv50.cpp
>>> +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_nv50.cpp
>>> @@ -663,6 +663,7 @@ CodeEmitterNV50::emitLOAD(const Instruction *i)
>>>         code[1] = 0x40000000;
>>>         break;
>>>      case FILE_MEMORY_BUFFER:
>>> +   case FILE_MEMORY_GLOBAL:
>>>         code[0] = 0xd0000001 | (i->getSrc(0)->reg.fileIndex << 16);
>>>         code[1] = 0x80000000;
>>>         break;
>>> @@ -671,7 +672,8 @@ CodeEmitterNV50::emitLOAD(const Instruction *i)
>>>         break;
>>>      }
>>>      if (sf == FILE_MEMORY_LOCAL ||
>>> -       sf == FILE_MEMORY_BUFFER)
>>> +       sf == FILE_MEMORY_BUFFER ||
>>> +       sf == FILE_MEMORY_GLOBAL)
>>>         emitLoadStoreSizeLG(i->sType, 21 + 32);
>>>
>>>      setDst(i, 0);
>>> @@ -679,7 +681,8 @@ CodeEmitterNV50::emitLOAD(const Instruction *i)
>>>      emitFlagsRd(i);
>>>      emitFlagsWr(i);
>>>
>>> -   if (i->src(0).getFile() == FILE_MEMORY_BUFFER) {
>>> +   if (i->src(0).getFile() == FILE_MEMORY_BUFFER ||
>>> +       i->src(0).getFile() == FILE_MEMORY_GLOBAL) {
>>>         srcId(*i->src(0).getIndirect(0), 9);
>>>      } else {
>>>         setAReg16(i, 0);
>>> @@ -700,6 +703,7 @@ CodeEmitterNV50::emitSTORE(const Instruction *i)
>>>         srcId(i->src(1), 32 + 14);
>>>         break;
>>>      case FILE_MEMORY_BUFFER:
>>> +   case FILE_MEMORY_GLOBAL:
>>>         code[0] = 0xd0000001 | (i->getSrc(0)->reg.fileIndex << 16);
>>>         code[1] = 0xa0000000;
>>>         emitLoadStoreSizeLG(i->dType, 21 + 32);
>>> @@ -737,7 +741,7 @@ CodeEmitterNV50::emitSTORE(const Instruction *i)
>>>         break;
>>>      }
>>>
>>> -   if (f == FILE_MEMORY_BUFFER)
>>> +   if (f == FILE_MEMORY_BUFFER || f == FILE_MEMORY_GLOBAL)
>>>         srcId(*i->src(0).getIndirect(0), 9);
>>>      else
>>>         setAReg16(i, 0);
>>> diff --git
>>> a/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_nvc0.cpp
>>> b/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_nvc0.cpp
>>> index 6236659..ca475ce 100644
>>> --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_nvc0.cpp
>>> +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_nvc0.cpp
>>> @@ -281,6 +281,7 @@ CodeEmitterNVC0::setAddressByFile(const ValueRef&
>>> src)
>>>   {
>>>      switch (src.getFile()) {
>>>      case FILE_MEMORY_BUFFER:
>>> +   case FILE_MEMORY_GLOBAL:
>>>         srcAddr32(src, 26, 0);
>>>         break;
>>>      case FILE_MEMORY_LOCAL:
>>> @@ -1768,7 +1769,8 @@ CodeEmitterNVC0::emitCachingMode(CacheMode c)
>>>   static inline bool
>>>   uses64bitAddress(const Instruction *ldst)
>>>   {
>>> -   return ldst->src(0).getFile() == FILE_MEMORY_BUFFER &&
>>> +   return (ldst->src(0).getFile() == FILE_MEMORY_BUFFER ||
>>> +           ldst->src(0).getFile() == FILE_MEMORY_GLOBAL) &&
>>>         ldst->src(0).isIndirect(0) &&
>>>         ldst->getIndirect(0, 0)->reg.size == 8;
>>>   }
>>> @@ -1779,8 +1781,13 @@ CodeEmitterNVC0::emitSTORE(const Instruction *i)
>>>      uint32_t opc;
>>>
>>>      switch (i->src(0).getFile()) {
>>> -   case FILE_MEMORY_BUFFER: opc = 0x90000000; break;
>>> -   case FILE_MEMORY_LOCAL:  opc = 0xc8000000; break;
>>> +   case FILE_MEMORY_BUFFER:
>>> +   case FILE_MEMORY_GLOBAL:
>>> +      opc = 0x90000000;
>>> +      break;
>>> +   case FILE_MEMORY_LOCAL:
>>> +      opc = 0xc8000000;
>>> +      break;
>>>      case FILE_MEMORY_SHARED:
>>>         if (i->subOp == NV50_IR_SUBOP_STORE_UNLOCKED) {
>>>            if (targ->getChipset() >= NVISA_GK104_CHIPSET)
>>> @@ -1828,8 +1835,13 @@ CodeEmitterNVC0::emitLOAD(const Instruction *i)
>>>      code[0] = 0x00000005;
>>>
>>>      switch (i->src(0).getFile()) {
>>> -   case FILE_MEMORY_BUFFER: opc = 0x80000000; break;
>>> -   case FILE_MEMORY_LOCAL:  opc = 0xc0000000; break;
>>> +   case FILE_MEMORY_BUFFER:
>>> +   case FILE_MEMORY_GLOBAL:
>>> +      opc = 0x80000000;
>>> +      break;
>>> +   case FILE_MEMORY_LOCAL:
>>> +      opc = 0xc0000000;
>>> +      break;
>>>      case FILE_MEMORY_SHARED:
>>>         if (i->subOp == NV50_IR_SUBOP_LOAD_LOCKED) {
>>>            if (targ->getChipset() >= NVISA_GK104_CHIPSET)
>>> @@ -2090,7 +2102,8 @@ CodeEmitterNVC0::emitCCTL(const Instruction *i)
>>>   {
>>>      code[0] = 0x00000005 | (i->subOp << 5);
>>>
>>> -   if (i->src(0).getFile() == FILE_MEMORY_BUFFER) {
>>> +   if (i->src(0).getFile() == FILE_MEMORY_BUFFER ||
>>> +       i->src(0).getFile() == FILE_MEMORY_GLOBAL) {
>>>         code[1] = 0x98000000;
>>>         srcAddr32(i->src(0), 28, 2);
>>>      } else {
>>> @@ -3122,6 +3135,7 @@ SchedDataCalculator::checkRd(const Value *v,
>>> int cycle, int& delay) const
>>>      case FILE_MEMORY_CONST:
>>>      case FILE_MEMORY_SHARED:
>>>      case FILE_MEMORY_BUFFER:
>>> +   case FILE_MEMORY_GLOBAL:
>>>      case FILE_SYSTEM_VALUE:
>>>         // TODO: any restrictions here ?
>>>         break;
>>> diff --git
>>> a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp
>>> b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp
>>> index 91879e4..c167c4a 100644
>>> --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp
>>> +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp
>>> @@ -374,7 +374,7 @@ static nv50_ir::DataFile translateFile(uint file)
>>>      case TGSI_FILE_IMMEDIATE:       return nv50_ir::FILE_IMMEDIATE;
>>>      case TGSI_FILE_SYSTEM_VALUE:    return nv50_ir::FILE_SYSTEM_VALUE;
>>>      case TGSI_FILE_BUFFER:          return nv50_ir::FILE_MEMORY_BUFFER;
>>> -   case TGSI_FILE_MEMORY:          return nv50_ir::FILE_MEMORY_BUFFER;
>>> +   case TGSI_FILE_MEMORY:          return nv50_ir::FILE_MEMORY_GLOBAL;
>>>      case TGSI_FILE_SAMPLER:
>>>      case TGSI_FILE_NULL:
>>>      default:
>>> @@ -1284,7 +1284,9 @@ bool Source::scanInstruction(const struct
>>> tgsi_full_instruction *inst)
>>>            if (dst.isIndirect(0))
>>>               indirectTempArrays.insert(dst.getArrayId());
>>>         } else
>>> -      if (dst.getFile() == TGSI_FILE_BUFFER) {
>>> +      if (dst.getFile() == TGSI_FILE_BUFFER ||
>>> +          (dst.getFile() == TGSI_FILE_MEMORY &&
>>> +           memoryFiles[dst.getIndex(0)].mem_type ==
>>> TGSI_MEMORY_TYPE_GLOBAL)) {
>>>            info->io.globalAccess |= 0x2;
>>>         }
>>>      }
>>> @@ -1295,7 +1297,9 @@ bool Source::scanInstruction(const struct
>>> tgsi_full_instruction *inst)
>>>            if (src.isIndirect(0))
>>>               indirectTempArrays.insert(src.getArrayId());
>>>         } else
>>> -      if (src.getFile() == TGSI_FILE_BUFFER) {
>>> +      if (src.getFile() == TGSI_FILE_BUFFER ||
>>> +          (src.getFile() == TGSI_FILE_MEMORY &&
>>> +           memoryFiles[src.getIndex(0)].mem_type ==
>>> TGSI_MEMORY_TYPE_GLOBAL)) {
>>>            info->io.globalAccess |= (insn.getOpcode() ==
>>> TGSI_OPCODE_LOAD) ?
>>>                  0x1 : 0x2;
>>>         } else
>>> @@ -1529,6 +1533,10 @@ Converter::makeSym(uint tgsiFile, int fileIdx,
>>> int idx, int c, uint32_t address)
>>>
>>>      if (tgsiFile == TGSI_FILE_MEMORY) {
>>>         switch (code->memoryFiles[fileIdx].mem_type) {
>>> +      case TGSI_MEMORY_TYPE_GLOBAL:
>>> +         /* No-op this is the default for TGSI_FILE_MEMORY */
>>> +         sym->setFile(FILE_MEMORY_GLOBAL);
>>> +         break;
>>>         case TGSI_MEMORY_TYPE_SHARED:
>>>            sym->setFile(FILE_MEMORY_SHARED);
>>>            break;
>>> diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
>>> b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
>>> index 4a96d04..84d2944 100644
>>> --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
>>> +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
>>> @@ -2581,6 +2581,7 @@ MemoryOpt::runOpt(BasicBlock *bb)
>>>                ldst->op == OP_MEMBAR) {
>>>               purgeRecords(NULL, FILE_MEMORY_LOCAL);
>>>               purgeRecords(NULL, FILE_MEMORY_BUFFER);
>>> +            purgeRecords(NULL, FILE_MEMORY_GLOBAL);
>>>               purgeRecords(NULL, FILE_MEMORY_SHARED);
>>>               purgeRecords(NULL, FILE_SHADER_OUTPUT);
>>>            } else
>>> @@ -2588,6 +2589,7 @@ MemoryOpt::runOpt(BasicBlock *bb)
>>>               if (ldst->src(0).getFile() == FILE_MEMORY_BUFFER) {
>>>                  purgeRecords(NULL, FILE_MEMORY_LOCAL);
>>>                  purgeRecords(NULL, FILE_MEMORY_BUFFER);
>>> +               purgeRecords(NULL, FILE_MEMORY_GLOBAL);
>>>                  purgeRecords(NULL, FILE_MEMORY_SHARED);
>>>               } else {
>>>                  purgeRecords(NULL, ldst->src(0).getFile());
>>> @@ -2607,7 +2609,8 @@ MemoryOpt::runOpt(BasicBlock *bb)
>>>            DataFile file = ldst->src(0).getFile();
>>>
>>>            // if ld l[]/g[] look for previous store to eliminate the
>>> reload
>>> -         if (file == FILE_MEMORY_BUFFER || file == FILE_MEMORY_LOCAL) {
>>> +         if (file == FILE_MEMORY_BUFFER || file == FILE_MEMORY_LOCAL ||
>>> +             file == FILE_MEMORY_GLOBAL) {
>>>               // TODO: shared memory ?
>>>               rec = findRecord(ldst, false, isAdjacent);
>>>               if (rec && !isAdjacent)
>>> diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_print.cpp
>>> b/src/gallium/drivers/nouveau/codegen/nv50_ir_print.cpp
>>> index 73ed753..3917768 100644
>>> --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_print.cpp
>>> +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_print.cpp
>>> @@ -456,6 +456,7 @@ int Symbol::print(char *buf, size_t size,
>>>      case FILE_SHADER_INPUT:  c = 'a'; break;
>>>      case FILE_SHADER_OUTPUT: c = 'o'; break;
>>>      case FILE_MEMORY_BUFFER: c = 'g'; break;
>>> +   case FILE_MEMORY_GLOBAL: c = 'g'; break;
>>>      case FILE_MEMORY_SHARED: c = 's'; break;
>>>      case FILE_MEMORY_LOCAL:  c = 'l'; break;
>>>      default:
>>> diff --git
>>> a/src/gallium/drivers/nouveau/codegen/nv50_ir_target_nv50.cpp
>>> b/src/gallium/drivers/nouveau/codegen/nv50_ir_target_nv50.cpp
>>> index 1cd45a2..5c60b22 100644
>>> --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_target_nv50.cpp
>>> +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_target_nv50.cpp
>>> @@ -208,6 +208,7 @@ TargetNV50::getFileSize(DataFile file) const
>>>      case FILE_SHADER_INPUT:  return 0x200;
>>>      case FILE_SHADER_OUTPUT: return 0x200;
>>>      case FILE_MEMORY_BUFFER: return 0xffffffff;
>>> +   case FILE_MEMORY_GLOBAL: return 0xffffffff;
>>>      case FILE_MEMORY_SHARED: return 16 << 10;
>>>      case FILE_MEMORY_LOCAL:  return 48 << 10;
>>>      case FILE_SYSTEM_VALUE:  return 16;
>>> diff --git
>>> a/src/gallium/drivers/nouveau/codegen/nv50_ir_target_nvc0.cpp
>>> b/src/gallium/drivers/nouveau/codegen/nv50_ir_target_nvc0.cpp
>>> index bda59a5..9e1e7bf 100644
>>> --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_target_nvc0.cpp
>>> +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_target_nvc0.cpp
>>> @@ -249,6 +249,7 @@ TargetNVC0::getFileSize(DataFile file) const
>>>      case FILE_SHADER_INPUT:  return 0x400;
>>>      case FILE_SHADER_OUTPUT: return 0x400;
>>>      case FILE_MEMORY_BUFFER: return 0xffffffff;
>>> +   case FILE_MEMORY_GLOBAL: return 0xffffffff;
>>>      case FILE_MEMORY_SHARED: return 16 << 10;
>>>      case FILE_MEMORY_LOCAL:  return 48 << 10;
>>>      case FILE_SYSTEM_VALUE:  return 32;
>>>
On 03/16/2016 11:49 AM, Hans de Goede wrote:
> Hi,
>
> On 16-03-16 11:45, Samuel Pitoiset wrote:
>>
>>
>> On 03/16/2016 10:23 AM, Hans de Goede wrote:
>>> Commit c3083c7082 ("nv50/ir: add support for BUFFER accesses")
>>> disabled /
>>> commented out some of the old resource handling code, but not all of it.
>>>
>>> Effectively all of it is dead already, if we ever enter the old code
>>> paths in handeLOAD / handleSTORE / handleATOM we will get an exception
>>> due to trying to access the now always zero-sized resources vector.
>>>
>>> Make non buffer / memory file accesses not being supported in these
>>> functions more explicit and comment out a whole bunch of dead code.
>>>
>>> Also remove the magic file-indexe defines from the old resource code
>>> from include/pipe/p_shader_tokens.h as those are no longer used now
>>> (which is a good thing).
>>>
>>> Signed-off-by: Hans de Goede <hdegoede@redhat.com>
>>> ---
>>>   .../drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp  | 42
>>> +++++++++++++++-------
>>>   src/gallium/include/pipe/p_shader_tokens.h         |  9 -----
>>>   2 files changed, 30 insertions(+), 21 deletions(-)
>>>
>>> diff --git
>>> a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp
>>> b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp
>>> index c167c4a..115d0bb 100644
>>> --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp
>>> +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp
>>> @@ -856,12 +856,14 @@ public:
>>>      };
>>>      std::vector<TextureView> textureViews;
>>>
>>> +   /*
>>>      struct Resource {
>>>         uint8_t target; // TGSI_TEXTURE_*
>>>         bool raw;
>>>         uint8_t slot; // $surface index
>>>      };
>>>      std::vector<Resource> resources;
>>> +   */
>>>
>>>      struct MemoryFile {
>>>         uint8_t mem_type; // TGSI_MEMORY_TYPE_*
>>> @@ -1423,8 +1425,8 @@ private:
>>>      void handleLIT(Value *dst0[4]);
>>>      void handleUserClipPlanes();
>>>
>>> -   Symbol *getResourceBase(int r);
>>> -   void getResourceCoords(std::vector<Value *>&, int r, int s);
>>> +   // Symbol *getResourceBase(int r);
>>> +   // void getResourceCoords(std::vector<Value *>&, int r, int s);
>>>
>>>      void handleLOAD(Value *dst0[4]);
>>>      void handleSTORE();
>>> @@ -2169,6 +2171,7 @@ Converter::handleLIT(Value *dst0[4])
>>>      }
>>>   }
>>>
>>> +/* Keep this around for now as reference when adding img support
>>>   static inline bool
>>>   isResourceSpecial(const int r)
>>>   {
>>> @@ -2264,6 +2267,7 @@ partitionLoadStore(uint8_t comp[2], uint8_t
>>> size[2], uint8_t mask)
>>>      }
>>>      return n + 1;
>>>   }
>>> +*/
>>>
>>>   // For raw loads, granularity is 4 byte.
>>>   // Usage of the texture read mask on OP_SULDP is not allowed.
>>> @@ -2274,8 +2278,9 @@ Converter::handleLOAD(Value *dst0[4])
>>>      int c;
>>>      std::vector<Value *> off, src, ldv, def;
>>>
>>> -   if (tgsi.getSrc(0).getFile() == TGSI_FILE_BUFFER ||
>>> -       tgsi.getSrc(0).getFile() == TGSI_FILE_MEMORY) {
>>> +   switch (tgsi.getSrc(0).getFile()) {
>>> +   case TGSI_FILE_BUFFER:
>>> +   case TGSI_FILE_MEMORY:
>>>         for (c = 0; c < 4; ++c) {
>>>            if (!dst0[c])
>>>               continue;
>>> @@ -2295,9 +2300,12 @@ Converter::handleLOAD(Value *dst0[4])
>>>            if (tgsi.getSrc(0).isIndirect(0))
>>>               ld->setIndirect(0, 1,
>>> fetchSrc(tgsi.getSrc(0).getIndirect(0), 0, 0));
>>>         }
>>> -      return;
>>> +      break;
>>> +   default:
>>> +      assert(!"Unsupported srcFile for LOAD");
>>>      }
>>>
>>> +/* Keep this around for now as reference when adding img support
>>>      getResourceCoords(off, r, 1);
>>>
>>>      if (isResourceRaw(code, r)) {
>>> @@ -2363,6 +2371,7 @@ Converter::handleLOAD(Value *dst0[4])
>>>      FOR_EACH_DST_ENABLED_CHANNEL(0, c, tgsi)
>>>         if (dst0[c] != def[c])
>>>            mkMov(dst0[c], def[tgsi.getSrc(0).getSwizzle(c)]);
>>> +*/
>>>   }
>>>
>>>   // For formatted stores, the write mask on OP_SUSTP can be used.
>>> @@ -2374,8 +2383,9 @@ Converter::handleSTORE()
>>>      int c;
>>>      std::vector<Value *> off, src, dummy;
>>>
>>> -   if (tgsi.getDst(0).getFile() == TGSI_FILE_BUFFER ||
>>> -       tgsi.getDst(0).getFile() == TGSI_FILE_MEMORY) {
>>> +   switch (tgsi.getDst(0).getFile()) {
>>> +   case TGSI_FILE_BUFFER:
>>> +   case TGSI_FILE_MEMORY:
>>>         for (c = 0; c < 4; ++c) {
>>>            if (!(tgsi.getDst(0).getMask() & (1 << c)))
>>>               continue;
>>> @@ -2396,9 +2406,12 @@ Converter::handleSTORE()
>>>            if (tgsi.getDst(0).isIndirect(0))
>>>               st->setIndirect(0, 1,
>>> fetchSrc(tgsi.getDst(0).getIndirect(0), 0, 0));
>>>         }
>>> -      return;
>>> +      break;
>>> +   default:
>>> +      assert(!"Unsupported dstFile for STORE");
>>>      }
>>>
>>> +/* Keep this around for now as reference when adding img support
>>>      getResourceCoords(off, r, 0);
>>>      src = off;
>>>      const int s = src.size();
>>> @@ -2446,6 +2459,7 @@ Converter::handleSTORE()
>>>         mkTex(OP_SUSTP, getResourceTarget(code, r),
>>> code->resources[r].slot, 0,
>>>               dummy, src)->tex.mask = tgsi.getDst(0).getMask();
>>>      }
>>> +*/
>>>   }
>>>
>>>   // XXX: These only work on resources with the single-component
>>> u32/s32 formats.
>>> @@ -2460,8 +2474,9 @@ Converter::handleATOM(Value *dst0[4], DataType
>>> ty, uint16_t subOp)
>>>      std::vector<Value *> defv;
>>>      LValue *dst = getScratch();
>>>
>>> -   if (tgsi.getSrc(0).getFile() == TGSI_FILE_BUFFER ||
>>> -       tgsi.getSrc(0).getFile() == TGSI_FILE_MEMORY) {
>>> +   switch (tgsi.getSrc(0).getFile()) {
>>> +   case TGSI_FILE_BUFFER:
>>> +   case TGSI_FILE_MEMORY:
>>>         for (int c = 0; c < 4; ++c) {
>>>            if (!dst0[c])
>>>               continue;
>>> @@ -2489,10 +2504,12 @@ Converter::handleATOM(Value *dst0[4],
>>> DataType ty, uint16_t subOp)
>>>         for (int c = 0; c < 4; ++c)
>>>            if (dst0[c])
>>>               dst0[c] = dst; // not equal to rDst so
>>> handleInstruction will do mkMov
>>> -      return;
>>> +      break;
>>> +   default:
>>> +      assert(!"Unsupported srcFile for ATOM");
>>>      }
>>>
>>> -
>>> +/* Keep this around for now as reference when adding img support
>>>      getResourceCoords(srcv, r, 1);
>>>
>>>      if (isResourceSpecial(r)) {
>>> @@ -2520,6 +2537,7 @@ Converter::handleATOM(Value *dst0[4], DataType
>>> ty, uint16_t subOp)
>>>      for (int c = 0; c < 4; ++c)
>>>         if (dst0[c])
>>>            dst0[c] = dst; // not equal to rDst so handleInstruction
>>> will do mkMov
>>> +*/
>>>   }
>>>
>>>   void
>>> diff --git a/src/gallium/include/pipe/p_shader_tokens.h
>>> b/src/gallium/include/pipe/p_shader_tokens.h
>>> index 65d8ad9..5ef6c30 100644
>>> --- a/src/gallium/include/pipe/p_shader_tokens.h
>>> +++ b/src/gallium/include/pipe/p_shader_tokens.h
>>> @@ -237,15 +237,6 @@ struct tgsi_declaration_array {
>>>      unsigned Padding : 22;
>>>   };
>>>
>>> -/*
>>> - * Special resources that don't need to be declared.  They map to the
>>> - * GLOBAL/LOCAL/PRIVATE/INPUT compute memory spaces.
>>> - */
>>> -#define TGSI_RESOURCE_GLOBAL    0x7fff
>>> -#define TGSI_RESOURCE_LOCAL    0x7ffe
>>> -#define TGSI_RESOURCE_PRIVATE    0x7ffd
>>> -#define TGSI_RESOURCE_INPUT    0x7ffc
>>> -
>>
>> This should be in a separate patch with "gallium:" as prefix even if
>> nouveau is the only driver which somehow uses these constants.
>
> Ok, will do.
>
>> Other than that, the patch looks fine.
>> And thanks to not remove this resource thing because this could help
>> for arb_shader_image_load_store. :-)
>>
>> I have the same comment as the previous patch, I think the cosmetic
>> changes should not be here.
>
> You mean the changing from if (... || ...) to switch case ? That is not
> cosmetic, note the
> new default: code path with the assert(). This replaces the implicit
> assert we had before by
> in the form of the old resource code throwing an exceptions because of
> the code indexing
> a 0 size vector.
>
> This is the part of the commit described by this bit of the commit msg:
>
> "Make non buffer / memory file accesses not being supported in these
> functions more explicit and comment out a whole bunch of dead code."
>
> I could split this into a separate commit if you want me to.

Yeah, seems like better.
Thanks.

>
> Regards,
>
> Hans
This approach leads to the emitters needing to know about both global and
buffer, even though at that point, they are identical. I was thinking that
in the lowering logic, buffer would just get rewritten as global (with the
offset added), thus not needing any change to the emitters. What do you
think about such an approach?
On Mar 16, 2016 2:24 AM, "Hans de Goede" <hdegoede@redhat.com> wrote:

> FILE_MEMORY_GLOBAL is currently only used for buffer handling, as we
> do not yet have (opencl) global memory support. Global memory support
> actually requires some different handling during lowering, so rename
> FILE_MEMORY_GLOBAL to FILE_MEMORY_BUFFER to reflect that the current
> code is for buffer handling, this will allow the later (re-)addition
> of FILE_MEMORY_GLOBAL for regular global memory.
>
> Signed-off-by: Hans de Goede <hdegoede@redhat.com>
> ---
>  src/gallium/drivers/nouveau/codegen/nv50_ir.h                |  2 +-
>  src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gk110.cpp   | 10
> +++++-----
>  src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp   |  6 +++---
>  src/gallium/drivers/nouveau/codegen/nv50_ir_emit_nv50.cpp    | 10
> +++++-----
>  src/gallium/drivers/nouveau/codegen/nv50_ir_emit_nvc0.cpp    | 12
> ++++++------
>  src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp    |  8 ++++----
>  .../drivers/nouveau/codegen/nv50_ir_lowering_nvc0.cpp        | 10
> +++++-----
>  src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp     |  8 ++++----
>  src/gallium/drivers/nouveau/codegen/nv50_ir_print.cpp        |  2 +-
>  src/gallium/drivers/nouveau/codegen/nv50_ir_target_nv50.cpp  |  6 +++---
>  src/gallium/drivers/nouveau/codegen/nv50_ir_target_nvc0.cpp  |  2 +-
>  11 files changed, 38 insertions(+), 38 deletions(-)
>
> diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir.h
> b/src/gallium/drivers/nouveau/codegen/nv50_ir.h
> index 7b0eb2f..fdc2195 100644
> --- a/src/gallium/drivers/nouveau/codegen/nv50_ir.h
> +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir.h
> @@ -332,7 +332,7 @@ enum DataFile
>     FILE_MEMORY_CONST,
>     FILE_SHADER_INPUT,
>     FILE_SHADER_OUTPUT,
> -   FILE_MEMORY_GLOBAL,
> +   FILE_MEMORY_BUFFER,
>     FILE_MEMORY_SHARED,
>     FILE_MEMORY_LOCAL,
>     FILE_SYSTEM_VALUE,
> diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gk110.cpp
> b/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gk110.cpp
> index 70f3c3f..02a1101 100644
> --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gk110.cpp
> +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gk110.cpp
> @@ -1641,7 +1641,7 @@ CodeEmitterGK110::emitSTORE(const Instruction *i)
>     int32_t offset = SDATA(i->src(0)).offset;
>
>     switch (i->src(0).getFile()) {
> -   case FILE_MEMORY_GLOBAL: code[1] = 0xe0000000; code[0] = 0x00000000;
> break;
> +   case FILE_MEMORY_BUFFER: code[1] = 0xe0000000; code[0] = 0x00000000;
> break;
>     case FILE_MEMORY_LOCAL:  code[1] = 0x7a800000; code[0] = 0x00000002;
> break;
>     case FILE_MEMORY_SHARED:
>        code[0] = 0x00000002;
> @@ -1678,7 +1678,7 @@ CodeEmitterGK110::emitSTORE(const Instruction *i)
>
>     srcId(i->src(1), 2);
>     srcId(i->src(0).getIndirect(0), 10);
> -   if (i->src(0).getFile() == FILE_MEMORY_GLOBAL &&
> +   if (i->src(0).getFile() == FILE_MEMORY_BUFFER &&
>         i->src(0).isIndirect(0) &&
>         i->getIndirect(0, 0)->reg.size == 8)
>        code[1] |= 1 << 23;
> @@ -1690,7 +1690,7 @@ CodeEmitterGK110::emitLOAD(const Instruction *i)
>     int32_t offset = SDATA(i->src(0)).offset;
>
>     switch (i->src(0).getFile()) {
> -   case FILE_MEMORY_GLOBAL: code[1] = 0xc0000000; code[0] = 0x00000000;
> break;
> +   case FILE_MEMORY_BUFFER: code[1] = 0xc0000000; code[0] = 0x00000000;
> break;
>     case FILE_MEMORY_LOCAL:  code[1] = 0x7a000000; code[0] = 0x00000002;
> break;
>     case FILE_MEMORY_SHARED:
>        code[0] = 0x00000002;
> @@ -1800,7 +1800,7 @@ CodeEmitterGK110::emitMOV(const Instruction *i)
>  static inline bool
>  uses64bitAddress(const Instruction *ldst)
>  {
> -   return ldst->src(0).getFile() == FILE_MEMORY_GLOBAL &&
> +   return ldst->src(0).getFile() == FILE_MEMORY_BUFFER &&
>        ldst->src(0).isIndirect(0) &&
>        ldst->getIndirect(0, 0)->reg.size == 8;
>  }
> @@ -1862,7 +1862,7 @@ CodeEmitterGK110::emitCCTL(const Instruction *i)
>
>     code[0] = 0x00000002 | (i->subOp << 2);
>
> -   if (i->src(0).getFile() == FILE_MEMORY_GLOBAL) {
> +   if (i->src(0).getFile() == FILE_MEMORY_BUFFER) {
>        code[1] = 0x7b000000;
>     } else {
>        code[1] = 0x7c000000;
> diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp
> b/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp
> index e079a57..27f287f 100644
> --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp
> +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp
> @@ -2417,7 +2417,7 @@ void
>  CodeEmitterGM107::emitCCTL()
>  {
>     unsigned width;
> -   if (insn->src(0).getFile() == FILE_MEMORY_GLOBAL) {
> +   if (insn->src(0).getFile() == FILE_MEMORY_BUFFER) {
>        emitInsn(0xef600000);
>        width = 30;
>     } else {
> @@ -2988,7 +2988,7 @@ CodeEmitterGM107::emitInstruction(Instruction *i)
>        case FILE_MEMORY_CONST : emitLDC(); break;
>        case FILE_MEMORY_LOCAL : emitLDL(); break;
>        case FILE_MEMORY_SHARED: emitLDS(); break;
> -      case FILE_MEMORY_GLOBAL: emitLD(); break;
> +      case FILE_MEMORY_BUFFER: emitLD(); break;
>        default:
>           assert(!"invalid load");
>           emitNOP();
> @@ -2999,7 +2999,7 @@ CodeEmitterGM107::emitInstruction(Instruction *i)
>        switch (insn->src(0).getFile()) {
>        case FILE_MEMORY_LOCAL : emitSTL(); break;
>        case FILE_MEMORY_SHARED: emitSTS(); break;
> -      case FILE_MEMORY_GLOBAL: emitST(); break;
> +      case FILE_MEMORY_BUFFER: emitST(); break;
>        default:
>           assert(!"invalid load");
>           emitNOP();
> diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_nv50.cpp
> b/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_nv50.cpp
> index 682a19d..7476e21 100644
> --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_nv50.cpp
> +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_nv50.cpp
> @@ -662,7 +662,7 @@ CodeEmitterNV50::emitLOAD(const Instruction *i)
>        code[0] = 0xd0000001;
>        code[1] = 0x40000000;
>        break;
> -   case FILE_MEMORY_GLOBAL:
> +   case FILE_MEMORY_BUFFER:
>        code[0] = 0xd0000001 | (i->getSrc(0)->reg.fileIndex << 16);
>        code[1] = 0x80000000;
>        break;
> @@ -671,7 +671,7 @@ CodeEmitterNV50::emitLOAD(const Instruction *i)
>        break;
>     }
>     if (sf == FILE_MEMORY_LOCAL ||
> -       sf == FILE_MEMORY_GLOBAL)
> +       sf == FILE_MEMORY_BUFFER)
>        emitLoadStoreSizeLG(i->sType, 21 + 32);
>
>     setDst(i, 0);
> @@ -679,7 +679,7 @@ CodeEmitterNV50::emitLOAD(const Instruction *i)
>     emitFlagsRd(i);
>     emitFlagsWr(i);
>
> -   if (i->src(0).getFile() == FILE_MEMORY_GLOBAL) {
> +   if (i->src(0).getFile() == FILE_MEMORY_BUFFER) {
>        srcId(*i->src(0).getIndirect(0), 9);
>     } else {
>        setAReg16(i, 0);
> @@ -699,7 +699,7 @@ CodeEmitterNV50::emitSTORE(const Instruction *i)
>        code[1] = 0x80c00000;
>        srcId(i->src(1), 32 + 14);
>        break;
> -   case FILE_MEMORY_GLOBAL:
> +   case FILE_MEMORY_BUFFER:
>        code[0] = 0xd0000001 | (i->getSrc(0)->reg.fileIndex << 16);
>        code[1] = 0xa0000000;
>        emitLoadStoreSizeLG(i->dType, 21 + 32);
> @@ -737,7 +737,7 @@ CodeEmitterNV50::emitSTORE(const Instruction *i)
>        break;
>     }
>
> -   if (f == FILE_MEMORY_GLOBAL)
> +   if (f == FILE_MEMORY_BUFFER)
>        srcId(*i->src(0).getIndirect(0), 9);
>     else
>        setAReg16(i, 0);
> diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_nvc0.cpp
> b/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_nvc0.cpp
> index 8b9328b..6236659 100644
> --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_nvc0.cpp
> +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_nvc0.cpp
> @@ -280,7 +280,7 @@ void
>  CodeEmitterNVC0::setAddressByFile(const ValueRef& src)
>  {
>     switch (src.getFile()) {
> -   case FILE_MEMORY_GLOBAL:
> +   case FILE_MEMORY_BUFFER:
>        srcAddr32(src, 26, 0);
>        break;
>     case FILE_MEMORY_LOCAL:
> @@ -1768,7 +1768,7 @@ CodeEmitterNVC0::emitCachingMode(CacheMode c)
>  static inline bool
>  uses64bitAddress(const Instruction *ldst)
>  {
> -   return ldst->src(0).getFile() == FILE_MEMORY_GLOBAL &&
> +   return ldst->src(0).getFile() == FILE_MEMORY_BUFFER &&
>        ldst->src(0).isIndirect(0) &&
>        ldst->getIndirect(0, 0)->reg.size == 8;
>  }
> @@ -1779,7 +1779,7 @@ CodeEmitterNVC0::emitSTORE(const Instruction *i)
>     uint32_t opc;
>
>     switch (i->src(0).getFile()) {
> -   case FILE_MEMORY_GLOBAL: opc = 0x90000000; break;
> +   case FILE_MEMORY_BUFFER: opc = 0x90000000; break;
>     case FILE_MEMORY_LOCAL:  opc = 0xc8000000; break;
>     case FILE_MEMORY_SHARED:
>        if (i->subOp == NV50_IR_SUBOP_STORE_UNLOCKED) {
> @@ -1828,7 +1828,7 @@ CodeEmitterNVC0::emitLOAD(const Instruction *i)
>     code[0] = 0x00000005;
>
>     switch (i->src(0).getFile()) {
> -   case FILE_MEMORY_GLOBAL: opc = 0x80000000; break;
> +   case FILE_MEMORY_BUFFER: opc = 0x80000000; break;
>     case FILE_MEMORY_LOCAL:  opc = 0xc0000000; break;
>     case FILE_MEMORY_SHARED:
>        if (i->subOp == NV50_IR_SUBOP_LOAD_LOCKED) {
> @@ -2090,7 +2090,7 @@ CodeEmitterNVC0::emitCCTL(const Instruction *i)
>  {
>     code[0] = 0x00000005 | (i->subOp << 5);
>
> -   if (i->src(0).getFile() == FILE_MEMORY_GLOBAL) {
> +   if (i->src(0).getFile() == FILE_MEMORY_BUFFER) {
>        code[1] = 0x98000000;
>        srcAddr32(i->src(0), 28, 2);
>     } else {
> @@ -3121,7 +3121,7 @@ SchedDataCalculator::checkRd(const Value *v, int
> cycle, int& delay) const
>     case FILE_MEMORY_LOCAL:
>     case FILE_MEMORY_CONST:
>     case FILE_MEMORY_SHARED:
> -   case FILE_MEMORY_GLOBAL:
> +   case FILE_MEMORY_BUFFER:
>     case FILE_SYSTEM_VALUE:
>        // TODO: any restrictions here ?
>        break;
> diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp
> b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp
> index 1e91ad3..91879e4 100644
> --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp
> +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp
> @@ -373,8 +373,8 @@ static nv50_ir::DataFile translateFile(uint file)
>     case TGSI_FILE_PREDICATE:       return nv50_ir::FILE_PREDICATE;
>     case TGSI_FILE_IMMEDIATE:       return nv50_ir::FILE_IMMEDIATE;
>     case TGSI_FILE_SYSTEM_VALUE:    return nv50_ir::FILE_SYSTEM_VALUE;
> -   case TGSI_FILE_BUFFER:          return nv50_ir::FILE_MEMORY_GLOBAL;
> -   case TGSI_FILE_MEMORY:          return nv50_ir::FILE_MEMORY_GLOBAL;
> +   case TGSI_FILE_BUFFER:          return nv50_ir::FILE_MEMORY_BUFFER;
> +   case TGSI_FILE_MEMORY:          return nv50_ir::FILE_MEMORY_BUFFER;
>     case TGSI_FILE_SAMPLER:
>     case TGSI_FILE_NULL:
>     default:
> @@ -2191,7 +2191,7 @@ Converter::getResourceBase(const int r)
>
>     switch (r) {
>     case TGSI_RESOURCE_GLOBAL:
> -      sym = new_Symbol(prog, nv50_ir::FILE_MEMORY_GLOBAL, 15);
> +      sym = new_Symbol(prog, nv50_ir::FILE_MEMORY_BUFFER, 15);
>        break;
>     case TGSI_RESOURCE_LOCAL:
>        assert(prog->getType() == Program::TYPE_COMPUTE);
> @@ -2209,7 +2209,7 @@ Converter::getResourceBase(const int r)
>        break;
>     default:
>        sym = new_Symbol(prog,
> -                       nv50_ir::FILE_MEMORY_GLOBAL, code->resources.at
> (r).slot);
> +                       nv50_ir::FILE_MEMORY_BUFFER, code->resources.at
> (r).slot);
>        break;
>     }
>     return sym;
> diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_nvc0.cpp
> b/src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_nvc0.cpp
> index d0936d8..563d7c2 100644
> --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_nvc0.cpp
> +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_nvc0.cpp
> @@ -1141,7 +1141,7 @@ NVC0LoweringPass::handleATOM(Instruction *atom)
>        handleSharedATOM(atom);
>        return true;
>     default:
> -      assert(atom->src(0).getFile() == FILE_MEMORY_GLOBAL);
> +      assert(atom->src(0).getFile() == FILE_MEMORY_BUFFER);
>        base = loadResInfo64(ind, atom->getSrc(0)->reg.fileIndex * 16);
>        assert(base->reg.size == 8);
>        if (ptr)
> @@ -1154,7 +1154,7 @@ NVC0LoweringPass::handleATOM(Instruction *atom)
>        bld.mkOp1v(OP_RDSV, TYPE_U32, bld.getScratch(), bld.mkSysVal(sv,
> 0));
>
>     atom->setSrc(0, cloneShallow(func, atom->getSrc(0)));
> -   atom->getSrc(0)->reg.file = FILE_MEMORY_GLOBAL;
> +   atom->getSrc(0)->reg.file = FILE_MEMORY_BUFFER;
>     if (ptr)
>        base = bld.mkOp2v(OP_ADD, TYPE_U32, base, base, ptr);
>     atom->setIndirect(0, 1, NULL);
> @@ -1571,7 +1571,7 @@ NVC0LoweringPass::handleSurfaceOpNVE4(TexInstruction
> *su)
>        Instruction *red = bld.mkOp(OP_ATOM, su->dType, su->getDef(0));
>        red->subOp = su->subOp;
>        if (!gMemBase)
> -         gMemBase = bld.mkSymbol(FILE_MEMORY_GLOBAL, 0, TYPE_U32, 0);
> +         gMemBase = bld.mkSymbol(FILE_MEMORY_BUFFER, 0, TYPE_U32, 0);
>        red->setSrc(0, gMemBase);
>        red->setSrc(1, su->getSrc(3));
>        if (su->subOp == NV50_IR_SUBOP_ATOM_CAS)
> @@ -1963,7 +1963,7 @@ NVC0LoweringPass::visit(Instruction *i)
>        } else if (i->src(0).getFile() == FILE_SHADER_OUTPUT) {
>           assert(prog->getType() == Program::TYPE_TESSELLATION_CONTROL);
>           i->op = OP_VFETCH;
> -      } else if (i->src(0).getFile() == FILE_MEMORY_GLOBAL) {
> +      } else if (i->src(0).getFile() == FILE_MEMORY_BUFFER) {
>           Value *ind = i->getIndirect(0, 1);
>           Value *ptr = loadResInfo64(ind, i->getSrc(0)->reg.fileIndex *
> 16);
>           // XXX come up with a way not to do this for EVERY little access
> but
> @@ -1987,7 +1987,7 @@ NVC0LoweringPass::visit(Instruction *i)
>        break;
>     case OP_ATOM:
>     {
> -      const bool cctl = i->src(0).getFile() == FILE_MEMORY_GLOBAL;
> +      const bool cctl = i->src(0).getFile() == FILE_MEMORY_BUFFER;
>        handleATOM(i);
>        handleCasExch(i, cctl);
>     }
> diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
> b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
> index 66e7b2e..4a96d04 100644
> --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
> +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
> @@ -2580,14 +2580,14 @@ MemoryOpt::runOpt(BasicBlock *bb)
>               ldst->op == OP_BAR ||
>               ldst->op == OP_MEMBAR) {
>              purgeRecords(NULL, FILE_MEMORY_LOCAL);
> -            purgeRecords(NULL, FILE_MEMORY_GLOBAL);
> +            purgeRecords(NULL, FILE_MEMORY_BUFFER);
>              purgeRecords(NULL, FILE_MEMORY_SHARED);
>              purgeRecords(NULL, FILE_SHADER_OUTPUT);
>           } else
>           if (ldst->op == OP_ATOM || ldst->op == OP_CCTL) {
> -            if (ldst->src(0).getFile() == FILE_MEMORY_GLOBAL) {
> +            if (ldst->src(0).getFile() == FILE_MEMORY_BUFFER) {
>                 purgeRecords(NULL, FILE_MEMORY_LOCAL);
> -               purgeRecords(NULL, FILE_MEMORY_GLOBAL);
> +               purgeRecords(NULL, FILE_MEMORY_BUFFER);
>                 purgeRecords(NULL, FILE_MEMORY_SHARED);
>              } else {
>                 purgeRecords(NULL, ldst->src(0).getFile());
> @@ -2607,7 +2607,7 @@ MemoryOpt::runOpt(BasicBlock *bb)
>           DataFile file = ldst->src(0).getFile();
>
>           // if ld l[]/g[] look for previous store to eliminate the reload
> -         if (file == FILE_MEMORY_GLOBAL || file == FILE_MEMORY_LOCAL) {
> +         if (file == FILE_MEMORY_BUFFER || file == FILE_MEMORY_LOCAL) {
>              // TODO: shared memory ?
>              rec = findRecord(ldst, false, isAdjacent);
>              if (rec && !isAdjacent)
> diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_print.cpp
> b/src/gallium/drivers/nouveau/codegen/nv50_ir_print.cpp
> index cfa85ec..73ed753 100644
> --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_print.cpp
> +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_print.cpp
> @@ -455,7 +455,7 @@ int Symbol::print(char *buf, size_t size,
>     case FILE_MEMORY_CONST:  c = 'c'; break;
>     case FILE_SHADER_INPUT:  c = 'a'; break;
>     case FILE_SHADER_OUTPUT: c = 'o'; break;
> -   case FILE_MEMORY_GLOBAL: c = 'g'; break;
> +   case FILE_MEMORY_BUFFER: c = 'g'; break;
>     case FILE_MEMORY_SHARED: c = 's'; break;
>     case FILE_MEMORY_LOCAL:  c = 'l'; break;
>     default:
> diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_target_nv50.cpp
> b/src/gallium/drivers/nouveau/codegen/nv50_ir_target_nv50.cpp
> index 2c4d7f5..1cd45a2 100644
> --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_target_nv50.cpp
> +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_target_nv50.cpp
> @@ -207,7 +207,7 @@ TargetNV50::getFileSize(DataFile file) const
>     case FILE_MEMORY_CONST:  return 65536;
>     case FILE_SHADER_INPUT:  return 0x200;
>     case FILE_SHADER_OUTPUT: return 0x200;
> -   case FILE_MEMORY_GLOBAL: return 0xffffffff;
> +   case FILE_MEMORY_BUFFER: return 0xffffffff;
>     case FILE_MEMORY_SHARED: return 16 << 10;
>     case FILE_MEMORY_LOCAL:  return 48 << 10;
>     case FILE_SYSTEM_VALUE:  return 16;
> @@ -406,7 +406,7 @@ TargetNV50::isAccessSupported(DataFile file, DataType
> ty) const
>     if (ty == TYPE_B96 || ty == TYPE_NONE)
>        return false;
>     if (typeSizeof(ty) > 4)
> -      return (file == FILE_MEMORY_LOCAL) || (file == FILE_MEMORY_GLOBAL);
> +      return (file == FILE_MEMORY_LOCAL) || (file == FILE_MEMORY_BUFFER);
>     return true;
>  }
>
> @@ -508,7 +508,7 @@ int TargetNV50::getLatency(const Instruction *i) const
>     if (i->op == OP_LOAD) {
>        switch (i->src(0).getFile()) {
>        case FILE_MEMORY_LOCAL:
> -      case FILE_MEMORY_GLOBAL:
> +      case FILE_MEMORY_BUFFER:
>           return 100; // really 400 to 800
>        default:
>           return 22;
> diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_target_nvc0.cpp
> b/src/gallium/drivers/nouveau/codegen/nv50_ir_target_nvc0.cpp
> index a03afa8..bda59a5 100644
> --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_target_nvc0.cpp
> +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_target_nvc0.cpp
> @@ -248,7 +248,7 @@ TargetNVC0::getFileSize(DataFile file) const
>     case FILE_MEMORY_CONST:  return 65536;
>     case FILE_SHADER_INPUT:  return 0x400;
>     case FILE_SHADER_OUTPUT: return 0x400;
> -   case FILE_MEMORY_GLOBAL: return 0xffffffff;
> +   case FILE_MEMORY_BUFFER: return 0xffffffff;
>     case FILE_MEMORY_SHARED: return 16 << 10;
>     case FILE_MEMORY_LOCAL:  return 48 << 10;
>     case FILE_SYSTEM_VALUE:  return 32;
> --
> 2.7.2
>
>
On 16.03.2016 04:23, Hans de Goede wrote:
> tgsi_default_instruction_memory / tgsi_build_instruction_memory were
> returning uninitialized memory for tgsi_instruction_memory.Texture and
> tgsi_instruction_memory.Format. Note 0 means not set, and thus is a
> correct default initializer for these.
>
> Fixes: 3243b6fc97 ("tgsi: add Texture and Format to tgsi_instruction_memory")
> Cc: Nicolai Hähnle <nicolai.haehnle@amd.com>
> Signed-off-by: Hans de Goede <hdegoede@redhat.com>

Thanks.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>

> ---
>   src/gallium/auxiliary/tgsi/tgsi_build.c | 4 ++++
>   1 file changed, 4 insertions(+)
>
> diff --git a/src/gallium/auxiliary/tgsi/tgsi_build.c b/src/gallium/auxiliary/tgsi/tgsi_build.c
> index a3e659b..7e30bb6 100644
> --- a/src/gallium/auxiliary/tgsi/tgsi_build.c
> +++ b/src/gallium/auxiliary/tgsi/tgsi_build.c
> @@ -781,6 +781,8 @@ tgsi_default_instruction_memory( void )
>      struct tgsi_instruction_memory instruction_memory;
>
>      instruction_memory.Qualifier = 0;
> +   instruction_memory.Texture = 0;
> +   instruction_memory.Format = 0;
>      instruction_memory.Padding = 0;
>
>      return instruction_memory;
> @@ -796,6 +798,8 @@ tgsi_build_instruction_memory(
>      struct tgsi_instruction_memory instruction_memory;
>
>      instruction_memory.Qualifier = qualifier;
> +   instruction_memory.Texture = 0;
> +   instruction_memory.Format = 0;
>      instruction_memory.Padding = 0;
>      instruction->Memory = 1;
>
>
Hi,

On 16-03-16 15:55, Ilia Mirkin wrote:
> This approach leads to the emitters needing to know about both global and
> buffer, even though at that point, they are identical. I was thinking that
> in the lowering logic, buffer would just get rewritten as global (with the
> offset added), thus not needing any change to the emitters. What do you
> think about such an approach?

I was actually thinking the same, but I was a bit afraid I might break
something by doing that. I'm willing to try though, but the result is
going to need some extra testing by others I believe.

Questions:

1) Any tests I can run to test the buffer paths ?

2) So the resulting patch, which would replace this one, and make most
    of the "nouveau: codegen: Add support for OpenCL global memory buffers"
    unnecessary would look something like this:
2a) Add FILE_MEMORY_BUFFER as nv50_ir::FILE_* type
2b) Use it in nv50_ir_from_tgsi.cpp instead of GLOBAL
2c) Use it in nv50_ir_lowering_ to check for buffer accesses,
     and when adding the offset change the file_type to GLOBAL

Right ?

Regards,

Hans


> On Mar 16, 2016 2:24 AM, "Hans de Goede" <hdegoede@redhat.com> wrote:
>
>> FILE_MEMORY_GLOBAL is currently only used for buffer handling, as we
>> do not yet have (opencl) global memory support. Global memory support
>> actually requires some different handling during lowering, so rename
>> FILE_MEMORY_GLOBAL to FILE_MEMORY_BUFFER to reflect that the current
>> code is for buffer handling, this will allow the later (re-)addition
>> of FILE_MEMORY_GLOBAL for regular global memory.
>>
>> Signed-off-by: Hans de Goede <hdegoede@redhat.com>
>> ---
>>   src/gallium/drivers/nouveau/codegen/nv50_ir.h                |  2 +-
>>   src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gk110.cpp   | 10
>> +++++-----
>>   src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp   |  6 +++---
>>   src/gallium/drivers/nouveau/codegen/nv50_ir_emit_nv50.cpp    | 10
>> +++++-----
>>   src/gallium/drivers/nouveau/codegen/nv50_ir_emit_nvc0.cpp    | 12
>> ++++++------
>>   src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp    |  8 ++++----
>>   .../drivers/nouveau/codegen/nv50_ir_lowering_nvc0.cpp        | 10
>> +++++-----
>>   src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp     |  8 ++++----
>>   src/gallium/drivers/nouveau/codegen/nv50_ir_print.cpp        |  2 +-
>>   src/gallium/drivers/nouveau/codegen/nv50_ir_target_nv50.cpp  |  6 +++---
>>   src/gallium/drivers/nouveau/codegen/nv50_ir_target_nvc0.cpp  |  2 +-
>>   11 files changed, 38 insertions(+), 38 deletions(-)
>>
>> diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir.h
>> b/src/gallium/drivers/nouveau/codegen/nv50_ir.h
>> index 7b0eb2f..fdc2195 100644
>> --- a/src/gallium/drivers/nouveau/codegen/nv50_ir.h
>> +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir.h
>> @@ -332,7 +332,7 @@ enum DataFile
>>      FILE_MEMORY_CONST,
>>      FILE_SHADER_INPUT,
>>      FILE_SHADER_OUTPUT,
>> -   FILE_MEMORY_GLOBAL,
>> +   FILE_MEMORY_BUFFER,
>>      FILE_MEMORY_SHARED,
>>      FILE_MEMORY_LOCAL,
>>      FILE_SYSTEM_VALUE,
>> diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gk110.cpp
>> b/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gk110.cpp
>> index 70f3c3f..02a1101 100644
>> --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gk110.cpp
>> +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gk110.cpp
>> @@ -1641,7 +1641,7 @@ CodeEmitterGK110::emitSTORE(const Instruction *i)
>>      int32_t offset = SDATA(i->src(0)).offset;
>>
>>      switch (i->src(0).getFile()) {
>> -   case FILE_MEMORY_GLOBAL: code[1] = 0xe0000000; code[0] = 0x00000000;
>> break;
>> +   case FILE_MEMORY_BUFFER: code[1] = 0xe0000000; code[0] = 0x00000000;
>> break;
>>      case FILE_MEMORY_LOCAL:  code[1] = 0x7a800000; code[0] = 0x00000002;
>> break;
>>      case FILE_MEMORY_SHARED:
>>         code[0] = 0x00000002;
>> @@ -1678,7 +1678,7 @@ CodeEmitterGK110::emitSTORE(const Instruction *i)
>>
>>      srcId(i->src(1), 2);
>>      srcId(i->src(0).getIndirect(0), 10);
>> -   if (i->src(0).getFile() == FILE_MEMORY_GLOBAL &&
>> +   if (i->src(0).getFile() == FILE_MEMORY_BUFFER &&
>>          i->src(0).isIndirect(0) &&
>>          i->getIndirect(0, 0)->reg.size == 8)
>>         code[1] |= 1 << 23;
>> @@ -1690,7 +1690,7 @@ CodeEmitterGK110::emitLOAD(const Instruction *i)
>>      int32_t offset = SDATA(i->src(0)).offset;
>>
>>      switch (i->src(0).getFile()) {
>> -   case FILE_MEMORY_GLOBAL: code[1] = 0xc0000000; code[0] = 0x00000000;
>> break;
>> +   case FILE_MEMORY_BUFFER: code[1] = 0xc0000000; code[0] = 0x00000000;
>> break;
>>      case FILE_MEMORY_LOCAL:  code[1] = 0x7a000000; code[0] = 0x00000002;
>> break;
>>      case FILE_MEMORY_SHARED:
>>         code[0] = 0x00000002;
>> @@ -1800,7 +1800,7 @@ CodeEmitterGK110::emitMOV(const Instruction *i)
>>   static inline bool
>>   uses64bitAddress(const Instruction *ldst)
>>   {
>> -   return ldst->src(0).getFile() == FILE_MEMORY_GLOBAL &&
>> +   return ldst->src(0).getFile() == FILE_MEMORY_BUFFER &&
>>         ldst->src(0).isIndirect(0) &&
>>         ldst->getIndirect(0, 0)->reg.size == 8;
>>   }
>> @@ -1862,7 +1862,7 @@ CodeEmitterGK110::emitCCTL(const Instruction *i)
>>
>>      code[0] = 0x00000002 | (i->subOp << 2);
>>
>> -   if (i->src(0).getFile() == FILE_MEMORY_GLOBAL) {
>> +   if (i->src(0).getFile() == FILE_MEMORY_BUFFER) {
>>         code[1] = 0x7b000000;
>>      } else {
>>         code[1] = 0x7c000000;
>> diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp
>> b/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp
>> index e079a57..27f287f 100644
>> --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp
>> +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp
>> @@ -2417,7 +2417,7 @@ void
>>   CodeEmitterGM107::emitCCTL()
>>   {
>>      unsigned width;
>> -   if (insn->src(0).getFile() == FILE_MEMORY_GLOBAL) {
>> +   if (insn->src(0).getFile() == FILE_MEMORY_BUFFER) {
>>         emitInsn(0xef600000);
>>         width = 30;
>>      } else {
>> @@ -2988,7 +2988,7 @@ CodeEmitterGM107::emitInstruction(Instruction *i)
>>         case FILE_MEMORY_CONST : emitLDC(); break;
>>         case FILE_MEMORY_LOCAL : emitLDL(); break;
>>         case FILE_MEMORY_SHARED: emitLDS(); break;
>> -      case FILE_MEMORY_GLOBAL: emitLD(); break;
>> +      case FILE_MEMORY_BUFFER: emitLD(); break;
>>         default:
>>            assert(!"invalid load");
>>            emitNOP();
>> @@ -2999,7 +2999,7 @@ CodeEmitterGM107::emitInstruction(Instruction *i)
>>         switch (insn->src(0).getFile()) {
>>         case FILE_MEMORY_LOCAL : emitSTL(); break;
>>         case FILE_MEMORY_SHARED: emitSTS(); break;
>> -      case FILE_MEMORY_GLOBAL: emitST(); break;
>> +      case FILE_MEMORY_BUFFER: emitST(); break;
>>         default:
>>            assert(!"invalid load");
>>            emitNOP();
>> diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_nv50.cpp
>> b/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_nv50.cpp
>> index 682a19d..7476e21 100644
>> --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_nv50.cpp
>> +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_nv50.cpp
>> @@ -662,7 +662,7 @@ CodeEmitterNV50::emitLOAD(const Instruction *i)
>>         code[0] = 0xd0000001;
>>         code[1] = 0x40000000;
>>         break;
>> -   case FILE_MEMORY_GLOBAL:
>> +   case FILE_MEMORY_BUFFER:
>>         code[0] = 0xd0000001 | (i->getSrc(0)->reg.fileIndex << 16);
>>         code[1] = 0x80000000;
>>         break;
>> @@ -671,7 +671,7 @@ CodeEmitterNV50::emitLOAD(const Instruction *i)
>>         break;
>>      }
>>      if (sf == FILE_MEMORY_LOCAL ||
>> -       sf == FILE_MEMORY_GLOBAL)
>> +       sf == FILE_MEMORY_BUFFER)
>>         emitLoadStoreSizeLG(i->sType, 21 + 32);
>>
>>      setDst(i, 0);
>> @@ -679,7 +679,7 @@ CodeEmitterNV50::emitLOAD(const Instruction *i)
>>      emitFlagsRd(i);
>>      emitFlagsWr(i);
>>
>> -   if (i->src(0).getFile() == FILE_MEMORY_GLOBAL) {
>> +   if (i->src(0).getFile() == FILE_MEMORY_BUFFER) {
>>         srcId(*i->src(0).getIndirect(0), 9);
>>      } else {
>>         setAReg16(i, 0);
>> @@ -699,7 +699,7 @@ CodeEmitterNV50::emitSTORE(const Instruction *i)
>>         code[1] = 0x80c00000;
>>         srcId(i->src(1), 32 + 14);
>>         break;
>> -   case FILE_MEMORY_GLOBAL:
>> +   case FILE_MEMORY_BUFFER:
>>         code[0] = 0xd0000001 | (i->getSrc(0)->reg.fileIndex << 16);
>>         code[1] = 0xa0000000;
>>         emitLoadStoreSizeLG(i->dType, 21 + 32);
>> @@ -737,7 +737,7 @@ CodeEmitterNV50::emitSTORE(const Instruction *i)
>>         break;
>>      }
>>
>> -   if (f == FILE_MEMORY_GLOBAL)
>> +   if (f == FILE_MEMORY_BUFFER)
>>         srcId(*i->src(0).getIndirect(0), 9);
>>      else
>>         setAReg16(i, 0);
>> diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_nvc0.cpp
>> b/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_nvc0.cpp
>> index 8b9328b..6236659 100644
>> --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_nvc0.cpp
>> +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_nvc0.cpp
>> @@ -280,7 +280,7 @@ void
>>   CodeEmitterNVC0::setAddressByFile(const ValueRef& src)
>>   {
>>      switch (src.getFile()) {
>> -   case FILE_MEMORY_GLOBAL:
>> +   case FILE_MEMORY_BUFFER:
>>         srcAddr32(src, 26, 0);
>>         break;
>>      case FILE_MEMORY_LOCAL:
>> @@ -1768,7 +1768,7 @@ CodeEmitterNVC0::emitCachingMode(CacheMode c)
>>   static inline bool
>>   uses64bitAddress(const Instruction *ldst)
>>   {
>> -   return ldst->src(0).getFile() == FILE_MEMORY_GLOBAL &&
>> +   return ldst->src(0).getFile() == FILE_MEMORY_BUFFER &&
>>         ldst->src(0).isIndirect(0) &&
>>         ldst->getIndirect(0, 0)->reg.size == 8;
>>   }
>> @@ -1779,7 +1779,7 @@ CodeEmitterNVC0::emitSTORE(const Instruction *i)
>>      uint32_t opc;
>>
>>      switch (i->src(0).getFile()) {
>> -   case FILE_MEMORY_GLOBAL: opc = 0x90000000; break;
>> +   case FILE_MEMORY_BUFFER: opc = 0x90000000; break;
>>      case FILE_MEMORY_LOCAL:  opc = 0xc8000000; break;
>>      case FILE_MEMORY_SHARED:
>>         if (i->subOp == NV50_IR_SUBOP_STORE_UNLOCKED) {
>> @@ -1828,7 +1828,7 @@ CodeEmitterNVC0::emitLOAD(const Instruction *i)
>>      code[0] = 0x00000005;
>>
>>      switch (i->src(0).getFile()) {
>> -   case FILE_MEMORY_GLOBAL: opc = 0x80000000; break;
>> +   case FILE_MEMORY_BUFFER: opc = 0x80000000; break;
>>      case FILE_MEMORY_LOCAL:  opc = 0xc0000000; break;
>>      case FILE_MEMORY_SHARED:
>>         if (i->subOp == NV50_IR_SUBOP_LOAD_LOCKED) {
>> @@ -2090,7 +2090,7 @@ CodeEmitterNVC0::emitCCTL(const Instruction *i)
>>   {
>>      code[0] = 0x00000005 | (i->subOp << 5);
>>
>> -   if (i->src(0).getFile() == FILE_MEMORY_GLOBAL) {
>> +   if (i->src(0).getFile() == FILE_MEMORY_BUFFER) {
>>         code[1] = 0x98000000;
>>         srcAddr32(i->src(0), 28, 2);
>>      } else {
>> @@ -3121,7 +3121,7 @@ SchedDataCalculator::checkRd(const Value *v, int
>> cycle, int& delay) const
>>      case FILE_MEMORY_LOCAL:
>>      case FILE_MEMORY_CONST:
>>      case FILE_MEMORY_SHARED:
>> -   case FILE_MEMORY_GLOBAL:
>> +   case FILE_MEMORY_BUFFER:
>>      case FILE_SYSTEM_VALUE:
>>         // TODO: any restrictions here ?
>>         break;
>> diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp
>> b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp
>> index 1e91ad3..91879e4 100644
>> --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp
>> +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp
>> @@ -373,8 +373,8 @@ static nv50_ir::DataFile translateFile(uint file)
>>      case TGSI_FILE_PREDICATE:       return nv50_ir::FILE_PREDICATE;
>>      case TGSI_FILE_IMMEDIATE:       return nv50_ir::FILE_IMMEDIATE;
>>      case TGSI_FILE_SYSTEM_VALUE:    return nv50_ir::FILE_SYSTEM_VALUE;
>> -   case TGSI_FILE_BUFFER:          return nv50_ir::FILE_MEMORY_GLOBAL;
>> -   case TGSI_FILE_MEMORY:          return nv50_ir::FILE_MEMORY_GLOBAL;
>> +   case TGSI_FILE_BUFFER:          return nv50_ir::FILE_MEMORY_BUFFER;
>> +   case TGSI_FILE_MEMORY:          return nv50_ir::FILE_MEMORY_BUFFER;
>>      case TGSI_FILE_SAMPLER:
>>      case TGSI_FILE_NULL:
>>      default:
>> @@ -2191,7 +2191,7 @@ Converter::getResourceBase(const int r)
>>
>>      switch (r) {
>>      case TGSI_RESOURCE_GLOBAL:
>> -      sym = new_Symbol(prog, nv50_ir::FILE_MEMORY_GLOBAL, 15);
>> +      sym = new_Symbol(prog, nv50_ir::FILE_MEMORY_BUFFER, 15);
>>         break;
>>      case TGSI_RESOURCE_LOCAL:
>>         assert(prog->getType() == Program::TYPE_COMPUTE);
>> @@ -2209,7 +2209,7 @@ Converter::getResourceBase(const int r)
>>         break;
>>      default:
>>         sym = new_Symbol(prog,
>> -                       nv50_ir::FILE_MEMORY_GLOBAL, code->resources.at
>> (r).slot);
>> +                       nv50_ir::FILE_MEMORY_BUFFER, code->resources.at
>> (r).slot);
>>         break;
>>      }
>>      return sym;
>> diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_nvc0.cpp
>> b/src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_nvc0.cpp
>> index d0936d8..563d7c2 100644
>> --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_nvc0.cpp
>> +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_nvc0.cpp
>> @@ -1141,7 +1141,7 @@ NVC0LoweringPass::handleATOM(Instruction *atom)
>>         handleSharedATOM(atom);
>>         return true;
>>      default:
>> -      assert(atom->src(0).getFile() == FILE_MEMORY_GLOBAL);
>> +      assert(atom->src(0).getFile() == FILE_MEMORY_BUFFER);
>>         base = loadResInfo64(ind, atom->getSrc(0)->reg.fileIndex * 16);
>>         assert(base->reg.size == 8);
>>         if (ptr)
>> @@ -1154,7 +1154,7 @@ NVC0LoweringPass::handleATOM(Instruction *atom)
>>         bld.mkOp1v(OP_RDSV, TYPE_U32, bld.getScratch(), bld.mkSysVal(sv,
>> 0));
>>
>>      atom->setSrc(0, cloneShallow(func, atom->getSrc(0)));
>> -   atom->getSrc(0)->reg.file = FILE_MEMORY_GLOBAL;
>> +   atom->getSrc(0)->reg.file = FILE_MEMORY_BUFFER;
>>      if (ptr)
>>         base = bld.mkOp2v(OP_ADD, TYPE_U32, base, base, ptr);
>>      atom->setIndirect(0, 1, NULL);
>> @@ -1571,7 +1571,7 @@ NVC0LoweringPass::handleSurfaceOpNVE4(TexInstruction
>> *su)
>>         Instruction *red = bld.mkOp(OP_ATOM, su->dType, su->getDef(0));
>>         red->subOp = su->subOp;
>>         if (!gMemBase)
>> -         gMemBase = bld.mkSymbol(FILE_MEMORY_GLOBAL, 0, TYPE_U32, 0);
>> +         gMemBase = bld.mkSymbol(FILE_MEMORY_BUFFER, 0, TYPE_U32, 0);
>>         red->setSrc(0, gMemBase);
>>         red->setSrc(1, su->getSrc(3));
>>         if (su->subOp == NV50_IR_SUBOP_ATOM_CAS)
>> @@ -1963,7 +1963,7 @@ NVC0LoweringPass::visit(Instruction *i)
>>         } else if (i->src(0).getFile() == FILE_SHADER_OUTPUT) {
>>            assert(prog->getType() == Program::TYPE_TESSELLATION_CONTROL);
>>            i->op = OP_VFETCH;
>> -      } else if (i->src(0).getFile() == FILE_MEMORY_GLOBAL) {
>> +      } else if (i->src(0).getFile() == FILE_MEMORY_BUFFER) {
>>            Value *ind = i->getIndirect(0, 1);
>>            Value *ptr = loadResInfo64(ind, i->getSrc(0)->reg.fileIndex *
>> 16);
>>            // XXX come up with a way not to do this for EVERY little access
>> but
>> @@ -1987,7 +1987,7 @@ NVC0LoweringPass::visit(Instruction *i)
>>         break;
>>      case OP_ATOM:
>>      {
>> -      const bool cctl = i->src(0).getFile() == FILE_MEMORY_GLOBAL;
>> +      const bool cctl = i->src(0).getFile() == FILE_MEMORY_BUFFER;
>>         handleATOM(i);
>>         handleCasExch(i, cctl);
>>      }
>> diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
>> b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
>> index 66e7b2e..4a96d04 100644
>> --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
>> +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
>> @@ -2580,14 +2580,14 @@ MemoryOpt::runOpt(BasicBlock *bb)
>>                ldst->op == OP_BAR ||
>>                ldst->op == OP_MEMBAR) {
>>               purgeRecords(NULL, FILE_MEMORY_LOCAL);
>> -            purgeRecords(NULL, FILE_MEMORY_GLOBAL);
>> +            purgeRecords(NULL, FILE_MEMORY_BUFFER);
>>               purgeRecords(NULL, FILE_MEMORY_SHARED);
>>               purgeRecords(NULL, FILE_SHADER_OUTPUT);
>>            } else
>>            if (ldst->op == OP_ATOM || ldst->op == OP_CCTL) {
>> -            if (ldst->src(0).getFile() == FILE_MEMORY_GLOBAL) {
>> +            if (ldst->src(0).getFile() == FILE_MEMORY_BUFFER) {
>>                  purgeRecords(NULL, FILE_MEMORY_LOCAL);
>> -               purgeRecords(NULL, FILE_MEMORY_GLOBAL);
>> +               purgeRecords(NULL, FILE_MEMORY_BUFFER);
>>                  purgeRecords(NULL, FILE_MEMORY_SHARED);
>>               } else {
>>                  purgeRecords(NULL, ldst->src(0).getFile());
>> @@ -2607,7 +2607,7 @@ MemoryOpt::runOpt(BasicBlock *bb)
>>            DataFile file = ldst->src(0).getFile();
>>
>>            // if ld l[]/g[] look for previous store to eliminate the reload
>> -         if (file == FILE_MEMORY_GLOBAL || file == FILE_MEMORY_LOCAL) {
>> +         if (file == FILE_MEMORY_BUFFER || file == FILE_MEMORY_LOCAL) {
>>               // TODO: shared memory ?
>>               rec = findRecord(ldst, false, isAdjacent);
>>               if (rec && !isAdjacent)
>> diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_print.cpp
>> b/src/gallium/drivers/nouveau/codegen/nv50_ir_print.cpp
>> index cfa85ec..73ed753 100644
>> --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_print.cpp
>> +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_print.cpp
>> @@ -455,7 +455,7 @@ int Symbol::print(char *buf, size_t size,
>>      case FILE_MEMORY_CONST:  c = 'c'; break;
>>      case FILE_SHADER_INPUT:  c = 'a'; break;
>>      case FILE_SHADER_OUTPUT: c = 'o'; break;
>> -   case FILE_MEMORY_GLOBAL: c = 'g'; break;
>> +   case FILE_MEMORY_BUFFER: c = 'g'; break;
>>      case FILE_MEMORY_SHARED: c = 's'; break;
>>      case FILE_MEMORY_LOCAL:  c = 'l'; break;
>>      default:
>> diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_target_nv50.cpp
>> b/src/gallium/drivers/nouveau/codegen/nv50_ir_target_nv50.cpp
>> index 2c4d7f5..1cd45a2 100644
>> --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_target_nv50.cpp
>> +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_target_nv50.cpp
>> @@ -207,7 +207,7 @@ TargetNV50::getFileSize(DataFile file) const
>>      case FILE_MEMORY_CONST:  return 65536;
>>      case FILE_SHADER_INPUT:  return 0x200;
>>      case FILE_SHADER_OUTPUT: return 0x200;
>> -   case FILE_MEMORY_GLOBAL: return 0xffffffff;
>> +   case FILE_MEMORY_BUFFER: return 0xffffffff;
>>      case FILE_MEMORY_SHARED: return 16 << 10;
>>      case FILE_MEMORY_LOCAL:  return 48 << 10;
>>      case FILE_SYSTEM_VALUE:  return 16;
>> @@ -406,7 +406,7 @@ TargetNV50::isAccessSupported(DataFile file, DataType
>> ty) const
>>      if (ty == TYPE_B96 || ty == TYPE_NONE)
>>         return false;
>>      if (typeSizeof(ty) > 4)
>> -      return (file == FILE_MEMORY_LOCAL) || (file == FILE_MEMORY_GLOBAL);
>> +      return (file == FILE_MEMORY_LOCAL) || (file == FILE_MEMORY_BUFFER);
>>      return true;
>>   }
>>
>> @@ -508,7 +508,7 @@ int TargetNV50::getLatency(const Instruction *i) const
>>      if (i->op == OP_LOAD) {
>>         switch (i->src(0).getFile()) {
>>         case FILE_MEMORY_LOCAL:
>> -      case FILE_MEMORY_GLOBAL:
>> +      case FILE_MEMORY_BUFFER:
>>            return 100; // really 400 to 800
>>         default:
>>            return 22;
>> diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_target_nvc0.cpp
>> b/src/gallium/drivers/nouveau/codegen/nv50_ir_target_nvc0.cpp
>> index a03afa8..bda59a5 100644
>> --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_target_nvc0.cpp
>> +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_target_nvc0.cpp
>> @@ -248,7 +248,7 @@ TargetNVC0::getFileSize(DataFile file) const
>>      case FILE_MEMORY_CONST:  return 65536;
>>      case FILE_SHADER_INPUT:  return 0x400;
>>      case FILE_SHADER_OUTPUT: return 0x400;
>> -   case FILE_MEMORY_GLOBAL: return 0xffffffff;
>> +   case FILE_MEMORY_BUFFER: return 0xffffffff;
>>      case FILE_MEMORY_SHARED: return 16 << 10;
>>      case FILE_MEMORY_LOCAL:  return 48 << 10;
>>      case FILE_SYSTEM_VALUE:  return 32;
>> --
>> 2.7.2
>>
>>
>
On Wed, Mar 16, 2016 at 2:14 PM, Hans de Goede <hdegoede@redhat.com> wrote:
> Hi,
>
> On 16-03-16 15:55, Ilia Mirkin wrote:
>>
>> This approach leads to the emitters needing to know about both global and
>> buffer, even though at that point, they are identical. I was thinking that
>> in the lowering logic, buffer would just get rewritten as global (with the
>> offset added), thus not needing any change to the emitters. What do you
>> think about such an approach?
>
>
> I was actually thinking the same, but I was a bit afraid I might break
> something by doing that. I'm willing to try though, but the result is
> going to need some extra testing by others I believe.
>
> Questions:
>
> 1) Any tests I can run to test the buffer paths ?

bin/arb_shader_storage_buffer_object-* (in piglit)

Also a ton of stuff in dEQP, although some of it fails. (--deqp-case='*ssbo*')

>
> 2) So the resulting patch, which would replace this one, and make most
>    of the "nouveau: codegen: Add support for OpenCL global memory buffers"
>    unnecessary would look something like this:
> 2a) Add FILE_MEMORY_BUFFER as nv50_ir::FILE_* type
> 2b) Use it in nv50_ir_from_tgsi.cpp instead of GLOBAL
> 2c) Use it in nv50_ir_lowering_ to check for buffer accesses,
>     and when adding the offset change the file_type to GLOBAL
>
> Right ?

Sounds good to me!

  -ilia