[Mesa-dev,2/2] i965: Workaround for hardware bug in multi-stream transform feedback.

Submitted by Iago Toral Quiroga on June 30, 2014, 8:36 a.m.

Details

Message ID 1404117365-18900-3-git-send-email-itoral@igalia.com
State Superseded
Headers show

Not browsing as part of any series.

Commit Message

Iago Toral Quiroga June 30, 2014, 8:36 a.m.
There is a hardware bug that breaks TF recording in multi-stream mode when
emitting vertices to a stream that does not have any varyings to record. If
that happens, and the vertex is the last vertex emitted by the geometry shader,
varying data is not recorded for any stream and primitive queries (transform
feedback written and primitives generated) return 0 for all streams.

This problem could affect a likely scenario where the application uses stream0
only for rendering (no TF) and other streams to capture TF, since in this case
the application would be emitting vertices to stream0 but would not have any
varyings to record for it.

If the vertex is not the last one emitted by the geometry shader, TF will work
fine, but primtive count on that stream will be 0. This is not what we want but
is the expected behavior according to the Ivy Bridge documentation:

    "8.3 Stream Output Function:

    If a stream has no SO_DECL state defined (NumEntries is 0), incoming
    objects targeting that stream are effectively ignored. As there is no
    attempt to perform stream output, overflow detection is neither
    required nor performed."

These issues can be worked around by configuring at least one fake declaration
for such streams and redirecting their output to a disabled transform feedback
buffer (so that no recording actually happens). This workaround is not perfect
though, since we need to have at least one unused (disabled) TF buffer
available.
---
 src/mesa/drivers/dri/i965/gen7_sol_state.c | 59 ++++++++++++++++++++++++++++++
 1 file changed, 59 insertions(+)

Patch hide | download patch | download mbox

diff --git a/src/mesa/drivers/dri/i965/gen7_sol_state.c b/src/mesa/drivers/dri/i965/gen7_sol_state.c
index 1aae659..50169a9 100644
--- a/src/mesa/drivers/dri/i965/gen7_sol_state.c
+++ b/src/mesa/drivers/dri/i965/gen7_sol_state.c
@@ -189,6 +189,65 @@  gen7_upload_3dstate_so_decl_list(struct brw_context *brw,
          max_decls = decls[stream_id];
    }
 
+   /* There is a hardware bug that breaks TF recording in multi-stream mode
+    * when emitting vertices to a stream that does not have any varyings to
+    * record. If that happens, and the vertex is the last vertex emitted by the
+    * geometry shader, varying data is not recorded for any stream and
+    * primitive queries (transform feedback written and primitives generated)
+    * return 0 for all streams.
+    *
+    * This problem could affect a likely scenario where the application uses
+    * stream0 only for rendering (no TF) and other streams to capture TF, since
+    * in this case the application would be emitting vertices to stream0 but
+    * would not have any varyings to record for it.
+    *
+    * If the vertex is not the last one emitted by the geometry shader, TF will
+    * work fine, but primtive count on that stream will be 0. This is not what
+    * we want but is the expected behavior according to the Ivy Bridge
+    * documentation:
+    *
+    *    "8.3 Stream Output Function:
+    *
+    *    If a stream has no SO_DECL state defined (NumEntries is 0), incoming
+    *    objects targeting that stream are effectively ignored. As there is no
+    *    attempt to perform stream output, overflow detection is neither
+    *    required nor performed."
+    *
+    * These issues can be worked around by configuring at least one fake
+    * declaration for such streams and redirecting their output to a disabled
+    * transform feedback buffer (so that no recording actually happens).
+    * This workaround is not perfect though, since we need to have at least one
+    * unused (disabled) TF buffer available.
+    */
+   if (ctx->GeometryProgram._Current &&
+       ctx->GeometryProgram._Current->UsesStreams) {
+      int disabled_buffer = -1;
+      bool tf_buffers_checked = false;
+      for (int i = 0; i < MAX_VERTEX_STREAMS; i++) {
+         if (decls[i] > 0)
+            continue;
+         /* This stream has no varyings to record: see if we have at least one
+          * disabled TF buffer available
+          */
+         if (!tf_buffers_checked) {
+            tf_buffers_checked = true;
+            for (int i = 0; i < 4; i++) {
+               if (linked_xfb_info->BufferStride[i] == 0) {
+                  disabled_buffer = i;
+                  break;
+               }
+            }
+         }
+         if (disabled_buffer < 0)
+            break;
+         /* We have at least one disabled TF buffer, so redirect output from
+          * this stream to it and add a fake declaration
+          */
+         decls[i] = 1;
+         buffer_mask[i] = 1 << disabled_buffer;
+      }
+   }
+
    BEGIN_BATCH(max_decls * 2 + 3);
    OUT_BATCH(_3DSTATE_SO_DECL_LIST << 16 | (max_decls * 2 + 1));