[v4,12/18] i965/blorp: Update the fast clear color address.

Submitted by Rafael Antognolli on March 8, 2018, 4:49 p.m.

Details

Message ID 20180308164911.15375-13-rafael.antognolli@intel.com
State New
Headers show
Series "Use clear color address in surface state." ( rev: 3 ) in Mesa

Not browsing as part of any series.

Commit Message

Rafael Antognolli March 8, 2018, 4:49 p.m.
On Gen10, whenever we do a fast clear, blorp will update the clear color
state buffer for us, as long as we set the clear color address
correctly.

However, on a hiz clear, if the surface is already on the fast clear
state we skip the actual fast clear operation and, before gen10, only
updated the miptree. On gen10+ we need to update the clear value state
buffer too, since blorp will not be doing a fast clear and updating it
for us.

v4:
 - do not use clear_value_size in the for loop
 - Get the address of the clear color from the aux buffer or the
 clear_color_bo, depending on which one is available.
 - let core blorp update the clear color, but also update it when we
 skip a fast clear depth.

Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
---
 src/mesa/drivers/dri/i965/brw_blorp.c | 11 +++++++++++
 src/mesa/drivers/dri/i965/brw_clear.c | 22 ++++++++++++++++++++++
 2 files changed, 33 insertions(+)

Patch hide | download patch | download mbox

diff --git a/src/mesa/drivers/dri/i965/brw_blorp.c b/src/mesa/drivers/dri/i965/brw_blorp.c
index ffd957fb866..914aeeace7a 100644
--- a/src/mesa/drivers/dri/i965/brw_blorp.c
+++ b/src/mesa/drivers/dri/i965/brw_blorp.c
@@ -185,6 +185,17 @@  blorp_surf_for_miptree(struct brw_context *brw,
 
       surf->aux_addr.buffer = aux_buf->bo;
       surf->aux_addr.offset = aux_buf->offset;
+
+      if (devinfo->gen >= 10) {
+         /* If we have a CCS surface and clear_color_bo set, use that bo as
+          * storage for the indirect clear color. Otherwise, use the extra
+          * space at the end of the aux_buffer.
+          */
+         surf->clear_color_addr = (struct blorp_address) {
+            .buffer = aux_buf->clear_color_bo,
+            .offset = aux_buf->clear_color_offset,
+         };
+      }
    } else {
       surf->aux_addr = (struct blorp_address) {
          .buffer = NULL,
diff --git a/src/mesa/drivers/dri/i965/brw_clear.c b/src/mesa/drivers/dri/i965/brw_clear.c
index 8aa83722ee9..63c0b241898 100644
--- a/src/mesa/drivers/dri/i965/brw_clear.c
+++ b/src/mesa/drivers/dri/i965/brw_clear.c
@@ -108,6 +108,7 @@  brw_fast_clear_depth(struct gl_context *ctx)
    struct intel_mipmap_tree *mt = depth_irb->mt;
    struct gl_renderbuffer_attachment *depth_att = &fb->Attachment[BUFFER_DEPTH];
    const struct gen_device_info *devinfo = &brw->screen->devinfo;
+   bool same_clear_value = true;
 
    if (devinfo->gen < 6)
       return false;
@@ -213,6 +214,7 @@  brw_fast_clear_depth(struct gl_context *ctx)
       }
 
       intel_miptree_set_depth_clear_value(ctx, mt, clear_value);
+      same_clear_value = false;
    }
 
    bool need_clear = false;
@@ -232,6 +234,26 @@  brw_fast_clear_depth(struct gl_context *ctx)
        * state then simply updating the miptree fast clear value is sufficient
        * to change their clear value.
        */
+      if (devinfo->gen >= 10 && !same_clear_value) {
+         /* Before gen10, it was enough to just update the clear value in the
+          * miptree. But on gen10+, we let blorp update the clear value state
+          * buffer when doing a fast clear. Since we are skipping the fast
+          * clear here, we need to update the clear color ourselves.
+          */
+         uint32_t clear_offset = mt->hiz_buf->clear_color_offset;
+         union isl_color_value clear_color = { .f32 = { clear_value, } };
+
+         /* We can't update the clear color while the hardware is still using
+          * the previous one for a resolve or sampling from it. So make sure
+          * that there's no pending commands at this point.
+          */
+         brw_emit_pipe_control_flush(brw, PIPE_CONTROL_CS_STALL);
+         for (int i = 0; i < 4; i++) {
+            brw_store_data_imm32(brw, mt->hiz_buf->clear_color_bo,
+                                 clear_offset + i * 4, clear_color.u32[i]);
+         }
+         brw_emit_pipe_control_flush(brw, PIPE_CONTROL_STATE_CACHE_INVALIDATE);
+      }
       return true;
    }
 

Comments

For the subject, instead of 'address', what about something like:

i965/blorp: Update the fast clear value buffer

On 2018-03-08 08:49:05, Rafael Antognolli wrote:
> On Gen10, whenever we do a fast clear, blorp will update the clear color
> state buffer for us, as long as we set the clear color address
> correctly.
> 
> However, on a hiz clear, if the surface is already on the fast clear
> state we skip the actual fast clear operation and, before gen10, only
> updated the miptree. On gen10+ we need to update the clear value state
> buffer too, since blorp will not be doing a fast clear and updating it
> for us.
> 
> v4:
>  - do not use clear_value_size in the for loop
>  - Get the address of the clear color from the aux buffer or the
>  clear_color_bo, depending on which one is available.
>  - let core blorp update the clear color, but also update it when we
>  skip a fast clear depth.
> 
> Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
> ---
>  src/mesa/drivers/dri/i965/brw_blorp.c | 11 +++++++++++
>  src/mesa/drivers/dri/i965/brw_clear.c | 22 ++++++++++++++++++++++
>  2 files changed, 33 insertions(+)
> 
> diff --git a/src/mesa/drivers/dri/i965/brw_blorp.c b/src/mesa/drivers/dri/i965/brw_blorp.c
> index ffd957fb866..914aeeace7a 100644
> --- a/src/mesa/drivers/dri/i965/brw_blorp.c
> +++ b/src/mesa/drivers/dri/i965/brw_blorp.c
> @@ -185,6 +185,17 @@ blorp_surf_for_miptree(struct brw_context *brw,
>  
>        surf->aux_addr.buffer = aux_buf->bo;
>        surf->aux_addr.offset = aux_buf->offset;
> +
> +      if (devinfo->gen >= 10) {
> +         /* If we have a CCS surface and clear_color_bo set, use that bo as
> +          * storage for the indirect clear color. Otherwise, use the extra
> +          * space at the end of the aux_buffer.
> +          */
> +         surf->clear_color_addr = (struct blorp_address) {
> +            .buffer = aux_buf->clear_color_bo,
> +            .offset = aux_buf->clear_color_offset,
> +         };
> +      }
>     } else {
>        surf->aux_addr = (struct blorp_address) {
>           .buffer = NULL,
> diff --git a/src/mesa/drivers/dri/i965/brw_clear.c b/src/mesa/drivers/dri/i965/brw_clear.c
> index 8aa83722ee9..63c0b241898 100644
> --- a/src/mesa/drivers/dri/i965/brw_clear.c
> +++ b/src/mesa/drivers/dri/i965/brw_clear.c
> @@ -108,6 +108,7 @@ brw_fast_clear_depth(struct gl_context *ctx)
>     struct intel_mipmap_tree *mt = depth_irb->mt;
>     struct gl_renderbuffer_attachment *depth_att = &fb->Attachment[BUFFER_DEPTH];
>     const struct gen_device_info *devinfo = &brw->screen->devinfo;
> +   bool same_clear_value = true;
>  
>     if (devinfo->gen < 6)
>        return false;
> @@ -213,6 +214,7 @@ brw_fast_clear_depth(struct gl_context *ctx)
>        }
>  
>        intel_miptree_set_depth_clear_value(ctx, mt, clear_value);
> +      same_clear_value = false;
>     }
>  
>     bool need_clear = false;
> @@ -232,6 +234,26 @@ brw_fast_clear_depth(struct gl_context *ctx)
>         * state then simply updating the miptree fast clear value is sufficient
>         * to change their clear value.
>         */
> +      if (devinfo->gen >= 10 && !same_clear_value) {
> +         /* Before gen10, it was enough to just update the clear value in the
> +          * miptree. But on gen10+, we let blorp update the clear value state
> +          * buffer when doing a fast clear. Since we are skipping the fast
> +          * clear here, we need to update the clear color ourselves.
> +          */
> +         uint32_t clear_offset = mt->hiz_buf->clear_color_offset;
> +         union isl_color_value clear_color = { .f32 = { clear_value, } };
> +
> +         /* We can't update the clear color while the hardware is still using
> +          * the previous one for a resolve or sampling from it. So make sure
> +          * that there's no pending commands at this point.
> +          */
> +         brw_emit_pipe_control_flush(brw, PIPE_CONTROL_CS_STALL);

I think we talked about potential performance concerns over the stall,
but we decided it was probably unlikely that an application would
clear the buffer multiple times with different values.

I just wanted to mention it in case anyone else has other opinions on
it.

11 - 12 Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>

> +         for (int i = 0; i < 4; i++) {
> +            brw_store_data_imm32(brw, mt->hiz_buf->clear_color_bo,
> +                                 clear_offset + i * 4, clear_color.u32[i]);
> +         }
> +         brw_emit_pipe_control_flush(brw, PIPE_CONTROL_STATE_CACHE_INVALIDATE);
> +      }
>        return true;
>     }
>  
> -- 
> 2.14.3
> 
> _______________________________________________
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
On Thu, Mar 8, 2018 at 8:49 AM, Rafael Antognolli <
rafael.antognolli@intel.com> wrote:

> On Gen10, whenever we do a fast clear, blorp will update the clear color
> state buffer for us, as long as we set the clear color address
> correctly.
>
> However, on a hiz clear, if the surface is already on the fast clear
> state we skip the actual fast clear operation and, before gen10, only
> updated the miptree. On gen10+ we need to update the clear value state
> buffer too, since blorp will not be doing a fast clear and updating it
> for us.
>
> v4:
>  - do not use clear_value_size in the for loop
>  - Get the address of the clear color from the aux buffer or the
>  clear_color_bo, depending on which one is available.
>  - let core blorp update the clear color, but also update it when we
>  skip a fast clear depth.
>
> Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
> ---
>  src/mesa/drivers/dri/i965/brw_blorp.c | 11 +++++++++++
>  src/mesa/drivers/dri/i965/brw_clear.c | 22 ++++++++++++++++++++++
>  2 files changed, 33 insertions(+)
>
> diff --git a/src/mesa/drivers/dri/i965/brw_blorp.c
> b/src/mesa/drivers/dri/i965/brw_blorp.c
> index ffd957fb866..914aeeace7a 100644
> --- a/src/mesa/drivers/dri/i965/brw_blorp.c
> +++ b/src/mesa/drivers/dri/i965/brw_blorp.c
> @@ -185,6 +185,17 @@ blorp_surf_for_miptree(struct brw_context *brw,
>
>        surf->aux_addr.buffer = aux_buf->bo;
>        surf->aux_addr.offset = aux_buf->offset;
> +
> +      if (devinfo->gen >= 10) {
> +         /* If we have a CCS surface and clear_color_bo set, use that bo
> as
> +          * storage for the indirect clear color. Otherwise, use the extra
> +          * space at the end of the aux_buffer.
> +          */
> +         surf->clear_color_addr = (struct blorp_address) {
> +            .buffer = aux_buf->clear_color_bo,
> +            .offset = aux_buf->clear_color_offset,
> +         };
> +      }
>     } else {
>        surf->aux_addr = (struct blorp_address) {
>           .buffer = NULL,
> diff --git a/src/mesa/drivers/dri/i965/brw_clear.c
> b/src/mesa/drivers/dri/i965/brw_clear.c
> index 8aa83722ee9..63c0b241898 100644
> --- a/src/mesa/drivers/dri/i965/brw_clear.c
> +++ b/src/mesa/drivers/dri/i965/brw_clear.c
> @@ -108,6 +108,7 @@ brw_fast_clear_depth(struct gl_context *ctx)
>     struct intel_mipmap_tree *mt = depth_irb->mt;
>     struct gl_renderbuffer_attachment *depth_att =
> &fb->Attachment[BUFFER_DEPTH];
>     const struct gen_device_info *devinfo = &brw->screen->devinfo;
> +   bool same_clear_value = true;
>
>     if (devinfo->gen < 6)
>        return false;
> @@ -213,6 +214,7 @@ brw_fast_clear_depth(struct gl_context *ctx)
>        }
>
>        intel_miptree_set_depth_clear_value(ctx, mt, clear_value);
> +      same_clear_value = false;
>     }
>
>     bool need_clear = false;
> @@ -232,6 +234,26 @@ brw_fast_clear_depth(struct gl_context *ctx)
>         * state then simply updating the miptree fast clear value is
> sufficient
>         * to change their clear value.
>         */
> +      if (devinfo->gen >= 10 && !same_clear_value) {
> +         /* Before gen10, it was enough to just update the clear value in
> the
> +          * miptree. But on gen10+, we let blorp update the clear value
> state
> +          * buffer when doing a fast clear. Since we are skipping the fast
> +          * clear here, we need to update the clear color ourselves.
> +          */
> +         uint32_t clear_offset = mt->hiz_buf->clear_color_offset;
> +         union isl_color_value clear_color = { .f32 = { clear_value, } };
> +
> +         /* We can't update the clear color while the hardware is still
> using
> +          * the previous one for a resolve or sampling from it. So make
> sure
> +          * that there's no pending commands at this point.
> +          */
> +         brw_emit_pipe_control_flush(brw, PIPE_CONTROL_CS_STALL);
>

This is fun... First off, this can only happen in the case where you have
two clears with no rendering in between so we shouldn't have any pending
rendering or resolves.  However, we may have pending texturing.  In that
case, I think we need a RT cache flush and CS stall before and state and
sampler cache invalidates afterwards.  We also need a test. :-)

This is enough of a crazy edge case, that I'm happy to let the patches
progress before the test has been written but we do need to write the test.


> +         for (int i = 0; i < 4; i++) {
> +            brw_store_data_imm32(brw, mt->hiz_buf->clear_color_bo,
> +                                 clear_offset + i * 4,
> clear_color.u32[i]);
> +         }
> +         brw_emit_pipe_control_flush(brw, PIPE_CONTROL_STATE_CACHE_
> INVALIDATE);
> +      }
>        return true;
>     }
>
> --
> 2.14.3
>
> _______________________________________________
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
On Tue, Mar 27, 2018 at 11:16:37AM -0700, Jason Ekstrand wrote:
> On Thu, Mar 8, 2018 at 8:49 AM, Rafael Antognolli <rafael.antognolli@intel.com>
> wrote:
> 
>     On Gen10, whenever we do a fast clear, blorp will update the clear color
>     state buffer for us, as long as we set the clear color address
>     correctly.
> 
>     However, on a hiz clear, if the surface is already on the fast clear
>     state we skip the actual fast clear operation and, before gen10, only
>     updated the miptree. On gen10+ we need to update the clear value state
>     buffer too, since blorp will not be doing a fast clear and updating it
>     for us.
> 
>     v4:
>      - do not use clear_value_size in the for loop
>      - Get the address of the clear color from the aux buffer or the
>      clear_color_bo, depending on which one is available.
>      - let core blorp update the clear color, but also update it when we
>      skip a fast clear depth.
> 
>     Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
>     ---
>      src/mesa/drivers/dri/i965/brw_blorp.c | 11 +++++++++++
>      src/mesa/drivers/dri/i965/brw_clear.c | 22 ++++++++++++++++++++++
>      2 files changed, 33 insertions(+)
> 
>     diff --git a/src/mesa/drivers/dri/i965/brw_blorp.c b/src/mesa/drivers/dri/
>     i965/brw_blorp.c
>     index ffd957fb866..914aeeace7a 100644
>     --- a/src/mesa/drivers/dri/i965/brw_blorp.c
>     +++ b/src/mesa/drivers/dri/i965/brw_blorp.c
>     @@ -185,6 +185,17 @@ blorp_surf_for_miptree(struct brw_context *brw,
> 
>            surf->aux_addr.buffer = aux_buf->bo;
>            surf->aux_addr.offset = aux_buf->offset;
>     +
>     +      if (devinfo->gen >= 10) {
>     +         /* If we have a CCS surface and clear_color_bo set, use that bo
>     as
>     +          * storage for the indirect clear color. Otherwise, use the extra
>     +          * space at the end of the aux_buffer.
>     +          */
>     +         surf->clear_color_addr = (struct blorp_address) {
>     +            .buffer = aux_buf->clear_color_bo,
>     +            .offset = aux_buf->clear_color_offset,
>     +         };
>     +      }
>         } else {
>            surf->aux_addr = (struct blorp_address) {
>               .buffer = NULL,
>     diff --git a/src/mesa/drivers/dri/i965/brw_clear.c b/src/mesa/drivers/dri/
>     i965/brw_clear.c
>     index 8aa83722ee9..63c0b241898 100644
>     --- a/src/mesa/drivers/dri/i965/brw_clear.c
>     +++ b/src/mesa/drivers/dri/i965/brw_clear.c
>     @@ -108,6 +108,7 @@ brw_fast_clear_depth(struct gl_context *ctx)
>         struct intel_mipmap_tree *mt = depth_irb->mt;
>         struct gl_renderbuffer_attachment *depth_att = &fb->Attachment
>     [BUFFER_DEPTH];
>         const struct gen_device_info *devinfo = &brw->screen->devinfo;
>     +   bool same_clear_value = true;
> 
>         if (devinfo->gen < 6)
>            return false;
>     @@ -213,6 +214,7 @@ brw_fast_clear_depth(struct gl_context *ctx)
>            }
> 
>            intel_miptree_set_depth_clear_value(ctx, mt, clear_value);
>     +      same_clear_value = false;
>         }
> 
>         bool need_clear = false;
>     @@ -232,6 +234,26 @@ brw_fast_clear_depth(struct gl_context *ctx)
>             * state then simply updating the miptree fast clear value is
>     sufficient
>             * to change their clear value.
>             */
>     +      if (devinfo->gen >= 10 && !same_clear_value) {
>     +         /* Before gen10, it was enough to just update the clear value in
>     the
>     +          * miptree. But on gen10+, we let blorp update the clear value
>     state
>     +          * buffer when doing a fast clear. Since we are skipping the fast
>     +          * clear here, we need to update the clear color ourselves.
>     +          */
>     +         uint32_t clear_offset = mt->hiz_buf->clear_color_offset;
>     +         union isl_color_value clear_color = { .f32 = { clear_value, } };
>     +
>     +         /* We can't update the clear color while the hardware is still
>     using
>     +          * the previous one for a resolve or sampling from it. So make
>     sure
>     +          * that there's no pending commands at this point.
>     +          */
>     +         brw_emit_pipe_control_flush(brw, PIPE_CONTROL_CS_STALL);
> 
> 
> This is fun... First off, this can only happen in the case where you have two
> clears with no rendering in between so we shouldn't have any pending rendering
> or resolves.  However, we may have pending texturing.  In that case, I think we
> need a RT cache flush and CS stall before and state and sampler cache
> invalidates afterwards.  We also need a test. :-)
> 
> This is enough of a crazy edge case, that I'm happy to let the patches progress
> before the test has been written but we do need to write the test.

I think I found a test that failed without those flushes, I can double
check it and let you know... would that be enough of a test case?

>     +         for (int i = 0; i < 4; i++) {
>     +            brw_store_data_imm32(brw, mt->hiz_buf->clear_color_bo,
>     +                                 clear_offset + i * 4, clear_color.u32
>     [i]);
>     +         }
>     +         brw_emit_pipe_control_flush(brw, PIPE_CONTROL_STATE_CACHE_
>     INVALIDATE);
>     +      }
>            return true;
>         }
>    
>     --
>     2.14.3
> 
>     _______________________________________________
>     mesa-dev mailing list
>     mesa-dev@lists.freedesktop.org
>     https://lists.freedesktop.org/mailman/listinfo/mesa-dev
> 
>