[3/8] anv: Make the clear state buffer 64 bytes aligned.

Submitted by Rafael Antognolli on Dec. 15, 2017, 10:53 p.m.

Details

Message ID 20171215225335.28009-3-rafael.antognolli@intel.com
State New
Headers show
Series "Series without cover letter" ( rev: 1 ) in Mesa

Not browsing as part of any series.

Commit Message

Rafael Antognolli Dec. 15, 2017, 10:53 p.m.
On Gen10+, if we use the clear state address field in the surface state
instead of the clear color directly, there's a restriction that the
address must point to the lower part of a 64 byte cache-line.

Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
---
 src/intel/vulkan/anv_private.h | 12 +++++++++++-
 1 file changed, 11 insertions(+), 1 deletion(-)

Patch hide | download patch | download mbox

diff --git a/src/intel/vulkan/anv_private.h b/src/intel/vulkan/anv_private.h
index b7bde4b8ce6..43cbf065724 100644
--- a/src/intel/vulkan/anv_private.h
+++ b/src/intel/vulkan/anv_private.h
@@ -2490,7 +2490,17 @@  anv_fast_clear_state_entry_size(const struct anv_device *device)
     * GPU memcpy operations.
     */
    assert(device->isl_dev.ss.clear_value_size % 4 == 0);
-   return device->isl_dev.ss.clear_value_size + 4;
+
+   const unsigned entry_size = device->isl_dev.ss.clear_value_size + 4;
+   /* On Gen10+, we use the clear color address of the surface to point to this
+    * buffer directly. However, according to the bspec:
+    *
+    *    The memory layout of the clear color pointed to by this address is a
+    *    value stored in the lower-order bytes of a 64-byte cache-line.
+    *
+    * So add some padding here for Gen10+.
+    */
+   return device->info.gen >= 10 ? ALIGN(entry_size, 64) : entry_size;
 }
 
 static inline struct anv_address

Comments

On Fri, Dec 15, 2017 at 02:53:30PM -0800, Rafael Antognolli wrote:
> On Gen10+, if we use the clear state address field in the surface state
> instead of the clear color directly, there's a restriction that the
> address must point to the lower part of a 64 byte cache-line.
> 
> Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
> ---
>  src/intel/vulkan/anv_private.h | 12 +++++++++++-
>  1 file changed, 11 insertions(+), 1 deletion(-)
> 
> diff --git a/src/intel/vulkan/anv_private.h b/src/intel/vulkan/anv_private.h
> index b7bde4b8ce6..43cbf065724 100644
> --- a/src/intel/vulkan/anv_private.h
> +++ b/src/intel/vulkan/anv_private.h
> @@ -2490,7 +2490,17 @@ anv_fast_clear_state_entry_size(const struct anv_device *device)
>      * GPU memcpy operations.
>      */
>     assert(device->isl_dev.ss.clear_value_size % 4 == 0);
> -   return device->isl_dev.ss.clear_value_size + 4;
> +
> +   const unsigned entry_size = device->isl_dev.ss.clear_value_size + 4;
> +   /* On Gen10+, we use the clear color address of the surface to point to this
> +    * buffer directly. However, according to the bspec:
> +    *
> +    *    The memory layout of the clear color pointed to by this address is a
> +    *    value stored in the lower-order bytes of a 64-byte cache-line.
> +    *
> +    * So add some padding here for Gen10+.
> +    */

I don't see any indication that the upper bytes may be modified by the
hardware. For that reason, I think we can assume that the image that
precedes this entry is at least 64 bytes and avoid padding the entry.

> +   return device->info.gen >= 10 ? ALIGN(entry_size, 64) : entry_size;


>  }
>  
>  static inline struct anv_address
> -- 
> 2.14.3
> 
> _______________________________________________
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
On Mon, Jan 8, 2018 at 3:00 PM, Nanley Chery <nanleychery@gmail.com> wrote:

> On Fri, Dec 15, 2017 at 02:53:30PM -0800, Rafael Antognolli wrote:
> > On Gen10+, if we use the clear state address field in the surface state
> > instead of the clear color directly, there's a restriction that the
> > address must point to the lower part of a 64 byte cache-line.
> >
> > Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
> > ---
> >  src/intel/vulkan/anv_private.h | 12 +++++++++++-
> >  1 file changed, 11 insertions(+), 1 deletion(-)
> >
> > diff --git a/src/intel/vulkan/anv_private.h b/src/intel/vulkan/anv_
> private.h
> > index b7bde4b8ce6..43cbf065724 100644
> > --- a/src/intel/vulkan/anv_private.h
> > +++ b/src/intel/vulkan/anv_private.h
> > @@ -2490,7 +2490,17 @@ anv_fast_clear_state_entry_size(const struct
> anv_device *device)
> >      * GPU memcpy operations.
> >      */
> >     assert(device->isl_dev.ss.clear_value_size % 4 == 0);
> > -   return device->isl_dev.ss.clear_value_size + 4;
> > +
> > +   const unsigned entry_size = device->isl_dev.ss.clear_value_size + 4;
> > +   /* On Gen10+, we use the clear color address of the surface to point
> to this
> > +    * buffer directly. However, according to the bspec:
> > +    *
> > +    *    The memory layout of the clear color pointed to by this
> address is a
> > +    *    value stored in the lower-order bytes of a 64-byte cache-line.
> > +    *
> > +    * So add some padding here for Gen10+.
> > +    */
>
> I don't see any indication that the upper bytes may be modified by the
> hardware. For that reason, I think we can assume that the image that
> precedes this entry is at least 64 bytes and avoid padding the entry.


I'm not sure what you mean by this.


> > +   return device->info.gen >= 10 ? ALIGN(entry_size, 64) : entry_size;
>
>
> >  }
> >
> >  static inline struct anv_address
> > --
> > 2.14.3
> >
> > _______________________________________________
> > mesa-dev mailing list
> > mesa-dev@lists.freedesktop.org
> > https://lists.freedesktop.org/mailman/listinfo/mesa-dev
> _______________________________________________
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
On Mon, Jan 08, 2018 at 04:03:47PM -0800, Jason Ekstrand wrote:
> On Mon, Jan 8, 2018 at 3:00 PM, Nanley Chery <nanleychery@gmail.com> wrote:
> 
>     On Fri, Dec 15, 2017 at 02:53:30PM -0800, Rafael Antognolli wrote:
>     > On Gen10+, if we use the clear state address field in the surface state
>     > instead of the clear color directly, there's a restriction that the
>     > address must point to the lower part of a 64 byte cache-line.
>     >
>     > Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
>     > ---
>     >  src/intel/vulkan/anv_private.h | 12 +++++++++++-
>     >  1 file changed, 11 insertions(+), 1 deletion(-)
>     >
>     > diff --git a/src/intel/vulkan/anv_private.h b/src/intel/vulkan/anv_
>     private.h
>     > index b7bde4b8ce6..43cbf065724 100644
>     > --- a/src/intel/vulkan/anv_private.h
>     > +++ b/src/intel/vulkan/anv_private.h
>     > @@ -2490,7 +2490,17 @@ anv_fast_clear_state_entry_size(const struct
>     anv_device *device)
>     >      * GPU memcpy operations.
>     >      */
>     >     assert(device->isl_dev.ss.clear_value_size % 4 == 0);
>     > -   return device->isl_dev.ss.clear_value_size + 4;
>     > +
>     > +   const unsigned entry_size = device->isl_dev.ss.clear_value_size + 4;
>     > +   /* On Gen10+, we use the clear color address of the surface to point
>     to this
>     > +    * buffer directly. However, according to the bspec:
>     > +    *
>     > +    *    The memory layout of the clear color pointed to by this address
>     is a
>     > +    *    value stored in the lower-order bytes of a 64-byte cache-line.
>     > +    *
>     > +    * So add some padding here for Gen10+.
>     > +    */
> 
>     I don't see any indication that the upper bytes may be modified by the
>     hardware. For that reason, I think we can assume that the image that
>     precedes this entry is at least 64 bytes and avoid padding the entry.
> 
> 
> I'm not sure what you mean by this.

Hmm... maybe my comment is confusing, but the idea is to add padding to
the entry, so if we have multiple entries (one per level), all of them
are aligned. I can try to find a better way to guarantee that.

Without this code, I was hitting the assert in patch 04 on some tests.

>     > +   return device->info.gen >= 10 ? ALIGN(entry_size, 64) : entry_size;
> 
> 
>     >  }
>     >
>     >  static inline struct anv_address
>     > --
>     > 2.14.3
>     >
>     > _______________________________________________
>     > mesa-dev mailing list
>     > mesa-dev@lists.freedesktop.org
>     > https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>     _______________________________________________
>     mesa-dev mailing list
>     mesa-dev@lists.freedesktop.org
>     https://lists.freedesktop.org/mailman/listinfo/mesa-dev
> 
>
On Mon, Jan 08, 2018 at 04:03:47PM -0800, Jason Ekstrand wrote:
> On Mon, Jan 8, 2018 at 3:00 PM, Nanley Chery <nanleychery@gmail.com> wrote:
> 
> > On Fri, Dec 15, 2017 at 02:53:30PM -0800, Rafael Antognolli wrote:
> > > On Gen10+, if we use the clear state address field in the surface state
> > > instead of the clear color directly, there's a restriction that the
> > > address must point to the lower part of a 64 byte cache-line.
> > >
> > > Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
> > > ---
> > >  src/intel/vulkan/anv_private.h | 12 +++++++++++-
> > >  1 file changed, 11 insertions(+), 1 deletion(-)
> > >
> > > diff --git a/src/intel/vulkan/anv_private.h b/src/intel/vulkan/anv_
> > private.h
> > > index b7bde4b8ce6..43cbf065724 100644
> > > --- a/src/intel/vulkan/anv_private.h
> > > +++ b/src/intel/vulkan/anv_private.h
> > > @@ -2490,7 +2490,17 @@ anv_fast_clear_state_entry_size(const struct
> > anv_device *device)
> > >      * GPU memcpy operations.
> > >      */
> > >     assert(device->isl_dev.ss.clear_value_size % 4 == 0);
> > > -   return device->isl_dev.ss.clear_value_size + 4;
> > > +
> > > +   const unsigned entry_size = device->isl_dev.ss.clear_value_size + 4;
> > > +   /* On Gen10+, we use the clear color address of the surface to point
> > to this
> > > +    * buffer directly. However, according to the bspec:
> > > +    *
> > > +    *    The memory layout of the clear color pointed to by this
> > address is a
> > > +    *    value stored in the lower-order bytes of a 64-byte cache-line.
> > > +    *
> > > +    * So add some padding here for Gen10+.
> > > +    */
> >
> > I don't see any indication that the upper bytes may be modified by the
> > hardware. For that reason, I think we can assume that the image that
> > precedes this entry is at least 64 bytes and avoid padding the entry.
> 
> 
> I'm not sure what you mean by this.
> 
> 

My bad, I meant to say: 
   I don't see any indication that the upper bytes of the cacheline may
   be modified by the hardware. I think we can also assume that the
   image that precedes this entry is at least 64 bytes. For these
   reasons we should be able to avoid padding the entry.

> > > +   return device->info.gen >= 10 ? ALIGN(entry_size, 64) : entry_size;
> >
> >
> > >  }
> > >
> > >  static inline struct anv_address
> > > --
> > > 2.14.3
> > >
> > > _______________________________________________
> > > mesa-dev mailing list
> > > mesa-dev@lists.freedesktop.org
> > > https://lists.freedesktop.org/mailman/listinfo/mesa-dev
> > _______________________________________________
> > mesa-dev mailing list
> > mesa-dev@lists.freedesktop.org
> > https://lists.freedesktop.org/mailman/listinfo/mesa-dev
> >
On Tue, Jan 9, 2018 at 10:33 AM, Nanley Chery <nanleychery@gmail.com> wrote:

> On Mon, Jan 08, 2018 at 04:03:47PM -0800, Jason Ekstrand wrote:
> > On Mon, Jan 8, 2018 at 3:00 PM, Nanley Chery <nanleychery@gmail.com>
> wrote:
> >
> > > On Fri, Dec 15, 2017 at 02:53:30PM -0800, Rafael Antognolli wrote:
> > > > On Gen10+, if we use the clear state address field in the surface
> state
> > > > instead of the clear color directly, there's a restriction that the
> > > > address must point to the lower part of a 64 byte cache-line.
> > > >
> > > > Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
> > > > ---
> > > >  src/intel/vulkan/anv_private.h | 12 +++++++++++-
> > > >  1 file changed, 11 insertions(+), 1 deletion(-)
> > > >
> > > > diff --git a/src/intel/vulkan/anv_private.h b/src/intel/vulkan/anv_
> > > private.h
> > > > index b7bde4b8ce6..43cbf065724 100644
> > > > --- a/src/intel/vulkan/anv_private.h
> > > > +++ b/src/intel/vulkan/anv_private.h
> > > > @@ -2490,7 +2490,17 @@ anv_fast_clear_state_entry_size(const struct
> > > anv_device *device)
> > > >      * GPU memcpy operations.
> > > >      */
> > > >     assert(device->isl_dev.ss.clear_value_size % 4 == 0);
> > > > -   return device->isl_dev.ss.clear_value_size + 4;
> > > > +
> > > > +   const unsigned entry_size = device->isl_dev.ss.clear_value_size
> + 4;
> > > > +   /* On Gen10+, we use the clear color address of the surface to
> point
> > > to this
> > > > +    * buffer directly. However, according to the bspec:
> > > > +    *
> > > > +    *    The memory layout of the clear color pointed to by this
> > > address is a
> > > > +    *    value stored in the lower-order bytes of a 64-byte
> cache-line.
> > > > +    *
> > > > +    * So add some padding here for Gen10+.
> > > > +    */
> > >
> > > I don't see any indication that the upper bytes may be modified by the
> > > hardware. For that reason, I think we can assume that the image that
> > > precedes this entry is at least 64 bytes and avoid padding the entry.
> >
> >
> > I'm not sure what you mean by this.
> >
> >
>
> My bad, I meant to say:
>    I don't see any indication that the upper bytes of the cacheline may
>    be modified by the hardware. I think we can also assume that the
>    image that precedes this entry is at least 64 bytes. For these
>    reasons we should be able to avoid padding the entry.
>

Yes, that means that we are free to put other stuff (such as resolve
tracking information) after the clear color.  However, for images with
multiple LODs, we need to have the individual per-LOD entries aligned.

--Jason


> > > > +   return device->info.gen >= 10 ? ALIGN(entry_size, 64) :
> entry_size;
> > >
> > >
> > > >  }
> > > >
> > > >  static inline struct anv_address
> > > > --
> > > > 2.14.3
> > > >
> > > > _______________________________________________
> > > > mesa-dev mailing list
> > > > mesa-dev@lists.freedesktop.org
> > > > https://lists.freedesktop.org/mailman/listinfo/mesa-dev
> > > _______________________________________________
> > > mesa-dev mailing list
> > > mesa-dev@lists.freedesktop.org
> > > https://lists.freedesktop.org/mailman/listinfo/mesa-dev
> > >
>
On Mon, Jan 08, 2018 at 04:33:25PM -0800, Rafael Antognolli wrote:
> On Mon, Jan 08, 2018 at 04:03:47PM -0800, Jason Ekstrand wrote:
> > On Mon, Jan 8, 2018 at 3:00 PM, Nanley Chery <nanleychery@gmail.com> wrote:
> > 
> >     On Fri, Dec 15, 2017 at 02:53:30PM -0800, Rafael Antognolli wrote:
> >     > On Gen10+, if we use the clear state address field in the surface state
> >     > instead of the clear color directly, there's a restriction that the
> >     > address must point to the lower part of a 64 byte cache-line.
> >     >
> >     > Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
> >     > ---
> >     >  src/intel/vulkan/anv_private.h | 12 +++++++++++-
> >     >  1 file changed, 11 insertions(+), 1 deletion(-)
> >     >
> >     > diff --git a/src/intel/vulkan/anv_private.h b/src/intel/vulkan/anv_
> >     private.h
> >     > index b7bde4b8ce6..43cbf065724 100644
> >     > --- a/src/intel/vulkan/anv_private.h
> >     > +++ b/src/intel/vulkan/anv_private.h
> >     > @@ -2490,7 +2490,17 @@ anv_fast_clear_state_entry_size(const struct
> >     anv_device *device)
> >     >      * GPU memcpy operations.
> >     >      */
> >     >     assert(device->isl_dev.ss.clear_value_size % 4 == 0);
> >     > -   return device->isl_dev.ss.clear_value_size + 4;
> >     > +
> >     > +   const unsigned entry_size = device->isl_dev.ss.clear_value_size + 4;
> >     > +   /* On Gen10+, we use the clear color address of the surface to point
> >     to this
> >     > +    * buffer directly. However, according to the bspec:
> >     > +    *
> >     > +    *    The memory layout of the clear color pointed to by this address
> >     is a
> >     > +    *    value stored in the lower-order bytes of a 64-byte cache-line.
> >     > +    *
> >     > +    * So add some padding here for Gen10+.
> >     > +    */
> > 
> >     I don't see any indication that the upper bytes may be modified by the
> >     hardware. For that reason, I think we can assume that the image that
> >     precedes this entry is at least 64 bytes and avoid padding the entry.
> > 
> > 
> > I'm not sure what you mean by this.
> 
> Hmm... maybe my comment is confusing, but the idea is to add padding to
> the entry, so if we have multiple entries (one per level), all of them
> are aligned. I can try to find a better way to guarantee that.
> 
> Without this code, I was hitting the assert in patch 04 on some tests.
> 

Oh, okay. That makes sense. This solution looks good to me. Perhaps we
want the title to mention that we specifically want the buffer entries
to be 64-byte aligned.

> >     > +   return device->info.gen >= 10 ? ALIGN(entry_size, 64) : entry_size;
> > 
> > 
> >     >  }
> >     >
> >     >  static inline struct anv_address
> >     > --
> >     > 2.14.3
> >     >
> >     > _______________________________________________
> >     > mesa-dev mailing list
> >     > mesa-dev@lists.freedesktop.org
> >     > https://lists.freedesktop.org/mailman/listinfo/mesa-dev
> >     _______________________________________________
> >     mesa-dev mailing list
> >     mesa-dev@lists.freedesktop.org
> >     https://lists.freedesktop.org/mailman/listinfo/mesa-dev
> > 
> >
On Tue, Jan 09, 2018 at 11:26:26AM -0800, Jason Ekstrand wrote:
> On Tue, Jan 9, 2018 at 10:33 AM, Nanley Chery <nanleychery@gmail.com> wrote:
> 
> > On Mon, Jan 08, 2018 at 04:03:47PM -0800, Jason Ekstrand wrote:
> > > On Mon, Jan 8, 2018 at 3:00 PM, Nanley Chery <nanleychery@gmail.com>
> > wrote:
> > >
> > > > On Fri, Dec 15, 2017 at 02:53:30PM -0800, Rafael Antognolli wrote:
> > > > > On Gen10+, if we use the clear state address field in the surface
> > state
> > > > > instead of the clear color directly, there's a restriction that the
> > > > > address must point to the lower part of a 64 byte cache-line.
> > > > >
> > > > > Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
> > > > > ---
> > > > >  src/intel/vulkan/anv_private.h | 12 +++++++++++-
> > > > >  1 file changed, 11 insertions(+), 1 deletion(-)
> > > > >
> > > > > diff --git a/src/intel/vulkan/anv_private.h b/src/intel/vulkan/anv_
> > > > private.h
> > > > > index b7bde4b8ce6..43cbf065724 100644
> > > > > --- a/src/intel/vulkan/anv_private.h
> > > > > +++ b/src/intel/vulkan/anv_private.h
> > > > > @@ -2490,7 +2490,17 @@ anv_fast_clear_state_entry_size(const struct
> > > > anv_device *device)
> > > > >      * GPU memcpy operations.
> > > > >      */
> > > > >     assert(device->isl_dev.ss.clear_value_size % 4 == 0);
> > > > > -   return device->isl_dev.ss.clear_value_size + 4;
> > > > > +
> > > > > +   const unsigned entry_size = device->isl_dev.ss.clear_value_size
> > + 4;
> > > > > +   /* On Gen10+, we use the clear color address of the surface to
> > point
> > > > to this
> > > > > +    * buffer directly. However, according to the bspec:
> > > > > +    *
> > > > > +    *    The memory layout of the clear color pointed to by this
> > > > address is a
> > > > > +    *    value stored in the lower-order bytes of a 64-byte
> > cache-line.
> > > > > +    *
> > > > > +    * So add some padding here for Gen10+.
> > > > > +    */
> > > >
> > > > I don't see any indication that the upper bytes may be modified by the
> > > > hardware. For that reason, I think we can assume that the image that
> > > > precedes this entry is at least 64 bytes and avoid padding the entry.
> > >
> > >
> > > I'm not sure what you mean by this.
> > >
> > >
> >
> > My bad, I meant to say:
> >    I don't see any indication that the upper bytes of the cacheline may
> >    be modified by the hardware. I think we can also assume that the
> >    image that precedes this entry is at least 64 bytes. For these
> >    reasons we should be able to avoid padding the entry.
> >
> 
> Yes, that means that we are free to put other stuff (such as resolve
> tracking information) after the clear color.  However, for images with
> multiple LODs, we need to have the individual per-LOD entries aligned.
> 
> --Jason
> 
> 

Got it. Thanks.

> > > > > +   return device->info.gen >= 10 ? ALIGN(entry_size, 64) :
> > entry_size;
> > > >
> > > >
> > > > >  }
> > > > >
> > > > >  static inline struct anv_address
> > > > > --
> > > > > 2.14.3
> > > > >
> > > > > _______________________________________________
> > > > > mesa-dev mailing list
> > > > > mesa-dev@lists.freedesktop.org
> > > > > https://lists.freedesktop.org/mailman/listinfo/mesa-dev
> > > > _______________________________________________
> > > > mesa-dev mailing list
> > > > mesa-dev@lists.freedesktop.org
> > > > https://lists.freedesktop.org/mailman/listinfo/mesa-dev
> > > >
> >