[v2,3/3] i965/gen9: Add workarounds for object preemption.

Submitted by Rafael Antognolli on Oct. 29, 2018, 5:19 p.m.

Details

Message ID 20181029171954.25263-4-rafael.antognolli@intel.com
State New
Headers show
Series "Add object level preemption to i965." ( rev: 1 ) in Mesa

Not browsing as part of any series.

Commit Message

Rafael Antognolli Oct. 29, 2018, 5:19 p.m.
Gen9 hardware requires some workarounds to disable preemption depending
on the type of primitive being emitted.

We implement this by adding a new atom that tracks BRW_NEW_PRIMITIVE.
Whenever it happens, we check the current type of primitive and
enable/disable object preemption.

For now, we just ignore blorp.  The only primitive it emits is
3DPRIM_RECTLIST, and since it's not listed in the workarounds, we can
safely leave preemption enabled when it happens. Or it will be disabled
by a previous 3DPRIMITIVE, which should be fine too.

Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Cc: Kenneth Graunke <kenneth@whitecape.org>
---
 src/mesa/drivers/dri/i965/genX_state_upload.c | 47 +++++++++++++++++++
 1 file changed, 47 insertions(+)

Patch hide | download patch | download mbox

diff --git a/src/mesa/drivers/dri/i965/genX_state_upload.c b/src/mesa/drivers/dri/i965/genX_state_upload.c
index 740cb0c4d2e..3a01bab1ae1 100644
--- a/src/mesa/drivers/dri/i965/genX_state_upload.c
+++ b/src/mesa/drivers/dri/i965/genX_state_upload.c
@@ -5563,6 +5563,50 @@  static const struct brw_tracked_state genX(blend_constant_color) = {
 
 /* ---------------------------------------------------------------------- */
 
+#if GEN_GEN == 9
+
+/**
+ * Implement workarounds for preemption:
+ *    - WaDisableMidObjectPreemptionForGSLineStripAdj
+ *    - WaDisableMidObjectPreemptionForTrifanOrPolygon
+ */
+static void
+gen9_emit_preempt_wa(struct brw_context *brw)
+{
+   /* WaDisableMidObjectPreemptionForGSLineStripAdj
+    *
+    *    WA: Disable mid-draw preemption when draw-call is a linestrip_adj and
+    *    GS is enabled.
+    */
+   bool object_preemption =
+      !(brw->primitive == _3DPRIM_LINESTRIP_ADJ && brw->gs.enabled);
+
+   /* WaDisableMidObjectPreemptionForTrifanOrPolygon
+    *
+    *    TriFan miscompare in Execlist Preemption test. Cut index that is on a
+    *    previous context. End the previous, the resume another context with a
+    *    tri-fan or polygon, and the vertex count is corrupted. If we prempt
+    *    again we will cause corruption.
+    *
+    *    WA: Disable mid-draw preemption when draw-call has a tri-fan.
+    */
+   object_preemption =
+      object_preemption && !(brw->primitive == _3DPRIM_TRIFAN);
+
+   brw_enable_obj_preemption(brw, object_preemption);
+}
+
+static const struct brw_tracked_state gen9_preempt_wa = {
+   .dirty = {
+      .mesa = 0,
+      .brw = BRW_NEW_PRIMITIVE | BRW_NEW_GEOMETRY_PROGRAM,
+   },
+   .emit = gen9_emit_preempt_wa,
+};
+#endif
+
+/* ---------------------------------------------------------------------- */
+
 void
 genX(init_atoms)(struct brw_context *brw)
 {
@@ -5867,6 +5911,9 @@  genX(init_atoms)(struct brw_context *brw)
 
       &genX(cut_index),
       &gen8_pma_fix,
+#if GEN_GEN == 9
+      &gen9_preempt_wa,
+#endif
    };
 #endif
 

Comments

On Monday, October 29, 2018 10:19:54 AM PDT Rafael Antognolli wrote:
> Gen9 hardware requires some workarounds to disable preemption depending
> on the type of primitive being emitted.
> 
> We implement this by adding a new atom that tracks BRW_NEW_PRIMITIVE.
> Whenever it happens, we check the current type of primitive and
> enable/disable object preemption.
> 
> For now, we just ignore blorp.  The only primitive it emits is
> 3DPRIM_RECTLIST, and since it's not listed in the workarounds, we can
> safely leave preemption enabled when it happens. Or it will be disabled
> by a previous 3DPRIMITIVE, which should be fine too.
> 
> Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
> Cc: Kenneth Graunke <kenneth@whitecape.org>
> ---
>  src/mesa/drivers/dri/i965/genX_state_upload.c | 47 +++++++++++++++++++
>  1 file changed, 47 insertions(+)
> 
> diff --git a/src/mesa/drivers/dri/i965/genX_state_upload.c b/src/mesa/drivers/dri/i965/genX_state_upload.c
> index 740cb0c4d2e..3a01bab1ae1 100644
> --- a/src/mesa/drivers/dri/i965/genX_state_upload.c
> +++ b/src/mesa/drivers/dri/i965/genX_state_upload.c
> @@ -5563,6 +5563,50 @@ static const struct brw_tracked_state genX(blend_constant_color) = {
>  
>  /* ---------------------------------------------------------------------- */
>  
> +#if GEN_GEN == 9
> +
> +/**

 * Enable or disable preemption based on the current primitive type.
 * (This should only be necessary on Gen9 hardware, not Gen10+.)
 *

> + * Implement workarounds for preemption:
> + *    - WaDisableMidObjectPreemptionForGSLineStripAdj
> + *    - WaDisableMidObjectPreemptionForTrifanOrPolygon
> + */
> +static void
> +gen9_emit_preempt_wa(struct brw_context *brw)
> +{

I think this might be a bit easier to follow as

   bool object_preemption = true;

   if (brw->primitive == _3DPRIM_LINESTRIP_ADJ && brw->gs.enabled)
      object_preemption = false;

   if (brw->primitive == _3DPRIM_TRIFAN)
      object_preemption = false;

   brw_enable_obj_preemption(brw, object_preemption);

(with the comments of course.)

Do we need any stalling when whacking CS_CHICKEN1...?

Looking through the workarounds list, I believe that we also need to
disable mid-object preemption for _3DPRIM_LINELOOP (Gen9 WA #0816).

We may need to disable it if instance_count > 0 in the 3DPRIMITIVE
(Gen9 WA #0798).

We may also need to disable autostripping by whacking some chicken
registers if it's enabled (Gen9 WA #0799).  Which would be lame,
because that's likely a useful optimization.  I guess we could disable
preemption for TRILIST and LINELIST as well to be safe.

And GPGPU preemption looks like a mile long list of workarounds,
so let's not try it on Gen9.

> +   /* WaDisableMidObjectPreemptionForGSLineStripAdj
> +    *
> +    *    WA: Disable mid-draw preemption when draw-call is a linestrip_adj and
> +    *    GS is enabled.
> +    */
> +   bool object_preemption =
> +      !(brw->primitive == _3DPRIM_LINESTRIP_ADJ && brw->gs.enabled);
> +
> +   /* WaDisableMidObjectPreemptionForTrifanOrPolygon
> +    *
> +    *    TriFan miscompare in Execlist Preemption test. Cut index that is on a
> +    *    previous context. End the previous, the resume another context with a
> +    *    tri-fan or polygon, and the vertex count is corrupted. If we prempt
> +    *    again we will cause corruption.
> +    *
> +    *    WA: Disable mid-draw preemption when draw-call has a tri-fan.
> +    */
> +   object_preemption =
> +      object_preemption && !(brw->primitive == _3DPRIM_TRIFAN);
> +
> +   brw_enable_obj_preemption(brw, object_preemption);
> +}
> +
> +static const struct brw_tracked_state gen9_preempt_wa = {
> +   .dirty = {
> +      .mesa = 0,
> +      .brw = BRW_NEW_PRIMITIVE | BRW_NEW_GEOMETRY_PROGRAM,
> +   },
> +   .emit = gen9_emit_preempt_wa,
> +};
> +#endif
> +
> +/* ---------------------------------------------------------------------- */
> +
>  void
>  genX(init_atoms)(struct brw_context *brw)
>  {
> @@ -5867,6 +5911,9 @@ genX(init_atoms)(struct brw_context *brw)
>  
>        &genX(cut_index),
>        &gen8_pma_fix,
> +#if GEN_GEN == 9
> +      &gen9_preempt_wa,
> +#endif
>     };
>  #endif
>  
>
On Tue, Oct 30, 2018 at 04:32:54PM -0700, Kenneth Graunke wrote:
> On Monday, October 29, 2018 10:19:54 AM PDT Rafael Antognolli wrote:
> > Gen9 hardware requires some workarounds to disable preemption depending
> > on the type of primitive being emitted.
> > 
> > We implement this by adding a new atom that tracks BRW_NEW_PRIMITIVE.
> > Whenever it happens, we check the current type of primitive and
> > enable/disable object preemption.
> > 
> > For now, we just ignore blorp.  The only primitive it emits is
> > 3DPRIM_RECTLIST, and since it's not listed in the workarounds, we can
> > safely leave preemption enabled when it happens. Or it will be disabled
> > by a previous 3DPRIMITIVE, which should be fine too.
> > 
> > Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
> > Cc: Kenneth Graunke <kenneth@whitecape.org>
> > ---
> >  src/mesa/drivers/dri/i965/genX_state_upload.c | 47 +++++++++++++++++++
> >  1 file changed, 47 insertions(+)
> > 
> > diff --git a/src/mesa/drivers/dri/i965/genX_state_upload.c b/src/mesa/drivers/dri/i965/genX_state_upload.c
> > index 740cb0c4d2e..3a01bab1ae1 100644
> > --- a/src/mesa/drivers/dri/i965/genX_state_upload.c
> > +++ b/src/mesa/drivers/dri/i965/genX_state_upload.c
> > @@ -5563,6 +5563,50 @@ static const struct brw_tracked_state genX(blend_constant_color) = {
> >  
> >  /* ---------------------------------------------------------------------- */
> >  
> > +#if GEN_GEN == 9
> > +
> > +/**
> 
>  * Enable or disable preemption based on the current primitive type.
>  * (This should only be necessary on Gen9 hardware, not Gen10+.)
>  *
> 
> > + * Implement workarounds for preemption:
> > + *    - WaDisableMidObjectPreemptionForGSLineStripAdj
> > + *    - WaDisableMidObjectPreemptionForTrifanOrPolygon
> > + */
> > +static void
> > +gen9_emit_preempt_wa(struct brw_context *brw)
> > +{
> 
> I think this might be a bit easier to follow as
> 
>    bool object_preemption = true;
> 
>    if (brw->primitive == _3DPRIM_LINESTRIP_ADJ && brw->gs.enabled)
>       object_preemption = false;
> 
>    if (brw->primitive == _3DPRIM_TRIFAN)
>       object_preemption = false;
> 
>    brw_enable_obj_preemption(brw, object_preemption);
> 
> (with the comments of course.)
> 
> Do we need any stalling when whacking CS_CHICKEN1...?

Hmmm... there's this:

"A fixed function pipe flush is required before modifying this field"

in the programming notes. I'm not sure what that is, but I assume it's
some type of PIPE_CONTROL?

> Looking through the workarounds list, I believe that we also need to
> disable mid-object preemption for _3DPRIM_LINELOOP (Gen9 WA #0816).
> 
> We may need to disable it if instance_count > 0 in the 3DPRIMITIVE
> (Gen9 WA #0798).

Ack.

> We may also need to disable autostripping by whacking some chicken
> registers if it's enabled (Gen9 WA #0799).  Which would be lame,

Looking again at #0799, it seems it's only applicable up to C0 on SKL,
and B0 on BXT. So maybe we should be fine here? Or just disable it on
BXT?

> because that's likely a useful optimization.  I guess we could disable
> preemption for TRILIST and LINELIST as well to be safe.

Is this because of the autostripping mentioned above, or some other
workaround? Or just your impression?

I can update it to include that, but just want to be sure that it's
still applicable, once we figure the thing about #0799.

> 
> And GPGPU preemption looks like a mile long list of workarounds,
> so let's not try it on Gen9.

Fair enough, thanks a lot for the review!

> > +   /* WaDisableMidObjectPreemptionForGSLineStripAdj
> > +    *
> > +    *    WA: Disable mid-draw preemption when draw-call is a linestrip_adj and
> > +    *    GS is enabled.
> > +    */
> > +   bool object_preemption =
> > +      !(brw->primitive == _3DPRIM_LINESTRIP_ADJ && brw->gs.enabled);
> > +
> > +   /* WaDisableMidObjectPreemptionForTrifanOrPolygon
> > +    *
> > +    *    TriFan miscompare in Execlist Preemption test. Cut index that is on a
> > +    *    previous context. End the previous, the resume another context with a
> > +    *    tri-fan or polygon, and the vertex count is corrupted. If we prempt
> > +    *    again we will cause corruption.
> > +    *
> > +    *    WA: Disable mid-draw preemption when draw-call has a tri-fan.
> > +    */
> > +   object_preemption =
> > +      object_preemption && !(brw->primitive == _3DPRIM_TRIFAN);
> > +
> > +   brw_enable_obj_preemption(brw, object_preemption);
> > +}
> > +
> > +static const struct brw_tracked_state gen9_preempt_wa = {
> > +   .dirty = {
> > +      .mesa = 0,
> > +      .brw = BRW_NEW_PRIMITIVE | BRW_NEW_GEOMETRY_PROGRAM,
> > +   },
> > +   .emit = gen9_emit_preempt_wa,
> > +};
> > +#endif
> > +
> > +/* ---------------------------------------------------------------------- */
> > +
> >  void
> >  genX(init_atoms)(struct brw_context *brw)
> >  {
> > @@ -5867,6 +5911,9 @@ genX(init_atoms)(struct brw_context *brw)
> >  
> >        &genX(cut_index),
> >        &gen8_pma_fix,
> > +#if GEN_GEN == 9
> > +      &gen9_preempt_wa,
> > +#endif
> >     };
> >  #endif
> >  
> > 
>
On Wednesday, October 31, 2018 11:15:28 AM PDT Rafael Antognolli wrote:
> On Tue, Oct 30, 2018 at 04:32:54PM -0700, Kenneth Graunke wrote:
> > On Monday, October 29, 2018 10:19:54 AM PDT Rafael Antognolli wrote:
> > Do we need any stalling when whacking CS_CHICKEN1...?
> 
> Hmmm... there's this:
> 
> "A fixed function pipe flush is required before modifying this field"
> 
> in the programming notes. I'm not sure what that is, but I assume it's
> some type of PIPE_CONTROL?

Yeah.  I'm not honestly sure what kind - "fixed function pipe flush"
isn't a thing.  Nobody ever uses wording that corresponds to actual
mechanics of the hardware. :(

Maybe this would work:

    brw_emit_end_of_pipe_sync(brw, PIPE_CONTROL_RENDER_TARGET_FLUSH);

> > We may also need to disable autostripping by whacking some chicken
> > registers if it's enabled (Gen9 WA #0799).  Which would be lame,
> 
> Looking again at #0799, it seems it's only applicable up to C0 on SKL,
> and B0 on BXT. So maybe we should be fine here? Or just disable it on
> BXT?

You're right, I misread that.  (I saw the Gen8 "FROM" tag and didn't
notice that it uses "UNTIL" on Gen9...)  Nothing to do here.

> > because that's likely a useful optimization.  I guess we could disable
> > preemption for TRILIST and LINELIST as well to be safe.
> 
> Is this because of the autostripping mentioned above, or some other
> workaround? Or just your impression?
> 
> I can update it to include that, but just want to be sure that it's
> still applicable, once we figure the thing about #0799.

Autostrip converts TRILIST/LINELIST into TRISTRIP/LINESTRIP, so
the idea would be to avoid preemption for anything that hits the
autostrip feature.  But, no need, as you noted above.

--Ken
On Wed, Oct 31, 2018 at 04:27:31PM -0700, Kenneth Graunke wrote:
> On Wednesday, October 31, 2018 11:15:28 AM PDT Rafael Antognolli wrote:
> > On Tue, Oct 30, 2018 at 04:32:54PM -0700, Kenneth Graunke wrote:
> > > On Monday, October 29, 2018 10:19:54 AM PDT Rafael Antognolli wrote:
> > > Do we need any stalling when whacking CS_CHICKEN1...?
> > 
> > Hmmm... there's this:
> > 
> > "A fixed function pipe flush is required before modifying this field"
> > 
> > in the programming notes. I'm not sure what that is, but I assume it's
> > some type of PIPE_CONTROL?
> 
> Yeah.  I'm not honestly sure what kind - "fixed function pipe flush"
> isn't a thing.  Nobody ever uses wording that corresponds to actual
> mechanics of the hardware. :(
> 
> Maybe this would work:
> 
>     brw_emit_end_of_pipe_sync(brw, PIPE_CONTROL_RENDER_TARGET_FLUSH);

Hey Ken,

Ressurrecting this old thread... I just noticed that I had this in patch
2/3 (inside brw_enable_obj_preemption()):

 brw_emit_pipe_control_flush(brw, PIPE_CONTROL_FLUSH_ENABLE);

That's bit 7 of the PIPE_CONTROL, and from the docs:

 "Hardware on parsing PIPECONTROL command with Pipe Control Flush Enable
 set will wait for all the outstanding post sync operations
 corresponding to previously executed PIPECONTROL commands are complete
 before making forward progress."

Do you think that's maybe what they meant? And in that case, I guess
maybe I would need a PIPE_CONTROL with post sync operation right before
this one, right?

> > > We may also need to disable autostripping by whacking some chicken
> > > registers if it's enabled (Gen9 WA #0799).  Which would be lame,
> > 
> > Looking again at #0799, it seems it's only applicable up to C0 on SKL,
> > and B0 on BXT. So maybe we should be fine here? Or just disable it on
> > BXT?
> 
> You're right, I misread that.  (I saw the Gen8 "FROM" tag and didn't
> notice that it uses "UNTIL" on Gen9...)  Nothing to do here.
> 
> > > because that's likely a useful optimization.  I guess we could disable
> > > preemption for TRILIST and LINELIST as well to be safe.
> > 
> > Is this because of the autostripping mentioned above, or some other
> > workaround? Or just your impression?
> > 
> > I can update it to include that, but just want to be sure that it's
> > still applicable, once we figure the thing about #0799.
> 
> Autostrip converts TRILIST/LINELIST into TRISTRIP/LINESTRIP, so
> the idea would be to avoid preemption for anything that hits the
> autostrip feature.  But, no need, as you noted above.
> 
> --Ken
On Thursday, December 13, 2018 5:00:43 PM PST Rafael Antognolli wrote:
> On Wed, Oct 31, 2018 at 04:27:31PM -0700, Kenneth Graunke wrote:
> > On Wednesday, October 31, 2018 11:15:28 AM PDT Rafael Antognolli wrote:
> > > On Tue, Oct 30, 2018 at 04:32:54PM -0700, Kenneth Graunke wrote:
> > > > On Monday, October 29, 2018 10:19:54 AM PDT Rafael Antognolli wrote:
> > > > Do we need any stalling when whacking CS_CHICKEN1...?
> > > 
> > > Hmmm... there's this:
> > > 
> > > "A fixed function pipe flush is required before modifying this field"
> > > 
> > > in the programming notes. I'm not sure what that is, but I assume it's
> > > some type of PIPE_CONTROL?
> > 
> > Yeah.  I'm not honestly sure what kind - "fixed function pipe flush"
> > isn't a thing.  Nobody ever uses wording that corresponds to actual
> > mechanics of the hardware. :(
> > 
> > Maybe this would work:
> > 
> >     brw_emit_end_of_pipe_sync(brw, PIPE_CONTROL_RENDER_TARGET_FLUSH);
> 
> Hey Ken,
> 
> Ressurrecting this old thread... I just noticed that I had this in patch
> 2/3 (inside brw_enable_obj_preemption()):
> 
>  brw_emit_pipe_control_flush(brw, PIPE_CONTROL_FLUSH_ENABLE);
> 
> That's bit 7 of the PIPE_CONTROL, and from the docs:
> 
>  "Hardware on parsing PIPECONTROL command with Pipe Control Flush Enable
>  set will wait for all the outstanding post sync operations
>  corresponding to previously executed PIPECONTROL commands are complete
>  before making forward progress."
> 
> Do you think that's maybe what they meant? And in that case, I guess
> maybe I would need a PIPE_CONTROL with post sync operation right before
> this one, right?

I'm not sure.  A post-sync write immediate to the workaround buffer
followed by a pipe control flush enable sounds plausible.  My earlier
suggestion of render target flush as an end-of-pipe sync sounds
plausible as well.  I think what you sent in v3 is fine until we find
otherwise.