[1/3] panfrost: Make sure a clear does not re-use a pre-existing batch

Submitted by Boris Brezillon on Sept. 20, 2019, 2:53 p.m.

Details

Message ID 20190920145339.27781-1-boris.brezillon@collabora.com
State New
Headers show
Series "Series without cover letter" ( rev: 1 ) in Mesa

Not browsing as part of any series.

Commit Message

Boris Brezillon Sept. 20, 2019, 2:53 p.m.
glClear()s are expected to be the first thing GL apps do before drawing
new things. If there's already an existing batch targetting the same
FBO that has draws attached to it, we should make sure the new clear
gets a new batch assigned to guaranteed that the FB content is actually
cleared with the requested color/depth/stencil values.

We create a panfrost_get_fresh_batch_for_fbo() helper for that and
call it from panfrost_clear().

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
---
 src/gallium/drivers/panfrost/pan_context.c |  2 +-
 src/gallium/drivers/panfrost/pan_job.c     | 21 +++++++++++++++++++++
 src/gallium/drivers/panfrost/pan_job.h     |  3 +++
 3 files changed, 25 insertions(+), 1 deletion(-)

Patch hide | download patch | download mbox

diff --git a/src/gallium/drivers/panfrost/pan_context.c b/src/gallium/drivers/panfrost/pan_context.c
index ac01461a07fe..b2f2a9da7a51 100644
--- a/src/gallium/drivers/panfrost/pan_context.c
+++ b/src/gallium/drivers/panfrost/pan_context.c
@@ -162,7 +162,7 @@  panfrost_clear(
         double depth, unsigned stencil)
 {
         struct panfrost_context *ctx = pan_context(pipe);
-        struct panfrost_batch *batch = panfrost_get_batch_for_fbo(ctx);
+        struct panfrost_batch *batch = panfrost_get_fresh_batch_for_fbo(ctx);
 
         panfrost_batch_add_fbo_bos(batch);
         panfrost_batch_clear(batch, buffers, color, depth, stencil);
diff --git a/src/gallium/drivers/panfrost/pan_job.c b/src/gallium/drivers/panfrost/pan_job.c
index d8330bc133a6..4ec2aa0565d7 100644
--- a/src/gallium/drivers/panfrost/pan_job.c
+++ b/src/gallium/drivers/panfrost/pan_job.c
@@ -298,6 +298,27 @@  panfrost_get_batch_for_fbo(struct panfrost_context *ctx)
         return batch;
 }
 
+struct panfrost_batch *
+panfrost_get_fresh_batch_for_fbo(struct panfrost_context *ctx)
+{
+        struct panfrost_batch *batch;
+
+        batch = panfrost_get_batch(ctx, &ctx->pipe_framebuffer);
+
+        /* The batch has no draw/clear queued, let's return it directly.
+         * Note that it's perfectly fine to re-use a batch with an
+         * existing clear, we'll just update it with the new clear request.
+         */
+        if (!batch->last_job.gpu)
+                return batch;
+
+        /* Otherwise, we need to freeze the existing one and instantiate a new
+         * one.
+         */
+        panfrost_freeze_batch(batch);
+        return panfrost_get_batch(ctx, &ctx->pipe_framebuffer);
+}
+
 static bool
 panfrost_batch_fence_is_signaled(struct panfrost_batch_fence *fence)
 {
diff --git a/src/gallium/drivers/panfrost/pan_job.h b/src/gallium/drivers/panfrost/pan_job.h
index e1b1f56a2e64..0bd78bba267a 100644
--- a/src/gallium/drivers/panfrost/pan_job.h
+++ b/src/gallium/drivers/panfrost/pan_job.h
@@ -172,6 +172,9 @@  panfrost_batch_fence_reference(struct panfrost_batch_fence *batch);
 struct panfrost_batch *
 panfrost_get_batch_for_fbo(struct panfrost_context *ctx);
 
+struct panfrost_batch *
+panfrost_get_fresh_batch_for_fbo(struct panfrost_context *ctx);
+
 void
 panfrost_batch_init(struct panfrost_context *ctx);
 

Comments

To be clear, if we have a batch and do the following operations:

	clear red
	draw 1
	clear green
	draw 2
	flush

All we should see is #2 on a green background, which this patch handles
by the second clear invalidating all the clears/draws that came before
it (provided there is no flush in between). 

I might just be tripped up by the "freeze" name. That really means throw
away / free here, I guess?

Provided that's the idea (and we're not somehow saving the original draw
1), it's Reviewed-by A R <a.r@c.com>

On Fri, Sep 20, 2019 at 04:53:37PM +0200, Boris Brezillon wrote:
> glClear()s are expected to be the first thing GL apps do before drawing
> new things. If there's already an existing batch targetting the same
> FBO that has draws attached to it, we should make sure the new clear
> gets a new batch assigned to guaranteed that the FB content is actually
> cleared with the requested color/depth/stencil values.
> 
> We create a panfrost_get_fresh_batch_for_fbo() helper for that and
> call it from panfrost_clear().
> 
> Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
> ---
>  src/gallium/drivers/panfrost/pan_context.c |  2 +-
>  src/gallium/drivers/panfrost/pan_job.c     | 21 +++++++++++++++++++++
>  src/gallium/drivers/panfrost/pan_job.h     |  3 +++
>  3 files changed, 25 insertions(+), 1 deletion(-)
> 
> diff --git a/src/gallium/drivers/panfrost/pan_context.c b/src/gallium/drivers/panfrost/pan_context.c
> index ac01461a07fe..b2f2a9da7a51 100644
> --- a/src/gallium/drivers/panfrost/pan_context.c
> +++ b/src/gallium/drivers/panfrost/pan_context.c
> @@ -162,7 +162,7 @@ panfrost_clear(
>          double depth, unsigned stencil)
>  {
>          struct panfrost_context *ctx = pan_context(pipe);
> -        struct panfrost_batch *batch = panfrost_get_batch_for_fbo(ctx);
> +        struct panfrost_batch *batch = panfrost_get_fresh_batch_for_fbo(ctx);
>  
>          panfrost_batch_add_fbo_bos(batch);
>          panfrost_batch_clear(batch, buffers, color, depth, stencil);
> diff --git a/src/gallium/drivers/panfrost/pan_job.c b/src/gallium/drivers/panfrost/pan_job.c
> index d8330bc133a6..4ec2aa0565d7 100644
> --- a/src/gallium/drivers/panfrost/pan_job.c
> +++ b/src/gallium/drivers/panfrost/pan_job.c
> @@ -298,6 +298,27 @@ panfrost_get_batch_for_fbo(struct panfrost_context *ctx)
>          return batch;
>  }
>  
> +struct panfrost_batch *
> +panfrost_get_fresh_batch_for_fbo(struct panfrost_context *ctx)
> +{
> +        struct panfrost_batch *batch;
> +
> +        batch = panfrost_get_batch(ctx, &ctx->pipe_framebuffer);
> +
> +        /* The batch has no draw/clear queued, let's return it directly.
> +         * Note that it's perfectly fine to re-use a batch with an
> +         * existing clear, we'll just update it with the new clear request.
> +         */
> +        if (!batch->last_job.gpu)
> +                return batch;
> +
> +        /* Otherwise, we need to freeze the existing one and instantiate a new
> +         * one.
> +         */
> +        panfrost_freeze_batch(batch);
> +        return panfrost_get_batch(ctx, &ctx->pipe_framebuffer);
> +}
> +
>  static bool
>  panfrost_batch_fence_is_signaled(struct panfrost_batch_fence *fence)
>  {
> diff --git a/src/gallium/drivers/panfrost/pan_job.h b/src/gallium/drivers/panfrost/pan_job.h
> index e1b1f56a2e64..0bd78bba267a 100644
> --- a/src/gallium/drivers/panfrost/pan_job.h
> +++ b/src/gallium/drivers/panfrost/pan_job.h
> @@ -172,6 +172,9 @@ panfrost_batch_fence_reference(struct panfrost_batch_fence *batch);
>  struct panfrost_batch *
>  panfrost_get_batch_for_fbo(struct panfrost_context *ctx);
>  
> +struct panfrost_batch *
> +panfrost_get_fresh_batch_for_fbo(struct panfrost_context *ctx);
> +
>  void
>  panfrost_batch_init(struct panfrost_context *ctx);
>  
> -- 
> 2.21.0
On Fri, 20 Sep 2019 15:45:33 -0400
Alyssa Rosenzweig <alyssa@rosenzweig.io> wrote:

> To be clear, if we have a batch and do the following operations:
> 
> 	clear red
> 	draw 1
> 	clear green
> 	draw 2
> 	flush
> 
> All we should see is #2 on a green background, which this patch handles
> by the second clear invalidating all the clears/draws that came before
> it (provided there is no flush in between). 
> 
> I might just be tripped up by the "freeze" name. That really means throw
> away / free here, I guess?

Nope. Freeze means "stop queuing new draws to this batch". I guess we
could free the batch as well if the result of the previous draws/clear
are really overwritten by this new clear, but that implies checking the
new clear flags to make sure they target the same depth/stencil/color
combination. On the other hand, I'm wondering if it's really the
driver's job to try to optimize silly things the apps might do. I mean,
the sequence you describe does not look like a wise thing to do since
the "clear red+draw 1" end up being overwritten by "clear green + draw
2".

> 
> Provided that's the idea (and we're not somehow saving the original draw
> 1), it's Reviewed-by A R <a.r@c.com>
> 
> On Fri, Sep 20, 2019 at 04:53:37PM +0200, Boris Brezillon wrote:
> > glClear()s are expected to be the first thing GL apps do before drawing
> > new things. If there's already an existing batch targetting the same
> > FBO that has draws attached to it, we should make sure the new clear
> > gets a new batch assigned to guaranteed that the FB content is actually
> > cleared with the requested color/depth/stencil values.
> > 
> > We create a panfrost_get_fresh_batch_for_fbo() helper for that and
> > call it from panfrost_clear().
> > 
> > Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
> > ---
> >  src/gallium/drivers/panfrost/pan_context.c |  2 +-
> >  src/gallium/drivers/panfrost/pan_job.c     | 21 +++++++++++++++++++++
> >  src/gallium/drivers/panfrost/pan_job.h     |  3 +++
> >  3 files changed, 25 insertions(+), 1 deletion(-)
> > 
> > diff --git a/src/gallium/drivers/panfrost/pan_context.c b/src/gallium/drivers/panfrost/pan_context.c
> > index ac01461a07fe..b2f2a9da7a51 100644
> > --- a/src/gallium/drivers/panfrost/pan_context.c
> > +++ b/src/gallium/drivers/panfrost/pan_context.c
> > @@ -162,7 +162,7 @@ panfrost_clear(
> >          double depth, unsigned stencil)
> >  {
> >          struct panfrost_context *ctx = pan_context(pipe);
> > -        struct panfrost_batch *batch = panfrost_get_batch_for_fbo(ctx);
> > +        struct panfrost_batch *batch = panfrost_get_fresh_batch_for_fbo(ctx);
> >  
> >          panfrost_batch_add_fbo_bos(batch);
> >          panfrost_batch_clear(batch, buffers, color, depth, stencil);
> > diff --git a/src/gallium/drivers/panfrost/pan_job.c b/src/gallium/drivers/panfrost/pan_job.c
> > index d8330bc133a6..4ec2aa0565d7 100644
> > --- a/src/gallium/drivers/panfrost/pan_job.c
> > +++ b/src/gallium/drivers/panfrost/pan_job.c
> > @@ -298,6 +298,27 @@ panfrost_get_batch_for_fbo(struct panfrost_context *ctx)
> >          return batch;
> >  }
> >  
> > +struct panfrost_batch *
> > +panfrost_get_fresh_batch_for_fbo(struct panfrost_context *ctx)
> > +{
> > +        struct panfrost_batch *batch;
> > +
> > +        batch = panfrost_get_batch(ctx, &ctx->pipe_framebuffer);
> > +
> > +        /* The batch has no draw/clear queued, let's return it directly.
> > +         * Note that it's perfectly fine to re-use a batch with an
> > +         * existing clear, we'll just update it with the new clear request.
> > +         */
> > +        if (!batch->last_job.gpu)
> > +                return batch;
> > +
> > +        /* Otherwise, we need to freeze the existing one and instantiate a new
> > +         * one.
> > +         */
> > +        panfrost_freeze_batch(batch);
> > +        return panfrost_get_batch(ctx, &ctx->pipe_framebuffer);
> > +}
> > +
> >  static bool
> >  panfrost_batch_fence_is_signaled(struct panfrost_batch_fence *fence)
> >  {
> > diff --git a/src/gallium/drivers/panfrost/pan_job.h b/src/gallium/drivers/panfrost/pan_job.h
> > index e1b1f56a2e64..0bd78bba267a 100644
> > --- a/src/gallium/drivers/panfrost/pan_job.h
> > +++ b/src/gallium/drivers/panfrost/pan_job.h
> > @@ -172,6 +172,9 @@ panfrost_batch_fence_reference(struct panfrost_batch_fence *batch);
> >  struct panfrost_batch *
> >  panfrost_get_batch_for_fbo(struct panfrost_context *ctx);
> >  
> > +struct panfrost_batch *
> > +panfrost_get_fresh_batch_for_fbo(struct panfrost_context *ctx);
> > +
> >  void
> >  panfrost_batch_init(struct panfrost_context *ctx);
> >  
> > -- 
> > 2.21.0
> > To be clear, if we have a batch and do the following operations:
> > 
> > 	clear red
> > 	draw 1
> > 	clear green
> > 	draw 2
> > 	flush
> > 
> > All we should see is #2 on a green background, which this patch handles
> > by the second clear invalidating all the clears/draws that came before
> > it (provided there is no flush in between). 
> > 
> > I might just be tripped up by the "freeze" name. That really means throw
> > away / free here, I guess?
> 
> Nope. Freeze means "stop queuing new draws to this batch". I guess we
> could free the batch as well if the result of the previous draws/clear
> are really overwritten by this new clear, but that implies checking the
> new clear flags to make sure they target the same depth/stencil/color
> combination. On the other hand, I'm wondering if it's really the
> driver's job to try to optimize silly things the apps might do. I mean,
> the sequence you describe does not look like a wise thing to do since
> the "clear red+draw 1" end up being overwritten by "clear green + draw
> 2".

I'm quite confused how this patch works, then.

A few thoughts: if the app clears all buffers in the middle, then yes
it's silly and yes we may as well optimize it out. (Should that be a thing GL
drivers have to do? I mean, if the other drivers are too...)

If the sequence is more like:

	clear all buffers
	draw 1
	clear color buffer (preserve depth stencil)
	draw 2
	flush

That second clear should really be done by drawing a full screen quad,
just like if we were wallpapering, except loading its colour from a
uniform instead of a texture.

Similarly, a depth-only clear mid-frame can be emulated by drawing a
full-screen quad with the gl_Position.zw components juryrigged to the
desired depth components, and disabling colour draws by setting the
colour mask to 0x0. That also means you can skip having any shader at
all (literally set the shader pointer to 0x0) so that's faster.

Finally, a stencil-only clear can occur similarly playing tricks with
the stencil test parameters.

I suspect u_blitter or mesa/st is capable of doing these sorts of tricks
on our behalf, but I have not researched it extensively.

In any case, for a partial clear mid-frame, we would much rather do:

	clear all buffers
	draw 1
	draw fullscreen quad (disable Z/S writes)
	draw 2
	flush

than what it sounds like this patch does

	clear all buffers
	draw 1
	flush

	clear all buffers
	wallpaper
	draw 2
	flush

Please do correct me if I've misunderstood the logic here.

> 
> > 
> > Provided that's the idea (and we're not somehow saving the original draw
> > 1), it's Reviewed-by A R <a.r@c.com>
> > 
> > On Fri, Sep 20, 2019 at 04:53:37PM +0200, Boris Brezillon wrote:
> > > glClear()s are expected to be the first thing GL apps do before drawing
> > > new things. If there's already an existing batch targetting the same
> > > FBO that has draws attached to it, we should make sure the new clear
> > > gets a new batch assigned to guaranteed that the FB content is actually
> > > cleared with the requested color/depth/stencil values.
> > > 
> > > We create a panfrost_get_fresh_batch_for_fbo() helper for that and
> > > call it from panfrost_clear().
> > > 
> > > Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
> > > ---
> > >  src/gallium/drivers/panfrost/pan_context.c |  2 +-
> > >  src/gallium/drivers/panfrost/pan_job.c     | 21 +++++++++++++++++++++
> > >  src/gallium/drivers/panfrost/pan_job.h     |  3 +++
> > >  3 files changed, 25 insertions(+), 1 deletion(-)
> > > 
> > > diff --git a/src/gallium/drivers/panfrost/pan_context.c b/src/gallium/drivers/panfrost/pan_context.c
> > > index ac01461a07fe..b2f2a9da7a51 100644
> > > --- a/src/gallium/drivers/panfrost/pan_context.c
> > > +++ b/src/gallium/drivers/panfrost/pan_context.c
> > > @@ -162,7 +162,7 @@ panfrost_clear(
> > >          double depth, unsigned stencil)
> > >  {
> > >          struct panfrost_context *ctx = pan_context(pipe);
> > > -        struct panfrost_batch *batch = panfrost_get_batch_for_fbo(ctx);
> > > +        struct panfrost_batch *batch = panfrost_get_fresh_batch_for_fbo(ctx);
> > >  
> > >          panfrost_batch_add_fbo_bos(batch);
> > >          panfrost_batch_clear(batch, buffers, color, depth, stencil);
> > > diff --git a/src/gallium/drivers/panfrost/pan_job.c b/src/gallium/drivers/panfrost/pan_job.c
> > > index d8330bc133a6..4ec2aa0565d7 100644
> > > --- a/src/gallium/drivers/panfrost/pan_job.c
> > > +++ b/src/gallium/drivers/panfrost/pan_job.c
> > > @@ -298,6 +298,27 @@ panfrost_get_batch_for_fbo(struct panfrost_context *ctx)
> > >          return batch;
> > >  }
> > >  
> > > +struct panfrost_batch *
> > > +panfrost_get_fresh_batch_for_fbo(struct panfrost_context *ctx)
> > > +{
> > > +        struct panfrost_batch *batch;
> > > +
> > > +        batch = panfrost_get_batch(ctx, &ctx->pipe_framebuffer);
> > > +
> > > +        /* The batch has no draw/clear queued, let's return it directly.
> > > +         * Note that it's perfectly fine to re-use a batch with an
> > > +         * existing clear, we'll just update it with the new clear request.
> > > +         */
> > > +        if (!batch->last_job.gpu)
> > > +                return batch;
> > > +
> > > +        /* Otherwise, we need to freeze the existing one and instantiate a new
> > > +         * one.
> > > +         */
> > > +        panfrost_freeze_batch(batch);
> > > +        return panfrost_get_batch(ctx, &ctx->pipe_framebuffer);
> > > +}
> > > +
> > >  static bool
> > >  panfrost_batch_fence_is_signaled(struct panfrost_batch_fence *fence)
> > >  {
> > > diff --git a/src/gallium/drivers/panfrost/pan_job.h b/src/gallium/drivers/panfrost/pan_job.h
> > > index e1b1f56a2e64..0bd78bba267a 100644
> > > --- a/src/gallium/drivers/panfrost/pan_job.h
> > > +++ b/src/gallium/drivers/panfrost/pan_job.h
> > > @@ -172,6 +172,9 @@ panfrost_batch_fence_reference(struct panfrost_batch_fence *batch);
> > >  struct panfrost_batch *
> > >  panfrost_get_batch_for_fbo(struct panfrost_context *ctx);
> > >  
> > > +struct panfrost_batch *
> > > +panfrost_get_fresh_batch_for_fbo(struct panfrost_context *ctx);
> > > +
> > >  void
> > >  panfrost_batch_init(struct panfrost_context *ctx);
> > >  
> > > -- 
> > > 2.21.0
On Sun, 22 Sep 2019 08:38:30 -0400
Alyssa Rosenzweig <alyssa@rosenzweig.io> wrote:

> > > To be clear, if we have a batch and do the following operations:
> > > 
> > > 	clear red
> > > 	draw 1
> > > 	clear green
> > > 	draw 2
> > > 	flush
> > > 
> > > All we should see is #2 on a green background, which this patch handles
> > > by the second clear invalidating all the clears/draws that came before
> > > it (provided there is no flush in between). 
> > > 
> > > I might just be tripped up by the "freeze" name. That really means throw
> > > away / free here, I guess?  
> > 
> > Nope. Freeze means "stop queuing new draws to this batch". I guess we
> > could free the batch as well if the result of the previous draws/clear
> > are really overwritten by this new clear, but that implies checking the
> > new clear flags to make sure they target the same depth/stencil/color
> > combination. On the other hand, I'm wondering if it's really the
> > driver's job to try to optimize silly things the apps might do. I mean,
> > the sequence you describe does not look like a wise thing to do since
> > the "clear red+draw 1" end up being overwritten by "clear green + draw
> > 2".  
> 
> I'm quite confused how this patch works, then.
> 
> A few thoughts: if the app clears all buffers in the middle, then yes
> it's silly and yes we may as well optimize it out. (Should that be a thing GL
> drivers have to do? I mean, if the other drivers are too...)
> 
> If the sequence is more like:
> 
> 	clear all buffers
> 	draw 1
> 	clear color buffer (preserve depth stencil)
> 	draw 2
> 	flush
> 
> That second clear should really be done by drawing a full screen quad,
> just like if we were wallpapering, except loading its colour from a
> uniform instead of a texture.
> 
> Similarly, a depth-only clear mid-frame can be emulated by drawing a
> full-screen quad with the gl_Position.zw components juryrigged to the
> desired depth components, and disabling colour draws by setting the
> colour mask to 0x0. That also means you can skip having any shader at
> all (literally set the shader pointer to 0x0) so that's faster.
> 
> Finally, a stencil-only clear can occur similarly playing tricks with
> the stencil test parameters.
> 
> I suspect u_blitter or mesa/st is capable of doing these sorts of tricks
> on our behalf, but I have not researched it extensively.

AFAIU, mesa/st does not transform the clear into a quad-draw for
depth/stencil only clears, it only does that for the color(s)-masked
case. That's certainly something we can do in Panfrost if we want to
avoid creating a new batch in such situations though.

> 
> In any case, for a partial clear mid-frame, we would much rather do:
> 
> 	clear all buffers
> 	draw 1
> 	draw fullscreen quad (disable Z/S writes)
> 	draw 2
> 	flush
> 
> than what it sounds like this patch does
> 
> 	clear all buffers
> 	draw 1
> 	flush
> 
> 	clear all buffers
> 	wallpaper
> 	draw 2
> 	flush
> 
> Please do correct me if I've misunderstood the logic here.


There's no flush introduced by this patch, it just adds a dep between
batch 2 and 1:

batch1:	clear all buffers
 	draw 1

batch2: clear all buffers
	wallpaper
	draw 2

	flush (actually flushes batch 1 and 2)

with batch2 --depends-on--> batch1
On Sun, 22 Sep 2019 15:24:10 +0200
Boris Brezillon <boris.brezillon@collabora.com> wrote:

> On Sun, 22 Sep 2019 08:38:30 -0400
> Alyssa Rosenzweig <alyssa@rosenzweig.io> wrote:
> 
> > > > To be clear, if we have a batch and do the following operations:
> > > > 
> > > > 	clear red
> > > > 	draw 1
> > > > 	clear green
> > > > 	draw 2
> > > > 	flush
> > > > 
> > > > All we should see is #2 on a green background, which this patch handles
> > > > by the second clear invalidating all the clears/draws that came before
> > > > it (provided there is no flush in between). 
> > > > 
> > > > I might just be tripped up by the "freeze" name. That really means throw
> > > > away / free here, I guess?    
> > > 
> > > Nope. Freeze means "stop queuing new draws to this batch". I guess we
> > > could free the batch as well if the result of the previous draws/clear
> > > are really overwritten by this new clear, but that implies checking the
> > > new clear flags to make sure they target the same depth/stencil/color
> > > combination. On the other hand, I'm wondering if it's really the
> > > driver's job to try to optimize silly things the apps might do. I mean,
> > > the sequence you describe does not look like a wise thing to do since
> > > the "clear red+draw 1" end up being overwritten by "clear green + draw
> > > 2".    
> > 
> > I'm quite confused how this patch works, then.
> > 
> > A few thoughts: if the app clears all buffers in the middle, then yes
> > it's silly and yes we may as well optimize it out. (Should that be a thing GL
> > drivers have to do? I mean, if the other drivers are too...)
> > 
> > If the sequence is more like:
> > 
> > 	clear all buffers
> > 	draw 1
> > 	clear color buffer (preserve depth stencil)
> > 	draw 2
> > 	flush
> > 
> > That second clear should really be done by drawing a full screen quad,
> > just like if we were wallpapering, except loading its colour from a
> > uniform instead of a texture.
> > 
> > Similarly, a depth-only clear mid-frame can be emulated by drawing a
> > full-screen quad with the gl_Position.zw components juryrigged to the
> > desired depth components, and disabling colour draws by setting the
> > colour mask to 0x0. That also means you can skip having any shader at
> > all (literally set the shader pointer to 0x0) so that's faster.
> > 
> > Finally, a stencil-only clear can occur similarly playing tricks with
> > the stencil test parameters.
> > 
> > I suspect u_blitter or mesa/st is capable of doing these sorts of tricks
> > on our behalf, but I have not researched it extensively.  
> 
> AFAIU, mesa/st does not transform the clear into a quad-draw for
> depth/stencil only clears, it only does that for the color(s)-masked
> case. That's certainly something we can do in Panfrost if we want to
> avoid creating a new batch in such situations though.

One more thing: optimization of the above scenario is probably
something we'll want to have at some point, but I think the current
patch is worth applying in the meantime. All this patch does is
enforcing ordering of clears/draws to make sure the end result matches
users expectation.

> 
> > 
> > In any case, for a partial clear mid-frame, we would much rather do:
> > 
> > 	clear all buffers
> > 	draw 1
> > 	draw fullscreen quad (disable Z/S writes)
> > 	draw 2
> > 	flush
> > 
> > than what it sounds like this patch does
> > 
> > 	clear all buffers
> > 	draw 1
> > 	flush
> > 
> > 	clear all buffers
> > 	wallpaper
> > 	draw 2
> > 	flush
> > 
> > Please do correct me if I've misunderstood the logic here.  
> 
> 
> There's no flush introduced by this patch, it just adds a dep between
> batch 2 and 1:
> 
> batch1:	clear all buffers
>  	draw 1
> 
> batch2: clear all buffers
> 	wallpaper
> 	draw 2
> 
> 	flush (actually flushes batch 1 and 2)
> 
> with batch2 --depends-on--> batch1