[v5] etnaviv: fix resource usage tracking across different pipe_context's

Submitted by Boris Brezillon on Feb. 22, 2019, 9:30 a.m.

Details

Message ID 20190222103003.30b4080f@collabora.com
State New
Headers show
Series "etnaviv: fix resource usage tracking across different pipe_context's" ( rev: 6 ) in Mesa

Not browsing as part of any series.

Commit Message

Boris Brezillon Feb. 22, 2019, 9:30 a.m.
On Thu, 21 Feb 2019 23:29:53 +0100
Boris Brezillon <boris.brezillon@collabora.com> wrote:

> Christian, Marek,
> 
> On Wed, 30 Jan 2019 05:28:14 +0100
> Marek Vasut <marex@denx.de> wrote:
> 
> > From: Christian Gmeiner <christian.gmeiner@gmail.com>
> > 
> > A pipe_resource can be shared by all the pipe_context's hanging off the
> > same pipe_screen.  
> 
> We seem to be impacted by the problem you're fixing here, but, while
> this patch definitely make things much better, the problem does not
> seem to be entirely fixed (at least in our case).
> 
> A bit more context: we have Qt App using QtWebEngine to render html
> content. When we call QtWebEngine::initialize(), which as for effect
> to allow shared GL contexts, we sometimes notice that part of the web
> page is mis-rendered. That does not happen when we omit the
> QtWebEngine::initialize() call.
> As said above, this patch make those rendering issues less likely to
> happen, but we still have the problem from time to time. So I thought
> I'd share my guesses about what could cause these issues before
> debugging it further.
> 
> First thing I noticed: I couldn't reproduce the problem with [1]
> applied (+ a tiny change forcing CPU-based tiling no matter the size of
> the region to tile/untile). So, my guess is that it's related to how we
> handle GPU-based tiling/untiling.
> Also noticed that we're testing the rsc->status here [2] without the
> screen->lock held, and there might be a race with another thread calling
> resource_used(). We might also lack a resource_read(ctx, &src->base)
> here [3]. But even after fixing those problems, the rendering issues
> are not gone.

I tested again with the following diff applied on top of your patch, and
the remaining rendering issues we had seem to be gone (don't know what I
messed up in my previous tests :-/).

--->8---

Patch hide | download patch | download mbox

diff --git a/src/gallium/drivers/etnaviv/etnaviv_rs.c b/src/gallium/drivers/etnaviv/etnaviv_rs.c
index fc4f65dbeee1..b8c8b96a6f72 100644
--- a/src/gallium/drivers/etnaviv/etnaviv_rs.c
+++ b/src/gallium/drivers/etnaviv/etnaviv_rs.c
@@ -729,6 +729,7 @@  etna_try_rs_blit(struct pipe_context *pctx,
 
    etna_submit_rs_state(ctx, &copy_to_screen);
    resource_written(ctx, &dst->base);
+   resource_read(ctx, &src->base);
    dst->seqno++;
    dst->levels[blit_info->dst.level].ts_valid = false;
    ctx->dirty |= ETNA_DIRTY_DERIVE_TS;
diff --git a/src/gallium/drivers/etnaviv/etnaviv_transfer.c b/src/gallium/drivers/etnaviv/etnaviv_transfer.c
index a3013e624ead..e4b2ac605e63 100644
--- a/src/gallium/drivers/etnaviv/etnaviv_transfer.c
+++ b/src/gallium/drivers/etnaviv/etnaviv_transfer.c
@@ -356,6 +356,7 @@  etna_transfer_map(struct pipe_context *pctx, struct pipe_resource *prsc,
     * transfers without a temporary resource.
     */
    if (trans->rsc || !(usage & PIPE_TRANSFER_UNSYNCHRONIZED)) {
+      struct etna_screen *screen = ctx->screen;
       uint32_t prep_flags = 0;
 
       /*
@@ -364,11 +365,13 @@  etna_transfer_map(struct pipe_context *pctx, struct pipe_resource *prsc,
        * current GPU usage (reads must wait for GPU writes, writes must have
        * exclusive access to the buffer).
        */
+      mtx_lock(&screen->lock);
       if ((trans->rsc && (etna_resource(trans->rsc)->status & ETNA_PENDING_WRITE)) ||
           (!trans->rsc &&
            (((usage & PIPE_TRANSFER_READ) && (rsc->status & ETNA_PENDING_WRITE)) ||
            ((usage & PIPE_TRANSFER_WRITE) && rsc->status))))
          pctx->flush(pctx, NULL, 0);
+      mtx_unlock(&screen->lock);
 
       if (usage & PIPE_TRANSFER_READ)
          prep_flags |= DRM_ETNA_PREP_READ;

Comments

On 2/22/19 10:30 AM, Boris Brezillon wrote:
> On Thu, 21 Feb 2019 23:29:53 +0100
> Boris Brezillon <boris.brezillon@collabora.com> wrote:
> 
>> Christian, Marek,
>>
>> On Wed, 30 Jan 2019 05:28:14 +0100
>> Marek Vasut <marex@denx.de> wrote:
>>
>>> From: Christian Gmeiner <christian.gmeiner@gmail.com>
>>>
>>> A pipe_resource can be shared by all the pipe_context's hanging off the
>>> same pipe_screen.  
>>
>> We seem to be impacted by the problem you're fixing here, but, while
>> this patch definitely make things much better, the problem does not
>> seem to be entirely fixed (at least in our case).
>>
>> A bit more context: we have Qt App using QtWebEngine to render html
>> content. When we call QtWebEngine::initialize(), which as for effect
>> to allow shared GL contexts, we sometimes notice that part of the web
>> page is mis-rendered. That does not happen when we omit the
>> QtWebEngine::initialize() call.
>> As said above, this patch make those rendering issues less likely to
>> happen, but we still have the problem from time to time. So I thought
>> I'd share my guesses about what could cause these issues before
>> debugging it further.
>>
>> First thing I noticed: I couldn't reproduce the problem with [1]
>> applied (+ a tiny change forcing CPU-based tiling no matter the size of
>> the region to tile/untile). So, my guess is that it's related to how we
>> handle GPU-based tiling/untiling.
>> Also noticed that we're testing the rsc->status here [2] without the
>> screen->lock held, and there might be a race with another thread calling
>> resource_used(). We might also lack a resource_read(ctx, &src->base)
>> here [3]. But even after fixing those problems, the rendering issues
>> are not gone.
> 
> I tested again with the following diff applied on top of your patch, and
> the remaining rendering issues we had seem to be gone (don't know what I
> messed up in my previous tests :-/).
> 
> --->8---
> diff --git a/src/gallium/drivers/etnaviv/etnaviv_rs.c b/src/gallium/drivers/etnaviv/etnaviv_rs.c
> index fc4f65dbeee1..b8c8b96a6f72 100644
> --- a/src/gallium/drivers/etnaviv/etnaviv_rs.c
> +++ b/src/gallium/drivers/etnaviv/etnaviv_rs.c
> @@ -729,6 +729,7 @@ etna_try_rs_blit(struct pipe_context *pctx,
>  
>     etna_submit_rs_state(ctx, &copy_to_screen);
>     resource_written(ctx, &dst->base);
> +   resource_read(ctx, &src->base);
>     dst->seqno++;
>     dst->levels[blit_info->dst.level].ts_valid = false;
>     ctx->dirty |= ETNA_DIRTY_DERIVE_TS;
> diff --git a/src/gallium/drivers/etnaviv/etnaviv_transfer.c b/src/gallium/drivers/etnaviv/etnaviv_transfer.c
> index a3013e624ead..e4b2ac605e63 100644
> --- a/src/gallium/drivers/etnaviv/etnaviv_transfer.c
> +++ b/src/gallium/drivers/etnaviv/etnaviv_transfer.c
> @@ -356,6 +356,7 @@ etna_transfer_map(struct pipe_context *pctx, struct pipe_resource *prsc,
>      * transfers without a temporary resource.
>      */
>     if (trans->rsc || !(usage & PIPE_TRANSFER_UNSYNCHRONIZED)) {
> +      struct etna_screen *screen = ctx->screen;
>        uint32_t prep_flags = 0;
>  
>        /*
> @@ -364,11 +365,13 @@ etna_transfer_map(struct pipe_context *pctx, struct pipe_resource *prsc,
>         * current GPU usage (reads must wait for GPU writes, writes must have
>         * exclusive access to the buffer).
>         */
> +      mtx_lock(&screen->lock);
>        if ((trans->rsc && (etna_resource(trans->rsc)->status & ETNA_PENDING_WRITE)) ||
>            (!trans->rsc &&
>             (((usage & PIPE_TRANSFER_READ) && (rsc->status & ETNA_PENDING_WRITE)) ||
>             ((usage & PIPE_TRANSFER_WRITE) && rsc->status))))
>           pctx->flush(pctx, NULL, 0);
> +      mtx_unlock(&screen->lock);
>  
>        if (usage & PIPE_TRANSFER_READ)
>           prep_flags |= DRM_ETNA_PREP_READ;
> 

On iMX6Q

Tested-by: Marek Vasut <marex@denx.de>

with multiple custom applications. I don't see any breakage.
On 2/22/19 10:30 AM, Boris Brezillon wrote:
> On Thu, 21 Feb 2019 23:29:53 +0100
> Boris Brezillon <boris.brezillon@collabora.com> wrote:
> 
>> Christian, Marek,
>>
>> On Wed, 30 Jan 2019 05:28:14 +0100
>> Marek Vasut <marex@denx.de> wrote:
>>
>>> From: Christian Gmeiner <christian.gmeiner@gmail.com>
>>>
>>> A pipe_resource can be shared by all the pipe_context's hanging off the
>>> same pipe_screen.  
>>
>> We seem to be impacted by the problem you're fixing here, but, while
>> this patch definitely make things much better, the problem does not
>> seem to be entirely fixed (at least in our case).
>>
>> A bit more context: we have Qt App using QtWebEngine to render html
>> content. When we call QtWebEngine::initialize(), which as for effect
>> to allow shared GL contexts, we sometimes notice that part of the web
>> page is mis-rendered. That does not happen when we omit the
>> QtWebEngine::initialize() call.
>> As said above, this patch make those rendering issues less likely to
>> happen, but we still have the problem from time to time. So I thought
>> I'd share my guesses about what could cause these issues before
>> debugging it further.
>>
>> First thing I noticed: I couldn't reproduce the problem with [1]
>> applied (+ a tiny change forcing CPU-based tiling no matter the size of
>> the region to tile/untile). So, my guess is that it's related to how we
>> handle GPU-based tiling/untiling.
>> Also noticed that we're testing the rsc->status here [2] without the
>> screen->lock held, and there might be a race with another thread calling
>> resource_used(). We might also lack a resource_read(ctx, &src->base)
>> here [3]. But even after fixing those problems, the rendering issues
>> are not gone.
> 
> I tested again with the following diff applied on top of your patch, and
> the remaining rendering issues we had seem to be gone (don't know what I
> messed up in my previous tests :-/).
> 
> --->8---
> diff --git a/src/gallium/drivers/etnaviv/etnaviv_rs.c b/src/gallium/drivers/etnaviv/etnaviv_rs.c
> index fc4f65dbeee1..b8c8b96a6f72 100644
> --- a/src/gallium/drivers/etnaviv/etnaviv_rs.c
> +++ b/src/gallium/drivers/etnaviv/etnaviv_rs.c
> @@ -729,6 +729,7 @@ etna_try_rs_blit(struct pipe_context *pctx,
>  
>     etna_submit_rs_state(ctx, &copy_to_screen);
>     resource_written(ctx, &dst->base);
> +   resource_read(ctx, &src->base);
>     dst->seqno++;
>     dst->levels[blit_info->dst.level].ts_valid = false;
>     ctx->dirty |= ETNA_DIRTY_DERIVE_TS;
While the V6 of this patch [1] is now in mesa upstream, this hunk is
missing from the V6 . Any special reason for that or is this just an
omission ?

[1] https://patchwork.freedesktop.org/patch/287603/
Am Sa., 16. März 2019 um 20:55 Uhr schrieb Marek Vasut <marex@denx.de>:
>
> On 2/22/19 10:30 AM, Boris Brezillon wrote:
> > On Thu, 21 Feb 2019 23:29:53 +0100
> > Boris Brezillon <boris.brezillon@collabora.com> wrote:
> >
> >> Christian, Marek,
> >>
> >> On Wed, 30 Jan 2019 05:28:14 +0100
> >> Marek Vasut <marex@denx.de> wrote:
> >>
> >>> From: Christian Gmeiner <christian.gmeiner@gmail.com>
> >>>
> >>> A pipe_resource can be shared by all the pipe_context's hanging off the
> >>> same pipe_screen.
> >>
> >> We seem to be impacted by the problem you're fixing here, but, while
> >> this patch definitely make things much better, the problem does not
> >> seem to be entirely fixed (at least in our case).
> >>
> >> A bit more context: we have Qt App using QtWebEngine to render html
> >> content. When we call QtWebEngine::initialize(), which as for effect
> >> to allow shared GL contexts, we sometimes notice that part of the web
> >> page is mis-rendered. That does not happen when we omit the
> >> QtWebEngine::initialize() call.
> >> As said above, this patch make those rendering issues less likely to
> >> happen, but we still have the problem from time to time. So I thought
> >> I'd share my guesses about what could cause these issues before
> >> debugging it further.
> >>
> >> First thing I noticed: I couldn't reproduce the problem with [1]
> >> applied (+ a tiny change forcing CPU-based tiling no matter the size of
> >> the region to tile/untile). So, my guess is that it's related to how we
> >> handle GPU-based tiling/untiling.
> >> Also noticed that we're testing the rsc->status here [2] without the
> >> screen->lock held, and there might be a race with another thread calling
> >> resource_used(). We might also lack a resource_read(ctx, &src->base)
> >> here [3]. But even after fixing those problems, the rendering issues
> >> are not gone.
> >
> > I tested again with the following diff applied on top of your patch, and
> > the remaining rendering issues we had seem to be gone (don't know what I
> > messed up in my previous tests :-/).
> >
> > --->8---
> > diff --git a/src/gallium/drivers/etnaviv/etnaviv_rs.c b/src/gallium/drivers/etnaviv/etnaviv_rs.c
> > index fc4f65dbeee1..b8c8b96a6f72 100644
> > --- a/src/gallium/drivers/etnaviv/etnaviv_rs.c
> > +++ b/src/gallium/drivers/etnaviv/etnaviv_rs.c
> > @@ -729,6 +729,7 @@ etna_try_rs_blit(struct pipe_context *pctx,
> >
> >     etna_submit_rs_state(ctx, &copy_to_screen);
> >     resource_written(ctx, &dst->base);
> > +   resource_read(ctx, &src->base);
> >     dst->seqno++;
> >     dst->levels[blit_info->dst.level].ts_valid = false;
> >     ctx->dirty |= ETNA_DIRTY_DERIVE_TS;
> While the V6 of this patch [1] is now in mesa upstream, this hunk is
> missing from the V6 . Any special reason for that or is this just an
> omission ?
>

This and an other change was done in separates commits - see:

https://cgit.freedesktop.org/mesa/mesa/commit/?id=7244e768040859126a5b74023f587beaa8bef81c
https://cgit.freedesktop.org/mesa/mesa/commit/?id=c56e73449698c31fe72b99a95b7e4ecdb2985b73

So everything went in.
On 3/17/19 10:48 AM, Christian Gmeiner wrote:
> Am Sa., 16. März 2019 um 20:55 Uhr schrieb Marek Vasut <marex@denx.de>:
>>
>> On 2/22/19 10:30 AM, Boris Brezillon wrote:
>>> On Thu, 21 Feb 2019 23:29:53 +0100
>>> Boris Brezillon <boris.brezillon@collabora.com> wrote:
>>>
>>>> Christian, Marek,
>>>>
>>>> On Wed, 30 Jan 2019 05:28:14 +0100
>>>> Marek Vasut <marex@denx.de> wrote:
>>>>
>>>>> From: Christian Gmeiner <christian.gmeiner@gmail.com>
>>>>>
>>>>> A pipe_resource can be shared by all the pipe_context's hanging off the
>>>>> same pipe_screen.
>>>>
>>>> We seem to be impacted by the problem you're fixing here, but, while
>>>> this patch definitely make things much better, the problem does not
>>>> seem to be entirely fixed (at least in our case).
>>>>
>>>> A bit more context: we have Qt App using QtWebEngine to render html
>>>> content. When we call QtWebEngine::initialize(), which as for effect
>>>> to allow shared GL contexts, we sometimes notice that part of the web
>>>> page is mis-rendered. That does not happen when we omit the
>>>> QtWebEngine::initialize() call.
>>>> As said above, this patch make those rendering issues less likely to
>>>> happen, but we still have the problem from time to time. So I thought
>>>> I'd share my guesses about what could cause these issues before
>>>> debugging it further.
>>>>
>>>> First thing I noticed: I couldn't reproduce the problem with [1]
>>>> applied (+ a tiny change forcing CPU-based tiling no matter the size of
>>>> the region to tile/untile). So, my guess is that it's related to how we
>>>> handle GPU-based tiling/untiling.
>>>> Also noticed that we're testing the rsc->status here [2] without the
>>>> screen->lock held, and there might be a race with another thread calling
>>>> resource_used(). We might also lack a resource_read(ctx, &src->base)
>>>> here [3]. But even after fixing those problems, the rendering issues
>>>> are not gone.
>>>
>>> I tested again with the following diff applied on top of your patch, and
>>> the remaining rendering issues we had seem to be gone (don't know what I
>>> messed up in my previous tests :-/).
>>>
>>> --->8---
>>> diff --git a/src/gallium/drivers/etnaviv/etnaviv_rs.c b/src/gallium/drivers/etnaviv/etnaviv_rs.c
>>> index fc4f65dbeee1..b8c8b96a6f72 100644
>>> --- a/src/gallium/drivers/etnaviv/etnaviv_rs.c
>>> +++ b/src/gallium/drivers/etnaviv/etnaviv_rs.c
>>> @@ -729,6 +729,7 @@ etna_try_rs_blit(struct pipe_context *pctx,
>>>
>>>     etna_submit_rs_state(ctx, &copy_to_screen);
>>>     resource_written(ctx, &dst->base);
>>> +   resource_read(ctx, &src->base);
>>>     dst->seqno++;
>>>     dst->levels[blit_info->dst.level].ts_valid = false;
>>>     ctx->dirty |= ETNA_DIRTY_DERIVE_TS;
>> While the V6 of this patch [1] is now in mesa upstream, this hunk is
>> missing from the V6 . Any special reason for that or is this just an
>> omission ?
>>
> 
> This and an other change was done in separates commits - see:
> 
> https://cgit.freedesktop.org/mesa/mesa/commit/?id=7244e768040859126a5b74023f587beaa8bef81c
> https://cgit.freedesktop.org/mesa/mesa/commit/?id=c56e73449698c31fe72b99a95b7e4ecdb2985b73
> 
> So everything went in.

Ah, thanks :)

Do you plan to backport those to mesa 18.3.5 too ?