[1/5] drm: don't block fb changes for async plane updates

Submitted by Helen Koike on March 4, 2019, 2:49 p.m.

Details

Message ID 20190304144909.6267-2-helen.koike@collabora.com
State New
Headers show
Series "drm: Fix fb changes for async updates" ( rev: 1 ) in DRI devel

Browsing this patch as part of:
"drm: Fix fb changes for async updates" rev 1 in DRI devel
<< prev patch [1/5] next patch >>

Commit Message

Helen Koike March 4, 2019, 2:49 p.m.
In the case of a normal sync update, the preparation of framebuffers (be
it calling drm_atomic_helper_prepare_planes() or doing setups with
drm_framebuffer_get()) are performed in the new_state and the respective
cleanups are performed in the old_state.

In the case of async updates, the preparation is also done in the
new_state but the cleanups are done in the new_state (because updates
are performed in place, i.e. in the current state).

The current code blocks async udpates when the fb is changed, turning
async updates into sync updates, slowing down cursor updates and
introducing regressions in igt tests with errors of type:

"CRITICAL: completed 97 cursor updated in a period of 30 flips, we
expect to complete approximately 15360 updates, with the threshold set
at 7680"

Fb changes in async updates were prevented to avoid the following scenario:

- Async update, oldfb = NULL, newfb = fb1, prepare fb1, cleanup fb1
- Async update, oldfb = fb1, newfb = fb2, prepare fb2, cleanup fb2
- Non-async commit, oldfb = fb2, newfb = fb1, prepare fb1, cleanup fb2 (wrong)
Where we have a single call to prepare fb2 but double cleanup call to fb2.

To solve the above problems, instead of blocking async fb changes, we
place the old framebuffer in the new_state object, so when the code
performs cleanups in the new_state it will cleanup the old_fb and we
will have the following scenario instead:

- Async update, oldfb = NULL, newfb = fb1, prepare fb1, no cleanup
- Async update, oldfb = fb1, newfb = fb2, prepare fb2, cleanup fb1
- Non-async commit, oldfb = fb2, newfb = fb1, prepare fb1, cleanup fb2

Where calls to prepare/cleanup are ballanced.

Cc: <stable@vger.kernel.org> # v4.14+: 25dc194b34dd: drm: Block fb changes for async plane updates
Fixes: 25dc194b34dd ("drm: Block fb changes for async plane updates")
Suggested-by: Boris Brezillon <boris.brezillon@collabora.com>
Signed-off-by: Helen Koike <helen.koike@collabora.com>

---
Hello,

As mentioned in the cover letter,
I tested on the rockchip and on i915 (with a patch I am still working on for
replacing cursors by async update), with igt plane_cursor_legacy and
kms_cursor_legacy and I didn't see any regressions.
I couldn't test on MSM and AMD because I don't have the hardware (and I am
having some issues testing on vc4) and I would appreciate if anyone could help
me testing those.

I also think it would be a better solution if, instead of having async
to do in-place updates in the current state, the async path should be
equivalent to a syncronous update, i.e., modifying new_state and
performing a flip
IMHO, the only difference between sync and async should be that async update
doesn't wait for vblank and applies the changes immeditally to the hw,
but the code path could be almost the same.
But for now I think this solution is ok (swaping new_fb/old_fb), and
then we can adjust things little by little, what do you think?

Thanks!
Helen

 drivers/gpu/drm/drm_atomic_helper.c | 20 ++++++++++----------
 1 file changed, 10 insertions(+), 10 deletions(-)

Patch hide | download patch | download mbox

diff --git a/drivers/gpu/drm/drm_atomic_helper.c b/drivers/gpu/drm/drm_atomic_helper.c
index 540a77a2ade9..e7eb96f1efc2 100644
--- a/drivers/gpu/drm/drm_atomic_helper.c
+++ b/drivers/gpu/drm/drm_atomic_helper.c
@@ -1608,15 +1608,6 @@  int drm_atomic_helper_async_check(struct drm_device *dev,
 	    old_plane_state->crtc != new_plane_state->crtc)
 		return -EINVAL;
 
-	/*
-	 * FIXME: Since prepare_fb and cleanup_fb are always called on
-	 * the new_plane_state for async updates we need to block framebuffer
-	 * changes. This prevents use of a fb that's been cleaned up and
-	 * double cleanups from occuring.
-	 */
-	if (old_plane_state->fb != new_plane_state->fb)
-		return -EINVAL;
-
 	funcs = plane->helper_private;
 	if (!funcs->atomic_async_update)
 		return -EINVAL;
@@ -1657,6 +1648,9 @@  void drm_atomic_helper_async_commit(struct drm_device *dev,
 	int i;
 
 	for_each_new_plane_in_state(state, plane, plane_state, i) {
+		struct drm_framebuffer *new_fb = plane_state->fb;
+		struct drm_framebuffer *old_fb = plane->state->fb;
+
 		funcs = plane->helper_private;
 		funcs->atomic_async_update(plane, plane_state);
 
@@ -1665,11 +1659,17 @@  void drm_atomic_helper_async_commit(struct drm_device *dev,
 		 * plane->state in-place, make sure at least common
 		 * properties have been properly updated.
 		 */
-		WARN_ON_ONCE(plane->state->fb != plane_state->fb);
+		WARN_ON_ONCE(plane->state->fb != new_fb);
 		WARN_ON_ONCE(plane->state->crtc_x != plane_state->crtc_x);
 		WARN_ON_ONCE(plane->state->crtc_y != plane_state->crtc_y);
 		WARN_ON_ONCE(plane->state->src_x != plane_state->src_x);
 		WARN_ON_ONCE(plane->state->src_y != plane_state->src_y);
+
+		/*
+		 * Make sure the FBs have been swapped so that cleanups in the
+		 * new_state performs a cleanup in the old FB.
+		 */
+		WARN_ON_ONCE(plane_state->fb != old_fb);
 	}
 }
 EXPORT_SYMBOL(drm_atomic_helper_async_commit);

Comments

On 3/4/19 9:49 AM, Helen Koike wrote:
> In the case of a normal sync update, the preparation of framebuffers (be

> it calling drm_atomic_helper_prepare_planes() or doing setups with

> drm_framebuffer_get()) are performed in the new_state and the respective

> cleanups are performed in the old_state.

> 

> In the case of async updates, the preparation is also done in the

> new_state but the cleanups are done in the new_state (because updates

> are performed in place, i.e. in the current state).

> 

> The current code blocks async udpates when the fb is changed, turning

> async updates into sync updates, slowing down cursor updates and

> introducing regressions in igt tests with errors of type:

> 

> "CRITICAL: completed 97 cursor updated in a period of 30 flips, we

> expect to complete approximately 15360 updates, with the threshold set

> at 7680"

> 

> Fb changes in async updates were prevented to avoid the following scenario:

> 

> - Async update, oldfb = NULL, newfb = fb1, prepare fb1, cleanup fb1

> - Async update, oldfb = fb1, newfb = fb2, prepare fb2, cleanup fb2

> - Non-async commit, oldfb = fb2, newfb = fb1, prepare fb1, cleanup fb2 (wrong)

> Where we have a single call to prepare fb2 but double cleanup call to fb2.

> 

> To solve the above problems, instead of blocking async fb changes, we

> place the old framebuffer in the new_state object, so when the code

> performs cleanups in the new_state it will cleanup the old_fb and we

> will have the following scenario instead:

> 

> - Async update, oldfb = NULL, newfb = fb1, prepare fb1, no cleanup

> - Async update, oldfb = fb1, newfb = fb2, prepare fb2, cleanup fb1

> - Non-async commit, oldfb = fb2, newfb = fb1, prepare fb1, cleanup fb2

> 

> Where calls to prepare/cleanup are ballanced.

> 

> Cc: <stable@vger.kernel.org> # v4.14+: 25dc194b34dd: drm: Block fb changes for async plane updates

> Fixes: 25dc194b34dd ("drm: Block fb changes for async plane updates")

> Suggested-by: Boris Brezillon <boris.brezillon@collabora.com>

> Signed-off-by: Helen Koike <helen.koike@collabora.com>

> 

> ---

> Hello,

> 

> As mentioned in the cover letter,

> I tested on the rockchip and on i915 (with a patch I am still working on for

> replacing cursors by async update), with igt plane_cursor_legacy and

> kms_cursor_legacy and I didn't see any regressions.

> I couldn't test on MSM and AMD because I don't have the hardware (and I am

> having some issues testing on vc4) and I would appreciate if anyone could help

> me testing those.

> 

> I also think it would be a better solution if, instead of having async

> to do in-place updates in the current state, the async path should be

> equivalent to a syncronous update, i.e., modifying new_state and

> performing a flip

> IMHO, the only difference between sync and async should be that async update

> doesn't wait for vblank and applies the changes immeditally to the hw,

> but the code path could be almost the same.

> But for now I think this solution is ok (swaping new_fb/old_fb), and

> then we can adjust things little by little, what do you think?

> 

> Thanks!

> Helen

> 

>   drivers/gpu/drm/drm_atomic_helper.c | 20 ++++++++++----------

>   1 file changed, 10 insertions(+), 10 deletions(-)

> 

> diff --git a/drivers/gpu/drm/drm_atomic_helper.c b/drivers/gpu/drm/drm_atomic_helper.c

> index 540a77a2ade9..e7eb96f1efc2 100644

> --- a/drivers/gpu/drm/drm_atomic_helper.c

> +++ b/drivers/gpu/drm/drm_atomic_helper.c

> @@ -1608,15 +1608,6 @@ int drm_atomic_helper_async_check(struct drm_device *dev,

>   	    old_plane_state->crtc != new_plane_state->crtc)

>   		return -EINVAL;

>   

> -	/*

> -	 * FIXME: Since prepare_fb and cleanup_fb are always called on

> -	 * the new_plane_state for async updates we need to block framebuffer

> -	 * changes. This prevents use of a fb that's been cleaned up and

> -	 * double cleanups from occuring.

> -	 */

> -	if (old_plane_state->fb != new_plane_state->fb)

> -		return -EINVAL;

> -

>   	funcs = plane->helper_private;

>   	if (!funcs->atomic_async_update)

>   		return -EINVAL;

> @@ -1657,6 +1648,9 @@ void drm_atomic_helper_async_commit(struct drm_device *dev,

>   	int i;

>   

>   	for_each_new_plane_in_state(state, plane, plane_state, i) {

> +		struct drm_framebuffer *new_fb = plane_state->fb;

> +		struct drm_framebuffer *old_fb = plane->state->fb;

> +

>   		funcs = plane->helper_private;

>   		funcs->atomic_async_update(plane, plane_state);

>   

> @@ -1665,11 +1659,17 @@ void drm_atomic_helper_async_commit(struct drm_device *dev,

>   		 * plane->state in-place, make sure at least common

>   		 * properties have been properly updated.

>   		 */

> -		WARN_ON_ONCE(plane->state->fb != plane_state->fb);

> +		WARN_ON_ONCE(plane->state->fb != new_fb);

>   		WARN_ON_ONCE(plane->state->crtc_x != plane_state->crtc_x);

>   		WARN_ON_ONCE(plane->state->crtc_y != plane_state->crtc_y);

>   		WARN_ON_ONCE(plane->state->src_x != plane_state->src_x);

>   		WARN_ON_ONCE(plane->state->src_y != plane_state->src_y);

> +

> +		/*

> +		 * Make sure the FBs have been swapped so that cleanups in the

> +		 * new_state performs a cleanup in the old FB.

> +		 */

> +		WARN_ON_ONCE(plane_state->fb != old_fb);


I personally think this approach is fine and the WARN_ON s are good for 
catching drivers that want to use these in the future.

I do think it would be good to add something to the function docs that 
explains this requirement and the issue that it addresses. It's a little 
unintuitive to require that the old fb is placed into the new state, but 
it makes sense as a workaround to this problem.

Nicholas Kazlauskas

>   	}

>   }

>   EXPORT_SYMBOL(drm_atomic_helper_async_commit);

>
On Mon,  4 Mar 2019 11:49:05 -0300
Helen Koike <helen.koike@collabora.com> wrote:

> In the case of a normal sync update, the preparation of framebuffers (be
> it calling drm_atomic_helper_prepare_planes() or doing setups with
> drm_framebuffer_get()) are performed in the new_state and the respective
> cleanups are performed in the old_state.
> 
> In the case of async updates, the preparation is also done in the
> new_state but the cleanups are done in the new_state (because updates
> are performed in place, i.e. in the current state).
> 
> The current code blocks async udpates when the fb is changed, turning
> async updates into sync updates, slowing down cursor updates and
> introducing regressions in igt tests with errors of type:
> 
> "CRITICAL: completed 97 cursor updated in a period of 30 flips, we
> expect to complete approximately 15360 updates, with the threshold set
> at 7680"
> 
> Fb changes in async updates were prevented to avoid the following scenario:
> 
> - Async update, oldfb = NULL, newfb = fb1, prepare fb1, cleanup fb1
> - Async update, oldfb = fb1, newfb = fb2, prepare fb2, cleanup fb2
> - Non-async commit, oldfb = fb2, newfb = fb1, prepare fb1, cleanup fb2 (wrong)
> Where we have a single call to prepare fb2 but double cleanup call to fb2.
> 
> To solve the above problems, instead of blocking async fb changes, we
> place the old framebuffer in the new_state object, so when the code
> performs cleanups in the new_state it will cleanup the old_fb and we
> will have the following scenario instead:
> 
> - Async update, oldfb = NULL, newfb = fb1, prepare fb1, no cleanup
> - Async update, oldfb = fb1, newfb = fb2, prepare fb2, cleanup fb1
> - Non-async commit, oldfb = fb2, newfb = fb1, prepare fb1, cleanup fb2
> 
> Where calls to prepare/cleanup are ballanced.
> 
> Cc: <stable@vger.kernel.org> # v4.14+: 25dc194b34dd: drm: Block fb changes for async plane updates
> Fixes: 25dc194b34dd ("drm: Block fb changes for async plane updates")
> Suggested-by: Boris Brezillon <boris.brezillon@collabora.com>
> Signed-off-by: Helen Koike <helen.koike@collabora.com>
> 
> ---
> Hello,
> 
> As mentioned in the cover letter,
> I tested on the rockchip and on i915 (with a patch I am still working on for
> replacing cursors by async update), with igt plane_cursor_legacy and
> kms_cursor_legacy and I didn't see any regressions.
> I couldn't test on MSM and AMD because I don't have the hardware (and I am
> having some issues testing on vc4) and I would appreciate if anyone could help
> me testing those.
> 
> I also think it would be a better solution if, instead of having async
> to do in-place updates in the current state, the async path should be
> equivalent to a syncronous update, i.e., modifying new_state and
> performing a flip
> IMHO, the only difference between sync and async should be that async update
> doesn't wait for vblank and applies the changes immeditally to the hw,
> but the code path could be almost the same.
> But for now I think this solution is ok (swaping new_fb/old_fb), and
> then we can adjust things little by little, what do you think?
> 
> Thanks!
> Helen
> 
>  drivers/gpu/drm/drm_atomic_helper.c | 20 ++++++++++----------
>  1 file changed, 10 insertions(+), 10 deletions(-)
> 
> diff --git a/drivers/gpu/drm/drm_atomic_helper.c b/drivers/gpu/drm/drm_atomic_helper.c
> index 540a77a2ade9..e7eb96f1efc2 100644
> --- a/drivers/gpu/drm/drm_atomic_helper.c
> +++ b/drivers/gpu/drm/drm_atomic_helper.c
> @@ -1608,15 +1608,6 @@ int drm_atomic_helper_async_check(struct drm_device *dev,
>  	    old_plane_state->crtc != new_plane_state->crtc)
>  		return -EINVAL;
>  
> -	/*
> -	 * FIXME: Since prepare_fb and cleanup_fb are always called on
> -	 * the new_plane_state for async updates we need to block framebuffer
> -	 * changes. This prevents use of a fb that's been cleaned up and
> -	 * double cleanups from occuring.
> -	 */
> -	if (old_plane_state->fb != new_plane_state->fb)
> -		return -EINVAL;
> -
>  	funcs = plane->helper_private;
>  	if (!funcs->atomic_async_update)
>  		return -EINVAL;
> @@ -1657,6 +1648,9 @@ void drm_atomic_helper_async_commit(struct drm_device *dev,
>  	int i;
>  
>  	for_each_new_plane_in_state(state, plane, plane_state, i) {
> +		struct drm_framebuffer *new_fb = plane_state->fb;
> +		struct drm_framebuffer *old_fb = plane->state->fb;
> +
>  		funcs = plane->helper_private;
>  		funcs->atomic_async_update(plane, plane_state);
>  
> @@ -1665,11 +1659,17 @@ void drm_atomic_helper_async_commit(struct drm_device *dev,
>  		 * plane->state in-place, make sure at least common
>  		 * properties have been properly updated.
>  		 */
> -		WARN_ON_ONCE(plane->state->fb != plane_state->fb);
> +		WARN_ON_ONCE(plane->state->fb != new_fb);
>  		WARN_ON_ONCE(plane->state->crtc_x != plane_state->crtc_x);
>  		WARN_ON_ONCE(plane->state->crtc_y != plane_state->crtc_y);
>  		WARN_ON_ONCE(plane->state->src_x != plane_state->src_x);
>  		WARN_ON_ONCE(plane->state->src_y != plane_state->src_y);
> +
> +		/*
> +		 * Make sure the FBs have been swapped so that cleanups in the
> +		 * new_state performs a cleanup in the old FB.
> +		 */
> +		WARN_ON_ONCE(plane_state->fb != old_fb);

Looks like this patch should go last in the series if you want to keep
things bisectable, otherwise you'll have a WARN_ON() backtrace in the
drivers your fixing in the following patches.

>  	}
>  }
>  EXPORT_SYMBOL(drm_atomic_helper_async_commit);
Hello Nicholas,

On Mon, 4 Mar 2019 15:46:49 +0000
"Kazlauskas, Nicholas" <Nicholas.Kazlauskas@amd.com> wrote:

> On 3/4/19 9:49 AM, Helen Koike wrote:
> > In the case of a normal sync update, the preparation of framebuffers (be
> > it calling drm_atomic_helper_prepare_planes() or doing setups with
> > drm_framebuffer_get()) are performed in the new_state and the respective
> > cleanups are performed in the old_state.
> > 
> > In the case of async updates, the preparation is also done in the
> > new_state but the cleanups are done in the new_state (because updates
> > are performed in place, i.e. in the current state).
> > 
> > The current code blocks async udpates when the fb is changed, turning
> > async updates into sync updates, slowing down cursor updates and
> > introducing regressions in igt tests with errors of type:
> > 
> > "CRITICAL: completed 97 cursor updated in a period of 30 flips, we
> > expect to complete approximately 15360 updates, with the threshold set
> > at 7680"
> > 
> > Fb changes in async updates were prevented to avoid the following scenario:
> > 
> > - Async update, oldfb = NULL, newfb = fb1, prepare fb1, cleanup fb1
> > - Async update, oldfb = fb1, newfb = fb2, prepare fb2, cleanup fb2
> > - Non-async commit, oldfb = fb2, newfb = fb1, prepare fb1, cleanup fb2 (wrong)
> > Where we have a single call to prepare fb2 but double cleanup call to fb2.
> > 
> > To solve the above problems, instead of blocking async fb changes, we
> > place the old framebuffer in the new_state object, so when the code
> > performs cleanups in the new_state it will cleanup the old_fb and we
> > will have the following scenario instead:
> > 
> > - Async update, oldfb = NULL, newfb = fb1, prepare fb1, no cleanup
> > - Async update, oldfb = fb1, newfb = fb2, prepare fb2, cleanup fb1
> > - Non-async commit, oldfb = fb2, newfb = fb1, prepare fb1, cleanup fb2
> > 
> > Where calls to prepare/cleanup are ballanced.
> > 
> > Cc: <stable@vger.kernel.org> # v4.14+: 25dc194b34dd: drm: Block fb changes for async plane updates
> > Fixes: 25dc194b34dd ("drm: Block fb changes for async plane updates")
> > Suggested-by: Boris Brezillon <boris.brezillon@collabora.com>
> > Signed-off-by: Helen Koike <helen.koike@collabora.com>
> > 
> > ---
> > Hello,
> > 
> > As mentioned in the cover letter,
> > I tested on the rockchip and on i915 (with a patch I am still working on for
> > replacing cursors by async update), with igt plane_cursor_legacy and
> > kms_cursor_legacy and I didn't see any regressions.
> > I couldn't test on MSM and AMD because I don't have the hardware (and I am
> > having some issues testing on vc4) and I would appreciate if anyone could help
> > me testing those.
> > 
> > I also think it would be a better solution if, instead of having async
> > to do in-place updates in the current state, the async path should be
> > equivalent to a syncronous update, i.e., modifying new_state and
> > performing a flip
> > IMHO, the only difference between sync and async should be that async update
> > doesn't wait for vblank and applies the changes immeditally to the hw,
> > but the code path could be almost the same.
> > But for now I think this solution is ok (swaping new_fb/old_fb), and
> > then we can adjust things little by little, what do you think?
> > 
> > Thanks!
> > Helen
> > 
> >   drivers/gpu/drm/drm_atomic_helper.c | 20 ++++++++++----------
> >   1 file changed, 10 insertions(+), 10 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/drm_atomic_helper.c b/drivers/gpu/drm/drm_atomic_helper.c
> > index 540a77a2ade9..e7eb96f1efc2 100644
> > --- a/drivers/gpu/drm/drm_atomic_helper.c
> > +++ b/drivers/gpu/drm/drm_atomic_helper.c
> > @@ -1608,15 +1608,6 @@ int drm_atomic_helper_async_check(struct drm_device *dev,
> >   	    old_plane_state->crtc != new_plane_state->crtc)
> >   		return -EINVAL;
> >   
> > -	/*
> > -	 * FIXME: Since prepare_fb and cleanup_fb are always called on
> > -	 * the new_plane_state for async updates we need to block framebuffer
> > -	 * changes. This prevents use of a fb that's been cleaned up and
> > -	 * double cleanups from occuring.
> > -	 */
> > -	if (old_plane_state->fb != new_plane_state->fb)
> > -		return -EINVAL;
> > -
> >   	funcs = plane->helper_private;
> >   	if (!funcs->atomic_async_update)
> >   		return -EINVAL;
> > @@ -1657,6 +1648,9 @@ void drm_atomic_helper_async_commit(struct drm_device *dev,
> >   	int i;
> >   
> >   	for_each_new_plane_in_state(state, plane, plane_state, i) {
> > +		struct drm_framebuffer *new_fb = plane_state->fb;
> > +		struct drm_framebuffer *old_fb = plane->state->fb;
> > +
> >   		funcs = plane->helper_private;
> >   		funcs->atomic_async_update(plane, plane_state);
> >   
> > @@ -1665,11 +1659,17 @@ void drm_atomic_helper_async_commit(struct drm_device *dev,
> >   		 * plane->state in-place, make sure at least common
> >   		 * properties have been properly updated.
> >   		 */
> > -		WARN_ON_ONCE(plane->state->fb != plane_state->fb);
> > +		WARN_ON_ONCE(plane->state->fb != new_fb);
> >   		WARN_ON_ONCE(plane->state->crtc_x != plane_state->crtc_x);
> >   		WARN_ON_ONCE(plane->state->crtc_y != plane_state->crtc_y);
> >   		WARN_ON_ONCE(plane->state->src_x != plane_state->src_x);
> >   		WARN_ON_ONCE(plane->state->src_y != plane_state->src_y);
> > +
> > +		/*
> > +		 * Make sure the FBs have been swapped so that cleanups in the
> > +		 * new_state performs a cleanup in the old FB.
> > +		 */
> > +		WARN_ON_ONCE(plane_state->fb != old_fb);  
> 
> I personally think this approach is fine and the WARN_ON s are good for 
> catching drivers that want to use these in the future.

Well, I agree this change is the way to go for a short-term solution
to relax the old_fb == new_fb constraint, but I keep thinking this whole
"update plane_state in place" is a recipe for trouble and just make
things more complicated for drivers for no obvious reasons. Look at the
VC4 implem [1] if you need a proof that things can get messy pretty
quickly.

All this state-fields-copying steps could be skipped if the core was
simply swapping the old/new states as is done in the sync update path.

[1]https://elixir.bootlin.com/linux/v5.0-rc7/source/drivers/gpu/drm/vc4/vc4_plane.c#L878

> 
> I do think it would be good to add something to the function docs that 
> explains this requirement and the issue that it addresses. It's a little 
> unintuitive to require that the old fb is placed into the new state, but 
> it makes sense as a workaround to this problem.
> 
> Nicholas Kazlauskas
> 
> >   	}
> >   }
> >   EXPORT_SYMBOL(drm_atomic_helper_async_commit);
> >   
>
On 3/11/19 6:06 AM, Boris Brezillon wrote:
> Hello Nicholas,

> 

> On Mon, 4 Mar 2019 15:46:49 +0000

> "Kazlauskas, Nicholas" <Nicholas.Kazlauskas@amd.com> wrote:

> 

>> On 3/4/19 9:49 AM, Helen Koike wrote:

>>> In the case of a normal sync update, the preparation of framebuffers (be

>>> it calling drm_atomic_helper_prepare_planes() or doing setups with

>>> drm_framebuffer_get()) are performed in the new_state and the respective

>>> cleanups are performed in the old_state.

>>>

>>> In the case of async updates, the preparation is also done in the

>>> new_state but the cleanups are done in the new_state (because updates

>>> are performed in place, i.e. in the current state).

>>>

>>> The current code blocks async udpates when the fb is changed, turning

>>> async updates into sync updates, slowing down cursor updates and

>>> introducing regressions in igt tests with errors of type:

>>>

>>> "CRITICAL: completed 97 cursor updated in a period of 30 flips, we

>>> expect to complete approximately 15360 updates, with the threshold set

>>> at 7680"

>>>

>>> Fb changes in async updates were prevented to avoid the following scenario:

>>>

>>> - Async update, oldfb = NULL, newfb = fb1, prepare fb1, cleanup fb1

>>> - Async update, oldfb = fb1, newfb = fb2, prepare fb2, cleanup fb2

>>> - Non-async commit, oldfb = fb2, newfb = fb1, prepare fb1, cleanup fb2 (wrong)

>>> Where we have a single call to prepare fb2 but double cleanup call to fb2.

>>>

>>> To solve the above problems, instead of blocking async fb changes, we

>>> place the old framebuffer in the new_state object, so when the code

>>> performs cleanups in the new_state it will cleanup the old_fb and we

>>> will have the following scenario instead:

>>>

>>> - Async update, oldfb = NULL, newfb = fb1, prepare fb1, no cleanup

>>> - Async update, oldfb = fb1, newfb = fb2, prepare fb2, cleanup fb1

>>> - Non-async commit, oldfb = fb2, newfb = fb1, prepare fb1, cleanup fb2

>>>

>>> Where calls to prepare/cleanup are ballanced.

>>>

>>> Cc: <stable@vger.kernel.org> # v4.14+: 25dc194b34dd: drm: Block fb changes for async plane updates

>>> Fixes: 25dc194b34dd ("drm: Block fb changes for async plane updates")

>>> Suggested-by: Boris Brezillon <boris.brezillon@collabora.com>

>>> Signed-off-by: Helen Koike <helen.koike@collabora.com>

>>>

>>> ---

>>> Hello,

>>>

>>> As mentioned in the cover letter,

>>> I tested on the rockchip and on i915 (with a patch I am still working on for

>>> replacing cursors by async update), with igt plane_cursor_legacy and

>>> kms_cursor_legacy and I didn't see any regressions.

>>> I couldn't test on MSM and AMD because I don't have the hardware (and I am

>>> having some issues testing on vc4) and I would appreciate if anyone could help

>>> me testing those.

>>>

>>> I also think it would be a better solution if, instead of having async

>>> to do in-place updates in the current state, the async path should be

>>> equivalent to a syncronous update, i.e., modifying new_state and

>>> performing a flip

>>> IMHO, the only difference between sync and async should be that async update

>>> doesn't wait for vblank and applies the changes immeditally to the hw,

>>> but the code path could be almost the same.

>>> But for now I think this solution is ok (swaping new_fb/old_fb), and

>>> then we can adjust things little by little, what do you think?

>>>

>>> Thanks!

>>> Helen

>>>

>>>    drivers/gpu/drm/drm_atomic_helper.c | 20 ++++++++++----------

>>>    1 file changed, 10 insertions(+), 10 deletions(-)

>>>

>>> diff --git a/drivers/gpu/drm/drm_atomic_helper.c b/drivers/gpu/drm/drm_atomic_helper.c

>>> index 540a77a2ade9..e7eb96f1efc2 100644

>>> --- a/drivers/gpu/drm/drm_atomic_helper.c

>>> +++ b/drivers/gpu/drm/drm_atomic_helper.c

>>> @@ -1608,15 +1608,6 @@ int drm_atomic_helper_async_check(struct drm_device *dev,

>>>    	    old_plane_state->crtc != new_plane_state->crtc)

>>>    		return -EINVAL;

>>>    

>>> -	/*

>>> -	 * FIXME: Since prepare_fb and cleanup_fb are always called on

>>> -	 * the new_plane_state for async updates we need to block framebuffer

>>> -	 * changes. This prevents use of a fb that's been cleaned up and

>>> -	 * double cleanups from occuring.

>>> -	 */

>>> -	if (old_plane_state->fb != new_plane_state->fb)

>>> -		return -EINVAL;

>>> -

>>>    	funcs = plane->helper_private;

>>>    	if (!funcs->atomic_async_update)

>>>    		return -EINVAL;

>>> @@ -1657,6 +1648,9 @@ void drm_atomic_helper_async_commit(struct drm_device *dev,

>>>    	int i;

>>>    

>>>    	for_each_new_plane_in_state(state, plane, plane_state, i) {

>>> +		struct drm_framebuffer *new_fb = plane_state->fb;

>>> +		struct drm_framebuffer *old_fb = plane->state->fb;

>>> +

>>>    		funcs = plane->helper_private;

>>>    		funcs->atomic_async_update(plane, plane_state);

>>>    

>>> @@ -1665,11 +1659,17 @@ void drm_atomic_helper_async_commit(struct drm_device *dev,

>>>    		 * plane->state in-place, make sure at least common

>>>    		 * properties have been properly updated.

>>>    		 */

>>> -		WARN_ON_ONCE(plane->state->fb != plane_state->fb);

>>> +		WARN_ON_ONCE(plane->state->fb != new_fb);

>>>    		WARN_ON_ONCE(plane->state->crtc_x != plane_state->crtc_x);

>>>    		WARN_ON_ONCE(plane->state->crtc_y != plane_state->crtc_y);

>>>    		WARN_ON_ONCE(plane->state->src_x != plane_state->src_x);

>>>    		WARN_ON_ONCE(plane->state->src_y != plane_state->src_y);

>>> +

>>> +		/*

>>> +		 * Make sure the FBs have been swapped so that cleanups in the

>>> +		 * new_state performs a cleanup in the old FB.

>>> +		 */

>>> +		WARN_ON_ONCE(plane_state->fb != old_fb);

>>

>> I personally think this approach is fine and the WARN_ON s are good for

>> catching drivers that want to use these in the future.

> 

> Well, I agree this change is the way to go for a short-term solution

> to relax the old_fb == new_fb constraint, but I keep thinking this whole

> "update plane_state in place" is a recipe for trouble and just make

> things more complicated for drivers for no obvious reasons. Look at the

> VC4 implem [1] if you need a proof that things can get messy pretty

> quickly.

> 

> All this state-fields-copying steps could be skipped if the core was

> simply swapping the old/new states as is done in the sync update path.

> 

> [1]https://elixir.bootlin.com/linux/v5.0-rc7/source/drivers/gpu/drm/vc4/vc4_plane.c#L878


I completely agree with this view FWIW. I had a discussion with Daniel 
about this when I had posted the original block FB changes patch.

- The plane object needs to be locked in order for async state to be updated
- Blocking commit work holds the lock for the plane, async update won't 
happen
- Non-blocking commit work that's still ongoing won't have hw_done 
signaled and drm_atomic_helper_async_check will block the async update

So this looks safe in theory, with the exception of the call to 
drm_atomic_helper_cleanup_planes occuring after hw_done is signaled.

I believe that the behavior of this function still remains the same even 
if plane->state is swapped to something else during the call (since 
old_plane_state should never be equal to plane->state if the commit 
succeeded and the plane is in the commit), but I'm not sure that's 
something we'd want to rely on.

I think other than that issue, you could probably just:

drm_atomic_helper_prepare_planes(...);
drm_atomic_helper_swap_state(...);
drm_atomic_state_get(state);
drm_atomic_helper_async_commit(...);
drm_atomic_helper_cleanup_planes(dev, state);

and it would work as expected. But there still may be other things I'm 
missing or haven't considered here.

Nicholas Kazlauskas

> 

>>

>> I do think it would be good to add something to the function docs that

>> explains this requirement and the issue that it addresses. It's a little

>> unintuitive to require that the old fb is placed into the new state, but

>> it makes sense as a workaround to this problem.

>>

>> Nicholas Kazlauskas

>>

>>>    	}

>>>    }

>>>    EXPORT_SYMBOL(drm_atomic_helper_async_commit);

>>>    

>>

>
On Mon, 11 Mar 2019 13:15:23 +0000
"Kazlauskas, Nicholas" <Nicholas.Kazlauskas@amd.com> wrote:

> On 3/11/19 6:06 AM, Boris Brezillon wrote:
> > Hello Nicholas,
> > 
> > On Mon, 4 Mar 2019 15:46:49 +0000
> > "Kazlauskas, Nicholas" <Nicholas.Kazlauskas@amd.com> wrote:
> >   
> >> On 3/4/19 9:49 AM, Helen Koike wrote:  
> >>> In the case of a normal sync update, the preparation of framebuffers (be
> >>> it calling drm_atomic_helper_prepare_planes() or doing setups with
> >>> drm_framebuffer_get()) are performed in the new_state and the respective
> >>> cleanups are performed in the old_state.
> >>>
> >>> In the case of async updates, the preparation is also done in the
> >>> new_state but the cleanups are done in the new_state (because updates
> >>> are performed in place, i.e. in the current state).
> >>>
> >>> The current code blocks async udpates when the fb is changed, turning
> >>> async updates into sync updates, slowing down cursor updates and
> >>> introducing regressions in igt tests with errors of type:
> >>>
> >>> "CRITICAL: completed 97 cursor updated in a period of 30 flips, we
> >>> expect to complete approximately 15360 updates, with the threshold set
> >>> at 7680"
> >>>
> >>> Fb changes in async updates were prevented to avoid the following scenario:
> >>>
> >>> - Async update, oldfb = NULL, newfb = fb1, prepare fb1, cleanup fb1
> >>> - Async update, oldfb = fb1, newfb = fb2, prepare fb2, cleanup fb2
> >>> - Non-async commit, oldfb = fb2, newfb = fb1, prepare fb1, cleanup fb2 (wrong)
> >>> Where we have a single call to prepare fb2 but double cleanup call to fb2.
> >>>
> >>> To solve the above problems, instead of blocking async fb changes, we
> >>> place the old framebuffer in the new_state object, so when the code
> >>> performs cleanups in the new_state it will cleanup the old_fb and we
> >>> will have the following scenario instead:
> >>>
> >>> - Async update, oldfb = NULL, newfb = fb1, prepare fb1, no cleanup
> >>> - Async update, oldfb = fb1, newfb = fb2, prepare fb2, cleanup fb1
> >>> - Non-async commit, oldfb = fb2, newfb = fb1, prepare fb1, cleanup fb2
> >>>
> >>> Where calls to prepare/cleanup are ballanced.
> >>>
> >>> Cc: <stable@vger.kernel.org> # v4.14+: 25dc194b34dd: drm: Block fb changes for async plane updates
> >>> Fixes: 25dc194b34dd ("drm: Block fb changes for async plane updates")
> >>> Suggested-by: Boris Brezillon <boris.brezillon@collabora.com>
> >>> Signed-off-by: Helen Koike <helen.koike@collabora.com>
> >>>
> >>> ---
> >>> Hello,
> >>>
> >>> As mentioned in the cover letter,
> >>> I tested on the rockchip and on i915 (with a patch I am still working on for
> >>> replacing cursors by async update), with igt plane_cursor_legacy and
> >>> kms_cursor_legacy and I didn't see any regressions.
> >>> I couldn't test on MSM and AMD because I don't have the hardware (and I am
> >>> having some issues testing on vc4) and I would appreciate if anyone could help
> >>> me testing those.
> >>>
> >>> I also think it would be a better solution if, instead of having async
> >>> to do in-place updates in the current state, the async path should be
> >>> equivalent to a syncronous update, i.e., modifying new_state and
> >>> performing a flip
> >>> IMHO, the only difference between sync and async should be that async update
> >>> doesn't wait for vblank and applies the changes immeditally to the hw,
> >>> but the code path could be almost the same.
> >>> But for now I think this solution is ok (swaping new_fb/old_fb), and
> >>> then we can adjust things little by little, what do you think?
> >>>
> >>> Thanks!
> >>> Helen
> >>>
> >>>    drivers/gpu/drm/drm_atomic_helper.c | 20 ++++++++++----------
> >>>    1 file changed, 10 insertions(+), 10 deletions(-)
> >>>
> >>> diff --git a/drivers/gpu/drm/drm_atomic_helper.c b/drivers/gpu/drm/drm_atomic_helper.c
> >>> index 540a77a2ade9..e7eb96f1efc2 100644
> >>> --- a/drivers/gpu/drm/drm_atomic_helper.c
> >>> +++ b/drivers/gpu/drm/drm_atomic_helper.c
> >>> @@ -1608,15 +1608,6 @@ int drm_atomic_helper_async_check(struct drm_device *dev,
> >>>    	    old_plane_state->crtc != new_plane_state->crtc)
> >>>    		return -EINVAL;
> >>>    
> >>> -	/*
> >>> -	 * FIXME: Since prepare_fb and cleanup_fb are always called on
> >>> -	 * the new_plane_state for async updates we need to block framebuffer
> >>> -	 * changes. This prevents use of a fb that's been cleaned up and
> >>> -	 * double cleanups from occuring.
> >>> -	 */
> >>> -	if (old_plane_state->fb != new_plane_state->fb)
> >>> -		return -EINVAL;
> >>> -
> >>>    	funcs = plane->helper_private;
> >>>    	if (!funcs->atomic_async_update)
> >>>    		return -EINVAL;
> >>> @@ -1657,6 +1648,9 @@ void drm_atomic_helper_async_commit(struct drm_device *dev,
> >>>    	int i;
> >>>    
> >>>    	for_each_new_plane_in_state(state, plane, plane_state, i) {
> >>> +		struct drm_framebuffer *new_fb = plane_state->fb;
> >>> +		struct drm_framebuffer *old_fb = plane->state->fb;
> >>> +
> >>>    		funcs = plane->helper_private;
> >>>    		funcs->atomic_async_update(plane, plane_state);
> >>>    
> >>> @@ -1665,11 +1659,17 @@ void drm_atomic_helper_async_commit(struct drm_device *dev,
> >>>    		 * plane->state in-place, make sure at least common
> >>>    		 * properties have been properly updated.
> >>>    		 */
> >>> -		WARN_ON_ONCE(plane->state->fb != plane_state->fb);
> >>> +		WARN_ON_ONCE(plane->state->fb != new_fb);
> >>>    		WARN_ON_ONCE(plane->state->crtc_x != plane_state->crtc_x);
> >>>    		WARN_ON_ONCE(plane->state->crtc_y != plane_state->crtc_y);
> >>>    		WARN_ON_ONCE(plane->state->src_x != plane_state->src_x);
> >>>    		WARN_ON_ONCE(plane->state->src_y != plane_state->src_y);
> >>> +
> >>> +		/*
> >>> +		 * Make sure the FBs have been swapped so that cleanups in the
> >>> +		 * new_state performs a cleanup in the old FB.
> >>> +		 */
> >>> +		WARN_ON_ONCE(plane_state->fb != old_fb);  
> >>
> >> I personally think this approach is fine and the WARN_ON s are good for
> >> catching drivers that want to use these in the future.  
> > 
> > Well, I agree this change is the way to go for a short-term solution
> > to relax the old_fb == new_fb constraint, but I keep thinking this whole
> > "update plane_state in place" is a recipe for trouble and just make
> > things more complicated for drivers for no obvious reasons. Look at the
> > VC4 implem [1] if you need a proof that things can get messy pretty
> > quickly.
> > 
> > All this state-fields-copying steps could be skipped if the core was
> > simply swapping the old/new states as is done in the sync update path.
> > 
> > [1]https://elixir.bootlin.com/linux/v5.0-rc7/source/drivers/gpu/drm/vc4/vc4_plane.c#L878  
> 
> I completely agree with this view FWIW. I had a discussion with Daniel 
> about this when I had posted the original block FB changes patch.
> 
> - The plane object needs to be locked in order for async state to be updated
> - Blocking commit work holds the lock for the plane, async update won't 
> happen
> - Non-blocking commit work that's still ongoing won't have hw_done 
> signaled and drm_atomic_helper_async_check will block the async update
> 
> So this looks safe in theory, with the exception of the call to 
> drm_atomic_helper_cleanup_planes occuring after hw_done is signaled.

Isn't it also the case in the sync update path?

> 
> I believe that the behavior of this function still remains the same even 
> if plane->state is swapped to something else during the call (since 
> old_plane_state should never be equal to plane->state if the commit 
> succeeded and the plane is in the commit), but I'm not sure that's 
> something we'd want to rely on.
> 
> I think other than that issue, you could probably just:
> 
> drm_atomic_helper_prepare_planes(...);
> drm_atomic_helper_swap_state(...);
> drm_atomic_state_get(state);

Why do we need a state_get() here? AFAICT, it's done this way in the
sync update path because of the non-blocking semantic where the state
might be released by the caller before it's been applied by the commit
worker.

> drm_atomic_helper_async_commit(...);
> drm_atomic_helper_cleanup_planes(dev, state);
> 
> and it would work as expected. But there still may be other things I'm 
> missing or haven't considered here.

Actually, when I said we could swap states, I was not necessarily
thinking about re-using drm_atomic_helper_swap_state(), but instead
swap states directly in drm_atomic_helper_async_commit():

	for_each_oldnew_plane_in_state(state, plane, old_plane_state,
				       new_plane_state, i) {
		WARN_ON(plane->state != old_plane_state);
		old_plane_state->state = state;
		new_plane_state->state = NULL;
		state->planes[i].state = old_plane_state;
		plane->state = new_plane_state;

		funcs = plane->helper_private;
		funcs->atomic_async_update(plane, new_plane_state);
	}

This way we would avoid the WARN_ON() lines we have in
drm_atomic_helper_async_commit() to check that things have been
properly updated in-place, and we would also get rid of the driver
code copying the plane_state property that can change during an async
update.

But, as you said, I might be missing other potential issues.
On Mon, Mar 11, 2019 at 03:20:09PM +0100, Boris Brezillon wrote:
> On Mon, 11 Mar 2019 13:15:23 +0000
> "Kazlauskas, Nicholas" <Nicholas.Kazlauskas@amd.com> wrote:
> 
> > On 3/11/19 6:06 AM, Boris Brezillon wrote:
> > > Hello Nicholas,
> > > 
> > > On Mon, 4 Mar 2019 15:46:49 +0000
> > > "Kazlauskas, Nicholas" <Nicholas.Kazlauskas@amd.com> wrote:
> > >   
> > >> On 3/4/19 9:49 AM, Helen Koike wrote:  
> > >>> In the case of a normal sync update, the preparation of framebuffers (be
> > >>> it calling drm_atomic_helper_prepare_planes() or doing setups with
> > >>> drm_framebuffer_get()) are performed in the new_state and the respective
> > >>> cleanups are performed in the old_state.
> > >>>
> > >>> In the case of async updates, the preparation is also done in the
> > >>> new_state but the cleanups are done in the new_state (because updates
> > >>> are performed in place, i.e. in the current state).
> > >>>
> > >>> The current code blocks async udpates when the fb is changed, turning
> > >>> async updates into sync updates, slowing down cursor updates and
> > >>> introducing regressions in igt tests with errors of type:
> > >>>
> > >>> "CRITICAL: completed 97 cursor updated in a period of 30 flips, we
> > >>> expect to complete approximately 15360 updates, with the threshold set
> > >>> at 7680"
> > >>>
> > >>> Fb changes in async updates were prevented to avoid the following scenario:
> > >>>
> > >>> - Async update, oldfb = NULL, newfb = fb1, prepare fb1, cleanup fb1
> > >>> - Async update, oldfb = fb1, newfb = fb2, prepare fb2, cleanup fb2
> > >>> - Non-async commit, oldfb = fb2, newfb = fb1, prepare fb1, cleanup fb2 (wrong)
> > >>> Where we have a single call to prepare fb2 but double cleanup call to fb2.
> > >>>
> > >>> To solve the above problems, instead of blocking async fb changes, we
> > >>> place the old framebuffer in the new_state object, so when the code
> > >>> performs cleanups in the new_state it will cleanup the old_fb and we
> > >>> will have the following scenario instead:
> > >>>
> > >>> - Async update, oldfb = NULL, newfb = fb1, prepare fb1, no cleanup
> > >>> - Async update, oldfb = fb1, newfb = fb2, prepare fb2, cleanup fb1
> > >>> - Non-async commit, oldfb = fb2, newfb = fb1, prepare fb1, cleanup fb2
> > >>>
> > >>> Where calls to prepare/cleanup are ballanced.
> > >>>
> > >>> Cc: <stable@vger.kernel.org> # v4.14+: 25dc194b34dd: drm: Block fb changes for async plane updates
> > >>> Fixes: 25dc194b34dd ("drm: Block fb changes for async plane updates")
> > >>> Suggested-by: Boris Brezillon <boris.brezillon@collabora.com>
> > >>> Signed-off-by: Helen Koike <helen.koike@collabora.com>
> > >>>
> > >>> ---
> > >>> Hello,
> > >>>
> > >>> As mentioned in the cover letter,
> > >>> I tested on the rockchip and on i915 (with a patch I am still working on for
> > >>> replacing cursors by async update), with igt plane_cursor_legacy and
> > >>> kms_cursor_legacy and I didn't see any regressions.
> > >>> I couldn't test on MSM and AMD because I don't have the hardware (and I am
> > >>> having some issues testing on vc4) and I would appreciate if anyone could help
> > >>> me testing those.
> > >>>
> > >>> I also think it would be a better solution if, instead of having async
> > >>> to do in-place updates in the current state, the async path should be
> > >>> equivalent to a syncronous update, i.e., modifying new_state and
> > >>> performing a flip
> > >>> IMHO, the only difference between sync and async should be that async update
> > >>> doesn't wait for vblank and applies the changes immeditally to the hw,
> > >>> but the code path could be almost the same.
> > >>> But for now I think this solution is ok (swaping new_fb/old_fb), and
> > >>> then we can adjust things little by little, what do you think?
> > >>>
> > >>> Thanks!
> > >>> Helen
> > >>>
> > >>>    drivers/gpu/drm/drm_atomic_helper.c | 20 ++++++++++----------
> > >>>    1 file changed, 10 insertions(+), 10 deletions(-)
> > >>>
> > >>> diff --git a/drivers/gpu/drm/drm_atomic_helper.c b/drivers/gpu/drm/drm_atomic_helper.c
> > >>> index 540a77a2ade9..e7eb96f1efc2 100644
> > >>> --- a/drivers/gpu/drm/drm_atomic_helper.c
> > >>> +++ b/drivers/gpu/drm/drm_atomic_helper.c
> > >>> @@ -1608,15 +1608,6 @@ int drm_atomic_helper_async_check(struct drm_device *dev,
> > >>>    	    old_plane_state->crtc != new_plane_state->crtc)
> > >>>    		return -EINVAL;
> > >>>    
> > >>> -	/*
> > >>> -	 * FIXME: Since prepare_fb and cleanup_fb are always called on
> > >>> -	 * the new_plane_state for async updates we need to block framebuffer
> > >>> -	 * changes. This prevents use of a fb that's been cleaned up and
> > >>> -	 * double cleanups from occuring.
> > >>> -	 */
> > >>> -	if (old_plane_state->fb != new_plane_state->fb)
> > >>> -		return -EINVAL;
> > >>> -
> > >>>    	funcs = plane->helper_private;
> > >>>    	if (!funcs->atomic_async_update)
> > >>>    		return -EINVAL;
> > >>> @@ -1657,6 +1648,9 @@ void drm_atomic_helper_async_commit(struct drm_device *dev,
> > >>>    	int i;
> > >>>    
> > >>>    	for_each_new_plane_in_state(state, plane, plane_state, i) {
> > >>> +		struct drm_framebuffer *new_fb = plane_state->fb;
> > >>> +		struct drm_framebuffer *old_fb = plane->state->fb;
> > >>> +
> > >>>    		funcs = plane->helper_private;
> > >>>    		funcs->atomic_async_update(plane, plane_state);
> > >>>    
> > >>> @@ -1665,11 +1659,17 @@ void drm_atomic_helper_async_commit(struct drm_device *dev,
> > >>>    		 * plane->state in-place, make sure at least common
> > >>>    		 * properties have been properly updated.
> > >>>    		 */
> > >>> -		WARN_ON_ONCE(plane->state->fb != plane_state->fb);
> > >>> +		WARN_ON_ONCE(plane->state->fb != new_fb);
> > >>>    		WARN_ON_ONCE(plane->state->crtc_x != plane_state->crtc_x);
> > >>>    		WARN_ON_ONCE(plane->state->crtc_y != plane_state->crtc_y);
> > >>>    		WARN_ON_ONCE(plane->state->src_x != plane_state->src_x);
> > >>>    		WARN_ON_ONCE(plane->state->src_y != plane_state->src_y);
> > >>> +
> > >>> +		/*
> > >>> +		 * Make sure the FBs have been swapped so that cleanups in the
> > >>> +		 * new_state performs a cleanup in the old FB.
> > >>> +		 */
> > >>> +		WARN_ON_ONCE(plane_state->fb != old_fb);  
> > >>
> > >> I personally think this approach is fine and the WARN_ON s are good for
> > >> catching drivers that want to use these in the future.  
> > > 
> > > Well, I agree this change is the way to go for a short-term solution
> > > to relax the old_fb == new_fb constraint, but I keep thinking this whole
> > > "update plane_state in place" is a recipe for trouble and just make
> > > things more complicated for drivers for no obvious reasons. Look at the
> > > VC4 implem [1] if you need a proof that things can get messy pretty
> > > quickly.
> > > 
> > > All this state-fields-copying steps could be skipped if the core was
> > > simply swapping the old/new states as is done in the sync update path.
> > > 
> > > [1]https://elixir.bootlin.com/linux/v5.0-rc7/source/drivers/gpu/drm/vc4/vc4_plane.c#L878  
> > 
> > I completely agree with this view FWIW. I had a discussion with Daniel 
> > about this when I had posted the original block FB changes patch.
> > 
> > - The plane object needs to be locked in order for async state to be updated
> > - Blocking commit work holds the lock for the plane, async update won't 
> > happen
> > - Non-blocking commit work that's still ongoing won't have hw_done 
> > signaled and drm_atomic_helper_async_check will block the async update
> > 
> > So this looks safe in theory, with the exception of the call to 
> > drm_atomic_helper_cleanup_planes occuring after hw_done is signaled.
> 
> Isn't it also the case in the sync update path?
> 
> > 
> > I believe that the behavior of this function still remains the same even 
> > if plane->state is swapped to something else during the call (since 
> > old_plane_state should never be equal to plane->state if the commit 
> > succeeded and the plane is in the commit), but I'm not sure that's 
> > something we'd want to rely on.
> > 
> > I think other than that issue, you could probably just:
> > 
> > drm_atomic_helper_prepare_planes(...);
> > drm_atomic_helper_swap_state(...);
> > drm_atomic_state_get(state);
> 
> Why do we need a state_get() here? AFAICT, it's done this way in the
> sync update path because of the non-blocking semantic where the state
> might be released by the caller before it's been applied by the commit
> worker.
> 
> > drm_atomic_helper_async_commit(...);
> > drm_atomic_helper_cleanup_planes(dev, state);
> > 
> > and it would work as expected. But there still may be other things I'm 
> > missing or haven't considered here.
> 
> Actually, when I said we could swap states, I was not necessarily
> thinking about re-using drm_atomic_helper_swap_state(), but instead
> swap states directly in drm_atomic_helper_async_commit():
> 
> 	for_each_oldnew_plane_in_state(state, plane, old_plane_state,
> 				       new_plane_state, i) {
> 		WARN_ON(plane->state != old_plane_state);
> 		old_plane_state->state = state;
> 		new_plane_state->state = NULL;
> 		state->planes[i].state = old_plane_state;
> 		plane->state = new_plane_state;
> 
> 		funcs = plane->helper_private;
> 		funcs->atomic_async_update(plane, new_plane_state);
> 	}
> 
> This way we would avoid the WARN_ON() lines we have in
> drm_atomic_helper_async_commit() to check that things have been
> properly updated in-place, and we would also get rid of the driver
> code copying the plane_state property that can change during an async
> update.
> 
> But, as you said, I might be missing other potential issues.

Ok I dug around again, and I think I reconstructed the problem again.

The issue is the lifetimes of state structs. The nonblocking commit worker
doesn't hold a reference onto the new states at all. The only reason those
new states cannot disappear is that the next atomic comit touching the
same states waits for crtc_commit.hw_done before it pushes its own update
through (and then goes and releases those state structures).

The old state has no such issue, since each commit takes ownership of the
old state and then releases it. And can do that any time after hw_done.

Now with the current async code that's no issue, because we do check for
hw_done. The trouble is that hw_done is a kernel-internal implementation
detail. The only think userspace can observe is flip_done, and that's
what's used for -EBUSY for normal page-flips. For cursor this kinda
doesn't matter, because these two should be fairly close together (in most
cases hw_done even happens before flip_done, but that depends upon the
driver). So the occasional silent fallback to a synchronous commit doesn't
really matter.

What we could do is just wait for hw_done for async commits, but that's
kinda not cool either since it blocks (again cursor is ill-defined enough
that it doesn't matter). And pushing async updates to a worker means we
need to greatly extend the crtc_commit tracking (at least to each plane
state). I think most of that exist now, since we had to add it anyway for
planes which can be reassigned between crtc.

tldr; maybe we can do the full swapping now?

I agree it feels like the cleaner solution, but definitely need a pile of
igt tests to make sure we can mix&match between async and sync commits and
nothing blows up. And sync commits need to use reassignment of planes to
different crtcs plus nonblocking commit (I think amd hw can do all that,
or at least I've seen prep patches).
-Daniel
On Mon, Mar 04, 2019 at 03:46:49PM +0000, Kazlauskas, Nicholas wrote:
> On 3/4/19 9:49 AM, Helen Koike wrote:
> > In the case of a normal sync update, the preparation of framebuffers (be
> > it calling drm_atomic_helper_prepare_planes() or doing setups with
> > drm_framebuffer_get()) are performed in the new_state and the respective
> > cleanups are performed in the old_state.
> > 
> > In the case of async updates, the preparation is also done in the
> > new_state but the cleanups are done in the new_state (because updates
> > are performed in place, i.e. in the current state).
> > 
> > The current code blocks async udpates when the fb is changed, turning
> > async updates into sync updates, slowing down cursor updates and
> > introducing regressions in igt tests with errors of type:
> > 
> > "CRITICAL: completed 97 cursor updated in a period of 30 flips, we
> > expect to complete approximately 15360 updates, with the threshold set
> > at 7680"
> > 
> > Fb changes in async updates were prevented to avoid the following scenario:
> > 
> > - Async update, oldfb = NULL, newfb = fb1, prepare fb1, cleanup fb1
> > - Async update, oldfb = fb1, newfb = fb2, prepare fb2, cleanup fb2
> > - Non-async commit, oldfb = fb2, newfb = fb1, prepare fb1, cleanup fb2 (wrong)
> > Where we have a single call to prepare fb2 but double cleanup call to fb2.
> > 
> > To solve the above problems, instead of blocking async fb changes, we
> > place the old framebuffer in the new_state object, so when the code
> > performs cleanups in the new_state it will cleanup the old_fb and we
> > will have the following scenario instead:
> > 
> > - Async update, oldfb = NULL, newfb = fb1, prepare fb1, no cleanup
> > - Async update, oldfb = fb1, newfb = fb2, prepare fb2, cleanup fb1
> > - Non-async commit, oldfb = fb2, newfb = fb1, prepare fb1, cleanup fb2
> > 
> > Where calls to prepare/cleanup are ballanced.
> > 
> > Cc: <stable@vger.kernel.org> # v4.14+: 25dc194b34dd: drm: Block fb changes for async plane updates
> > Fixes: 25dc194b34dd ("drm: Block fb changes for async plane updates")
> > Suggested-by: Boris Brezillon <boris.brezillon@collabora.com>
> > Signed-off-by: Helen Koike <helen.koike@collabora.com>
> > 
> > ---
> > Hello,
> > 
> > As mentioned in the cover letter,
> > I tested on the rockchip and on i915 (with a patch I am still working on for
> > replacing cursors by async update), with igt plane_cursor_legacy and
> > kms_cursor_legacy and I didn't see any regressions.
> > I couldn't test on MSM and AMD because I don't have the hardware (and I am
> > having some issues testing on vc4) and I would appreciate if anyone could help
> > me testing those.
> > 
> > I also think it would be a better solution if, instead of having async
> > to do in-place updates in the current state, the async path should be
> > equivalent to a syncronous update, i.e., modifying new_state and
> > performing a flip
> > IMHO, the only difference between sync and async should be that async update
> > doesn't wait for vblank and applies the changes immeditally to the hw,
> > but the code path could be almost the same.
> > But for now I think this solution is ok (swaping new_fb/old_fb), and
> > then we can adjust things little by little, what do you think?
> > 
> > Thanks!
> > Helen
> > 
> >   drivers/gpu/drm/drm_atomic_helper.c | 20 ++++++++++----------
> >   1 file changed, 10 insertions(+), 10 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/drm_atomic_helper.c b/drivers/gpu/drm/drm_atomic_helper.c
> > index 540a77a2ade9..e7eb96f1efc2 100644
> > --- a/drivers/gpu/drm/drm_atomic_helper.c
> > +++ b/drivers/gpu/drm/drm_atomic_helper.c
> > @@ -1608,15 +1608,6 @@ int drm_atomic_helper_async_check(struct drm_device *dev,
> >   	    old_plane_state->crtc != new_plane_state->crtc)
> >   		return -EINVAL;
> >   
> > -	/*
> > -	 * FIXME: Since prepare_fb and cleanup_fb are always called on
> > -	 * the new_plane_state for async updates we need to block framebuffer
> > -	 * changes. This prevents use of a fb that's been cleaned up and
> > -	 * double cleanups from occuring.
> > -	 */
> > -	if (old_plane_state->fb != new_plane_state->fb)
> > -		return -EINVAL;
> > -
> >   	funcs = plane->helper_private;
> >   	if (!funcs->atomic_async_update)
> >   		return -EINVAL;
> > @@ -1657,6 +1648,9 @@ void drm_atomic_helper_async_commit(struct drm_device *dev,
> >   	int i;
> >   
> >   	for_each_new_plane_in_state(state, plane, plane_state, i) {
> > +		struct drm_framebuffer *new_fb = plane_state->fb;
> > +		struct drm_framebuffer *old_fb = plane->state->fb;
> > +
> >   		funcs = plane->helper_private;
> >   		funcs->atomic_async_update(plane, plane_state);
> >   
> > @@ -1665,11 +1659,17 @@ void drm_atomic_helper_async_commit(struct drm_device *dev,
> >   		 * plane->state in-place, make sure at least common
> >   		 * properties have been properly updated.
> >   		 */
> > -		WARN_ON_ONCE(plane->state->fb != plane_state->fb);
> > +		WARN_ON_ONCE(plane->state->fb != new_fb);
> >   		WARN_ON_ONCE(plane->state->crtc_x != plane_state->crtc_x);
> >   		WARN_ON_ONCE(plane->state->crtc_y != plane_state->crtc_y);
> >   		WARN_ON_ONCE(plane->state->src_x != plane_state->src_x);
> >   		WARN_ON_ONCE(plane->state->src_y != plane_state->src_y);
> > +
> > +		/*
> > +		 * Make sure the FBs have been swapped so that cleanups in the
> > +		 * new_state performs a cleanup in the old FB.
> > +		 */
> > +		WARN_ON_ONCE(plane_state->fb != old_fb);
> 
> I personally think this approach is fine and the WARN_ON s are good for 
> catching drivers that want to use these in the future.
> 
> I do think it would be good to add something to the function docs that 
> explains this requirement and the issue that it addresses. It's a little 
> unintuitive to require that the old fb is placed into the new state, but 
> it makes sense as a workaround to this problem.

Agreed.

And yeah this looks like a reasonable short-term fix.
-Daniel

> 
> Nicholas Kazlauskas
> 
> >   	}
> >   }
> >   EXPORT_SYMBOL(drm_atomic_helper_async_commit);
> > 
>
On Mon, Mar 11, 2019 at 08:51:27PM +0100, Daniel Vetter wrote:
> On Mon, Mar 11, 2019 at 03:20:09PM +0100, Boris Brezillon wrote:
> > On Mon, 11 Mar 2019 13:15:23 +0000
> > "Kazlauskas, Nicholas" <Nicholas.Kazlauskas@amd.com> wrote:
> > 
> > > On 3/11/19 6:06 AM, Boris Brezillon wrote:
> > > > Hello Nicholas,
> > > > 
> > > > On Mon, 4 Mar 2019 15:46:49 +0000
> > > > "Kazlauskas, Nicholas" <Nicholas.Kazlauskas@amd.com> wrote:
> > > >   
> > > >> On 3/4/19 9:49 AM, Helen Koike wrote:  
> > > >>> In the case of a normal sync update, the preparation of framebuffers (be
> > > >>> it calling drm_atomic_helper_prepare_planes() or doing setups with
> > > >>> drm_framebuffer_get()) are performed in the new_state and the respective
> > > >>> cleanups are performed in the old_state.
> > > >>>
> > > >>> In the case of async updates, the preparation is also done in the
> > > >>> new_state but the cleanups are done in the new_state (because updates
> > > >>> are performed in place, i.e. in the current state).
> > > >>>
> > > >>> The current code blocks async udpates when the fb is changed, turning
> > > >>> async updates into sync updates, slowing down cursor updates and
> > > >>> introducing regressions in igt tests with errors of type:
> > > >>>
> > > >>> "CRITICAL: completed 97 cursor updated in a period of 30 flips, we
> > > >>> expect to complete approximately 15360 updates, with the threshold set
> > > >>> at 7680"
> > > >>>
> > > >>> Fb changes in async updates were prevented to avoid the following scenario:
> > > >>>
> > > >>> - Async update, oldfb = NULL, newfb = fb1, prepare fb1, cleanup fb1
> > > >>> - Async update, oldfb = fb1, newfb = fb2, prepare fb2, cleanup fb2
> > > >>> - Non-async commit, oldfb = fb2, newfb = fb1, prepare fb1, cleanup fb2 (wrong)
> > > >>> Where we have a single call to prepare fb2 but double cleanup call to fb2.
> > > >>>
> > > >>> To solve the above problems, instead of blocking async fb changes, we
> > > >>> place the old framebuffer in the new_state object, so when the code
> > > >>> performs cleanups in the new_state it will cleanup the old_fb and we
> > > >>> will have the following scenario instead:
> > > >>>
> > > >>> - Async update, oldfb = NULL, newfb = fb1, prepare fb1, no cleanup
> > > >>> - Async update, oldfb = fb1, newfb = fb2, prepare fb2, cleanup fb1
> > > >>> - Non-async commit, oldfb = fb2, newfb = fb1, prepare fb1, cleanup fb2
> > > >>>
> > > >>> Where calls to prepare/cleanup are ballanced.
> > > >>>
> > > >>> Cc: <stable@vger.kernel.org> # v4.14+: 25dc194b34dd: drm: Block fb changes for async plane updates
> > > >>> Fixes: 25dc194b34dd ("drm: Block fb changes for async plane updates")
> > > >>> Suggested-by: Boris Brezillon <boris.brezillon@collabora.com>
> > > >>> Signed-off-by: Helen Koike <helen.koike@collabora.com>
> > > >>>
> > > >>> ---
> > > >>> Hello,
> > > >>>
> > > >>> As mentioned in the cover letter,
> > > >>> I tested on the rockchip and on i915 (with a patch I am still working on for
> > > >>> replacing cursors by async update), with igt plane_cursor_legacy and
> > > >>> kms_cursor_legacy and I didn't see any regressions.
> > > >>> I couldn't test on MSM and AMD because I don't have the hardware (and I am
> > > >>> having some issues testing on vc4) and I would appreciate if anyone could help
> > > >>> me testing those.
> > > >>>
> > > >>> I also think it would be a better solution if, instead of having async
> > > >>> to do in-place updates in the current state, the async path should be
> > > >>> equivalent to a syncronous update, i.e., modifying new_state and
> > > >>> performing a flip
> > > >>> IMHO, the only difference between sync and async should be that async update
> > > >>> doesn't wait for vblank and applies the changes immeditally to the hw,
> > > >>> but the code path could be almost the same.
> > > >>> But for now I think this solution is ok (swaping new_fb/old_fb), and
> > > >>> then we can adjust things little by little, what do you think?
> > > >>>
> > > >>> Thanks!
> > > >>> Helen
> > > >>>
> > > >>>    drivers/gpu/drm/drm_atomic_helper.c | 20 ++++++++++----------
> > > >>>    1 file changed, 10 insertions(+), 10 deletions(-)
> > > >>>
> > > >>> diff --git a/drivers/gpu/drm/drm_atomic_helper.c b/drivers/gpu/drm/drm_atomic_helper.c
> > > >>> index 540a77a2ade9..e7eb96f1efc2 100644
> > > >>> --- a/drivers/gpu/drm/drm_atomic_helper.c
> > > >>> +++ b/drivers/gpu/drm/drm_atomic_helper.c
> > > >>> @@ -1608,15 +1608,6 @@ int drm_atomic_helper_async_check(struct drm_device *dev,
> > > >>>    	    old_plane_state->crtc != new_plane_state->crtc)
> > > >>>    		return -EINVAL;
> > > >>>    
> > > >>> -	/*
> > > >>> -	 * FIXME: Since prepare_fb and cleanup_fb are always called on
> > > >>> -	 * the new_plane_state for async updates we need to block framebuffer
> > > >>> -	 * changes. This prevents use of a fb that's been cleaned up and
> > > >>> -	 * double cleanups from occuring.
> > > >>> -	 */
> > > >>> -	if (old_plane_state->fb != new_plane_state->fb)
> > > >>> -		return -EINVAL;
> > > >>> -
> > > >>>    	funcs = plane->helper_private;
> > > >>>    	if (!funcs->atomic_async_update)
> > > >>>    		return -EINVAL;
> > > >>> @@ -1657,6 +1648,9 @@ void drm_atomic_helper_async_commit(struct drm_device *dev,
> > > >>>    	int i;
> > > >>>    
> > > >>>    	for_each_new_plane_in_state(state, plane, plane_state, i) {
> > > >>> +		struct drm_framebuffer *new_fb = plane_state->fb;
> > > >>> +		struct drm_framebuffer *old_fb = plane->state->fb;
> > > >>> +
> > > >>>    		funcs = plane->helper_private;
> > > >>>    		funcs->atomic_async_update(plane, plane_state);
> > > >>>    
> > > >>> @@ -1665,11 +1659,17 @@ void drm_atomic_helper_async_commit(struct drm_device *dev,
> > > >>>    		 * plane->state in-place, make sure at least common
> > > >>>    		 * properties have been properly updated.
> > > >>>    		 */
> > > >>> -		WARN_ON_ONCE(plane->state->fb != plane_state->fb);
> > > >>> +		WARN_ON_ONCE(plane->state->fb != new_fb);
> > > >>>    		WARN_ON_ONCE(plane->state->crtc_x != plane_state->crtc_x);
> > > >>>    		WARN_ON_ONCE(plane->state->crtc_y != plane_state->crtc_y);
> > > >>>    		WARN_ON_ONCE(plane->state->src_x != plane_state->src_x);
> > > >>>    		WARN_ON_ONCE(plane->state->src_y != plane_state->src_y);
> > > >>> +
> > > >>> +		/*
> > > >>> +		 * Make sure the FBs have been swapped so that cleanups in the
> > > >>> +		 * new_state performs a cleanup in the old FB.
> > > >>> +		 */
> > > >>> +		WARN_ON_ONCE(plane_state->fb != old_fb);  
> > > >>
> > > >> I personally think this approach is fine and the WARN_ON s are good for
> > > >> catching drivers that want to use these in the future.  
> > > > 
> > > > Well, I agree this change is the way to go for a short-term solution
> > > > to relax the old_fb == new_fb constraint, but I keep thinking this whole
> > > > "update plane_state in place" is a recipe for trouble and just make
> > > > things more complicated for drivers for no obvious reasons. Look at the
> > > > VC4 implem [1] if you need a proof that things can get messy pretty
> > > > quickly.
> > > > 
> > > > All this state-fields-copying steps could be skipped if the core was
> > > > simply swapping the old/new states as is done in the sync update path.
> > > > 
> > > > [1]https://elixir.bootlin.com/linux/v5.0-rc7/source/drivers/gpu/drm/vc4/vc4_plane.c#L878  
> > > 
> > > I completely agree with this view FWIW. I had a discussion with Daniel 
> > > about this when I had posted the original block FB changes patch.
> > > 
> > > - The plane object needs to be locked in order for async state to be updated
> > > - Blocking commit work holds the lock for the plane, async update won't 
> > > happen
> > > - Non-blocking commit work that's still ongoing won't have hw_done 
> > > signaled and drm_atomic_helper_async_check will block the async update
> > > 
> > > So this looks safe in theory, with the exception of the call to 
> > > drm_atomic_helper_cleanup_planes occuring after hw_done is signaled.
> > 
> > Isn't it also the case in the sync update path?
> > 
> > > 
> > > I believe that the behavior of this function still remains the same even 
> > > if plane->state is swapped to something else during the call (since 
> > > old_plane_state should never be equal to plane->state if the commit 
> > > succeeded and the plane is in the commit), but I'm not sure that's 
> > > something we'd want to rely on.
> > > 
> > > I think other than that issue, you could probably just:
> > > 
> > > drm_atomic_helper_prepare_planes(...);
> > > drm_atomic_helper_swap_state(...);
> > > drm_atomic_state_get(state);
> > 
> > Why do we need a state_get() here? AFAICT, it's done this way in the
> > sync update path because of the non-blocking semantic where the state
> > might be released by the caller before it's been applied by the commit
> > worker.
> > 
> > > drm_atomic_helper_async_commit(...);
> > > drm_atomic_helper_cleanup_planes(dev, state);
> > > 
> > > and it would work as expected. But there still may be other things I'm 
> > > missing or haven't considered here.
> > 
> > Actually, when I said we could swap states, I was not necessarily
> > thinking about re-using drm_atomic_helper_swap_state(), but instead
> > swap states directly in drm_atomic_helper_async_commit():
> > 
> > 	for_each_oldnew_plane_in_state(state, plane, old_plane_state,
> > 				       new_plane_state, i) {
> > 		WARN_ON(plane->state != old_plane_state);
> > 		old_plane_state->state = state;
> > 		new_plane_state->state = NULL;
> > 		state->planes[i].state = old_plane_state;
> > 		plane->state = new_plane_state;
> > 
> > 		funcs = plane->helper_private;
> > 		funcs->atomic_async_update(plane, new_plane_state);
> > 	}
> > 
> > This way we would avoid the WARN_ON() lines we have in
> > drm_atomic_helper_async_commit() to check that things have been
> > properly updated in-place, and we would also get rid of the driver
> > code copying the plane_state property that can change during an async
> > update.
> > 
> > But, as you said, I might be missing other potential issues.
> 
> Ok I dug around again, and I think I reconstructed the problem again.
> 
> The issue is the lifetimes of state structs. The nonblocking commit worker
> doesn't hold a reference onto the new states at all. The only reason those
> new states cannot disappear is that the next atomic comit touching the
> same states waits for crtc_commit.hw_done before it pushes its own update
> through (and then goes and releases those state structures).
> 
> The old state has no such issue, since each commit takes ownership of the
> old state and then releases it. And can do that any time after hw_done.
> 
> Now with the current async code that's no issue, because we do check for
> hw_done. The trouble is that hw_done is a kernel-internal implementation
> detail. The only think userspace can observe is flip_done, and that's
> what's used for -EBUSY for normal page-flips. For cursor this kinda
> doesn't matter, because these two should be fairly close together (in most
> cases hw_done even happens before flip_done, but that depends upon the
> driver). So the occasional silent fallback to a synchronous commit doesn't
> really matter.
> 
> What we could do is just wait for hw_done for async commits, but that's
> kinda not cool either since it blocks (again cursor is ill-defined enough
> that it doesn't matter). And pushing async updates to a worker means we
> need to greatly extend the crtc_commit tracking (at least to each plane
> state). I think most of that exist now, since we had to add it anyway for
> planes which can be reassigned between crtc.
> 
> tldr; maybe we can do the full swapping now?
> 
> I agree it feels like the cleaner solution, but definitely need a pile of
> igt tests to make sure we can mix&match between async and sync commits and
> nothing blows up. And sync commits need to use reassignment of planes to
> different crtcs plus nonblocking commit (I think amd hw can do all that,
> or at least I've seen prep patches).

Another upshot of the inplace approach: It forces verbosity :-)

Every value you have to manually update is also a value you have to write
to hw somewhere, and you need to audit that that write is ok from an async
pov. Making async too similar to sync commits might tempt people to just
share the same code for everything, and then async isn't really any
better. But I'm not sure how real a concern that really is, and whether
that justifies the verbosity ...
-Daniel
On Mon, 11 Mar 2019 20:51:27 +0100
Daniel Vetter <daniel@ffwll.ch> wrote:

> On Mon, Mar 11, 2019 at 03:20:09PM +0100, Boris Brezillon wrote:
> > On Mon, 11 Mar 2019 13:15:23 +0000
> > "Kazlauskas, Nicholas" <Nicholas.Kazlauskas@amd.com> wrote:
> >   
> > > On 3/11/19 6:06 AM, Boris Brezillon wrote:  
> > > > Hello Nicholas,
> > > > 
> > > > On Mon, 4 Mar 2019 15:46:49 +0000
> > > > "Kazlauskas, Nicholas" <Nicholas.Kazlauskas@amd.com> wrote:
> > > >     
> > > >> On 3/4/19 9:49 AM, Helen Koike wrote:    
> > > >>> In the case of a normal sync update, the preparation of framebuffers (be
> > > >>> it calling drm_atomic_helper_prepare_planes() or doing setups with
> > > >>> drm_framebuffer_get()) are performed in the new_state and the respective
> > > >>> cleanups are performed in the old_state.
> > > >>>
> > > >>> In the case of async updates, the preparation is also done in the
> > > >>> new_state but the cleanups are done in the new_state (because updates
> > > >>> are performed in place, i.e. in the current state).
> > > >>>
> > > >>> The current code blocks async udpates when the fb is changed, turning
> > > >>> async updates into sync updates, slowing down cursor updates and
> > > >>> introducing regressions in igt tests with errors of type:
> > > >>>
> > > >>> "CRITICAL: completed 97 cursor updated in a period of 30 flips, we
> > > >>> expect to complete approximately 15360 updates, with the threshold set
> > > >>> at 7680"
> > > >>>
> > > >>> Fb changes in async updates were prevented to avoid the following scenario:
> > > >>>
> > > >>> - Async update, oldfb = NULL, newfb = fb1, prepare fb1, cleanup fb1
> > > >>> - Async update, oldfb = fb1, newfb = fb2, prepare fb2, cleanup fb2
> > > >>> - Non-async commit, oldfb = fb2, newfb = fb1, prepare fb1, cleanup fb2 (wrong)
> > > >>> Where we have a single call to prepare fb2 but double cleanup call to fb2.
> > > >>>
> > > >>> To solve the above problems, instead of blocking async fb changes, we
> > > >>> place the old framebuffer in the new_state object, so when the code
> > > >>> performs cleanups in the new_state it will cleanup the old_fb and we
> > > >>> will have the following scenario instead:
> > > >>>
> > > >>> - Async update, oldfb = NULL, newfb = fb1, prepare fb1, no cleanup
> > > >>> - Async update, oldfb = fb1, newfb = fb2, prepare fb2, cleanup fb1
> > > >>> - Non-async commit, oldfb = fb2, newfb = fb1, prepare fb1, cleanup fb2
> > > >>>
> > > >>> Where calls to prepare/cleanup are ballanced.
> > > >>>
> > > >>> Cc: <stable@vger.kernel.org> # v4.14+: 25dc194b34dd: drm: Block fb changes for async plane updates
> > > >>> Fixes: 25dc194b34dd ("drm: Block fb changes for async plane updates")
> > > >>> Suggested-by: Boris Brezillon <boris.brezillon@collabora.com>
> > > >>> Signed-off-by: Helen Koike <helen.koike@collabora.com>
> > > >>>
> > > >>> ---
> > > >>> Hello,
> > > >>>
> > > >>> As mentioned in the cover letter,
> > > >>> I tested on the rockchip and on i915 (with a patch I am still working on for
> > > >>> replacing cursors by async update), with igt plane_cursor_legacy and
> > > >>> kms_cursor_legacy and I didn't see any regressions.
> > > >>> I couldn't test on MSM and AMD because I don't have the hardware (and I am
> > > >>> having some issues testing on vc4) and I would appreciate if anyone could help
> > > >>> me testing those.
> > > >>>
> > > >>> I also think it would be a better solution if, instead of having async
> > > >>> to do in-place updates in the current state, the async path should be
> > > >>> equivalent to a syncronous update, i.e., modifying new_state and
> > > >>> performing a flip
> > > >>> IMHO, the only difference between sync and async should be that async update
> > > >>> doesn't wait for vblank and applies the changes immeditally to the hw,
> > > >>> but the code path could be almost the same.
> > > >>> But for now I think this solution is ok (swaping new_fb/old_fb), and
> > > >>> then we can adjust things little by little, what do you think?
> > > >>>
> > > >>> Thanks!
> > > >>> Helen
> > > >>>
> > > >>>    drivers/gpu/drm/drm_atomic_helper.c | 20 ++++++++++----------
> > > >>>    1 file changed, 10 insertions(+), 10 deletions(-)
> > > >>>
> > > >>> diff --git a/drivers/gpu/drm/drm_atomic_helper.c b/drivers/gpu/drm/drm_atomic_helper.c
> > > >>> index 540a77a2ade9..e7eb96f1efc2 100644
> > > >>> --- a/drivers/gpu/drm/drm_atomic_helper.c
> > > >>> +++ b/drivers/gpu/drm/drm_atomic_helper.c
> > > >>> @@ -1608,15 +1608,6 @@ int drm_atomic_helper_async_check(struct drm_device *dev,
> > > >>>    	    old_plane_state->crtc != new_plane_state->crtc)
> > > >>>    		return -EINVAL;
> > > >>>    
> > > >>> -	/*
> > > >>> -	 * FIXME: Since prepare_fb and cleanup_fb are always called on
> > > >>> -	 * the new_plane_state for async updates we need to block framebuffer
> > > >>> -	 * changes. This prevents use of a fb that's been cleaned up and
> > > >>> -	 * double cleanups from occuring.
> > > >>> -	 */
> > > >>> -	if (old_plane_state->fb != new_plane_state->fb)
> > > >>> -		return -EINVAL;
> > > >>> -
> > > >>>    	funcs = plane->helper_private;
> > > >>>    	if (!funcs->atomic_async_update)
> > > >>>    		return -EINVAL;
> > > >>> @@ -1657,6 +1648,9 @@ void drm_atomic_helper_async_commit(struct drm_device *dev,
> > > >>>    	int i;
> > > >>>    
> > > >>>    	for_each_new_plane_in_state(state, plane, plane_state, i) {
> > > >>> +		struct drm_framebuffer *new_fb = plane_state->fb;
> > > >>> +		struct drm_framebuffer *old_fb = plane->state->fb;
> > > >>> +
> > > >>>    		funcs = plane->helper_private;
> > > >>>    		funcs->atomic_async_update(plane, plane_state);
> > > >>>    
> > > >>> @@ -1665,11 +1659,17 @@ void drm_atomic_helper_async_commit(struct drm_device *dev,
> > > >>>    		 * plane->state in-place, make sure at least common
> > > >>>    		 * properties have been properly updated.
> > > >>>    		 */
> > > >>> -		WARN_ON_ONCE(plane->state->fb != plane_state->fb);
> > > >>> +		WARN_ON_ONCE(plane->state->fb != new_fb);
> > > >>>    		WARN_ON_ONCE(plane->state->crtc_x != plane_state->crtc_x);
> > > >>>    		WARN_ON_ONCE(plane->state->crtc_y != plane_state->crtc_y);
> > > >>>    		WARN_ON_ONCE(plane->state->src_x != plane_state->src_x);
> > > >>>    		WARN_ON_ONCE(plane->state->src_y != plane_state->src_y);
> > > >>> +
> > > >>> +		/*
> > > >>> +		 * Make sure the FBs have been swapped so that cleanups in the
> > > >>> +		 * new_state performs a cleanup in the old FB.
> > > >>> +		 */
> > > >>> +		WARN_ON_ONCE(plane_state->fb != old_fb);    
> > > >>
> > > >> I personally think this approach is fine and the WARN_ON s are good for
> > > >> catching drivers that want to use these in the future.    
> > > > 
> > > > Well, I agree this change is the way to go for a short-term solution
> > > > to relax the old_fb == new_fb constraint, but I keep thinking this whole
> > > > "update plane_state in place" is a recipe for trouble and just make
> > > > things more complicated for drivers for no obvious reasons. Look at the
> > > > VC4 implem [1] if you need a proof that things can get messy pretty
> > > > quickly.
> > > > 
> > > > All this state-fields-copying steps could be skipped if the core was
> > > > simply swapping the old/new states as is done in the sync update path.
> > > > 
> > > > [1]https://elixir.bootlin.com/linux/v5.0-rc7/source/drivers/gpu/drm/vc4/vc4_plane.c#L878    
> > > 
> > > I completely agree with this view FWIW. I had a discussion with Daniel 
> > > about this when I had posted the original block FB changes patch.
> > > 
> > > - The plane object needs to be locked in order for async state to be updated
> > > - Blocking commit work holds the lock for the plane, async update won't 
> > > happen
> > > - Non-blocking commit work that's still ongoing won't have hw_done 
> > > signaled and drm_atomic_helper_async_check will block the async update
> > > 
> > > So this looks safe in theory, with the exception of the call to 
> > > drm_atomic_helper_cleanup_planes occuring after hw_done is signaled.  
> > 
> > Isn't it also the case in the sync update path?
> >   
> > > 
> > > I believe that the behavior of this function still remains the same even 
> > > if plane->state is swapped to something else during the call (since 
> > > old_plane_state should never be equal to plane->state if the commit 
> > > succeeded and the plane is in the commit), but I'm not sure that's 
> > > something we'd want to rely on.
> > > 
> > > I think other than that issue, you could probably just:
> > > 
> > > drm_atomic_helper_prepare_planes(...);
> > > drm_atomic_helper_swap_state(...);
> > > drm_atomic_state_get(state);  
> > 
> > Why do we need a state_get() here? AFAICT, it's done this way in the
> > sync update path because of the non-blocking semantic where the state
> > might be released by the caller before it's been applied by the commit
> > worker.
> >   
> > > drm_atomic_helper_async_commit(...);
> > > drm_atomic_helper_cleanup_planes(dev, state);
> > > 
> > > and it would work as expected. But there still may be other things I'm 
> > > missing or haven't considered here.  
> > 
> > Actually, when I said we could swap states, I was not necessarily
> > thinking about re-using drm_atomic_helper_swap_state(), but instead
> > swap states directly in drm_atomic_helper_async_commit():
> > 
> > 	for_each_oldnew_plane_in_state(state, plane, old_plane_state,
> > 				       new_plane_state, i) {
> > 		WARN_ON(plane->state != old_plane_state);
> > 		old_plane_state->state = state;
> > 		new_plane_state->state = NULL;
> > 		state->planes[i].state = old_plane_state;
> > 		plane->state = new_plane_state;
> > 
> > 		funcs = plane->helper_private;
> > 		funcs->atomic_async_update(plane, new_plane_state);
> > 	}
> > 
> > This way we would avoid the WARN_ON() lines we have in
> > drm_atomic_helper_async_commit() to check that things have been
> > properly updated in-place, and we would also get rid of the driver
> > code copying the plane_state property that can change during an async
> > update.
> > 
> > But, as you said, I might be missing other potential issues.  
> 
> Ok I dug around again, and I think I reconstructed the problem again.

Great!

> 
> The issue is the lifetimes of state structs. The nonblocking commit worker
> doesn't hold a reference onto the new states at all. The only reason those
> new states cannot disappear is that the next atomic comit touching the
> same states waits for crtc_commit.hw_done before it pushes its own update
> through (and then goes and releases those state structures).

By disappear I guess you mean when it's replaced in plane->state by a
subsequent atomic commit that places them in the old_state slot and
release them as part of the drm_atomic_state_put() call when returning
from a non-blocking atomic update. Any reason we couldn't retain
new_state refs until we're done manipulating them to overcome this
problem?

> 
> The old state has no such issue, since each commit takes ownership of the
> old state and then releases it. And can do that any time after hw_done.

I'd expect the wait on hw_done to be needed anyway for async commits
going after sync ones. As the comment says, if we don't wait for
hw_done, the async update settings might be overridden by the sync
update ones.

> 
> Now with the current async code that's no issue, because we do check for
> hw_done. The trouble is that hw_done is a kernel-internal implementation
> detail. The only think userspace can observe is flip_done, and that's
> what's used for -EBUSY for normal page-flips. For cursor this kinda
> doesn't matter, because these two should be fairly close together (in most
> cases hw_done even happens before flip_done, but that depends upon the
> driver). So the occasional silent fallback to a synchronous commit doesn't
> really matter.
> 
> What we could do is just wait for hw_done for async commits, but that's
> kinda not cool either since it blocks (again cursor is ill-defined enough
> that it doesn't matter). And pushing async updates to a worker means we
> need to greatly extend the crtc_commit tracking (at least to each plane
> state). I think most of that exist now, since we had to add it anyway for
> planes which can be reassigned between crtc.

To be honest, I don't know what the semantic of async commit should be.
Does async (update things between 2 VBLANKS at the risk of causing
tearing) necessarily implies non-blocking (return before the update is
actually pushed to the HW)?

> 
> tldr; maybe we can do the full swapping now?
> 
> I agree it feels like the cleaner solution, but definitely need a pile of
> igt tests to make sure we can mix&match between async and sync commits and
> nothing blows up. And sync commits need to use reassignment of planes to
> different crtcs plus nonblocking commit (I think amd hw can do all that,
> or at least I've seen prep patches).

Yes, that'd be great to have that in place, especially if we want to
expose async atomic commits to userspace (right now it's only used for
legacy cursor updates).
On Tue, Mar 12, 2019 at 10:32:09AM +0100, Boris Brezillon wrote:
> On Mon, 11 Mar 2019 20:51:27 +0100
> Daniel Vetter <daniel@ffwll.ch> wrote:
> 
> > On Mon, Mar 11, 2019 at 03:20:09PM +0100, Boris Brezillon wrote:
> > > On Mon, 11 Mar 2019 13:15:23 +0000
> > > "Kazlauskas, Nicholas" <Nicholas.Kazlauskas@amd.com> wrote:
> > >   
> > > > On 3/11/19 6:06 AM, Boris Brezillon wrote:  
> > > > > Hello Nicholas,
> > > > > 
> > > > > On Mon, 4 Mar 2019 15:46:49 +0000
> > > > > "Kazlauskas, Nicholas" <Nicholas.Kazlauskas@amd.com> wrote:
> > > > >     
> > > > >> On 3/4/19 9:49 AM, Helen Koike wrote:    
> > > > >>> In the case of a normal sync update, the preparation of framebuffers (be
> > > > >>> it calling drm_atomic_helper_prepare_planes() or doing setups with
> > > > >>> drm_framebuffer_get()) are performed in the new_state and the respective
> > > > >>> cleanups are performed in the old_state.
> > > > >>>
> > > > >>> In the case of async updates, the preparation is also done in the
> > > > >>> new_state but the cleanups are done in the new_state (because updates
> > > > >>> are performed in place, i.e. in the current state).
> > > > >>>
> > > > >>> The current code blocks async udpates when the fb is changed, turning
> > > > >>> async updates into sync updates, slowing down cursor updates and
> > > > >>> introducing regressions in igt tests with errors of type:
> > > > >>>
> > > > >>> "CRITICAL: completed 97 cursor updated in a period of 30 flips, we
> > > > >>> expect to complete approximately 15360 updates, with the threshold set
> > > > >>> at 7680"
> > > > >>>
> > > > >>> Fb changes in async updates were prevented to avoid the following scenario:
> > > > >>>
> > > > >>> - Async update, oldfb = NULL, newfb = fb1, prepare fb1, cleanup fb1
> > > > >>> - Async update, oldfb = fb1, newfb = fb2, prepare fb2, cleanup fb2
> > > > >>> - Non-async commit, oldfb = fb2, newfb = fb1, prepare fb1, cleanup fb2 (wrong)
> > > > >>> Where we have a single call to prepare fb2 but double cleanup call to fb2.
> > > > >>>
> > > > >>> To solve the above problems, instead of blocking async fb changes, we
> > > > >>> place the old framebuffer in the new_state object, so when the code
> > > > >>> performs cleanups in the new_state it will cleanup the old_fb and we
> > > > >>> will have the following scenario instead:
> > > > >>>
> > > > >>> - Async update, oldfb = NULL, newfb = fb1, prepare fb1, no cleanup
> > > > >>> - Async update, oldfb = fb1, newfb = fb2, prepare fb2, cleanup fb1
> > > > >>> - Non-async commit, oldfb = fb2, newfb = fb1, prepare fb1, cleanup fb2
> > > > >>>
> > > > >>> Where calls to prepare/cleanup are ballanced.
> > > > >>>
> > > > >>> Cc: <stable@vger.kernel.org> # v4.14+: 25dc194b34dd: drm: Block fb changes for async plane updates
> > > > >>> Fixes: 25dc194b34dd ("drm: Block fb changes for async plane updates")
> > > > >>> Suggested-by: Boris Brezillon <boris.brezillon@collabora.com>
> > > > >>> Signed-off-by: Helen Koike <helen.koike@collabora.com>
> > > > >>>
> > > > >>> ---
> > > > >>> Hello,
> > > > >>>
> > > > >>> As mentioned in the cover letter,
> > > > >>> I tested on the rockchip and on i915 (with a patch I am still working on for
> > > > >>> replacing cursors by async update), with igt plane_cursor_legacy and
> > > > >>> kms_cursor_legacy and I didn't see any regressions.
> > > > >>> I couldn't test on MSM and AMD because I don't have the hardware (and I am
> > > > >>> having some issues testing on vc4) and I would appreciate if anyone could help
> > > > >>> me testing those.
> > > > >>>
> > > > >>> I also think it would be a better solution if, instead of having async
> > > > >>> to do in-place updates in the current state, the async path should be
> > > > >>> equivalent to a syncronous update, i.e., modifying new_state and
> > > > >>> performing a flip
> > > > >>> IMHO, the only difference between sync and async should be that async update
> > > > >>> doesn't wait for vblank and applies the changes immeditally to the hw,
> > > > >>> but the code path could be almost the same.
> > > > >>> But for now I think this solution is ok (swaping new_fb/old_fb), and
> > > > >>> then we can adjust things little by little, what do you think?
> > > > >>>
> > > > >>> Thanks!
> > > > >>> Helen
> > > > >>>
> > > > >>>    drivers/gpu/drm/drm_atomic_helper.c | 20 ++++++++++----------
> > > > >>>    1 file changed, 10 insertions(+), 10 deletions(-)
> > > > >>>
> > > > >>> diff --git a/drivers/gpu/drm/drm_atomic_helper.c b/drivers/gpu/drm/drm_atomic_helper.c
> > > > >>> index 540a77a2ade9..e7eb96f1efc2 100644
> > > > >>> --- a/drivers/gpu/drm/drm_atomic_helper.c
> > > > >>> +++ b/drivers/gpu/drm/drm_atomic_helper.c
> > > > >>> @@ -1608,15 +1608,6 @@ int drm_atomic_helper_async_check(struct drm_device *dev,
> > > > >>>    	    old_plane_state->crtc != new_plane_state->crtc)
> > > > >>>    		return -EINVAL;
> > > > >>>    
> > > > >>> -	/*
> > > > >>> -	 * FIXME: Since prepare_fb and cleanup_fb are always called on
> > > > >>> -	 * the new_plane_state for async updates we need to block framebuffer
> > > > >>> -	 * changes. This prevents use of a fb that's been cleaned up and
> > > > >>> -	 * double cleanups from occuring.
> > > > >>> -	 */
> > > > >>> -	if (old_plane_state->fb != new_plane_state->fb)
> > > > >>> -		return -EINVAL;
> > > > >>> -
> > > > >>>    	funcs = plane->helper_private;
> > > > >>>    	if (!funcs->atomic_async_update)
> > > > >>>    		return -EINVAL;
> > > > >>> @@ -1657,6 +1648,9 @@ void drm_atomic_helper_async_commit(struct drm_device *dev,
> > > > >>>    	int i;
> > > > >>>    
> > > > >>>    	for_each_new_plane_in_state(state, plane, plane_state, i) {
> > > > >>> +		struct drm_framebuffer *new_fb = plane_state->fb;
> > > > >>> +		struct drm_framebuffer *old_fb = plane->state->fb;
> > > > >>> +
> > > > >>>    		funcs = plane->helper_private;
> > > > >>>    		funcs->atomic_async_update(plane, plane_state);
> > > > >>>    
> > > > >>> @@ -1665,11 +1659,17 @@ void drm_atomic_helper_async_commit(struct drm_device *dev,
> > > > >>>    		 * plane->state in-place, make sure at least common
> > > > >>>    		 * properties have been properly updated.
> > > > >>>    		 */
> > > > >>> -		WARN_ON_ONCE(plane->state->fb != plane_state->fb);
> > > > >>> +		WARN_ON_ONCE(plane->state->fb != new_fb);
> > > > >>>    		WARN_ON_ONCE(plane->state->crtc_x != plane_state->crtc_x);
> > > > >>>    		WARN_ON_ONCE(plane->state->crtc_y != plane_state->crtc_y);
> > > > >>>    		WARN_ON_ONCE(plane->state->src_x != plane_state->src_x);
> > > > >>>    		WARN_ON_ONCE(plane->state->src_y != plane_state->src_y);
> > > > >>> +
> > > > >>> +		/*
> > > > >>> +		 * Make sure the FBs have been swapped so that cleanups in the
> > > > >>> +		 * new_state performs a cleanup in the old FB.
> > > > >>> +		 */
> > > > >>> +		WARN_ON_ONCE(plane_state->fb != old_fb);    
> > > > >>
> > > > >> I personally think this approach is fine and the WARN_ON s are good for
> > > > >> catching drivers that want to use these in the future.    
> > > > > 
> > > > > Well, I agree this change is the way to go for a short-term solution
> > > > > to relax the old_fb == new_fb constraint, but I keep thinking this whole
> > > > > "update plane_state in place" is a recipe for trouble and just make
> > > > > things more complicated for drivers for no obvious reasons. Look at the
> > > > > VC4 implem [1] if you need a proof that things can get messy pretty
> > > > > quickly.
> > > > > 
> > > > > All this state-fields-copying steps could be skipped if the core was
> > > > > simply swapping the old/new states as is done in the sync update path.
> > > > > 
> > > > > [1]https://elixir.bootlin.com/linux/v5.0-rc7/source/drivers/gpu/drm/vc4/vc4_plane.c#L878    
> > > > 
> > > > I completely agree with this view FWIW. I had a discussion with Daniel 
> > > > about this when I had posted the original block FB changes patch.
> > > > 
> > > > - The plane object needs to be locked in order for async state to be updated
> > > > - Blocking commit work holds the lock for the plane, async update won't 
> > > > happen
> > > > - Non-blocking commit work that's still ongoing won't have hw_done 
> > > > signaled and drm_atomic_helper_async_check will block the async update
> > > > 
> > > > So this looks safe in theory, with the exception of the call to 
> > > > drm_atomic_helper_cleanup_planes occuring after hw_done is signaled.  
> > > 
> > > Isn't it also the case in the sync update path?
> > >   
> > > > 
> > > > I believe that the behavior of this function still remains the same even 
> > > > if plane->state is swapped to something else during the call (since 
> > > > old_plane_state should never be equal to plane->state if the commit 
> > > > succeeded and the plane is in the commit), but I'm not sure that's 
> > > > something we'd want to rely on.
> > > > 
> > > > I think other than that issue, you could probably just:
> > > > 
> > > > drm_atomic_helper_prepare_planes(...);
> > > > drm_atomic_helper_swap_state(...);
> > > > drm_atomic_state_get(state);  
> > > 
> > > Why do we need a state_get() here? AFAICT, it's done this way in the
> > > sync update path because of the non-blocking semantic where the state
> > > might be released by the caller before it's been applied by the commit
> > > worker.
> > >   
> > > > drm_atomic_helper_async_commit(...);
> > > > drm_atomic_helper_cleanup_planes(dev, state);
> > > > 
> > > > and it would work as expected. But there still may be other things I'm 
> > > > missing or haven't considered here.  
> > > 
> > > Actually, when I said we could swap states, I was not necessarily
> > > thinking about re-using drm_atomic_helper_swap_state(), but instead
> > > swap states directly in drm_atomic_helper_async_commit():
> > > 
> > > 	for_each_oldnew_plane_in_state(state, plane, old_plane_state,
> > > 				       new_plane_state, i) {
> > > 		WARN_ON(plane->state != old_plane_state);
> > > 		old_plane_state->state = state;
> > > 		new_plane_state->state = NULL;
> > > 		state->planes[i].state = old_plane_state;
> > > 		plane->state = new_plane_state;
> > > 
> > > 		funcs = plane->helper_private;
> > > 		funcs->atomic_async_update(plane, new_plane_state);
> > > 	}
> > > 
> > > This way we would avoid the WARN_ON() lines we have in
> > > drm_atomic_helper_async_commit() to check that things have been
> > > properly updated in-place, and we would also get rid of the driver
> > > code copying the plane_state property that can change during an async
> > > update.
> > > 
> > > But, as you said, I might be missing other potential issues.  
> > 
> > Ok I dug around again, and I think I reconstructed the problem again.
> 
> Great!
> 
> > 
> > The issue is the lifetimes of state structs. The nonblocking commit worker
> > doesn't hold a reference onto the new states at all. The only reason those
> > new states cannot disappear is that the next atomic comit touching the
> > same states waits for crtc_commit.hw_done before it pushes its own update
> > through (and then goes and releases those state structures).
> 
> By disappear I guess you mean when it's replaced in plane->state by a
> subsequent atomic commit that places them in the old_state slot and
> release them as part of the drm_atomic_state_put() call when returning
> from a non-blocking atomic update. Any reason we couldn't retain
> new_state refs until we're done manipulating them to overcome this
> problem?

They're not refcounted. The idea behind that is that since state updates
for a given object are supposed to be strictly ordered, it should be clear
who owns it and when it's ok to release the old state.

> > The old state has no such issue, since each commit takes ownership of the
> > old state and then releases it. And can do that any time after hw_done.
> 
> I'd expect the wait on hw_done to be needed anyway for async commits
> going after sync ones. As the comment says, if we don't wait for
> hw_done, the async update settings might be overridden by the sync
> update ones.

Yup. nonblocking commits do the same, but in the worker thread, so not
holding up anything.

> > Now with the current async code that's no issue, because we do check for
> > hw_done. The trouble is that hw_done is a kernel-internal implementation
> > detail. The only think userspace can observe is flip_done, and that's
> > what's used for -EBUSY for normal page-flips. For cursor this kinda
> > doesn't matter, because these two should be fairly close together (in most
> > cases hw_done even happens before flip_done, but that depends upon the
> > driver). So the occasional silent fallback to a synchronous commit doesn't
> > really matter.
> > 
> > What we could do is just wait for hw_done for async commits, but that's
> > kinda not cool either since it blocks (again cursor is ill-defined enough
> > that it doesn't matter). And pushing async updates to a worker means we
> > need to greatly extend the crtc_commit tracking (at least to each plane
> > state). I think most of that exist now, since we had to add it anyway for
> > planes which can be reassigned between crtc.
> 
> To be honest, I don't know what the semantic of async commit should be.
> Does async (update things between 2 VBLANKS at the risk of causing
> tearing) necessarily implies non-blocking (return before the update is
> actually pushed to the HW)?

I think so. At least for cursor we want "fast". In a way nonblocking
commit can also block (locks, kmalloc), but generally shouldn't.

There have been discussions to expose async flips on all planes (not just
for cursors and for primary flips for amdgpu), but since no one typed the
userspace I have no idea what good semantics for all the interactions
between sync/async and blocking/nonblocking should be. Throw in
allow-modeset/flip-only for even more fun.

otoh we already have combinations that don't work reliably, e.g.
allow-modeset and nonblocking is not a good idea, since a modeset can pull
in additional crtc, which will then make subsequent pageflips on those
other crtcs fail with -EBUSY. And current atomic doesn't tell userspace
when this happens.

So if we make async good enough for cursors and legacy async page-flip and
leave everything else undefined behaviour, I think that's good enough.

Now the question is whether "waiting for hw_done" is too much blocking,
and that might very much depend upon the driver. I think for most drivers
it should be ok.

> > tldr; maybe we can do the full swapping now?
> > 
> > I agree it feels like the cleaner solution, but definitely need a pile of
> > igt tests to make sure we can mix&match between async and sync commits and
> > nothing blows up. And sync commits need to use reassignment of planes to
> > different crtcs plus nonblocking commit (I think amd hw can do all that,
> > or at least I've seen prep patches).
> 
> Yes, that'd be great to have that in place, especially if we want to
> expose async atomic commits to userspace (right now it's only used for
> legacy cursor updates).

legacy cursor + async page flip I think right now.
-Daniel