[3/3] drm/amd/display: Don't replace the dc_state for fast updates

Submitted by Kazlauskas, Nicholas on July 31, 2019, 4:26 p.m.

Details

Message ID 20190731162604.28509-3-nicholas.kazlauskas@amd.com
State Accepted
Commit bd200d190f45b62c006d1ad0a63eeffd87db7a47
Headers show
Series "Series without cover letter" ( rev: 1 ) in AMD X.Org drivers

Browsing this patch as part of:
"Series without cover letter" rev 1 in AMD X.Org drivers
<< prev patch [3/3] next patch >>

Commit Message

Kazlauskas, Nicholas July 31, 2019, 4:26 p.m.
[Why]
DRM private objects have no hw_done/flip_done fencing mechanism on their
own and cannot be used to sequence commits accordingly.

When issuing commits that don't touch the same set of hardware resources
like page-flips on different CRTCs we can run into the issue below
because of this:

1. Client requests non-blocking Commit #1, has a new dc_state #1,
state is swapped, commit tail is deferred to work queue

2. Client requests non-blocking Commit #2, has a new dc_state #2,
state is swapped, commit tail is deferred to work queue

3. Commit #2 work starts, commit tail finishes,
atomic state is cleared, dc_state #1 is freed

4. Commit #1 work starts,
commit tail encounters null pointer deref on dc_state #1

In order to change the DC state as in the private object we need to
ensure that we wait for all outstanding commits to finish and that
any other pending commits must wait for the current one to finish as
well.

We do this for MEDIUM and FULL updates. But not for FAST updates, nor
would we want to since it would cause stuttering from the delays.

FAST updates that go through dm_determine_update_type_for_commit always
create a new dc_state and lock the DRM private object if there are
any changed planes.

We need the old state to validate, but we don't actually need the new
state here.

[How]
If the commit isn't a full update then the use after free can be
resolved by simply discarding the new state entirely and retaining
the existing one instead.

With this change the sequence above can be reexamined. Commit #2 will
still free Commit #1's reference, but before this happens we actually
added an additional reference as part of Commit #2.

If an update comes in during this that needs to change the dc_state
it will need to wait on Commit #1 and Commit #2 to finish. Then it'll
swap the state, finish the work in commit tail and drop the last
reference on Commit #2's dc_state.

Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=204181
Cc: Leo Li <sunpeng.li@amd.com>
Cc: David Francis <david.francis@amd.com>
Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
---
 .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 23 +++++++++++++++++++
 1 file changed, 23 insertions(+)

Patch hide | download patch | download mbox

diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
index 4c90662e9fa2..fe5291b16193 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
@@ -7288,6 +7288,29 @@  static int amdgpu_dm_atomic_check(struct drm_device *dev,
 			ret = -EINVAL;
 			goto fail;
 		}
+	} else {
+		/*
+		 * The commit is a fast update. Fast updates shouldn't change
+		 * the DC context, affect global validation, and can have their
+		 * commit work done in parallel with other commits not touching
+		 * the same resource. If we have a new DC context as part of
+		 * the DM atomic state from validation we need to free it and
+		 * retain the existing one instead.
+		 */
+		struct dm_atomic_state *new_dm_state, *old_dm_state;
+
+		new_dm_state = dm_atomic_get_new_state(state);
+		old_dm_state = dm_atomic_get_old_state(state);
+
+		if (new_dm_state && old_dm_state) {
+			if (new_dm_state->context)
+				dc_release_state(new_dm_state->context);
+
+			new_dm_state->context = old_dm_state->context;
+
+			if (old_dm_state->context)
+				dc_retain_state(old_dm_state->context);
+		}
 	}
 
 	/* Must be success */

Comments

On Wed, Jul 31, 2019 at 12:26 PM Nicholas Kazlauskas
<nicholas.kazlauskas@amd.com> wrote:
>
> [Why]
> DRM private objects have no hw_done/flip_done fencing mechanism on their
> own and cannot be used to sequence commits accordingly.
>
> When issuing commits that don't touch the same set of hardware resources
> like page-flips on different CRTCs we can run into the issue below
> because of this:
>
> 1. Client requests non-blocking Commit #1, has a new dc_state #1,
> state is swapped, commit tail is deferred to work queue
>
> 2. Client requests non-blocking Commit #2, has a new dc_state #2,
> state is swapped, commit tail is deferred to work queue
>
> 3. Commit #2 work starts, commit tail finishes,
> atomic state is cleared, dc_state #1 is freed
>
> 4. Commit #1 work starts,
> commit tail encounters null pointer deref on dc_state #1
>
> In order to change the DC state as in the private object we need to
> ensure that we wait for all outstanding commits to finish and that
> any other pending commits must wait for the current one to finish as
> well.
>
> We do this for MEDIUM and FULL updates. But not for FAST updates, nor
> would we want to since it would cause stuttering from the delays.
>
> FAST updates that go through dm_determine_update_type_for_commit always
> create a new dc_state and lock the DRM private object if there are
> any changed planes.
>
> We need the old state to validate, but we don't actually need the new
> state here.
>
> [How]
> If the commit isn't a full update then the use after free can be
> resolved by simply discarding the new state entirely and retaining
> the existing one instead.
>
> With this change the sequence above can be reexamined. Commit #2 will
> still free Commit #1's reference, but before this happens we actually
> added an additional reference as part of Commit #2.
>
> If an update comes in during this that needs to change the dc_state
> it will need to wait on Commit #1 and Commit #2 to finish. Then it'll
> swap the state, finish the work in commit tail and drop the last
> reference on Commit #2's dc_state.
>
> Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=204181
> Cc: Leo Li <sunpeng.li@amd.com>
> Cc: David Francis <david.francis@amd.com>
> Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>

Series is:
Acked-by: Alex Deucher <alexander.deucher@amd.com>

> ---
>  .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 23 +++++++++++++++++++
>  1 file changed, 23 insertions(+)
>
> diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> index 4c90662e9fa2..fe5291b16193 100644
> --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> @@ -7288,6 +7288,29 @@ static int amdgpu_dm_atomic_check(struct drm_device *dev,
>                         ret = -EINVAL;
>                         goto fail;
>                 }
> +       } else {
> +               /*
> +                * The commit is a fast update. Fast updates shouldn't change
> +                * the DC context, affect global validation, and can have their
> +                * commit work done in parallel with other commits not touching
> +                * the same resource. If we have a new DC context as part of
> +                * the DM atomic state from validation we need to free it and
> +                * retain the existing one instead.
> +                */
> +               struct dm_atomic_state *new_dm_state, *old_dm_state;
> +
> +               new_dm_state = dm_atomic_get_new_state(state);
> +               old_dm_state = dm_atomic_get_old_state(state);
> +
> +               if (new_dm_state && old_dm_state) {
> +                       if (new_dm_state->context)
> +                               dc_release_state(new_dm_state->context);
> +
> +                       new_dm_state->context = old_dm_state->context;
> +
> +                       if (old_dm_state->context)
> +                               dc_retain_state(old_dm_state->context);
> +               }
>         }
>
>         /* Must be success */
> --
> 2.17.1
>
> _______________________________________________
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx

I forgot to add it in this commit description, but I do know the actual 
regression commit was.

Fixes: 004b3938e637 ("drm/amd/display: Check scaling info when determing 
update type")

This started creating new dc_state on fast updates, but we need them for 
validation.

I'll add this to the commit body and push.

Thanks for the review!

Nicholas Kazlauskas

On 8/1/19 3:25 PM, Francis, David wrote:
> Series is:

> 

> Reviewed-by: David Francis <david.francis@amd.com>

> 

> ------------------------------------------------------------------------

> *From:* Alex Deucher <alexdeucher@gmail.com>

> *Sent:* August 1, 2019 2:58:56 PM

> *To:* Kazlauskas, Nicholas <Nicholas.Kazlauskas@amd.com>

> *Cc:* amd-gfx list <amd-gfx@lists.freedesktop.org>; Li, Sun peng (Leo) 

> <Sunpeng.Li@amd.com>; Francis, David <David.Francis@amd.com>

> *Subject:* Re: [PATCH 3/3] drm/amd/display: Don't replace the dc_state 

> for fast updates

> On Wed, Jul 31, 2019 at 12:26 PM Nicholas Kazlauskas

> <nicholas.kazlauskas@amd.com> wrote:

>>

>> [Why]

>> DRM private objects have no hw_done/flip_done fencing mechanism on their

>> own and cannot be used to sequence commits accordingly.

>>

>> When issuing commits that don't touch the same set of hardware resources

>> like page-flips on different CRTCs we can run into the issue below

>> because of this:

>>

>> 1. Client requests non-blocking Commit #1, has a new dc_state #1,

>> state is swapped, commit tail is deferred to work queue

>>

>> 2. Client requests non-blocking Commit #2, has a new dc_state #2,

>> state is swapped, commit tail is deferred to work queue

>>

>> 3. Commit #2 work starts, commit tail finishes,

>> atomic state is cleared, dc_state #1 is freed

>>

>> 4. Commit #1 work starts,

>> commit tail encounters null pointer deref on dc_state #1

>>

>> In order to change the DC state as in the private object we need to

>> ensure that we wait for all outstanding commits to finish and that

>> any other pending commits must wait for the current one to finish as

>> well.

>>

>> We do this for MEDIUM and FULL updates. But not for FAST updates, nor

>> would we want to since it would cause stuttering from the delays.

>>

>> FAST updates that go through dm_determine_update_type_for_commit always

>> create a new dc_state and lock the DRM private object if there are

>> any changed planes.

>>

>> We need the old state to validate, but we don't actually need the new

>> state here.

>>

>> [How]

>> If the commit isn't a full update then the use after free can be

>> resolved by simply discarding the new state entirely and retaining

>> the existing one instead.

>>

>> With this change the sequence above can be reexamined. Commit #2 will

>> still free Commit #1's reference, but before this happens we actually

>> added an additional reference as part of Commit #2.

>>

>> If an update comes in during this that needs to change the dc_state

>> it will need to wait on Commit #1 and Commit #2 to finish. Then it'll

>> swap the state, finish the work in commit tail and drop the last

>> reference on Commit #2's dc_state.

>>

>> Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=204181

>> Cc: Leo Li <sunpeng.li@amd.com>

>> Cc: David Francis <david.francis@amd.com>

>> Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>

> 

> Series is:

> Acked-by: Alex Deucher <alexander.deucher@amd.com>

> 

>> ---

>>  .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 23 +++++++++++++++++++

>>  1 file changed, 23 insertions(+)

>>

>> diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c

>> index 4c90662e9fa2..fe5291b16193 100644

>> --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c

>> +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c

>> @@ -7288,6 +7288,29 @@ static int amdgpu_dm_atomic_check(struct drm_device *dev,

>>                         ret = -EINVAL;

>>                         goto fail;

>>                 }

>> +       } else {

>> +               /*

>> +                * The commit is a fast update. Fast updates shouldn't change

>> +                * the DC context, affect global validation, and can have their

>> +                * commit work done in parallel with other commits not touching

>> +                * the same resource. If we have a new DC context as part of

>> +                * the DM atomic state from validation we need to free it and

>> +                * retain the existing one instead.

>> +                */

>> +               struct dm_atomic_state *new_dm_state, *old_dm_state;

>> +

>> +               new_dm_state = dm_atomic_get_new_state(state);

>> +               old_dm_state = dm_atomic_get_old_state(state);

>> +

>> +               if (new_dm_state && old_dm_state) {

>> +                       if (new_dm_state->context)

>> +                               dc_release_state(new_dm_state->context);

>> +

>> +                       new_dm_state->context = old_dm_state->context;

>> +

>> +                       if (old_dm_state->context)

>> +                               dc_retain_state(old_dm_state->context);

>> +               }

>>         }

>>

>>         /* Must be success */

>> --

>> 2.17.1

>>

>> _______________________________________________

>> amd-gfx mailing list

>> amd-gfx@lists.freedesktop.org

>> https://lists.freedesktop.org/mailman/listinfo/amd-gfx