[7/9] drm/i915: Fix random aux transactions failures.

Submitted by Rodrigo Vivi on Nov. 26, 2015, 12:04 a.m.

Details

Message ID 1448496245-1495-8-git-send-email-rodrigo.vivi@intel.com
State New
Headers show
Series "Organize aux retries v3" ( rev: 1 ) in Intel GFX

Not browsing as part of any series.

Commit Message

Rodrigo Vivi Nov. 26, 2015, 12:04 a.m.
Mainly aux communications on sink_crc
were failing a lot randomly on recent platforms.
The first solution was to try to use intel_dp_dpcd_read_wake, but then
it was suggested to move retries to drm level.

Since drm level was already taking care of retries and didn't want
to through random retries on that level the second solution was to
put the retries at aux_transfer layer what was nacked.

So I realized we had so many retries in different places and
started to organize that a bit. During this organization I noticed
that we weren't handing at all the case were the message size was
zeroed. And this was exactly the case that was affecting sink_crc.

Also we weren't respect BSPec who says this size message = 0 or > 20
are forbidden.

It is a fact that we still have no clue why we are getting this
forbidden value there. But anyway we need to handle that for now
so we return -EBUSY and drm level takes care of the retries that
are already in place.

v2: Print debug messsage when this case is reached as suggested
    by Jani.

Cc: Jani Nikula <jani.nikula@intel.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Tested-by: Daniel Stone <daniels@collabora.com> # SKL
---
 drivers/gpu/drm/i915/intel_dp.c | 13 +++++++++++++
 1 file changed, 13 insertions(+)

Patch hide | download patch | download mbox

diff --git a/drivers/gpu/drm/i915/intel_dp.c b/drivers/gpu/drm/i915/intel_dp.c
index 54e85f5..a02bfa1 100644
--- a/drivers/gpu/drm/i915/intel_dp.c
+++ b/drivers/gpu/drm/i915/intel_dp.c
@@ -899,6 +899,19 @@  done:
 	/* Unload any bytes sent back from the other side */
 	recv_bytes = ((status & DP_AUX_CH_CTL_MESSAGE_SIZE_MASK) >>
 		      DP_AUX_CH_CTL_MESSAGE_SIZE_SHIFT);
+
+	/*
+	 * By BSpec: "Message sizes of 0 or >20 are not allowed."
+	 * We have no idea of what happened so we return -EBUSY so
+	 * drm layer takes care for the necessary retries.
+	 */
+	if (recv_bytes == 0 || recv_bytes > 20) {
+		DRM_DEBUG_KMS("Forbidden recv_bytes = %d on aux transaction\n",
+			      recv_bytes);
+		ret = -EBUSY;
+		goto out;
+	}
+
 	if (recv_bytes > recv_size)
 		recv_bytes = recv_size;
 

Comments

Hi Jani,

Is this version ok now?
Do you believe we could go only with this patch for now and continue the
retry and kill read_wake function later?

Thanks,
Rodrigo.

On Wed, Nov 25, 2015 at 4:04 PM Rodrigo Vivi <rodrigo.vivi@intel.com> wrote:

> Mainly aux communications on sink_crc
> were failing a lot randomly on recent platforms.
> The first solution was to try to use intel_dp_dpcd_read_wake, but then
> it was suggested to move retries to drm level.
>
> Since drm level was already taking care of retries and didn't want
> to through random retries on that level the second solution was to
> put the retries at aux_transfer layer what was nacked.
>
> So I realized we had so many retries in different places and
> started to organize that a bit. During this organization I noticed
> that we weren't handing at all the case were the message size was
> zeroed. And this was exactly the case that was affecting sink_crc.
>
> Also we weren't respect BSPec who says this size message = 0 or > 20
> are forbidden.
>
> It is a fact that we still have no clue why we are getting this
> forbidden value there. But anyway we need to handle that for now
> so we return -EBUSY and drm level takes care of the retries that
> are already in place.
>
> v2: Print debug messsage when this case is reached as suggested
>     by Jani.
>
> Cc: Jani Nikula <jani.nikula@intel.com>
> Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
> Tested-by: Daniel Stone <daniels@collabora.com> # SKL
> ---
>  drivers/gpu/drm/i915/intel_dp.c | 13 +++++++++++++
>  1 file changed, 13 insertions(+)
>
> diff --git a/drivers/gpu/drm/i915/intel_dp.c
> b/drivers/gpu/drm/i915/intel_dp.c
> index 54e85f5..a02bfa1 100644
> --- a/drivers/gpu/drm/i915/intel_dp.c
> +++ b/drivers/gpu/drm/i915/intel_dp.c
> @@ -899,6 +899,19 @@ done:
>         /* Unload any bytes sent back from the other side */
>         recv_bytes = ((status & DP_AUX_CH_CTL_MESSAGE_SIZE_MASK) >>
>                       DP_AUX_CH_CTL_MESSAGE_SIZE_SHIFT);
> +
> +       /*
> +        * By BSpec: "Message sizes of 0 or >20 are not allowed."
> +        * We have no idea of what happened so we return -EBUSY so
> +        * drm layer takes care for the necessary retries.
> +        */
> +       if (recv_bytes == 0 || recv_bytes > 20) {
> +               DRM_DEBUG_KMS("Forbidden recv_bytes = %d on aux
> transaction\n",
> +                             recv_bytes);
> +               ret = -EBUSY;
> +               goto out;
> +       }
> +
>         if (recv_bytes > recv_size)
>                 recv_bytes = recv_size;
>
> --
> 2.4.3
>
> _______________________________________________
> dri-devel mailing list
> dri-devel@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/dri-devel
>