[2/2] drm/nouveau: Don't retry infinitely when receiving no data on i2c over AUX

Submitted by Lyude Paul on July 25, 2019, 7:40 p.m.

Details

Message ID 20190725194005.16572-3-lyude@redhat.com
State New
Headers show
Series "drm/nouveau: i2c over DP AUX fixes" ( rev: 1 ) in Nouveau

Not browsing as part of any series.

Commit Message

Lyude Paul July 25, 2019, 7:40 p.m.
While I had thought I had fixed this issue in:

commit 342406e4fbba ("drm/nouveau/i2c: Disable i2c bus access after
->fini()")

It turns out that while I did fix the error messages I was seeing on my
P50 when trying to access i2c busses with the GPU in runtime suspend, I
accidentally had missed one important detail that was mentioned on the
bug report this commit was supposed to fix: that the CPU would only lock
up when trying to access i2c busses _on connected devices_ _while the
GPU is not in runtime suspend_. Whoops. That definitely explains why I
was not able to get my machine to hang with i2c bus interactions until
now, as plugging my P50 into it's dock with an HDMI monitor connected
allowed me to finally reproduce this locally.

Now that I have managed to reproduce this issue properly, it looks like
the problem is much simpler then it looks. It turns out that some
connected devices, such as MST laptop docks, will actually ACK i2c reads
even if no data was actually read:

[  275.063043] nouveau 0000:01:00.0: i2c: aux 000a: 1: 0000004c 1
[  275.063447] nouveau 0000:01:00.0: i2c: aux 000a: 00 01101000 10040000
[  275.063759] nouveau 0000:01:00.0: i2c: aux 000a: rd 00000001
[  275.064024] nouveau 0000:01:00.0: i2c: aux 000a: rd 00000000
[  275.064285] nouveau 0000:01:00.0: i2c: aux 000a: rd 00000000
[  275.064594] nouveau 0000:01:00.0: i2c: aux 000a: rd 00000000

Because we don't handle the situation of i2c ack without any data, we
end up entering an infinite loop in nvkm_i2c_aux_i2c_xfer() since the
value of cnt always remains at 0. This finally properly explains how
this could result in a CPU hang like the ones observed in the
aforementioned commit.

So, fix this by retrying transactions if no data is written or received,
and give up and fail the transaction if we continue to not write or
receive any data after 32 retries.

Signed-off-by: Lyude Paul <lyude@redhat.com>
Cc: stable@vger.kernel.org
---
 drivers/gpu/drm/nouveau/nvkm/subdev/i2c/aux.c | 24 +++++++++++++------
 1 file changed, 17 insertions(+), 7 deletions(-)

Patch hide | download patch | download mbox

diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/i2c/aux.c b/drivers/gpu/drm/nouveau/nvkm/subdev/i2c/aux.c
index b4e7404fe660..a11637b0f6cc 100644
--- a/drivers/gpu/drm/nouveau/nvkm/subdev/i2c/aux.c
+++ b/drivers/gpu/drm/nouveau/nvkm/subdev/i2c/aux.c
@@ -40,8 +40,7 @@  nvkm_i2c_aux_i2c_xfer(struct i2c_adapter *adap, struct i2c_msg *msgs, int num)
 		u8 *ptr = msg->buf;
 
 		while (remaining) {
-			u8 cnt = (remaining > 16) ? 16 : remaining;
-			u8 cmd;
+			u8 cnt, retries, cmd;
 
 			if (msg->flags & I2C_M_RD)
 				cmd = 1;
@@ -51,10 +50,19 @@  nvkm_i2c_aux_i2c_xfer(struct i2c_adapter *adap, struct i2c_msg *msgs, int num)
 			if (mcnt || remaining > 16)
 				cmd |= 4; /* MOT */
 
-			ret = aux->func->xfer(aux, true, cmd, msg->addr, ptr, &cnt);
-			if (ret < 0) {
-				nvkm_i2c_aux_release(aux);
-				return ret;
+			for (retries = 0, cnt = 0;
+			     retries < 32 && !cnt;
+			     retries++) {
+				cnt = min_t(u8, remaining, 16);
+				ret = aux->func->xfer(aux, true, cmd,
+						      msg->addr, ptr, &cnt);
+				if (ret < 0)
+					goto out;
+			}
+			if (!cnt) {
+				AUX_TRACE(aux, "no data after 32 retries");
+				ret = -EIO;
+				goto out;
 			}
 
 			ptr += cnt;
@@ -64,8 +72,10 @@  nvkm_i2c_aux_i2c_xfer(struct i2c_adapter *adap, struct i2c_msg *msgs, int num)
 		msg++;
 	}
 
+	ret = num;
+out:
 	nvkm_i2c_aux_release(aux);
-	return num;
+	return ret;
 }
 
 static u32

Comments

Hi,

[This is an automated email]

This commit has been processed because it contains a -stable tag.
The stable tag indicates that it's relevant for the following trees: all

The bot has tested the following trees: v5.2.2, v5.1.19, v4.19.60, v4.14.134, v4.9.186, v4.4.186.

v5.2.2: Build OK!
v5.1.19: Build OK!
v4.19.60: Build OK!
v4.14.134: Build OK!
v4.9.186: Failed to apply! Possible dependencies:
    1af5c410cc0c ("drm/nouveau/i2c: modify aux interface to return length actually transferred")

v4.4.186: Failed to apply! Possible dependencies:
    1af5c410cc0c ("drm/nouveau/i2c: modify aux interface to return length actually transferred")
    2ed95a4c65a3 ("drm/nouveau: recognise GM200 chipset")
    7568b1067181 ("drm/nouveau/nvif: split out display interface definitions")
    7d2813c437a0 ("drm/nouveau/ltc/gm204: split implementation from gm107")
    db1eb528462f ("drm/nouveau: s/gm204/gm200/ in a number of places")
    e3d26d086092 ("drm/nouveau/ibus/gm204: split implementation from gk104")


NOTE: The patch will not be queued to stable trees until it is upstream.

How should we proceed with this patch?

--
Thanks,
Sasha
On Fri, 2019-07-26 at 14:17 +0000, Sasha Levin wrote:
> Hi,
> 
> [This is an automated email]
> 
> This commit has been processed because it contains a -stable tag.
> The stable tag indicates that it's relevant for the following trees: all
> 
> The bot has tested the following trees: v5.2.2, v5.1.19, v4.19.60,
> v4.14.134, v4.9.186, v4.4.186.
> 
> v5.2.2: Build OK!
> v5.1.19: Build OK!
> v4.19.60: Build OK!
> v4.14.134: Build OK!
> v4.9.186: Failed to apply! Possible dependencies:
>     1af5c410cc0c ("drm/nouveau/i2c: modify aux interface to return length
> actually transferred")

skip v4.9
> 
> v4.4.186: Failed to apply! Possible dependencies:
>     1af5c410cc0c ("drm/nouveau/i2c: modify aux interface to return length
> actually transferred")
>     2ed95a4c65a3 ("drm/nouveau: recognise GM200 chipset")
>     7568b1067181 ("drm/nouveau/nvif: split out display interface
> definitions")
>     7d2813c437a0 ("drm/nouveau/ltc/gm204: split implementation from gm107")
>     db1eb528462f ("drm/nouveau: s/gm204/gm200/ in a number of places")
>     e3d26d086092 ("drm/nouveau/ibus/gm204: split implementation from gk104")
> 
> 
and skip v4.4

> NOTE: The patch will not be queued to stable trees until it is upstream.
> 
> How should we proceed with this patch?
> 
> --
> Thanks,
> Sasha