drm/etnaviv: lock MMU while dumping core

Submitted by Lucas Stach on May 22, 2019, 9:55 a.m.

Details

Message ID 20190522095514.7000-1-l.stach@pengutronix.de
State Accepted
Commit 1396500d673bd027683a0609ff84dca7eb6ea2e7
Headers show
Series "drm/etnaviv: lock MMU while dumping core" ( rev: 1 ) in DRI devel

Not browsing as part of any series.

Commit Message

Lucas Stach May 22, 2019, 9:55 a.m.
The devcoredump needs to operate on a stable state of the MMU while
it is writing the MMU state to the coredump. The missing lock
allowed both the userspace submit, as well as the GPU job finish
paths to mutate the MMU state while a coredump is under way.

Fixes: a8c21a5451d8 (drm/etnaviv: add initial etnaviv DRM driver)
Reported-by: David Jander <david@protonic.nl>
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Tested-by: David Jander <david@protonic.nl>
---
 drivers/gpu/drm/etnaviv/etnaviv_dump.c | 5 +++++
 1 file changed, 5 insertions(+)

Patch hide | download patch | download mbox

diff --git a/drivers/gpu/drm/etnaviv/etnaviv_dump.c b/drivers/gpu/drm/etnaviv/etnaviv_dump.c
index 33854c94cb85..515515ef24f9 100644
--- a/drivers/gpu/drm/etnaviv/etnaviv_dump.c
+++ b/drivers/gpu/drm/etnaviv/etnaviv_dump.c
@@ -125,6 +125,8 @@  void etnaviv_core_dump(struct etnaviv_gpu *gpu)
 		return;
 	etnaviv_dump_core = false;
 
+	mutex_lock(&gpu->mmu->lock);
+
 	mmu_size = etnaviv_iommu_dump_size(gpu->mmu);
 
 	/* We always dump registers, mmu, ring and end marker */
@@ -167,6 +169,7 @@  void etnaviv_core_dump(struct etnaviv_gpu *gpu)
 	iter.start = __vmalloc(file_size, GFP_KERNEL | __GFP_NOWARN | __GFP_NORETRY,
 			       PAGE_KERNEL);
 	if (!iter.start) {
+		mutex_unlock(&gpu->mmu->lock);
 		dev_warn(gpu->dev, "failed to allocate devcoredump file\n");
 		return;
 	}
@@ -234,6 +237,8 @@  void etnaviv_core_dump(struct etnaviv_gpu *gpu)
 					 obj->base.size);
 	}
 
+	mutex_unlock(&gpu->mmu->lock);
+
 	etnaviv_core_dump_header(&iter, ETDUMP_BUF_END, iter.data);
 
 	dev_coredumpv(gpu->dev, iter.start, iter.data - iter.start, GFP_KERNEL);

Comments

Hi Lucas,

Seems I'm not getting a reply, so I'm hijacking one of your more
recent patches in the hope of attracting your attention.

A while back I sent a fix for a regression that recently occurred
with etnaviv, where the kernel spat out a warning when importing
buffers into etnaviv.  You apparently merged this as "development",
queuing it up for the last merge window.

Since it is a regression (although not directly attributable to
etnaviv), please ensure that it is merged into stable kernels.

Thanks.

On Wed, May 22, 2019 at 11:55:14AM +0200, Lucas Stach wrote:
> The devcoredump needs to operate on a stable state of the MMU while
> it is writing the MMU state to the coredump. The missing lock
> allowed both the userspace submit, as well as the GPU job finish
> paths to mutate the MMU state while a coredump is under way.
> 
> Fixes: a8c21a5451d8 (drm/etnaviv: add initial etnaviv DRM driver)
> Reported-by: David Jander <david@protonic.nl>
> Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
> Tested-by: David Jander <david@protonic.nl>
> ---
>  drivers/gpu/drm/etnaviv/etnaviv_dump.c | 5 +++++
>  1 file changed, 5 insertions(+)
> 
> diff --git a/drivers/gpu/drm/etnaviv/etnaviv_dump.c b/drivers/gpu/drm/etnaviv/etnaviv_dump.c
> index 33854c94cb85..515515ef24f9 100644
> --- a/drivers/gpu/drm/etnaviv/etnaviv_dump.c
> +++ b/drivers/gpu/drm/etnaviv/etnaviv_dump.c
> @@ -125,6 +125,8 @@ void etnaviv_core_dump(struct etnaviv_gpu *gpu)
>  		return;
>  	etnaviv_dump_core = false;
>  
> +	mutex_lock(&gpu->mmu->lock);
> +
>  	mmu_size = etnaviv_iommu_dump_size(gpu->mmu);
>  
>  	/* We always dump registers, mmu, ring and end marker */
> @@ -167,6 +169,7 @@ void etnaviv_core_dump(struct etnaviv_gpu *gpu)
>  	iter.start = __vmalloc(file_size, GFP_KERNEL | __GFP_NOWARN | __GFP_NORETRY,
>  			       PAGE_KERNEL);
>  	if (!iter.start) {
> +		mutex_unlock(&gpu->mmu->lock);
>  		dev_warn(gpu->dev, "failed to allocate devcoredump file\n");
>  		return;
>  	}
> @@ -234,6 +237,8 @@ void etnaviv_core_dump(struct etnaviv_gpu *gpu)
>  					 obj->base.size);
>  	}
>  
> +	mutex_unlock(&gpu->mmu->lock);
> +
>  	etnaviv_core_dump_header(&iter, ETDUMP_BUF_END, iter.data);
>  
>  	dev_coredumpv(gpu->dev, iter.start, iter.data - iter.start, GFP_KERNEL);
> -- 
> 2.20.1
> 
>
On Wed, 2019-05-22 at 11:55 +0200, Lucas Stach wrote:
> The devcoredump needs to operate on a stable state of the MMU while
> it is writing the MMU state to the coredump. The missing lock
> allowed both the userspace submit, as well as the GPU job finish
> paths to mutate the MMU state while a coredump is under way.
> 
> Fixes: a8c21a5451d8 (drm/etnaviv: add initial etnaviv DRM driver)
> Reported-by: David Jander <david@protonic.nl>
> Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
> Tested-by: David Jander <david@protonic.nl>

Reviewed-by: Philipp Zabel <p.zabel@pengutronix.de>

> ---
>  drivers/gpu/drm/etnaviv/etnaviv_dump.c | 5 +++++
>  1 file changed, 5 insertions(+)
> 
> diff --git a/drivers/gpu/drm/etnaviv/etnaviv_dump.c b/drivers/gpu/drm/etnaviv/etnaviv_dump.c
> index 33854c94cb85..515515ef24f9 100644
> --- a/drivers/gpu/drm/etnaviv/etnaviv_dump.c
> +++ b/drivers/gpu/drm/etnaviv/etnaviv_dump.c
> @@ -125,6 +125,8 @@ void etnaviv_core_dump(struct etnaviv_gpu *gpu)
>  		return;
>  	etnaviv_dump_core = false;
>  
> +	mutex_lock(&gpu->mmu->lock);
> +
>  	mmu_size = etnaviv_iommu_dump_size(gpu->mmu);
>  
>  	/* We always dump registers, mmu, ring and end marker */
> @@ -167,6 +169,7 @@ void etnaviv_core_dump(struct etnaviv_gpu *gpu)
>  	iter.start = __vmalloc(file_size, GFP_KERNEL | __GFP_NOWARN | __GFP_NORETRY,
>  			       PAGE_KERNEL);
>  	if (!iter.start) {
> +		mutex_unlock(&gpu->mmu->lock);
>  		dev_warn(gpu->dev, "failed to allocate devcoredump file\n");
>  		return;
>  	}
> @@ -234,6 +237,8 @@ void etnaviv_core_dump(struct etnaviv_gpu *gpu)
>  					 obj->base.size);
>  	}
>  
> +	mutex_unlock(&gpu->mmu->lock);
> +
>  	etnaviv_core_dump_header(&iter, ETDUMP_BUF_END, iter.data);
>  
>  	dev_coredumpv(gpu->dev, iter.start, iter.data - iter.start, GFP_KERNEL);

regards
Philipp
On Wed, May 22, 2019 at 11:55:14AM +0200, Lucas Stach wrote:
> The devcoredump needs to operate on a stable state of the MMU while
> it is writing the MMU state to the coredump. The missing lock
> allowed both the userspace submit, as well as the GPU job finish
> paths to mutate the MMU state while a coredump is under way.

This locking does little to solve this problem.  We actually rely on the
GPU being deadlocked at this point - we aren't expecting the GPU to make
any progress.  The only thing that can realistically happen is for
userspace to submit a new job, but adding this locking does little to
avoid that.

> Fixes: a8c21a5451d8 (drm/etnaviv: add initial etnaviv DRM driver)
> Reported-by: David Jander <david@protonic.nl>
> Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
> Tested-by: David Jander <david@protonic.nl>
> ---
>  drivers/gpu/drm/etnaviv/etnaviv_dump.c | 5 +++++
>  1 file changed, 5 insertions(+)
> 
> diff --git a/drivers/gpu/drm/etnaviv/etnaviv_dump.c b/drivers/gpu/drm/etnaviv/etnaviv_dump.c
> index 33854c94cb85..515515ef24f9 100644
> --- a/drivers/gpu/drm/etnaviv/etnaviv_dump.c
> +++ b/drivers/gpu/drm/etnaviv/etnaviv_dump.c
> @@ -125,6 +125,8 @@ void etnaviv_core_dump(struct etnaviv_gpu *gpu)
>  		return;
>  	etnaviv_dump_core = false;
>  
> +	mutex_lock(&gpu->mmu->lock);
> +
>  	mmu_size = etnaviv_iommu_dump_size(gpu->mmu);
>  
>  	/* We always dump registers, mmu, ring and end marker */
> @@ -167,6 +169,7 @@ void etnaviv_core_dump(struct etnaviv_gpu *gpu)
>  	iter.start = __vmalloc(file_size, GFP_KERNEL | __GFP_NOWARN | __GFP_NORETRY,
>  			       PAGE_KERNEL);
>  	if (!iter.start) {
> +		mutex_unlock(&gpu->mmu->lock);
>  		dev_warn(gpu->dev, "failed to allocate devcoredump file\n");
>  		return;
>  	}
> @@ -234,6 +237,8 @@ void etnaviv_core_dump(struct etnaviv_gpu *gpu)
>  					 obj->base.size);
>  	}
>  
> +	mutex_unlock(&gpu->mmu->lock);
> +

All that this lock does is prevent the lists from changing while we
build up what we're going to write out.  At this point, you drop the
lock, before we've written anything out to the coredump.  Things
can now change, including the ring buffer.

>  	etnaviv_core_dump_header(&iter, ETDUMP_BUF_END, iter.data);
>  
>  	dev_coredumpv(gpu->dev, iter.start, iter.data - iter.start, GFP_KERNEL);

Here we write out the data, which includes the command buffers, ring
buffers, BOs, etc.  The data in the ring buffer can still change
because the lock has been dropped.

However, all those objects should be pinned, so none of them should
go away: things like the command buffers that have been submitted
should be immutable at this point (if they aren't, it could well be
a reason why the GPU has gone awol.)

It is also not nice to hold the lock over the _big_ vmalloc() which
may take some time.

Have you actually seen problems here, or is this just theoretical?
Am Freitag, den 24.05.2019, 16:31 +0100 schrieb Russell King - ARM Linux admin:
> On Wed, May 22, 2019 at 11:55:14AM +0200, Lucas Stach wrote:
> > The devcoredump needs to operate on a stable state of the MMU while
> > it is writing the MMU state to the coredump. The missing lock
> > allowed both the userspace submit, as well as the GPU job finish
> > paths to mutate the MMU state while a coredump is under way.
> 
> This locking does little to solve this problem.  We actually rely on the
> GPU being deadlocked at this point - we aren't expecting the GPU to make
> any progress.  The only thing that can realistically happen is for
> userspace to submit a new job, but adding this locking does little to
> avoid that.

The GPU is dead at the point where we do the dump. But there is nothing
that would stop the workers that clean up finished jobs before the GPU
stopped execution to manipulate the MMU mapping list, which happens
when buffers become unreferenced due to the finished GPU operation.
Also new mappings can be added to the MMU due to a userspace submit,
even if the GPU is blocked.

> > Fixes: a8c21a5451d8 (drm/etnaviv: add initial etnaviv DRM driver)
> > > > Reported-by: David Jander <david@protonic.nl>
> > > > Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
> > > > Tested-by: David Jander <david@protonic.nl>
> > ---
> >  drivers/gpu/drm/etnaviv/etnaviv_dump.c | 5 +++++
> >  1 file changed, 5 insertions(+)
> > 
> > diff --git a/drivers/gpu/drm/etnaviv/etnaviv_dump.c b/drivers/gpu/drm/etnaviv/etnaviv_dump.c
> > index 33854c94cb85..515515ef24f9 100644
> > --- a/drivers/gpu/drm/etnaviv/etnaviv_dump.c
> > +++ b/drivers/gpu/drm/etnaviv/etnaviv_dump.c
> > @@ -125,6 +125,8 @@ void etnaviv_core_dump(struct etnaviv_gpu *gpu)
> > > >  		return;
> > > >  	etnaviv_dump_core = false;
> >  
> > > > +	mutex_lock(&gpu->mmu->lock);
> > +
> > > >  	mmu_size = etnaviv_iommu_dump_size(gpu->mmu);
> >  
> > > >  	/* We always dump registers, mmu, ring and end marker */
> > @@ -167,6 +169,7 @@ void etnaviv_core_dump(struct etnaviv_gpu *gpu)
> > > >  	iter.start = __vmalloc(file_size, GFP_KERNEL | __GFP_NOWARN | __GFP_NORETRY,
> > > >  			       PAGE_KERNEL);
> > > >  	if (!iter.start) {
> > > > +		mutex_unlock(&gpu->mmu->lock);
> > > >  		dev_warn(gpu->dev, "failed to allocate devcoredump file\n");
> > > >  		return;
> > > >  	}
> > @@ -234,6 +237,8 @@ void etnaviv_core_dump(struct etnaviv_gpu *gpu)
> > > >  					 obj->base.size);
> > > >  	}
> >  
> > > > +	mutex_unlock(&gpu->mmu->lock);
> > +
> 
> All that this lock does is prevent the lists from changing while we
> build up what we're going to write out.  At this point, you drop the
> lock, before we've written anything out to the coredump.  Things
> can now change, including the ring buffer.

At this point we finished moving all the dump data into the vmalloced
memory allocated earlier. Why wound we need to hold the lock after this
is finished? Nothing is going to touch the MMU mappings after that
point.

> >  	etnaviv_core_dump_header(&iter, ETDUMP_BUF_END, iter.data);
> >  
> >  	dev_coredumpv(gpu->dev, iter.start, iter.data - iter.start, GFP_KERNEL);
> 
> Here we write out the data, which includes the command buffers, ring
> buffers, BOs, etc.  The data in the ring buffer can still change
> because the lock has been dropped.
> 
> However, all those objects should be pinned, so none of them should
> go away: things like the command buffers that have been submitted
> should be immutable at this point (if they aren't, it could well be
> a reason why the GPU has gone awol.)

We do not have a big etnaviv lock that would prevent cleanup or new
submit work to make progress while the GPU is busy or hung. All those
operations are able to mutate the MMU state, even when the GPU is
locked up. The only things that are guaranteed to be stable are the
objects referenced by the hanging job, which is also why I think we
should dump only the hanging job and the GPU state, but that's a much
bigger patch. I've kept this one small, so it can be applied to the
stable kernels without any conflicts.

> It is also not nice to hold the lock over the _big_ vmalloc() which
> may take some time.

At the time we detect the GPU lockup, we've already lost a lot of GPU
not executing pending jobs. I don't really care about blocking a
userspace submit a bit longer while the dump and recovery is under way.

> Have you actually seen problems here, or is this just theoretical?

This fixes a real kernel crashing issue as we overrun the vmalloced
buffer when new BOs are added into the MMU between the time we go over
the mappings to get the dump size and actually moving the BO data into
the dump. This has been reported and tracked down by David.

Regards,
Lucas