drm/i915/fbdev: Serialise early hotplug events with async fbdev config

Submitted by Chris Wilson on Nov. 25, 2017, 7:41 p.m.

Details

Message ID 20171125194155.355-1-chris@chris-wilson.co.uk
State Accepted
Commit ad88d7fc6c032ddfb32b8d496a070ab71de3a64f
Headers show
Series "drm/i915/fbdev: Serialise early hotplug events with async fbdev config" ( rev: 1 ) in Intel GFX

Not browsing as part of any series.

Commit Message

Chris Wilson Nov. 25, 2017, 7:41 p.m.
As both the hotplug event and fbdev configuration run asynchronously, it
is possible for them to run concurrently. If configuration fails, we were
freeing the fbdev causing a use-after-free in the hotplug event.

<7>[ 3069.935211] [drm:intel_fb_initial_config [i915]] Not using firmware configuration
<7>[ 3069.935225] [drm:drm_setup_crtcs] looking for cmdline mode on connector 77
<7>[ 3069.935229] [drm:drm_setup_crtcs] looking for preferred mode on connector 77 0
<7>[ 3069.935233] [drm:drm_setup_crtcs] found mode 3200x1800
<7>[ 3069.935236] [drm:drm_setup_crtcs] picking CRTCs for 8192x8192 config
<7>[ 3069.935253] [drm:drm_setup_crtcs] desired mode 3200x1800 set on crtc 43 (0,0)
<7>[ 3069.935323] [drm:intelfb_create [i915]] no BIOS fb, allocating a new one
<4>[ 3069.967737] general protection fault: 0000 [#1] PREEMPT SMP
<0>[ 3069.977453] ---------------------------------
<4>[ 3069.977457] Modules linked in: i915(+) vgem snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_codec_generic x86_pkg_temp_thermal intel_powerclamp coretemp crct10dif_pclmul crc32_pclmul ghash_clmulni_intel snd_hda_codec snd_hwdep snd_hda_core snd_pcm r8169 mei_me mii prime_numbers mei i2c_hid pinctrl_geminilake pinctrl_intel [last unloaded: i915]
<4>[ 3069.977492] CPU: 1 PID: 15414 Comm: kworker/1:0 Tainted: G     U          4.14.0-CI-CI_DRM_3388+ #1
<4>[ 3069.977497] Hardware name: Intel Corp. Geminilake/GLK RVP1 DDR4 (05), BIOS GELKRVPA.X64.0062.B30.1708222146 08/22/2017
<4>[ 3069.977508] Workqueue: events output_poll_execute
<4>[ 3069.977512] task: ffff880177734e40 task.stack: ffffc90001fe4000
<4>[ 3069.977519] RIP: 0010:__lock_acquire+0x109/0x1b60
<4>[ 3069.977523] RSP: 0018:ffffc90001fe7bb0 EFLAGS: 00010002
<4>[ 3069.977526] RAX: 6b6b6b6b6b6b6b6b RBX: 0000000000000282 RCX: 0000000000000000
<4>[ 3069.977530] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff880170d4efd0
<4>[ 3069.977534] RBP: ffffc90001fe7c70 R08: 0000000000000001 R09: 0000000000000000
<4>[ 3069.977538] R10: 0000000000000000 R11: ffffffff81899609 R12: ffff880170d4efd0
<4>[ 3069.977542] R13: ffff880177734e40 R14: 0000000000000001 R15: 0000000000000000
<4>[ 3069.977547] FS:  0000000000000000(0000) GS:ffff88017fc80000(0000) knlGS:0000000000000000
<4>[ 3069.977551] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
<4>[ 3069.977555] CR2: 00007f7e8b7bcf04 CR3: 0000000003e0f000 CR4: 00000000003406e0
<4>[ 3069.977559] Call Trace:
<4>[ 3069.977565]  ? mark_held_locks+0x64/0x90
<4>[ 3069.977571]  ? _raw_spin_unlock_irq+0x24/0x50
<4>[ 3069.977575]  ? _raw_spin_unlock_irq+0x24/0x50
<4>[ 3069.977579]  ? trace_hardirqs_on_caller+0xde/0x1c0
<4>[ 3069.977583]  ? _raw_spin_unlock_irq+0x2f/0x50
<4>[ 3069.977588]  ? finish_task_switch+0xa5/0x210
<4>[ 3069.977592]  ? lock_acquire+0xaf/0x200
<4>[ 3069.977596]  lock_acquire+0xaf/0x200
<4>[ 3069.977600]  ? __mutex_lock+0x5e9/0x9b0
<4>[ 3069.977604]  _raw_spin_lock+0x2a/0x40
<4>[ 3069.977608]  ? __mutex_lock+0x5e9/0x9b0
<4>[ 3069.977612]  __mutex_lock+0x5e9/0x9b0
<4>[ 3069.977616]  ? drm_fb_helper_hotplug_event.part.19+0x16/0xa0
<4>[ 3069.977621]  ? drm_fb_helper_hotplug_event.part.19+0x16/0xa0
<4>[ 3069.977625]  drm_fb_helper_hotplug_event.part.19+0x16/0xa0
<4>[ 3069.977630]  output_poll_execute+0x8d/0x180
<4>[ 3069.977635]  process_one_work+0x22e/0x660
<4>[ 3069.977640]  worker_thread+0x48/0x3a0
<4>[ 3069.977644]  ? _raw_spin_unlock_irqrestore+0x4c/0x60
<4>[ 3069.977649]  kthread+0x102/0x140
<4>[ 3069.977653]  ? process_one_work+0x660/0x660
<4>[ 3069.977657]  ? kthread_create_on_node+0x40/0x40
<4>[ 3069.977662]  ret_from_fork+0x27/0x40
<4>[ 3069.977666] Code: 8d 62 f8 c3 49 81 3c 24 e0 fa 3c 82 41 be 00 00 00 00 45 0f 45 f0 83 fe 01 77 86 89 f0 49 8b 44 c4 08 48 85 c0 0f 84 76 ff ff ff <f0> ff 80 38 01 00 00 8b 1d 62 f9 e8 01 45 8b 85 b8 08 00 00 85
<1>[ 3069.977707] RIP: __lock_acquire+0x109/0x1b60 RSP: ffffc90001fe7bb0
<4>[ 3069.977712] ---[ end trace 4ad012eb3af62df7 ]---

In order to keep the dev_priv->ifbdev alive after failure, we have to
avoid the free and leave it empty until we unload the module. Then we
can use intel_fbdev_sync() to serialise the hotplug event with the
configuration. The serialisation between the two was removed in commit
934458c2c95d ("Revert "drm/i915: Fix races on fbdev""), but the use
after free is much older, commit 366e39b4d2c5 ("drm/i915: Tear down fbdev
if initialization fails")

Fixes: 366e39b4d2c5 ("drm/i915: Tear down fbdev if initialization fails")
Fixes: 934458c2c95d ("Revert "drm/i915: Fix races on fbdev"")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Lukas Wunner <lukas@wunner.de>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: stable@vger.kernel.org
---
 drivers/gpu/drm/i915/intel_fbdev.c | 10 ++++++----
 1 file changed, 6 insertions(+), 4 deletions(-)

Patch hide | download patch | download mbox

diff --git a/drivers/gpu/drm/i915/intel_fbdev.c b/drivers/gpu/drm/i915/intel_fbdev.c
index b8af35187d22..ea96682568e8 100644
--- a/drivers/gpu/drm/i915/intel_fbdev.c
+++ b/drivers/gpu/drm/i915/intel_fbdev.c
@@ -697,10 +697,8 @@  static void intel_fbdev_initial_config(void *data, async_cookie_t cookie)
 
 	/* Due to peculiar init order wrt to hpd handling this is separate. */
 	if (drm_fb_helper_initial_config(&ifbdev->helper,
-					 ifbdev->preferred_bpp)) {
+					 ifbdev->preferred_bpp))
 		intel_fbdev_unregister(to_i915(ifbdev->helper.dev));
-		intel_fbdev_fini(to_i915(ifbdev->helper.dev));
-	}
 }
 
 void intel_fbdev_initial_config_async(struct drm_device *dev)
@@ -800,7 +798,11 @@  void intel_fbdev_output_poll_changed(struct drm_device *dev)
 {
 	struct intel_fbdev *ifbdev = to_i915(dev)->fbdev;
 
-	if (ifbdev)
+	if (!ifbdev)
+		return;
+
+	intel_fbdev_sync(ifbdev);
+	if (ifbdev->vma)
 		drm_fb_helper_hotplug_event(&ifbdev->helper);
 }
 

Comments

On Sat, Nov 25, 2017 at 07:41:55PM +0000, Chris Wilson wrote:
> As both the hotplug event and fbdev configuration run asynchronously, it
> is possible for them to run concurrently. If configuration fails, we were
> freeing the fbdev causing a use-after-free in the hotplug event.

That'll teach me to muck around in this complicated driver. :-)

IIUC, the issue is that ifbdev is briefly non-NULL and the if clause
happens to be executed when it's non-NULL and it becomes NULL upon
or during execution of intel_fbdev_output_poll_changed(), is that
correct?

Wouldn't the proper solution be to set ifbdev only after configuration
was successful, i.e. somewhere at the end of intelfb_create()?
With a memory barrier in case intel_fbdev_output_poll_changed is running
on a different CPU?


> In order to keep the dev_priv->ifbdev alive after failure, we have to
> avoid the free and leave it empty until we unload the module.

Well, that seems to defeat the goal stated in the commit message of
366e39b4d2c5 to free up the memory if fbdev initialization failed.
Not that it's a big deal for me personally, just noting. :-)

Thanks,

Lukas
Quoting Lukas Wunner (2017-11-25 21:03:35)
> On Sat, Nov 25, 2017 at 07:41:55PM +0000, Chris Wilson wrote:
> > As both the hotplug event and fbdev configuration run asynchronously, it
> > is possible for them to run concurrently. If configuration fails, we were
> > freeing the fbdev causing a use-after-free in the hotplug event.
> 
> That'll teach me to muck around in this complicated driver. :-)
> 
> IIUC, the issue is that ifbdev is briefly non-NULL and the if clause
> happens to be executed when it's non-NULL and it becomes NULL upon
> or during execution of intel_fbdev_output_poll_changed(), is that
> correct?

Yes.
 
> Wouldn't the proper solution be to set ifbdev only after configuration
> was successful, i.e. somewhere at the end of intelfb_create()?
> With a memory barrier in case intel_fbdev_output_poll_changed is running
> on a different CPU?

I was looking at if we could easily do that; trying to look at various
methods we could defer the calls back to fbdev until after it was
registered (because that is the crux of the issue imo). I didn't see
anything as easy as using the existing one-off synchronisation.

> > In order to keep the dev_priv->ifbdev alive after failure, we have to
> > avoid the free and leave it empty until we unload the module.
> 
> Well, that seems to defeat the goal stated in the commit message of
> 366e39b4d2c5 to free up the memory if fbdev initialization failed.
> Not that it's a big deal for me personally, just noting. :-)

I know, but it should at least be a small bit of unused allocation, and
we already expect to pay the cost of that allocation for normal use
cases. It may be the straw that breaks the camel's back, hopefully no
one notices in the calamity. What we might do then is just pull the
struct into dev_priv under ifdef FBDEV.
-Chris
On Sat, Nov 25, 2017 at 07:41:55PM +0000, Chris Wilson wrote:
> @@ -697,10 +697,8 @@ static void intel_fbdev_initial_config(void *data, async_cookie_t cookie)
>  
>  	/* Due to peculiar init order wrt to hpd handling this is separate. */
>  	if (drm_fb_helper_initial_config(&ifbdev->helper,
> -					 ifbdev->preferred_bpp)) {
> +					 ifbdev->preferred_bpp))
>  		intel_fbdev_unregister(to_i915(ifbdev->helper.dev));
> -		intel_fbdev_fini(to_i915(ifbdev->helper.dev));
> -	}
>  }

Hm, the race at hand would be solved by the intel_fbdev_sync() below,
or am I missing something?  Still wondering why it's necessary to
leave the fbdev around...


> @@ -800,7 +798,11 @@ void intel_fbdev_output_poll_changed(struct drm_device *dev)
>  {
>  	struct intel_fbdev *ifbdev = to_i915(dev)->fbdev;
>  
> -	if (ifbdev)
> +	if (!ifbdev)
> +		return;
> +
> +	intel_fbdev_sync(ifbdev);
> +	if (ifbdev->vma)
>  		drm_fb_helper_hotplug_event(&ifbdev->helper);
>  }

This hunk looks good, as you note the synchronization was already there
but had to be reverted because I failed to notice that a "+ 1" needs to
be added to the cookie.  You did a much better job than me understanding
how the async API works with 43cee314345a.

However the "if (ifbdev->vma)" looks a bit fishy, ifbdev could be NULL
(e.g. if BIOS fb was too small but intelfb_alloc() failed) so I think
this might lead to a null pointer deref.  Does it make a difference
if we check for ifbdev versus ifbdev->vma?  I also notice that you
added a check for ifbdev->vma with 15727ed0d944 but Daniel later
removed it with 88be58be886f.

I guess a check *is* necessary here because fbdev initialization might
have failed, but I'd just check for "if (ifbdev)".

Thanks & have a pleasant Sunday afternoon.

Lukas
Quoting Lukas Wunner (2017-11-26 11:49:19)
> On Sat, Nov 25, 2017 at 07:41:55PM +0000, Chris Wilson wrote:
> > @@ -697,10 +697,8 @@ static void intel_fbdev_initial_config(void *data, async_cookie_t cookie)
> >  
> >       /* Due to peculiar init order wrt to hpd handling this is separate. */
> >       if (drm_fb_helper_initial_config(&ifbdev->helper,
> > -                                      ifbdev->preferred_bpp)) {
> > +                                      ifbdev->preferred_bpp))
> >               intel_fbdev_unregister(to_i915(ifbdev->helper.dev));
> > -             intel_fbdev_fini(to_i915(ifbdev->helper.dev));
> > -     }
> >  }
> 
> Hm, the race at hand would be solved by the intel_fbdev_sync() below,
> or am I missing something?  Still wondering why it's necessary to
> leave the fbdev around...

The race is solved, but if we do free ifbdev, we can't dereference
ifbdev prior to the sync; and we store the async cookie inside ifbdev.
Bleugh. Catch 22.

> 
> 
> > @@ -800,7 +798,11 @@ void intel_fbdev_output_poll_changed(struct drm_device *dev)
> >  {
> >       struct intel_fbdev *ifbdev = to_i915(dev)->fbdev;
> >  
> > -     if (ifbdev)
> > +     if (!ifbdev)
> > +             return;
> > +
> > +     intel_fbdev_sync(ifbdev);
> > +     if (ifbdev->vma)
> >               drm_fb_helper_hotplug_event(&ifbdev->helper);
> >  }
> 
> This hunk looks good, as you note the synchronization was already there
> but had to be reverted because I failed to notice that a "+ 1" needs to
> be added to the cookie.  You did a much better job than me understanding
> how the async API works with 43cee314345a.
> 
> However the "if (ifbdev->vma)" looks a bit fishy, ifbdev could be NULL
> (e.g. if BIOS fb was too small but intelfb_alloc() failed) so I think
> this might lead to a null pointer deref.  Does it make a difference
> if we check for ifbdev versus ifbdev->vma?  I also notice that you
> added a check for ifbdev->vma with 15727ed0d944 but Daniel later
> removed it with 88be58be886f.

We know that ifbdev is non-NULL and can't become NULL until fini. So
after the sync point, we want to ask the question of whether the config
was successful, for that I used to use ->fb which now replaced by ->vma.
 
> I guess a check *is* necessary here because fbdev initialization might
> have failed, but I'd just check for "if (ifbdev)".

ifbdev = i915->fbdev;
if (ifbdev)
	intel_fbdev_sync(ifbdev);
ifbdev = i915->fbdev;
if (ifbdev)
	drm_fb_helper_hotplug_event(&ifbdev->helper);

See the problem with randomly freeing ifbdev partway through and only
using a fence?
-Chris
On Sat, Nov 25, 2017 at 07:41:55PM +0000, Chris Wilson wrote:
> As both the hotplug event and fbdev configuration run asynchronously, it
> is possible for them to run concurrently. If configuration fails, we were
> freeing the fbdev causing a use-after-free in the hotplug event.
> 
> <7>[ 3069.935211] [drm:intel_fb_initial_config [i915]] Not using firmware configuration
> <7>[ 3069.935225] [drm:drm_setup_crtcs] looking for cmdline mode on connector 77
> <7>[ 3069.935229] [drm:drm_setup_crtcs] looking for preferred mode on connector 77 0
> <7>[ 3069.935233] [drm:drm_setup_crtcs] found mode 3200x1800
> <7>[ 3069.935236] [drm:drm_setup_crtcs] picking CRTCs for 8192x8192 config
> <7>[ 3069.935253] [drm:drm_setup_crtcs] desired mode 3200x1800 set on crtc 43 (0,0)
> <7>[ 3069.935323] [drm:intelfb_create [i915]] no BIOS fb, allocating a new one
> <4>[ 3069.967737] general protection fault: 0000 [#1] PREEMPT SMP
> <0>[ 3069.977453] ---------------------------------
> <4>[ 3069.977457] Modules linked in: i915(+) vgem snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_codec_generic x86_pkg_temp_thermal intel_powerclamp coretemp crct10dif_pclmul crc32_pclmul ghash_clmulni_intel snd_hda_codec snd_hwdep snd_hda_core snd_pcm r8169 mei_me mii prime_numbers mei i2c_hid pinctrl_geminilake pinctrl_intel [last unloaded: i915]
> <4>[ 3069.977492] CPU: 1 PID: 15414 Comm: kworker/1:0 Tainted: G     U          4.14.0-CI-CI_DRM_3388+ #1
> <4>[ 3069.977497] Hardware name: Intel Corp. Geminilake/GLK RVP1 DDR4 (05), BIOS GELKRVPA.X64.0062.B30.1708222146 08/22/2017
> <4>[ 3069.977508] Workqueue: events output_poll_execute
> <4>[ 3069.977512] task: ffff880177734e40 task.stack: ffffc90001fe4000
> <4>[ 3069.977519] RIP: 0010:__lock_acquire+0x109/0x1b60
> <4>[ 3069.977523] RSP: 0018:ffffc90001fe7bb0 EFLAGS: 00010002
> <4>[ 3069.977526] RAX: 6b6b6b6b6b6b6b6b RBX: 0000000000000282 RCX: 0000000000000000
> <4>[ 3069.977530] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff880170d4efd0
> <4>[ 3069.977534] RBP: ffffc90001fe7c70 R08: 0000000000000001 R09: 0000000000000000
> <4>[ 3069.977538] R10: 0000000000000000 R11: ffffffff81899609 R12: ffff880170d4efd0
> <4>[ 3069.977542] R13: ffff880177734e40 R14: 0000000000000001 R15: 0000000000000000
> <4>[ 3069.977547] FS:  0000000000000000(0000) GS:ffff88017fc80000(0000) knlGS:0000000000000000
> <4>[ 3069.977551] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> <4>[ 3069.977555] CR2: 00007f7e8b7bcf04 CR3: 0000000003e0f000 CR4: 00000000003406e0
> <4>[ 3069.977559] Call Trace:
> <4>[ 3069.977565]  ? mark_held_locks+0x64/0x90
> <4>[ 3069.977571]  ? _raw_spin_unlock_irq+0x24/0x50
> <4>[ 3069.977575]  ? _raw_spin_unlock_irq+0x24/0x50
> <4>[ 3069.977579]  ? trace_hardirqs_on_caller+0xde/0x1c0
> <4>[ 3069.977583]  ? _raw_spin_unlock_irq+0x2f/0x50
> <4>[ 3069.977588]  ? finish_task_switch+0xa5/0x210
> <4>[ 3069.977592]  ? lock_acquire+0xaf/0x200
> <4>[ 3069.977596]  lock_acquire+0xaf/0x200
> <4>[ 3069.977600]  ? __mutex_lock+0x5e9/0x9b0
> <4>[ 3069.977604]  _raw_spin_lock+0x2a/0x40
> <4>[ 3069.977608]  ? __mutex_lock+0x5e9/0x9b0
> <4>[ 3069.977612]  __mutex_lock+0x5e9/0x9b0
> <4>[ 3069.977616]  ? drm_fb_helper_hotplug_event.part.19+0x16/0xa0
> <4>[ 3069.977621]  ? drm_fb_helper_hotplug_event.part.19+0x16/0xa0
> <4>[ 3069.977625]  drm_fb_helper_hotplug_event.part.19+0x16/0xa0
> <4>[ 3069.977630]  output_poll_execute+0x8d/0x180
> <4>[ 3069.977635]  process_one_work+0x22e/0x660
> <4>[ 3069.977640]  worker_thread+0x48/0x3a0
> <4>[ 3069.977644]  ? _raw_spin_unlock_irqrestore+0x4c/0x60
> <4>[ 3069.977649]  kthread+0x102/0x140
> <4>[ 3069.977653]  ? process_one_work+0x660/0x660
> <4>[ 3069.977657]  ? kthread_create_on_node+0x40/0x40
> <4>[ 3069.977662]  ret_from_fork+0x27/0x40
> <4>[ 3069.977666] Code: 8d 62 f8 c3 49 81 3c 24 e0 fa 3c 82 41 be 00 00 00 00 45 0f 45 f0 83 fe 01 77 86 89 f0 49 8b 44 c4 08 48 85 c0 0f 84 76 ff ff ff <f0> ff 80 38 01 00 00 8b 1d 62 f9 e8 01 45 8b 85 b8 08 00 00 85
> <1>[ 3069.977707] RIP: __lock_acquire+0x109/0x1b60 RSP: ffffc90001fe7bb0
> <4>[ 3069.977712] ---[ end trace 4ad012eb3af62df7 ]---
> 
> In order to keep the dev_priv->ifbdev alive after failure, we have to
> avoid the free and leave it empty until we unload the module. Then we
> can use intel_fbdev_sync() to serialise the hotplug event with the
> configuration. The serialisation between the two was removed in commit
> 934458c2c95d ("Revert "drm/i915: Fix races on fbdev""), but the use
> after free is much older, commit 366e39b4d2c5 ("drm/i915: Tear down fbdev
> if initialization fails")
> 
> Fixes: 366e39b4d2c5 ("drm/i915: Tear down fbdev if initialization fails")
> Fixes: 934458c2c95d ("Revert "drm/i915: Fix races on fbdev"")
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Lukas Wunner <lukas@wunner.de>
> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
> Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
> Cc: stable@vger.kernel.org

Patch looks correct to me, though keeping the ifbdev around even if it
failed to initialize is regrettable.  FWIW,

Reviewed-by: Lukas Wunner <lukas@wunner.de>

Thanks,

Lukas

> ---
>  drivers/gpu/drm/i915/intel_fbdev.c | 10 ++++++----
>  1 file changed, 6 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/intel_fbdev.c b/drivers/gpu/drm/i915/intel_fbdev.c
> index b8af35187d22..ea96682568e8 100644
> --- a/drivers/gpu/drm/i915/intel_fbdev.c
> +++ b/drivers/gpu/drm/i915/intel_fbdev.c
> @@ -697,10 +697,8 @@ static void intel_fbdev_initial_config(void *data, async_cookie_t cookie)
>  
>  	/* Due to peculiar init order wrt to hpd handling this is separate. */
>  	if (drm_fb_helper_initial_config(&ifbdev->helper,
> -					 ifbdev->preferred_bpp)) {
> +					 ifbdev->preferred_bpp))
>  		intel_fbdev_unregister(to_i915(ifbdev->helper.dev));
> -		intel_fbdev_fini(to_i915(ifbdev->helper.dev));
> -	}
>  }
>  
>  void intel_fbdev_initial_config_async(struct drm_device *dev)
> @@ -800,7 +798,11 @@ void intel_fbdev_output_poll_changed(struct drm_device *dev)
>  {
>  	struct intel_fbdev *ifbdev = to_i915(dev)->fbdev;
>  
> -	if (ifbdev)
> +	if (!ifbdev)
> +		return;
> +
> +	intel_fbdev_sync(ifbdev);
> +	if (ifbdev->vma)
>  		drm_fb_helper_hotplug_event(&ifbdev->helper);
>  }
>  
> -- 
> 2.15.0
>
Quoting Lukas Wunner (2017-11-26 12:20:32)
> On Sat, Nov 25, 2017 at 07:41:55PM +0000, Chris Wilson wrote:
> > As both the hotplug event and fbdev configuration run asynchronously, it
> > is possible for them to run concurrently. If configuration fails, we were
> > freeing the fbdev causing a use-after-free in the hotplug event.
> > 
> > <7>[ 3069.935211] [drm:intel_fb_initial_config [i915]] Not using firmware configuration
> > <7>[ 3069.935225] [drm:drm_setup_crtcs] looking for cmdline mode on connector 77
> > <7>[ 3069.935229] [drm:drm_setup_crtcs] looking for preferred mode on connector 77 0
> > <7>[ 3069.935233] [drm:drm_setup_crtcs] found mode 3200x1800
> > <7>[ 3069.935236] [drm:drm_setup_crtcs] picking CRTCs for 8192x8192 config
> > <7>[ 3069.935253] [drm:drm_setup_crtcs] desired mode 3200x1800 set on crtc 43 (0,0)
> > <7>[ 3069.935323] [drm:intelfb_create [i915]] no BIOS fb, allocating a new one
> > <4>[ 3069.967737] general protection fault: 0000 [#1] PREEMPT SMP
> > <0>[ 3069.977453] ---------------------------------
> > <4>[ 3069.977457] Modules linked in: i915(+) vgem snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_codec_generic x86_pkg_temp_thermal intel_powerclamp coretemp crct10dif_pclmul crc32_pclmul ghash_clmulni_intel snd_hda_codec snd_hwdep snd_hda_core snd_pcm r8169 mei_me mii prime_numbers mei i2c_hid pinctrl_geminilake pinctrl_intel [last unloaded: i915]
> > <4>[ 3069.977492] CPU: 1 PID: 15414 Comm: kworker/1:0 Tainted: G     U          4.14.0-CI-CI_DRM_3388+ #1
> > <4>[ 3069.977497] Hardware name: Intel Corp. Geminilake/GLK RVP1 DDR4 (05), BIOS GELKRVPA.X64.0062.B30.1708222146 08/22/2017
> > <4>[ 3069.977508] Workqueue: events output_poll_execute
> > <4>[ 3069.977512] task: ffff880177734e40 task.stack: ffffc90001fe4000
> > <4>[ 3069.977519] RIP: 0010:__lock_acquire+0x109/0x1b60
> > <4>[ 3069.977523] RSP: 0018:ffffc90001fe7bb0 EFLAGS: 00010002
> > <4>[ 3069.977526] RAX: 6b6b6b6b6b6b6b6b RBX: 0000000000000282 RCX: 0000000000000000
> > <4>[ 3069.977530] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff880170d4efd0
> > <4>[ 3069.977534] RBP: ffffc90001fe7c70 R08: 0000000000000001 R09: 0000000000000000
> > <4>[ 3069.977538] R10: 0000000000000000 R11: ffffffff81899609 R12: ffff880170d4efd0
> > <4>[ 3069.977542] R13: ffff880177734e40 R14: 0000000000000001 R15: 0000000000000000
> > <4>[ 3069.977547] FS:  0000000000000000(0000) GS:ffff88017fc80000(0000) knlGS:0000000000000000
> > <4>[ 3069.977551] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > <4>[ 3069.977555] CR2: 00007f7e8b7bcf04 CR3: 0000000003e0f000 CR4: 00000000003406e0
> > <4>[ 3069.977559] Call Trace:
> > <4>[ 3069.977565]  ? mark_held_locks+0x64/0x90
> > <4>[ 3069.977571]  ? _raw_spin_unlock_irq+0x24/0x50
> > <4>[ 3069.977575]  ? _raw_spin_unlock_irq+0x24/0x50
> > <4>[ 3069.977579]  ? trace_hardirqs_on_caller+0xde/0x1c0
> > <4>[ 3069.977583]  ? _raw_spin_unlock_irq+0x2f/0x50
> > <4>[ 3069.977588]  ? finish_task_switch+0xa5/0x210
> > <4>[ 3069.977592]  ? lock_acquire+0xaf/0x200
> > <4>[ 3069.977596]  lock_acquire+0xaf/0x200
> > <4>[ 3069.977600]  ? __mutex_lock+0x5e9/0x9b0
> > <4>[ 3069.977604]  _raw_spin_lock+0x2a/0x40
> > <4>[ 3069.977608]  ? __mutex_lock+0x5e9/0x9b0
> > <4>[ 3069.977612]  __mutex_lock+0x5e9/0x9b0
> > <4>[ 3069.977616]  ? drm_fb_helper_hotplug_event.part.19+0x16/0xa0
> > <4>[ 3069.977621]  ? drm_fb_helper_hotplug_event.part.19+0x16/0xa0
> > <4>[ 3069.977625]  drm_fb_helper_hotplug_event.part.19+0x16/0xa0
> > <4>[ 3069.977630]  output_poll_execute+0x8d/0x180
> > <4>[ 3069.977635]  process_one_work+0x22e/0x660
> > <4>[ 3069.977640]  worker_thread+0x48/0x3a0
> > <4>[ 3069.977644]  ? _raw_spin_unlock_irqrestore+0x4c/0x60
> > <4>[ 3069.977649]  kthread+0x102/0x140
> > <4>[ 3069.977653]  ? process_one_work+0x660/0x660
> > <4>[ 3069.977657]  ? kthread_create_on_node+0x40/0x40
> > <4>[ 3069.977662]  ret_from_fork+0x27/0x40
> > <4>[ 3069.977666] Code: 8d 62 f8 c3 49 81 3c 24 e0 fa 3c 82 41 be 00 00 00 00 45 0f 45 f0 83 fe 01 77 86 89 f0 49 8b 44 c4 08 48 85 c0 0f 84 76 ff ff ff <f0> ff 80 38 01 00 00 8b 1d 62 f9 e8 01 45 8b 85 b8 08 00 00 85
> > <1>[ 3069.977707] RIP: __lock_acquire+0x109/0x1b60 RSP: ffffc90001fe7bb0
> > <4>[ 3069.977712] ---[ end trace 4ad012eb3af62df7 ]---
> > 
> > In order to keep the dev_priv->ifbdev alive after failure, we have to
> > avoid the free and leave it empty until we unload the module. Then we
> > can use intel_fbdev_sync() to serialise the hotplug event with the
> > configuration. The serialisation between the two was removed in commit
> > 934458c2c95d ("Revert "drm/i915: Fix races on fbdev""), but the use
> > after free is much older, commit 366e39b4d2c5 ("drm/i915: Tear down fbdev
> > if initialization fails")
> > 
> > Fixes: 366e39b4d2c5 ("drm/i915: Tear down fbdev if initialization fails")
> > Fixes: 934458c2c95d ("Revert "drm/i915: Fix races on fbdev"")
> > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> > Cc: Lukas Wunner <lukas@wunner.de>
> > Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
> > Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
> > Cc: stable@vger.kernel.org
> 
> Patch looks correct to me, though keeping the ifbdev around even if it
> failed to initialize is regrettable.

Add a note to say this is decidedly not ideal.

> FWIW,
> 
> Reviewed-by: Lukas Wunner <lukas@wunner.de>

Much appreciated, thanks for taking the time to dig through this code
again.
-Chris