[v3,1/4] drm/vc4: Wait for display list synchronization when completing commit

Submitted by Paul Kocialkowski on Jan. 8, 2019, 2:50 p.m.

Details

Message ID 20190108145056.2276-2-paul.kocialkowski@bootlin.com
State New
Headers show
Series "drm/vc4: Add a load tracker" ( rev: 2 ) in DRI devel

Not browsing as part of any series.

Commit Message

Paul Kocialkowski Jan. 8, 2019, 2:50 p.m.
During an atomic commit, the HVS is configured with a display list
for the channel matching the associated CRTC. The Pixel Valve (CRTC)
and encoder are also configured for the new setup at that time.
While the Pixel Valve and encoder are reconfigured synchronously, the
HVS is only reconfigured after the display list address (DISPLIST) has
been updated to the current display list address (DISPLACTX), which is
the responsibility of the hardware.

The time frame during which the HVS is still running on its previous
configuration but the CRTC and encoder have been reconfigured already
can lead to a number of synchronization issues. They will eventually
cause errors reported on the FIFOs, such as underruns.

With underrun detection enabled (from Boris Brezillon's series), this
leads to unreliable underrun detection with random false positives.

To ensure a coherent state, wait for each enabled channel of the HVS
to synchronize its current display list address. This fixes the issue
of random underrun reporting on commits.

Signed-off-by: Paul Kocialkowski <paul.kocialkowski@bootlin.com>
---
 drivers/gpu/drm/vc4/vc4_drv.h  |  1 +
 drivers/gpu/drm/vc4/vc4_hvs.c  | 17 +++++++++++++++++
 drivers/gpu/drm/vc4/vc4_kms.c  |  2 ++
 drivers/gpu/drm/vc4/vc4_regs.h |  2 ++
 4 files changed, 22 insertions(+)

Patch hide | download patch | download mbox

diff --git a/drivers/gpu/drm/vc4/vc4_drv.h b/drivers/gpu/drm/vc4/vc4_drv.h
index c24b078f0593..955f157f5ad0 100644
--- a/drivers/gpu/drm/vc4/vc4_drv.h
+++ b/drivers/gpu/drm/vc4/vc4_drv.h
@@ -772,6 +772,7 @@  void vc4_irq_reset(struct drm_device *dev);
 extern struct platform_driver vc4_hvs_driver;
 void vc4_hvs_dump_state(struct drm_device *dev);
 int vc4_hvs_debugfs_regs(struct seq_file *m, void *unused);
+void vc4_hvs_sync_dlist(struct drm_device *dev);
 
 /* vc4_kms.c */
 int vc4_kms_load(struct drm_device *dev);
diff --git a/drivers/gpu/drm/vc4/vc4_hvs.c b/drivers/gpu/drm/vc4/vc4_hvs.c
index 5d8c749c9749..1ba60b8e0c2d 100644
--- a/drivers/gpu/drm/vc4/vc4_hvs.c
+++ b/drivers/gpu/drm/vc4/vc4_hvs.c
@@ -166,6 +166,23 @@  static int vc4_hvs_upload_linear_kernel(struct vc4_hvs *hvs,
 	return 0;
 }
 
+void vc4_hvs_sync_dlist(struct drm_device *dev)
+{
+	struct vc4_dev *vc4 = to_vc4_dev(dev);
+	unsigned int i;
+	int ret;
+
+	for (i = 0; i < SCALER_CHANNELS_COUNT; i++) {
+		if (!(HVS_READ(SCALER_DISPCTRLX(i)) & SCALER_DISPCTRLX_ENABLE))
+			continue;
+
+		ret = wait_for(HVS_READ(SCALER_DISPLACTX(i)) ==
+			       HVS_READ(SCALER_DISPLISTX(i)), 1000);
+		WARN(ret, "Timeout waiting for channel %d display list sync\n",
+		     i);
+	}
+}
+
 static int vc4_hvs_bind(struct device *dev, struct device *master, void *data)
 {
 	struct platform_device *pdev = to_platform_device(dev);
diff --git a/drivers/gpu/drm/vc4/vc4_kms.c b/drivers/gpu/drm/vc4/vc4_kms.c
index 0490edb192a1..2d66a2b57a91 100644
--- a/drivers/gpu/drm/vc4/vc4_kms.c
+++ b/drivers/gpu/drm/vc4/vc4_kms.c
@@ -155,6 +155,8 @@  vc4_atomic_complete_commit(struct drm_atomic_state *state)
 
 	drm_atomic_helper_commit_hw_done(state);
 
+	vc4_hvs_sync_dlist(dev);
+
 	drm_atomic_helper_wait_for_flip_done(dev, state);
 
 	drm_atomic_helper_cleanup_planes(dev, state);
diff --git a/drivers/gpu/drm/vc4/vc4_regs.h b/drivers/gpu/drm/vc4/vc4_regs.h
index 931088014272..50c653309aec 100644
--- a/drivers/gpu/drm/vc4/vc4_regs.h
+++ b/drivers/gpu/drm/vc4/vc4_regs.h
@@ -212,6 +212,8 @@ 
 
 #define PV_HACT_ACT				0x30
 
+#define SCALER_CHANNELS_COUNT			3
+
 #define SCALER_DISPCTRL                         0x00000000
 /* Global register for clock gating the HVS */
 # define SCALER_DISPCTRL_ENABLE			BIT(31)

Comments

On Tue, Jan 8, 2019 at 3:51 PM Paul Kocialkowski
<paul.kocialkowski@bootlin.com> wrote:
>
> During an atomic commit, the HVS is configured with a display list
> for the channel matching the associated CRTC. The Pixel Valve (CRTC)
> and encoder are also configured for the new setup at that time.
> While the Pixel Valve and encoder are reconfigured synchronously, the
> HVS is only reconfigured after the display list address (DISPLIST) has
> been updated to the current display list address (DISPLACTX), which is
> the responsibility of the hardware.
>
> The time frame during which the HVS is still running on its previous
> configuration but the CRTC and encoder have been reconfigured already
> can lead to a number of synchronization issues. They will eventually
> cause errors reported on the FIFOs, such as underruns.
>
> With underrun detection enabled (from Boris Brezillon's series), this
> leads to unreliable underrun detection with random false positives.
>
> To ensure a coherent state, wait for each enabled channel of the HVS
> to synchronize its current display list address. This fixes the issue
> of random underrun reporting on commits.
>
> Signed-off-by: Paul Kocialkowski <paul.kocialkowski@bootlin.com>
> ---
>  drivers/gpu/drm/vc4/vc4_drv.h  |  1 +
>  drivers/gpu/drm/vc4/vc4_hvs.c  | 17 +++++++++++++++++
>  drivers/gpu/drm/vc4/vc4_kms.c  |  2 ++
>  drivers/gpu/drm/vc4/vc4_regs.h |  2 ++
>  4 files changed, 22 insertions(+)
>
> diff --git a/drivers/gpu/drm/vc4/vc4_drv.h b/drivers/gpu/drm/vc4/vc4_drv.h
> index c24b078f0593..955f157f5ad0 100644
> --- a/drivers/gpu/drm/vc4/vc4_drv.h
> +++ b/drivers/gpu/drm/vc4/vc4_drv.h
> @@ -772,6 +772,7 @@ void vc4_irq_reset(struct drm_device *dev);
>  extern struct platform_driver vc4_hvs_driver;
>  void vc4_hvs_dump_state(struct drm_device *dev);
>  int vc4_hvs_debugfs_regs(struct seq_file *m, void *unused);
> +void vc4_hvs_sync_dlist(struct drm_device *dev);
>
>  /* vc4_kms.c */
>  int vc4_kms_load(struct drm_device *dev);
> diff --git a/drivers/gpu/drm/vc4/vc4_hvs.c b/drivers/gpu/drm/vc4/vc4_hvs.c
> index 5d8c749c9749..1ba60b8e0c2d 100644
> --- a/drivers/gpu/drm/vc4/vc4_hvs.c
> +++ b/drivers/gpu/drm/vc4/vc4_hvs.c
> @@ -166,6 +166,23 @@ static int vc4_hvs_upload_linear_kernel(struct vc4_hvs *hvs,
>         return 0;
>  }
>
> +void vc4_hvs_sync_dlist(struct drm_device *dev)
> +{
> +       struct vc4_dev *vc4 = to_vc4_dev(dev);
> +       unsigned int i;
> +       int ret;
> +
> +       for (i = 0; i < SCALER_CHANNELS_COUNT; i++) {
> +               if (!(HVS_READ(SCALER_DISPCTRLX(i)) & SCALER_DISPCTRLX_ENABLE))
> +                       continue;
> +
> +               ret = wait_for(HVS_READ(SCALER_DISPLACTX(i)) ==
> +                              HVS_READ(SCALER_DISPLISTX(i)), 1000);
> +               WARN(ret, "Timeout waiting for channel %d display list sync\n",
> +                    i);
> +       }
> +}
> +
>  static int vc4_hvs_bind(struct device *dev, struct device *master, void *data)
>  {
>         struct platform_device *pdev = to_platform_device(dev);
> diff --git a/drivers/gpu/drm/vc4/vc4_kms.c b/drivers/gpu/drm/vc4/vc4_kms.c
> index 0490edb192a1..2d66a2b57a91 100644
> --- a/drivers/gpu/drm/vc4/vc4_kms.c
> +++ b/drivers/gpu/drm/vc4/vc4_kms.c
> @@ -155,6 +155,8 @@ vc4_atomic_complete_commit(struct drm_atomic_state *state)
>
>         drm_atomic_helper_commit_hw_done(state);
>
> +       vc4_hvs_sync_dlist(dev);

From your description I'd have guessed you want this between when you
update the planes and the crtc, so somewhere between commit_planes()
and commit_modeset_enables(). At least I have no idea how waiting here
can prevent underruns, by this point there's no further hw programming
happening. Only exception is if you have an IOMMU which can fault, in
which case the cleanup_planes might remove the buffers prematurely.
But if that's the problem, then your semantics of the flip_done event
are wrong - when flip_done is signalled, the hw must have stopped
scanning out the old planes, since userspace expects to be able to
start overwriting/reusing them.
-Daniel

> +
>         drm_atomic_helper_wait_for_flip_done(dev, state);
>
>         drm_atomic_helper_cleanup_planes(dev, state);
> diff --git a/drivers/gpu/drm/vc4/vc4_regs.h b/drivers/gpu/drm/vc4/vc4_regs.h
> index 931088014272..50c653309aec 100644
> --- a/drivers/gpu/drm/vc4/vc4_regs.h
> +++ b/drivers/gpu/drm/vc4/vc4_regs.h
> @@ -212,6 +212,8 @@
>
>  #define PV_HACT_ACT                            0x30
>
> +#define SCALER_CHANNELS_COUNT                  3
> +
>  #define SCALER_DISPCTRL                         0x00000000
>  /* Global register for clock gating the HVS */
>  # define SCALER_DISPCTRL_ENABLE                        BIT(31)
> --
> 2.20.1
>
Hi Daniel,

On Tue, 2019-01-08 at 19:21 +0100, Daniel Vetter wrote:
> On Tue, Jan 8, 2019 at 3:51 PM Paul Kocialkowski
> <paul.kocialkowski@bootlin.com> wrote:
> > During an atomic commit, the HVS is configured with a display list
> > for the channel matching the associated CRTC. The Pixel Valve (CRTC)
> > and encoder are also configured for the new setup at that time.
> > While the Pixel Valve and encoder are reconfigured synchronously, the
> > HVS is only reconfigured after the display list address (DISPLIST) has
> > been updated to the current display list address (DISPLACTX), which is
> > the responsibility of the hardware.
> > 
> > The time frame during which the HVS is still running on its previous
> > configuration but the CRTC and encoder have been reconfigured already
> > can lead to a number of synchronization issues. They will eventually
> > cause errors reported on the FIFOs, such as underruns.
> > 
> > With underrun detection enabled (from Boris Brezillon's series), this
> > leads to unreliable underrun detection with random false positives.
> > 
> > To ensure a coherent state, wait for each enabled channel of the HVS
> > to synchronize its current display list address. This fixes the issue
> > of random underrun reporting on commits.
> > 
> > Signed-off-by: Paul Kocialkowski <paul.kocialkowski@bootlin.com>
> > ---
> >  drivers/gpu/drm/vc4/vc4_drv.h  |  1 +
> >  drivers/gpu/drm/vc4/vc4_hvs.c  | 17 +++++++++++++++++
> >  drivers/gpu/drm/vc4/vc4_kms.c  |  2 ++
> >  drivers/gpu/drm/vc4/vc4_regs.h |  2 ++
> >  4 files changed, 22 insertions(+)
> > 
> > diff --git a/drivers/gpu/drm/vc4/vc4_drv.h b/drivers/gpu/drm/vc4/vc4_drv.h
> > index c24b078f0593..955f157f5ad0 100644
> > --- a/drivers/gpu/drm/vc4/vc4_drv.h
> > +++ b/drivers/gpu/drm/vc4/vc4_drv.h
> > @@ -772,6 +772,7 @@ void vc4_irq_reset(struct drm_device *dev);
> >  extern struct platform_driver vc4_hvs_driver;
> >  void vc4_hvs_dump_state(struct drm_device *dev);
> >  int vc4_hvs_debugfs_regs(struct seq_file *m, void *unused);
> > +void vc4_hvs_sync_dlist(struct drm_device *dev);
> > 
> >  /* vc4_kms.c */
> >  int vc4_kms_load(struct drm_device *dev);
> > diff --git a/drivers/gpu/drm/vc4/vc4_hvs.c b/drivers/gpu/drm/vc4/vc4_hvs.c
> > index 5d8c749c9749..1ba60b8e0c2d 100644
> > --- a/drivers/gpu/drm/vc4/vc4_hvs.c
> > +++ b/drivers/gpu/drm/vc4/vc4_hvs.c
> > @@ -166,6 +166,23 @@ static int vc4_hvs_upload_linear_kernel(struct vc4_hvs *hvs,
> >         return 0;
> >  }
> > 
> > +void vc4_hvs_sync_dlist(struct drm_device *dev)
> > +{
> > +       struct vc4_dev *vc4 = to_vc4_dev(dev);
> > +       unsigned int i;
> > +       int ret;
> > +
> > +       for (i = 0; i < SCALER_CHANNELS_COUNT; i++) {
> > +               if (!(HVS_READ(SCALER_DISPCTRLX(i)) & SCALER_DISPCTRLX_ENABLE))
> > +                       continue;
> > +
> > +               ret = wait_for(HVS_READ(SCALER_DISPLACTX(i)) ==
> > +                              HVS_READ(SCALER_DISPLISTX(i)), 1000);
> > +               WARN(ret, "Timeout waiting for channel %d display list sync\n",
> > +                    i);
> > +       }
> > +}
> > +
> >  static int vc4_hvs_bind(struct device *dev, struct device *master, void *data)
> >  {
> >         struct platform_device *pdev = to_platform_device(dev);
> > diff --git a/drivers/gpu/drm/vc4/vc4_kms.c b/drivers/gpu/drm/vc4/vc4_kms.c
> > index 0490edb192a1..2d66a2b57a91 100644
> > --- a/drivers/gpu/drm/vc4/vc4_kms.c
> > +++ b/drivers/gpu/drm/vc4/vc4_kms.c
> > @@ -155,6 +155,8 @@ vc4_atomic_complete_commit(struct drm_atomic_state *state)
> > 
> >         drm_atomic_helper_commit_hw_done(state);
> > 
> > +       vc4_hvs_sync_dlist(dev);
> 
> From your description I'd have guessed you want this between when you
> update the planes and the crtc, so somewhere between commit_planes()
> and commit_modeset_enables(). At least I have no idea how waiting here
> can prevent underruns, by this point there's no further hw programming
> happening.

One thing that I did not mention is that the display list (that
configures the planes) is only set at crtc_enable time (and taken into
account by the hardware later).

However, even calling vc4_hvs_sync_dlist right at the end of
crtc_enable doesn't do either (the old display list just sticks). It
only seems to work after the HDMI encoder enable step and I don't know
any good reason why.

I didn't find any description of when that dlist sync mechanism is
supposed to take place and what particular event triggers it. Perhaps
it is triggered by a signal originating from the encoder? If anyone has
insight on the hardware, feel free to shed some light here :)

Cheers and thanks for the review,

Paul

> Only exception is if you have an IOMMU which can fault, in
> which case the cleanup_planes might remove the buffers prematurely.
> But if that's the problem, then your semantics of the flip_done event
> are wrong - when flip_done is signalled, the hw must have stopped
> scanning out the old planes, since userspace expects to be able to
> start overwriting/reusing them.
> -Daniel
> 
> > +
> >         drm_atomic_helper_wait_for_flip_done(dev, state);
> > 
> >         drm_atomic_helper_cleanup_planes(dev, state);
> > diff --git a/drivers/gpu/drm/vc4/vc4_regs.h b/drivers/gpu/drm/vc4/vc4_regs.h
> > index 931088014272..50c653309aec 100644
> > --- a/drivers/gpu/drm/vc4/vc4_regs.h
> > +++ b/drivers/gpu/drm/vc4/vc4_regs.h
> > @@ -212,6 +212,8 @@
> > 
> >  #define PV_HACT_ACT                            0x30
> > 
> > +#define SCALER_CHANNELS_COUNT                  3
> > +
> >  #define SCALER_DISPCTRL                         0x00000000
> >  /* Global register for clock gating the HVS */
> >  # define SCALER_DISPCTRL_ENABLE                        BIT(31)
> > --
> > 2.20.1
> > 
> 
>
On Wed, Jan 09, 2019 at 05:52:20PM +0100, Paul Kocialkowski wrote:
> Hi Daniel,
> 
> On Tue, 2019-01-08 at 19:21 +0100, Daniel Vetter wrote:
> > On Tue, Jan 8, 2019 at 3:51 PM Paul Kocialkowski
> > <paul.kocialkowski@bootlin.com> wrote:
> > > During an atomic commit, the HVS is configured with a display list
> > > for the channel matching the associated CRTC. The Pixel Valve (CRTC)
> > > and encoder are also configured for the new setup at that time.
> > > While the Pixel Valve and encoder are reconfigured synchronously, the
> > > HVS is only reconfigured after the display list address (DISPLIST) has
> > > been updated to the current display list address (DISPLACTX), which is
> > > the responsibility of the hardware.
> > > 
> > > The time frame during which the HVS is still running on its previous
> > > configuration but the CRTC and encoder have been reconfigured already
> > > can lead to a number of synchronization issues. They will eventually
> > > cause errors reported on the FIFOs, such as underruns.
> > > 
> > > With underrun detection enabled (from Boris Brezillon's series), this
> > > leads to unreliable underrun detection with random false positives.
> > > 
> > > To ensure a coherent state, wait for each enabled channel of the HVS
> > > to synchronize its current display list address. This fixes the issue
> > > of random underrun reporting on commits.
> > > 
> > > Signed-off-by: Paul Kocialkowski <paul.kocialkowski@bootlin.com>
> > > ---
> > >  drivers/gpu/drm/vc4/vc4_drv.h  |  1 +
> > >  drivers/gpu/drm/vc4/vc4_hvs.c  | 17 +++++++++++++++++
> > >  drivers/gpu/drm/vc4/vc4_kms.c  |  2 ++
> > >  drivers/gpu/drm/vc4/vc4_regs.h |  2 ++
> > >  4 files changed, 22 insertions(+)
> > > 
> > > diff --git a/drivers/gpu/drm/vc4/vc4_drv.h b/drivers/gpu/drm/vc4/vc4_drv.h
> > > index c24b078f0593..955f157f5ad0 100644
> > > --- a/drivers/gpu/drm/vc4/vc4_drv.h
> > > +++ b/drivers/gpu/drm/vc4/vc4_drv.h
> > > @@ -772,6 +772,7 @@ void vc4_irq_reset(struct drm_device *dev);
> > >  extern struct platform_driver vc4_hvs_driver;
> > >  void vc4_hvs_dump_state(struct drm_device *dev);
> > >  int vc4_hvs_debugfs_regs(struct seq_file *m, void *unused);
> > > +void vc4_hvs_sync_dlist(struct drm_device *dev);
> > > 
> > >  /* vc4_kms.c */
> > >  int vc4_kms_load(struct drm_device *dev);
> > > diff --git a/drivers/gpu/drm/vc4/vc4_hvs.c b/drivers/gpu/drm/vc4/vc4_hvs.c
> > > index 5d8c749c9749..1ba60b8e0c2d 100644
> > > --- a/drivers/gpu/drm/vc4/vc4_hvs.c
> > > +++ b/drivers/gpu/drm/vc4/vc4_hvs.c
> > > @@ -166,6 +166,23 @@ static int vc4_hvs_upload_linear_kernel(struct vc4_hvs *hvs,
> > >         return 0;
> > >  }
> > > 
> > > +void vc4_hvs_sync_dlist(struct drm_device *dev)
> > > +{
> > > +       struct vc4_dev *vc4 = to_vc4_dev(dev);
> > > +       unsigned int i;
> > > +       int ret;
> > > +
> > > +       for (i = 0; i < SCALER_CHANNELS_COUNT; i++) {
> > > +               if (!(HVS_READ(SCALER_DISPCTRLX(i)) & SCALER_DISPCTRLX_ENABLE))
> > > +                       continue;
> > > +
> > > +               ret = wait_for(HVS_READ(SCALER_DISPLACTX(i)) ==
> > > +                              HVS_READ(SCALER_DISPLISTX(i)), 1000);
> > > +               WARN(ret, "Timeout waiting for channel %d display list sync\n",
> > > +                    i);
> > > +       }
> > > +}
> > > +
> > >  static int vc4_hvs_bind(struct device *dev, struct device *master, void *data)
> > >  {
> > >         struct platform_device *pdev = to_platform_device(dev);
> > > diff --git a/drivers/gpu/drm/vc4/vc4_kms.c b/drivers/gpu/drm/vc4/vc4_kms.c
> > > index 0490edb192a1..2d66a2b57a91 100644
> > > --- a/drivers/gpu/drm/vc4/vc4_kms.c
> > > +++ b/drivers/gpu/drm/vc4/vc4_kms.c
> > > @@ -155,6 +155,8 @@ vc4_atomic_complete_commit(struct drm_atomic_state *state)
> > > 
> > >         drm_atomic_helper_commit_hw_done(state);
> > > 
> > > +       vc4_hvs_sync_dlist(dev);
> > 
> > From your description I'd have guessed you want this between when you
> > update the planes and the crtc, so somewhere between commit_planes()
> > and commit_modeset_enables(). At least I have no idea how waiting here
> > can prevent underruns, by this point there's no further hw programming
> > happening.
> 
> One thing that I did not mention is that the display list (that
> configures the planes) is only set at crtc_enable time (and taken into
> account by the hardware later).
> 
> However, even calling vc4_hvs_sync_dlist right at the end of
> crtc_enable doesn't do either (the old display list just sticks). It
> only seems to work after the HDMI encoder enable step and I don't know
> any good reason why.
> 
> I didn't find any description of when that dlist sync mechanism is
> supposed to take place and what particular event triggers it. Perhaps
> it is triggered by a signal originating from the encoder? If anyone has
> insight on the hardware, feel free to shed some light here :)

Maybe my concern wasn't clear: I have no idea why you need this exactly
and how your hw works. Only thing I meant to highlight is that since all
you're doing is wait a bit, then the only reason I can come up with why
that wait does anything is cleanup_planes() later on. And if that's the
case, then you also need to sufficiently delay the flip_done signalling to
userspace (i.e. sending out the crtc_state->event vblank event).

But I'm really not understanding what the hw does and how your patch here
helps at all. It just looked really strange from a atomic kms pov.
-Daniel

> 
> Cheers and thanks for the review,
> 
> Paul
> 
> > Only exception is if you have an IOMMU which can fault, in
> > which case the cleanup_planes might remove the buffers prematurely.
> > But if that's the problem, then your semantics of the flip_done event
> > are wrong - when flip_done is signalled, the hw must have stopped
> > scanning out the old planes, since userspace expects to be able to
> > start overwriting/reusing them.
> > -Daniel
> > 
> > > +
> > >         drm_atomic_helper_wait_for_flip_done(dev, state);
> > > 
> > >         drm_atomic_helper_cleanup_planes(dev, state);
> > > diff --git a/drivers/gpu/drm/vc4/vc4_regs.h b/drivers/gpu/drm/vc4/vc4_regs.h
> > > index 931088014272..50c653309aec 100644
> > > --- a/drivers/gpu/drm/vc4/vc4_regs.h
> > > +++ b/drivers/gpu/drm/vc4/vc4_regs.h
> > > @@ -212,6 +212,8 @@
> > > 
> > >  #define PV_HACT_ACT                            0x30
> > > 
> > > +#define SCALER_CHANNELS_COUNT                  3
> > > +
> > >  #define SCALER_DISPCTRL                         0x00000000
> > >  /* Global register for clock gating the HVS */
> > >  # define SCALER_DISPCTRL_ENABLE                        BIT(31)
> > > --
> > > 2.20.1
> > > 
> > 
> > 
> -- 
> Paul Kocialkowski, Bootlin
> Embedded Linux and kernel engineering
> https://bootlin.com
>
Paul Kocialkowski <paul.kocialkowski@bootlin.com> writes:

> During an atomic commit, the HVS is configured with a display list
> for the channel matching the associated CRTC. The Pixel Valve (CRTC)
> and encoder are also configured for the new setup at that time.
> While the Pixel Valve and encoder are reconfigured synchronously, the
> HVS is only reconfigured after the display list address (DISPLIST) has
> been updated to the current display list address (DISPLACTX), which is
> the responsibility of the hardware.
>
> The time frame during which the HVS is still running on its previous
> configuration but the CRTC and encoder have been reconfigured already
> can lead to a number of synchronization issues. They will eventually
> cause errors reported on the FIFOs, such as underruns.
>
> With underrun detection enabled (from Boris Brezillon's series), this
> leads to unreliable underrun detection with random false positives.
>
> To ensure a coherent state, wait for each enabled channel of the HVS
> to synchronize its current display list address. This fixes the issue
> of random underrun reporting on commits.
>
> Signed-off-by: Paul Kocialkowski <paul.kocialkowski@bootlin.com>
> ---
>  drivers/gpu/drm/vc4/vc4_drv.h  |  1 +
>  drivers/gpu/drm/vc4/vc4_hvs.c  | 17 +++++++++++++++++
>  drivers/gpu/drm/vc4/vc4_kms.c  |  2 ++
>  drivers/gpu/drm/vc4/vc4_regs.h |  2 ++
>  4 files changed, 22 insertions(+)
>
> diff --git a/drivers/gpu/drm/vc4/vc4_drv.h b/drivers/gpu/drm/vc4/vc4_drv.h
> index c24b078f0593..955f157f5ad0 100644
> --- a/drivers/gpu/drm/vc4/vc4_drv.h
> +++ b/drivers/gpu/drm/vc4/vc4_drv.h
> @@ -772,6 +772,7 @@ void vc4_irq_reset(struct drm_device *dev);
>  extern struct platform_driver vc4_hvs_driver;
>  void vc4_hvs_dump_state(struct drm_device *dev);
>  int vc4_hvs_debugfs_regs(struct seq_file *m, void *unused);
> +void vc4_hvs_sync_dlist(struct drm_device *dev);
>  
>  /* vc4_kms.c */
>  int vc4_kms_load(struct drm_device *dev);
> diff --git a/drivers/gpu/drm/vc4/vc4_hvs.c b/drivers/gpu/drm/vc4/vc4_hvs.c
> index 5d8c749c9749..1ba60b8e0c2d 100644
> --- a/drivers/gpu/drm/vc4/vc4_hvs.c
> +++ b/drivers/gpu/drm/vc4/vc4_hvs.c
> @@ -166,6 +166,23 @@ static int vc4_hvs_upload_linear_kernel(struct vc4_hvs *hvs,
>  	return 0;
>  }
>  
> +void vc4_hvs_sync_dlist(struct drm_device *dev)
> +{
> +	struct vc4_dev *vc4 = to_vc4_dev(dev);
> +	unsigned int i;
> +	int ret;
> +
> +	for (i = 0; i < SCALER_CHANNELS_COUNT; i++) {
> +		if (!(HVS_READ(SCALER_DISPCTRLX(i)) & SCALER_DISPCTRLX_ENABLE))
> +			continue;
> +
> +		ret = wait_for(HVS_READ(SCALER_DISPLACTX(i)) ==
> +			       HVS_READ(SCALER_DISPLISTX(i)), 1000);
> +		WARN(ret, "Timeout waiting for channel %d display list sync\n",
> +		     i);
> +	}
> +}
> +
>  static int vc4_hvs_bind(struct device *dev, struct device *master, void *data)
>  {
>  	struct platform_device *pdev = to_platform_device(dev);
> diff --git a/drivers/gpu/drm/vc4/vc4_kms.c b/drivers/gpu/drm/vc4/vc4_kms.c
> index 0490edb192a1..2d66a2b57a91 100644
> --- a/drivers/gpu/drm/vc4/vc4_kms.c
> +++ b/drivers/gpu/drm/vc4/vc4_kms.c
> @@ -155,6 +155,8 @@ vc4_atomic_complete_commit(struct drm_atomic_state *state)
>  
>  	drm_atomic_helper_commit_hw_done(state);
>  
> +	vc4_hvs_sync_dlist(dev);
> +
>  	drm_atomic_helper_wait_for_flip_done(dev, state);

wait_for_flip_done should already be waiting for DISPLACTX to match our
crtc's display list, though, right (see vc4_crtc_handle_page_flip())?
This seems like a no-op to me.
Hi,

On Wed, 2019-01-23 at 10:34 -0800, Eric Anholt wrote:
> Paul Kocialkowski <paul.kocialkowski@bootlin.com> writes:
> 
> > During an atomic commit, the HVS is configured with a display list
> > for the channel matching the associated CRTC. The Pixel Valve (CRTC)
> > and encoder are also configured for the new setup at that time.
> > While the Pixel Valve and encoder are reconfigured synchronously, the
> > HVS is only reconfigured after the display list address (DISPLIST) has
> > been updated to the current display list address (DISPLACTX), which is
> > the responsibility of the hardware.
> > 
> > The time frame during which the HVS is still running on its previous
> > configuration but the CRTC and encoder have been reconfigured already
> > can lead to a number of synchronization issues. They will eventually
> > cause errors reported on the FIFOs, such as underruns.
> > 
> > With underrun detection enabled (from Boris Brezillon's series), this
> > leads to unreliable underrun detection with random false positives.
> > 
> > To ensure a coherent state, wait for each enabled channel of the HVS
> > to synchronize its current display list address. This fixes the issue
> > of random underrun reporting on commits.
> > 
> > Signed-off-by: Paul Kocialkowski <paul.kocialkowski@bootlin.com>
> > ---
> >  drivers/gpu/drm/vc4/vc4_drv.h  |  1 +
> >  drivers/gpu/drm/vc4/vc4_hvs.c  | 17 +++++++++++++++++
> >  drivers/gpu/drm/vc4/vc4_kms.c  |  2 ++
> >  drivers/gpu/drm/vc4/vc4_regs.h |  2 ++
> >  4 files changed, 22 insertions(+)
> > 
> > diff --git a/drivers/gpu/drm/vc4/vc4_drv.h b/drivers/gpu/drm/vc4/vc4_drv.h
> > index c24b078f0593..955f157f5ad0 100644
> > --- a/drivers/gpu/drm/vc4/vc4_drv.h
> > +++ b/drivers/gpu/drm/vc4/vc4_drv.h
> > @@ -772,6 +772,7 @@ void vc4_irq_reset(struct drm_device *dev);
> >  extern struct platform_driver vc4_hvs_driver;
> >  void vc4_hvs_dump_state(struct drm_device *dev);
> >  int vc4_hvs_debugfs_regs(struct seq_file *m, void *unused);
> > +void vc4_hvs_sync_dlist(struct drm_device *dev);
> >  
> >  /* vc4_kms.c */
> >  int vc4_kms_load(struct drm_device *dev);
> > diff --git a/drivers/gpu/drm/vc4/vc4_hvs.c b/drivers/gpu/drm/vc4/vc4_hvs.c
> > index 5d8c749c9749..1ba60b8e0c2d 100644
> > --- a/drivers/gpu/drm/vc4/vc4_hvs.c
> > +++ b/drivers/gpu/drm/vc4/vc4_hvs.c
> > @@ -166,6 +166,23 @@ static int vc4_hvs_upload_linear_kernel(struct vc4_hvs *hvs,
> >  	return 0;
> >  }
> >  
> > +void vc4_hvs_sync_dlist(struct drm_device *dev)
> > +{
> > +	struct vc4_dev *vc4 = to_vc4_dev(dev);
> > +	unsigned int i;
> > +	int ret;
> > +
> > +	for (i = 0; i < SCALER_CHANNELS_COUNT; i++) {
> > +		if (!(HVS_READ(SCALER_DISPCTRLX(i)) & SCALER_DISPCTRLX_ENABLE))
> > +			continue;
> > +
> > +		ret = wait_for(HVS_READ(SCALER_DISPLACTX(i)) ==
> > +			       HVS_READ(SCALER_DISPLISTX(i)), 1000);
> > +		WARN(ret, "Timeout waiting for channel %d display list sync\n",
> > +		     i);
> > +	}
> > +}
> > +
> >  static int vc4_hvs_bind(struct device *dev, struct device *master, void *data)
> >  {
> >  	struct platform_device *pdev = to_platform_device(dev);
> > diff --git a/drivers/gpu/drm/vc4/vc4_kms.c b/drivers/gpu/drm/vc4/vc4_kms.c
> > index 0490edb192a1..2d66a2b57a91 100644
> > --- a/drivers/gpu/drm/vc4/vc4_kms.c
> > +++ b/drivers/gpu/drm/vc4/vc4_kms.c
> > @@ -155,6 +155,8 @@ vc4_atomic_complete_commit(struct drm_atomic_state *state)
> >  
> >  	drm_atomic_helper_commit_hw_done(state);
> >  
> > +	vc4_hvs_sync_dlist(dev);
> > +
> >  	drm_atomic_helper_wait_for_flip_done(dev, state);
> 
> wait_for_flip_done should already be waiting for DISPLACTX to match our
> crtc's display list, though, right (see vc4_crtc_handle_page_flip())?
> This seems like a no-op to me.

Right, at this stage it's pretty much a no-op. It does make a
difference when bringing-in vc4_hvs_unmask_underrun, because this
vc4_hvs_sync_dlist call comes before it.

When the display lists are not yet synchronized (and differ), an
underrun occurs so it will be reported although it's not related to the
new display list configuration. Waiting for the display lists to sync
before unmasking the interrupt prevents that.

I think waiting for the page flip to unmask the underrun interrupt
would be too late because the HVS should have finished processing the
new dlist by then (but maybe I'm mistaken about that), so we would miss
the underrun indication.

Cheers,

Paul
Paul Kocialkowski <paul.kocialkowski@bootlin.com> writes:

> Hi,
>
> On Wed, 2019-01-23 at 10:34 -0800, Eric Anholt wrote:
>> Paul Kocialkowski <paul.kocialkowski@bootlin.com> writes:
>> 
>> > During an atomic commit, the HVS is configured with a display list
>> > for the channel matching the associated CRTC. The Pixel Valve (CRTC)
>> > and encoder are also configured for the new setup at that time.
>> > While the Pixel Valve and encoder are reconfigured synchronously, the
>> > HVS is only reconfigured after the display list address (DISPLIST) has
>> > been updated to the current display list address (DISPLACTX), which is
>> > the responsibility of the hardware.
>> > 
>> > The time frame during which the HVS is still running on its previous
>> > configuration but the CRTC and encoder have been reconfigured already
>> > can lead to a number of synchronization issues. They will eventually
>> > cause errors reported on the FIFOs, such as underruns.
>> > 
>> > With underrun detection enabled (from Boris Brezillon's series), this
>> > leads to unreliable underrun detection with random false positives.
>> > 
>> > To ensure a coherent state, wait for each enabled channel of the HVS
>> > to synchronize its current display list address. This fixes the issue
>> > of random underrun reporting on commits.
>> > 
>> > Signed-off-by: Paul Kocialkowski <paul.kocialkowski@bootlin.com>
>> > ---
>> >  drivers/gpu/drm/vc4/vc4_drv.h  |  1 +
>> >  drivers/gpu/drm/vc4/vc4_hvs.c  | 17 +++++++++++++++++
>> >  drivers/gpu/drm/vc4/vc4_kms.c  |  2 ++
>> >  drivers/gpu/drm/vc4/vc4_regs.h |  2 ++
>> >  4 files changed, 22 insertions(+)
>> > 
>> > diff --git a/drivers/gpu/drm/vc4/vc4_drv.h b/drivers/gpu/drm/vc4/vc4_drv.h
>> > index c24b078f0593..955f157f5ad0 100644
>> > --- a/drivers/gpu/drm/vc4/vc4_drv.h
>> > +++ b/drivers/gpu/drm/vc4/vc4_drv.h
>> > @@ -772,6 +772,7 @@ void vc4_irq_reset(struct drm_device *dev);
>> >  extern struct platform_driver vc4_hvs_driver;
>> >  void vc4_hvs_dump_state(struct drm_device *dev);
>> >  int vc4_hvs_debugfs_regs(struct seq_file *m, void *unused);
>> > +void vc4_hvs_sync_dlist(struct drm_device *dev);
>> >  
>> >  /* vc4_kms.c */
>> >  int vc4_kms_load(struct drm_device *dev);
>> > diff --git a/drivers/gpu/drm/vc4/vc4_hvs.c b/drivers/gpu/drm/vc4/vc4_hvs.c
>> > index 5d8c749c9749..1ba60b8e0c2d 100644
>> > --- a/drivers/gpu/drm/vc4/vc4_hvs.c
>> > +++ b/drivers/gpu/drm/vc4/vc4_hvs.c
>> > @@ -166,6 +166,23 @@ static int vc4_hvs_upload_linear_kernel(struct vc4_hvs *hvs,
>> >  	return 0;
>> >  }
>> >  
>> > +void vc4_hvs_sync_dlist(struct drm_device *dev)
>> > +{
>> > +	struct vc4_dev *vc4 = to_vc4_dev(dev);
>> > +	unsigned int i;
>> > +	int ret;
>> > +
>> > +	for (i = 0; i < SCALER_CHANNELS_COUNT; i++) {
>> > +		if (!(HVS_READ(SCALER_DISPCTRLX(i)) & SCALER_DISPCTRLX_ENABLE))
>> > +			continue;
>> > +
>> > +		ret = wait_for(HVS_READ(SCALER_DISPLACTX(i)) ==
>> > +			       HVS_READ(SCALER_DISPLISTX(i)), 1000);
>> > +		WARN(ret, "Timeout waiting for channel %d display list sync\n",
>> > +		     i);
>> > +	}
>> > +}
>> > +
>> >  static int vc4_hvs_bind(struct device *dev, struct device *master, void *data)
>> >  {
>> >  	struct platform_device *pdev = to_platform_device(dev);
>> > diff --git a/drivers/gpu/drm/vc4/vc4_kms.c b/drivers/gpu/drm/vc4/vc4_kms.c
>> > index 0490edb192a1..2d66a2b57a91 100644
>> > --- a/drivers/gpu/drm/vc4/vc4_kms.c
>> > +++ b/drivers/gpu/drm/vc4/vc4_kms.c
>> > @@ -155,6 +155,8 @@ vc4_atomic_complete_commit(struct drm_atomic_state *state)
>> >  
>> >  	drm_atomic_helper_commit_hw_done(state);
>> >  
>> > +	vc4_hvs_sync_dlist(dev);
>> > +
>> >  	drm_atomic_helper_wait_for_flip_done(dev, state);
>> 
>> wait_for_flip_done should already be waiting for DISPLACTX to match our
>> crtc's display list, though, right (see vc4_crtc_handle_page_flip())?
>> This seems like a no-op to me.
>
> Right, at this stage it's pretty much a no-op. It does make a
> difference when bringing-in vc4_hvs_unmask_underrun, because this
> vc4_hvs_sync_dlist call comes before it.
>
> When the display lists are not yet synchronized (and differ), an
> underrun occurs so it will be reported although it's not related to the
> new display list configuration. Waiting for the display lists to sync
> before unmasking the interrupt prevents that.
>
> I think waiting for the page flip to unmask the underrun interrupt
> would be too late because the HVS should have finished processing the
> new dlist by then (but maybe I'm mistaken about that), so we would miss
> the underrun indication.

For load tracking I'm mainly concerned about making sure that we don't
have long-standing underruns -- if the first frame after a modeset is a
little wonky, you probably won't notice it.

Getting a clean modeset sequence would be really nice to have,
just... don't block on that.