[4/5] drm/vmwgfx: remove custom ioctl io encoding check

Submitted by Emil Velikov on May 22, 2019, 4:41 p.m.

Details

Message ID 20190522164119.24139-4-emil.l.velikov@gmail.com
State New
Headers show
Series "Series without cover letter" ( rev: 1 ) in DRI devel

Not browsing as part of any series.

Commit Message

Emil Velikov May 22, 2019, 4:41 p.m.
From: Emil Velikov <emil.velikov@collabora.com>

Drop the custom ioctl io encoding check - core drm does it for us.

Cc: "VMware Graphics" <linux-graphics-maintainer@vmware.com>
Cc: Thomas Hellstrom <thellstrom@vmware.com>
Cc: Daniel Vetter <daniel@ffwll.ch>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
---
 drivers/gpu/drm/vmwgfx/vmwgfx_drv.c | 9 ---------
 1 file changed, 9 deletions(-)

Patch hide | download patch | download mbox

diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_drv.c b/drivers/gpu/drm/vmwgfx/vmwgfx_drv.c
index 2cb6ae219e43..f65542639b55 100644
--- a/drivers/gpu/drm/vmwgfx/vmwgfx_drv.c
+++ b/drivers/gpu/drm/vmwgfx/vmwgfx_drv.c
@@ -1147,9 +1147,6 @@  static long vmw_generic_ioctl(struct file *filp, unsigned int cmd,
 				return -EACCES;
 		}
 
-		if (unlikely(ioctl->cmd != cmd))
-			goto out_io_encoding;
-
 		flags = ioctl->flags;
 	} else if (!drm_ioctl_flags(nr, &flags))
 		return -EINVAL;
@@ -1169,12 +1166,6 @@  static long vmw_generic_ioctl(struct file *filp, unsigned int cmd,
 		ttm_read_unlock(&vmaster->lock);
 
 	return ret;
-
-out_io_encoding:
-	DRM_ERROR("Invalid command format, ioctl %d\n",
-		  nr - DRM_COMMAND_BASE);
-
-	return -EINVAL;
 }
 
 static long vmw_unlocked_ioctl(struct file *filp, unsigned int cmd,

Comments

Hi, Emil,

On Wed, 2019-05-22 at 17:41 +0100, Emil Velikov wrote:
> From: Emil Velikov <emil.velikov@collabora.com>

> 

> Drop the custom ioctl io encoding check - core drm does it for us.


I fail to see where the core does this, or do I miss something?
Thanks,
Thomas


> 

> Cc: "VMware Graphics" <linux-graphics-maintainer@vmware.com>

> Cc: Thomas Hellstrom <thellstrom@vmware.com>

> Cc: Daniel Vetter <daniel@ffwll.ch>

> Signed-off-by: Emil Velikov <emil.velikov@collabora.com>

> ---

>  drivers/gpu/drm/vmwgfx/vmwgfx_drv.c | 9 ---------

>  1 file changed, 9 deletions(-)

> 

> diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_drv.c

> b/drivers/gpu/drm/vmwgfx/vmwgfx_drv.c

> index 2cb6ae219e43..f65542639b55 100644

> --- a/drivers/gpu/drm/vmwgfx/vmwgfx_drv.c

> +++ b/drivers/gpu/drm/vmwgfx/vmwgfx_drv.c

> @@ -1147,9 +1147,6 @@ static long vmw_generic_ioctl(struct file

> *filp, unsigned int cmd,

>  				return -EACCES;

>  		}

>  

> -		if (unlikely(ioctl->cmd != cmd))

> -			goto out_io_encoding;

> -

>  		flags = ioctl->flags;

>  	} else if (!drm_ioctl_flags(nr, &flags))

>  		return -EINVAL;

> @@ -1169,12 +1166,6 @@ static long vmw_generic_ioctl(struct file

> *filp, unsigned int cmd,

>  		ttm_read_unlock(&vmaster->lock);

>  

>  	return ret;

> -

> -out_io_encoding:

> -	DRM_ERROR("Invalid command format, ioctl %d\n",

> -		  nr - DRM_COMMAND_BASE);

> -

> -	return -EINVAL;

>  }

>  

>  static long vmw_unlocked_ioctl(struct file *filp, unsigned int cmd,
On 2019/05/23, Thomas Hellstrom wrote:
> Hi, Emil,
> 
> On Wed, 2019-05-22 at 17:41 +0100, Emil Velikov wrote:
> > From: Emil Velikov <emil.velikov@collabora.com>
> > 
> > Drop the custom ioctl io encoding check - core drm does it for us.
> 
> I fail to see where the core does this, or do I miss something?

drm_ioctl() allows for the encoding to be changed and attributes that only the
appropriate size is copied in/out of the kernel.

Technically the function is more relaxed relative to the vmwgfx check, yet
seems perfectly reasonable.

Is there any corner-case that isn't but should be handled in drm_ioctl()?

-Emil
On Fri, 2019-05-24 at 13:14 +0100, Emil Velikov wrote:
> On 2019/05/23, Thomas Hellstrom wrote:

> > Hi, Emil,

> > 

> > On Wed, 2019-05-22 at 17:41 +0100, Emil Velikov wrote:

> > > From: Emil Velikov <emil.velikov@collabora.com>

> > > 

> > > Drop the custom ioctl io encoding check - core drm does it for

> > > us.

> > 

> > I fail to see where the core does this, or do I miss something?

> 

> drm_ioctl() allows for the encoding to be changed and attributes that

> only the

> appropriate size is copied in/out of the kernel.

> 

> Technically the function is more relaxed relative to the vmwgfx

> check, yet

> seems perfectly reasonable.

> 

> Is there any corner-case that isn't but should be handled in

> drm_ioctl()?


I'd like to turn the question around and ask whether there's a reason
we should relax the vmwgfx test? In the past it has trapped quite a few
user-space errors.

Thanks,
Thomas



> 

> -Emil
On 2019/05/24, Thomas Hellstrom wrote:
> On Fri, 2019-05-24 at 13:14 +0100, Emil Velikov wrote:
> > On 2019/05/23, Thomas Hellstrom wrote:
> > > Hi, Emil,
> > > 
> > > On Wed, 2019-05-22 at 17:41 +0100, Emil Velikov wrote:
> > > > From: Emil Velikov <emil.velikov@collabora.com>
> > > > 
> > > > Drop the custom ioctl io encoding check - core drm does it for
> > > > us.
> > > 
> > > I fail to see where the core does this, or do I miss something?
> > 
> > drm_ioctl() allows for the encoding to be changed and attributes that
> > only the
> > appropriate size is copied in/out of the kernel.
> > 
> > Technically the function is more relaxed relative to the vmwgfx
> > check, yet
> > seems perfectly reasonable.
> > 
> > Is there any corner-case that isn't but should be handled in
> > drm_ioctl()?
> 
> I'd like to turn the question around and ask whether there's a reason
> we should relax the vmwgfx test? In the past it has trapped quite a few
> user-space errors.
> 
The way I see it either:
 - the check, as-is, is unnessesary, or
 - it is needed, and we should do something equivalent for all of DRM

We had a very long brainstorming session with a colleague and we could not see
any cases where this would cause a problem. If you recall anything concrete
please let me know - I would be more than happy to take a closer look.

Thanks
Emil
Hi, Emil

On Fri, 2019-05-24 at 16:26 +0100, Emil Velikov wrote:
> On 2019/05/24, Thomas Hellstrom wrote:

> > On Fri, 2019-05-24 at 13:14 +0100, Emil Velikov wrote:

> > > On 2019/05/23, Thomas Hellstrom wrote:

> > > > Hi, Emil,

> > > > 

> > > > On Wed, 2019-05-22 at 17:41 +0100, Emil Velikov wrote:

> > > > > From: Emil Velikov <emil.velikov@collabora.com>

> > > > > 

> > > > > Drop the custom ioctl io encoding check - core drm does it

> > > > > for

> > > > > us.

> > > > 

> > > > I fail to see where the core does this, or do I miss something?

> > > 

> > > drm_ioctl() allows for the encoding to be changed and attributes

> > > that

> > > only the

> > > appropriate size is copied in/out of the kernel.

> > > 

> > > Technically the function is more relaxed relative to the vmwgfx

> > > check, yet

> > > seems perfectly reasonable.

> > > 

> > > Is there any corner-case that isn't but should be handled in

> > > drm_ioctl()?

> > 

> > I'd like to turn the question around and ask whether there's a

> > reason

> > we should relax the vmwgfx test? In the past it has trapped quite a

> > few

> > user-space errors.

> > 

> The way I see it either:

>  - the check, as-is, is unnessesary, or

>  - it is needed, and we should do something equivalent for all of DRM

> 

> We had a very long brainstorming session with a colleague and we

> could not see

> any cases where this would cause a problem. If you recall anything

> concrete

> please let me know - I would be more than happy to take a closer

> look.


If you have a good reason to drop an ioctl sanity check, I'd be
perfectly happy to do it. To me, a good reason even includes "I have a
non-open-source customer having problems with this check" because of
reason etc. etc. as long as I have a way to evaluate those reasons and
determine if they're valid or not. "No other drm driver nor the core is
doing this" is NOT a valid reason to me. In particular if the check is
not affecting performance. So unless you provide additional reasons to
drop this check, it's a solid NAK from my side.

Thanks,
Thomas


> 

> Thanks

> Emil
On Sat, 2019-05-25 at 00:39 +0200, Thomas Hellström wrote:
> Hi, Emil

> 

> On Fri, 2019-05-24 at 16:26 +0100, Emil Velikov wrote:

> > On 2019/05/24, Thomas Hellstrom wrote:

> > > On Fri, 2019-05-24 at 13:14 +0100, Emil Velikov wrote:

> > > > On 2019/05/23, Thomas Hellstrom wrote:

> > > > > Hi, Emil,

> > > > > 

> > > > > On Wed, 2019-05-22 at 17:41 +0100, Emil Velikov wrote:

> > > > > > From: Emil Velikov <emil.velikov@collabora.com>

> > > > > > 

> > > > > > Drop the custom ioctl io encoding check - core drm does it

> > > > > > for

> > > > > > us.

> > > > > 

> > > > > I fail to see where the core does this, or do I miss

> > > > > something?

> > > > 

> > > > drm_ioctl() allows for the encoding to be changed and

> > > > attributes

> > > > that

> > > > only the

> > > > appropriate size is copied in/out of the kernel.

> > > > 

> > > > Technically the function is more relaxed relative to the vmwgfx

> > > > check, yet

> > > > seems perfectly reasonable.

> > > > 

> > > > Is there any corner-case that isn't but should be handled in

> > > > drm_ioctl()?

> > > 

> > > I'd like to turn the question around and ask whether there's a

> > > reason

> > > we should relax the vmwgfx test? In the past it has trapped quite

> > > a

> > > few

> > > user-space errors.

> > > 

> > The way I see it either:

> >  - the check, as-is, is unnessesary, or

> >  - it is needed, and we should do something equivalent for all of

> > DRM

> > 

> > We had a very long brainstorming session with a colleague and we

> > could not see

> > any cases where this would cause a problem. If you recall anything

> > concrete

> > please let me know - I would be more than happy to take a closer

> > look.

> 

> If you have a good reason to drop an ioctl sanity check, I'd be

> perfectly happy to do it. To me, a good reason even includes "I have

> a

> non-open-source customer having problems with this check" because of

> reason etc. etc. as long as I have a way to evaluate those reasons

> and

> determine if they're valid or not. "No other drm driver nor the core

> is

> doing this" is NOT a valid reason to me. In particular if the check

> is

> not affecting performance. So unless you provide additional reasons

> to

> drop this check, it's a solid NAK from my side.


To clarify my point of view a bit, this check is useful to early catch
userspace using incorrect flags and sizes, which otherwise might make
it out to distros and after that, introducing a check like this would
be impossible, since it might break old user-space. For the same reason
it would probably be very difficult to introduce it in core drm. 

Thanks,
Thomas



> 

> Thanks,

> Thomas

> 

> 

> > Thanks

> > Emil
On 2019/05/25, Thomas Hellstrom wrote:
> On Sat, 2019-05-25 at 00:39 +0200, Thomas Hellström wrote:
> > Hi, Emil
> > 
> > On Fri, 2019-05-24 at 16:26 +0100, Emil Velikov wrote:
> > > On 2019/05/24, Thomas Hellstrom wrote:
> > > > On Fri, 2019-05-24 at 13:14 +0100, Emil Velikov wrote:
> > > > > On 2019/05/23, Thomas Hellstrom wrote:
> > > > > > Hi, Emil,
> > > > > > 
> > > > > > On Wed, 2019-05-22 at 17:41 +0100, Emil Velikov wrote:
> > > > > > > From: Emil Velikov <emil.velikov@collabora.com>
> > > > > > > 
> > > > > > > Drop the custom ioctl io encoding check - core drm does it
> > > > > > > for
> > > > > > > us.
> > > > > > 
> > > > > > I fail to see where the core does this, or do I miss
> > > > > > something?
> > > > > 
> > > > > drm_ioctl() allows for the encoding to be changed and
> > > > > attributes
> > > > > that
> > > > > only the
> > > > > appropriate size is copied in/out of the kernel.
> > > > > 
> > > > > Technically the function is more relaxed relative to the vmwgfx
> > > > > check, yet
> > > > > seems perfectly reasonable.
> > > > > 
> > > > > Is there any corner-case that isn't but should be handled in
> > > > > drm_ioctl()?
> > > > 
> > > > I'd like to turn the question around and ask whether there's a
> > > > reason
> > > > we should relax the vmwgfx test? In the past it has trapped quite
> > > > a
> > > > few
> > > > user-space errors.
> > > > 
> > > The way I see it either:
> > >  - the check, as-is, is unnessesary, or
> > >  - it is needed, and we should do something equivalent for all of
> > > DRM
> > > 
> > > We had a very long brainstorming session with a colleague and we
> > > could not see
> > > any cases where this would cause a problem. If you recall anything
> > > concrete
> > > please let me know - I would be more than happy to take a closer
> > > look.
> > 
> > If you have a good reason to drop an ioctl sanity check, I'd be
> > perfectly happy to do it. To me, a good reason even includes "I have
> > a
> > non-open-source customer having problems with this check" because of
> > reason etc. etc. as long as I have a way to evaluate those reasons
> > and
> > determine if they're valid or not. "No other drm driver nor the core
> > is
> > doing this" is NOT a valid reason to me. In particular if the check
> > is
> > not affecting performance. So unless you provide additional reasons
> > to
> > drop this check, it's a solid NAK from my side.
> 
> To clarify my point of view a bit, this check is useful to early catch
> userspace using incorrect flags and sizes, which otherwise might make
> it out to distros and after that, introducing a check like this would
> be impossible, since it might break old user-space. For the same reason
> it would probably be very difficult to introduce it in core drm. 
> 
I think we might be talking past each other, let's take a step back:

 - as of previous patch, all of vmwgfx ioctls size is consistently
handled by the core
 - handling of featue flags, as always, is responsibility of the driver
ifself
 - with this patch, ioctl direction is also handled by core.

Here core ensures we only copy in/out as much data as the kernel
implementation can handle.


Let's consider the following real world example - msm and virtio_gpu.

An in field of an _IOW ioctl becomes in/out aka _IORW ioctl.
 - we add a flag to annotate/request the out, as always invalid flags
are return -EINVAL
 - we change the ioctl encoding

As currently handled by core DRM, old kernel/new userspace and
vice-versa works just fine. Sadly, vmwgfx will error out, while it could
be avoided.

As said above, I'll gladly adjust core and/or others, if this relaxed
approach causes an issue somewhere. A specific use-case, real or
hypothetical will be appreciated.


All this is part of an "evil" plan of mine, to port cool features from
vmwgfx to core and effectively remove the vmw_generic_ioctl() wrapper.


Hope the bigger picture is clearer now, if not please let me know.

Thanks
Emil
Hi, Emil,

On Mon, 2019-05-27 at 10:08 +0100, Emil Velikov wrote:
> On 2019/05/25, Thomas Hellstrom wrote:

> > On Sat, 2019-05-25 at 00:39 +0200, Thomas Hellström wrote:

> > > Hi, Emil

> > > 

> > > On Fri, 2019-05-24 at 16:26 +0100, Emil Velikov wrote:

> > > > On 2019/05/24, Thomas Hellstrom wrote:

> > > > > On Fri, 2019-05-24 at 13:14 +0100, Emil Velikov wrote:

> > > > > > On 2019/05/23, Thomas Hellstrom wrote:

> > > > > > > Hi, Emil,

> > > > > > > 

> > > > > > > On Wed, 2019-05-22 at 17:41 +0100, Emil Velikov wrote:

> > > > > > > > From: Emil Velikov <emil.velikov@collabora.com>

> > > > > > > > 

> > > > > > > > Drop the custom ioctl io encoding check - core drm does

> > > > > > > > it

> > > > > > > > for

> > > > > > > > us.

> > > > > > > 

> > > > > > > I fail to see where the core does this, or do I miss

> > > > > > > something?

> > > > > > 

> > > > > > drm_ioctl() allows for the encoding to be changed and

> > > > > > attributes

> > > > > > that

> > > > > > only the

> > > > > > appropriate size is copied in/out of the kernel.

> > > > > > 

> > > > > > Technically the function is more relaxed relative to the

> > > > > > vmwgfx

> > > > > > check, yet

> > > > > > seems perfectly reasonable.

> > > > > > 

> > > > > > Is there any corner-case that isn't but should be handled

> > > > > > in

> > > > > > drm_ioctl()?

> > > > > 

> > > > > I'd like to turn the question around and ask whether there's

> > > > > a

> > > > > reason

> > > > > we should relax the vmwgfx test? In the past it has trapped

> > > > > quite

> > > > > a

> > > > > few

> > > > > user-space errors.

> > > > > 

> > > > The way I see it either:

> > > >  - the check, as-is, is unnessesary, or

> > > >  - it is needed, and we should do something equivalent for all

> > > > of

> > > > DRM

> > > > 

> > > > We had a very long brainstorming session with a colleague and

> > > > we

> > > > could not see

> > > > any cases where this would cause a problem. If you recall

> > > > anything

> > > > concrete

> > > > please let me know - I would be more than happy to take a

> > > > closer

> > > > look.

> > > 

> > > If you have a good reason to drop an ioctl sanity check, I'd be

> > > perfectly happy to do it. To me, a good reason even includes "I

> > > have

> > > a

> > > non-open-source customer having problems with this check" because

> > > of

> > > reason etc. etc. as long as I have a way to evaluate those

> > > reasons

> > > and

> > > determine if they're valid or not. "No other drm driver nor the

> > > core

> > > is

> > > doing this" is NOT a valid reason to me. In particular if the

> > > check

> > > is

> > > not affecting performance. So unless you provide additional

> > > reasons

> > > to

> > > drop this check, it's a solid NAK from my side.

> > 

> > To clarify my point of view a bit, this check is useful to early

> > catch

> > userspace using incorrect flags and sizes, which otherwise might

> > make

> > it out to distros and after that, introducing a check like this

> > would

> > be impossible, since it might break old user-space. For the same

> > reason

> > it would probably be very difficult to introduce it in core drm. 

> > 

> I think we might be talking past each other, let's take a step back:

> 

>  - as of previous patch, all of vmwgfx ioctls size is consistently

> handled by the core


I don't think I follow you here, AFAICT patch 3/5 only affects and
relaxes the execbuf checking (and in fact a little more than I would
like)?

>  - handling of featue flags, as always, is responsibility of the

> driver

> ifself

>  - with this patch, ioctl direction is also handled by core.

> 

> Here core ensures we only copy in/out as much data as the kernel

> implementation can handle.

> 

> 

> Let's consider the following real world example - msm and virtio_gpu.

> 

> An in field of an _IOW ioctl becomes in/out aka _IORW ioctl.

>  - we add a flag to annotate/request the out, as always invalid flags

> are return -EINVAL

>  - we change the ioctl encoding

> 

> As currently handled by core DRM, old kernel/new userspace and

> vice-versa works just fine. Sadly, vmwgfx will error out, while it

> could

> be avoided.


IMO basically we have a tradeoff between strict checking in this case,
and new user-space vs old kernel "hazzle-free" transition in the
relaxed case. 

> 

> As said above, I'll gladly adjust core and/or others, if this relaxed

> approach causes an issue somewhere. A specific use-case, real or

> hypothetical will be appreciated.


To me there are two important reasons to keep the strict approach.

1) Avoid user-space mistakes early in the development cycle. We can't
distinguish between buggy user-space and "new" user-space. This is
important because of [a]) below.

2) Catch a lot of fuzzer combinations and error out early instead of
forwarding them to the ioctl function where they may cause harm.

I think the new user-space vs old kernel can be handled nicely in user-
space with feature flags or API versions. That's the way we've handled
them up to now?

[a] But you probably can't move the stricter approach to core drm, or
even to core drm ioctls, since that may break old user-space we might
not even know about. Relaxing this now may put the vmwgfx-specific
ioctls in the same situation. I'd like to have this check also in core
drm, but alas, I think it's impossible.

> 

> 

> All this is part of an "evil" plan of mine, to port cool features

> from

> vmwgfx to core and effectively remove the vmw_generic_ioctl()

> wrapper.


I'm not completely sure I understand what role the vmw_generic_ioctl()
wrapper plays in this? I mean, the advantage of a helper approach
instead of a middle layer approach is that you are free to use the
helpers and amend them if desirable. If everybody is forced to use
exactly the same helpers, then we're basically back at the middle layer
approach?

/Thomas

> 

> 

> Hope the bigger picture is clearer now, if not please let me know.

> 

> Thanks

> Emil
Hi Thomas,

On 2019/05/27, Thomas Hellstrom wrote:

> > I think we might be talking past each other, let's take a step back:
> > 
> >  - as of previous patch, all of vmwgfx ioctls size is consistently
> > handled by the core
> 
> I don't think I follow you here, AFAICT patch 3/5 only affects and
> relaxes the execbuf checking (and in fact a little more than I would
> like)?
> 
Precisely, it makes execbuf ioctl behave like all other ioctls - both
vmwgfx and rest of drm.

> >  - handling of featue flags, as always, is responsibility of the
> > driver
> > ifself
> >  - with this patch, ioctl direction is also handled by core.
> > 
> > Here core ensures we only copy in/out as much data as the kernel
> > implementation can handle.
> > 
> > 
> > Let's consider the following real world example - msm and virtio_gpu.
> > 
> > An in field of an _IOW ioctl becomes in/out aka _IORW ioctl.
> >  - we add a flag to annotate/request the out, as always invalid flags
> > are return -EINVAL
> >  - we change the ioctl encoding
> > 
> > As currently handled by core DRM, old kernel/new userspace and
> > vice-versa works just fine. Sadly, vmwgfx will error out, while it
> > could
> > be avoided.
> 
> IMO basically we have a tradeoff between strict checking in this case,
> and new user-space vs old kernel "hazzle-free" transition in the
> relaxed case. 
> 
Precisely. If I read the code correctly, ATM new userspace will fail
against old kernels. Unless userspace writes two versions of the ioctl -
with with each encoding.

> > 
> > As said above, I'll gladly adjust core and/or others, if this relaxed
> > approach causes an issue somewhere. A specific use-case, real or
> > hypothetical will be appreciated.
> 
> To me there are two important reasons to keep the strict approach.
> 
> 1) Avoid user-space mistakes early in the development cycle. We can't
> distinguish between buggy user-space and "new" user-space. This is
> important because of [a]) below.
> 
Can you provide a concrete example, please?

> 2) Catch a lot of fuzzer combinations and error out early instead of
> forwarding them to the ioctl function where they may cause harm.
> 
Struggling to see why this is a problem? At some point the fuzzer will
get past this first line of defence, so we want to make the rest of the
ioctl is robust.


> I think the new user-space vs old kernel can be handled nicely in user-
> space with feature flags or API versions. That's the way we've handled
> them up to now?
> 
How is a feature flag doing to help if the encoding changes from _IOW
to _IORW?


Thanks
Emil
On 5/27/19 2:35 PM, Emil Velikov wrote:
> Hi Thomas,
>
> On 2019/05/27, Thomas Hellstrom wrote:
>
>>> I think we might be talking past each other, let's take a step back:
>>>
>>>   - as of previous patch, all of vmwgfx ioctls size is consistently
>>> handled by the core
>> I don't think I follow you here, AFAICT patch 3/5 only affects and
>> relaxes the execbuf checking (and in fact a little more than I would
>> like)?
>>
> Precisely, it makes execbuf ioctl behave like all other ioctls - both
> vmwgfx and rest of drm.

But we're still enforcing a non-relaxed size check for the other vmwgfx 
private ioctls, right? Which is relaxed, together with the directions, 
in this commit?

(Not that it matters much to the discussion, though).

>
>>>   - handling of featue flags, as always, is responsibility of the
>>> driver
>>> ifself
>>>   - with this patch, ioctl direction is also handled by core.
>>>
>>> Here core ensures we only copy in/out as much data as the kernel
>>> implementation can handle.
>>>
>>>
>>> Let's consider the following real world example - msm and virtio_gpu.
>>>
>>> An in field of an _IOW ioctl becomes in/out aka _IORW ioctl.
>>>   - we add a flag to annotate/request the out, as always invalid flags
>>> are return -EINVAL
>>>   - we change the ioctl encoding
>>>
>>> As currently handled by core DRM, old kernel/new userspace and
>>> vice-versa works just fine. Sadly, vmwgfx will error out, while it
>>> could
>>> be avoided.
>> IMO basically we have a tradeoff between strict checking in this case,
>> and new user-space vs old kernel "hazzle-free" transition in the
>> relaxed case.
>>
> Precisely. If I read the code correctly, ATM new userspace will fail
> against old kernels. Unless userspace writes two versions of the ioctl -
> with with each encoding.
>
>>> As said above, I'll gladly adjust core and/or others, if this relaxed
>>> approach causes an issue somewhere. A specific use-case, real or
>>> hypothetical will be appreciated.
>> To me there are two important reasons to keep the strict approach.
>>
>> 1) Avoid user-space mistakes early in the development cycle. We can't
>> distinguish between buggy user-space and "new" user-space. This is
>> important because of [a]) below.
>>
> Can you provide a concrete example, please?

OK, let's say you were developing fence wait functionality. Like 
vmw_fence_obj_wait ioctl. Then suddenly you started to wonder why the 
wait never timed out as it should. The reason turn out to be that 
signals were restarting the waits with the original timeout. So you 
change the ioctl from W to RW and add a kernel-computed time to the 
argument. Everything is fine, except that you forget to change this in a 
user-space application somewhere.

So now what happens, is that that user-space bug can live on undetected 
as in 1), and that means you can never go back and implement a stricter 
check because that would completely break old user-space.

The current code will trap (and has historically trapped) code like 
this. That's mainly why I'm reluctant to give it up, but I guess it can 
be conditionally compiled in for debug purposes.

>
>> 2) Catch a lot of fuzzer combinations and error out early instead of
>> forwarding them to the ioctl function where they may cause harm.
>>
> Struggling to see why this is a problem? At some point the fuzzer will
> get past this first line of defence, so we want to make the rest of the
> ioctl is robust.
>
>
>> I think the new user-space vs old kernel can be handled nicely in user-
>> space with feature flags or API versions. That's the way we've handled
>> them up to now?
>>
> How is a feature flag doing to help if the encoding changes from _IOW
> to _IORW?

Ah, you're referring to old user-space new kernel? Yes, I was probably 
reading a bit too fast. Sorry about that.

So we're basically landing in a tradeoff between trapping problems like 
above, and hazzle-free ioctl argument definition change.

OK, so I'm ok with that as long as there is a way we can compile in 
strict checking, which will likely has to be as a vmwgfx-specific wrapper.

/Thomas


>
>
> Thanks
> Emil
> _______________________________________________
> dri-devel mailing list
> dri-devel@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/dri-devel
On 2019/05/27, Thomas Hellstrom wrote:
> On 5/27/19 2:35 PM, Emil Velikov wrote:
> > Hi Thomas,
> > 
> > On 2019/05/27, Thomas Hellstrom wrote:
> > 
> > > > I think we might be talking past each other, let's take a step back:
> > > > 
> > > >   - as of previous patch, all of vmwgfx ioctls size is consistently
> > > > handled by the core
> > > I don't think I follow you here, AFAICT patch 3/5 only affects and
> > > relaxes the execbuf checking (and in fact a little more than I would
> > > like)?
> > > 
> > Precisely, it makes execbuf ioctl behave like all other ioctls - both
> > vmwgfx and rest of drm.
> 
> But we're still enforcing a non-relaxed size check for the other vmwgfx
> private ioctls, right? Which is relaxed, together with the directions, in
> this commit?
> 
Regardless of the patch, all !execbuf vmwgfx ioctls use the related size
checking from core drm.

> (Not that it matters much to the discussion, though).
> 
Agreed.

> > 
> > > >   - handling of featue flags, as always, is responsibility of the
> > > > driver
> > > > ifself
> > > >   - with this patch, ioctl direction is also handled by core.
> > > > 
> > > > Here core ensures we only copy in/out as much data as the kernel
> > > > implementation can handle.
> > > > 
> > > > 
> > > > Let's consider the following real world example - msm and virtio_gpu.
> > > > 
> > > > An in field of an _IOW ioctl becomes in/out aka _IORW ioctl.
> > > >   - we add a flag to annotate/request the out, as always invalid flags
> > > > are return -EINVAL
> > > >   - we change the ioctl encoding
> > > > 
> > > > As currently handled by core DRM, old kernel/new userspace and
> > > > vice-versa works just fine. Sadly, vmwgfx will error out, while it
> > > > could
> > > > be avoided.
> > > IMO basically we have a tradeoff between strict checking in this case,
> > > and new user-space vs old kernel "hazzle-free" transition in the
> > > relaxed case.
> > > 
> > Precisely. If I read the code correctly, ATM new userspace will fail
> > against old kernels. Unless userspace writes two versions of the ioctl -
> > with with each encoding.
> > 
> > > > As said above, I'll gladly adjust core and/or others, if this relaxed
> > > > approach causes an issue somewhere. A specific use-case, real or
> > > > hypothetical will be appreciated.
> > > To me there are two important reasons to keep the strict approach.
> > > 
> > > 1) Avoid user-space mistakes early in the development cycle. We can't
> > > distinguish between buggy user-space and "new" user-space. This is
> > > important because of [a]) below.
> > > 
> > Can you provide a concrete example, please?
> 
> OK, let's say you were developing fence wait functionality. Like
> vmw_fence_obj_wait ioctl. Then suddenly you started to wonder why the wait
> never timed out as it should. The reason turn out to be that signals were
> restarting the waits with the original timeout. So you change the ioctl from
> W to RW and add a kernel-computed time to the argument. Everything is fine,
> except that you forget to change this in a user-space application somewhere.
> 
> So now what happens, is that that user-space bug can live on undetected as
> in 1), and that means you can never go back and implement a stricter check
> because that would completely break old user-space.
> 
If I understand you correctly, the W -> RW change in unnecessary. Yet
the only negative effect that I can see is the copy_to_user() overhead.

The copy should be negligible, yet it "feels" silly.

Is there anything more serious that I've missed?


Having a closer look - vmwgfx (et al) seems to stand out, such that it
does not provide a UABI define including the encoding. Hence it sort of
duplicates that in userspace, by using the explicit drmCommand*

Guess I could follow the suggestion in vmwgfx_drv.c move the defines to
UABI, sync header and update mesa/xf86-video-vmwgfx.

What do you think - yes, or please don't?

> The current code will trap (and has historically trapped) code like this.
> That's mainly why I'm reluctant to give it up, but I guess it can be
> conditionally compiled in for debug purposes.
> 
This piece here, is the holly grail. I'll go further and suggest:

 - add a strict encoding and size check, behind a config toggle
 - make it a core drm thing and drop the custom vmwgfx one

Will keep it disabled by default - but will clearly document Kconfig and
docs that devs should toggle it to catch bugs.

> > 
> > > 2) Catch a lot of fuzzer combinations and error out early instead of
> > > forwarding them to the ioctl function where they may cause harm.
> > > 
> > Struggling to see why this is a problem? At some point the fuzzer will
> > get past this first line of defence, so we want to make the rest of the
> > ioctl is robust.
> > 
> > 
> > > I think the new user-space vs old kernel can be handled nicely in user-
> > > space with feature flags or API versions. That's the way we've handled
> > > them up to now?
> > > 
> > How is a feature flag doing to help if the encoding changes from _IOW
> > to _IORW?
> 
> Ah, you're referring to old user-space new kernel? Yes, I was probably
> reading a bit too fast. Sorry about that.
> 
> So we're basically landing in a tradeoff between trapping problems like
> above, and hazzle-free ioctl argument definition change.
> 
> OK, so I'm ok with that as long as there is a way we can compile in strict
> checking, which will likely has to be as a vmwgfx-specific wrapper.
> 
Ack, I'll proceed with the debug toggle suggestion.

Thank you for the insightful input.
Emil
On 5/27/19 5:27 PM, Emil Velikov wrote:
> On 2019/05/27, Thomas Hellstrom wrote:
>> On 5/27/19 2:35 PM, Emil Velikov wrote:
>>> Hi Thomas,
>>>
>>> On 2019/05/27, Thomas Hellstrom wrote:
>>>
>>>>> I think we might be talking past each other, let's take a step back:
>>>>>
>>>>>    - as of previous patch, all of vmwgfx ioctls size is consistently
>>>>> handled by the core
>>>> I don't think I follow you here, AFAICT patch 3/5 only affects and
>>>> relaxes the execbuf checking (and in fact a little more than I would
>>>> like)?
>>>>
>>> Precisely, it makes execbuf ioctl behave like all other ioctls - both
>>> vmwgfx and rest of drm.
>> But we're still enforcing a non-relaxed size check for the other vmwgfx
>> private ioctls, right? Which is relaxed, together with the directions, in
>> this commit?
>>
> Regardless of the patch, all !execbuf vmwgfx ioctls use the related size
> checking from core drm.

Well it does, but since we (before this patch) enforce ioctl->cmd == 
cmd, we also enforce
_IOC_SIZE(ioctl->cmd) == _IOC_SIZE(cmd), which makes the core check 
pointless, or am I missing something?

>
>> (Not that it matters much to the discussion, though).
>>
> Agreed.
>
>>>
...
>>> Can you provide a concrete example, please?
>> OK, let's say you were developing fence wait functionality. Like
>> vmw_fence_obj_wait ioctl. Then suddenly you started to wonder why the wait
>> never timed out as it should. The reason turn out to be that signals were
>> restarting the waits with the original timeout. So you change the ioctl from
>> W to RW and add a kernel-computed time to the argument. Everything is fine,
>> except that you forget to change this in a user-space application somewhere.
>>
>> So now what happens, is that that user-space bug can live on undetected as
>> in 1), and that means you can never go back and implement a stricter check
>> because that would completely break old user-space.
>>
> If I understand you correctly, the W -> RW change in unnecessary. Yet
> the only negative effect that I can see is the copy_to_user() overhead.
>
> The copy should be negligible, yet it "feels" silly.
>
> Is there anything more serious that I've missed?

Well the point is in this case, that the write was necessary, but the 
code would work sort of OK anyway. It updated a kernel "cookie" to make 
sure the timeout would be correct even with the next call repetition. 
Now if an old header was floating around, there might be clients using 
it. And with the current core checks that typically wouldn't get 
noticed. With the check we'd immediately notice and abort. It feels a 
little like moving from ANSI C to K&R... :-)

>
>
> Having a closer look - vmwgfx (et al) seems to stand out, such that it
> does not provide a UABI define including the encoding. Hence it sort of
> duplicates that in userspace, by using the explicit drmCommand*
>
> Guess I could follow the suggestion in vmwgfx_drv.c move the defines to
> UABI, sync header and update mesa/xf86-video-vmwgfx.
>
> What do you think - yes, or please don't?

Please hold on for a while, and I'll discuss it internally.

>
>> The current code will trap (and has historically trapped) code like this.
>> That's mainly why I'm reluctant to give it up, but I guess it can be
>> conditionally compiled in for debug purposes.
>>
> This piece here, is the holly grail. I'll go further and suggest:
>
>   - add a strict encoding and size check, behind a config toggle
>   - make it a core drm thing and drop the custom vmwgfx one
>
> Will keep it disabled by default - but will clearly document Kconfig and
> docs that devs should toggle it to catch bugs.

Sounds good, but IIRC the reason why I kept it only to driver-private 
ioctls, is that there were errors with the drm ioctls. But that was a 
long time ago so I might remember incorrectly, or user-space has been fixed.

>
>>>> 2) Catch a lot of fuzzer combinations and error out early instead of
>>>> forwarding them to the ioctl function where they may cause harm.
>>>>
>>> Struggling to see why this is a problem? At some point the fuzzer will
>>> get past this first line of defence, so we want to make the rest of the
>>> ioctl is robust.
>>>
>>>
>>>> I think the new user-space vs old kernel can be handled nicely in user-
>>>> space with feature flags or API versions. That's the way we've handled
>>>> them up to now?
>>>>
>>> How is a feature flag doing to help if the encoding changes from _IOW
>>> to _IORW?
>> Ah, you're referring to old user-space new kernel? Yes, I was probably
>> reading a bit too fast. Sorry about that.
>>
>> So we're basically landing in a tradeoff between trapping problems like
>> above, and hazzle-free ioctl argument definition change.
>>
>> OK, so I'm ok with that as long as there is a way we can compile in strict
>> checking, which will likely has to be as a vmwgfx-specific wrapper.
>>
> Ack, I'll proceed with the debug toggle suggestion.

Great.


>
> Thank you for the insightful input.
> Emil

Thanks,

Thomas
On 2019/05/27, Thomas Hellstrom wrote:
> On 5/27/19 5:27 PM, Emil Velikov wrote:
> > On 2019/05/27, Thomas Hellstrom wrote:
> > > On 5/27/19 2:35 PM, Emil Velikov wrote:
> > > > Hi Thomas,
> > > > 
> > > > On 2019/05/27, Thomas Hellstrom wrote:
> > > > 
> > > > > > I think we might be talking past each other, let's take a step back:
> > > > > > 
> > > > > >    - as of previous patch, all of vmwgfx ioctls size is consistently
> > > > > > handled by the core
> > > > > I don't think I follow you here, AFAICT patch 3/5 only affects and
> > > > > relaxes the execbuf checking (and in fact a little more than I would
> > > > > like)?
> > > > > 
> > > > Precisely, it makes execbuf ioctl behave like all other ioctls - both
> > > > vmwgfx and rest of drm.
> > > But we're still enforcing a non-relaxed size check for the other vmwgfx
> > > private ioctls, right? Which is relaxed, together with the directions, in
> > > this commit?
> > > 
> > Regardless of the patch, all !execbuf vmwgfx ioctls use the related size
> > checking from core drm.
> 
> Well it does, but since we (before this patch) enforce ioctl->cmd == cmd, we
> also enforce
> _IOC_SIZE(ioctl->cmd) == _IOC_SIZE(cmd), which makes the core check
> pointless, or am I missing something?
> 
You're spot on - I never looked at the _IOC_SIZE declaration. I was
assuming some other magic.


> > 
> > > (Not that it matters much to the discussion, though).
> > > 
> > Agreed.
> > 
> > > > 
> ...
> > > > Can you provide a concrete example, please?
> > > OK, let's say you were developing fence wait functionality. Like
> > > vmw_fence_obj_wait ioctl. Then suddenly you started to wonder why the wait
> > > never timed out as it should. The reason turn out to be that signals were
> > > restarting the waits with the original timeout. So you change the ioctl from
> > > W to RW and add a kernel-computed time to the argument. Everything is fine,
> > > except that you forget to change this in a user-space application somewhere.
> > > 
> > > So now what happens, is that that user-space bug can live on undetected as
> > > in 1), and that means you can never go back and implement a stricter check
> > > because that would completely break old user-space.
> > > 
> > If I understand you correctly, the W -> RW change in unnecessary. Yet
> > the only negative effect that I can see is the copy_to_user() overhead.
> > 
> > The copy should be negligible, yet it "feels" silly.
> > 
> > Is there anything more serious that I've missed?
> 
> Well the point is in this case, that the write was necessary, but the code
> would work sort of OK anyway. It updated a kernel "cookie" to make sure the
> timeout would be correct even with the next call repetition. Now if an old
> header was floating around, there might be clients using it. And with the
> current core checks that typically wouldn't get noticed. With the check we'd
> immediately notice and abort. It feels a little like moving from ANSI C to
> K&R... :-)
> 
Technically, the kernel (or any function really) should write output
arguments only when needed. Agree though - it's sometimes annoying or
inconvenient.

For the ANSI C to K&R comment - sure, only if one cares about backward
compat. If they don't - flip the config toggle and carry on ;-)

> > 
> > 
> > Having a closer look - vmwgfx (et al) seems to stand out, such that it
> > does not provide a UABI define including the encoding. Hence it sort of
> > duplicates that in userspace, by using the explicit drmCommand*
> > 
> > Guess I could follow the suggestion in vmwgfx_drv.c move the defines to
> > UABI, sync header and update mesa/xf86-video-vmwgfx.
> > 
> > What do you think - yes, or please don't?
> 
> Please hold on for a while, and I'll discuss it internally.
> 
Ack.

> > 
> > > The current code will trap (and has historically trapped) code like this.
> > > That's mainly why I'm reluctant to give it up, but I guess it can be
> > > conditionally compiled in for debug purposes.
> > > 
> > This piece here, is the holly grail. I'll go further and suggest:
> > 
> >   - add a strict encoding and size check, behind a config toggle
> >   - make it a core drm thing and drop the custom vmwgfx one
> > 
> > Will keep it disabled by default - but will clearly document Kconfig and
> > docs that devs should toggle it to catch bugs.
> 
> Sounds good, but IIRC the reason why I kept it only to driver-private
> ioctls, is that there were errors with the drm ioctls. But that was a long
> time ago so I might remember incorrectly, or user-space has been fixed.
> 
Good point - will have a quick look and send patches if needed.

Thanks
Emil