[v3,1/4] fs, net: Standardize on file_receive helper to move fds across processes

Submitted by Sargun Dhillon on June 3, 2020, 1:10 a.m.

Details

Message ID 20200603011044.7972-2-sargun@sargun.me
State New
Series "Add seccomp notifier ioctl that enables adding fds"
Headers show

Commit Message

Sargun Dhillon June 3, 2020, 1:10 a.m.
Previously there were two chunks of code where the logic to receive file
descriptors was duplicated in net. The compat version of copying
file descriptors via SCM_RIGHTS did not have logic to update cgroups.
Logic to change the cgroup data was added in:
commit 48a87cc26c13 ("net: netprio: fd passed in SCM_RIGHTS datagram not set correctly")
commit d84295067fc7 ("net: net_cls: fd passed in SCM_RIGHTS datagram not set correctly")

This was not copied to the compat path. This commit fixes that, and thus
should be cherry-picked into stable.

This introduces a helper (file_receive) which encapsulates the logic for
handling calling security hooks as well as manipulating cgroup information.
This helper can then be used other places in the kernel where file
descriptors are copied between processes

I tested cgroup classid setting on both the compat (x32) path, and the
native path to ensure that when moving the file descriptor the classid
is set.

Signed-off-by: Sargun Dhillon <sargun@sargun.me>
Suggested-by: Kees Cook <keescook@chromium.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Christian Brauner <christian.brauner@ubuntu.com>
Cc: Daniel Wagner <daniel.wagner@bmw-carit.de>
Cc: David S. Miller <davem@davemloft.net>
Cc: Jann Horn <jannh@google.com>,
Cc: John Fastabend <john.r.fastabend@intel.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Tycho Andersen <tycho@tycho.ws>
Cc: stable@vger.kernel.org
Cc: cgroups@vger.kernel.org
Cc: linux-fsdevel@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
---
 fs/file.c            | 35 +++++++++++++++++++++++++++++++++++
 include/linux/file.h |  1 +
 net/compat.c         | 10 +++++-----
 net/core/scm.c       | 14 ++++----------
 4 files changed, 45 insertions(+), 15 deletions(-)

Patch hide | download patch | download mbox

diff --git a/fs/file.c b/fs/file.c
index abb8b7081d7a..5afd76fca8c2 100644
--- a/fs/file.c
+++ b/fs/file.c
@@ -18,6 +18,9 @@ 
 #include <linux/bitops.h>
 #include <linux/spinlock.h>
 #include <linux/rcupdate.h>
+#include <net/sock.h>
+#include <net/netprio_cgroup.h>
+#include <net/cls_cgroup.h>
 
 unsigned int sysctl_nr_open __read_mostly = 1024*1024;
 unsigned int sysctl_nr_open_min = BITS_PER_LONG;
@@ -931,6 +934,38 @@  int replace_fd(unsigned fd, struct file *file, unsigned flags)
 	return err;
 }
 
+/*
+ * File Receive - Receive a file from another process
+ *
+ * This function is designed to receive files from other tasks. It encapsulates
+ * logic around security and cgroups. The file descriptor provided must be a
+ * freshly allocated (unused) file descriptor.
+ *
+ * This helper does not consume a reference to the file, so the caller must put
+ * their reference.
+ *
+ * Returns 0 upon success.
+ */
+int file_receive(int fd, struct file *file)
+{
+	struct socket *sock;
+	int err;
+
+	err = security_file_receive(file);
+	if (err)
+		return err;
+
+	fd_install(fd, get_file(file));
+
+	sock = sock_from_file(file, &err);
+	if (sock) {
+		sock_update_netprioidx(&sock->sk->sk_cgrp_data);
+		sock_update_classid(&sock->sk->sk_cgrp_data);
+	}
+
+	return 0;
+}
+
 static int ksys_dup3(unsigned int oldfd, unsigned int newfd, int flags)
 {
 	int err = -EBADF;
diff --git a/include/linux/file.h b/include/linux/file.h
index 142d102f285e..7b56dc23e560 100644
--- a/include/linux/file.h
+++ b/include/linux/file.h
@@ -94,4 +94,5 @@  extern void fd_install(unsigned int fd, struct file *file);
 extern void flush_delayed_fput(void);
 extern void __fput_sync(struct file *);
 
+extern int file_receive(int fd, struct file *file);
 #endif /* __LINUX_FILE_H */
diff --git a/net/compat.c b/net/compat.c
index 4bed96e84d9a..8ac0e7e09208 100644
--- a/net/compat.c
+++ b/net/compat.c
@@ -293,9 +293,6 @@  void scm_detach_fds_compat(struct msghdr *kmsg, struct scm_cookie *scm)
 
 	for (i = 0, cmfptr = (int __user *) CMSG_COMPAT_DATA(cm); i < fdmax; i++, cmfptr++) {
 		int new_fd;
-		err = security_file_receive(fp[i]);
-		if (err)
-			break;
 		err = get_unused_fd_flags(MSG_CMSG_CLOEXEC & kmsg->msg_flags
 					  ? O_CLOEXEC : 0);
 		if (err < 0)
@@ -306,8 +303,11 @@  void scm_detach_fds_compat(struct msghdr *kmsg, struct scm_cookie *scm)
 			put_unused_fd(new_fd);
 			break;
 		}
-		/* Bump the usage count and install the file. */
-		fd_install(new_fd, get_file(fp[i]));
+		err = file_receive(new_fd, fp[i]);
+		if (err) {
+			put_unused_fd(new_fd);
+			break;
+		}
 	}
 
 	if (i > 0) {
diff --git a/net/core/scm.c b/net/core/scm.c
index dc6fed1f221c..ba93abf2881b 100644
--- a/net/core/scm.c
+++ b/net/core/scm.c
@@ -303,11 +303,7 @@  void scm_detach_fds(struct msghdr *msg, struct scm_cookie *scm)
 	for (i=0, cmfptr=(__force int __user *)CMSG_DATA(cm); i<fdmax;
 	     i++, cmfptr++)
 	{
-		struct socket *sock;
 		int new_fd;
-		err = security_file_receive(fp[i]);
-		if (err)
-			break;
 		err = get_unused_fd_flags(MSG_CMSG_CLOEXEC & msg->msg_flags
 					  ? O_CLOEXEC : 0);
 		if (err < 0)
@@ -318,13 +314,11 @@  void scm_detach_fds(struct msghdr *msg, struct scm_cookie *scm)
 			put_unused_fd(new_fd);
 			break;
 		}
-		/* Bump the usage count and install the file. */
-		sock = sock_from_file(fp[i], &err);
-		if (sock) {
-			sock_update_netprioidx(&sock->sk->sk_cgrp_data);
-			sock_update_classid(&sock->sk->sk_cgrp_data);
+		err = file_receive(new_fd, fp[i]);
+		if (err) {
+			put_unused_fd(new_fd);
+			break;
 		}
-		fd_install(new_fd, get_file(fp[i]));
 	}
 
 	if (i > 0)

Comments

Christian Brauner June 4, 2020, 1:24 a.m.
On Tue, Jun 02, 2020 at 06:10:41PM -0700, Sargun Dhillon wrote:
> Previously there were two chunks of code where the logic to receive file
> descriptors was duplicated in net. The compat version of copying
> file descriptors via SCM_RIGHTS did not have logic to update cgroups.
> Logic to change the cgroup data was added in:
> commit 48a87cc26c13 ("net: netprio: fd passed in SCM_RIGHTS datagram not set correctly")
> commit d84295067fc7 ("net: net_cls: fd passed in SCM_RIGHTS datagram not set correctly")
> 
> This was not copied to the compat path. This commit fixes that, and thus
> should be cherry-picked into stable.
> 
> This introduces a helper (file_receive) which encapsulates the logic for
> handling calling security hooks as well as manipulating cgroup information.
> This helper can then be used other places in the kernel where file
> descriptors are copied between processes
> 
> I tested cgroup classid setting on both the compat (x32) path, and the
> native path to ensure that when moving the file descriptor the classid
> is set.
> 
> Signed-off-by: Sargun Dhillon <sargun@sargun.me>
> Suggested-by: Kees Cook <keescook@chromium.org>
> Cc: Al Viro <viro@zeniv.linux.org.uk>
> Cc: Christian Brauner <christian.brauner@ubuntu.com>
> Cc: Daniel Wagner <daniel.wagner@bmw-carit.de>
> Cc: David S. Miller <davem@davemloft.net>
> Cc: Jann Horn <jannh@google.com>,
> Cc: John Fastabend <john.r.fastabend@intel.com>
> Cc: Tejun Heo <tj@kernel.org>
> Cc: Tycho Andersen <tycho@tycho.ws>
> Cc: stable@vger.kernel.org
> Cc: cgroups@vger.kernel.org
> Cc: linux-fsdevel@vger.kernel.org
> Cc: linux-kernel@vger.kernel.org
> ---
>  fs/file.c            | 35 +++++++++++++++++++++++++++++++++++
>  include/linux/file.h |  1 +
>  net/compat.c         | 10 +++++-----
>  net/core/scm.c       | 14 ++++----------
>  4 files changed, 45 insertions(+), 15 deletions(-)
> 
> diff --git a/fs/file.c b/fs/file.c
> index abb8b7081d7a..5afd76fca8c2 100644
> --- a/fs/file.c
> +++ b/fs/file.c
> @@ -18,6 +18,9 @@
>  #include <linux/bitops.h>
>  #include <linux/spinlock.h>
>  #include <linux/rcupdate.h>
> +#include <net/sock.h>
> +#include <net/netprio_cgroup.h>
> +#include <net/cls_cgroup.h>
>  
>  unsigned int sysctl_nr_open __read_mostly = 1024*1024;
>  unsigned int sysctl_nr_open_min = BITS_PER_LONG;
> @@ -931,6 +934,38 @@ int replace_fd(unsigned fd, struct file *file, unsigned flags)
>  	return err;
>  }
>  
> +/*
> + * File Receive - Receive a file from another process
> + *
> + * This function is designed to receive files from other tasks. It encapsulates
> + * logic around security and cgroups. The file descriptor provided must be a
> + * freshly allocated (unused) file descriptor.
> + *
> + * This helper does not consume a reference to the file, so the caller must put
> + * their reference.
> + *
> + * Returns 0 upon success.
> + */
> +int file_receive(int fd, struct file *file)

This is all just a remote version of fd_install(), yet it deviates from
fd_install()'s semantics and naming. That's not great imho. What about
naming this something like:

fd_install_received()

and move the get_file() out of there so it has the same semantics as
fd_install(). It seems rather dangerous to have a function like
fd_install() that consumes a reference once it returned and another
version of this that is basically the same thing but doesn't consume a
reference because it takes its own. Seems an invitation for confusion.
Does that make sense?

> +{
> +	struct socket *sock;
> +	int err;
> +
> +	err = security_file_receive(file);
> +	if (err)
> +		return err;
> +
> +	fd_install(fd, get_file(file));
> +
> +	sock = sock_from_file(file, &err);
> +	if (sock) {
> +		sock_update_netprioidx(&sock->sk->sk_cgrp_data);
> +		sock_update_classid(&sock->sk->sk_cgrp_data);
> +	}
> +
> +	return 0;
> +}
> +
>  static int ksys_dup3(unsigned int oldfd, unsigned int newfd, int flags)
>  {
>  	int err = -EBADF;
> diff --git a/include/linux/file.h b/include/linux/file.h
> index 142d102f285e..7b56dc23e560 100644
> --- a/include/linux/file.h
> +++ b/include/linux/file.h
> @@ -94,4 +94,5 @@ extern void fd_install(unsigned int fd, struct file *file);
>  extern void flush_delayed_fput(void);
>  extern void __fput_sync(struct file *);
>  
> +extern int file_receive(int fd, struct file *file);
>  #endif /* __LINUX_FILE_H */
> diff --git a/net/compat.c b/net/compat.c
> index 4bed96e84d9a..8ac0e7e09208 100644
> --- a/net/compat.c
> +++ b/net/compat.c
> @@ -293,9 +293,6 @@ void scm_detach_fds_compat(struct msghdr *kmsg, struct scm_cookie *scm)
>  
>  	for (i = 0, cmfptr = (int __user *) CMSG_COMPAT_DATA(cm); i < fdmax; i++, cmfptr++) {
>  		int new_fd;
> -		err = security_file_receive(fp[i]);
> -		if (err)
> -			break;
>  		err = get_unused_fd_flags(MSG_CMSG_CLOEXEC & kmsg->msg_flags
>  					  ? O_CLOEXEC : 0);
>  		if (err < 0)
> @@ -306,8 +303,11 @@ void scm_detach_fds_compat(struct msghdr *kmsg, struct scm_cookie *scm)
>  			put_unused_fd(new_fd);
>  			break;
>  		}
> -		/* Bump the usage count and install the file. */
> -		fd_install(new_fd, get_file(fp[i]));
> +		err = file_receive(new_fd, fp[i]);
> +		if (err) {
> +			put_unused_fd(new_fd);
> +			break;
> +		}
>  	}
>  
>  	if (i > 0) {
> diff --git a/net/core/scm.c b/net/core/scm.c
> index dc6fed1f221c..ba93abf2881b 100644
> --- a/net/core/scm.c
> +++ b/net/core/scm.c
> @@ -303,11 +303,7 @@ void scm_detach_fds(struct msghdr *msg, struct scm_cookie *scm)
>  	for (i=0, cmfptr=(__force int __user *)CMSG_DATA(cm); i<fdmax;
>  	     i++, cmfptr++)
>  	{
> -		struct socket *sock;
>  		int new_fd;
> -		err = security_file_receive(fp[i]);
> -		if (err)
> -			break;
>  		err = get_unused_fd_flags(MSG_CMSG_CLOEXEC & msg->msg_flags
>  					  ? O_CLOEXEC : 0);
>  		if (err < 0)
> @@ -318,13 +314,11 @@ void scm_detach_fds(struct msghdr *msg, struct scm_cookie *scm)
>  			put_unused_fd(new_fd);
>  			break;
>  		}
> -		/* Bump the usage count and install the file. */
> -		sock = sock_from_file(fp[i], &err);
> -		if (sock) {
> -			sock_update_netprioidx(&sock->sk->sk_cgrp_data);
> -			sock_update_classid(&sock->sk->sk_cgrp_data);
> +		err = file_receive(new_fd, fp[i]);
> +		if (err) {
> +			put_unused_fd(new_fd);
> +			break;
>  		}
> -		fd_install(new_fd, get_file(fp[i]));
>  	}
>  
>  	if (i > 0)
> -- 
> 2.25.1
>
Kees Cook June 4, 2020, 2:22 a.m.
On Thu, Jun 04, 2020 at 03:24:52AM +0200, Christian Brauner wrote:
> On Tue, Jun 02, 2020 at 06:10:41PM -0700, Sargun Dhillon wrote:
> > Previously there were two chunks of code where the logic to receive file
> > descriptors was duplicated in net. The compat version of copying
> > file descriptors via SCM_RIGHTS did not have logic to update cgroups.
> > Logic to change the cgroup data was added in:
> > commit 48a87cc26c13 ("net: netprio: fd passed in SCM_RIGHTS datagram not set correctly")
> > commit d84295067fc7 ("net: net_cls: fd passed in SCM_RIGHTS datagram not set correctly")
> > 
> > This was not copied to the compat path. This commit fixes that, and thus
> > should be cherry-picked into stable.
> > 
> > This introduces a helper (file_receive) which encapsulates the logic for
> > handling calling security hooks as well as manipulating cgroup information.
> > This helper can then be used other places in the kernel where file
> > descriptors are copied between processes
> > 
> > I tested cgroup classid setting on both the compat (x32) path, and the
> > native path to ensure that when moving the file descriptor the classid
> > is set.
> > 
> > Signed-off-by: Sargun Dhillon <sargun@sargun.me>
> > Suggested-by: Kees Cook <keescook@chromium.org>
> > Cc: Al Viro <viro@zeniv.linux.org.uk>
> > Cc: Christian Brauner <christian.brauner@ubuntu.com>
> > Cc: Daniel Wagner <daniel.wagner@bmw-carit.de>
> > Cc: David S. Miller <davem@davemloft.net>
> > Cc: Jann Horn <jannh@google.com>,
> > Cc: John Fastabend <john.r.fastabend@intel.com>
> > Cc: Tejun Heo <tj@kernel.org>
> > Cc: Tycho Andersen <tycho@tycho.ws>
> > Cc: stable@vger.kernel.org
> > Cc: cgroups@vger.kernel.org
> > Cc: linux-fsdevel@vger.kernel.org
> > Cc: linux-kernel@vger.kernel.org
> > ---
> >  fs/file.c            | 35 +++++++++++++++++++++++++++++++++++
> >  include/linux/file.h |  1 +
> >  net/compat.c         | 10 +++++-----
> >  net/core/scm.c       | 14 ++++----------
> >  4 files changed, 45 insertions(+), 15 deletions(-)
> > 
> > diff --git a/fs/file.c b/fs/file.c
> > index abb8b7081d7a..5afd76fca8c2 100644
> > --- a/fs/file.c
> > +++ b/fs/file.c
> > @@ -18,6 +18,9 @@
> >  #include <linux/bitops.h>
> >  #include <linux/spinlock.h>
> >  #include <linux/rcupdate.h>
> > +#include <net/sock.h>
> > +#include <net/netprio_cgroup.h>
> > +#include <net/cls_cgroup.h>
> >  
> >  unsigned int sysctl_nr_open __read_mostly = 1024*1024;
> >  unsigned int sysctl_nr_open_min = BITS_PER_LONG;
> > @@ -931,6 +934,38 @@ int replace_fd(unsigned fd, struct file *file, unsigned flags)
> >  	return err;
> >  }
> >  
> > +/*
> > + * File Receive - Receive a file from another process
> > + *
> > + * This function is designed to receive files from other tasks. It encapsulates
> > + * logic around security and cgroups. The file descriptor provided must be a
> > + * freshly allocated (unused) file descriptor.
> > + *
> > + * This helper does not consume a reference to the file, so the caller must put
> > + * their reference.
> > + *
> > + * Returns 0 upon success.
> > + */
> > +int file_receive(int fd, struct file *file)
> 
> This is all just a remote version of fd_install(), yet it deviates from
> fd_install()'s semantics and naming. That's not great imho. What about
> naming this something like:
> 
> fd_install_received()
> 
> and move the get_file() out of there so it has the same semantics as
> fd_install(). It seems rather dangerous to have a function like
> fd_install() that consumes a reference once it returned and another
> version of this that is basically the same thing but doesn't consume a
> reference because it takes its own. Seems an invitation for confusion.
> Does that make sense?

We have some competing opinions on this, I guess. What I really don't
like is the copy/pasting of the get_unused_fd_flags() and
put_unused_fd() needed by (nearly) all the callers. If it's a helper, it
should help. Specifically, I'd like to see this:

int file_receive(int fd, unsigned long flags, struct file *file,
		 int __user *fdptr)
{
	struct socket *sock;
	int err;

	err = security_file_receive(file);
	if (err)
		return err;

	if (fd < 0) {
		/* Install new fd. */
		int new_fd;

		err = get_unused_fd_flags(flags);
		if (err < 0)
			return err;
		new_fd = err;

		/* Copy fd to any waiting user memory. */
		if (fdptr) {
			err = put_user(new_fd, fdptr);
			if (err < 0) {
				put_unused_fd(new_fd);
				return err;
			}
		}
		fd_install(new_fd, get_file(file));
		fd = new_fd;
	} else {
		/* Replace existing fd. */
		err = replace_fd(fd, file, flags);
		if (err)
			return err;
	}

	/* Bump the cgroup usage counts. */
	sock = sock_from_file(fd, &err);
	if (sock) {
		sock_update_netprioidx(&sock->sk->sk_cgrp_data);
		sock_update_classid(&sock->sk->sk_cgrp_data);
	}

	return fd;
}

If everyone else *really* prefers keeping the get_unused_fd_flags() /
put_unused_fd() stuff outside the helper, then I guess I'll give up,
but I think it is MUCH cleaner this way -- all 4 users trim down lots
of code duplication.
Sargun Dhillon June 4, 2020, 3:39 a.m.
On Thu, Jun 04, 2020 at 03:24:52AM +0200, Christian Brauner wrote:
> On Tue, Jun 02, 2020 at 06:10:41PM -0700, Sargun Dhillon wrote:
> > Previously there were two chunks of code where the logic to receive file
> > descriptors was duplicated in net. The compat version of copying
> > file descriptors via SCM_RIGHTS did not have logic to update cgroups.
> > Logic to change the cgroup data was added in:
> > commit 48a87cc26c13 ("net: netprio: fd passed in SCM_RIGHTS datagram not set correctly")
> > commit d84295067fc7 ("net: net_cls: fd passed in SCM_RIGHTS datagram not set correctly")
> > 
> > This was not copied to the compat path. This commit fixes that, and thus
> > should be cherry-picked into stable.
> > 
> > This introduces a helper (file_receive) which encapsulates the logic for
> > handling calling security hooks as well as manipulating cgroup information.
> > This helper can then be used other places in the kernel where file
> > descriptors are copied between processes
> > 
> > I tested cgroup classid setting on both the compat (x32) path, and the
> > native path to ensure that when moving the file descriptor the classid
> > is set.
> > 
> > Signed-off-by: Sargun Dhillon <sargun@sargun.me>
> > Suggested-by: Kees Cook <keescook@chromium.org>
> > Cc: Al Viro <viro@zeniv.linux.org.uk>
> > Cc: Christian Brauner <christian.brauner@ubuntu.com>
> > Cc: Daniel Wagner <daniel.wagner@bmw-carit.de>
> > Cc: David S. Miller <davem@davemloft.net>
> > Cc: Jann Horn <jannh@google.com>,
> > Cc: John Fastabend <john.r.fastabend@intel.com>
> > Cc: Tejun Heo <tj@kernel.org>
> > Cc: Tycho Andersen <tycho@tycho.ws>
> > Cc: stable@vger.kernel.org
> > Cc: cgroups@vger.kernel.org
> > Cc: linux-fsdevel@vger.kernel.org
> > Cc: linux-kernel@vger.kernel.org
> > ---
> >  fs/file.c            | 35 +++++++++++++++++++++++++++++++++++
> >  include/linux/file.h |  1 +
> >  net/compat.c         | 10 +++++-----
> >  net/core/scm.c       | 14 ++++----------
> >  4 files changed, 45 insertions(+), 15 deletions(-)
> > 
> > diff --git a/fs/file.c b/fs/file.c
> > index abb8b7081d7a..5afd76fca8c2 100644
> > --- a/fs/file.c
> > +++ b/fs/file.c
> > @@ -18,6 +18,9 @@
> >  #include <linux/bitops.h>
> >  #include <linux/spinlock.h>
> >  #include <linux/rcupdate.h>
> > +#include <net/sock.h>
> > +#include <net/netprio_cgroup.h>
> > +#include <net/cls_cgroup.h>
> >  
> >  unsigned int sysctl_nr_open __read_mostly = 1024*1024;
> >  unsigned int sysctl_nr_open_min = BITS_PER_LONG;
> > @@ -931,6 +934,38 @@ int replace_fd(unsigned fd, struct file *file, unsigned flags)
> >  	return err;
> >  }
> >  
> > +/*
> > + * File Receive - Receive a file from another process
> > + *
> > + * This function is designed to receive files from other tasks. It encapsulates
> > + * logic around security and cgroups. The file descriptor provided must be a
> > + * freshly allocated (unused) file descriptor.
> > + *
> > + * This helper does not consume a reference to the file, so the caller must put
> > + * their reference.
> > + *
> > + * Returns 0 upon success.
> > + */
> > +int file_receive(int fd, struct file *file)
> 
> This is all just a remote version of fd_install(), yet it deviates from
> fd_install()'s semantics and naming. That's not great imho. What about
> naming this something like:
> 
> fd_install_received()
> 
> and move the get_file() out of there so it has the same semantics as
> fd_install(). It seems rather dangerous to have a function like
> fd_install() that consumes a reference once it returned and another
> version of this that is basically the same thing but doesn't consume a
> reference because it takes its own. Seems an invitation for confusion.
> Does that make sense?
> 
You're right. The reason for the difference in my mind is that fd_install
always succeeds, whereas file_receive can fail. It's easier to do something
like:
fd_install(fd, get_file(f))
vs.
if (file_receive(fd, get_file(f))
	fput(f);

Alternatively, if the reference was always consumed, it is somewhat
easier.

I'm fine either way, but just explaining my reasoning for the difference
in behaviour.
Sargun Dhillon June 4, 2020, 5:20 a.m.
On Wed, Jun 03, 2020 at 07:22:57PM -0700, Kees Cook wrote:
> On Thu, Jun 04, 2020 at 03:24:52AM +0200, Christian Brauner wrote:
> > On Tue, Jun 02, 2020 at 06:10:41PM -0700, Sargun Dhillon wrote:
> > > Previously there were two chunks of code where the logic to receive file
> > > descriptors was duplicated in net. The compat version of copying
> > > file descriptors via SCM_RIGHTS did not have logic to update cgroups.
> > > Logic to change the cgroup data was added in:
> > > commit 48a87cc26c13 ("net: netprio: fd passed in SCM_RIGHTS datagram not set correctly")
> > > commit d84295067fc7 ("net: net_cls: fd passed in SCM_RIGHTS datagram not set correctly")
> > > 
> > > This was not copied to the compat path. This commit fixes that, and thus
> > > should be cherry-picked into stable.
> > > 
> > > This introduces a helper (file_receive) which encapsulates the logic for
> > > handling calling security hooks as well as manipulating cgroup information.
> > > This helper can then be used other places in the kernel where file
> > > descriptors are copied between processes
> > > 
> > > I tested cgroup classid setting on both the compat (x32) path, and the
> > > native path to ensure that when moving the file descriptor the classid
> > > is set.
> > > 
> > > Signed-off-by: Sargun Dhillon <sargun@sargun.me>
> > > Suggested-by: Kees Cook <keescook@chromium.org>
> > > Cc: Al Viro <viro@zeniv.linux.org.uk>
> > > Cc: Christian Brauner <christian.brauner@ubuntu.com>
> > > Cc: Daniel Wagner <daniel.wagner@bmw-carit.de>
> > > Cc: David S. Miller <davem@davemloft.net>
> > > Cc: Jann Horn <jannh@google.com>,
> > > Cc: John Fastabend <john.r.fastabend@intel.com>
> > > Cc: Tejun Heo <tj@kernel.org>
> > > Cc: Tycho Andersen <tycho@tycho.ws>
> > > Cc: stable@vger.kernel.org
> > > Cc: cgroups@vger.kernel.org
> > > Cc: linux-fsdevel@vger.kernel.org
> > > Cc: linux-kernel@vger.kernel.org
> > > ---
> > >  fs/file.c            | 35 +++++++++++++++++++++++++++++++++++
> > >  include/linux/file.h |  1 +
> > >  net/compat.c         | 10 +++++-----
> > >  net/core/scm.c       | 14 ++++----------
> > >  4 files changed, 45 insertions(+), 15 deletions(-)
> > > 
> > > diff --git a/fs/file.c b/fs/file.c
> > > index abb8b7081d7a..5afd76fca8c2 100644
> > > --- a/fs/file.c
> > > +++ b/fs/file.c
> > > @@ -18,6 +18,9 @@
> > >  #include <linux/bitops.h>
> > >  #include <linux/spinlock.h>
> > >  #include <linux/rcupdate.h>
> > > +#include <net/sock.h>
> > > +#include <net/netprio_cgroup.h>
> > > +#include <net/cls_cgroup.h>
> > >  
> > >  unsigned int sysctl_nr_open __read_mostly = 1024*1024;
> > >  unsigned int sysctl_nr_open_min = BITS_PER_LONG;
> > > @@ -931,6 +934,38 @@ int replace_fd(unsigned fd, struct file *file, unsigned flags)
> > >  	return err;
> > >  }
> > >  
> > > +/*
> > > + * File Receive - Receive a file from another process
> > > + *
> > > + * This function is designed to receive files from other tasks. It encapsulates
> > > + * logic around security and cgroups. The file descriptor provided must be a
> > > + * freshly allocated (unused) file descriptor.
> > > + *
> > > + * This helper does not consume a reference to the file, so the caller must put
> > > + * their reference.
> > > + *
> > > + * Returns 0 upon success.
> > > + */
> > > +int file_receive(int fd, struct file *file)
> > 
> > This is all just a remote version of fd_install(), yet it deviates from
> > fd_install()'s semantics and naming. That's not great imho. What about
> > naming this something like:
> > 
> > fd_install_received()
> > 
> > and move the get_file() out of there so it has the same semantics as
> > fd_install(). It seems rather dangerous to have a function like
> > fd_install() that consumes a reference once it returned and another
> > version of this that is basically the same thing but doesn't consume a
> > reference because it takes its own. Seems an invitation for confusion.
> > Does that make sense?
> 
> We have some competing opinions on this, I guess. What I really don't
> like is the copy/pasting of the get_unused_fd_flags() and
> put_unused_fd() needed by (nearly) all the callers. If it's a helper, it
> should help. Specifically, I'd like to see this:
> 
> int file_receive(int fd, unsigned long flags, struct file *file,
> 		 int __user *fdptr)
> {
> 	struct socket *sock;
> 	int err;
> 
> 	err = security_file_receive(file);
> 	if (err)
> 		return err;
> 
> 	if (fd < 0) {
> 		/* Install new fd. */
> 		int new_fd;
> 
> 		err = get_unused_fd_flags(flags);
> 		if (err < 0)
> 			return err;
> 		new_fd = err;
> 
> 		/* Copy fd to any waiting user memory. */
> 		if (fdptr) {
> 			err = put_user(new_fd, fdptr);
> 			if (err < 0) {
> 				put_unused_fd(new_fd);
> 				return err;
> 			}
> 		}
> 		fd_install(new_fd, get_file(file));
> 		fd = new_fd;
> 	} else {
> 		/* Replace existing fd. */
> 		err = replace_fd(fd, file, flags);
> 		if (err)
> 			return err;
> 	}
> 
> 	/* Bump the cgroup usage counts. */
> 	sock = sock_from_file(fd, &err);
> 	if (sock) {
> 		sock_update_netprioidx(&sock->sk->sk_cgrp_data);
> 		sock_update_classid(&sock->sk->sk_cgrp_data);
> 	}
> 
> 	return fd;
> }
> 
> If everyone else *really* prefers keeping the get_unused_fd_flags() /
> put_unused_fd() stuff outside the helper, then I guess I'll give up,
> but I think it is MUCH cleaner this way -- all 4 users trim down lots
> of code duplication.
> 
> -- 
> Kees Cook
This seems weird that the function has two different return mechanisms
depending on the value of fdptr, especially given that behaviour is
only invoked by SCM, whereas the other callers (addfd, and pidfd_getfd)
just want the FD value returned.

Won't this produce a "bad" result, if the user does:

struct msghdr msg = {};
struct cmsghdr *cmsg;
struct iovec io = {
	.iov_base = &c,
	.iov_len = 1,
};

msg.msg_iov = &io;
msg.msg_iovlen = 1;
msg.msg_control = NULL;
msg.msg_controllen = sizeof(buf);

recvmsg(sock, &msg, 0);
----

This will end up installing the FD, but it will efault, when
scm_detach_fds tries to fill out the rest of the info. 

I mean, we can easily solve this with a null pointer check
in scm_detach_fds, but my fear is that user n will forget
to do this, and make a mistake.

Maybe it would be nice to have:

/* Receives file descriptor and installs it in userspace at uptr. */
static inline intfile_receive_user(struct file *file, unsigned long flags,
				   int __user *fdptr)
{
	if (fdptr == NULL)
		return -EFAULT;

	return __file_receive(-1, flags, file, uptr);
}

And then just let pidfd_getfd, and seccomp_addfd call __file_receive
directly, or offer a different helper like:

static inline file_receive(long fd, struct *file, unsigned long flags)
{
	return __file_receive(fd, flags, file, NULL);
}
Christian Brauner June 4, 2020, 12:52 p.m.
On Wed, Jun 03, 2020 at 07:22:57PM -0700, Kees Cook wrote:
> On Thu, Jun 04, 2020 at 03:24:52AM +0200, Christian Brauner wrote:
> > On Tue, Jun 02, 2020 at 06:10:41PM -0700, Sargun Dhillon wrote:
> > > Previously there were two chunks of code where the logic to receive file
> > > descriptors was duplicated in net. The compat version of copying
> > > file descriptors via SCM_RIGHTS did not have logic to update cgroups.
> > > Logic to change the cgroup data was added in:
> > > commit 48a87cc26c13 ("net: netprio: fd passed in SCM_RIGHTS datagram not set correctly")
> > > commit d84295067fc7 ("net: net_cls: fd passed in SCM_RIGHTS datagram not set correctly")
> > > 
> > > This was not copied to the compat path. This commit fixes that, and thus
> > > should be cherry-picked into stable.
> > > 
> > > This introduces a helper (file_receive) which encapsulates the logic for
> > > handling calling security hooks as well as manipulating cgroup information.
> > > This helper can then be used other places in the kernel where file
> > > descriptors are copied between processes
> > > 
> > > I tested cgroup classid setting on both the compat (x32) path, and the
> > > native path to ensure that when moving the file descriptor the classid
> > > is set.
> > > 
> > > Signed-off-by: Sargun Dhillon <sargun@sargun.me>
> > > Suggested-by: Kees Cook <keescook@chromium.org>
> > > Cc: Al Viro <viro@zeniv.linux.org.uk>
> > > Cc: Christian Brauner <christian.brauner@ubuntu.com>
> > > Cc: Daniel Wagner <daniel.wagner@bmw-carit.de>
> > > Cc: David S. Miller <davem@davemloft.net>
> > > Cc: Jann Horn <jannh@google.com>,
> > > Cc: John Fastabend <john.r.fastabend@intel.com>
> > > Cc: Tejun Heo <tj@kernel.org>
> > > Cc: Tycho Andersen <tycho@tycho.ws>
> > > Cc: stable@vger.kernel.org
> > > Cc: cgroups@vger.kernel.org
> > > Cc: linux-fsdevel@vger.kernel.org
> > > Cc: linux-kernel@vger.kernel.org
> > > ---
> > >  fs/file.c            | 35 +++++++++++++++++++++++++++++++++++
> > >  include/linux/file.h |  1 +
> > >  net/compat.c         | 10 +++++-----
> > >  net/core/scm.c       | 14 ++++----------
> > >  4 files changed, 45 insertions(+), 15 deletions(-)
> > > 
> > > diff --git a/fs/file.c b/fs/file.c
> > > index abb8b7081d7a..5afd76fca8c2 100644
> > > --- a/fs/file.c
> > > +++ b/fs/file.c
> > > @@ -18,6 +18,9 @@
> > >  #include <linux/bitops.h>
> > >  #include <linux/spinlock.h>
> > >  #include <linux/rcupdate.h>
> > > +#include <net/sock.h>
> > > +#include <net/netprio_cgroup.h>
> > > +#include <net/cls_cgroup.h>
> > >  
> > >  unsigned int sysctl_nr_open __read_mostly = 1024*1024;
> > >  unsigned int sysctl_nr_open_min = BITS_PER_LONG;
> > > @@ -931,6 +934,38 @@ int replace_fd(unsigned fd, struct file *file, unsigned flags)
> > >  	return err;
> > >  }
> > >  
> > > +/*
> > > + * File Receive - Receive a file from another process
> > > + *
> > > + * This function is designed to receive files from other tasks. It encapsulates
> > > + * logic around security and cgroups. The file descriptor provided must be a
> > > + * freshly allocated (unused) file descriptor.
> > > + *
> > > + * This helper does not consume a reference to the file, so the caller must put
> > > + * their reference.
> > > + *
> > > + * Returns 0 upon success.
> > > + */
> > > +int file_receive(int fd, struct file *file)
> > 
> > This is all just a remote version of fd_install(), yet it deviates from
> > fd_install()'s semantics and naming. That's not great imho. What about
> > naming this something like:
> > 
> > fd_install_received()
> > 
> > and move the get_file() out of there so it has the same semantics as
> > fd_install(). It seems rather dangerous to have a function like
> > fd_install() that consumes a reference once it returned and another
> > version of this that is basically the same thing but doesn't consume a
> > reference because it takes its own. Seems an invitation for confusion.
> > Does that make sense?
> 
> We have some competing opinions on this, I guess. What I really don't
> like is the copy/pasting of the get_unused_fd_flags() and
> put_unused_fd() needed by (nearly) all the callers. If it's a helper, it
> should help. Specifically, I'd like to see this:
> 
> int file_receive(int fd, unsigned long flags, struct file *file,
> 		 int __user *fdptr)

I still fail to see what this whole put_user() handling buys us at all
and why this function needs to be anymore complicated then simply:

fd_install_received(int fd, struct file *file)
{
	security_file_receive(file);
 
 	sock = sock_from_file(fd, &err);
 	if (sock) {
 		sock_update_netprioidx(&sock->sk->sk_cgrp_data);
 		sock_update_classid(&sock->sk->sk_cgrp_data);
 	}

	fd_install();
	return;
}

exactly like fd_install() but for received files.

For scm you can fail somewhere in the middle of putting any number of
file descriptors so you're left in a state with only a subset of
requested file descriptors installed so it's not really useful there.
And if you manage to install an fd but then fail to put_user() it
userspace can simply check it's fds via proc and has to anyway on any
scm message error. If you fail an scm message userspace better check
their fds.
For seccomp maybe but even there I doubt it and I still maintain that
userspace screwing this up is on them which is how we do this most of
the time. And for pidfd_getfd() this whole put_user() thing doesn't
matter at all.

It's much easier and clearer if we simply have a fd_install() -
fd_install_received() parallelism where we follow an established
convention. _But_ if that blocks you from making this generic enough
then at least the replace_fd() vs fd_install() logic seems it shouldn't
be in there. 

And the function name really needs to drive home the point that it
installs an fd into the tasks fdtable no matter what version you go
with. file_receive() is really not accurate enough for this at all.

> {
> 	struct socket *sock;
> 	int err;
> 
> 	err = security_file_receive(file);
> 	if (err)
> 		return err;
> 
> 	if (fd < 0) {
> 		/* Install new fd. */
> 		int new_fd;
> 
> 		err = get_unused_fd_flags(flags);
> 		if (err < 0)
> 			return err;
> 		new_fd = err;
> 
> 		/* Copy fd to any waiting user memory. */
> 		if (fdptr) {
> 			err = put_user(new_fd, fdptr);
> 			if (err < 0) {
> 				put_unused_fd(new_fd);
> 				return err;
> 			}
> 		}
> 		fd_install(new_fd, get_file(file));
> 		fd = new_fd;
> 	} else {
> 		/* Replace existing fd. */
> 		err = replace_fd(fd, file, flags);
> 		if (err)
> 			return err;
> 	}
> 
> 	/* Bump the cgroup usage counts. */
> 	sock = sock_from_file(fd, &err);
> 	if (sock) {
> 		sock_update_netprioidx(&sock->sk->sk_cgrp_data);
> 		sock_update_classid(&sock->sk->sk_cgrp_data);
> 	}
> 
> 	return fd;
> }
> 
> If everyone else *really* prefers keeping the get_unused_fd_flags() /
> put_unused_fd() stuff outside the helper, then I guess I'll give up,
> but I think it is MUCH cleaner this way -- all 4 users trim down lots
> of code duplication.
> 
> -- 
> Kees Cook
David Laight June 4, 2020, 1:28 p.m.
From: Christian Brauner
> Sent: 04 June 2020 13:52
..
> For scm you can fail somewhere in the middle of putting any number of
> file descriptors so you're left in a state with only a subset of
> requested file descriptors installed so it's not really useful there.
> And if you manage to install an fd but then fail to put_user() it
> userspace can simply check it's fds via proc and has to anyway on any
> scm message error. If you fail an scm message userspace better check
> their fds.

There is a similar error path in the sctp 'peeloff' code.
If the put_user() fails it currently closes the fd before
returning -EFAULT.

I'm not at all sure this is helpful.
The application can't tell whether the SIGSEGV happened on the
copyin of the parameters or the copyout of the result.

ISTM that if the application passes an address that cannot
be written to it deserves what it gets - typically an fd it
doesn't know the number of.

What is important is that the kernel data is consistent.
So when the process exits the fd is closed and all the resources
are released.

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)
Sargun Dhillon June 5, 2020, 7:54 a.m.
On Thu, Jun 04, 2020 at 02:52:26PM +0200, Christian Brauner wrote:
> On Wed, Jun 03, 2020 at 07:22:57PM -0700, Kees Cook wrote:
> > On Thu, Jun 04, 2020 at 03:24:52AM +0200, Christian Brauner wrote:
> > > On Tue, Jun 02, 2020 at 06:10:41PM -0700, Sargun Dhillon wrote:
> > > > Previously there were two chunks of code where the logic to receive file
> > > > descriptors was duplicated in net. The compat version of copying
> > > > file descriptors via SCM_RIGHTS did not have logic to update cgroups.
> > > > Logic to change the cgroup data was added in:
> > > > commit 48a87cc26c13 ("net: netprio: fd passed in SCM_RIGHTS datagram not set correctly")
> > > > commit d84295067fc7 ("net: net_cls: fd passed in SCM_RIGHTS datagram not set correctly")
> > > > 
> > > > This was not copied to the compat path. This commit fixes that, and thus
> > > > should be cherry-picked into stable.
> > > > 
> > > > This introduces a helper (file_receive) which encapsulates the logic for
> > > > handling calling security hooks as well as manipulating cgroup information.
> > > > This helper can then be used other places in the kernel where file
> > > > descriptors are copied between processes
> > > > 
> > > > I tested cgroup classid setting on both the compat (x32) path, and the
> > > > native path to ensure that when moving the file descriptor the classid
> > > > is set.
> > > > 
> > > > Signed-off-by: Sargun Dhillon <sargun@sargun.me>
> > > > Suggested-by: Kees Cook <keescook@chromium.org>
> > > > Cc: Al Viro <viro@zeniv.linux.org.uk>
> > > > Cc: Christian Brauner <christian.brauner@ubuntu.com>
> > > > Cc: Daniel Wagner <daniel.wagner@bmw-carit.de>
> > > > Cc: David S. Miller <davem@davemloft.net>
> > > > Cc: Jann Horn <jannh@google.com>,
> > > > Cc: John Fastabend <john.r.fastabend@intel.com>
> > > > Cc: Tejun Heo <tj@kernel.org>
> > > > Cc: Tycho Andersen <tycho@tycho.ws>
> > > > Cc: stable@vger.kernel.org
> > > > Cc: cgroups@vger.kernel.org
> > > > Cc: linux-fsdevel@vger.kernel.org
> > > > Cc: linux-kernel@vger.kernel.org
> > > > ---
> > > >  fs/file.c            | 35 +++++++++++++++++++++++++++++++++++
> > > >  include/linux/file.h |  1 +
> > > >  net/compat.c         | 10 +++++-----
> > > >  net/core/scm.c       | 14 ++++----------
> > > >  4 files changed, 45 insertions(+), 15 deletions(-)
> > > > 
> > > 
> > > This is all just a remote version of fd_install(), yet it deviates from
> > > fd_install()'s semantics and naming. That's not great imho. What about
> > > naming this something like:
> > > 
> > > fd_install_received()
> > > 
> > > and move the get_file() out of there so it has the same semantics as
> > > fd_install(). It seems rather dangerous to have a function like
> > > fd_install() that consumes a reference once it returned and another
> > > version of this that is basically the same thing but doesn't consume a
> > > reference because it takes its own. Seems an invitation for confusion.
> > > Does that make sense?
> > 
> > We have some competing opinions on this, I guess. What I really don't
> > like is the copy/pasting of the get_unused_fd_flags() and
> > put_unused_fd() needed by (nearly) all the callers. If it's a helper, it
> > should help. Specifically, I'd like to see this:
> > 
> > int file_receive(int fd, unsigned long flags, struct file *file,
> > 		 int __user *fdptr)
> 
> I still fail to see what this whole put_user() handling buys us at all
> and why this function needs to be anymore complicated then simply:
> 
> fd_install_received(int fd, struct file *file)
> {
> 	security_file_receive(file);
>  
>  	sock = sock_from_file(fd, &err);
>  	if (sock) {
>  		sock_update_netprioidx(&sock->sk->sk_cgrp_data);
>  		sock_update_classid(&sock->sk->sk_cgrp_data);
>  	}
> 
> 	fd_install();
> 	return;
> }
> 
> exactly like fd_install() but for received files.
> 
> For scm you can fail somewhere in the middle of putting any number of
> file descriptors so you're left in a state with only a subset of
> requested file descriptors installed so it's not really useful there.
> And if you manage to install an fd but then fail to put_user() it
> userspace can simply check it's fds via proc and has to anyway on any
> scm message error. If you fail an scm message userspace better check
> their fds.
> For seccomp maybe but even there I doubt it and I still maintain that
> userspace screwing this up is on them which is how we do this most of
> the time. And for pidfd_getfd() this whole put_user() thing doesn't
> matter at all.
> 
> It's much easier and clearer if we simply have a fd_install() -
> fd_install_received() parallelism where we follow an established
> convention. _But_ if that blocks you from making this generic enough
> then at least the replace_fd() vs fd_install() logic seems it shouldn't
> be in there. 
> 
> And the function name really needs to drive home the point that it
> installs an fd into the tasks fdtable no matter what version you go
> with. file_receive() is really not accurate enough for this at all.
> 
> > {
> > 	struct socket *sock;
> > 	int err;
> > 
> > 	err = security_file_receive(file);
> > 	if (err)
> > 		return err;
> > 
> > 	if (fd < 0) {
> > 		/* Install new fd. */
> > 		int new_fd;
> > 
> > 		err = get_unused_fd_flags(flags);
> > 		if (err < 0)
> > 			return err;
> > 		new_fd = err;
> > 
> > 		/* Copy fd to any waiting user memory. */
> > 		if (fdptr) {
> > 			err = put_user(new_fd, fdptr);
> > 			if (err < 0) {
> > 				put_unused_fd(new_fd);
> > 				return err;
> > 			}
> > 		}
> > 		fd_install(new_fd, get_file(file));
> > 		fd = new_fd;
> > 	} else {
> > 		/* Replace existing fd. */
> > 		err = replace_fd(fd, file, flags);
> > 		if (err)
> > 			return err;
> > 	}
> > 
> > 	/* Bump the cgroup usage counts. */
> > 	sock = sock_from_file(fd, &err);
> > 	if (sock) {
> > 		sock_update_netprioidx(&sock->sk->sk_cgrp_data);
> > 		sock_update_classid(&sock->sk->sk_cgrp_data);
> > 	}
> > 
> > 	return fd;
> > }
> > 
> > If everyone else *really* prefers keeping the get_unused_fd_flags() /
> > put_unused_fd() stuff outside the helper, then I guess I'll give up,
> > but I think it is MUCH cleaner this way -- all 4 users trim down lots
> > of code duplication.
> > 
> > -- 
> > Kees Cook
How about this:


static int do_dup2(struct files_struct *files,
	struct file *file, unsigned fd, unsigned flags)
__releases(&files->file_lock)
{
	struct file *tofree;
	struct fdtable *fdt;

	...

	/*
	 * New bit, allowing the file to be null. Doesn't have the same
	 * "sanity check" bits from __alloc_fd
	 */
	if (likely(file))
		get_file(file);
	rcu_assign_pointer(fdt->fd[fd], file);

	__set_open_fd(fd, fdt);

	...
}

/*
 * File Receive - Receive a file from another process
 *
 * Encapsulates the logic to handle receiving a file from another task. It
 * does not install the file descriptor. That is delegated to the user. If
 * an error occurs that results in the file descriptor not being installed,
 * they must put_unused_fd.
 *
 * fd should be >= 0 if you intend on replacing a file descriptor, or
 * alternatively -1 if you want file_receive to allocate an FD for you
 *
 * Returns the fd number on success.
 * Returns negative error code on failure.
 *
 */
int file_receive(int fd, unsigned int flags, struct file *file)
{
	int err;
	struct socket *sock;
	struct files_struct *files = current->files;

	err = security_file_receive(file);
	if (err)
		return err;

	if (fd >= 0) {
		if (fd >= rlimit(RLIMIT_NOFILE))
			return -EBADF;

		spin_lock(&files->file_lock);
		err = expand_files(files, fd);
		if (err < 0) {
			goto out_unlock;
		}

		err = do_dup2(files, NULL, fd, flags);
		if (err)
			return err;
	} else {
		fd = get_unused_fd_flags(flags);
		if (fd < 0)
			return fd;
	}

	sock = sock_from_file(file, &err);
	if (sock) {
		sock_update_netprioidx(&sock->sk->sk_cgrp_data);
		sock_update_classid(&sock->sk->sk_cgrp_data);
	}

	return fd;

out_unlock:
	spin_unlock(&files->file_lock);
	return err;
}

---

then the code in scm.c:
err = file_receive(-1, flags, fp[i]);
if (err < 0)
	break;

new_fd = err;
err = put_user(new_fd, cmfptr);
if (err) {
	put_unused_fd(new_fd);
	break;
}

/* Bump the usage count and install the file. */
fd_install(new_fd, get_file(fp[i]));

And addfd:
ret = file_receive(addfd->fd, addfd->flags, addfd->file);
if (ret >= 0)
	fd_install(ret, get_file(addfd->file));
addfd->ret = ret;

----

This way there is:
1. No "put_user" logic in file_receive
2. Minimal (single) branching logic, unless there's something in between
   the file_receive and installing the FD, such as put_user.
3. Doesn't implement fd_install, so there's no ambiguity about it being
   file_install_received vs. just the receive logic.
Kees Cook June 9, 2020, 7:43 p.m.
On Fri, Jun 05, 2020 at 07:54:36AM +0000, Sargun Dhillon wrote:
> On Thu, Jun 04, 2020 at 02:52:26PM +0200, Christian Brauner wrote:
> > On Wed, Jun 03, 2020 at 07:22:57PM -0700, Kees Cook wrote:
> > > On Thu, Jun 04, 2020 at 03:24:52AM +0200, Christian Brauner wrote:
> > > > On Tue, Jun 02, 2020 at 06:10:41PM -0700, Sargun Dhillon wrote:
> > > > > Previously there were two chunks of code where the logic to receive file
> > > > > descriptors was duplicated in net. The compat version of copying
> > > > > file descriptors via SCM_RIGHTS did not have logic to update cgroups.
> > > > > Logic to change the cgroup data was added in:
> > > > > commit 48a87cc26c13 ("net: netprio: fd passed in SCM_RIGHTS datagram not set correctly")
> > > > > commit d84295067fc7 ("net: net_cls: fd passed in SCM_RIGHTS datagram not set correctly")
> > > > > 
> > > > > This was not copied to the compat path. This commit fixes that, and thus
> > > > > should be cherry-picked into stable.
> > > > > 
> > > > > This introduces a helper (file_receive) which encapsulates the logic for
> > > > > handling calling security hooks as well as manipulating cgroup information.
> > > > > This helper can then be used other places in the kernel where file
> > > > > descriptors are copied between processes
> > > > > 
> > > > > I tested cgroup classid setting on both the compat (x32) path, and the
> > > > > native path to ensure that when moving the file descriptor the classid
> > > > > is set.
> > > > > 
> > > > > Signed-off-by: Sargun Dhillon <sargun@sargun.me>
> > > > > Suggested-by: Kees Cook <keescook@chromium.org>
> > > > > Cc: Al Viro <viro@zeniv.linux.org.uk>
> > > > > Cc: Christian Brauner <christian.brauner@ubuntu.com>
> > > > > Cc: Daniel Wagner <daniel.wagner@bmw-carit.de>
> > > > > Cc: David S. Miller <davem@davemloft.net>
> > > > > Cc: Jann Horn <jannh@google.com>,
> > > > > Cc: John Fastabend <john.r.fastabend@intel.com>
> > > > > Cc: Tejun Heo <tj@kernel.org>
> > > > > Cc: Tycho Andersen <tycho@tycho.ws>
> > > > > Cc: stable@vger.kernel.org
> > > > > Cc: cgroups@vger.kernel.org
> > > > > Cc: linux-fsdevel@vger.kernel.org
> > > > > Cc: linux-kernel@vger.kernel.org
> > > > > ---
> > > > >  fs/file.c            | 35 +++++++++++++++++++++++++++++++++++
> > > > >  include/linux/file.h |  1 +
> > > > >  net/compat.c         | 10 +++++-----
> > > > >  net/core/scm.c       | 14 ++++----------
> > > > >  4 files changed, 45 insertions(+), 15 deletions(-)
> > > > > 
> > > > 
> > > > This is all just a remote version of fd_install(), yet it deviates from
> > > > fd_install()'s semantics and naming. That's not great imho. What about
> > > > naming this something like:
> > > > 
> > > > fd_install_received()
> > > > 
> > > > and move the get_file() out of there so it has the same semantics as
> > > > fd_install(). It seems rather dangerous to have a function like
> > > > fd_install() that consumes a reference once it returned and another
> > > > version of this that is basically the same thing but doesn't consume a
> > > > reference because it takes its own. Seems an invitation for confusion.
> > > > Does that make sense?
> > > 
> > > We have some competing opinions on this, I guess. What I really don't
> > > like is the copy/pasting of the get_unused_fd_flags() and
> > > put_unused_fd() needed by (nearly) all the callers. If it's a helper, it
> > > should help. Specifically, I'd like to see this:
> > > 
> > > int file_receive(int fd, unsigned long flags, struct file *file,
> > > 		 int __user *fdptr)
> > 
> > I still fail to see what this whole put_user() handling buys us at all
> > and why this function needs to be anymore complicated then simply:
> > 
> > fd_install_received(int fd, struct file *file)
> > {
> > 	security_file_receive(file);
> >  
> >  	sock = sock_from_file(fd, &err);
> >  	if (sock) {
> >  		sock_update_netprioidx(&sock->sk->sk_cgrp_data);
> >  		sock_update_classid(&sock->sk->sk_cgrp_data);
> >  	}
> > 
> > 	fd_install();
> > 	return;
> > }
> > 
> > exactly like fd_install() but for received files.
> > 
> > For scm you can fail somewhere in the middle of putting any number of
> > file descriptors so you're left in a state with only a subset of
> > requested file descriptors installed so it's not really useful there.
> > And if you manage to install an fd but then fail to put_user() it
> > userspace can simply check it's fds via proc and has to anyway on any
> > scm message error. If you fail an scm message userspace better check
> > their fds.
> > For seccomp maybe but even there I doubt it and I still maintain that
> > userspace screwing this up is on them which is how we do this most of
> > the time. And for pidfd_getfd() this whole put_user() thing doesn't
> > matter at all.
> > 
> > It's much easier and clearer if we simply have a fd_install() -
> > fd_install_received() parallelism where we follow an established
> > convention. _But_ if that blocks you from making this generic enough
> > then at least the replace_fd() vs fd_install() logic seems it shouldn't
> > be in there. 
> > 
> > And the function name really needs to drive home the point that it
> > installs an fd into the tasks fdtable no matter what version you go
> > with. file_receive() is really not accurate enough for this at all.
> > 
> > > {
> > > 	struct socket *sock;
> > > 	int err;
> > > 
> > > 	err = security_file_receive(file);
> > > 	if (err)
> > > 		return err;
> > > 
> > > 	if (fd < 0) {
> > > 		/* Install new fd. */
> > > 		int new_fd;
> > > 
> > > 		err = get_unused_fd_flags(flags);
> > > 		if (err < 0)
> > > 			return err;
> > > 		new_fd = err;
> > > 
> > > 		/* Copy fd to any waiting user memory. */
> > > 		if (fdptr) {
> > > 			err = put_user(new_fd, fdptr);
> > > 			if (err < 0) {
> > > 				put_unused_fd(new_fd);
> > > 				return err;
> > > 			}
> > > 		}
> > > 		fd_install(new_fd, get_file(file));
> > > 		fd = new_fd;
> > > 	} else {
> > > 		/* Replace existing fd. */
> > > 		err = replace_fd(fd, file, flags);
> > > 		if (err)
> > > 			return err;
> > > 	}
> > > 
> > > 	/* Bump the cgroup usage counts. */
> > > 	sock = sock_from_file(fd, &err);
> > > 	if (sock) {
> > > 		sock_update_netprioidx(&sock->sk->sk_cgrp_data);
> > > 		sock_update_classid(&sock->sk->sk_cgrp_data);
> > > 	}
> > > 
> > > 	return fd;
> > > }
> > > 
> > > If everyone else *really* prefers keeping the get_unused_fd_flags() /
> > > put_unused_fd() stuff outside the helper, then I guess I'll give up,
> > > but I think it is MUCH cleaner this way -- all 4 users trim down lots
> > > of code duplication.
> > > 
> > > -- 
> > > Kees Cook
> How about this:
> 
> 
> static int do_dup2(struct files_struct *files,
> 	struct file *file, unsigned fd, unsigned flags)
> __releases(&files->file_lock)
> {
> 	struct file *tofree;
> 	struct fdtable *fdt;
> 
> 	...
> 
> 	/*
> 	 * New bit, allowing the file to be null. Doesn't have the same
> 	 * "sanity check" bits from __alloc_fd
> 	 */
> 	if (likely(file))
> 		get_file(file);
> 	rcu_assign_pointer(fdt->fd[fd], file);
> 
> 	__set_open_fd(fd, fdt);

IIUC, this means we can get the fdt into a state of an open fd with a
NULL file... is that okay? That feels like something Al might rebel at.
:)

> 
> 	...
> }
> 
> /*
>  * File Receive - Receive a file from another process
>  *
>  * Encapsulates the logic to handle receiving a file from another task. It
>  * does not install the file descriptor. That is delegated to the user. If
>  * an error occurs that results in the file descriptor not being installed,
>  * they must put_unused_fd.
>  *
>  * fd should be >= 0 if you intend on replacing a file descriptor, or
>  * alternatively -1 if you want file_receive to allocate an FD for you
>  *
>  * Returns the fd number on success.
>  * Returns negative error code on failure.
>  *
>  */
> int file_receive(int fd, unsigned int flags, struct file *file)
> {
> 	int err;
> 	struct socket *sock;
> 	struct files_struct *files = current->files;
> 
> 	err = security_file_receive(file);
> 	if (err)
> 		return err;
> 
> 	if (fd >= 0) {
> 		if (fd >= rlimit(RLIMIT_NOFILE))
> 			return -EBADF;
> 
> 		spin_lock(&files->file_lock);
> 		err = expand_files(files, fd);
> 		if (err < 0) {
> 			goto out_unlock;
> 		}
> 
> 		err = do_dup2(files, NULL, fd, flags);
> 		if (err)
> 			return err;

This seems like we're duplicating some checks and missing others -- I
really think we need to do this using the existing primitives. But I'd
really like some review or commentary from Al. We can do this a bunch of
ways, and I'd like to know which way looks best to him. :(

> This way there is:
> 1. No "put_user" logic in file_receive
> 2. Minimal (single) branching logic, unless there's something in between
>    the file_receive and installing the FD, such as put_user.
> 3. Doesn't implement fd_install, so there's no ambiguity about it being
>    file_install_received vs. just the receive logic.

I still wonder if we should refactor SCM_RIGHTS to just delay put_user
failures, which would simplify a bunch. It's a behavior change, but it
seems from Al and Jann that it just shouldn't matter. (And if it does,
we'll hear about it.)
Christian Brauner June 9, 2020, 8:03 p.m.
On Tue, Jun 09, 2020 at 12:43:48PM -0700, Kees Cook wrote:
> On Fri, Jun 05, 2020 at 07:54:36AM +0000, Sargun Dhillon wrote:
> > On Thu, Jun 04, 2020 at 02:52:26PM +0200, Christian Brauner wrote:
> > > On Wed, Jun 03, 2020 at 07:22:57PM -0700, Kees Cook wrote:
> > > > On Thu, Jun 04, 2020 at 03:24:52AM +0200, Christian Brauner wrote:
> > > > > On Tue, Jun 02, 2020 at 06:10:41PM -0700, Sargun Dhillon wrote:
> > > > > > Previously there were two chunks of code where the logic to receive file
> > > > > > descriptors was duplicated in net. The compat version of copying
> > > > > > file descriptors via SCM_RIGHTS did not have logic to update cgroups.
> > > > > > Logic to change the cgroup data was added in:
> > > > > > commit 48a87cc26c13 ("net: netprio: fd passed in SCM_RIGHTS datagram not set correctly")
> > > > > > commit d84295067fc7 ("net: net_cls: fd passed in SCM_RIGHTS datagram not set correctly")
> > > > > > 
> > > > > > This was not copied to the compat path. This commit fixes that, and thus
> > > > > > should be cherry-picked into stable.
> > > > > > 
> > > > > > This introduces a helper (file_receive) which encapsulates the logic for
> > > > > > handling calling security hooks as well as manipulating cgroup information.
> > > > > > This helper can then be used other places in the kernel where file
> > > > > > descriptors are copied between processes
> > > > > > 
> > > > > > I tested cgroup classid setting on both the compat (x32) path, and the
> > > > > > native path to ensure that when moving the file descriptor the classid
> > > > > > is set.
> > > > > > 
> > > > > > Signed-off-by: Sargun Dhillon <sargun@sargun.me>
> > > > > > Suggested-by: Kees Cook <keescook@chromium.org>
> > > > > > Cc: Al Viro <viro@zeniv.linux.org.uk>
> > > > > > Cc: Christian Brauner <christian.brauner@ubuntu.com>
> > > > > > Cc: Daniel Wagner <daniel.wagner@bmw-carit.de>
> > > > > > Cc: David S. Miller <davem@davemloft.net>
> > > > > > Cc: Jann Horn <jannh@google.com>,
> > > > > > Cc: John Fastabend <john.r.fastabend@intel.com>
> > > > > > Cc: Tejun Heo <tj@kernel.org>
> > > > > > Cc: Tycho Andersen <tycho@tycho.ws>
> > > > > > Cc: stable@vger.kernel.org
> > > > > > Cc: cgroups@vger.kernel.org
> > > > > > Cc: linux-fsdevel@vger.kernel.org
> > > > > > Cc: linux-kernel@vger.kernel.org
> > > > > > ---
> > > > > >  fs/file.c            | 35 +++++++++++++++++++++++++++++++++++
> > > > > >  include/linux/file.h |  1 +
> > > > > >  net/compat.c         | 10 +++++-----
> > > > > >  net/core/scm.c       | 14 ++++----------
> > > > > >  4 files changed, 45 insertions(+), 15 deletions(-)
> > > > > > 
> > > > > 
> > > > > This is all just a remote version of fd_install(), yet it deviates from
> > > > > fd_install()'s semantics and naming. That's not great imho. What about
> > > > > naming this something like:
> > > > > 
> > > > > fd_install_received()
> > > > > 
> > > > > and move the get_file() out of there so it has the same semantics as
> > > > > fd_install(). It seems rather dangerous to have a function like
> > > > > fd_install() that consumes a reference once it returned and another
> > > > > version of this that is basically the same thing but doesn't consume a
> > > > > reference because it takes its own. Seems an invitation for confusion.
> > > > > Does that make sense?
> > > > 
> > > > We have some competing opinions on this, I guess. What I really don't
> > > > like is the copy/pasting of the get_unused_fd_flags() and
> > > > put_unused_fd() needed by (nearly) all the callers. If it's a helper, it
> > > > should help. Specifically, I'd like to see this:
> > > > 
> > > > int file_receive(int fd, unsigned long flags, struct file *file,
> > > > 		 int __user *fdptr)
> > > 
> > > I still fail to see what this whole put_user() handling buys us at all
> > > and why this function needs to be anymore complicated then simply:
> > > 
> > > fd_install_received(int fd, struct file *file)
> > > {
> > > 	security_file_receive(file);
> > >  
> > >  	sock = sock_from_file(fd, &err);
> > >  	if (sock) {
> > >  		sock_update_netprioidx(&sock->sk->sk_cgrp_data);
> > >  		sock_update_classid(&sock->sk->sk_cgrp_data);
> > >  	}
> > > 
> > > 	fd_install();
> > > 	return;
> > > }
> > > 
> > > exactly like fd_install() but for received files.
> > > 
> > > For scm you can fail somewhere in the middle of putting any number of
> > > file descriptors so you're left in a state with only a subset of
> > > requested file descriptors installed so it's not really useful there.
> > > And if you manage to install an fd but then fail to put_user() it
> > > userspace can simply check it's fds via proc and has to anyway on any
> > > scm message error. If you fail an scm message userspace better check
> > > their fds.
> > > For seccomp maybe but even there I doubt it and I still maintain that
> > > userspace screwing this up is on them which is how we do this most of
> > > the time. And for pidfd_getfd() this whole put_user() thing doesn't
> > > matter at all.
> > > 
> > > It's much easier and clearer if we simply have a fd_install() -
> > > fd_install_received() parallelism where we follow an established
> > > convention. _But_ if that blocks you from making this generic enough
> > > then at least the replace_fd() vs fd_install() logic seems it shouldn't
> > > be in there. 
> > > 
> > > And the function name really needs to drive home the point that it
> > > installs an fd into the tasks fdtable no matter what version you go
> > > with. file_receive() is really not accurate enough for this at all.
> > > 
> > > > {
> > > > 	struct socket *sock;
> > > > 	int err;
> > > > 
> > > > 	err = security_file_receive(file);
> > > > 	if (err)
> > > > 		return err;
> > > > 
> > > > 	if (fd < 0) {
> > > > 		/* Install new fd. */
> > > > 		int new_fd;
> > > > 
> > > > 		err = get_unused_fd_flags(flags);
> > > > 		if (err < 0)
> > > > 			return err;
> > > > 		new_fd = err;
> > > > 
> > > > 		/* Copy fd to any waiting user memory. */
> > > > 		if (fdptr) {
> > > > 			err = put_user(new_fd, fdptr);
> > > > 			if (err < 0) {
> > > > 				put_unused_fd(new_fd);
> > > > 				return err;
> > > > 			}
> > > > 		}
> > > > 		fd_install(new_fd, get_file(file));
> > > > 		fd = new_fd;
> > > > 	} else {
> > > > 		/* Replace existing fd. */
> > > > 		err = replace_fd(fd, file, flags);
> > > > 		if (err)
> > > > 			return err;
> > > > 	}
> > > > 
> > > > 	/* Bump the cgroup usage counts. */
> > > > 	sock = sock_from_file(fd, &err);
> > > > 	if (sock) {
> > > > 		sock_update_netprioidx(&sock->sk->sk_cgrp_data);
> > > > 		sock_update_classid(&sock->sk->sk_cgrp_data);
> > > > 	}
> > > > 
> > > > 	return fd;
> > > > }
> > > > 
> > > > If everyone else *really* prefers keeping the get_unused_fd_flags() /
> > > > put_unused_fd() stuff outside the helper, then I guess I'll give up,
> > > > but I think it is MUCH cleaner this way -- all 4 users trim down lots
> > > > of code duplication.
> > > > 
> > > > -- 
> > > > Kees Cook
> > How about this:
> > 
> > 
> > static int do_dup2(struct files_struct *files,
> > 	struct file *file, unsigned fd, unsigned flags)
> > __releases(&files->file_lock)
> > {
> > 	struct file *tofree;
> > 	struct fdtable *fdt;
> > 
> > 	...
> > 
> > 	/*
> > 	 * New bit, allowing the file to be null. Doesn't have the same
> > 	 * "sanity check" bits from __alloc_fd
> > 	 */
> > 	if (likely(file))
> > 		get_file(file);
> > 	rcu_assign_pointer(fdt->fd[fd], file);
> > 
> > 	__set_open_fd(fd, fdt);
> 
> IIUC, this means we can get the fdt into a state of an open fd with a
> NULL file... is that okay? That feels like something Al might rebel at.
> :)
> 
> > 
> > 	...
> > }
> > 
> > /*
> >  * File Receive - Receive a file from another process
> >  *
> >  * Encapsulates the logic to handle receiving a file from another task. It
> >  * does not install the file descriptor. That is delegated to the user. If
> >  * an error occurs that results in the file descriptor not being installed,
> >  * they must put_unused_fd.
> >  *
> >  * fd should be >= 0 if you intend on replacing a file descriptor, or
> >  * alternatively -1 if you want file_receive to allocate an FD for you
> >  *
> >  * Returns the fd number on success.
> >  * Returns negative error code on failure.
> >  *
> >  */
> > int file_receive(int fd, unsigned int flags, struct file *file)
> > {
> > 	int err;
> > 	struct socket *sock;
> > 	struct files_struct *files = current->files;
> > 
> > 	err = security_file_receive(file);
> > 	if (err)
> > 		return err;
> > 
> > 	if (fd >= 0) {
> > 		if (fd >= rlimit(RLIMIT_NOFILE))
> > 			return -EBADF;
> > 
> > 		spin_lock(&files->file_lock);
> > 		err = expand_files(files, fd);
> > 		if (err < 0) {
> > 			goto out_unlock;
> > 		}
> > 
> > 		err = do_dup2(files, NULL, fd, flags);
> > 		if (err)
> > 			return err;
> 
> This seems like we're duplicating some checks and missing others -- I
> really think we need to do this using the existing primitives. But I'd
> really like some review or commentary from Al. We can do this a bunch of
> ways, and I'd like to know which way looks best to him. :(
> 
> > This way there is:
> > 1. No "put_user" logic in file_receive
> > 2. Minimal (single) branching logic, unless there's something in between
> >    the file_receive and installing the FD, such as put_user.
> > 3. Doesn't implement fd_install, so there's no ambiguity about it being
> >    file_install_received vs. just the receive logic.
> 
> I still wonder if we should refactor SCM_RIGHTS to just delay put_user
> failures, which would simplify a bunch. It's a behavior change, but it

I'm looking at __scm_install_fd() and I wonder what specifically you
mean by that? The put_user() seems to be placed such that the install
occurrs only if it succeeded. Sure, it only handles a single fd but
whatever. Userspace knows that already. Just look at systemd when a msg
fails:

void cmsg_close_all(struct msghdr *mh) {
        struct cmsghdr *cmsg;

        assert(mh);

        CMSG_FOREACH(cmsg, mh)
                if (cmsg->cmsg_level == SOL_SOCKET && cmsg->cmsg_type == SCM_RIGHTS)
                        close_many((int*) CMSG_DATA(cmsg), (cmsg->cmsg_len - CMSG_LEN(0)) / sizeof(int));
}

The only reasonable scenario for this whole mess I can think of is sm like (pseudo code):

fd_install_received(int fd, struct file *file)
{
 	sock = sock_from_file(fd, &err);
 	if (sock) {
 		sock_update_netprioidx(&sock->sk->sk_cgrp_data);
 		sock_update_classid(&sock->sk->sk_cgrp_data);
 	}

	fd_install();
}

error = 0;
fdarray = malloc(fdmax);
for (i = 0; i < fdmax; i++) {
	fdarray[i] = get_unused_fd_flags(o_flags);
	if (fdarray[i] < 0) {
		error = -EBADF;
		break;
	}

	error = security_file_receive(file);
	if (error)
		break;

	error = put_user(fd_array[i], ufd);
	if (error)
		break;
}

for (i = 0; i < fdmax; i++) {
	if (error) {
		/* ignore errors */
		put_user(-EBADF, ufd); /* If this put_user() fails and the first one succeeded userspace might now close an fd it didn't intend to. */
		put_unused_fd(fdarray[i]);
	} else {
		fd_install_received(fdarray[i], file);
	}
}

Christian
Kees Cook June 9, 2020, 8:55 p.m.
On Tue, Jun 09, 2020 at 10:03:46PM +0200, Christian Brauner wrote:
> I'm looking at __scm_install_fd() and I wonder what specifically you
> mean by that? The put_user() seems to be placed such that the install
> occurrs only if it succeeded. Sure, it only handles a single fd but
> whatever. Userspace knows that already. Just look at systemd when a msg
> fails:
> 
> void cmsg_close_all(struct msghdr *mh) {
>         struct cmsghdr *cmsg;
> 
>         assert(mh);
> 
>         CMSG_FOREACH(cmsg, mh)
>                 if (cmsg->cmsg_level == SOL_SOCKET && cmsg->cmsg_type == SCM_RIGHTS)
>                         close_many((int*) CMSG_DATA(cmsg), (cmsg->cmsg_len - CMSG_LEN(0)) / sizeof(int));
> }
> 
> The only reasonable scenario for this whole mess I can think of is sm like (pseudo code):
> 
> fd_install_received(int fd, struct file *file)
> {
>  	sock = sock_from_file(fd, &err);
>  	if (sock) {
>  		sock_update_netprioidx(&sock->sk->sk_cgrp_data);
>  		sock_update_classid(&sock->sk->sk_cgrp_data);
>  	}
> 
> 	fd_install();
> }
> 
> error = 0;
> fdarray = malloc(fdmax);
> for (i = 0; i < fdmax; i++) {
> 	fdarray[i] = get_unused_fd_flags(o_flags);
> 	if (fdarray[i] < 0) {
> 		error = -EBADF;
> 		break;
> 	}
> 
> 	error = security_file_receive(file);
> 	if (error)
> 		break;
> 
> 	error = put_user(fd_array[i], ufd);
> 	if (error)
> 		break;
> }
> 
> for (i = 0; i < fdmax; i++) {
> 	if (error) {
> 		/* ignore errors */
> 		put_user(-EBADF, ufd); /* If this put_user() fails and the first one succeeded userspace might now close an fd it didn't intend to. */
> 		put_unused_fd(fdarray[i]);
> 	} else {
> 		fd_install_received(fdarray[i], file);
> 	}
> }

I see 4 cases of the same code pattern (get_unused_fd_flags(),
sock_update_*(), fd_install()), one of them has this difficult put_user()
in the middle, and one of them has a potential replace_fd() instead of
the get_used/fd_install. So, to me, it makes sense to have a helper that
encapsulates the common work that each of those call sites has to do,
which I keep cringing at all these suggestions that leave portions of it
outside the helper.

If it's too ugly to keep the put_user() in the helper, then we can try
what was suggested earlier, and just totally rework the failure path for
SCM_RIGHTS.

LOL. And while we were debating this, hch just went and cleaned stuff
up:

2618d530dd8b ("net/scm: cleanup scm_detach_fds")

So, um, yeah, now my proposal is actually even closer to what we already
have there. We just add the replace_fd() logic to __scm_install_fd() and
we're done with it.
Christian Brauner June 9, 2020, 9:27 p.m.
On June 9, 2020 10:55:42 PM GMT+02:00, Kees Cook <keescook@chromium.org> wrote:
>On Tue, Jun 09, 2020 at 10:03:46PM +0200, Christian Brauner wrote:
>> I'm looking at __scm_install_fd() and I wonder what specifically you
>> mean by that? The put_user() seems to be placed such that the install
>> occurrs only if it succeeded. Sure, it only handles a single fd but
>> whatever. Userspace knows that already. Just look at systemd when a
>msg
>> fails:
>> 
>> void cmsg_close_all(struct msghdr *mh) {
>>         struct cmsghdr *cmsg;
>> 
>>         assert(mh);
>> 
>>         CMSG_FOREACH(cmsg, mh)
>>                 if (cmsg->cmsg_level == SOL_SOCKET && cmsg->cmsg_type
>== SCM_RIGHTS)
>>                         close_many((int*) CMSG_DATA(cmsg),
>(cmsg->cmsg_len - CMSG_LEN(0)) / sizeof(int));
>> }
>> 
>> The only reasonable scenario for this whole mess I can think of is sm
>like (pseudo code):
>> 
>> fd_install_received(int fd, struct file *file)
>> {
>>  	sock = sock_from_file(fd, &err);
>>  	if (sock) {
>>  		sock_update_netprioidx(&sock->sk->sk_cgrp_data);
>>  		sock_update_classid(&sock->sk->sk_cgrp_data);
>>  	}
>> 
>> 	fd_install();
>> }
>> 
>> error = 0;
>> fdarray = malloc(fdmax);
>> for (i = 0; i < fdmax; i++) {
>> 	fdarray[i] = get_unused_fd_flags(o_flags);
>> 	if (fdarray[i] < 0) {
>> 		error = -EBADF;
>> 		break;
>> 	}
>> 
>> 	error = security_file_receive(file);
>> 	if (error)
>> 		break;
>> 
>> 	error = put_user(fd_array[i], ufd);
>> 	if (error)
>> 		break;
>> }
>> 
>> for (i = 0; i < fdmax; i++) {
>> 	if (error) {
>> 		/* ignore errors */
>> 		put_user(-EBADF, ufd); /* If this put_user() fails and the first
>one succeeded userspace might now close an fd it didn't intend to. */
>> 		put_unused_fd(fdarray[i]);
>> 	} else {
>> 		fd_install_received(fdarray[i], file);
>> 	}
>> }
>
>I see 4 cases of the same code pattern (get_unused_fd_flags(),
>sock_update_*(), fd_install()), one of them has this difficult
>put_user()
>in the middle, and one of them has a potential replace_fd() instead of
>the get_used/fd_install. So, to me, it makes sense to have a helper
>that
>encapsulates the common work that each of those call sites has to do,
>which I keep cringing at all these suggestions that leave portions of
>it
>outside the helper.
>
>If it's too ugly to keep the put_user() in the helper, then we can try
>what was suggested earlier, and just totally rework the failure path
>for
>SCM_RIGHTS.
>
>LOL. And while we were debating this, hch just went and cleaned stuff
>up:
>
>2618d530dd8b ("net/scm: cleanup scm_detach_fds")
>
>So, um, yeah, now my proposal is actually even closer to what we
>already
>have there. We just add the replace_fd() logic to __scm_install_fd()
>and
>we're done with it.

Cool, you have a link? :)

Christian
Kees Cook June 10, 2020, 5:27 a.m.
On Tue, Jun 09, 2020 at 11:27:30PM +0200, Christian Brauner wrote:
> On June 9, 2020 10:55:42 PM GMT+02:00, Kees Cook <keescook@chromium.org> wrote:
> >LOL. And while we were debating this, hch just went and cleaned stuff up:
> >
> >2618d530dd8b ("net/scm: cleanup scm_detach_fds")
> >
> >So, um, yeah, now my proposal is actually even closer to what we already
> >have there. We just add the replace_fd() logic to __scm_install_fd() and
> >we're done with it.
> 
> Cool, you have a link? :)

How about this:

https://git.kernel.org/pub/scm/linux/kernel/git/kees/linux.git/commit/?h=devel/seccomp/addfd/v3.1&id=bb94586b9e7cc88e915536c2e9fb991a97b62416
Sargun Dhillon June 10, 2020, 8:12 a.m.
On Tue, Jun 09, 2020 at 10:27:54PM -0700, Kees Cook wrote:
> On Tue, Jun 09, 2020 at 11:27:30PM +0200, Christian Brauner wrote:
> > On June 9, 2020 10:55:42 PM GMT+02:00, Kees Cook <keescook@chromium.org> wrote:
> > >LOL. And while we were debating this, hch just went and cleaned stuff up:
> > >
> > >2618d530dd8b ("net/scm: cleanup scm_detach_fds")
> > >
> > >So, um, yeah, now my proposal is actually even closer to what we already
> > >have there. We just add the replace_fd() logic to __scm_install_fd() and
> > >we're done with it.
> > 
> > Cool, you have a link? :)
> 
> How about this:
> 
Thank you.
> https://git.kernel.org/pub/scm/linux/kernel/git/kees/linux.git/commit/?h=devel/seccomp/addfd/v3.1&id=bb94586b9e7cc88e915536c2e9fb991a97b62416
> 
> -- 
> Kees Cook

+		if (ufd) {
+			error = put_user(new_fd, ufd);
+			if (error) {
+				put_unused_fd(new_fd);
+				return error;
+			}
+ 		}
I'm fairly sure this introduces a bug[1] if the user does:

struct msghdr msg = {};
struct cmsghdr *cmsg;
struct iovec io = {
	.iov_base = &c,
	.iov_len = 1,
};

msg.msg_iov = &io;
msg.msg_iovlen = 1;
msg.msg_control = NULL;
msg.msg_controllen = sizeof(buf);

recvmsg(sock, &msg, 0);

They will have the FD installed, no error message, but FD number wont be written 
to memory AFAICT. If two FDs are passed, you will get an efault. They will both
be installed, but memory wont be written to. Maybe instead of 0, make it a
poison pointer, or -1 instead?

-----
As an aside, all of this junk should be dropped:
+	ret = get_user(size, &uaddfd->size);
+	if (ret)
+		return ret;
+
+	ret = copy_struct_from_user(&addfd, sizeof(addfd), uaddfd, size);
+	if (ret)
+		return ret;

and the size member of the seccomp_notif_addfd struct. I brought this up 
off-list with Tycho that ioctls have the size of the struct embedded in them. We 
should just use that. The ioctl definition is based on this[2]:
#define _IOC(dir,type,nr,size) \
	(((dir)  << _IOC_DIRSHIFT) | \
	 ((type) << _IOC_TYPESHIFT) | \
	 ((nr)   << _IOC_NRSHIFT) | \
	 ((size) << _IOC_SIZESHIFT))


We should just use copy_from_user for now. In the future, we can either 
introduce new ioctl names for new structs, or extract the size dynamically from 
the ioctl (and mask it out on the switch statement in seccomp_notify_ioctl.

----
+#define SECCOMP_IOCTL_NOTIF_ADDFD	SECCOMP_IOR(3,	\
+						struct seccomp_notif_addfd)

Lastly, what I believe to be a small mistake, it should be SECCOMP_IOW, based on 
the documentation in ioctl.h -- "_IOW means userland is writing and kernel is 
reading."


[1]: https://lore.kernel.org/lkml/20200604052040.GA16501@ircssh-2.c.rugged-nimbus-611.internal/
[2]: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/include/uapi/asm-generic/ioctl.h?id=v5.7#n69
David Laight June 10, 2020, 8:48 a.m.
From: Sargun Dhillon
> Sent: 10 June 2020 09:13
> 
> On Tue, Jun 09, 2020 at 10:27:54PM -0700, Kees Cook wrote:
> > On Tue, Jun 09, 2020 at 11:27:30PM +0200, Christian Brauner wrote:
> > > On June 9, 2020 10:55:42 PM GMT+02:00, Kees Cook <keescook@chromium.org> wrote:
> > > >LOL. And while we were debating this, hch just went and cleaned stuff up:
> > > >
> > > >2618d530dd8b ("net/scm: cleanup scm_detach_fds")
> > > >
> > > >So, um, yeah, now my proposal is actually even closer to what we already
> > > >have there. We just add the replace_fd() logic to __scm_install_fd() and
> > > >we're done with it.
> > >
> > > Cool, you have a link? :)
> >
> > How about this:
> >
> Thank you.
> >
> https://git.kernel.org/pub/scm/linux/kernel/git/kees/linux.git/commit/?h=devel/seccomp/addfd/v3.1&id=b
> b94586b9e7cc88e915536c2e9fb991a97b62416
> >
> > --
> > Kees Cook
> 
> +		if (ufd) {
> +			error = put_user(new_fd, ufd);
> +			if (error) {
> +				put_unused_fd(new_fd);
> +				return error;
> +			}
> + 		}
> I'm fairly sure this introduces a bug[1] if the user does:
> 
> struct msghdr msg = {};
> struct cmsghdr *cmsg;
> struct iovec io = {
> 	.iov_base = &c,
> 	.iov_len = 1,
> };
> 
> msg.msg_iov = &io;
> msg.msg_iovlen = 1;
> msg.msg_control = NULL;
> msg.msg_controllen = sizeof(buf);
> 
> recvmsg(sock, &msg, 0);
> 
> They will have the FD installed, no error message, but FD number wont be written
> to memory AFAICT. If two FDs are passed, you will get an efault. They will both
> be installed, but memory wont be written to. Maybe instead of 0, make it a
> poison pointer, or -1 instead?

IMHO if the buffer isn't big enough the nothing should happen.
(or maybe a few of the fds be returned and the others left for later.)

OTOH if the user passed an invalid address then installing the fd
and returning EFAULT (and hence SIGSEGV) seems reasonable.
Properly written apps just don't do that.

In essence the 'copy_to_user' is done by the wrapper code.
The code filling in the CMSG buffer can be considered to be
writing a kernel buffer.

IIRC other kernels (eg NetBSD) do the copies for ioctl() requests
in the ioctl syscall wrapper.
The IOW/IOR/IOWR flags have to be right.

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)
Christian Brauner June 10, 2020, 9:30 a.m.
On Tue, Jun 09, 2020 at 11:27:30PM +0200, Christian Brauner wrote:
> On June 9, 2020 10:55:42 PM GMT+02:00, Kees Cook <keescook@chromium.org> wrote:
> >On Tue, Jun 09, 2020 at 10:03:46PM +0200, Christian Brauner wrote:
> >> I'm looking at __scm_install_fd() and I wonder what specifically you
> >> mean by that? The put_user() seems to be placed such that the install
> >> occurrs only if it succeeded. Sure, it only handles a single fd but
> >> whatever. Userspace knows that already. Just look at systemd when a
> >msg
> >> fails:
> >> 
> >> void cmsg_close_all(struct msghdr *mh) {
> >>         struct cmsghdr *cmsg;
> >> 
> >>         assert(mh);
> >> 
> >>         CMSG_FOREACH(cmsg, mh)
> >>                 if (cmsg->cmsg_level == SOL_SOCKET && cmsg->cmsg_type
> >== SCM_RIGHTS)
> >>                         close_many((int*) CMSG_DATA(cmsg),
> >(cmsg->cmsg_len - CMSG_LEN(0)) / sizeof(int));
> >> }
> >> 
> >> The only reasonable scenario for this whole mess I can think of is sm
> >like (pseudo code):
> >> 
> >> fd_install_received(int fd, struct file *file)
> >> {
> >>  	sock = sock_from_file(fd, &err);
> >>  	if (sock) {
> >>  		sock_update_netprioidx(&sock->sk->sk_cgrp_data);
> >>  		sock_update_classid(&sock->sk->sk_cgrp_data);
> >>  	}
> >> 
> >> 	fd_install();
> >> }
> >> 
> >> error = 0;
> >> fdarray = malloc(fdmax);
> >> for (i = 0; i < fdmax; i++) {
> >> 	fdarray[i] = get_unused_fd_flags(o_flags);
> >> 	if (fdarray[i] < 0) {
> >> 		error = -EBADF;
> >> 		break;
> >> 	}
> >> 
> >> 	error = security_file_receive(file);
> >> 	if (error)
> >> 		break;
> >> 
> >> 	error = put_user(fd_array[i], ufd);
> >> 	if (error)
> >> 		break;
> >> }
> >> 
> >> for (i = 0; i < fdmax; i++) {
> >> 	if (error) {
> >> 		/* ignore errors */
> >> 		put_user(-EBADF, ufd); /* If this put_user() fails and the first
> >one succeeded userspace might now close an fd it didn't intend to. */
> >> 		put_unused_fd(fdarray[i]);
> >> 	} else {
> >> 		fd_install_received(fdarray[i], file);
> >> 	}
> >> }
> >
> >I see 4 cases of the same code pattern (get_unused_fd_flags(),
> >sock_update_*(), fd_install()), one of them has this difficult
> >put_user()
> >in the middle, and one of them has a potential replace_fd() instead of
> >the get_used/fd_install. So, to me, it makes sense to have a helper
> >that
> >encapsulates the common work that each of those call sites has to do,
> >which I keep cringing at all these suggestions that leave portions of
> >it
> >outside the helper.
> >
> >If it's too ugly to keep the put_user() in the helper, then we can try
> >what was suggested earlier, and just totally rework the failure path
> >for
> >SCM_RIGHTS.
> >
> >LOL. And while we were debating this, hch just went and cleaned stuff
> >up:
> >
> >2618d530dd8b ("net/scm: cleanup scm_detach_fds")
> >
> >So, um, yeah, now my proposal is actually even closer to what we
> >already
> >have there. We just add the replace_fd() logic to __scm_install_fd()
> >and
> >we're done with it.
> 
> Cool, you have a link? :)

For the record, as I didn't see this yesterday since I was already
looking at a kernel with Christoph's changes. His changes just move the
logic that was already there before into a separate helper.

So effectively nothing has changed semantically in the scm code at all.

This is why I was asking yesterday what you meant by reworking the scm
code's put_user() logic as it seems obviously correct as it is done now.

Christian
Kees Cook June 10, 2020, 5:10 p.m.
On Wed, Jun 10, 2020 at 08:12:38AM +0000, Sargun Dhillon wrote:
> On Tue, Jun 09, 2020 at 10:27:54PM -0700, Kees Cook wrote:
> > On Tue, Jun 09, 2020 at 11:27:30PM +0200, Christian Brauner wrote:
> > > On June 9, 2020 10:55:42 PM GMT+02:00, Kees Cook <keescook@chromium.org> wrote:
> > > >LOL. And while we were debating this, hch just went and cleaned stuff up:
> > > >
> > > >2618d530dd8b ("net/scm: cleanup scm_detach_fds")
> > > >
> > > >So, um, yeah, now my proposal is actually even closer to what we already
> > > >have there. We just add the replace_fd() logic to __scm_install_fd() and
> > > >we're done with it.
> > > 
> > > Cool, you have a link? :)
> > 
> > How about this:
> > 
> Thank you.
> > https://git.kernel.org/pub/scm/linux/kernel/git/kees/linux.git/commit/?h=devel/seccomp/addfd/v3.1&id=bb94586b9e7cc88e915536c2e9fb991a97b62416
> > 
> > -- 
> > Kees Cook
> 
> +		if (ufd) {
> +			error = put_user(new_fd, ufd);
> +			if (error) {
> +				put_unused_fd(new_fd);
> +				return error;
> +			}
> + 		}
> I'm fairly sure this introduces a bug[1] if the user does:

Ah, sorry, I missed this before I posted my "v3.2" tree link.

> 
> struct msghdr msg = {};
> struct cmsghdr *cmsg;
> struct iovec io = {
> 	.iov_base = &c,
> 	.iov_len = 1,
> };
> 
> msg.msg_iov = &io;
> msg.msg_iovlen = 1;
> msg.msg_control = NULL;
> msg.msg_controllen = sizeof(buf);
> 
> recvmsg(sock, &msg, 0);
> 
> They will have the FD installed, no error message, but FD number wont be written 
> to memory AFAICT. If two FDs are passed, you will get an efault. They will both
> be installed, but memory wont be written to. Maybe instead of 0, make it a
> poison pointer, or -1 instead?

Hmmm. I see what you mean -- SCM_RIGHTS effectively _requires_ a valid
__user pointer, so we can't use NULL to indicate "we don't want this".
I'm not sure I can pass this through directly at all, though.

> -----
> As an aside, all of this junk should be dropped:
> +	ret = get_user(size, &uaddfd->size);
> +	if (ret)
> +		return ret;
> +
> +	ret = copy_struct_from_user(&addfd, sizeof(addfd), uaddfd, size);
> +	if (ret)
> +		return ret;
> 
> and the size member of the seccomp_notif_addfd struct. I brought this up 
> off-list with Tycho that ioctls have the size of the struct embedded in them. We 
> should just use that. The ioctl definition is based on this[2]:
> #define _IOC(dir,type,nr,size) \
> 	(((dir)  << _IOC_DIRSHIFT) | \
> 	 ((type) << _IOC_TYPESHIFT) | \
> 	 ((nr)   << _IOC_NRSHIFT) | \
> 	 ((size) << _IOC_SIZESHIFT))
> 
> 
> We should just use copy_from_user for now. In the future, we can either 
> introduce new ioctl names for new structs, or extract the size dynamically from 
> the ioctl (and mask it out on the switch statement in seccomp_notify_ioctl.

Okay, sounds good.

> ----
> +#define SECCOMP_IOCTL_NOTIF_ADDFD	SECCOMP_IOR(3,	\
> +						struct seccomp_notif_addfd)
> 
> Lastly, what I believe to be a small mistake, it should be SECCOMP_IOW, based on 
> the documentation in ioctl.h -- "_IOW means userland is writing and kernel is 
> reading."

Okay, let me tweak things and get a "v3.3". ;)
Kees Cook June 11, 2020, 3:02 a.m.
On Wed, Jun 10, 2020 at 08:48:45AM +0000, David Laight wrote:
> From: Sargun Dhillon
> > Sent: 10 June 2020 09:13
> In essence the 'copy_to_user' is done by the wrapper code.
> The code filling in the CMSG buffer can be considered to be
> writing a kernel buffer.
> 
> IIRC other kernels (eg NetBSD) do the copies for ioctl() requests
> in the ioctl syscall wrapper.
> The IOW/IOR/IOWR flags have to be right.

Yeah, this seems like it'd make a lot more sense (and would have easily
caught the IOR/IOW issue pointed out later in the thread). I wonder how
insane it would be to try to fix that globally in the kernel...
David Laight June 11, 2020, 7:51 a.m.
From: Kees Cook
> Sent: 11 June 2020 04:03
...
> > IIRC other kernels (eg NetBSD) do the copies for ioctl() requests
> > in the ioctl syscall wrapper.
> > The IOW/IOR/IOWR flags have to be right.
> 
> Yeah, this seems like it'd make a lot more sense (and would have easily
> caught the IOR/IOW issue pointed out later in the thread). I wonder how
> insane it would be to try to fix that globally in the kernel...

Seems like a good idea to me.
(Even though I'll need to fix our 'out of tree' modules.)

Unlike [sg]etsockopt() at least the buffer is bounded to 1k.

But you'd really need to add new kernel_ioctl() entry points
before deprecating the existing ones a release or two later.

With a bit of luck there aren't any drivers ported from SYSV that
just treat the ioctl command as a 32bit transparent value and
the argument as an integer.

I actually suspect that BSD added IOW (etc) in the 16bit to 32bit port.
The kernel copies being moved to the syscall stub at the same time.
Since Linux has only ever been 32bit and uses IOW is it actually odd
that Linus didn't do the copies in the stub.

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)