[03/11] fs: Allow superblock owner to change ownership of inodes

Submitted by Dongsu Park on Dec. 22, 2017, 2:32 p.m.

Details

Message ID ac3d34002d7690f6ca5928b57b7fc4d707104b04.1512041070.git.dongsu@kinvolk.io
State New
Series "FUSE mounts from non-init user namespaces"
Headers show

Commit Message

Dongsu Park Dec. 22, 2017, 2:32 p.m.
From: Eric W. Biederman <ebiederm@xmission.com>

Allow users with CAP_SYS_CHOWN over the superblock of a filesystem to
chown files.  Ordinarily the capable_wrt_inode_uidgid check is
sufficient to allow access to files but when the underlying filesystem
has uids or gids that don't map to the current user namespace it is
not enough, so the chown permission checks need to be extended to
allow this case.

Calling chown on filesystem nodes whose uid or gid don't map is
necessary if those nodes are going to be modified as writing back
inodes which contain uids or gids that don't map is likely to cause
filesystem corruption of the uid or gid fields.

Once chown has been called the existing capable_wrt_inode_uidgid
checks are sufficient, to allow the owner of a superblock to do anything
the global root user can do with an appropriate set of capabilities.

For the proc filesystem this relaxation of permissions is not safe, as
some files are owned by users (particularly GLOBAL_ROOT_UID) outside
of the control of the mounter of the proc and that would be unsafe to
grant chown access to.  So update setattr on proc to disallow changing
files whose uids or gids are outside of proc's s_user_ns.

The original version of this patch was written by: Seth Forshee.  I
have rewritten and rethought this patch enough so it's really not the
same thing (certainly it needs a different description), but he
deserves credit for getting out there and getting the conversation
started, and finding the potential gotcha's and putting up with my
semi-paranoid feedback.

Patch v4 is available: https://patchwork.kernel.org/patch/8944611/

Cc: linux-fsdevel@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: "Luis R. Rodriguez" <mcgrof@kernel.org>
Cc: Kees Cook <keescook@chromium.org>
Inspired-by: Seth Forshee <seth.forshee@canonical.com>
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
[saf: Resolve conflicts caused by s/inode_change_ok/setattr_prepare/]
Signed-off-by: Dongsu Park <dongsu@kinvolk.io>
---
 fs/attr.c             | 34 ++++++++++++++++++++++++++--------
 fs/proc/base.c        |  7 +++++++
 fs/proc/generic.c     |  7 +++++++
 fs/proc/proc_sysctl.c |  7 +++++++
 4 files changed, 47 insertions(+), 8 deletions(-)

Patch hide | download patch | download mbox

diff --git a/fs/attr.c b/fs/attr.c
index 12ffdb6f..bf8e94f3 100644
--- a/fs/attr.c
+++ b/fs/attr.c
@@ -18,6 +18,30 @@ 
 #include <linux/evm.h>
 #include <linux/ima.h>
 
+static bool chown_ok(const struct inode *inode, kuid_t uid)
+{
+	if (uid_eq(current_fsuid(), inode->i_uid) &&
+	    uid_eq(uid, inode->i_uid))
+		return true;
+	if (capable_wrt_inode_uidgid(inode, CAP_CHOWN))
+		return true;
+	if (ns_capable(inode->i_sb->s_user_ns, CAP_CHOWN))
+		return true;
+	return false;
+}
+
+static bool chgrp_ok(const struct inode *inode, kgid_t gid)
+{
+	if (uid_eq(current_fsuid(), inode->i_uid) &&
+	    (in_group_p(gid) || gid_eq(gid, inode->i_gid)))
+		return true;
+	if (capable_wrt_inode_uidgid(inode, CAP_CHOWN))
+		return true;
+	if (ns_capable(inode->i_sb->s_user_ns, CAP_CHOWN))
+		return true;
+	return false;
+}
+
 /**
  * setattr_prepare - check if attribute changes to a dentry are allowed
  * @dentry:	dentry to check
@@ -52,17 +76,11 @@  int setattr_prepare(struct dentry *dentry, struct iattr *attr)
 		goto kill_priv;
 
 	/* Make sure a caller can chown. */
-	if ((ia_valid & ATTR_UID) &&
-	    (!uid_eq(current_fsuid(), inode->i_uid) ||
-	     !uid_eq(attr->ia_uid, inode->i_uid)) &&
-	    !capable_wrt_inode_uidgid(inode, CAP_CHOWN))
+	if ((ia_valid & ATTR_UID) && !chown_ok(inode, attr->ia_uid))
 		return -EPERM;
 
 	/* Make sure caller can chgrp. */
-	if ((ia_valid & ATTR_GID) &&
-	    (!uid_eq(current_fsuid(), inode->i_uid) ||
-	    (!in_group_p(attr->ia_gid) && !gid_eq(attr->ia_gid, inode->i_gid))) &&
-	    !capable_wrt_inode_uidgid(inode, CAP_CHOWN))
+	if ((ia_valid & ATTR_GID) && !chgrp_ok(inode, attr->ia_gid))
 		return -EPERM;
 
 	/* Make sure a caller can chmod. */
diff --git a/fs/proc/base.c b/fs/proc/base.c
index 31934cb9..9d50ec92 100644
--- a/fs/proc/base.c
+++ b/fs/proc/base.c
@@ -665,10 +665,17 @@  int proc_setattr(struct dentry *dentry, struct iattr *attr)
 {
 	int error;
 	struct inode *inode = d_inode(dentry);
+	struct user_namespace *s_user_ns;
 
 	if (attr->ia_valid & ATTR_MODE)
 		return -EPERM;
 
+	/* Don't let anyone mess with weird proc files */
+	s_user_ns = inode->i_sb->s_user_ns;
+	if (!kuid_has_mapping(s_user_ns, inode->i_uid) ||
+	    !kgid_has_mapping(s_user_ns, inode->i_gid))
+		return -EPERM;
+
 	error = setattr_prepare(dentry, attr);
 	if (error)
 		return error;
diff --git a/fs/proc/generic.c b/fs/proc/generic.c
index 793a6757..527d46c8 100644
--- a/fs/proc/generic.c
+++ b/fs/proc/generic.c
@@ -106,8 +106,15 @@  static int proc_notify_change(struct dentry *dentry, struct iattr *iattr)
 {
 	struct inode *inode = d_inode(dentry);
 	struct proc_dir_entry *de = PDE(inode);
+	struct user_namespace *s_user_ns;
 	int error;
 
+	/* Don't let anyone mess with weird proc files */
+	s_user_ns = inode->i_sb->s_user_ns;
+	if (!kuid_has_mapping(s_user_ns, inode->i_uid) ||
+	    !kgid_has_mapping(s_user_ns, inode->i_gid))
+		return -EPERM;
+
 	error = setattr_prepare(dentry, iattr);
 	if (error)
 		return error;
diff --git a/fs/proc/proc_sysctl.c b/fs/proc/proc_sysctl.c
index c5cbbdff..0f9562d1 100644
--- a/fs/proc/proc_sysctl.c
+++ b/fs/proc/proc_sysctl.c
@@ -802,11 +802,18 @@  static int proc_sys_permission(struct inode *inode, int mask)
 static int proc_sys_setattr(struct dentry *dentry, struct iattr *attr)
 {
 	struct inode *inode = d_inode(dentry);
+	struct user_namespace *s_user_ns;
 	int error;
 
 	if (attr->ia_valid & (ATTR_MODE | ATTR_UID | ATTR_GID))
 		return -EPERM;
 
+	/* Don't let anyone mess with weird proc files */
+	s_user_ns = inode->i_sb->s_user_ns;
+	if (!kuid_has_mapping(s_user_ns, inode->i_uid) ||
+	    !kgid_has_mapping(s_user_ns, inode->i_gid))
+		return -EPERM;
+
 	error = setattr_prepare(dentry, attr);
 	if (error)
 		return error;

Comments

Serge E. Hallyn Dec. 23, 2017, 3:17 a.m.
On Fri, Dec 22, 2017 at 03:32:27PM +0100, Dongsu Park wrote:
> From: Eric W. Biederman <ebiederm@xmission.com>
> 
> Allow users with CAP_SYS_CHOWN over the superblock of a filesystem to

Note it is CAP_CHOWN

> chown files.  Ordinarily the capable_wrt_inode_uidgid check is
> sufficient to allow access to files but when the underlying filesystem
> has uids or gids that don't map to the current user namespace it is
> not enough, so the chown permission checks need to be extended to
> allow this case.
> 
> Calling chown on filesystem nodes whose uid or gid don't map is
> necessary if those nodes are going to be modified as writing back
> inodes which contain uids or gids that don't map is likely to cause
> filesystem corruption of the uid or gid fields.
> 
> Once chown has been called the existing capable_wrt_inode_uidgid
> checks are sufficient, to allow the owner of a superblock to do anything
> the global root user can do with an appropriate set of capabilities.
> 
> For the proc filesystem this relaxation of permissions is not safe, as
> some files are owned by users (particularly GLOBAL_ROOT_UID) outside
> of the control of the mounter of the proc and that would be unsafe to
> grant chown access to.  So update setattr on proc to disallow changing
> files whose uids or gids are outside of proc's s_user_ns.
> 
> The original version of this patch was written by: Seth Forshee.  I
> have rewritten and rethought this patch enough so it's really not the
> same thing (certainly it needs a different description), but he
> deserves credit for getting out there and getting the conversation
> started, and finding the potential gotcha's and putting up with my
> semi-paranoid feedback.
> 
> Patch v4 is available: https://patchwork.kernel.org/patch/8944611/
> 
> Cc: linux-fsdevel@vger.kernel.org
> Cc: linux-kernel@vger.kernel.org
> Cc: Alexander Viro <viro@zeniv.linux.org.uk>
> Cc: "Luis R. Rodriguez" <mcgrof@kernel.org>
> Cc: Kees Cook <keescook@chromium.org>
> Inspired-by: Seth Forshee <seth.forshee@canonical.com>
> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
> [saf: Resolve conflicts caused by s/inode_change_ok/setattr_prepare/]
> Signed-off-by: Dongsu Park <dongsu@kinvolk.io>

Reviewed-by: Serge Hallyn <serge@hallyn.com>

> ---
>  fs/attr.c             | 34 ++++++++++++++++++++++++++--------
>  fs/proc/base.c        |  7 +++++++
>  fs/proc/generic.c     |  7 +++++++
>  fs/proc/proc_sysctl.c |  7 +++++++
>  4 files changed, 47 insertions(+), 8 deletions(-)
> 
> diff --git a/fs/attr.c b/fs/attr.c
> index 12ffdb6f..bf8e94f3 100644
> --- a/fs/attr.c
> +++ b/fs/attr.c
> @@ -18,6 +18,30 @@
>  #include <linux/evm.h>
>  #include <linux/ima.h>
>  
> +static bool chown_ok(const struct inode *inode, kuid_t uid)
> +{
> +	if (uid_eq(current_fsuid(), inode->i_uid) &&
> +	    uid_eq(uid, inode->i_uid))
> +		return true;
> +	if (capable_wrt_inode_uidgid(inode, CAP_CHOWN))
> +		return true;
> +	if (ns_capable(inode->i_sb->s_user_ns, CAP_CHOWN))
> +		return true;
> +	return false;
> +}
> +
> +static bool chgrp_ok(const struct inode *inode, kgid_t gid)
> +{
> +	if (uid_eq(current_fsuid(), inode->i_uid) &&
> +	    (in_group_p(gid) || gid_eq(gid, inode->i_gid)))
> +		return true;
> +	if (capable_wrt_inode_uidgid(inode, CAP_CHOWN))
> +		return true;
> +	if (ns_capable(inode->i_sb->s_user_ns, CAP_CHOWN))
> +		return true;
> +	return false;
> +}
> +
>  /**
>   * setattr_prepare - check if attribute changes to a dentry are allowed
>   * @dentry:	dentry to check
> @@ -52,17 +76,11 @@ int setattr_prepare(struct dentry *dentry, struct iattr *attr)
>  		goto kill_priv;
>  
>  	/* Make sure a caller can chown. */
> -	if ((ia_valid & ATTR_UID) &&
> -	    (!uid_eq(current_fsuid(), inode->i_uid) ||
> -	     !uid_eq(attr->ia_uid, inode->i_uid)) &&
> -	    !capable_wrt_inode_uidgid(inode, CAP_CHOWN))
> +	if ((ia_valid & ATTR_UID) && !chown_ok(inode, attr->ia_uid))
>  		return -EPERM;
>  
>  	/* Make sure caller can chgrp. */
> -	if ((ia_valid & ATTR_GID) &&
> -	    (!uid_eq(current_fsuid(), inode->i_uid) ||
> -	    (!in_group_p(attr->ia_gid) && !gid_eq(attr->ia_gid, inode->i_gid))) &&
> -	    !capable_wrt_inode_uidgid(inode, CAP_CHOWN))
> +	if ((ia_valid & ATTR_GID) && !chgrp_ok(inode, attr->ia_gid))
>  		return -EPERM;
>  
>  	/* Make sure a caller can chmod. */
> diff --git a/fs/proc/base.c b/fs/proc/base.c
> index 31934cb9..9d50ec92 100644
> --- a/fs/proc/base.c
> +++ b/fs/proc/base.c
> @@ -665,10 +665,17 @@ int proc_setattr(struct dentry *dentry, struct iattr *attr)
>  {
>  	int error;
>  	struct inode *inode = d_inode(dentry);
> +	struct user_namespace *s_user_ns;
>  
>  	if (attr->ia_valid & ATTR_MODE)
>  		return -EPERM;
>  
> +	/* Don't let anyone mess with weird proc files */
> +	s_user_ns = inode->i_sb->s_user_ns;
> +	if (!kuid_has_mapping(s_user_ns, inode->i_uid) ||
> +	    !kgid_has_mapping(s_user_ns, inode->i_gid))
> +		return -EPERM;
> +
>  	error = setattr_prepare(dentry, attr);
>  	if (error)
>  		return error;
> diff --git a/fs/proc/generic.c b/fs/proc/generic.c
> index 793a6757..527d46c8 100644
> --- a/fs/proc/generic.c
> +++ b/fs/proc/generic.c
> @@ -106,8 +106,15 @@ static int proc_notify_change(struct dentry *dentry, struct iattr *iattr)
>  {
>  	struct inode *inode = d_inode(dentry);
>  	struct proc_dir_entry *de = PDE(inode);
> +	struct user_namespace *s_user_ns;
>  	int error;
>  
> +	/* Don't let anyone mess with weird proc files */
> +	s_user_ns = inode->i_sb->s_user_ns;
> +	if (!kuid_has_mapping(s_user_ns, inode->i_uid) ||
> +	    !kgid_has_mapping(s_user_ns, inode->i_gid))
> +		return -EPERM;
> +
>  	error = setattr_prepare(dentry, iattr);
>  	if (error)
>  		return error;
> diff --git a/fs/proc/proc_sysctl.c b/fs/proc/proc_sysctl.c
> index c5cbbdff..0f9562d1 100644
> --- a/fs/proc/proc_sysctl.c
> +++ b/fs/proc/proc_sysctl.c
> @@ -802,11 +802,18 @@ static int proc_sys_permission(struct inode *inode, int mask)
>  static int proc_sys_setattr(struct dentry *dentry, struct iattr *attr)
>  {
>  	struct inode *inode = d_inode(dentry);
> +	struct user_namespace *s_user_ns;
>  	int error;
>  
>  	if (attr->ia_valid & (ATTR_MODE | ATTR_UID | ATTR_GID))
>  		return -EPERM;
>  
> +	/* Don't let anyone mess with weird proc files */
> +	s_user_ns = inode->i_sb->s_user_ns;
> +	if (!kuid_has_mapping(s_user_ns, inode->i_uid) ||
> +	    !kgid_has_mapping(s_user_ns, inode->i_gid))
> +		return -EPERM;
> +
>  	error = setattr_prepare(dentry, attr);
>  	if (error)
>  		return error;
> -- 
> 2.13.6
Luis R. Rodriguez Jan. 5, 2018, 7:24 p.m.
On Fri, Dec 22, 2017 at 03:32:27PM +0100, Dongsu Park wrote:
> diff --git a/fs/attr.c b/fs/attr.c
> index 12ffdb6f..bf8e94f3 100644
> --- a/fs/attr.c
> +++ b/fs/attr.c
> @@ -18,6 +18,30 @@
>  #include <linux/evm.h>
>  #include <linux/ima.h>
>  
> +static bool chown_ok(const struct inode *inode, kuid_t uid)
> +{
> +	if (uid_eq(current_fsuid(), inode->i_uid) &&
> +	    uid_eq(uid, inode->i_uid))
> +		return true;
> +	if (capable_wrt_inode_uidgid(inode, CAP_CHOWN))
> +		return true;
> +	if (ns_capable(inode->i_sb->s_user_ns, CAP_CHOWN))
> +		return true;
> +	return false;
> +}
> +
> +static bool chgrp_ok(const struct inode *inode, kgid_t gid)
> +{
> +	if (uid_eq(current_fsuid(), inode->i_uid) &&
> +	    (in_group_p(gid) || gid_eq(gid, inode->i_gid)))
> +		return true;
> +	if (capable_wrt_inode_uidgid(inode, CAP_CHOWN))
> +		return true;
> +	if (ns_capable(inode->i_sb->s_user_ns, CAP_CHOWN))
> +		return true;
> +	return false;
> +}
> +
>  /**
>   * setattr_prepare - check if attribute changes to a dentry are allowed
>   * @dentry:	dentry to check
> @@ -52,17 +76,11 @@ int setattr_prepare(struct dentry *dentry, struct iattr *attr)
>  		goto kill_priv;
>  
>  	/* Make sure a caller can chown. */
> -	if ((ia_valid & ATTR_UID) &&
> -	    (!uid_eq(current_fsuid(), inode->i_uid) ||
> -	     !uid_eq(attr->ia_uid, inode->i_uid)) &&
> -	    !capable_wrt_inode_uidgid(inode, CAP_CHOWN))
> +	if ((ia_valid & ATTR_UID) && !chown_ok(inode, attr->ia_uid))
>  		return -EPERM;

I think this patch would read much better and easier to review if it was
split up by first adding the helpers, and then extending them afterwards.

>  
>  	/* Make sure caller can chgrp. */
> -	if ((ia_valid & ATTR_GID) &&
> -	    (!uid_eq(current_fsuid(), inode->i_uid) ||
> -	    (!in_group_p(attr->ia_gid) && !gid_eq(attr->ia_gid, inode->i_gid))) &&
> -	    !capable_wrt_inode_uidgid(inode, CAP_CHOWN))
> +	if ((ia_valid & ATTR_GID) && !chgrp_ok(inode, attr->ia_gid))
>  		return -EPERM;
>  
>  	/* Make sure a caller can chmod. */
> diff --git a/fs/proc/base.c b/fs/proc/base.c
> index 31934cb9..9d50ec92 100644
> --- a/fs/proc/base.c
> +++ b/fs/proc/base.c
> @@ -665,10 +665,17 @@ int proc_setattr(struct dentry *dentry, struct iattr *attr)
>  {
>  	int error;
>  	struct inode *inode = d_inode(dentry);
> +	struct user_namespace *s_user_ns;
>  
>  	if (attr->ia_valid & ATTR_MODE)
>  		return -EPERM;
>  
> +	/* Don't let anyone mess with weird proc files */
> +	s_user_ns = inode->i_sb->s_user_ns;
> +	if (!kuid_has_mapping(s_user_ns, inode->i_uid) ||
> +	    !kgid_has_mapping(s_user_ns, inode->i_gid))
> +		return -EPERM;
> +
>  	error = setattr_prepare(dentry, attr);
>  	if (error)
>  		return error;

Are we sure proc is the only special one? How was it observed first that this was
require for proc? Has anyone tried fuzzing by trying this op with a slew of other
filesystems on all files?

  Luis
Dongsu Park Jan. 9, 2018, 3:10 p.m.
Hi,

On Fri, Jan 5, 2018 at 8:24 PM, Luis R. Rodriguez <mcgrof@kernel.org> wrote:
> On Fri, Dec 22, 2017 at 03:32:27PM +0100, Dongsu Park wrote:
>> diff --git a/fs/attr.c b/fs/attr.c
>> index 12ffdb6f..bf8e94f3 100644
>> --- a/fs/attr.c
>> +++ b/fs/attr.c
>> @@ -18,6 +18,30 @@
>>  #include <linux/evm.h>
>>  #include <linux/ima.h>
>>
>> +static bool chown_ok(const struct inode *inode, kuid_t uid)
>> +{
>> +     if (uid_eq(current_fsuid(), inode->i_uid) &&
>> +         uid_eq(uid, inode->i_uid))
>> +             return true;
>> +     if (capable_wrt_inode_uidgid(inode, CAP_CHOWN))
>> +             return true;
>> +     if (ns_capable(inode->i_sb->s_user_ns, CAP_CHOWN))
>> +             return true;
>> +     return false;
>> +}
>> +
>> +static bool chgrp_ok(const struct inode *inode, kgid_t gid)
>> +{
>> +     if (uid_eq(current_fsuid(), inode->i_uid) &&
>> +         (in_group_p(gid) || gid_eq(gid, inode->i_gid)))
>> +             return true;
>> +     if (capable_wrt_inode_uidgid(inode, CAP_CHOWN))
>> +             return true;
>> +     if (ns_capable(inode->i_sb->s_user_ns, CAP_CHOWN))
>> +             return true;
>> +     return false;
>> +}
>> +
>>  /**
>>   * setattr_prepare - check if attribute changes to a dentry are allowed
>>   * @dentry:  dentry to check
>> @@ -52,17 +76,11 @@ int setattr_prepare(struct dentry *dentry, struct iattr *attr)
>>               goto kill_priv;
>>
>>       /* Make sure a caller can chown. */
>> -     if ((ia_valid & ATTR_UID) &&
>> -         (!uid_eq(current_fsuid(), inode->i_uid) ||
>> -          !uid_eq(attr->ia_uid, inode->i_uid)) &&
>> -         !capable_wrt_inode_uidgid(inode, CAP_CHOWN))
>> +     if ((ia_valid & ATTR_UID) && !chown_ok(inode, attr->ia_uid))
>>               return -EPERM;
>
> I think this patch would read much better and easier to review if it was
> split up by first adding the helpers, and then extending them afterwards.

I'm fine with splitting it up into multiple patches, if the original author
Eric agrees.

>>       /* Make sure caller can chgrp. */
>> -     if ((ia_valid & ATTR_GID) &&
>> -         (!uid_eq(current_fsuid(), inode->i_uid) ||
>> -         (!in_group_p(attr->ia_gid) && !gid_eq(attr->ia_gid, inode->i_gid))) &&
>> -         !capable_wrt_inode_uidgid(inode, CAP_CHOWN))
>> +     if ((ia_valid & ATTR_GID) && !chgrp_ok(inode, attr->ia_gid))
>>               return -EPERM;
>>
>>       /* Make sure a caller can chmod. */
>> diff --git a/fs/proc/base.c b/fs/proc/base.c
>> index 31934cb9..9d50ec92 100644
>> --- a/fs/proc/base.c
>> +++ b/fs/proc/base.c
>> @@ -665,10 +665,17 @@ int proc_setattr(struct dentry *dentry, struct iattr *attr)
>>  {
>>       int error;
>>       struct inode *inode = d_inode(dentry);
>> +     struct user_namespace *s_user_ns;
>>
>>       if (attr->ia_valid & ATTR_MODE)
>>               return -EPERM;
>>
>> +     /* Don't let anyone mess with weird proc files */
>> +     s_user_ns = inode->i_sb->s_user_ns;
>> +     if (!kuid_has_mapping(s_user_ns, inode->i_uid) ||
>> +         !kgid_has_mapping(s_user_ns, inode->i_gid))
>> +             return -EPERM;
>> +
>>       error = setattr_prepare(dentry, attr);
>>       if (error)
>>               return error;
>
> Are we sure proc is the only special one? How was it observed first that this was
> require for proc? Has anyone tried fuzzing by trying this op with a slew of other
> filesystems on all files?

From my limited knowledge about procfs, I suppose that procfs is a little
different from ordinary filesystems. Procfs is not exactly namespaced,
it has many inconsistencies. Some files under /proc should be owned by the
global root, regardless of user namespaces. That's why we need to handle such
special cases for proc. As it has been historically like that since the
beginning, it's hard to change it fundamentally.

However, you have good points. Other than procfs, there could be other
filesystems that have potential issues when relaxing privileges. Question is
how we can be sure that there's no hidden issues. From my understanding,
usually we could run testsuites like LTP
(https://github.com/linux-test-project/ltp.git) to avoid such regressions.
Today I have run LTP tests for fs & containers, with the patchset included.
It seemed to work fine without failures. Obviously it doesn't mean that it's
completely bug-free, when we are talking about unknown issues.
Please let me know if there are other good ways to figure out potential issues.

Thanks,
Dongsu

>   Luis
Luis R. Rodriguez Jan. 9, 2018, 5:23 p.m.
On Tue, Jan 09, 2018 at 04:10:54PM +0100, Dongsu Park wrote:
> On Fri, Jan 5, 2018 at 8:24 PM, Luis R. Rodriguez <mcgrof@kernel.org> wrote:
> > On Fri, Dec 22, 2017 at 03:32:27PM +0100, Dongsu Park wrote:
> > I think this patch would read much better and easier to review if it was
> > split up by first adding the helpers, and then extending them afterwards.
> 
> I'm fine with splitting it up into multiple patches, if the original author
> Eric agrees.

Great.

> > Are we sure proc is the only special one? How was it observed first that this was
> > require for proc? Has anyone tried fuzzing by trying this op with a slew of other
> > filesystems on all files?
>
> Please let me know if there are other good ways to figure out potential issues.

I think the trick would be to create a test which mimicks the issue and then try to
mount and run the test against as many filesystems as we support. So would developing
a test be possible here?

  Luis
Miklos Szeredi Feb. 13, 2018, 1:18 p.m.
On Fri, Dec 22, 2017 at 3:32 PM, Dongsu Park <dongsu@kinvolk.io> wrote:
> From: Eric W. Biederman <ebiederm@xmission.com>
>
> Allow users with CAP_SYS_CHOWN over the superblock of a filesystem to
> chown files.  Ordinarily the capable_wrt_inode_uidgid check is
> sufficient to allow access to files but when the underlying filesystem
> has uids or gids that don't map to the current user namespace it is
> not enough, so the chown permission checks need to be extended to
> allow this case.
>
> Calling chown on filesystem nodes whose uid or gid don't map is
> necessary if those nodes are going to be modified as writing back
> inodes which contain uids or gids that don't map is likely to cause
> filesystem corruption of the uid or gid fields.

How can the filesystem be corrupted if chown is denied?

It is not clear to me what the purpose of this patch is or what the
exact usecase this is fixing.

Thanks,
Miklos
Eric W. Biederman Feb. 16, 2018, 10 p.m.
Miklos Szeredi <mszeredi@redhat.com> writes:

> On Fri, Dec 22, 2017 at 3:32 PM, Dongsu Park <dongsu@kinvolk.io> wrote:
>> From: Eric W. Biederman <ebiederm@xmission.com>
>>
>> Allow users with CAP_SYS_CHOWN over the superblock of a filesystem to
>> chown files.  Ordinarily the capable_wrt_inode_uidgid check is
>> sufficient to allow access to files but when the underlying filesystem
>> has uids or gids that don't map to the current user namespace it is
>> not enough, so the chown permission checks need to be extended to
>> allow this case.
>>
>> Calling chown on filesystem nodes whose uid or gid don't map is
>> necessary if those nodes are going to be modified as writing back
>> inodes which contain uids or gids that don't map is likely to cause
>> filesystem corruption of the uid or gid fields.
>
> How can the filesystem be corrupted if chown is denied?
>
> It is not clear to me what the purpose of this patch is or what the
> exact usecase this is fixing.

It isn't a fix and we can delay this one and similar patches
that enable things until we are certain all of the necessary
restrictions are in place.  This is not essential for safely getting
fully unprivileged mounting of fuse to work.

The overall strategy has been to handle as many of the generic concerns
at the vfs level as possible to separate filesystem concerns and generic
concerns.

In this case the generic concern is what happens when the uid is read
from the filesystem and it gets mapped to INVALID_UID and then the inode
for that file is written back.

That is a trap for the unwary filesystem implementation and not a case
that I think anyone will actually care about.  It is just not useful
to mount a filesystem and to not map some of it's ids.   So the generic
vfs code just denies writes to files like show with uid of INVALID_UID
or gid of INVALID_GID.  Just to ensure that problems don't show up.

This patch gets through those defenses.

Eric