[v2,02/24] exec: Simplify unshare_files

Submitted by Eric W. Biederman on Nov. 20, 2020, 11:14 p.m.

Details

Message ID 20201120231441.29911-2-ebiederm@xmission.com
State New
Series "exec: Move unshare_files and guarantee files_struct.count is correct"
Headers show

Commit Message

Eric W. Biederman Nov. 20, 2020, 11:14 p.m.
Now that exec no longer needs to return the unshared files to their
previous value there is no reason to return displaced.

Instead when unshare_fd creates a copy of the file table, call
put_files_struct before returning from unshare_files.

Acked-by: Christian Brauner <christian.brauner@ubuntu.com>
v1: https://lkml.kernel.org/r/20200817220425.9389-2-ebiederm@xmission.com
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
---
 fs/coredump.c           |  5 +----
 fs/exec.c               |  5 +----
 include/linux/fdtable.h |  2 +-
 kernel/fork.c           | 12 ++++++------
 4 files changed, 9 insertions(+), 15 deletions(-)

Patch hide | download patch | download mbox

diff --git a/fs/coredump.c b/fs/coredump.c
index 0cd9056d79cc..abf807235262 100644
--- a/fs/coredump.c
+++ b/fs/coredump.c
@@ -585,7 +585,6 @@  void do_coredump(const kernel_siginfo_t *siginfo)
 	int ispipe;
 	size_t *argv = NULL;
 	int argc = 0;
-	struct files_struct *displaced;
 	/* require nonrelative corefile path and be extra careful */
 	bool need_suid_safe = false;
 	bool core_dumped = false;
@@ -791,11 +790,9 @@  void do_coredump(const kernel_siginfo_t *siginfo)
 	}
 
 	/* get us an unshared descriptor table; almost always a no-op */
-	retval = unshare_files(&displaced);
+	retval = unshare_files();
 	if (retval)
 		goto close_fail;
-	if (displaced)
-		put_files_struct(displaced);
 	if (!dump_interrupted()) {
 		/*
 		 * umh disabled with CONFIG_STATIC_USERMODEHELPER_PATH="" would
diff --git a/fs/exec.c b/fs/exec.c
index fa788988efe9..14fae2ec1c9d 100644
--- a/fs/exec.c
+++ b/fs/exec.c
@@ -1238,7 +1238,6 @@  void __set_task_comm(struct task_struct *tsk, const char *buf, bool exec)
 int begin_new_exec(struct linux_binprm * bprm)
 {
 	struct task_struct *me = current;
-	struct files_struct *displaced;
 	int retval;
 
 	/* Once we are committed compute the creds */
@@ -1259,11 +1258,9 @@  int begin_new_exec(struct linux_binprm * bprm)
 		goto out;
 
 	/* Ensure the files table is not shared. */
-	retval = unshare_files(&displaced);
+	retval = unshare_files();
 	if (retval)
 		goto out;
-	if (displaced)
-		put_files_struct(displaced);
 
 	/*
 	 * Must be called _before_ exec_mmap() as bprm->mm is
diff --git a/include/linux/fdtable.h b/include/linux/fdtable.h
index a32bf47c593e..f46a084b60b2 100644
--- a/include/linux/fdtable.h
+++ b/include/linux/fdtable.h
@@ -109,7 +109,7 @@  struct task_struct;
 struct files_struct *get_files_struct(struct task_struct *);
 void put_files_struct(struct files_struct *fs);
 void reset_files_struct(struct files_struct *);
-int unshare_files(struct files_struct **);
+int unshare_files(void);
 struct files_struct *dup_fd(struct files_struct *, unsigned, int *) __latent_entropy;
 void do_close_on_exec(struct files_struct *);
 int iterate_fd(struct files_struct *, unsigned,
diff --git a/kernel/fork.c b/kernel/fork.c
index 32083db7a2a2..837b546528c8 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -3023,21 +3023,21 @@  SYSCALL_DEFINE1(unshare, unsigned long, unshare_flags)
  *	the exec layer of the kernel.
  */
 
-int unshare_files(struct files_struct **displaced)
+int unshare_files(void)
 {
 	struct task_struct *task = current;
-	struct files_struct *copy = NULL;
+	struct files_struct *old, *copy = NULL;
 	int error;
 
 	error = unshare_fd(CLONE_FILES, NR_OPEN_MAX, &copy);
-	if (error || !copy) {
-		*displaced = NULL;
+	if (error || !copy)
 		return error;
-	}
-	*displaced = task->files;
+
+	old = task->files;
 	task_lock(task);
 	task->files = copy;
 	task_unlock(task);
+	put_files_struct(old);
 	return 0;
 }
 

Comments

Oleg Nesterov Nov. 23, 2020, 5:50 p.m.
I'll try to actually read this series tomorrow but I see nothing wrong
after the quick glance.

Just one off-topic question,

On 11/20, Eric W. Biederman wrote:
>
> --- a/fs/coredump.c
> +++ b/fs/coredump.c
> @@ -585,7 +585,6 @@ void do_coredump(const kernel_siginfo_t *siginfo)
>  	int ispipe;
>  	size_t *argv = NULL;
>  	int argc = 0;
> -	struct files_struct *displaced;
>  	/* require nonrelative corefile path and be extra careful */
>  	bool need_suid_safe = false;
>  	bool core_dumped = false;
> @@ -791,11 +790,9 @@ void do_coredump(const kernel_siginfo_t *siginfo)
>  	}
>
>  	/* get us an unshared descriptor table; almost always a no-op */
> -	retval = unshare_files(&displaced);
> +	retval = unshare_files();

Can anyone explain why does do_coredump() need unshare_files() at all?

Oleg.
Eric W. Biederman Nov. 23, 2020, 6:04 p.m.
Oleg Nesterov <oleg@redhat.com> writes:

> I'll try to actually read this series tomorrow but I see nothing wrong
> after the quick glance.
>
> Just one off-topic question,
>
> On 11/20, Eric W. Biederman wrote:
>>
>> --- a/fs/coredump.c
>> +++ b/fs/coredump.c
>> @@ -585,7 +585,6 @@ void do_coredump(const kernel_siginfo_t *siginfo)
>>  	int ispipe;
>>  	size_t *argv = NULL;
>>  	int argc = 0;
>> -	struct files_struct *displaced;
>>  	/* require nonrelative corefile path and be extra careful */
>>  	bool need_suid_safe = false;
>>  	bool core_dumped = false;
>> @@ -791,11 +790,9 @@ void do_coredump(const kernel_siginfo_t *siginfo)
>>  	}
>>
>>  	/* get us an unshared descriptor table; almost always a no-op */
>> -	retval = unshare_files(&displaced);
>> +	retval = unshare_files();
>
> Can anyone explain why does do_coredump() need unshare_files() at all?

A very good question.  I see the commit where this change came in
179e037fc137 ("do_coredump(): make sure that descriptor table isn't
shared").  But that doesn't give much information.

I see the obvious that unshare_files keeps us from racing with process
that share files that are not part of the core dump.  However in
fs/coredump.c and fs/binfmt_elf.c I don't see anywhere that we actually
dump the set of files the program has open so I don't know what the
point of preventing races is.

So I we can verify that none of the other core dump handlers cares and
remove unshare_files entirely from do_coredump.

Eric
Linus Torvalds Nov. 23, 2020, 6:25 p.m.
On Mon, Nov 23, 2020 at 9:52 AM Oleg Nesterov <oleg@redhat.com> wrote:
>
> Can anyone explain why does do_coredump() need unshare_files() at all?

Hmm. It goes back to 2012, and it's placed just before calling
"->core_dump()", so I assume some core dumping function messed with
the file table back when..

I can't see anything like that currently.

The alternative is that core-dumping just keeps the file table around
for a long while, and thus files don't actually close in a timely
manner. So it might not be a "correctness" issue as much as a latency
issue.

             Linus
Eric W. Biederman Nov. 23, 2020, 8:54 p.m.
Linus Torvalds <torvalds@linux-foundation.org> writes:

> On Mon, Nov 23, 2020 at 9:52 AM Oleg Nesterov <oleg@redhat.com> wrote:
>>
>> Can anyone explain why does do_coredump() need unshare_files() at all?
>
> Hmm. It goes back to 2012, and it's placed just before calling
> "->core_dump()", so I assume some core dumping function messed with
> the file table back when..

I don't see that in the code from back then.  Until 2009 binfmt_elf did
install a file descriptor in the file descriptor table e7b9b550f53e
("Don't mess with descriptor table in load_elf_binary()").  But that was
gone for 3 years by the time the wrapper was added 

> I can't see anything like that currently.
>
> The alternative is that core-dumping just keeps the file table around
> for a long while, and thus files don't actually close in a timely
> manner. So it might not be a "correctness" issue as much as a latency
> issue.

That doesn't work out.  Unsharing the file descriptor table would
increase the latency because it guarantees that the file descritpors
have a reference until the core dump is finished.  While continuing to
share the file descriptor table would allow the file descriptors to be
closed.

Continue to look but I don't see a reason for that unshare.

Eric
Al Viro Dec. 7, 2020, 10:32 p.m.
On Mon, Nov 23, 2020 at 10:25:13AM -0800, Linus Torvalds wrote:
> On Mon, Nov 23, 2020 at 9:52 AM Oleg Nesterov <oleg@redhat.com> wrote:
> >
> > Can anyone explain why does do_coredump() need unshare_files() at all?
> 
> Hmm. It goes back to 2012, and it's placed just before calling
> "->core_dump()", so I assume some core dumping function messed with
> the file table back when..
> 
> I can't see anything like that currently.
> 
> The alternative is that core-dumping just keeps the file table around
> for a long while, and thus files don't actually close in a timely
> manner. So it might not be a "correctness" issue as much as a latency
> issue.

IIRC, it was "weird architecture hooks might be playing silly buggers
with some per-descriptor information they want in coredumps, better
make sure it can't change under them"; it doesn't cost much and
it reduced the analysis surface nicely.

Had been a while ago, so the memories might be faulty...  Anyway, that
reasoning seems to be applicable right now - rather than keeping an
eye on coredump logics on random architectures that might be looking
at descriptor table in unsafe way, just make sure they have a stable
private table and be done with that.

How much is simplified by not doing it there, anyway?