[v2,14/24] file: Implement task_lookup_next_fd_rcu

Submitted by Eric W. Biederman on Nov. 20, 2020, 11:14 p.m.


Message ID 20201120231441.29911-14-ebiederm@xmission.com
State New
Series "exec: Move unshare_files and guarantee files_struct.count is correct"
Headers show

Commit Message

Eric W. Biederman Nov. 20, 2020, 11:14 p.m.
As a companion to fget_task and task_lookup_fd_rcu implement
task_lookup_next_fd_rcu that will return the struct file for the first
file descriptor number that is equal or greater than the fd argument
value, or NULL if there is no such struct file.

This allows file descriptors of foreign processes to be iterated
through safely, without needed to increment the count on files_struct.

Some concern[1] has been expressed that this function takes the task_lock
for each iteration and thus for each file descriptor.  This place
where this function will be called in a commonly used code path is for
listing /proc/<pid>/fd.  I did some small benchmarks and did not see
any measurable performance differences.  For ordinary users ls is
likely to stat each of the directory entries and tid_fd_mode called
from tid_fd_revalidae has always taken the task lock for each file
descriptor.  So this does not look like it will be a big change in

At some point is will probably be worth changing put_files_struct to
free files_struct after an rcu grace period so that task_lock won't be
needed at all.

[1] https://lkml.kernel.org/r/20200817220425.9389-10-ebiederm@xmission.com
v1: https://lkml.kernel.org/r/20200817220425.9389-9-ebiederm@xmission.com
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
 fs/file.c               | 21 +++++++++++++++++++++
 include/linux/fdtable.h |  1 +
 2 files changed, 22 insertions(+)

Patch hide | download patch | download mbox

diff --git a/fs/file.c b/fs/file.c
index 6448523ca29e..23b888a4acbe 100644
--- a/fs/file.c
+++ b/fs/file.c
@@ -880,6 +880,27 @@  struct file *task_lookup_fd_rcu(struct task_struct *task, unsigned int fd)
 	return file;
+struct file *task_lookup_next_fd_rcu(struct task_struct *task, unsigned int *ret_fd)
+	/* Must be called with rcu_read_lock held */
+	struct files_struct *files;
+	unsigned int fd = *ret_fd;
+	struct file *file = NULL;
+	task_lock(task);
+	files = task->files;
+	if (files) {
+		for (; fd < files_fdtable(files)->max_fds; fd++) {
+			file = files_lookup_fd_rcu(files, fd);
+			if (file)
+				break;
+		}
+	}
+	task_unlock(task);
+	*ret_fd = fd;
+	return file;
  * Lightweight file lookup - no refcnt increment if fd table isn't shared.
diff --git a/include/linux/fdtable.h b/include/linux/fdtable.h
index a0558ae9b40c..cf6c52dae3a1 100644
--- a/include/linux/fdtable.h
+++ b/include/linux/fdtable.h
@@ -111,6 +111,7 @@  static inline struct file *lookup_fd_rcu(unsigned int fd)
 struct file *task_lookup_fd_rcu(struct task_struct *task, unsigned int fd);
+struct file *task_lookup_next_fd_rcu(struct task_struct *task, unsigned int *fd);
 struct task_struct;