[v2,18/30] files: Choose file master with enough permissions

Submitted by Kirill Tkhai on June 16, 2017, 9:05 a.m.

Details

Message ID 20170616090556.gqu4stku6vlidpg3@virtuozzo.com
State Accepted
Series "Support sockets leaked to child user_ns task"
Commit 8c2b8a675ff29baa033383b306237c5659a1e63b
Headers show

Commit Message

Kirill Tkhai June 16, 2017, 9:05 a.m.
On Thu, Jun 15, 2017 at 12:16, Cyrill Gorcunov wrote:
> On Thu, Jun 15, 2017 at 12:10:43PM +0300, Kirill Tkhai wrote:
> > On Wed, Jun 14, 2017 at 23:32, Andrei Vagin wrote:
> > > On Wed, Jun 07, 2017 at 02:28:53PM +0300, Kirill Tkhai wrote:
> > > > 1)Find such fle, and link it at the beginning of list.
> > > > 2)Order by pid, where possible, if it does not contradict (1)
> > > 
> > > Why do we need to order by pid?
> > 
> > This was initially, and I left the logic. As I know,
> > it's need for epoll, to place master in parent task.
> > 
> > CC: gorcunov@virtuozzo.com
> > Cyrill, could you please say, why we need this, if you remember?
> 
> I think it's the same as in bug we met.
> ---
> commit 2df9c9dc6e0b926aaba00138e3e66295ebea76ce
> Author: Cyrill Gorcunov <gorcunov@virtuozzo.com>
> Date:   Mon Apr 3 18:38:55 2017 +0300
> 
>     vz7: files -- Select proper master fd when collecting fd
>     
>     When choosing the master file which gonna be sending file
>     descriptor to the children we must not only look into
>     their PIDs but consider process tree relations, in particular
>     the child of a process might be choosen as a master and
>     epoll restore will fail because target files are simply
>     not present in child tree.
>     
>     |  31964  31964  31964       epoll
>     |    585  31964  31964           epoll
>     |    586  31964  31964           epoll
>     |...
>     | (04.797121)    585: Error (criu/eventpoll.c:180): epoll: Unexpected state for tfd (id 0 fd 8)
>     
>     That's because the target files are blong to 31964 and not
>     present in child 585, but because PID wrapp happened it
>     has been chosen as a leader which is of course wrong.
>     
>     https://jira.sw.ru/browse/PSBM-63355
>     
>     Signed-off-by: Cyrill Gorcunov <gorcunov@virtuozzo.com>

[PATCH v3 18/30]files: Choose file master with enough permissions

1)Find such fle, and link it at the beginning of list.
2)Order by pid, where possible, if it does not contradict (1)
3)If there is no a master, leave fdesc in fake_master_head.

v3: Describe pid order reasons

Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
---
 criu/files.c | 46 +++++++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 45 insertions(+), 1 deletion(-)

Patch hide | download patch | download mbox

diff --git a/criu/files.c b/criu/files.c
index 8c2ba78a6..5200a2ea8 100644
--- a/criu/files.c
+++ b/criu/files.c
@@ -84,6 +84,13 @@  int file_desc_add(struct file_desc *d, u32 id, struct file_desc_ops *ops)
 	hlist_add_head(&d->hash, &file_desc_hash[id % FDESC_HASH_SIZE]);
 	if (d->ops->get_user_ns) {
 		d->ops->get_user_ns(d, &file_user_ns, &d->setns_userns);
+		/*
+		 * Hash file_desc in the fake_master_list. Later it will
+		 * be removed in collect_desc_fle() from there, if a fle
+		 * with enough permissions is found.
+		 */
+		if (d->setns_userns)
+			list_add(&d->fake_master_list, &fake_master_head);
 	}
 	return 0; /* this is to make tail-calls in collect_one_foo look nice */
 }
@@ -717,12 +724,49 @@  static struct fdinfo_list_entry *alloc_fle(int pid, FdinfoEntry *fe)
 static void collect_desc_fle(struct fdinfo_list_entry *new_le, struct file_desc *fdesc)
 {
 	struct fdinfo_list_entry *le;
+	struct ns_id *task_ns = NULL;
+	bool first = true;
 
 	new_le->desc = fdesc;
 
-	list_for_each_entry(le, &fdesc->fd_info_head, desc_list)
+	/*
+	 * First fle in fdesc->fd_info_head list (i.e., master)
+	 * must have enough permissions to restore file.
+	 * Sort the rest of list and cases by pid if it's possible
+	 * (this is a fix for epolls, which gives a little more
+	 * possibility, the master file is belonged to parent
+	 * process, and the parent more probably can restore epoll).
+	 */
+	list_for_each_entry(le, &fdesc->fd_info_head, desc_list) {
+		if (!fdesc->setns_userns || !first ||
+		    le->task->ids->user_ns_id == new_le->task->ids->user_ns_id)
+			goto compare_pid;
+
+		task_ns = lookup_ns_by_id(new_le->task->ids->user_ns_id, &user_ns_desc);
+		if (is_subns(fdesc->setns_userns, task_ns))
+			break;
+		task_ns = NULL;
+
+		if (list_empty(&fdesc->fake_master_list)) {
+			/* Current master has perms, leave it on the place */
+			first = false;
+			continue;
+		}
+compare_pid:
 		if (pid_rst_prio(new_le->pid, le->pid))
 			break;
+		first = false;
+	}
+
+	if (fdesc->setns_userns && list_empty(&fdesc->fd_info_head)) {
+		/* First fle is hashing */
+		task_ns = lookup_ns_by_id(new_le->task->ids->user_ns_id, &user_ns_desc);
+		if (!is_subns(fdesc->setns_userns, task_ns))
+			task_ns = NULL;
+	}
+	if (task_ns)
+		list_del_init(&fdesc->fake_master_list);
+
 	list_add_tail(&new_le->desc_list, &le->desc_list);
 }