[2/2,v2] criu: restore -- Wait for userns being prepared before moving into cgroups

Submitted by Kirill Gorkunov on April 20, 2016, 4:13 p.m.

Details

Message ID 20160420161329.GA8387@uranus.sw.swsoft.com
State Rejected
Series "criu: restore -- Wait for userns completion"
Headers show

Patch hide | download patch | download mbox

diff --git a/criu/cr-restore.c b/criu/cr-restore.c
index 4bce020..cda3d8b 100644
--- a/criu/cr-restore.c
+++ b/criu/cr-restore.c
@@ -1320,9 +1320,23 @@  static inline int fork_with_pid(struct pstree_item *item)
 
 
 	if (item == root_item) {
+		int retval = 0;
+
 		item->pid.real = ret;
 		pr_debug("PID: real %d virt %d\n",
 				item->pid.real, item->pid.virt);
+		if (root_ns_mask & CLONE_NEWUSER) {
+			/*
+			 * uid_map and gid_map must be filled from a parent user namespace.
+			 * prepare_userns_creds() must be called after filling mappings.
+			 */
+			retval = prepare_userns(item);
+			if (retval)
+				pr_err("Can't prepare userns");
+		}
+		complete(&task_entries->userns_completion);
+		if (retval)
+			kill(ret, SIGKILL);
 	}
 
 	if (opts.pidfile && root_item == item) {
@@ -1663,6 +1677,9 @@  static int restore_task_with_children(void *_arg)
 			goto err;
 	}
 
+	if (!current->parent)
+		wait_for_completion(&task_entries->userns_completion);
+
 	/*
 	 * Call this _before_ forking to optimize cgroups
 	 * restore -- if all tasks live in one set of cgroups
@@ -2074,13 +2091,6 @@  static int restore_root_task(struct pstree_item *init)
 		}
 	}
 
-	/*
-	 * uid_map and gid_map must be filled from a parent user namespace.
-	 * prepare_userns_creds() must be called after filling mappings.
-	 */
-	if ((root_ns_mask & CLONE_NEWUSER) && prepare_userns(init))
-		goto out_kill;
-
 	pr_info("Wait until namespaces are created\n");
 	ret = restore_wait_inprogress_tasks();
 	if (ret)
@@ -2262,6 +2272,7 @@  int prepare_task_entries(void)
 	atomic_set(&task_entries->nr_zombies, 0);
 	futex_set(&task_entries->start, CR_STATE_RESTORE_NS);
 	mutex_init(&task_entries->userns_sync_lock);
+	init_completion(&task_entries->userns_completion);
 
 	return 0;
 }
diff --git a/criu/include/rst_info.h b/criu/include/rst_info.h
index 562e2f0..8975095 100644
--- a/criu/include/rst_info.h
+++ b/criu/include/rst_info.h
@@ -12,6 +12,7 @@  struct task_entries {
 	futex_t start;
 	atomic_t cr_err;
 	mutex_t userns_sync_lock;
+	completion_t userns_completion;
 };
 
 struct fdt {

Comments

Pavel Emelianov April 21, 2016, 12:27 p.m.
On 04/20/2016 07:13 PM, Cyrill Gorcunov wrote:
> On Wed, Apr 20, 2016 at 07:09:07PM +0300, Cyrill Gorcunov wrote:
>> When user namespace are stepping in we should wait for their preparation
>> to complete before start using userns daemon (internally the kernel
>> checks for uids and if uids are not set -EINVAL will be returned
>> when usersn calls for sendmsg()).
>>
>> Thus use completion and wait for uid maps being written first.

We already have stages to divide restore process into steps. In particular,
there's a separate stage with only root task inside (CR_STAGE_RESTORE_NS).

-- Pavel
Cyrill Gorcunov April 21, 2016, 12:36 p.m.
On Thu, Apr 21, 2016 at 03:27:35PM +0300, Pavel Emelyanov wrote:
> On 04/20/2016 07:13 PM, Cyrill Gorcunov wrote:
> > On Wed, Apr 20, 2016 at 07:09:07PM +0300, Cyrill Gorcunov wrote:
> >> When user namespace are stepping in we should wait for their preparation
> >> to complete before start using userns daemon (internally the kernel
> >> checks for uids and if uids are not set -EINVAL will be returned
> >> when usersn calls for sendmsg()).
> >>
> >> Thus use completion and wait for uid maps being written first.
> 
> We already have stages to divide restore process into steps. In particular,
> there's a separate stage with only root task inside (CR_STAGE_RESTORE_NS).

Andrew already posted shorter version. Thanks.