[Devel,rh7] ms/cgroup: fix rmdir EBUSY regression in 3.11

Submitted by Andrey Ryabinin on Jan. 11, 2017, 9:30 a.m.

Details

Message ID 20170111093049.5893-1-aryabinin@virtuozzo.com
State New
Series "ms/cgroup: fix rmdir EBUSY regression in 3.11"
Headers show

Commit Message

Andrey Ryabinin Jan. 11, 2017, 9:30 a.m.
From: Hugh Dickins <hughd@google.com>

commit bb78a92f47696b2da49f2692b6a9fa56d07c444a upstream.

On 3.11-rc we are seeing cgroup directories left behind when they should
have been removed.  Here's a trivial reproducer:

cd /sys/fs/cgroup/memory
mkdir parent parent/child; rmdir parent/child parent
rmdir: failed to remove `parent': Device or resource busy

It's because cgroup_destroy_locked() (step 1 of destruction) leaves
cgroup on parent's children list, letting cgroup_offline_fn() (step 2 of
destruction) remove it; but step 2 is run by work queue, which may not
yet have removed the children when parent destruction checks the list.

Fix that by checking through a non-empty list of children: if every one
of them has already been marked CGRP_DEAD, then it's safe to proceed:
those children are invisible to userspace, and should not obstruct rmdir.

(I didn't see any reason to keep the cgrp->children checks under the
unrelated css_set_lock, so moved them out.)

tj: Flattened nested ifs a bit and updated comment so that it's
    correct on both for-3.11-fixes and for-3.12.

Signed-off-by: Hugh Dickins <hughd@google.com>
Signed-off-by: Tejun Heo <tj@kernel.org>

https://jira.sw.ru/browse/PSBM-53314

[aryabinin: s/cgroup_is_dead()/cgroup_is_removed()]
Signed-off-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
---
 kernel/cgroup.c | 20 +++++++++++++++++++-
 1 file changed, 19 insertions(+), 1 deletion(-)

Patch hide | download patch | download mbox

diff --git a/kernel/cgroup.c b/kernel/cgroup.c
index 1c047b9..6aafc51 100644
--- a/kernel/cgroup.c
+++ b/kernel/cgroup.c
@@ -4432,11 +4432,29 @@  static int cgroup_destroy_locked(struct cgroup *cgrp)
 	struct dentry *d = cgrp->dentry;
 	struct cgroup_event *event, *tmp;
 	struct cgroup_subsys *ss;
+	struct cgroup *child;
+	bool empty;
 
 	lockdep_assert_held(&d->d_inode->i_mutex);
 	lockdep_assert_held(&cgroup_mutex);
 
-	if (atomic_read(&cgrp->count) || !list_empty(&cgrp->children))
+	if (atomic_read(&cgrp->count))
+		return -EBUSY;
+
+	/*
+	 * Make sure there's no live children.  We can't test ->children
+	 * emptiness as dead children linger on it while being destroyed;
+	 * otherwise, "rmdir parent/child parent" may fail with -EBUSY.
+	 */
+	empty = true;
+	rcu_read_lock();
+	list_for_each_entry_rcu(child, &cgrp->children, sibling) {
+		empty = cgroup_is_removed(child);
+		if (!empty)
+			break;
+	}
+	rcu_read_unlock();
+	if (!empty)
 		return -EBUSY;
 
 	/*

Comments

Cyrill Gorcunov Jan. 11, 2017, 9:37 a.m.
On Wed, Jan 11, 2017 at 12:30:49PM +0300, Andrey Ryabinin wrote:
> 
> [aryabinin: s/cgroup_is_dead()/cgroup_is_removed()]
> Signed-off-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
Acked-by: Cyrill Gorcunov <gorcunov@openvz.org>

Thank you!