[Devel,RHEL7,COMMIT] ms/cgroup: fix rmdir EBUSY regression in 3.11

Submitted by Konstantin Khorenko on Jan. 11, 2017, 10:20 a.m.

Details

Message ID 201701111020.v0BAKU1f002972@finist_cl7.x64_64.work.ct
State New
Series "ms/cgroup: fix rmdir EBUSY regression in 3.11"
Headers show

Commit Message

Konstantin Khorenko Jan. 11, 2017, 10:20 a.m.
The commit is pushed to "branch-rh7-3.10.0-514.vz7.27.x-ovz" and will appear at https://src.openvz.org/scm/ovz/vzkernel.git
after rh7-3.10.0-514.vz7.27.8
------>
commit 34697deacd12106ba28bc7e44efe5221de278ff8
Author: Hugh Dickins <hughd@google.com>
Date:   Wed Jan 11 14:20:30 2017 +0400

    ms/cgroup: fix rmdir EBUSY regression in 3.11
    
    commit bb78a92f47696b2da49f2692b6a9fa56d07c444a upstream.
    
    On 3.11-rc we are seeing cgroup directories left behind when they should
    have been removed.  Here's a trivial reproducer:
    
    cd /sys/fs/cgroup/memory
    mkdir parent parent/child; rmdir parent/child parent
    rmdir: failed to remove `parent': Device or resource busy
    
    It's because cgroup_destroy_locked() (step 1 of destruction) leaves
    cgroup on parent's children list, letting cgroup_offline_fn() (step 2 of
    destruction) remove it; but step 2 is run by work queue, which may not
    yet have removed the children when parent destruction checks the list.
    
    Fix that by checking through a non-empty list of children: if every one
    of them has already been marked CGRP_DEAD, then it's safe to proceed:
    those children are invisible to userspace, and should not obstruct rmdir.
    
    (I didn't see any reason to keep the cgrp->children checks under the
    unrelated css_set_lock, so moved them out.)
    
    tj: Flattened nested ifs a bit and updated comment so that it's
        correct on both for-3.11-fixes and for-3.12.
    
    Signed-off-by: Hugh Dickins <hughd@google.com>
    Signed-off-by: Tejun Heo <tj@kernel.org>
    
    https://jira.sw.ru/browse/PSBM-53314
    
    [aryabinin: s/cgroup_is_dead()/cgroup_is_removed()]
    Signed-off-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
    Acked-by: Cyrill Gorcunov <gorcunov@openvz.org>
---
 kernel/cgroup.c | 20 +++++++++++++++++++-
 1 file changed, 19 insertions(+), 1 deletion(-)

Patch hide | download patch | download mbox

diff --git a/kernel/cgroup.c b/kernel/cgroup.c
index 7c185a0..5ea44e1 100644
--- a/kernel/cgroup.c
+++ b/kernel/cgroup.c
@@ -4434,11 +4434,29 @@  static int cgroup_destroy_locked(struct cgroup *cgrp)
 	struct dentry *d = cgrp->dentry;
 	struct cgroup_event *event, *tmp;
 	struct cgroup_subsys *ss;
+	struct cgroup *child;
+	bool empty;
 
 	lockdep_assert_held(&d->d_inode->i_mutex);
 	lockdep_assert_held(&cgroup_mutex);
 
-	if (atomic_read(&cgrp->count) || !list_empty(&cgrp->children))
+	if (atomic_read(&cgrp->count))
+		return -EBUSY;
+
+	/*
+	 * Make sure there's no live children.  We can't test ->children
+	 * emptiness as dead children linger on it while being destroyed;
+	 * otherwise, "rmdir parent/child parent" may fail with -EBUSY.
+	 */
+	empty = true;
+	rcu_read_lock();
+	list_for_each_entry_rcu(child, &cgrp->children, sibling) {
+		empty = cgroup_is_removed(child);
+		if (!empty)
+			break;
+	}
+	rcu_read_unlock();
+	if (!empty)
 		return -EBUSY;
 
 	/*