Message ID | 20200608100417.8442-1-aryabinin@virtuozzo.com |
---|---|
State | New |
Series | "kernel/cgroup: Fix cgroups leaking" |
Headers | show |
diff --git a/kernel/cgroup.c b/kernel/cgroup.c index fc3148d37610..9fdba793d4f3 100644 --- a/kernel/cgroup.c +++ b/kernel/cgroup.c @@ -4249,13 +4249,9 @@ static int online_css(struct cgroup_subsys *ss, struct cgroup *cgrp) if (ss->css_online) ret = ss->css_online(cgrp); - if (!ret) { + if (!ret) cgrp->subsys[ss->subsys_id]->flags |= CSS_ONLINE; - refcount_inc(&cgrp->online_cnt); - if (cgrp->parent) - refcount_inc(&cgrp->parent->online_cnt); - } return ret; } @@ -4483,6 +4479,9 @@ static long cgroup_create(struct cgroup *parent, struct dentry *dentry, if (err) goto err_destroy; + refcount_inc(&cgrp->online_cnt); + refcount_inc(&parent->online_cnt); + mutex_unlock(&cgroup_mutex); mutex_unlock(&cgrp->dentry->d_inode->i_mutex);
On 08.06.2020 13:04, Andrey Ryabinin wrote: > There are cgroups with several css'es like, e.g. cpu,cpuacct. > The increment of the cgroup->online_cnt counter happens in the online_css() > which is called per each css on cgroup onlining. This leads to leaking such > cgroups as ->online_cnt can't become zero. > > Move refcount increments to the upper level in cgroup_create() to fix this. > > The bug is the result of incorrect backport of the upstream patch. The major > difference is that the upstream has css lifetime decoupled from the cgroup > lifetime. ->online_cnt counter is in css and controls the order of css > offline online execution. While in our kernel css lifetime is tied to the > cgroup lifetime and the ->online_cnt counter is in the cgroup struct and > controls the order cgroup's online/offline execution. > > https://jira.sw.ru/browse/PSBM-104524 > Fixes: 5734a6a5e621 ("ms/cgroup: make sure a parent css isn't offlined before its children") > Signed-off-by: Andrey Ryabinin <aryabinin@virtuozzo.com> Reviewed-by: Kirill Tkhai <ktkhai@virtuozzo.com> > --- > kernel/cgroup.c | 9 ++++----- > 1 file changed, 4 insertions(+), 5 deletions(-) > > diff --git a/kernel/cgroup.c b/kernel/cgroup.c > index fc3148d37610..9fdba793d4f3 100644 > --- a/kernel/cgroup.c > +++ b/kernel/cgroup.c > @@ -4249,13 +4249,9 @@ static int online_css(struct cgroup_subsys *ss, struct cgroup *cgrp) > > if (ss->css_online) > ret = ss->css_online(cgrp); > - if (!ret) { > + if (!ret) > cgrp->subsys[ss->subsys_id]->flags |= CSS_ONLINE; > > - refcount_inc(&cgrp->online_cnt); > - if (cgrp->parent) > - refcount_inc(&cgrp->parent->online_cnt); > - } > return ret; > } > > @@ -4483,6 +4479,9 @@ static long cgroup_create(struct cgroup *parent, struct dentry *dentry, > if (err) > goto err_destroy; > > + refcount_inc(&cgrp->online_cnt); > + refcount_inc(&parent->online_cnt); > + > mutex_unlock(&cgroup_mutex); > mutex_unlock(&cgrp->dentry->d_inode->i_mutex); > >
There are cgroups with several css'es like, e.g. cpu,cpuacct. The increment of the cgroup->online_cnt counter happens in the online_css() which is called per each css on cgroup onlining. This leads to leaking such cgroups as ->online_cnt can't become zero. Move refcount increments to the upper level in cgroup_create() to fix this. The bug is the result of incorrect backport of the upstream patch. The major difference is that the upstream has css lifetime decoupled from the cgroup lifetime. ->online_cnt counter is in css and controls the order of css offline online execution. While in our kernel css lifetime is tied to the cgroup lifetime and the ->online_cnt counter is in the cgroup struct and controls the order cgroup's online/offline execution. https://jira.sw.ru/browse/PSBM-104524 Fixes: 5734a6a5e621 ("ms/cgroup: make sure a parent css isn't offlined before its children") Signed-off-by: Andrey Ryabinin <aryabinin@virtuozzo.com> --- kernel/cgroup.c | 9 ++++----- 1 file changed, 4 insertions(+), 5 deletions(-)