[RHEL,v22,14/14] ve/cgroup: At cgroup_mark(unmark)_ve_roots skip non-virtualized roots

Submitted by Valeriy Vdovin on July 31, 2020, 12:54 p.m.

Details

Message ID 1596200072-167946-15-git-send-email-valeriy.vdovin@virtuozzo.com
State New
Series "Make release_agent per-cgroup property. Run release_agent in proper ve."
Headers show

Commit Message

Valeriy Vdovin July 31, 2020, 12:54 p.m.
During container start there might be a situation when not all cgroup
hierarchies get virtualized by container manager (like vzctl). By
virtualizing a cgroup hierarchy I mean creation of sub-directory within
a particular mounted cgroup. When container starts it looks in css set
of it's init process to list all affilated cgroups and perform actions
on each. But non-virtualized cgroups will also be present in init's css_set
and they should not be touched from inside of any non root ve.

Signed-off-by: Valeriy Vdovin <valeriy.vdovin@virtuozzo.com>
---
 kernel/cgroup.c | 31 +++++++++++++++++++++++++++++++
 1 file changed, 31 insertions(+)

Patch hide | download patch | download mbox

diff --git a/kernel/cgroup.c b/kernel/cgroup.c
index 3305032..aefe40b 100644
--- a/kernel/cgroup.c
+++ b/kernel/cgroup.c
@@ -4449,6 +4449,18 @@  static struct cftype *get_cftype_by_name(const char *name)
 }
 
 #ifdef CONFIG_VE
+static inline bool is_virtualized_cgroup(struct cgroup *cgrp)
+{
+	lockdep_assert_held(&cgroup_mutex);
+	if (cgrp->root->subsys_mask)
+		return true;
+
+	if (!strcmp(cgrp->root->name, "systemd"))
+		return true;
+
+	return false;
+}
+
 int cgroup_mark_ve_roots(struct ve_struct *ve)
 {
 	struct cgroup *cgrp, *tmp;
@@ -4464,6 +4476,17 @@  int cgroup_mark_ve_roots(struct ve_struct *ve)
 	mutex_lock(&cgroup_mutex);
 	for_each_active_root(root) {
 		cgrp = css_cgroup_from_root(ve->root_css_set, root);
+
+		/*
+		 * At container start, vzctl creates special cgroups to serve
+		 * as virtualized cgroup roots. They are bind-mounted on top
+		 * of original cgroup mount point in container namespace. But
+		 * not all cgroup mounts undergo this procedure. We should
+		 * skip cgroup mounts that are not virtualized.
+		 */
+		if (!is_virtualized_cgroup(cgrp))
+			continue;
+
 		rcu_assign_pointer(cgrp->ve_owner, ve);
 		set_bit(CGRP_VE_ROOT, &cgrp->flags);
 
@@ -4513,6 +4536,14 @@  void cgroup_unmark_ve_roots(struct ve_struct *ve)
 	mutex_lock(&cgroup_mutex);
 	for_each_active_root(root) {
 		cgrp = css_cgroup_from_root(ve->root_css_set, root);
+
+		/*
+		 * For this line see comments in
+		 * cgroup_mark_ve_roots
+		 */
+		if (!is_virtualized_cgroup(cgrp))
+			continue;
+
 		dget(cgrp->dentry);
 		list_add_tail(&cgrp->cft_q_node, &pending);
 	}

Comments

Kirill Tkhai July 31, 2020, 2:43 p.m.
On 31.07.2020 15:54, Valeriy Vdovin wrote:
> During container start there might be a situation when not all cgroup
> hierarchies get virtualized by container manager (like vzctl). By
> virtualizing a cgroup hierarchy I mean creation of sub-directory within
> a particular mounted cgroup. When container starts it looks in css set
> of it's init process to list all affilated cgroups and perform actions
> on each. But non-virtualized cgroups will also be present in init's css_set
> and they should not be touched from inside of any non root ve.
> 
1

Reviewed-by: Kirill Tkhai <ktkhai@virtuozzo.com>

> Signed-off-by: Valeriy Vdovin <valeriy.vdovin@virtuozzo.com>
> ---
>  kernel/cgroup.c | 31 +++++++++++++++++++++++++++++++
>  1 file changed, 31 insertions(+)
> 
> diff --git a/kernel/cgroup.c b/kernel/cgroup.c
> index 3305032..aefe40b 100644
> --- a/kernel/cgroup.c
> +++ b/kernel/cgroup.c
> @@ -4449,6 +4449,18 @@ static struct cftype *get_cftype_by_name(const char *name)
>  }
>  
>  #ifdef CONFIG_VE
> +static inline bool is_virtualized_cgroup(struct cgroup *cgrp)
> +{
> +	lockdep_assert_held(&cgroup_mutex);
> +	if (cgrp->root->subsys_mask)
> +		return true;
> +
> +	if (!strcmp(cgrp->root->name, "systemd"))
> +		return true;
> +
> +	return false;
> +}
> +
>  int cgroup_mark_ve_roots(struct ve_struct *ve)
>  {
>  	struct cgroup *cgrp, *tmp;
> @@ -4464,6 +4476,17 @@ int cgroup_mark_ve_roots(struct ve_struct *ve)
>  	mutex_lock(&cgroup_mutex);
>  	for_each_active_root(root) {
>  		cgrp = css_cgroup_from_root(ve->root_css_set, root);
> +
> +		/*
> +		 * At container start, vzctl creates special cgroups to serve
> +		 * as virtualized cgroup roots. They are bind-mounted on top
> +		 * of original cgroup mount point in container namespace. But
> +		 * not all cgroup mounts undergo this procedure. We should
> +		 * skip cgroup mounts that are not virtualized.
> +		 */
> +		if (!is_virtualized_cgroup(cgrp))
> +			continue;
> +
>  		rcu_assign_pointer(cgrp->ve_owner, ve);
>  		set_bit(CGRP_VE_ROOT, &cgrp->flags);
>  
> @@ -4513,6 +4536,14 @@ void cgroup_unmark_ve_roots(struct ve_struct *ve)
>  	mutex_lock(&cgroup_mutex);
>  	for_each_active_root(root) {
>  		cgrp = css_cgroup_from_root(ve->root_css_set, root);
> +
> +		/*
> +		 * For this line see comments in
> +		 * cgroup_mark_ve_roots
> +		 */
> +		if (!is_virtualized_cgroup(cgrp))
> +			continue;
> +
>  		dget(cgrp->dentry);
>  		list_add_tail(&cgrp->cft_q_node, &pending);
>  	}
>