[RHEL7,COMMIT] cgroup/ve: at container start only check virtualizable cgroups.

Submitted by Vasily Averin on March 4, 2021, 8:19 a.m.

Details

Message ID 202103040819.1248JDNm016734@vz7build.vvs.sw.ru
State New
Series "cgroup/ve: at container start only check virtualizable cgroups."
Headers show

Commit Message

Vasily Averin March 4, 2021, 8:19 a.m.
The commit is pushed to "branch-rh7-3.10.0-1160.15.2.vz7.173.x-ovz" and will appear at https://src.openvz.org/scm/ovz/vzkernel.git
after rh7-3.10.0-1160.15.2.vz7.173.4
------>
commit a0cdc1cf72c4823b9cb34432d9fa75b995074f6f
Author: Valeriy Vdovin <valeriy.vdovin@virtuozzo.com>
Date:   Thu Mar 4 11:19:13 2021 +0300

    cgroup/ve: at container start only check virtualizable cgroups.
    
    The above commit prevented situation when the a task tried to start
    container without first creating the right cgroups context for that.
    
    The logic behind that check was:
    - there is a set of cgroups that will be virtualized during container
      start.
    - for that these cgroups will be modified.
    - the cgroup that will be chosen for modification are in starting task
      css set.
    - it is invalid and forbidden to modify cgroups that a located in the
      root of each cgroup hierarchy.
    - therefore we have to check all the css set to see if it has cgroups
      with no parent (indication of root) and forbid the whole procedure
      if at least some cgroup matches.
    
    The bug in this behaviour was:
    - there are cases when there are non-virtualizable cgroup mounts.
    - these are named cgroups which do not have a bound cgroup subsystems
      on them.
    - there is one exception which is a named cgroup "systemd".
    - therefore container starters do not have to make nested cgroups
      for these type of non-virtualizable cgroup hierarchies.
    - therefore there can be named cgroups with parent == NULL in css set
      of a starting task and they will not pass the check and container
      start will fail.
    
    We fix the bug to only check those cgroups in css set, that are
    virtualizable. We already have the check helper that is used a bit
    later in cgroup_mark_ve_roots, so let's use it.
    
    Fixes 105332edc47c ("ve/cgroup: At container start check ve's css_set for host-level cgroups.")
    https://jira.sw.ru/browse/PSBM-125040
    Signed-off-by: Valeriy Vdovin <valeriy.vdovin@virtuozzo.com>
---
 kernel/cgroup.c | 30 ++++++++++++++++++------------
 1 file changed, 18 insertions(+), 12 deletions(-)

Patch hide | download patch | download mbox

diff --git a/kernel/cgroup.c b/kernel/cgroup.c
index 85d281e..b6408e6 100644
--- a/kernel/cgroup.c
+++ b/kernel/cgroup.c
@@ -658,6 +658,19 @@  static struct cgroup *css_cgroup_from_root(struct css_set *css_set,
 	return res;
 }
 
+#ifdef CONFIG_VE
+static inline bool is_virtualized_cgroup(struct cgroup *cgrp)
+{
+	lockdep_assert_held(&cgroup_mutex);
+	if (cgrp->root->subsys_mask)
+		return true;
+
+	if (!strcmp(cgrp->root->name, "systemd"))
+		return true;
+
+	return false;
+}
+
 /*
  * Iterate all cgroups in a given css_set and check if it is a top cgroup
  * of it's hierarchy.
@@ -674,6 +687,9 @@  static inline bool css_has_host_cgroups(struct css_set *css_set)
 		if (link->cgrp->root == &rootnode)
 			continue;
 
+		if (!is_virtualized_cgroup(link->cgrp))
+			continue;
+
 		if (!link->cgrp->parent) {
 			read_unlock(&css_set_lock);
 			return true;
@@ -682,6 +698,8 @@  static inline bool css_has_host_cgroups(struct css_set *css_set)
 	read_unlock(&css_set_lock);
 	return false;
 }
+#endif
+
 
 /*
  * Return the cgroup for "task" from the given hierarchy. Must be
@@ -4628,18 +4646,6 @@  static struct cftype *get_cftype_by_name(const char *name)
 }
 
 #ifdef CONFIG_VE
-static inline bool is_virtualized_cgroup(struct cgroup *cgrp)
-{
-	lockdep_assert_held(&cgroup_mutex);
-	if (cgrp->root->subsys_mask)
-		return true;
-
-	if (!strcmp(cgrp->root->name, "systemd"))
-		return true;
-
-	return false;
-}
-
 int cgroup_mark_ve_roots(struct ve_struct *ve)
 {
 	struct cgroup *cgrp, *tmp;