[RHEL7,COMMIT] mm/memcg: Fix race on mem cgroup offline/kmem deactivation.

Submitted by Konstantin Khorenko on May 12, 2020, 12:58 p.m.

Details

Message ID 202005121258.04CCw2wi030710@finist-ce7.sw.ru
State New
Series "mm/memcg: Fix race on mem cgroup offline/kmem deactivation."
Headers show

Commit Message

Konstantin Khorenko May 12, 2020, 12:58 p.m.
The commit is pushed to "branch-rh7-3.10.0-1127.vz7.150.x-ovz" and will appear at https://src.openvz.org/scm/ovz/vzkernel.git
after rh7-3.10.0-1127.vz7.150.9
------>
commit 583d034b51b6a14432e8b95678223ff8af829f00
Author: Andrey Ryabinin <aryabinin@virtuozzo.com>
Date:   Tue May 12 15:58:01 2020 +0300

    mm/memcg: Fix race on mem cgroup offline/kmem deactivation.
    
    The following race could happen
    
      Initial state: kmem = 0, cgroup is online
    
                 CPU0:
       charge_kmem(1); //kmem = 1
       memcg_uncharge_kmem()
       {
       ...
            kmem = page_counter_uncharge(&memcg->kmem, nr_pages); //kmem = 0
    
                 CPU1:
    
        charge_kmem(1); //kmem=1 again
        ....
        mem_cgroup_css_offline();
           memcg_deactivate_kmem() {
               ...
               css_get();
               memcg_kmem_mark_dead(memcg);
               if (page_counter_read(&memcg->kmem)) //kmem=1, thus return
                       return;
           }
                   CPU0:
    
            /* Not down to 0 */
            if (kmem)
                    return;
            if (memcg_kmem_test_and_clear_dead(memcg))
                    css_put(&memcg->css);              //kill memcg with kmem != 0
    
       }
    
    Fix this by making a fake kmem charge on cgroup creation, and uncharging it
    back during the offline stage. This way we guarantee that kmem is never 0
    while cgroup is online, so the race above isn't possible anymore.
    
    https://jira.sw.ru/browse/PSBM-98148
    
    Signed-off-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
    Suggested-by: Kirill Tkhai <ktkhai@virtuozzo.com>
    Reviewed-by: Kirill Tkhai <ktkhai@virtuozzo.com>
---
 mm/memcontrol.c | 40 +++++++++++++++++++++++++++++++++++++++-
 1 file changed, 39 insertions(+), 1 deletion(-)

Patch hide | download patch | download mbox

diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index e82c59f6a9840..c3586e8e27ca2 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -4831,6 +4831,43 @@  static int __memcg_activate_kmem(struct mem_cgroup *memcg,
 	memcg->kmemcg_id = memcg_id;
 	set_bit(KMEM_ACCOUNTED_ACTIVE, &memcg->kmem_account_flags);
 	set_bit(KMEM_ACCOUNTED_ACTIVATED, &memcg->kmem_account_flags);
+
+	/*
+	 * Fake charge to keep kmem > 0 while cgroup is online. This prevents
+	 * premature cgroup destruction due to the following race:
+	 * CPU0:
+	 *       charge_kmem(1); //kmem = 1
+	 *       memcg_uncharge_kmem()
+	 *       {
+	 *       ...
+	 *            kmem = page_counter_uncharge(&memcg->kmem, nr_pages); //kmem = 0
+	 *
+	 *    ----------------------------------------------------------------------
+	 *
+	 *                 CPU1:
+	 *
+	 *        charge_kmem(1); //kmem=1 again
+	 *        ....
+	 *        mem_cgroup_css_offline();
+	 *           memcg_deactivate_kmem() {
+	 *               ...
+	 *               css_get();
+	 *               memcg_kmem_mark_dead(memcg);
+	 *               if (page_counter_read(&memcg->kmem)) //kmem=1, thus return
+	 *                       return;
+	 *           }
+	 *    ----------------------------------------------------------------------
+	 *
+	 *                   CPU0:
+	 *
+	 *            if (kmem)
+	 *                    return;
+	 *            if (memcg_kmem_test_and_clear_dead(memcg))
+	 *                    css_put(&memcg->css);              //kill memcg with kmem != 0
+	 *
+	 *       }
+	 */
+	page_counter_charge(&memcg->kmem, 1);
 out:
 	return err;
 }
@@ -6149,7 +6186,8 @@  static void memcg_deactivate_kmem(struct mem_cgroup *memcg)
 
 	memcg_kmem_mark_dead(memcg);
 
-	if (page_counter_read(&memcg->kmem))
+	/* Uncharge fake charge from __memcg_activate_kmem() */
+	if (page_counter_uncharge(&memcg->kmem, 1))
 		return;
 
 	/*