[RHEL7,COMMIT] mm/memcg: Don't stuck in mem_cgroup_reparent_charges() forever

Submitted by Konstantin Khorenko on June 28, 2018, 2:48 p.m.

Details

Message ID 201806281448.w5SEmpRm020215@finist_ce7.work
State New
Series "mm/memcg: Don't stuck in mem_cgroup_reparent_charges() forever"
Headers show

Commit Message

Konstantin Khorenko June 28, 2018, 2:48 p.m.
The commit is pushed to "branch-rh7-3.10.0-862.3.2.vz7.61.x-ovz" and will appear at https://src.openvz.org/scm/ovz/vzkernel.git
after rh7-3.10.0-862.3.2.vz7.61.6
------>
commit e53270d201af7d1cbf86cd65ed35a479f0901b86
Author: Andrey Ryabinin <aryabinin@virtuozzo.com>
Date:   Thu Jun 28 17:48:50 2018 +0300

    mm/memcg: Don't stuck in mem_cgroup_reparent_charges() forever
    
    mem_cgroup_reparent_charges() supposed bring down all non-kmem charges.
    Sometimes this doesn't happen due to some bug in memcg accounting or
    leak, etc. When this happens, mem_cgroup_reparent_charges() loops
    forever under the global cgroup_mutex, causing a lot of troubles for the
    whole system. Instead of endless loop, make several retries, WARN() and
    break the loop if reparenting was unsuccessful.
    
    https://jira.sw.ru/browse/PSBM-86092
    Signed-off-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
---
 mm/memcontrol.c | 8 ++++++++
 1 file changed, 8 insertions(+)

Patch hide | download patch | download mbox

diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 59cf47972f9e..755c09e050a7 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -4010,6 +4010,7 @@  static void mem_cgroup_force_empty_list(struct mem_cgroup *memcg,
  */
 static void mem_cgroup_reparent_charges(struct mem_cgroup *memcg)
 {
+	int nr_retries = MEM_CGROUP_RECLAIM_RETRIES;
 	int node, zid;
 
 	do {
@@ -4030,6 +4031,11 @@  static void mem_cgroup_reparent_charges(struct mem_cgroup *memcg)
 		memcg_oom_recover(memcg);
 		cond_resched();
 
+		if (WARN(!--nr_retries, "memory %ld > kmem %ld\n",
+				page_counter_read(&memcg->memory),
+				page_counter_read(&memcg->kmem)))
+			break;
+
 		/*
 		 * Kernel memory may not necessarily be trackable to a specific
 		 * process. So they are not migrated, and therefore we can't
@@ -4044,6 +4050,8 @@  static void mem_cgroup_reparent_charges(struct mem_cgroup *memcg)
 		 */
 	} while (page_counter_read(&memcg->memory) -
 		 page_counter_read(&memcg->kmem) > 0);
+
+	WARN_ON(!nr_retries);
 }
 
 /*