[RHEL7,COMMIT] ms/mm/vmscan.c: iterate only over charged shrinkers during memcg shrink_slab()

Submitted by Konstantin Khorenko on Sept. 5, 2018, 9:37 a.m.

Details

Message ID 201809050937.w859bETK010398@finist_ce7.work
State New
Series "Port "Improve shrink_slab() scalability" patchset"
Headers show

Commit Message

Konstantin Khorenko Sept. 5, 2018, 9:37 a.m.
The commit is pushed to "branch-rh7-3.10.0-862.11.6.vz7.71.x-ovz" and will appear at https://src.openvz.org/scm/ovz/vzkernel.git
after rh7-3.10.0-862.11.6.vz7.71.8
------>
commit 9db6d3c48a4ee8c3ad64293e09a766af5f80ec6e
Author: Kirill Tkhai <ktkhai@virtuozzo.com>
Date:   Wed Sep 5 12:37:13 2018 +0300

    ms/mm/vmscan.c: iterate only over charged shrinkers during memcg shrink_slab()
    
    ms commit b0dedc49a2da
    
    Using the preparations made in previous patches, in case of memcg
    shrink, we may avoid shrinkers, which are not set in memcg's shrinkers
    bitmap.  To do that, we separate iterations over memcg-aware and
    !memcg-aware shrinkers, and memcg-aware shrinkers are chosen via
    for_each_set_bit() from the bitmap.  In case of big nodes, having many
    isolated environments, this gives significant performance growth.  See
    next patches for the details.
    
    Note that the patch does not respect to empty memcg shrinkers, since we
    never clear the bitmap bits after we set it once.  Their shrinkers will
    be called again, with no shrinked objects as result.  This functionality
    is provided by next patches.
    
    [ktkhai@virtuozzo.com: v9]
    Link: http://lkml.kernel.org/r/153112558507.4097.12713813335683345488.stgit@localhost.localdomain
    Link: http://lkml.kernel.org/r/153063066653.1818.976035462801487910.stgit@localhost.localdomain
    Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
    
    Acked-by: Vladimir Davydov <vdavydov.dev@gmail.com>
    Tested-by: Shakeel Butt <shakeelb@google.com>
    Cc: Al Viro <viro@zeniv.linux.org.uk>
    Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
    Cc: Chris Wilson <chris@chris-wilson.co.uk>
    Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
    Cc: Guenter Roeck <linux@roeck-us.net>
    Cc: "Huang, Ying" <ying.huang@intel.com>
    Cc: Johannes Weiner <hannes@cmpxchg.org>
    Cc: Josef Bacik <jbacik@fb.com>
    Cc: Li RongQing <lirongqing@baidu.com>
    Cc: Matthew Wilcox <willy@infradead.org>
    Cc: Matthias Kaehlcke <mka@chromium.org>
    Cc: Mel Gorman <mgorman@techsingularity.net>
    Cc: Michal Hocko <mhocko@kernel.org>
    Cc: Minchan Kim <minchan@kernel.org>
    Cc: Philippe Ombredanne <pombredanne@nexb.com>
    Cc: Roman Gushchin <guro@fb.com>
    Cc: Sahitya Tummala <stummala@codeaurora.org>
    Cc: Stephen Rothwell <sfr@canb.auug.org.au>
    Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
    Cc: Thomas Gleixner <tglx@linutronix.de>
    Cc: Waiman Long <longman@redhat.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
    Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
    
    =====================
    Patchset description:
    
    Port "Improve shrink_slab() scalability" patchset
    
    https://jira.sw.ru/browse/PSBM-88027
    
    This is backport of the patchset improving the performance
    of overcommited containers with many memcgs and mounts.
    The original set is in Linus' tree, and came into 4.19-rc1.
    
    Kirill Tkhai (12):
          mm: assign id to every memcg-aware shrinker
          mm/memcontrol.c: move up for_each_mem_cgroup{, _tree} defines
          mm, memcg: assign memcg-aware shrinkers bitmap to memcg
          fs: propagate shrinker::id to list_lru
          mm/list_lru.c: add memcg argument to list_lru_from_kmem()
          mm/list_lru: pass dst_memcg argument to memcg_drain_list_lru_node()
          mm/list_lru.c: pass lru argument to memcg_drain_list_lru_node()
          mm/list_lru.c: set bit in memcg shrinker bitmap on first list_lru item appearance
          mm/memcontrol.c: export mem_cgroup_is_root()
          mm/vmscan.c: iterate only over charged shrinkers during memcg shrink_slab()
          mm: add SHRINK_EMPTY shrinker methods return value
          mm/vmscan.c: clear shrinker bit if there are no objects related to memcg
    
    Vladimir Davydov (1):
          mm/vmscan.c: generalize shrink_slab() calls in shrink_node()
---
 include/linux/memcontrol.h |  2 ++
 mm/memcontrol.c            |  6 +++++
 mm/vmscan.c                | 58 +++++++++++++++++++++++++++++++++++++++++++++-
 3 files changed, 65 insertions(+), 1 deletion(-)

Patch hide | download patch | download mbox

diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h
index 05078fe37034..0fafaa8c285f 100644
--- a/include/linux/memcontrol.h
+++ b/include/linux/memcontrol.h
@@ -642,6 +642,8 @@  static __always_inline struct mem_cgroup *mem_cgroup_from_kmem(void *ptr)
 extern int memcg_expand_shrinker_maps(int new_id);
 extern void memcg_set_shrinker_bit(struct mem_cgroup *memcg,
 				   int nid, int shrinker_id);
+
+extern struct memcg_shrinker_map *memcg_nid_shrinker_map(struct mem_cgroup *memcg, int nid);
 #else
 #define for_each_memcg_cache_index(_idx)	\
 	for (; NULL; )
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 3eaa16032d54..dec66d859df8 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -831,6 +831,12 @@  void memcg_set_shrinker_bit(struct mem_cgroup *memcg, int nid, int shrinker_id)
 		rcu_read_unlock();
 	}
 }
+
+struct memcg_shrinker_map *memcg_nid_shrinker_map(struct mem_cgroup *memcg, int nid)
+{
+	return rcu_dereference_protected(memcg->info.nodeinfo[nid]->shrinker_map,
+					 true /* shrinker_rwsem */);
+}
 #else
 static void disarm_kmem_keys(struct mem_cgroup *memcg)
 {
diff --git a/mm/vmscan.c b/mm/vmscan.c
index a0ed282035b9..df792f5444f7 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -430,6 +430,59 @@  static unsigned long do_shrink_slab(struct shrink_control *shrinkctl,
 	return freed;
 }
 
+#ifdef CONFIG_MEMCG_KMEM
+static unsigned long shrink_slab_memcg(gfp_t gfp_mask, int nid,
+			struct mem_cgroup *memcg, int priority)
+{
+	struct memcg_shrinker_map *map;
+	unsigned long freed = 0;
+	int ret, i;
+
+	if (!memcg_kmem_enabled())
+		return 0;
+
+	if (!down_read_trylock(&shrinker_rwsem))
+		return 0;
+
+	map = memcg_nid_shrinker_map(memcg, nid);
+
+	if (unlikely(!map))
+		goto unlock;
+
+	for_each_set_bit(i, map->map, shrinker_nr_max) {
+		struct shrink_control sc = {
+			.gfp_mask = gfp_mask,
+			.nid = nid,
+			.memcg = memcg,
+		};
+		struct shrinker *shrinker;
+
+		shrinker = idr_find(&shrinker_idr, i);
+		if (unlikely(!shrinker)) {
+			clear_bit(i, map->map);
+			continue;
+		}
+
+		ret = do_shrink_slab(&sc, shrinker, priority);
+		freed += ret;
+
+		if (rwsem_is_contended(&shrinker_rwsem)) {
+			freed = freed ? : 1;
+			break;
+		}
+	}
+unlock:
+	up_read(&shrinker_rwsem);
+	return freed;
+}
+#else /* CONFIG_MEMCG_KMEM */
+static unsigned long shrink_slab_memcg(gfp_t gfp_mask, int nid,
+			struct mem_cgroup *memcg, int priority)
+{
+	return 0;
+}
+#endif /* CONFIG_MEMCG_KMEM */
+
 /**
  * shrink_slab - shrink slab caches
  * @gfp_mask: allocation context
@@ -464,6 +517,9 @@  static unsigned long shrink_slab(gfp_t gfp_mask, int nid,
 	if (unlikely(test_tsk_thread_flag(current, TIF_MEMDIE)))
 		return 0;
 
+	if (memcg && !mem_cgroup_is_root(memcg))
+		return shrink_slab_memcg(gfp_mask, nid, memcg, priority);
+
 	if (!down_read_trylock(&shrinker_rwsem)) {
 		/*
 		 * If we would return 0, our callers would understand that we
@@ -483,7 +539,7 @@  static unsigned long shrink_slab(gfp_t gfp_mask, int nid,
 			.for_drop_caches = for_drop_caches,
 		};
 
-		if (memcg && !(shrinker->flags & SHRINKER_MEMCG_AWARE))
+		if (!!memcg != !!(shrinker->flags & SHRINKER_MEMCG_AWARE))
 			continue;
 
 		if (!(shrinker->flags & SHRINKER_NUMA_AWARE))