[RHEL7,COMMIT] ms/slab, slub: skip unnecessary kasan_cache_shutdown()

Submitted by Konstantin Khorenko on Nov. 24, 2018, 11:52 a.m.

Details

Message ID 201811241152.wAOBqdpJ029799@finist-ce7.sw.ru
State New
Series "ms/kasan: depend on CONFIG_SLUB_DEBUG"
Headers show

Commit Message

Konstantin Khorenko Nov. 24, 2018, 11:52 a.m.
The commit is pushed to "branch-rh7-3.10.0-862.20.2.vz7.73.x-ovz" and will appear at https://src.openvz.org/scm/ovz/vzkernel.git
after rh7-3.10.0-862.20.2.vz7.73.8
------>
commit 1146c87b15893ab6fbe8bf563e84dc8ab9edcc87
Author: Shakeel Butt <shakeelb@google.com>
Date:   Thu Apr 5 16:21:57 2018 -0700

    ms/slab, slub: skip unnecessary kasan_cache_shutdown()
    
    The kasan quarantine is designed to delay freeing slab objects to catch
    use-after-free.  The quarantine can be large (several percent of machine
    memory size).  When kmem_caches are deleted related objects are flushed
    from the quarantine but this requires scanning the entire quarantine
    which can be very slow.  We have seen the kernel busily working on this
    while holding slab_mutex and badly affecting cache_reaper, slabinfo
    readers and memcg kmem cache creations.
    
    It can easily reproduced by following script:
    
            yes . | head -1000000 | xargs stat > /dev/null
            for i in `seq 1 10`; do
                    seq 500 | (cd /cg/memory && xargs mkdir)
                    seq 500 | xargs -I{} sh -c 'echo $BASHPID > \
                            /cg/memory/{}/tasks && exec stat .' > /dev/null
                    seq 500 | (cd /cg/memory && xargs rmdir)
            done
    
    The busy stack:
        kasan_cache_shutdown
        shutdown_cache
        memcg_destroy_kmem_caches
        mem_cgroup_css_free
        css_free_rwork_fn
        process_one_work
        worker_thread
        kthread
        ret_from_fork
    
    This patch is based on the observation that if the kmem_cache to be
    destroyed is empty then there should not be any objects of this cache in
    the quarantine.
    
    Without the patch the script got stuck for couple of hours.  With the
    patch the script completed within a second.
    
    Link: http://lkml.kernel.org/r/20180327230603.54721-1-shakeelb@google.com
    Signed-off-by: Shakeel Butt <shakeelb@google.com>
    Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
    Acked-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
    Acked-by: Christoph Lameter <cl@linux.com>
    Cc: Vladimir Davydov <vdavydov.dev@gmail.com>
    Cc: Alexander Potapenko <glider@google.com>
    Cc: Greg Thelen <gthelen@google.com>
    Cc: Dmitry Vyukov <dvyukov@google.com>
    Cc: Pekka Enberg <penberg@kernel.org>
    Cc: David Rientjes <rientjes@google.com>
    Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
    
    May be fixes: https://jira.sw.ru/browse/PSBM-89434
    and https://pmc.acronis.com/browse/VSTOR-17956
    
    (cherry picked from commit f9e13c0a5a33d1eaec374d6d4dab53a4f72756a0)
    Signed-off-by: Konstantin Khorenko <khorenko@virtuozzo.com>
    
    Backport note: for_each_kmem_cache_node() not available in RHEL7 kernel
    so use old-way for_each_online_node() instead.
---
 mm/kasan/kasan.c |  3 ++-
 mm/slab.c        | 12 ++++++++++++
 mm/slab.h        |  1 +
 mm/slub.c        | 15 +++++++++++++++
 4 files changed, 30 insertions(+), 1 deletion(-)

Patch hide | download patch | download mbox

diff --git a/mm/kasan/kasan.c b/mm/kasan/kasan.c
index be5f7d2152b2..b536e2b94897 100644
--- a/mm/kasan/kasan.c
+++ b/mm/kasan/kasan.c
@@ -411,7 +411,8 @@  void kasan_cache_shrink(struct kmem_cache *cache)
 
 void kasan_cache_shutdown(struct kmem_cache *cache)
 {
-	quarantine_remove_cache(cache);
+	if (!__kmem_cache_empty(cache))
+		quarantine_remove_cache(cache);
 }
 
 size_t kasan_metadata_size(struct kmem_cache *cache)
diff --git a/mm/slab.c b/mm/slab.c
index bcfd2a84b3ec..9399da9d723f 100644
--- a/mm/slab.c
+++ b/mm/slab.c
@@ -2524,6 +2524,18 @@  static int drain_freelist(struct kmem_cache *cache,
 	return nr_freed;
 }
 
+bool __kmem_cache_empty(struct kmem_cache *s)
+{
+	int node;
+	struct kmem_cache_node *n;
+
+	for_each_kmem_cache_node(s, node, n)
+		if (!list_empty(&n->slabs_full) ||
+		    !list_empty(&n->slabs_partial))
+			return false;
+	return true;
+}
+
 int __kmem_cache_shrink(struct kmem_cache *cachep, bool deactivate)
 {
 	int ret = 0, i = 0;
diff --git a/mm/slab.h b/mm/slab.h
index 292fccf68182..4b00493445d8 100644
--- a/mm/slab.h
+++ b/mm/slab.h
@@ -125,6 +125,7 @@  __kmem_cache_alias(const char *name, size_t size, size_t align,
 
 #define CACHE_CREATE_MASK (SLAB_CORE_FLAGS | SLAB_DEBUG_FLAGS | SLAB_CACHE_FLAGS)
 
+bool __kmem_cache_empty(struct kmem_cache *);
 int __kmem_cache_shutdown(struct kmem_cache *);
 void __kmem_cache_release(struct kmem_cache *);
 int __kmem_cache_shrink(struct kmem_cache *, bool);
diff --git a/mm/slub.c b/mm/slub.c
index f918c77bee17..d5b0bc1fcd56 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -3476,6 +3476,21 @@  static void free_partial(struct kmem_cache *s, struct kmem_cache_node *n)
 	spin_unlock_irq(&n->list_lock);
 }
 
+bool __kmem_cache_empty(struct kmem_cache *s)
+{
+	int node;
+
+	for_each_online_node(node) {
+		struct kmem_cache_node *n = get_node(s, node);
+
+		if (!n)
+			continue;
+		if (n->nr_partial || slabs_node(s, node))
+			return false;
+	}
+	return true;
+}
+
 /*
  * Release all resources used by a slab cache.
  */