[vz7,v2] mm/slab: use monotonic serial number for naming per memcg caches

Submitted by Konstantin Khorenko on Oct. 15, 2018, 10:15 a.m.

Details

Message ID 20181015101528.23569-1-khorenko@virtuozzo.com
State New
Series "mm/slab: use monotonic serial number for naming per memcg caches"
Headers show

Commit Message

Konstantin Khorenko Oct. 15, 2018, 10:15 a.m.
Currently, we use mem_cgroup->kmemcg_id to guarantee kmem_cache->name
uniqueness. Unfortunately kmemcg_id release happens on cgroup offline
when kmem caches are  still alive and their sysfs entries are not removed.

Thus it's possible to start/suspend/resume/stop a Container and get into
situation when mem cgroup is offline (thus mem_cgroup::kmemcg_id is
released), but not freed and corresponding slab caches are alive along
with their sysfs entries like:
  /sys/kernel/slab/filp/cgroup/filp($KMEMCG_ID:$MEMCG_NAME)

If we start a Container after that, new memory cgroup is created and
gets same kmemcg_id, new kmem_cache is created and sysfs_slab_add()
fails due to duplicate name provided which triggers a warning:

  WARNING: CPU: 0 PID: 31346 at lib/kobject.c:239
    kobject_add_internal+0x6ef/0x910
  kobject_add_internal failed for kmalloc-512(3:100) with -EEXIST, don't
    try to register things with the same name in the same directory.

In older kernels we use css_id(mem_cgroup_css(memcg)), but now cssid is
dropped.
In mainstream monotonic counter css->serial_nr is used instead,
so let's add monotonic serial_nr to kmem_cache->name as well.
No need to store is anywhere as kmem_cache has a link to memcg.

https://jira.sw.ru/browse/PSBM-89047

Signed-off-by: Konstantin Khorenko <khorenko@virtuozzo.com>

v2: Holding slab_mutex makes atomic variable redundant.
    No need to store serial number assigned to a kmem_cache,
    because kmem_cache has a link to memcg.

Signed-off-by: Konstantin Khorenko <khorenko@virtuozzo.com>
---
 mm/slab_common.c | 11 +++++++++--
 1 file changed, 9 insertions(+), 2 deletions(-)

Patch hide | download patch | download mbox

diff --git a/mm/slab_common.c b/mm/slab_common.c
index 1b2a7d208596..b08f48541f4a 100644
--- a/mm/slab_common.c
+++ b/mm/slab_common.c
@@ -421,6 +421,7 @@  static void do_kmem_cache_release(struct list_head *release,
 void memcg_create_kmem_cache(struct mem_cgroup *memcg,
 			     struct kmem_cache *root_cache)
 {
+	static u64 serial_nr_cursor = 0;
 	static char memcg_name_buf[NAME_MAX + 1]; /* protected by slab_mutex */
 	struct memcg_cache_array *arr;
 	struct kmem_cache *s = NULL;
@@ -453,8 +454,14 @@  void memcg_create_kmem_cache(struct mem_cgroup *memcg,
 	strlcpy(memcg_name_buf, cgroup_name(mem_cgroup_css(memcg)->cgroup),
 		NAME_MAX + 1);
 	rcu_read_unlock();
-	cache_name = kasprintf(GFP_KERNEL, "%s(%d:%s)", root_cache->name,
-			       idx, memcg_name_buf);
+
+	/*
+	 * Monotonically increasing serial number in the name guarantees
+	 * kmem_cache name uniqueness even if a mem cgroup goes offline,
+	 * releases kmemcg_id and new mem cgroup gets same kmemcg_id.
+	 */
+	cache_name = kasprintf(GFP_KERNEL, "%s(%llu:%s)", root_cache->name,
+			       serial_nr_cursor++, memcg_name_buf);
 	if (!cache_name)
 		goto out_unlock;
 

Comments

Kirill Tkhai Oct. 15, 2018, 12:12 p.m.
On 15.10.2018 13:15, Konstantin Khorenko wrote:
> Currently, we use mem_cgroup->kmemcg_id to guarantee kmem_cache->name
> uniqueness. Unfortunately kmemcg_id release happens on cgroup offline
> when kmem caches are  still alive and their sysfs entries are not removed.
> 
> Thus it's possible to start/suspend/resume/stop a Container and get into
> situation when mem cgroup is offline (thus mem_cgroup::kmemcg_id is
> released), but not freed and corresponding slab caches are alive along
> with their sysfs entries like:
>   /sys/kernel/slab/filp/cgroup/filp($KMEMCG_ID:$MEMCG_NAME)
> 
> If we start a Container after that, new memory cgroup is created and
> gets same kmemcg_id, new kmem_cache is created and sysfs_slab_add()
> fails due to duplicate name provided which triggers a warning:
> 
>   WARNING: CPU: 0 PID: 31346 at lib/kobject.c:239
>     kobject_add_internal+0x6ef/0x910
>   kobject_add_internal failed for kmalloc-512(3:100) with -EEXIST, don't
>     try to register things with the same name in the same directory.
> 
> In older kernels we use css_id(mem_cgroup_css(memcg)), but now cssid is
> dropped.
> In mainstream monotonic counter css->serial_nr is used instead,
> so let's add monotonic serial_nr to kmem_cache->name as well.
> No need to store is anywhere as kmem_cache has a link to memcg.
> 
> https://jira.sw.ru/browse/PSBM-89047
> 
> Signed-off-by: Konstantin Khorenko <khorenko@virtuozzo.com>
> 
> v2: Holding slab_mutex makes atomic variable redundant.
>     No need to store serial number assigned to a kmem_cache,
>     because kmem_cache has a link to memcg.
> 
> Signed-off-by: Konstantin Khorenko <khorenko@virtuozzo.com>

Acked-by: Kirill Tkhai <ktkhai@virtuozzo.com>

> ---
>  mm/slab_common.c | 11 +++++++++--
>  1 file changed, 9 insertions(+), 2 deletions(-)
> 
> diff --git a/mm/slab_common.c b/mm/slab_common.c
> index 1b2a7d208596..b08f48541f4a 100644
> --- a/mm/slab_common.c
> +++ b/mm/slab_common.c
> @@ -421,6 +421,7 @@ static void do_kmem_cache_release(struct list_head *release,
>  void memcg_create_kmem_cache(struct mem_cgroup *memcg,
>  			     struct kmem_cache *root_cache)
>  {
> +	static u64 serial_nr_cursor = 0;
>  	static char memcg_name_buf[NAME_MAX + 1]; /* protected by slab_mutex */
>  	struct memcg_cache_array *arr;
>  	struct kmem_cache *s = NULL;
> @@ -453,8 +454,14 @@ void memcg_create_kmem_cache(struct mem_cgroup *memcg,
>  	strlcpy(memcg_name_buf, cgroup_name(mem_cgroup_css(memcg)->cgroup),
>  		NAME_MAX + 1);
>  	rcu_read_unlock();
> -	cache_name = kasprintf(GFP_KERNEL, "%s(%d:%s)", root_cache->name,
> -			       idx, memcg_name_buf);
> +
> +	/*
> +	 * Monotonically increasing serial number in the name guarantees
> +	 * kmem_cache name uniqueness even if a mem cgroup goes offline,
> +	 * releases kmemcg_id and new mem cgroup gets same kmemcg_id.
> +	 */
> +	cache_name = kasprintf(GFP_KERNEL, "%s(%llu:%s)", root_cache->name,
> +			       serial_nr_cursor++, memcg_name_buf);
>  	if (!cache_name)
>  		goto out_unlock;
>  
>