[Devel,RHEL7,COMMIT] ms/mm: memcontrol: only mark charged pages with PageKmemcg

Submitted by Konstantin Khorenko on Jan. 16, 2017, 4:27 p.m.

Details

Message ID 201701161627.v0GGRHTI029270@finist_cl7.x64_64.work.ct
State New
Series "Series without cover letter"
Headers show

Commit Message

Konstantin Khorenko Jan. 16, 2017, 4:27 p.m.
The commit is pushed to "branch-rh7-3.10.0-514.vz7.27.x-ovz" and will appear at https://src.openvz.org/scm/ovz/vzkernel.git
after rh7-3.10.0-514.vz7.27.10
------>
commit cd4b4b8807ac3545017b3153ced5e81b27aa9346
Author: Vladimir Davydov <vdavydov@virtuozzo.com>
Date:   Mon Jan 16 20:27:17 2017 +0400

    ms/mm: memcontrol: only mark charged pages with PageKmemcg
    
    To distinguish non-slab pages charged to kmemcg we mark them PageKmemcg,
    which sets page->_mapcount to -512.  Currently, we set/clear PageKmemcg
    in __alloc_pages_nodemask()/free_pages_prepare() for any page allocated
    with __GFP_ACCOUNT, including those that aren't actually charged to any
    cgroup, i.e. allocated from the root cgroup context.  To avoid overhead
    in case cgroups are not used, we only do that if memcg_kmem_enabled() is
    true.  The latter is set iff there are kmem-enabled memory cgroups
    (online or offline).  The root cgroup is not considered kmem-enabled.
    
    As a result, if a page is allocated with __GFP_ACCOUNT for the root
    cgroup when there are kmem-enabled memory cgroups and is freed after all
    kmem-enabled memory cgroups were removed, e.g.
    
      # no memory cgroups has been created yet, create one
      mkdir /sys/fs/cgroup/memory/test
      # run something allocating pages with __GFP_ACCOUNT, e.g.
      # a program using pipe
      dmesg | tail
      # remove the memory cgroup
      rmdir /sys/fs/cgroup/memory/test
    
    we'll get bad page state bug complaining about page->_mapcount != -1:
    
      BUG: Bad page state in process swapper/0  pfn:1fd945c
      page:ffffea007f651700 count:0 mapcount:-511 mapping:          (null) index:0x0
      flags: 0x1000000000000000()
    
    To avoid that, let's mark with PageKmemcg only those pages that are
    actually charged to and hence pin a non-root memory cgroup.
    
    Fixes: 4949148ad433 ("mm: charge/uncharge kmemcg from generic page allocator paths")
    Reported-and-tested-by: Eric Dumazet <eric.dumazet@gmail.com>
    Signed-off-by: Vladimir Davydov <vdavydov@virtuozzo.com>
    
    Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
    
    https://jira.sw.ru/browse/PSBM-51558
    (cherry picked from commit c4159a75b64c0e67caededf4d7372c1b58a5f42a)
    Signed-off-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
---
 mm/memcontrol.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

Patch hide | download patch | download mbox

diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 0183a9c..dc83f4e 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -7001,8 +7001,10 @@  static void uncharge_list(struct list_head *page_list)
 			else
 				nr_file += nr_pages;
 			pgpgout++;
-		} else
+		} else {
 			nr_kmem += 1 << compound_order(page);
+			__ClearPageKmemcg(page);
+		}
 
 		if (pc->flags & PCG_MEM)
 			nr_mem += nr_pages;