[rh7] mm/memcg: Bypass charges to offlined cgroup.

Submitted by Andrey Ryabinin on July 13, 2018, 11:28 a.m.

Details

Message ID 20180713112800.21720-1-aryabinin@virtuozzo.com
State New
Series "mm/memcg: Bypass charges to offlined cgroup."
Headers show

Commit Message

Andrey Ryabinin July 13, 2018, 11:28 a.m.
Charges to offlined cgroup can happen. E.g. swapped
KSM page may charge to offlined cgroup. If such charge
races with reparenting during the offlining process
it may result in hang in mem_cgroup_reparent_charges()

Let's say we have cgroup A and cgroup B. A is parent of
B, B is offlined already, and A in the process of offlining.
During offline of A, we walk through all childs of A
and reparent their charges. After that, we reparent the A itself
If charge to B cgroup will appear after the reparenting
of B and before reparenting A, the reparanting of A will hang.
Charge to B also increases ->memory counter of its parent A,
so mem_cgroup_reparent_charges(A) will never satisfy the condition
'A->memory - A->kmem > 0'
which is required to break the loop.

https://jira.sw.ru/browse/PSBM-86092
Signed-off-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
---
 mm/memcontrol.c | 22 ++++++++++++++++++++++
 1 file changed, 22 insertions(+)

Patch hide | download patch | download mbox

diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 59cf47972f9e..8b979d88045c 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -2909,6 +2909,28 @@  static int try_charge(struct mem_cgroup *memcg, gfp_t gfp_mask, bool kmem_charge
 	return -EINTR;
 
 done_restock:
+
+	/*
+	 * Cancel charge in case if cgroup was offlined while we were here,
+	 * otherwise we can get a pending user memory charge to an offline
+	 * cgroup, which might race with reparanting in mem_cgroup_css_offline()
+	 * and result in hang.
+	 *
+	 * Note, no need to issue an explicit barrier here, because a
+	 * successful charge implies full memory barrier.
+	 */
+	if (unlikely(memcg->is_offline)) {
+		page_counter_uncharge(&memcg->memory, batch);
+		if (do_swap_account)
+			page_counter_uncharge(&memcg->memsw, batch);
+		if (cache_charge)
+			page_counter_uncharge(&memcg->cache, nr_pages);
+		if (kmem_charge)
+			page_counter_uncharge(&memcg->kmem, nr_pages);
+
+		goto bypass;
+	}
+
 	if (batch > nr_pages)
 		refill_stock(memcg, batch - nr_pages);
 done:

Comments

Konstantin Khorenko July 13, 2018, 12:34 p.m.
Do we need it to RK as well for older kernels?

--
Best regards,

Konstantin Khorenko,
Virtuozzo Linux Kernel Team

On 07/13/2018 02:28 PM, Andrey Ryabinin wrote:
> Charges to offlined cgroup can happen. E.g. swapped
> KSM page may charge to offlined cgroup. If such charge
> races with reparenting during the offlining process
> it may result in hang in mem_cgroup_reparent_charges()
>
> Let's say we have cgroup A and cgroup B. A is parent of
> B, B is offlined already, and A in the process of offlining.
> During offline of A, we walk through all childs of A
> and reparent their charges. After that, we reparent the A itself
> If charge to B cgroup will appear after the reparenting
> of B and before reparenting A, the reparanting of A will hang.
> Charge to B also increases ->memory counter of its parent A,
> so mem_cgroup_reparent_charges(A) will never satisfy the condition
> 'A->memory - A->kmem > 0'
> which is required to break the loop.
>
> https://jira.sw.ru/browse/PSBM-86092
> Signed-off-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
> ---
>  mm/memcontrol.c | 22 ++++++++++++++++++++++
>  1 file changed, 22 insertions(+)
>
> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> index 59cf47972f9e..8b979d88045c 100644
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -2909,6 +2909,28 @@ static int try_charge(struct mem_cgroup *memcg, gfp_t gfp_mask, bool kmem_charge
>  	return -EINTR;
>
>  done_restock:
> +
> +	/*
> +	 * Cancel charge in case if cgroup was offlined while we were here,
> +	 * otherwise we can get a pending user memory charge to an offline
> +	 * cgroup, which might race with reparanting in mem_cgroup_css_offline()
> +	 * and result in hang.
> +	 *
> +	 * Note, no need to issue an explicit barrier here, because a
> +	 * successful charge implies full memory barrier.
> +	 */
> +	if (unlikely(memcg->is_offline)) {
> +		page_counter_uncharge(&memcg->memory, batch);
> +		if (do_swap_account)
> +			page_counter_uncharge(&memcg->memsw, batch);
> +		if (cache_charge)
> +			page_counter_uncharge(&memcg->cache, nr_pages);
> +		if (kmem_charge)
> +			page_counter_uncharge(&memcg->kmem, nr_pages);
> +
> +		goto bypass;
> +	}
> +
>  	if (batch > nr_pages)
>  		refill_stock(memcg, batch - nr_pages);
>  done:
>
Andrey Ryabinin July 13, 2018, 12:39 p.m.
On 07/13/2018 03:34 PM, Konstantin Khorenko wrote:
> Do we need it to RK as well for older kernels?
> 

Yes.

> -- 
> Best regards,
> 
> Konstantin Khorenko,
> Virtuozzo Linux Kernel Team
> 
> On 07/13/2018 02:28 PM, Andrey Ryabinin wrote:
>> Charges to offlined cgroup can happen. E.g. swapped
>> KSM page may charge to offlined cgroup. If such charge
>> races with reparenting during the offlining process
>> it may result in hang in mem_cgroup_reparent_charges()
>>
>> Let's say we have cgroup A and cgroup B. A is parent of
>> B, B is offlined already, and A in the process of offlining.
>> During offline of A, we walk through all childs of A
>> and reparent their charges. After that, we reparent the A itself
>> If charge to B cgroup will appear after the reparenting
>> of B and before reparenting A, the reparanting of A will hang.
>> Charge to B also increases ->memory counter of its parent A,
>> so mem_cgroup_reparent_charges(A) will never satisfy the condition
>> 'A->memory - A->kmem > 0'
>> which is required to break the loop.
>>
>> https://jira.sw.ru/browse/PSBM-86092
>> Signed-off-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
>> ---
>>  mm/memcontrol.c | 22 ++++++++++++++++++++++
>>  1 file changed, 22 insertions(+)
>>
>> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
>> index 59cf47972f9e..8b979d88045c 100644
>> --- a/mm/memcontrol.c
>> +++ b/mm/memcontrol.c
>> @@ -2909,6 +2909,28 @@ static int try_charge(struct mem_cgroup *memcg, gfp_t gfp_mask, bool kmem_charge
>>      return -EINTR;
>>
>>  done_restock:
>> +
>> +    /*
>> +     * Cancel charge in case if cgroup was offlined while we were here,
>> +     * otherwise we can get a pending user memory charge to an offline
>> +     * cgroup, which might race with reparanting in mem_cgroup_css_offline()
>> +     * and result in hang.
>> +     *
>> +     * Note, no need to issue an explicit barrier here, because a
>> +     * successful charge implies full memory barrier.
>> +     */
>> +    if (unlikely(memcg->is_offline)) {
>> +        page_counter_uncharge(&memcg->memory, batch);
>> +        if (do_swap_account)
>> +            page_counter_uncharge(&memcg->memsw, batch);
>> +        if (cache_charge)
>> +            page_counter_uncharge(&memcg->cache, nr_pages);
>> +        if (kmem_charge)
>> +            page_counter_uncharge(&memcg->kmem, nr_pages);
>> +
>> +        goto bypass;
>> +    }
>> +
>>      if (batch > nr_pages)
>>          refill_stock(memcg, batch - nr_pages);
>>  done:
>>