[Devel,rh7] mm/zswap: fix potential deadlock in zswap_frontswap_store()

Submitted by Andrey Ryabinin on March 28, 2017, 9:17 a.m.

Details

Message ID 20170328091724.17401-1-aryabinin@virtuozzo.com
State New
Series "mm/zswap: fix potential deadlock in zswap_frontswap_store()"
Headers show

Commit Message

Andrey Ryabinin March 28, 2017, 9:17 a.m.
zswap_frontswap_store() is called during memory reclaim, therefore
it can't use __GFP_FS, otherwise we may renter into fs code and deadlock.
zswap_frontswap_store() also shouldn't do __GFP_IO, to avoid reentering
into zswap code itself.

 WARNING: at mm/slub.c:1250 slab_pre_alloc_hook.isra.42.part.43+0x15/0x17()
...
 Call Trace:
 [<ffffffff8163413f>] dump_stack+0x19/0x1b
 [<ffffffff8107b620>] warn_slowpath_common+0x70/0xb0
 [<ffffffff8107b76a>] warn_slowpath_null+0x1a/0x20
 [<ffffffff816311c8>] slab_pre_alloc_hook.isra.42.part.43+0x15/0x17
 [<ffffffff811d7235>] kmem_cache_alloc+0x55/0x220
 [<ffffffff811c4a09>] zswap_frontswap_store+0xe9/0x320
 [<ffffffff811c3d5b>] __frontswap_store+0x7b/0x110
 [<ffffffff811bf6a3>] swap_writepage+0x23/0x80
 [<ffffffff8118fff2>] shrink_page_list+0x4b2/0xa80
 [<ffffffff81190c1b>] shrink_inactive_list+0x1fb/0x6c0
 [<ffffffff811918b5>] shrink_lruvec+0x395/0x800
 [<ffffffff81191e0f>] shrink_zone+0xef/0x2d0
 [<ffffffff81192390>] do_try_to_free_pages+0x170/0x530
 [<ffffffff81192825>] try_to_free_pages+0xd5/0x160
 [<ffffffff81185aa5>] __alloc_pages_nodemask+0x8a5/0xc10
 [<ffffffff811cb1da>] alloc_pages_current+0xaa/0x170
 [<ffffffff811803de>] __get_free_pages+0xe/0x50
 [<ffffffffa029a5aa>] jbd2_alloc+0x8a/0x90 [jbd2]
 [<ffffffffa028fab4>] do_get_write_access+0x1d4/0x4d0 [jbd2]
 [<ffffffffa028fdd7>] jbd2_journal_get_write_access+0x27/0x40 [jbd2]
 [<ffffffffa02e8e0b>] __ext4_journal_get_write_access+0x3b/0x80 [ext4]
 [<ffffffffa02efccd>] ext4_mb_mark_diskspace_used+0x7d/0x4f0 [ext4]
 [<ffffffffa02f156d>] ext4_mb_new_blocks+0x36d/0x620 [ext4]
 [<ffffffffa02e601d>] ext4_ext_map_blocks+0x49d/0xed0 [ext4]
 [<ffffffffa02b2689>] ext4_map_blocks+0x179/0x590 [ext4]
 [<ffffffffa02b5bc2>] ext4_writepages+0x692/0xd60 [ext4]
 [<ffffffff8118828e>] do_writepages+0x1e/0x40
 [<ffffffff812249da>] __writeback_single_inode+0x7a/0x410
 [<ffffffff812257d7>] writeback_sb_inodes+0x2c7/0x540
 [<ffffffff81225aef>] __writeback_inodes_wb+0x9f/0xd0
 [<ffffffff81225d83>] wb_writeback+0x263/0x320
 [<ffffffff8122821d>] bdi_writeback_workfn+0x11d/0x420
 [<ffffffff8109ef4b>] process_one_work+0x17b/0x470
 [<ffffffff8109fd1b>] worker_thread+0x11b/0x400
 [<ffffffff810a779f>] kthread+0xcf/0xe0
 [<ffffffff81644918>] ret_from_fork+0x58/0x90
 ---[ end trace 892c68052e9c3d3f ]---

Also add __GFP_NOWARN as failures here is fine.

https://jira.sw.ru/browse/PSBM-58279
Signed-off-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
---
 mm/zswap.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Patch hide | download patch | download mbox

diff --git a/mm/zswap.c b/mm/zswap.c
index e2f8c7e..c582024 100644
--- a/mm/zswap.c
+++ b/mm/zswap.c
@@ -665,7 +665,7 @@  static int zswap_frontswap_store(unsigned type, pgoff_t offset,
 	}
 
 	/* allocate entry */
-	entry = zswap_entry_cache_alloc(GFP_KERNEL);
+	entry = zswap_entry_cache_alloc(GFP_NOIO | __GFP_NOWARN);
 	if (!entry) {
 		zswap_reject_kmemcache_fail++;
 		ret = -ENOMEM;

Comments

Konstantin Khorenko March 29, 2017, 12:32 p.m.
Andrey, did you send it to mainstream as well?

Please post a link here.

--
Best regards,

Konstantin Khorenko,
Virtuozzo Linux Kernel Team

On 03/28/2017 12:17 PM, Andrey Ryabinin wrote:
> zswap_frontswap_store() is called during memory reclaim, therefore
> it can't use __GFP_FS, otherwise we may renter into fs code and deadlock.
> zswap_frontswap_store() also shouldn't do __GFP_IO, to avoid reentering
> into zswap code itself.
>
>  WARNING: at mm/slub.c:1250 slab_pre_alloc_hook.isra.42.part.43+0x15/0x17()
> ...
>  Call Trace:
>  [<ffffffff8163413f>] dump_stack+0x19/0x1b
>  [<ffffffff8107b620>] warn_slowpath_common+0x70/0xb0
>  [<ffffffff8107b76a>] warn_slowpath_null+0x1a/0x20
>  [<ffffffff816311c8>] slab_pre_alloc_hook.isra.42.part.43+0x15/0x17
>  [<ffffffff811d7235>] kmem_cache_alloc+0x55/0x220
>  [<ffffffff811c4a09>] zswap_frontswap_store+0xe9/0x320
>  [<ffffffff811c3d5b>] __frontswap_store+0x7b/0x110
>  [<ffffffff811bf6a3>] swap_writepage+0x23/0x80
>  [<ffffffff8118fff2>] shrink_page_list+0x4b2/0xa80
>  [<ffffffff81190c1b>] shrink_inactive_list+0x1fb/0x6c0
>  [<ffffffff811918b5>] shrink_lruvec+0x395/0x800
>  [<ffffffff81191e0f>] shrink_zone+0xef/0x2d0
>  [<ffffffff81192390>] do_try_to_free_pages+0x170/0x530
>  [<ffffffff81192825>] try_to_free_pages+0xd5/0x160
>  [<ffffffff81185aa5>] __alloc_pages_nodemask+0x8a5/0xc10
>  [<ffffffff811cb1da>] alloc_pages_current+0xaa/0x170
>  [<ffffffff811803de>] __get_free_pages+0xe/0x50
>  [<ffffffffa029a5aa>] jbd2_alloc+0x8a/0x90 [jbd2]
>  [<ffffffffa028fab4>] do_get_write_access+0x1d4/0x4d0 [jbd2]
>  [<ffffffffa028fdd7>] jbd2_journal_get_write_access+0x27/0x40 [jbd2]
>  [<ffffffffa02e8e0b>] __ext4_journal_get_write_access+0x3b/0x80 [ext4]
>  [<ffffffffa02efccd>] ext4_mb_mark_diskspace_used+0x7d/0x4f0 [ext4]
>  [<ffffffffa02f156d>] ext4_mb_new_blocks+0x36d/0x620 [ext4]
>  [<ffffffffa02e601d>] ext4_ext_map_blocks+0x49d/0xed0 [ext4]
>  [<ffffffffa02b2689>] ext4_map_blocks+0x179/0x590 [ext4]
>  [<ffffffffa02b5bc2>] ext4_writepages+0x692/0xd60 [ext4]
>  [<ffffffff8118828e>] do_writepages+0x1e/0x40
>  [<ffffffff812249da>] __writeback_single_inode+0x7a/0x410
>  [<ffffffff812257d7>] writeback_sb_inodes+0x2c7/0x540
>  [<ffffffff81225aef>] __writeback_inodes_wb+0x9f/0xd0
>  [<ffffffff81225d83>] wb_writeback+0x263/0x320
>  [<ffffffff8122821d>] bdi_writeback_workfn+0x11d/0x420
>  [<ffffffff8109ef4b>] process_one_work+0x17b/0x470
>  [<ffffffff8109fd1b>] worker_thread+0x11b/0x400
>  [<ffffffff810a779f>] kthread+0xcf/0xe0
>  [<ffffffff81644918>] ret_from_fork+0x58/0x90
>  ---[ end trace 892c68052e9c3d3f ]---
>
> Also add __GFP_NOWARN as failures here is fine.
>
> https://jira.sw.ru/browse/PSBM-58279
> Signed-off-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
> ---
>  mm/zswap.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/mm/zswap.c b/mm/zswap.c
> index e2f8c7e..c582024 100644
> --- a/mm/zswap.c
> +++ b/mm/zswap.c
> @@ -665,7 +665,7 @@ static int zswap_frontswap_store(unsigned type, pgoff_t offset,
>  	}
>
>  	/* allocate entry */
> -	entry = zswap_entry_cache_alloc(GFP_KERNEL);
> +	entry = zswap_entry_cache_alloc(GFP_NOIO | __GFP_NOWARN);
>  	if (!entry) {
>  		zswap_reject_kmemcache_fail++;
>  		ret = -ENOMEM;
>
Andrey Ryabinin March 31, 2017, 3:34 p.m.
On 03/29/2017 03:32 PM, Konstantin Khorenko wrote:
> Andrey, did you send it to mainstream as well?
> 
> Please post a link here.
> 

https://lkml.kernel.org/r/20170331153009.11397-1-aryabinin@virtuozzo.com

Note it slightly different, after looking at this code again I decided to made small cleanup.
I could sync our code if/when it will be accepted.
Konstantin Khorenko April 3, 2017, 9:52 a.m.
On 03/31/2017 06:34 PM, Andrey Ryabinin wrote:
>
>
> On 03/29/2017 03:32 PM, Konstantin Khorenko wrote:
>> Andrey, did you send it to mainstream as well?
>>
>> Please post a link here.
>>
>
> https://lkml.kernel.org/r/20170331153009.11397-1-aryabinin@virtuozzo.com
>
> Note it slightly different, after looking at this code again I decided to made small cleanup.
> I could sync our code if/when it will be accepted.

Good! Yes, please, sync the code once the patch is accepted in ms.
Konstantin Khorenko May 12, 2017, 12:36 p.m.
No answer from ms guys?
To ping once again?

--
Best regards,

Konstantin Khorenko,
Virtuozzo Linux Kernel Team

On 04/03/2017 12:52 PM, Konstantin Khorenko wrote:
> On 03/31/2017 06:34 PM, Andrey Ryabinin wrote:
>>
>>
>> On 03/29/2017 03:32 PM, Konstantin Khorenko wrote:
>>> Andrey, did you send it to mainstream as well?
>>>
>>> Please post a link here.
>>>
>>
>> https://lkml.kernel.org/r/20170331153009.11397-1-aryabinin@virtuozzo.com
>>
>> Note it slightly different, after looking at this code again I decided to made small cleanup.
>> I could sync our code if/when it will be accepted.
>
> Good! Yes, please, sync the code once the patch is accepted in ms.
> _______________________________________________
> Devel mailing list
> Devel@openvz.org
> https://lists.openvz.org/mailman/listinfo/devel
> .
>
Andrey Ryabinin May 12, 2017, 12:42 p.m.
On 05/12/2017 03:36 PM, Konstantin Khorenko wrote:
> No answer from ms guys?
> To ping once again?
> 

There were answers.
It was figured out that there is no bug here, as PF_MEMALLOC protects us from recursion.
So, the patch is not needed.

> -- 
> Best regards,
> 
> Konstantin Khorenko,
> Virtuozzo Linux Kernel Team
> 
> On 04/03/2017 12:52 PM, Konstantin Khorenko wrote:
>> On 03/31/2017 06:34 PM, Andrey Ryabinin wrote:
>>>
>>>
>>> On 03/29/2017 03:32 PM, Konstantin Khorenko wrote:
>>>> Andrey, did you send it to mainstream as well?
>>>>
>>>> Please post a link here.
>>>>
>>>
>>> https://lkml.kernel.org/r/20170331153009.11397-1-aryabinin@virtuozzo.com
>>>
>>> Note it slightly different, after looking at this code again I decided to made small cleanup.
>>> I could sync our code if/when it will be accepted.
>>
>> Good! Yes, please, sync the code once the patch is accepted in ms.
>> _______________________________________________
>> Devel mailing list
>> Devel@openvz.org
>> https://lists.openvz.org/mailman/listinfo/devel
>> .
>>