[9/7] Added --mix-pre-dump option for zdtm suite

Submitted by Radostin Stoyanov on Aug. 31, 2019, 7:37 p.m.

Details

Message ID 7f31d78a-9af8-55b2-54ca-e8095b12b440@gmail.com
State New
Headers show

Patch hide | download patch | download mbox

diff --git a/criu/mem.c b/criu/mem.c
index e19688d..740992d 100644
--- a/criu/mem.c
+++ b/criu/mem.c
@@ -351,7 +351,7 @@  static int generate_vma_iovs(struct pstree_item 
*item, struct vma_area *vma,
                   struct page_pipe *pp, struct page_xfer *xfer,
                   struct parasite_dump_pages_args *args,
                   struct parasite_ctl *ctl, pmc_t *pmc,
-                 bool has_parent, bool pre_dump)
+                 bool has_parent, struct mem_dump_ctl *mdc)



Comments

Abhishek Dubey Sept. 2, 2019, 6:19 a.m.
On 01/09/19 1:07 AM, Radostin Stoyanov wrote:
>
> On 31/08/2019 13:12, Abhishek Dubey wrote:
>> READ and SPLICE mode pre-dumps can happen
>> in any order. New option --mix-pre-dump is
>> added for zdtm suite only, to test integrity
>> of random order of pre-dump modes.
> It is important when a test fails to be able to reproduce the error 
> consistently. The use of random order for testing might make debugging 
> difficult.
Sure. I will remove random to preserve consistency.
>
>> Signed-off-by: Abhishek Dubey <dubeyabhishek777@gmail.com>
>> ---
>>   criu/cr-dump.c       |  4 ++--
>>   criu/cr-restore.c    |  4 ++--
>>   criu/include/stats.h |  4 ++--
>>   criu/mem.c           | 65 
>> ++++++++++++++++++++++++++++++++++++++++------------
>>   criu/stats.c         | 42 ++++++++++++++++++++++++++++++++-
>>   images/stats.proto   |  1 +
>>   test/zdtm.py         | 13 ++++++++++-
>>   7 files changed, 110 insertions(+), 23 deletions(-)
>>
>> diff --git a/criu/cr-dump.c b/criu/cr-dump.c
>> index b916e0a..f62362f 100644
>> --- a/criu/cr-dump.c
>> +++ b/criu/cr-dump.c
>> @@ -1554,7 +1554,7 @@ err:
>>       if (ret)
>>           pr_err("Pre-dumping FAILED.\n");
>>       else {
>> -        write_stats(DUMP_STATS);
>> +        write_stats(DUMP_STATS, 1);
>>           pr_info("Pre-dumping finished successfully\n");
>>       }
>>       return ret;
>> @@ -1760,7 +1760,7 @@ static int cr_dump_finish(int ret)
>>       if (ret) {
>>           pr_err("Dumping FAILED.\n");
>>       } else {
>> -        write_stats(DUMP_STATS);
>> +        write_stats(DUMP_STATS, 0);
>>           pr_info("Dumping finished successfully\n");
>>       }
>>       return post_dump_ret ? : (ret != 0);
>> diff --git a/criu/cr-restore.c b/criu/cr-restore.c
>> index de0b2cb..8ca46e4 100644
>> --- a/criu/cr-restore.c
>> +++ b/criu/cr-restore.c
>> @@ -2226,7 +2226,7 @@ skip_ns_bouncing:
>>       if (ret != 0) {
>>           pr_err("Aborting restore due to post-restore script ret 
>> code %d\n", ret);
>>           timing_stop(TIME_RESTORE);
>> -        write_stats(RESTORE_STATS);
>> +        write_stats(RESTORE_STATS, 0);
>>           goto out_kill;
>>       }
>>   @@ -2292,7 +2292,7 @@ skip_ns_bouncing:
>>       /* Detaches from processes and they continue run through 
>> sigreturn. */
>>       finalize_restore_detach(ret);
>>   -    write_stats(RESTORE_STATS);
>> +    write_stats(RESTORE_STATS, 0);
>>         ret = run_scripts(ACT_POST_RESUME);
>>       if (ret != 0)
>> diff --git a/criu/include/stats.h b/criu/include/stats.h
>> index 5d408b7..e6f86cb 100644
>> --- a/criu/include/stats.h
>> +++ b/criu/include/stats.h
>> @@ -51,6 +51,6 @@ extern void cnt_sub(int c, unsigned long val);
>>   #define RESTORE_STATS    2
>>     extern int init_stats(int what);
>> -extern void write_stats(int what);
>> -
>> +extern void write_stats(int what, int pre_dump);
>> +extern int get_parent_pre_dump_type();
>>   #endif /* __CR_STATS_H__ */
>> diff --git a/criu/mem.c b/criu/mem.c
>> index 15094f7..0cfa4c2 100644
>> --- a/criu/mem.c
>> +++ b/criu/mem.c
>> @@ -351,7 +351,8 @@ static int generate_vma_iovs(struct pstree_item 
>> *item, struct vma_area *vma,
>>                    struct page_pipe *pp, struct page_xfer *xfer,
>>                    struct parasite_dump_pages_args *args,
>>                    struct parasite_ctl *ctl, pmc_t *pmc,
>> -                 bool has_parent, bool pre_dump)
>> +                 bool has_parent, bool pre_dump,
>> +                 int parent_predump_mode)
> Should the patch be applied after "[PATCH 2/7] Skip generating iov for 
> non PROT_READ region" ?
>
> diff --git a/criu/mem.c b/criu/mem.c
> index e19688d..740992d 100644
> --- a/criu/mem.c
> +++ b/criu/mem.c
> @@ -351,7 +351,7 @@ static int generate_vma_iovs(struct pstree_item 
> *item, struct vma_area *vma,
>                   struct page_pipe *pp, struct page_xfer *xfer,
>                   struct parasite_dump_pages_args *args,
>                   struct parasite_ctl *ctl, pmc_t *pmc,
> -                 bool has_parent, bool pre_dump)
> +                 bool has_parent, struct mem_dump_ctl *mdc)
>
>
>>   {
>>       u64 off = 0;
>>       u64 *map;
>> @@ -362,17 +363,47 @@ static int generate_vma_iovs(struct pstree_item 
>> *item, struct vma_area *vma,
>>           return 0;
>>         /*
>> -     * process_vm_readv syscall can't copy memory regions lacking
>> -     * PROT_READ flag. Therefore, avoid generating iovs for such
>> -     * regions in "read" mode pre-dump. Regions skipped by pre-dumps
>> -     * can't be referred as parent by following dump stage. So, mark
>> -     * "has_parent=false" for such regions.
>> +     * To facilitate any combination of pre-dump modes to run after
>> +     * one another, we need to take extra care as discussed below.
>> +     *
>> +     * The SPLICE mode pre-dump, processes all type of memory regions,
>> +     * whereas READ mode pre-dump skips processing those memory regions
>> +     * which lacks PROT_READ flag.
>> +     *
>> +     * Now on mixing pre-dump modes:
>> +     *     If SPLICE mode follows SPLICE mode    : no issue
>> +     *        -> everything dumped both the times
>> +     *
>> +     *     If READ mode follows READ mode        : no issue
>> +     *        -> non-PROT_READ skipped both the time
>> +     *
>> +     *     If READ mode follows SPLICE mode       : no issue
>> +     *        -> everything dumped at first,
>> +     *           the non-PROT_READ skipped later
>> +     *
>> +     *     If SPLICE mode follows READ mode       : Need special care
>> +     *
> It would be good to cover these 4 cases with a test.
Random order will be replaced by these 4 cases.
>
>> +     * If READ pre-dump happens first then it has skipped processing
>> +     * non-PROT_READ regions. Following SPLICE pre-dump expects pagemap
>> +     * entries for all mappings in parent pagemap, but last READ mode
>> +     * pre-dump has skipped processing and pagemap generation for
>> +     * non-PROT_READ regions. So SPLICE mode throws error of missing
>> +     * pagemap entry for non-PROT_READ mapping.
>> +     *
>> +     * To resolve this, the mode of pre-dump is stored in pre-dump's
>> +     * stats file. This mode is read back from stats file during next
>> +     * pre-dump.
> The purpose of the stats file is to contain statistics about how 
> dump/restore went. IMHO the data in this file should not be required 
> for pre-dump.

If we don't mix splice and read modes, then we can completely avoid this 
whole stat-file writing/reading. But, if we mix pre-dump-modes in single 
run then we need

to tell splice-pre-dump, about what has been skipped in parent images 
through read-pre-dumps.

>
>> +     * If parent-pre-dump and next pre-dump turns out in READ-mode -->
>> +     * SPLICE-mode order, then SPLICE mode doesn't expect mappings for
>> +     * non-PROT_READ regions in parent-image and marks 
>> "has_parent=false".
>>        */
>> -    if (opts.pre_dump_mode == PRE_DUMP_READ &&
>> -                              !(vma->e->prot & PROT_READ)) {
>> -        if (pre_dump)
>> +
>> +    if (!(vma->e->prot & PROT_READ)) {
>> +        if (opts.pre_dump_mode == PRE_DUMP_READ && pre_dump)
>>               return 0;
>> -        has_parent = false;
>> +        if ((parent_predump_mode == PRE_DUMP_READ &&
>> +            opts.pre_dump_mode == PRE_DUMP_SPLICE) || !pre_dump)
>> +            has_parent = false;
>>       }
> This becomes complicated because of the fix mentioned in 
> https://lists.openvz.org/pipermail/criu/2019-August/044404.html
>
> Would it be possible to send version 2 for the whole series?
>
> You can use git rebase to apply necessary changes in previous commits.
Yes, I will post version 2 of complete series, once final changes are 
approved.
>
>
> Radostin
>
>
-Abhishek
Abhishek Dubey Sept. 2, 2019, 5:39 p.m.
Hi Radostin,

Please find github link implementing check for 4 pre-dump combinations

https://github.com/dubeyabhishek/ultimateCRIU/commit/27391ad41e7d63af98ba89f84b0216cab4ac1af7


This check will be invoked from travis script.

If --pre=5 for maps007 and --mix-pre-dump is supplied, then 5 pre-dumps 
for all 4 cases are performed,

followed by single dump. The order of each of the 4 case is reproducible 
and non-altered with each run.

Please confirm if it looks good.

The Output of patch looks like:


# python test/zdtm.py run --pre 5 -t zdtm/transition/maps007 
--mix-pre-dump -f h
=== Run 1/1 ================ zdtm/transition/maps007
======================= Run zdtm/transition/maps007 in h 
=======================
Start test
Test is SUID
./maps007 --pidfile=maps007.pid --outfile=maps007.out
===== mix-pre-dump case 1 =====
Mix-mode selected: splice
Run criu pre-dump
Mix-mode selected: splice
Run criu pre-dump
Mix-mode selected: splice
Run criu pre-dump
Mix-mode selected: splice
Run criu pre-dump
Mix-mode selected: splice
Run criu pre-dump
===== mix-pre-dump case 2 =====
Mix-mode selected: read
Run criu pre-dump
Mix-mode selected: read
Run criu pre-dump
Mix-mode selected: read
Run criu pre-dump
Mix-mode selected: read
Run criu pre-dump
Mix-mode selected: read
Run criu pre-dump
===== mix-pre-dump case 3 =====
Mix-mode selected: splice
Run criu pre-dump
Mix-mode selected: read
Run criu pre-dump
Mix-mode selected: splice
Run criu pre-dump
Mix-mode selected: read
Run criu pre-dump
Mix-mode selected: splice
Run criu pre-dump
===== mix-pre-dump case 4 =====
Mix-mode selected: read
Run criu pre-dump
Mix-mode selected: splice
Run criu pre-dump
Mix-mode selected: read
Run criu pre-dump
Mix-mode selected: splice
Run criu pre-dump
Mix-mode selected: read
Run criu pre-dump
Run criu dump
Run criu restore
Send the 15 signal to  38
Wait for zdtm/transition/maps007(38) to die for 0.100000
Wait for zdtm/transition/maps007(38) to die for 0.200000
Removing dump/zdtm/transition/maps007/38
====================== Test zdtm/transition/maps007 PASS 
=======================


On 01/09/19 1:07 AM, Radostin Stoyanov wrote:
>
> On 31/08/2019 13:12, Abhishek Dubey wrote:
>> READ and SPLICE mode pre-dumps can happen
>> in any order. New option --mix-pre-dump is
>> added for zdtm suite only, to test integrity
>> of random order of pre-dump modes.
> It is important when a test fails to be able to reproduce the error 
> consistently. The use of random order for testing might make debugging 
> difficult.
>
>> Signed-off-by: Abhishek Dubey <dubeyabhishek777@gmail.com>
>> ---
>>   criu/cr-dump.c       |  4 ++--
>>   criu/cr-restore.c    |  4 ++--
>>   criu/include/stats.h |  4 ++--
>>   criu/mem.c           | 65 
>> ++++++++++++++++++++++++++++++++++++++++------------
>>   criu/stats.c         | 42 ++++++++++++++++++++++++++++++++-
>>   images/stats.proto   |  1 +
>>   test/zdtm.py         | 13 ++++++++++-
>>   7 files changed, 110 insertions(+), 23 deletions(-)
>>
>> diff --git a/criu/cr-dump.c b/criu/cr-dump.c
>> index b916e0a..f62362f 100644
>> --- a/criu/cr-dump.c
>> +++ b/criu/cr-dump.c
>> @@ -1554,7 +1554,7 @@ err:
>>       if (ret)
>>           pr_err("Pre-dumping FAILED.\n");
>>       else {
>> -        write_stats(DUMP_STATS);
>> +        write_stats(DUMP_STATS, 1);
>>           pr_info("Pre-dumping finished successfully\n");
>>       }
>>       return ret;
>> @@ -1760,7 +1760,7 @@ static int cr_dump_finish(int ret)
>>       if (ret) {
>>           pr_err("Dumping FAILED.\n");
>>       } else {
>> -        write_stats(DUMP_STATS);
>> +        write_stats(DUMP_STATS, 0);
>>           pr_info("Dumping finished successfully\n");
>>       }
>>       return post_dump_ret ? : (ret != 0);
>> diff --git a/criu/cr-restore.c b/criu/cr-restore.c
>> index de0b2cb..8ca46e4 100644
>> --- a/criu/cr-restore.c
>> +++ b/criu/cr-restore.c
>> @@ -2226,7 +2226,7 @@ skip_ns_bouncing:
>>       if (ret != 0) {
>>           pr_err("Aborting restore due to post-restore script ret 
>> code %d\n", ret);
>>           timing_stop(TIME_RESTORE);
>> -        write_stats(RESTORE_STATS);
>> +        write_stats(RESTORE_STATS, 0);
>>           goto out_kill;
>>       }
>>   @@ -2292,7 +2292,7 @@ skip_ns_bouncing:
>>       /* Detaches from processes and they continue run through 
>> sigreturn. */
>>       finalize_restore_detach(ret);
>>   -    write_stats(RESTORE_STATS);
>> +    write_stats(RESTORE_STATS, 0);
>>         ret = run_scripts(ACT_POST_RESUME);
>>       if (ret != 0)
>> diff --git a/criu/include/stats.h b/criu/include/stats.h
>> index 5d408b7..e6f86cb 100644
>> --- a/criu/include/stats.h
>> +++ b/criu/include/stats.h
>> @@ -51,6 +51,6 @@ extern void cnt_sub(int c, unsigned long val);
>>   #define RESTORE_STATS    2
>>     extern int init_stats(int what);
>> -extern void write_stats(int what);
>> -
>> +extern void write_stats(int what, int pre_dump);
>> +extern int get_parent_pre_dump_type();
>>   #endif /* __CR_STATS_H__ */
>> diff --git a/criu/mem.c b/criu/mem.c
>> index 15094f7..0cfa4c2 100644
>> --- a/criu/mem.c
>> +++ b/criu/mem.c
>> @@ -351,7 +351,8 @@ static int generate_vma_iovs(struct pstree_item 
>> *item, struct vma_area *vma,
>>                    struct page_pipe *pp, struct page_xfer *xfer,
>>                    struct parasite_dump_pages_args *args,
>>                    struct parasite_ctl *ctl, pmc_t *pmc,
>> -                 bool has_parent, bool pre_dump)
>> +                 bool has_parent, bool pre_dump,
>> +                 int parent_predump_mode)
> Should the patch be applied after "[PATCH 2/7] Skip generating iov for 
> non PROT_READ region" ?
>
> diff --git a/criu/mem.c b/criu/mem.c
> index e19688d..740992d 100644
> --- a/criu/mem.c
> +++ b/criu/mem.c
> @@ -351,7 +351,7 @@ static int generate_vma_iovs(struct pstree_item 
> *item, struct vma_area *vma,
>                   struct page_pipe *pp, struct page_xfer *xfer,
>                   struct parasite_dump_pages_args *args,
>                   struct parasite_ctl *ctl, pmc_t *pmc,
> -                 bool has_parent, bool pre_dump)
> +                 bool has_parent, struct mem_dump_ctl *mdc)
>
>
>>   {
>>       u64 off = 0;
>>       u64 *map;
>> @@ -362,17 +363,47 @@ static int generate_vma_iovs(struct pstree_item 
>> *item, struct vma_area *vma,
>>           return 0;
>>         /*
>> -     * process_vm_readv syscall can't copy memory regions lacking
>> -     * PROT_READ flag. Therefore, avoid generating iovs for such
>> -     * regions in "read" mode pre-dump. Regions skipped by pre-dumps
>> -     * can't be referred as parent by following dump stage. So, mark
>> -     * "has_parent=false" for such regions.
>> +     * To facilitate any combination of pre-dump modes to run after
>> +     * one another, we need to take extra care as discussed below.
>> +     *
>> +     * The SPLICE mode pre-dump, processes all type of memory regions,
>> +     * whereas READ mode pre-dump skips processing those memory regions
>> +     * which lacks PROT_READ flag.
>> +     *
>> +     * Now on mixing pre-dump modes:
>> +     *     If SPLICE mode follows SPLICE mode    : no issue
>> +     *        -> everything dumped both the times
>> +     *
>> +     *     If READ mode follows READ mode        : no issue
>> +     *        -> non-PROT_READ skipped both the time
>> +     *
>> +     *     If READ mode follows SPLICE mode       : no issue
>> +     *        -> everything dumped at first,
>> +     *           the non-PROT_READ skipped later
>> +     *
>> +     *     If SPLICE mode follows READ mode       : Need special care
>> +     *
> It would be good to cover these 4 cases with a test.
>
>> +     * If READ pre-dump happens first then it has skipped processing
>> +     * non-PROT_READ regions. Following SPLICE pre-dump expects pagemap
>> +     * entries for all mappings in parent pagemap, but last READ mode
>> +     * pre-dump has skipped processing and pagemap generation for
>> +     * non-PROT_READ regions. So SPLICE mode throws error of missing
>> +     * pagemap entry for non-PROT_READ mapping.
>> +     *
>> +     * To resolve this, the mode of pre-dump is stored in pre-dump's
>> +     * stats file. This mode is read back from stats file during next
>> +     * pre-dump.
> The purpose of the stats file is to contain statistics about how 
> dump/restore went. IMHO the data in this file should not be required 
> for pre-dump.
>
>> +     * If parent-pre-dump and next pre-dump turns out in READ-mode -->
>> +     * SPLICE-mode order, then SPLICE mode doesn't expect mappings for
>> +     * non-PROT_READ regions in parent-image and marks 
>> "has_parent=false".
>>        */
>> -    if (opts.pre_dump_mode == PRE_DUMP_READ &&
>> -                              !(vma->e->prot & PROT_READ)) {
>> -        if (pre_dump)
>> +
>> +    if (!(vma->e->prot & PROT_READ)) {
>> +        if (opts.pre_dump_mode == PRE_DUMP_READ && pre_dump)
>>               return 0;
>> -        has_parent = false;
>> +        if ((parent_predump_mode == PRE_DUMP_READ &&
>> +            opts.pre_dump_mode == PRE_DUMP_SPLICE) || !pre_dump)
>> +            has_parent = false;
>>       }
> This becomes complicated because of the fix mentioned in 
> https://lists.openvz.org/pipermail/criu/2019-August/044404.html
>
> Would it be possible to send version 2 for the whole series?
>
> You can use git rebase to apply necessary changes in previous commits.
>
> Radostin
>
>
--Abhishek
Radostin Stoyanov Sept. 5, 2019, 3:45 a.m.
On 02/09/2019 07:19, abhishek dubey wrote:
> On 01/09/19 1:07 AM, Radostin Stoyanov wrote:
>>
>> On 31/08/2019 13:12, Abhishek Dubey wrote:
>>> READ and SPLICE mode pre-dumps can happen
>>> in any order. New option --mix-pre-dump is
>>> added for zdtm suite only, to test integrity
>>> of random order of pre-dump modes.
>> It is important when a test fails to be able to reproduce the error 
>> consistently. The use of random order for testing might make 
>> debugging difficult.
> Sure. I will remove random to preserve consistency.
>>
>>> Signed-off-by: Abhishek Dubey <dubeyabhishek777@gmail.com>
>>> ---
>>>   criu/cr-dump.c       |  4 ++--
>>>   criu/cr-restore.c    |  4 ++--
>>>   criu/include/stats.h |  4 ++--
>>>   criu/mem.c           | 65 
>>> ++++++++++++++++++++++++++++++++++++++++------------
>>>   criu/stats.c         | 42 ++++++++++++++++++++++++++++++++-
>>>   images/stats.proto   |  1 +
>>>   test/zdtm.py         | 13 ++++++++++-
>>>   7 files changed, 110 insertions(+), 23 deletions(-)
>>>
>>> diff --git a/criu/cr-dump.c b/criu/cr-dump.c
>>> index b916e0a..f62362f 100644
>>> --- a/criu/cr-dump.c
>>> +++ b/criu/cr-dump.c
>>> @@ -1554,7 +1554,7 @@ err:
>>>       if (ret)
>>>           pr_err("Pre-dumping FAILED.\n");
>>>       else {
>>> -        write_stats(DUMP_STATS);
>>> +        write_stats(DUMP_STATS, 1);
>>>           pr_info("Pre-dumping finished successfully\n");
>>>       }
>>>       return ret;
>>> @@ -1760,7 +1760,7 @@ static int cr_dump_finish(int ret)
>>>       if (ret) {
>>>           pr_err("Dumping FAILED.\n");
>>>       } else {
>>> -        write_stats(DUMP_STATS);
>>> +        write_stats(DUMP_STATS, 0);
>>>           pr_info("Dumping finished successfully\n");
>>>       }
>>>       return post_dump_ret ? : (ret != 0);
>>> diff --git a/criu/cr-restore.c b/criu/cr-restore.c
>>> index de0b2cb..8ca46e4 100644
>>> --- a/criu/cr-restore.c
>>> +++ b/criu/cr-restore.c
>>> @@ -2226,7 +2226,7 @@ skip_ns_bouncing:
>>>       if (ret != 0) {
>>>           pr_err("Aborting restore due to post-restore script ret 
>>> code %d\n", ret);
>>>           timing_stop(TIME_RESTORE);
>>> -        write_stats(RESTORE_STATS);
>>> +        write_stats(RESTORE_STATS, 0);
>>>           goto out_kill;
>>>       }
>>>   @@ -2292,7 +2292,7 @@ skip_ns_bouncing:
>>>       /* Detaches from processes and they continue run through 
>>> sigreturn. */
>>>       finalize_restore_detach(ret);
>>>   -    write_stats(RESTORE_STATS);
>>> +    write_stats(RESTORE_STATS, 0);
>>>         ret = run_scripts(ACT_POST_RESUME);
>>>       if (ret != 0)
>>> diff --git a/criu/include/stats.h b/criu/include/stats.h
>>> index 5d408b7..e6f86cb 100644
>>> --- a/criu/include/stats.h
>>> +++ b/criu/include/stats.h
>>> @@ -51,6 +51,6 @@ extern void cnt_sub(int c, unsigned long val);
>>>   #define RESTORE_STATS    2
>>>     extern int init_stats(int what);
>>> -extern void write_stats(int what);
>>> -
>>> +extern void write_stats(int what, int pre_dump);
>>> +extern int get_parent_pre_dump_type();
>>>   #endif /* __CR_STATS_H__ */
>>> diff --git a/criu/mem.c b/criu/mem.c
>>> index 15094f7..0cfa4c2 100644
>>> --- a/criu/mem.c
>>> +++ b/criu/mem.c
>>> @@ -351,7 +351,8 @@ static int generate_vma_iovs(struct pstree_item 
>>> *item, struct vma_area *vma,
>>>                    struct page_pipe *pp, struct page_xfer *xfer,
>>>                    struct parasite_dump_pages_args *args,
>>>                    struct parasite_ctl *ctl, pmc_t *pmc,
>>> -                 bool has_parent, bool pre_dump)
>>> +                 bool has_parent, bool pre_dump,
>>> +                 int parent_predump_mode)
>> Should the patch be applied after "[PATCH 2/7] Skip generating iov 
>> for non PROT_READ region" ?
>>
>> diff --git a/criu/mem.c b/criu/mem.c
>> index e19688d..740992d 100644
>> --- a/criu/mem.c
>> +++ b/criu/mem.c
>> @@ -351,7 +351,7 @@ static int generate_vma_iovs(struct pstree_item 
>> *item, struct vma_area *vma,
>>                   struct page_pipe *pp, struct page_xfer *xfer,
>>                   struct parasite_dump_pages_args *args,
>>                   struct parasite_ctl *ctl, pmc_t *pmc,
>> -                 bool has_parent, bool pre_dump)
>> +                 bool has_parent, struct mem_dump_ctl *mdc)
>>
>>
>>>   {
>>>       u64 off = 0;
>>>       u64 *map;
>>> @@ -362,17 +363,47 @@ static int generate_vma_iovs(struct 
>>> pstree_item *item, struct vma_area *vma,
>>>           return 0;
>>>         /*
>>> -     * process_vm_readv syscall can't copy memory regions lacking
>>> -     * PROT_READ flag. Therefore, avoid generating iovs for such
>>> -     * regions in "read" mode pre-dump. Regions skipped by pre-dumps
>>> -     * can't be referred as parent by following dump stage. So, mark
>>> -     * "has_parent=false" for such regions.
>>> +     * To facilitate any combination of pre-dump modes to run after
>>> +     * one another, we need to take extra care as discussed below.
>>> +     *
>>> +     * The SPLICE mode pre-dump, processes all type of memory regions,
>>> +     * whereas READ mode pre-dump skips processing those memory 
>>> regions
>>> +     * which lacks PROT_READ flag.
>>> +     *
>>> +     * Now on mixing pre-dump modes:
>>> +     *     If SPLICE mode follows SPLICE mode    : no issue
>>> +     *        -> everything dumped both the times
>>> +     *
>>> +     *     If READ mode follows READ mode        : no issue
>>> +     *        -> non-PROT_READ skipped both the time
>>> +     *
>>> +     *     If READ mode follows SPLICE mode       : no issue
>>> +     *        -> everything dumped at first,
>>> +     *           the non-PROT_READ skipped later
>>> +     *
>>> +     *     If SPLICE mode follows READ mode       : Need special care
>>> +     *
>> It would be good to cover these 4 cases with a test.
> Random order will be replaced by these 4 cases.
>>
>>> +     * If READ pre-dump happens first then it has skipped processing
>>> +     * non-PROT_READ regions. Following SPLICE pre-dump expects 
>>> pagemap
>>> +     * entries for all mappings in parent pagemap, but last READ mode
>>> +     * pre-dump has skipped processing and pagemap generation for
>>> +     * non-PROT_READ regions. So SPLICE mode throws error of missing
>>> +     * pagemap entry for non-PROT_READ mapping.
>>> +     *
>>> +     * To resolve this, the mode of pre-dump is stored in pre-dump's
>>> +     * stats file. This mode is read back from stats file during next
>>> +     * pre-dump.
>> The purpose of the stats file is to contain statistics about how 
>> dump/restore went. IMHO the data in this file should not be required 
>> for pre-dump.
>
> If we don't mix splice and read modes, then we can completely avoid 
> this whole stat-file writing/reading. But, if we mix pre-dump-modes in 
> single run then we need
>
> to tell splice-pre-dump, about what has been skipped in parent images 
> through read-pre-dumps.
>
I understand that we need to include information about what pre-dump 
mode was used but I think that stats-dump is not the right image for this.

Can we use inventory.img instead?

Radostin
Radostin Stoyanov Sept. 5, 2019, 4:16 a.m.
On 02/09/2019 18:39, abhishek dubey wrote:
> Hi Radostin,
>
> Please find github link implementing check for 4 pre-dump combinations
>
> https://github.com/dubeyabhishek/ultimateCRIU/commit/27391ad41e7d63af98ba89f84b0216cab4ac1af7 
>
>
>
> This check will be invoked from travis script.
>
> If --pre=5 for maps007 and --mix-pre-dump is supplied, then 5 
> pre-dumps for all 4 cases are performed,
>
> followed by single dump. The order of each of the 4 case is 
> reproducible and non-altered with each run.
>
> Please confirm if it looks good.
>
> The Output of patch looks like:
>
>
> # python test/zdtm.py run --pre 5 -t zdtm/transition/maps007 
> --mix-pre-dump -f h
> === Run 1/1 ================ zdtm/transition/maps007
> ======================= Run zdtm/transition/maps007 in h 
> =======================
> Start test
> Test is SUID
> ./maps007 --pidfile=maps007.pid --outfile=maps007.out
> ===== mix-pre-dump case 1 =====
> Mix-mode selected: splice
> Run criu pre-dump
> Mix-mode selected: splice
> Run criu pre-dump
> Mix-mode selected: splice
> Run criu pre-dump
> Mix-mode selected: splice
> Run criu pre-dump
> Mix-mode selected: splice
> Run criu pre-dump
> ===== mix-pre-dump case 2 =====
> Mix-mode selected: read
> Run criu pre-dump
> Mix-mode selected: read
> Run criu pre-dump
> Mix-mode selected: read
> Run criu pre-dump
> Mix-mode selected: read
> Run criu pre-dump
> Mix-mode selected: read
> Run criu pre-dump
> ===== mix-pre-dump case 3 =====
> Mix-mode selected: splice
> Run criu pre-dump
> Mix-mode selected: read
> Run criu pre-dump
> Mix-mode selected: splice
> Run criu pre-dump
> Mix-mode selected: read
> Run criu pre-dump
> Mix-mode selected: splice
> Run criu pre-dump
> ===== mix-pre-dump case 4 =====
> Mix-mode selected: read
> Run criu pre-dump
> Mix-mode selected: splice
> Run criu pre-dump
> Mix-mode selected: read
> Run criu pre-dump
> Mix-mode selected: splice
> Run criu pre-dump
> Mix-mode selected: read
> Run criu pre-dump
> Run criu dump
> Run criu restore
> Send the 15 signal to  38
> Wait for zdtm/transition/maps007(38) to die for 0.100000
> Wait for zdtm/transition/maps007(38) to die for 0.200000
> Removing dump/zdtm/transition/maps007/38
> ====================== Test zdtm/transition/maps007 PASS 
> =======================
>
>
The output looks good.

> On 01/09/19 1:07 AM, Radostin Stoyanov wrote:
>>
>> On 31/08/2019 13:12, Abhishek Dubey wrote:
>>> READ and SPLICE mode pre-dumps can happen
>>> in any order. New option --mix-pre-dump is
>>> added for zdtm suite only, to test integrity
>>> of random order of pre-dump modes.
>> It is important when a test fails to be able to reproduce the error 
>> consistently. The use of random order for testing might make 
>> debugging difficult.
>>
>>> Signed-off-by: Abhishek Dubey <dubeyabhishek777@gmail.com>
>>> ---
>>>   criu/cr-dump.c       |  4 ++--
>>>   criu/cr-restore.c    |  4 ++--
>>>   criu/include/stats.h |  4 ++--
>>>   criu/mem.c           | 65 
>>> ++++++++++++++++++++++++++++++++++++++++------------
>>>   criu/stats.c         | 42 ++++++++++++++++++++++++++++++++-
>>>   images/stats.proto   |  1 +
>>>   test/zdtm.py         | 13 ++++++++++-
>>>   7 files changed, 110 insertions(+), 23 deletions(-)
>>>
>>> diff --git a/criu/cr-dump.c b/criu/cr-dump.c
>>> index b916e0a..f62362f 100644
>>> --- a/criu/cr-dump.c
>>> +++ b/criu/cr-dump.c
>>> @@ -1554,7 +1554,7 @@ err:
>>>       if (ret)
>>>           pr_err("Pre-dumping FAILED.\n");
>>>       else {
>>> -        write_stats(DUMP_STATS);
>>> +        write_stats(DUMP_STATS, 1);
>>>           pr_info("Pre-dumping finished successfully\n");
>>>       }
>>>       return ret;
>>> @@ -1760,7 +1760,7 @@ static int cr_dump_finish(int ret)
>>>       if (ret) {
>>>           pr_err("Dumping FAILED.\n");
>>>       } else {
>>> -        write_stats(DUMP_STATS);
>>> +        write_stats(DUMP_STATS, 0);
>>>           pr_info("Dumping finished successfully\n");
>>>       }
>>>       return post_dump_ret ? : (ret != 0);
>>> diff --git a/criu/cr-restore.c b/criu/cr-restore.c
>>> index de0b2cb..8ca46e4 100644
>>> --- a/criu/cr-restore.c
>>> +++ b/criu/cr-restore.c
>>> @@ -2226,7 +2226,7 @@ skip_ns_bouncing:
>>>       if (ret != 0) {
>>>           pr_err("Aborting restore due to post-restore script ret 
>>> code %d\n", ret);
>>>           timing_stop(TIME_RESTORE);
>>> -        write_stats(RESTORE_STATS);
>>> +        write_stats(RESTORE_STATS, 0);
>>>           goto out_kill;
>>>       }
>>>   @@ -2292,7 +2292,7 @@ skip_ns_bouncing:
>>>       /* Detaches from processes and they continue run through 
>>> sigreturn. */
>>>       finalize_restore_detach(ret);
>>>   -    write_stats(RESTORE_STATS);
>>> +    write_stats(RESTORE_STATS, 0);
>>>         ret = run_scripts(ACT_POST_RESUME);
>>>       if (ret != 0)
>>> diff --git a/criu/include/stats.h b/criu/include/stats.h
>>> index 5d408b7..e6f86cb 100644
>>> --- a/criu/include/stats.h
>>> +++ b/criu/include/stats.h
>>> @@ -51,6 +51,6 @@ extern void cnt_sub(int c, unsigned long val);
>>>   #define RESTORE_STATS    2
>>>     extern int init_stats(int what);
>>> -extern void write_stats(int what);
>>> -
>>> +extern void write_stats(int what, int pre_dump);
>>> +extern int get_parent_pre_dump_type();
>>>   #endif /* __CR_STATS_H__ */
>>> diff --git a/criu/mem.c b/criu/mem.c
>>> index 15094f7..0cfa4c2 100644
>>> --- a/criu/mem.c
>>> +++ b/criu/mem.c
>>> @@ -351,7 +351,8 @@ static int generate_vma_iovs(struct pstree_item 
>>> *item, struct vma_area *vma,
>>>                    struct page_pipe *pp, struct page_xfer *xfer,
>>>                    struct parasite_dump_pages_args *args,
>>>                    struct parasite_ctl *ctl, pmc_t *pmc,
>>> -                 bool has_parent, bool pre_dump)
>>> +                 bool has_parent, bool pre_dump,
>>> +                 int parent_predump_mode)
>> Should the patch be applied after "[PATCH 2/7] Skip generating iov 
>> for non PROT_READ region" ?
>>
>> diff --git a/criu/mem.c b/criu/mem.c
>> index e19688d..740992d 100644
>> --- a/criu/mem.c
>> +++ b/criu/mem.c
>> @@ -351,7 +351,7 @@ static int generate_vma_iovs(struct pstree_item 
>> *item, struct vma_area *vma,
>>                   struct page_pipe *pp, struct page_xfer *xfer,
>>                   struct parasite_dump_pages_args *args,
>>                   struct parasite_ctl *ctl, pmc_t *pmc,
>> -                 bool has_parent, bool pre_dump)
>> +                 bool has_parent, struct mem_dump_ctl *mdc)
>>
>>
>>>   {
>>>       u64 off = 0;
>>>       u64 *map;
>>> @@ -362,17 +363,47 @@ static int generate_vma_iovs(struct 
>>> pstree_item *item, struct vma_area *vma,
>>>           return 0;
>>>         /*
>>> -     * process_vm_readv syscall can't copy memory regions lacking
>>> -     * PROT_READ flag. Therefore, avoid generating iovs for such
>>> -     * regions in "read" mode pre-dump. Regions skipped by pre-dumps
>>> -     * can't be referred as parent by following dump stage. So, mark
>>> -     * "has_parent=false" for such regions.
>>> +     * To facilitate any combination of pre-dump modes to run after
>>> +     * one another, we need to take extra care as discussed below.
>>> +     *
>>> +     * The SPLICE mode pre-dump, processes all type of memory regions,
>>> +     * whereas READ mode pre-dump skips processing those memory 
>>> regions
>>> +     * which lacks PROT_READ flag.
>>> +     *
>>> +     * Now on mixing pre-dump modes:
>>> +     *     If SPLICE mode follows SPLICE mode    : no issue
>>> +     *        -> everything dumped both the times
>>> +     *
>>> +     *     If READ mode follows READ mode        : no issue
>>> +     *        -> non-PROT_READ skipped both the time
>>> +     *
>>> +     *     If READ mode follows SPLICE mode       : no issue
>>> +     *        -> everything dumped at first,
>>> +     *           the non-PROT_READ skipped later
>>> +     *
>>> +     *     If SPLICE mode follows READ mode       : Need special care
>>> +     *
>> It would be good to cover these 4 cases with a test.
>>
>>> +     * If READ pre-dump happens first then it has skipped processing
>>> +     * non-PROT_READ regions. Following SPLICE pre-dump expects 
>>> pagemap
>>> +     * entries for all mappings in parent pagemap, but last READ mode
>>> +     * pre-dump has skipped processing and pagemap generation for
>>> +     * non-PROT_READ regions. So SPLICE mode throws error of missing
>>> +     * pagemap entry for non-PROT_READ mapping.
>>> +     *
>>> +     * To resolve this, the mode of pre-dump is stored in pre-dump's
>>> +     * stats file. This mode is read back from stats file during next
>>> +     * pre-dump.
>> The purpose of the stats file is to contain statistics about how 
>> dump/restore went. IMHO the data in this file should not be required 
>> for pre-dump.
>>
>>> +     * If parent-pre-dump and next pre-dump turns out in READ-mode -->
>>> +     * SPLICE-mode order, then SPLICE mode doesn't expect mappings for
>>> +     * non-PROT_READ regions in parent-image and marks 
>>> "has_parent=false".
>>>        */
>>> -    if (opts.pre_dump_mode == PRE_DUMP_READ &&
>>> -                              !(vma->e->prot & PROT_READ)) {
>>> -        if (pre_dump)
>>> +
>>> +    if (!(vma->e->prot & PROT_READ)) {
>>> +        if (opts.pre_dump_mode == PRE_DUMP_READ && pre_dump)
>>>               return 0;
>>> -        has_parent = false;
>>> +        if ((parent_predump_mode == PRE_DUMP_READ &&
>>> +            opts.pre_dump_mode == PRE_DUMP_SPLICE) || !pre_dump)
>>> +            has_parent = false;
>>>       }
>> This becomes complicated because of the fix mentioned in 
>> https://lists.openvz.org/pipermail/criu/2019-August/044404.html
>>
>> Would it be possible to send version 2 for the whole series?
>>
>> You can use git rebase to apply necessary changes in previous commits.
>>
>> Radostin
>>
>>
> --Abhishek
Abhishek Dubey Sept. 7, 2019, 12:17 p.m.
Hi Radostin,

With new changes, pre-dump-mode is dumped in inventory file.

https://github.com/dubeyabhishek/ultimateCRIU/commit/961c08a5cfa4ac59e64677c5e1c14c515ef844da

If the changes are good, I will prepare rev2 of patch-series.


On 05/09/19 9:15 AM, Radostin Stoyanov wrote:
> On 02/09/2019 07:19, abhishek dubey wrote:
>> On 01/09/19 1:07 AM, Radostin Stoyanov wrote:
>>>
>>> On 31/08/2019 13:12, Abhishek Dubey wrote:
>>>> READ and SPLICE mode pre-dumps can happen
>>>> in any order. New option --mix-pre-dump is
>>>> added for zdtm suite only, to test integrity
>>>> of random order of pre-dump modes.
>>> It is important when a test fails to be able to reproduce the error 
>>> consistently. The use of random order for testing might make 
>>> debugging difficult.
>> Sure. I will remove random to preserve consistency.
>>>
>>>> Signed-off-by: Abhishek Dubey <dubeyabhishek777@gmail.com>
>>>> ---
>>>>   criu/cr-dump.c       |  4 ++--
>>>>   criu/cr-restore.c    |  4 ++--
>>>>   criu/include/stats.h |  4 ++--
>>>>   criu/mem.c           | 65 
>>>> ++++++++++++++++++++++++++++++++++++++++------------
>>>>   criu/stats.c         | 42 ++++++++++++++++++++++++++++++++-
>>>>   images/stats.proto   |  1 +
>>>>   test/zdtm.py         | 13 ++++++++++-
>>>>   7 files changed, 110 insertions(+), 23 deletions(-)
>>>>
>>>> diff --git a/criu/cr-dump.c b/criu/cr-dump.c
>>>> index b916e0a..f62362f 100644
>>>> --- a/criu/cr-dump.c
>>>> +++ b/criu/cr-dump.c
>>>> @@ -1554,7 +1554,7 @@ err:
>>>>       if (ret)
>>>>           pr_err("Pre-dumping FAILED.\n");
>>>>       else {
>>>> -        write_stats(DUMP_STATS);
>>>> +        write_stats(DUMP_STATS, 1);
>>>>           pr_info("Pre-dumping finished successfully\n");
>>>>       }
>>>>       return ret;
>>>> @@ -1760,7 +1760,7 @@ static int cr_dump_finish(int ret)
>>>>       if (ret) {
>>>>           pr_err("Dumping FAILED.\n");
>>>>       } else {
>>>> -        write_stats(DUMP_STATS);
>>>> +        write_stats(DUMP_STATS, 0);
>>>>           pr_info("Dumping finished successfully\n");
>>>>       }
>>>>       return post_dump_ret ? : (ret != 0);
>>>> diff --git a/criu/cr-restore.c b/criu/cr-restore.c
>>>> index de0b2cb..8ca46e4 100644
>>>> --- a/criu/cr-restore.c
>>>> +++ b/criu/cr-restore.c
>>>> @@ -2226,7 +2226,7 @@ skip_ns_bouncing:
>>>>       if (ret != 0) {
>>>>           pr_err("Aborting restore due to post-restore script ret 
>>>> code %d\n", ret);
>>>>           timing_stop(TIME_RESTORE);
>>>> -        write_stats(RESTORE_STATS);
>>>> +        write_stats(RESTORE_STATS, 0);
>>>>           goto out_kill;
>>>>       }
>>>>   @@ -2292,7 +2292,7 @@ skip_ns_bouncing:
>>>>       /* Detaches from processes and they continue run through 
>>>> sigreturn. */
>>>>       finalize_restore_detach(ret);
>>>>   -    write_stats(RESTORE_STATS);
>>>> +    write_stats(RESTORE_STATS, 0);
>>>>         ret = run_scripts(ACT_POST_RESUME);
>>>>       if (ret != 0)
>>>> diff --git a/criu/include/stats.h b/criu/include/stats.h
>>>> index 5d408b7..e6f86cb 100644
>>>> --- a/criu/include/stats.h
>>>> +++ b/criu/include/stats.h
>>>> @@ -51,6 +51,6 @@ extern void cnt_sub(int c, unsigned long val);
>>>>   #define RESTORE_STATS    2
>>>>     extern int init_stats(int what);
>>>> -extern void write_stats(int what);
>>>> -
>>>> +extern void write_stats(int what, int pre_dump);
>>>> +extern int get_parent_pre_dump_type();
>>>>   #endif /* __CR_STATS_H__ */
>>>> diff --git a/criu/mem.c b/criu/mem.c
>>>> index 15094f7..0cfa4c2 100644
>>>> --- a/criu/mem.c
>>>> +++ b/criu/mem.c
>>>> @@ -351,7 +351,8 @@ static int generate_vma_iovs(struct pstree_item 
>>>> *item, struct vma_area *vma,
>>>>                    struct page_pipe *pp, struct page_xfer *xfer,
>>>>                    struct parasite_dump_pages_args *args,
>>>>                    struct parasite_ctl *ctl, pmc_t *pmc,
>>>> -                 bool has_parent, bool pre_dump)
>>>> +                 bool has_parent, bool pre_dump,
>>>> +                 int parent_predump_mode)
>>> Should the patch be applied after "[PATCH 2/7] Skip generating iov 
>>> for non PROT_READ region" ?
>>>
>>> diff --git a/criu/mem.c b/criu/mem.c
>>> index e19688d..740992d 100644
>>> --- a/criu/mem.c
>>> +++ b/criu/mem.c
>>> @@ -351,7 +351,7 @@ static int generate_vma_iovs(struct pstree_item 
>>> *item, struct vma_area *vma,
>>>                   struct page_pipe *pp, struct page_xfer *xfer,
>>>                   struct parasite_dump_pages_args *args,
>>>                   struct parasite_ctl *ctl, pmc_t *pmc,
>>> -                 bool has_parent, bool pre_dump)
>>> +                 bool has_parent, struct mem_dump_ctl *mdc)
>>>
>>>
>>>>   {
>>>>       u64 off = 0;
>>>>       u64 *map;
>>>> @@ -362,17 +363,47 @@ static int generate_vma_iovs(struct 
>>>> pstree_item *item, struct vma_area *vma,
>>>>           return 0;
>>>>         /*
>>>> -     * process_vm_readv syscall can't copy memory regions lacking
>>>> -     * PROT_READ flag. Therefore, avoid generating iovs for such
>>>> -     * regions in "read" mode pre-dump. Regions skipped by pre-dumps
>>>> -     * can't be referred as parent by following dump stage. So, mark
>>>> -     * "has_parent=false" for such regions.
>>>> +     * To facilitate any combination of pre-dump modes to run after
>>>> +     * one another, we need to take extra care as discussed below.
>>>> +     *
>>>> +     * The SPLICE mode pre-dump, processes all type of memory 
>>>> regions,
>>>> +     * whereas READ mode 
>>>> pre-dhttps://github.com/dubeyabhishek/ultimateCRIU/commit/961c08a5cfa4ac59e64677c5e1c14c515ef844daump 
>>>> skips processing those memory regions
>>>> +     * which lacks PROT_READ flag.
>>>> +     *
>>>> +     * Now on mixing pre-dump modes:
>>>> +     *     If SPLICE mode follows SPLICE mode    : no issue
>>>> +     *        -> everything dumped both the times
>>>> +     *
>>>> +     *     If READ mode follows READ mode        : no issue
>>>> +     *        -> non-PROT_READ skipped both the time
>>>> +     *
>>>> +     *     If READ mode follows SPLICE mode       : no issue
>>>> +     *        -> everything dumped at first,
>>>> +     *           the non-PROT_READ skipped later
>>>> +     *
>>>> +     *     If SPLICE mode follows READ mode       : Need special care
>>>> +     *
>>> It would be good to cover these 4 cases with a test.
>> Random order will be replaced by these 4 cases.
>>>
>>>> +     * If READ pre-dump happens first then it has skipped processing
>>>> +     * non-PROT_READ regions. Following SPLICE pre-dump expects 
>>>> pagemap
>>>> +     * entries for all mappings in parent pagemap, but last READ mode
>>>> +     * pre-dump has skipped processing and pagemap generation for
>>>> +     * non-PROT_READ regions. So SPLICE mode throws error of missing
>>>> +     * pagemap entry for non-PROT_READ mapping.
>>>> +     *
>>>> +     * To resolve this, the mode of pre-dump is stored in pre-dump's
>>>> +     * stats file. This mode is read back from stats file during next
>>>> +     * pre-dump.
>>> The purpose of the stats file is to contain statistics about how 
>>> dump/restore went. IMHO the data in this file should not be required 
>>> for pre-dump.
>>
>> If we don't mix splice and read modes, then we can completely avoid 
>> this whole stat-file writing/reading. But, if we mix pre-dump-modes 
>> in single run then we need
>>
>> to tell splice-pre-dump, about what has been skipped in parent images 
>> through read-pre-dumps.
>>
> I understand that we need to include information about what pre-dump 
> mode was used but I think that stats-dump is not the right image for 
> this.
>
> Can we use inventory.img instead?
>
> Radostin
-Abhishek