p.haul page statistics

Submitted by Mike Rapoport on Sept. 12, 2016, 12:08 p.m.

Details

Message ID 20160912120854.GA359@rapoport-lnx
State Rejected
Series "p.haul page statistics"
Headers show

Commit Message

Mike Rapoport Sept. 12, 2016, 12:08 p.m.
On Mon, Sep 12, 2016 at 11:02:34AM +0200, Adrian Reber wrote:
> On Thu, Sep 08, 2016 at 08:39:06AM +0300, Mike Rapoport wrote:
> > On Tue, Sep 06, 2016 at 11:35:23AM +0200, Adrian Reber wrote:
> > > Using the latest criu with p.haul the information about the pages dumped
> > > looks a bit wrong. From p.haul I get:

[ snip ]
 
> > > So p.haul is kind of right as the stats file says that only 13 pages
> > > have been written. But all the lazy pages have also been written.
> > > 
> > > Should pages_written include the lazy pages or should p.haul add
> > > pages_written and pages_lazy to get the actual number of pages written?
> > 
> > I think p.haul should add pages_written and pages_lazy.
> 
> I thought some more about this and I am not convinced. 'pages_written'
> sounds like all pages written. Unrelated if the pages are lazy or not.
 
Agree. How about the patch below?

> Are there any advantages I do not see if the lazy pages are not included
> in 'pages_written'? In the future when p.haul might know how to combine
> pre-copy and post-copy the variable 'pages_lazy' will become important
> but right now it feels wrong to read it to decide if additional pre-copy
> runs should be performed.
> 
> 		Adrian
> 

From 898ab269a6b4a47ed8019445b688dfd3d41d0581 Mon Sep 17 00:00:00 2001
From: Mike Rapoport <rppt@linux.vnet.ibm.com>
Date: Mon, 12 Sep 2016 15:05:13 +0300
Subject: [CRIU][PATCH] criu: mem: count all pages actually written to image as
 "pages_written"

When potentially lazy pages are written to image add their count to
"pages_written" stats counter.

Signed-off-by: Mike Rapoport <rppt@linux.vnet.ibm.com>
---
 criu/mem.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

Patch hide | download patch | download mbox

diff --git a/criu/mem.c b/criu/mem.c
index 597bf29..b9b0bf4 100644
--- a/criu/mem.c
+++ b/criu/mem.c
@@ -192,7 +192,8 @@  static int generate_iovs(struct vma_area *vma, struct page_pipe *pp, u64 *map, u
 	cnt_add(CNT_PAGES_ZERO, pages[0]);
 	cnt_add(CNT_PAGES_SKIPPED_PARENT, pages[1]);
 	cnt_add(CNT_PAGES_LAZY, pages[2]);
-	cnt_add(CNT_PAGES_WRITTEN, pages[3]);
+	cnt_add(CNT_PAGES_WRITTEN,
+		opts.lazy_pages ? pages[3] : pages[2] + pages[3]);
 
 	pr_info("Pagemap generated: %lu pages (%lu lazy) %lu holes %lu zeros\n",
 		pages[3] + pages[2], pages[2], pages[1], pages[0]);

Comments

Adrian Reber Sept. 12, 2016, 2:51 p.m.
On Mon, Sep 12, 2016 at 03:08:55PM +0300, Mike Rapoport wrote:
> On Mon, Sep 12, 2016 at 11:02:34AM +0200, Adrian Reber wrote:
> > On Thu, Sep 08, 2016 at 08:39:06AM +0300, Mike Rapoport wrote:
> > > On Tue, Sep 06, 2016 at 11:35:23AM +0200, Adrian Reber wrote:
> > > > Using the latest criu with p.haul the information about the pages dumped
> > > > looks a bit wrong. From p.haul I get:
> 
> [ snip ]
>  
> > > > So p.haul is kind of right as the stats file says that only 13 pages
> > > > have been written. But all the lazy pages have also been written.
> > > > 
> > > > Should pages_written include the lazy pages or should p.haul add
> > > > pages_written and pages_lazy to get the actual number of pages written?
> > > 
> > > I think p.haul should add pages_written and pages_lazy.
> > 
> > I thought some more about this and I am not convinced. 'pages_written'
> > sounds like all pages written. Unrelated if the pages are lazy or not.
>  
> Agree. How about the patch below?

p.haul is happy again. Acked-by

		Adrian

> > Are there any advantages I do not see if the lazy pages are not included
> > in 'pages_written'? In the future when p.haul might know how to combine
> > pre-copy and post-copy the variable 'pages_lazy' will become important
> > but right now it feels wrong to read it to decide if additional pre-copy
> > runs should be performed.
> > 
> > 		Adrian
> > 
> 
> >From 898ab269a6b4a47ed8019445b688dfd3d41d0581 Mon Sep 17 00:00:00 2001
> From: Mike Rapoport <rppt@linux.vnet.ibm.com>
> Date: Mon, 12 Sep 2016 15:05:13 +0300
> Subject: [CRIU][PATCH] criu: mem: count all pages actually written to image as
>  "pages_written"
> 
> When potentially lazy pages are written to image add their count to
> "pages_written" stats counter.
> 
> Signed-off-by: Mike Rapoport <rppt@linux.vnet.ibm.com>
> ---
>  criu/mem.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/criu/mem.c b/criu/mem.c
> index 597bf29..b9b0bf4 100644
> --- a/criu/mem.c
> +++ b/criu/mem.c
> @@ -192,7 +192,8 @@ static int generate_iovs(struct vma_area *vma, struct page_pipe *pp, u64 *map, u
>  	cnt_add(CNT_PAGES_ZERO, pages[0]);
>  	cnt_add(CNT_PAGES_SKIPPED_PARENT, pages[1]);
>  	cnt_add(CNT_PAGES_LAZY, pages[2]);
> -	cnt_add(CNT_PAGES_WRITTEN, pages[3]);
> +	cnt_add(CNT_PAGES_WRITTEN,
> +		opts.lazy_pages ? pages[3] : pages[2] + pages[3]);
>  
>  	pr_info("Pagemap generated: %lu pages (%lu lazy) %lu holes %lu zeros\n",
>  		pages[3] + pages[2], pages[2], pages[1], pages[0]);
> -- 
> 1.9.1
Pavel Emelianov Sept. 13, 2016, 3:12 p.m.
On 09/12/2016 03:08 PM, Mike Rapoport wrote:
> On Mon, Sep 12, 2016 at 11:02:34AM +0200, Adrian Reber wrote:
>> On Thu, Sep 08, 2016 at 08:39:06AM +0300, Mike Rapoport wrote:
>>> On Tue, Sep 06, 2016 at 11:35:23AM +0200, Adrian Reber wrote:
>>>> Using the latest criu with p.haul the information about the pages dumped
>>>> looks a bit wrong. From p.haul I get:
> 
> [ snip ]
>  
>>>> So p.haul is kind of right as the stats file says that only 13 pages
>>>> have been written. But all the lazy pages have also been written.
>>>>
>>>> Should pages_written include the lazy pages or should p.haul add
>>>> pages_written and pages_lazy to get the actual number of pages written?
>>>
>>> I think p.haul should add pages_written and pages_lazy.
>>
>> I thought some more about this and I am not convinced. 'pages_written'
>> sounds like all pages written. Unrelated if the pages are lazy or not.
>  
> Agree. How about the patch below?
> 
>> Are there any advantages I do not see if the lazy pages are not included
>> in 'pages_written'? In the future when p.haul might know how to combine
>> pre-copy and post-copy the variable 'pages_lazy' will become important
>> but right now it feels wrong to read it to decide if additional pre-copy
>> runs should be performed.
>>
>> 		Adrian
>>
> 
>>From 898ab269a6b4a47ed8019445b688dfd3d41d0581 Mon Sep 17 00:00:00 2001
> From: Mike Rapoport <rppt@linux.vnet.ibm.com>
> Date: Mon, 12 Sep 2016 15:05:13 +0300
> Subject: [CRIU][PATCH] criu: mem: count all pages actually written to image as
>  "pages_written"
> 
> When potentially lazy pages are written to image add their count to
> "pages_written" stats counter.

Erm... I saw Adrian's thoughts on counting lazy pages as written, but now I
disagree with that. If we treat pages_written as "pages that went into the
image file", then this definition is clean and understandable. But how to
define pages_written if we count lazy pages there as well?

Can we better leave pages_written as "pages that are in the images" and, if
we need it, introduce pages_dumped to count all the pages that are to be
taken with us regardless whether they are in the image or lazy?

-- Pavel

> Signed-off-by: Mike Rapoport <rppt@linux.vnet.ibm.com>
> ---
>  criu/mem.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/criu/mem.c b/criu/mem.c
> index 597bf29..b9b0bf4 100644
> --- a/criu/mem.c
> +++ b/criu/mem.c
> @@ -192,7 +192,8 @@ static int generate_iovs(struct vma_area *vma, struct page_pipe *pp, u64 *map, u
>  	cnt_add(CNT_PAGES_ZERO, pages[0]);
>  	cnt_add(CNT_PAGES_SKIPPED_PARENT, pages[1]);
>  	cnt_add(CNT_PAGES_LAZY, pages[2]);
> -	cnt_add(CNT_PAGES_WRITTEN, pages[3]);
> +	cnt_add(CNT_PAGES_WRITTEN,
> +		opts.lazy_pages ? pages[3] : pages[2] + pages[3]);
>  
>  	pr_info("Pagemap generated: %lu pages (%lu lazy) %lu holes %lu zeros\n",
>  		pages[3] + pages[2], pages[2], pages[1], pages[0]);
>
Mike Rapoport Sept. 13, 2016, 3:18 p.m.
On Tue, Sep 13, 2016 at 06:12:28PM +0300, Pavel Emelyanov wrote:
> On 09/12/2016 03:08 PM, Mike Rapoport wrote:
> > On Mon, Sep 12, 2016 at 11:02:34AM +0200, Adrian Reber wrote:
> >> On Thu, Sep 08, 2016 at 08:39:06AM +0300, Mike Rapoport wrote:
> >>> On Tue, Sep 06, 2016 at 11:35:23AM +0200, Adrian Reber wrote:
> >>>> Using the latest criu with p.haul the information about the pages dumped
> >>>> looks a bit wrong. From p.haul I get:
> > 
> > [ snip ]
> >  
> >>>> So p.haul is kind of right as the stats file says that only 13 pages
> >>>> have been written. But all the lazy pages have also been written.
> >>>>
> >>>> Should pages_written include the lazy pages or should p.haul add
> >>>> pages_written and pages_lazy to get the actual number of pages written?
> >>>
> >>> I think p.haul should add pages_written and pages_lazy.
> >>
> >> I thought some more about this and I am not convinced. 'pages_written'
> >> sounds like all pages written. Unrelated if the pages are lazy or not.
> >  
> > Agree. How about the patch below?
> > 
> >> Are there any advantages I do not see if the lazy pages are not included
> >> in 'pages_written'? In the future when p.haul might know how to combine
> >> pre-copy and post-copy the variable 'pages_lazy' will become important
> >> but right now it feels wrong to read it to decide if additional pre-copy
> >> runs should be performed.
> >>
> >> 		Adrian
> >>
> > 
> >>From 898ab269a6b4a47ed8019445b688dfd3d41d0581 Mon Sep 17 00:00:00 2001
> > From: Mike Rapoport <rppt@linux.vnet.ibm.com>
> > Date: Mon, 12 Sep 2016 15:05:13 +0300
> > Subject: [CRIU][PATCH] criu: mem: count all pages actually written to image as
> >  "pages_written"
> > 
> > When potentially lazy pages are written to image add their count to
> > "pages_written" stats counter.
> 
> Erm... I saw Adrian's thoughts on counting lazy pages as written, but now I
> disagree with that. If we treat pages_written as "pages that went into the
> image file", then this definition is clean and understandable. But how to
> define pages_written if we count lazy pages there as well?
> 
> Can we better leave pages_written as "pages that are in the images" and, if
> we need it, introduce pages_dumped to count all the pages that are to be
> taken with us regardless whether they are in the image or lazy?

Well, PAGES_LAZY have broken notion of pages_written as "pages that
are in the images". With this patch it's restored:
- if dump writes potentially lazy pages to the image, they are counted as
  written
- if dump will go to server mode to transfer pages on demand, the lazy
  pages are not written to images and CNT_PAGES_WRITTEN does not count
them 
 
> -- Pavel
> 
> > Signed-off-by: Mike Rapoport <rppt@linux.vnet.ibm.com>
> > ---
> >  criu/mem.c | 3 ++-
> >  1 file changed, 2 insertions(+), 1 deletion(-)
> > 
> > diff --git a/criu/mem.c b/criu/mem.c
> > index 597bf29..b9b0bf4 100644
> > --- a/criu/mem.c
> > +++ b/criu/mem.c
> > @@ -192,7 +192,8 @@ static int generate_iovs(struct vma_area *vma, struct page_pipe *pp, u64 *map, u
> >  	cnt_add(CNT_PAGES_ZERO, pages[0]);
> >  	cnt_add(CNT_PAGES_SKIPPED_PARENT, pages[1]);
> >  	cnt_add(CNT_PAGES_LAZY, pages[2]);
> > -	cnt_add(CNT_PAGES_WRITTEN, pages[3]);
> > +	cnt_add(CNT_PAGES_WRITTEN,
> > +		opts.lazy_pages ? pages[3] : pages[2] + pages[3]);
> >  
> >  	pr_info("Pagemap generated: %lu pages (%lu lazy) %lu holes %lu zeros\n",
> >  		pages[3] + pages[2], pages[2], pages[1], pages[0]);
> > 
>
Adrian Reber Sept. 13, 2016, 5:18 p.m.
On Tue, Sep 13, 2016 at 06:12:28PM +0300, Pavel Emelyanov wrote:
> On 09/12/2016 03:08 PM, Mike Rapoport wrote:
> > On Mon, Sep 12, 2016 at 11:02:34AM +0200, Adrian Reber wrote:
> >> On Thu, Sep 08, 2016 at 08:39:06AM +0300, Mike Rapoport wrote:
> >>> On Tue, Sep 06, 2016 at 11:35:23AM +0200, Adrian Reber wrote:
> >>>> Using the latest criu with p.haul the information about the pages dumped
> >>>> looks a bit wrong. From p.haul I get:
> > 
> > [ snip ]
> >  
> >>>> So p.haul is kind of right as the stats file says that only 13 pages
> >>>> have been written. But all the lazy pages have also been written.
> >>>>
> >>>> Should pages_written include the lazy pages or should p.haul add
> >>>> pages_written and pages_lazy to get the actual number of pages written?
> >>>
> >>> I think p.haul should add pages_written and pages_lazy.
> >>
> >> I thought some more about this and I am not convinced. 'pages_written'
> >> sounds like all pages written. Unrelated if the pages are lazy or not.
> >  
> > Agree. How about the patch below?
> > 
> >> Are there any advantages I do not see if the lazy pages are not included
> >> in 'pages_written'? In the future when p.haul might know how to combine
> >> pre-copy and post-copy the variable 'pages_lazy' will become important
> >> but right now it feels wrong to read it to decide if additional pre-copy
> >> runs should be performed.
> >>
> >> 		Adrian
> >>
> > 
> >>From 898ab269a6b4a47ed8019445b688dfd3d41d0581 Mon Sep 17 00:00:00 2001
> > From: Mike Rapoport <rppt@linux.vnet.ibm.com>
> > Date: Mon, 12 Sep 2016 15:05:13 +0300
> > Subject: [CRIU][PATCH] criu: mem: count all pages actually written to image as
> >  "pages_written"
> > 
> > When potentially lazy pages are written to image add their count to
> > "pages_written" stats counter.
> 
> Erm... I saw Adrian's thoughts on counting lazy pages as written, but now I
> disagree with that. If we treat pages_written as "pages that went into the
> image file", then this definition is clean and understandable. But how to
> define pages_written if we count lazy pages there as well?
> 
> Can we better leave pages_written as "pages that are in the images" and, if
> we need it, introduce pages_dumped to count all the pages that are to be
> taken with us regardless whether they are in the image or lazy?

The problem I see is with p.haul during the pre-dump. Maybe pages should
not be counted as lazy in the pre-dump case. There is no use to count
lazy pages during pre-dump.

		Adrian
Pavel Emelianov Sept. 14, 2016, 9:05 a.m.
On 09/13/2016 08:18 PM, Adrian Reber wrote:
> On Tue, Sep 13, 2016 at 06:12:28PM +0300, Pavel Emelyanov wrote:
>> On 09/12/2016 03:08 PM, Mike Rapoport wrote:
>>> On Mon, Sep 12, 2016 at 11:02:34AM +0200, Adrian Reber wrote:
>>>> On Thu, Sep 08, 2016 at 08:39:06AM +0300, Mike Rapoport wrote:
>>>>> On Tue, Sep 06, 2016 at 11:35:23AM +0200, Adrian Reber wrote:
>>>>>> Using the latest criu with p.haul the information about the pages dumped
>>>>>> looks a bit wrong. From p.haul I get:
>>>
>>> [ snip ]
>>>  
>>>>>> So p.haul is kind of right as the stats file says that only 13 pages
>>>>>> have been written. But all the lazy pages have also been written.
>>>>>>
>>>>>> Should pages_written include the lazy pages or should p.haul add
>>>>>> pages_written and pages_lazy to get the actual number of pages written?
>>>>>
>>>>> I think p.haul should add pages_written and pages_lazy.
>>>>
>>>> I thought some more about this and I am not convinced. 'pages_written'
>>>> sounds like all pages written. Unrelated if the pages are lazy or not.
>>>  
>>> Agree. How about the patch below?
>>>
>>>> Are there any advantages I do not see if the lazy pages are not included
>>>> in 'pages_written'? In the future when p.haul might know how to combine
>>>> pre-copy and post-copy the variable 'pages_lazy' will become important
>>>> but right now it feels wrong to read it to decide if additional pre-copy
>>>> runs should be performed.
>>>>
>>>> 		Adrian
>>>>
>>>
>>> >From 898ab269a6b4a47ed8019445b688dfd3d41d0581 Mon Sep 17 00:00:00 2001
>>> From: Mike Rapoport <rppt@linux.vnet.ibm.com>
>>> Date: Mon, 12 Sep 2016 15:05:13 +0300
>>> Subject: [CRIU][PATCH] criu: mem: count all pages actually written to image as
>>>  "pages_written"
>>>
>>> When potentially lazy pages are written to image add their count to
>>> "pages_written" stats counter.
>>
>> Erm... I saw Adrian's thoughts on counting lazy pages as written, but now I
>> disagree with that. If we treat pages_written as "pages that went into the
>> image file", then this definition is clean and understandable. But how to
>> define pages_written if we count lazy pages there as well?
>>
>> Can we better leave pages_written as "pages that are in the images" and, if
>> we need it, introduce pages_dumped to count all the pages that are to be
>> taken with us regardless whether they are in the image or lazy?
> 
> The problem I see is with p.haul during the pre-dump. Maybe pages should
> not be counted as lazy in the pre-dump case. There is no use to count
> lazy pages during pre-dump.

Agreed. On pre-dump lazy pages do not make sense. But AFAIU they are not
counted.

Mike?

> 		Adrian
> .
>
Mike Rapoport Sept. 14, 2016, 12:53 p.m.
On Wed, Sep 14, 2016 at 12:05:49PM +0300, Pavel Emelyanov wrote:
> On 09/13/2016 08:18 PM, Adrian Reber wrote:
> > On Tue, Sep 13, 2016 at 06:12:28PM +0300, Pavel Emelyanov wrote:
> >> On 09/12/2016 03:08 PM, Mike Rapoport wrote:
> >>> On Mon, Sep 12, 2016 at 11:02:34AM +0200, Adrian Reber wrote:
> >>>> On Thu, Sep 08, 2016 at 08:39:06AM +0300, Mike Rapoport wrote:
> >>
> >> Erm... I saw Adrian's thoughts on counting lazy pages as written, but now I
> >> disagree with that. If we treat pages_written as "pages that went into the
> >> image file", then this definition is clean and understandable. But how to
> >> define pages_written if we count lazy pages there as well?
> >>
> >> Can we better leave pages_written as "pages that are in the images" and, if
> >> we need it, introduce pages_dumped to count all the pages that are to be
> >> taken with us regardless whether they are in the image or lazy?
> > 
> > The problem I see is with p.haul during the pre-dump. Maybe pages should
> > not be counted as lazy in the pre-dump case. There is no use to count
> > lazy pages during pre-dump.
> 
> Agreed. On pre-dump lazy pages do not make sense. But AFAIU they are not
> counted.
 
The short answer is "the lazy pages are always counted as lazy".

The longer story is that at the point when the pages are collected to pipes
and counted, the "lazy" has notion means "the page belongs to VMA that
userfaultfd can handle". Therefore, the ppb's marked as PPB_LAY may or may
not get written to the images, depending on the options.

We can add sixth parameter to generate_iovs, so that it'll be able to tell
whether it should count lazy pages or not. But, IMHO, it'll make memory
dump path even "nicer" than it is now.

In the current state, which is certainly buggy, PAGES_WRITTEN counts only
pages that cannot be handled by userfaultfd, and PAGES_LAZY counts pages
that can be handled by userfaultfd.

The patch above in this thread makes sure that PAGES_WRITTEN counts all the
pages that are in the images, while keeping PAGES_LAZY intact.

> 
> > 		Adrian
> > .

--
Sincerely yours,
Mike.
Mike Rapoport Sept. 27, 2016, 10:25 a.m.
On Wed, Sep 14, 2016 at 03:53:49PM +0300, Mike Rapoport wrote:
> On Wed, Sep 14, 2016 at 12:05:49PM +0300, Pavel Emelyanov wrote:
> > On 09/13/2016 08:18 PM, Adrian Reber wrote:
> > > On Tue, Sep 13, 2016 at 06:12:28PM +0300, Pavel Emelyanov wrote:
> > >> On 09/12/2016 03:08 PM, Mike Rapoport wrote:
> > >>> On Mon, Sep 12, 2016 at 11:02:34AM +0200, Adrian Reber wrote:
> > >>>> On Thu, Sep 08, 2016 at 08:39:06AM +0300, Mike Rapoport wrote:
> > >>
> > >> Erm... I saw Adrian's thoughts on counting lazy pages as written, but now I
> > >> disagree with that. If we treat pages_written as "pages that went into the
> > >> image file", then this definition is clean and understandable. But how to
> > >> define pages_written if we count lazy pages there as well?
> > >>
> > >> Can we better leave pages_written as "pages that are in the images" and, if
> > >> we need it, introduce pages_dumped to count all the pages that are to be
> > >> taken with us regardless whether they are in the image or lazy?
> > > 
> > > The problem I see is with p.haul during the pre-dump. Maybe pages should
> > > not be counted as lazy in the pre-dump case. There is no use to count
> > > lazy pages during pre-dump.
> > 
> > Agreed. On pre-dump lazy pages do not make sense. But AFAIU they are not
> > counted.
> 
> The short answer is "the lazy pages are always counted as lazy".
> 
> The longer story is that at the point when the pages are collected to pipes
> and counted, the "lazy" has notion means "the page belongs to VMA that
> userfaultfd can handle". Therefore, the ppb's marked as PPB_LAY may or may
> not get written to the images, depending on the options.
> 
> We can add sixth parameter to generate_iovs, so that it'll be able to tell
> whether it should count lazy pages or not. But, IMHO, it'll make memory
> dump path even "nicer" than it is now.
> 
> In the current state, which is certainly buggy, PAGES_WRITTEN counts only
> pages that cannot be handled by userfaultfd, and PAGES_LAZY counts pages
> that can be handled by userfaultfd.
> 
> The patch above in this thread makes sure that PAGES_WRITTEN counts all the
> pages that are in the images, while keeping PAGES_LAZY intact.

So, how do we want to move forward with fixing the stats reporting?
 
> > 
> > > 		Adrian
> > > .
> 
> --
> Sincerely yours,
> Mike.
> 
> _______________________________________________
> CRIU mailing list
> CRIU@openvz.org
> https://lists.openvz.org/mailman/listinfo/criu
>