[5/6] criu: pagemap: add PE_PRESENT flag

Submitted by Mike Rapoport on Sept. 8, 2016, 7:39 a.m.

Details

Message ID 1473320352-28669-6-git-send-email-rppt@linux.vnet.ibm.com
State Rejected
Series "criu: pagemap: improve accuracy of lazy pages"
Headers show

Commit Message

Mike Rapoport Sept. 8, 2016, 7:39 a.m.
The PE_PRESENT flags is always set for pagemap entries that have
corresponding pages in the pages*img. Pagemap entries describing a hole
either with zero page or with pages in the parent snapshot will no have
PE_PRESENT flag set.
Pagemap entry that may be lazily restored is a special case. For the lazy
restore from disk case, both PE_LAZY and PE_PRESENT will be set in the
pagemap, but for the remote lazy pages case only PE_LAZY will be set.

Signed-off-by: Mike Rapoport <rppt@linux.vnet.ibm.com>
---
 criu/include/pagemap.h |  6 ++++++
 criu/page-xfer.c       | 16 +++++++++++-----
 criu/pagemap.c         | 10 +++++++++-
 3 files changed, 26 insertions(+), 6 deletions(-)

Patch hide | download patch | download mbox

diff --git a/criu/include/pagemap.h b/criu/include/pagemap.h
index f4a9285..705a9af 100644
--- a/criu/include/pagemap.h
+++ b/criu/include/pagemap.h
@@ -97,6 +97,7 @@  extern int dedup_one_iovec(struct page_read *pr, struct iovec *iov);
 #define PE_PARENT	(1 << 0)	/* pages are in parent snapshot */
 #define PE_ZERO		(1 << 1)	/* pages can be lazily restored */
 #define PE_LAZY		(1 << 2)	/* pages are mapped to zero pfn */
+#define PE_PRESENT	(1 << 3)	/* pages are present in pages*img */
 
 static inline bool pagemap_in_parent(PagemapEntry *pe)
 {
@@ -113,4 +114,9 @@  static inline bool pagemap_lazy(PagemapEntry *pe)
 	return !!(pe->flags & PE_LAZY);
 }
 
+static inline bool pagemap_present(PagemapEntry *pe)
+{
+	return !!(pe->flags & PE_PRESENT);
+}
+
 #endif /* __CR_PAGE_READ_H__ */
diff --git a/criu/page-xfer.c b/criu/page-xfer.c
index 68dba86..798ea0b 100644
--- a/criu/page-xfer.c
+++ b/criu/page-xfer.c
@@ -440,6 +440,7 @@  int page_xfer_dump_pages(struct page_xfer *xfer, struct page_pipe *pp,
 
 		for (i = 0; i < ppb->nr_segs; i++) {
 			struct iovec iov = get_iov(ppb->iov, i, pp->flags & PP_COMPAT);
+			u32 flags = PE_PRESENT;
 
 			ret = dump_holes(xfer, pp, &cur_hole, iov.iov_base, off);
 			if (ret)
@@ -450,13 +451,18 @@  int page_xfer_dump_pages(struct page_xfer *xfer, struct page_pipe *pp,
 			pr_debug("\tp %p [%u]\n", iov.iov_base,
 					(unsigned int)(iov.iov_len / PAGE_SIZE));
 
-			if (!dump_lazy && ppb->flags & PPB_LAZY) {
-				if (xfer->write_hole(xfer, &iov, PS_IOV_LAZY))
-					return -1;
-				continue;
+			if (ppb->flags & PPB_LAZY) {
+				if (!dump_lazy) {
+					if (xfer->write_hole(xfer, &iov,
+							     PS_IOV_LAZY))
+						return -1;
+					continue;
+				} else {
+					flags |= PE_LAZY;
+				}
 			}
 
-			if (xfer->write_pagemap(xfer, &iov, 0))
+			if (xfer->write_pagemap(xfer, &iov, flags))
 				return -1;
 			if (xfer->write_pages(xfer, ppb->p[0], iov.iov_len))
 				return -1;
diff --git a/criu/pagemap.c b/criu/pagemap.c
index aaef5a4..71ae11e 100644
--- a/criu/pagemap.c
+++ b/criu/pagemap.c
@@ -158,7 +158,7 @@  static void skip_pagemap_pages(struct page_read *pr, unsigned long len)
 		return;
 
 	pr_debug("\tpr%u Skip %lu bytes from page-dump\n", pr->id, len);
-	if (!pagemap_in_parent(pr->pe) && !pagemap_zero(pr->pe) && !pagemap_lazy(pr->pe))
+	if (pagemap_present(pr->pe))
 		pr->pi_off += len;
 	pr->cvaddr += len;
 }
@@ -360,8 +360,16 @@  err_cl:
 
 static void init_compat_pagemap_entry(PagemapEntry *pe)
 {
+	/*
+	 * pagemap image generated with older version will either
+	 * contain a hole because the pages are in the parent
+	 * shanpshot or a pagemap that should be marked with
+	 * PE_PRESENT
+	 */
 	if (pe->has_in_parent && pe->in_parent)
 		pe->flags |= PE_PARENT;
+	else if (!pe->has_flags)
+		pe->flags = PE_PRESENT;
 }
 
 /*

Comments

Pavel Emelianov Sept. 13, 2016, 2:59 p.m.
On 09/08/2016 10:39 AM, Mike Rapoport wrote:
> The PE_PRESENT flags is always set for pagemap entries that have
> corresponding pages in the pages*img. Pagemap entries describing a hole
> either with zero page or with pages in the parent snapshot will no have
> PE_PRESENT flag set.

But why do we need the separate flag then? If entry.flag & (PE_ZERO | PE_LAZY)
is equivalent to it...

> Pagemap entry that may be lazily restored is a special case. For the lazy
> restore from disk case, both PE_LAZY and PE_PRESENT will be set in the
> pagemap, but for the remote lazy pages case only PE_LAZY will be set.
> 
> Signed-off-by: Mike Rapoport <rppt@linux.vnet.ibm.com>
> ---
>  criu/include/pagemap.h |  6 ++++++
>  criu/page-xfer.c       | 16 +++++++++++-----
>  criu/pagemap.c         | 10 +++++++++-
>  3 files changed, 26 insertions(+), 6 deletions(-)
> 
> diff --git a/criu/include/pagemap.h b/criu/include/pagemap.h
> index f4a9285..705a9af 100644
> --- a/criu/include/pagemap.h
> +++ b/criu/include/pagemap.h
> @@ -97,6 +97,7 @@ extern int dedup_one_iovec(struct page_read *pr, struct iovec *iov);
>  #define PE_PARENT	(1 << 0)	/* pages are in parent snapshot */
>  #define PE_ZERO		(1 << 1)	/* pages can be lazily restored */
>  #define PE_LAZY		(1 << 2)	/* pages are mapped to zero pfn */
> +#define PE_PRESENT	(1 << 3)	/* pages are present in pages*img */
>  
>  static inline bool pagemap_in_parent(PagemapEntry *pe)
>  {
> @@ -113,4 +114,9 @@ static inline bool pagemap_lazy(PagemapEntry *pe)
>  	return !!(pe->flags & PE_LAZY);
>  }
>  
> +static inline bool pagemap_present(PagemapEntry *pe)
> +{
> +	return !!(pe->flags & PE_PRESENT);
> +}
> +
>  #endif /* __CR_PAGE_READ_H__ */
> diff --git a/criu/page-xfer.c b/criu/page-xfer.c
> index 68dba86..798ea0b 100644
> --- a/criu/page-xfer.c
> +++ b/criu/page-xfer.c
> @@ -440,6 +440,7 @@ int page_xfer_dump_pages(struct page_xfer *xfer, struct page_pipe *pp,
>  
>  		for (i = 0; i < ppb->nr_segs; i++) {
>  			struct iovec iov = get_iov(ppb->iov, i, pp->flags & PP_COMPAT);
> +			u32 flags = PE_PRESENT;
>  
>  			ret = dump_holes(xfer, pp, &cur_hole, iov.iov_base, off);
>  			if (ret)
> @@ -450,13 +451,18 @@ int page_xfer_dump_pages(struct page_xfer *xfer, struct page_pipe *pp,
>  			pr_debug("\tp %p [%u]\n", iov.iov_base,
>  					(unsigned int)(iov.iov_len / PAGE_SIZE));
>  
> -			if (!dump_lazy && ppb->flags & PPB_LAZY) {
> -				if (xfer->write_hole(xfer, &iov, PS_IOV_LAZY))
> -					return -1;
> -				continue;
> +			if (ppb->flags & PPB_LAZY) {
> +				if (!dump_lazy) {
> +					if (xfer->write_hole(xfer, &iov,
> +							     PS_IOV_LAZY))
> +						return -1;
> +					continue;
> +				} else {
> +					flags |= PE_LAZY;
> +				}
>  			}
>  
> -			if (xfer->write_pagemap(xfer, &iov, 0))
> +			if (xfer->write_pagemap(xfer, &iov, flags))
>  				return -1;
>  			if (xfer->write_pages(xfer, ppb->p[0], iov.iov_len))
>  				return -1;
> diff --git a/criu/pagemap.c b/criu/pagemap.c
> index aaef5a4..71ae11e 100644
> --- a/criu/pagemap.c
> +++ b/criu/pagemap.c
> @@ -158,7 +158,7 @@ static void skip_pagemap_pages(struct page_read *pr, unsigned long len)
>  		return;
>  
>  	pr_debug("\tpr%u Skip %lu bytes from page-dump\n", pr->id, len);
> -	if (!pagemap_in_parent(pr->pe) && !pagemap_zero(pr->pe) && !pagemap_lazy(pr->pe))
> +	if (pagemap_present(pr->pe))
>  		pr->pi_off += len;
>  	pr->cvaddr += len;
>  }
> @@ -360,8 +360,16 @@ err_cl:
>  
>  static void init_compat_pagemap_entry(PagemapEntry *pe)
>  {
> +	/*
> +	 * pagemap image generated with older version will either
> +	 * contain a hole because the pages are in the parent
> +	 * shanpshot or a pagemap that should be marked with
> +	 * PE_PRESENT
> +	 */
>  	if (pe->has_in_parent && pe->in_parent)
>  		pe->flags |= PE_PARENT;
> +	else if (!pe->has_flags)
> +		pe->flags = PE_PRESENT;
>  }
>  
>  /*
>
Mike Rapoport Sept. 13, 2016, 3:13 p.m.
On Tue, Sep 13, 2016 at 05:59:42PM +0300, Pavel Emelyanov wrote:
> On 09/08/2016 10:39 AM, Mike Rapoport wrote:
> > The PE_PRESENT flags is always set for pagemap entries that have
> > corresponding pages in the pages*img. Pagemap entries describing a hole
> > either with zero page or with pages in the parent snapshot will no have
> > PE_PRESENT flag set.
> 
> But why do we need the separate flag then? If entry.flag & (PE_ZERO | PE_LAZY)
> is equivalent to it...

If we dump lazy pages for e.g. lazy restore from images, the pagemaps with
lazy pages will have PE_PRESENT | PE_LAZY.
For post-copy, only PE_LAZY will be set and there won't be pages in
pages*img.

The idea is to make pagemap contain all the information to decide whether
uffd can/should be used during restore. Then we don't need to check each
time against VMA if certain page may be lazy.

 
> > Pagemap entry that may be lazily restored is a special case. For the lazy
> > restore from disk case, both PE_LAZY and PE_PRESENT will be set in the
> > pagemap, but for the remote lazy pages case only PE_LAZY will be set.
> > 
> > Signed-off-by: Mike Rapoport <rppt@linux.vnet.ibm.com>
> > ---
> >  criu/include/pagemap.h |  6 ++++++
> >  criu/page-xfer.c       | 16 +++++++++++-----
> >  criu/pagemap.c         | 10 +++++++++-
> >  3 files changed, 26 insertions(+), 6 deletions(-)
> > 
> > diff --git a/criu/include/pagemap.h b/criu/include/pagemap.h
> > index f4a9285..705a9af 100644
> > --- a/criu/include/pagemap.h
> > +++ b/criu/include/pagemap.h
> > @@ -97,6 +97,7 @@ extern int dedup_one_iovec(struct page_read *pr, struct iovec *iov);
> >  #define PE_PARENT	(1 << 0)	/* pages are in parent snapshot */
> >  #define PE_ZERO		(1 << 1)	/* pages can be lazily restored */
> >  #define PE_LAZY		(1 << 2)	/* pages are mapped to zero pfn */
> > +#define PE_PRESENT	(1 << 3)	/* pages are present in pages*img */
> >  
> >  static inline bool pagemap_in_parent(PagemapEntry *pe)
> >  {
> > @@ -113,4 +114,9 @@ static inline bool pagemap_lazy(PagemapEntry *pe)
> >  	return !!(pe->flags & PE_LAZY);
> >  }
> >  
> > +static inline bool pagemap_present(PagemapEntry *pe)
> > +{
> > +	return !!(pe->flags & PE_PRESENT);
> > +}
> > +
> >  #endif /* __CR_PAGE_READ_H__ */
> > diff --git a/criu/page-xfer.c b/criu/page-xfer.c
> > index 68dba86..798ea0b 100644
> > --- a/criu/page-xfer.c
> > +++ b/criu/page-xfer.c
> > @@ -440,6 +440,7 @@ int page_xfer_dump_pages(struct page_xfer *xfer, struct page_pipe *pp,
> >  
> >  		for (i = 0; i < ppb->nr_segs; i++) {
> >  			struct iovec iov = get_iov(ppb->iov, i, pp->flags & PP_COMPAT);
> > +			u32 flags = PE_PRESENT;
> >  
> >  			ret = dump_holes(xfer, pp, &cur_hole, iov.iov_base, off);
> >  			if (ret)
> > @@ -450,13 +451,18 @@ int page_xfer_dump_pages(struct page_xfer *xfer, struct page_pipe *pp,
> >  			pr_debug("\tp %p [%u]\n", iov.iov_base,
> >  					(unsigned int)(iov.iov_len / PAGE_SIZE));
> >  
> > -			if (!dump_lazy && ppb->flags & PPB_LAZY) {
> > -				if (xfer->write_hole(xfer, &iov, PS_IOV_LAZY))
> > -					return -1;
> > -				continue;
> > +			if (ppb->flags & PPB_LAZY) {
> > +				if (!dump_lazy) {
> > +					if (xfer->write_hole(xfer, &iov,
> > +							     PS_IOV_LAZY))
> > +						return -1;
> > +					continue;
> > +				} else {
> > +					flags |= PE_LAZY;
> > +				}
> >  			}
> >  
> > -			if (xfer->write_pagemap(xfer, &iov, 0))
> > +			if (xfer->write_pagemap(xfer, &iov, flags))
> >  				return -1;
> >  			if (xfer->write_pages(xfer, ppb->p[0], iov.iov_len))
> >  				return -1;
> > diff --git a/criu/pagemap.c b/criu/pagemap.c
> > index aaef5a4..71ae11e 100644
> > --- a/criu/pagemap.c
> > +++ b/criu/pagemap.c
> > @@ -158,7 +158,7 @@ static void skip_pagemap_pages(struct page_read *pr, unsigned long len)
> >  		return;
> >  
> >  	pr_debug("\tpr%u Skip %lu bytes from page-dump\n", pr->id, len);
> > -	if (!pagemap_in_parent(pr->pe) && !pagemap_zero(pr->pe) && !pagemap_lazy(pr->pe))
> > +	if (pagemap_present(pr->pe))
> >  		pr->pi_off += len;
> >  	pr->cvaddr += len;
> >  }
> > @@ -360,8 +360,16 @@ err_cl:
> >  
> >  static void init_compat_pagemap_entry(PagemapEntry *pe)
> >  {
> > +	/*
> > +	 * pagemap image generated with older version will either
> > +	 * contain a hole because the pages are in the parent
> > +	 * shanpshot or a pagemap that should be marked with
> > +	 * PE_PRESENT
> > +	 */
> >  	if (pe->has_in_parent && pe->in_parent)
> >  		pe->flags |= PE_PARENT;
> > +	else if (!pe->has_flags)
> > +		pe->flags = PE_PRESENT;
> >  }
> >  
> >  /*
> > 
>
Pavel Emelianov Sept. 13, 2016, 3:51 p.m.
On 09/13/2016 06:13 PM, Mike Rapoport wrote:
> On Tue, Sep 13, 2016 at 05:59:42PM +0300, Pavel Emelyanov wrote:
>> On 09/08/2016 10:39 AM, Mike Rapoport wrote:
>>> The PE_PRESENT flags is always set for pagemap entries that have
>>> corresponding pages in the pages*img. Pagemap entries describing a hole
>>> either with zero page or with pages in the parent snapshot will no have
>>> PE_PRESENT flag set.
>>
>> But why do we need the separate flag then? If entry.flag & (PE_ZERO | PE_LAZY)
>> is equivalent to it...
> 
> If we dump lazy pages for e.g. lazy restore from images, the pagemaps with
> lazy pages will have PE_PRESENT | PE_LAZY.

But how? I'm reading the page_xfer_dump_pages and see that flags will
be on pe only if it calls ->write_hole() and in this case no pages
go to the image file :(

> For post-copy, only PE_LAZY will be set and there won't be pages in
> pages*img.
> 
> The idea is to make pagemap contain all the information to decide whether
> uffd can/should be used during restore. Then we don't need to check each
> time against VMA if certain page may be lazy.
Mike Rapoport Sept. 13, 2016, 4:08 p.m.
On Tue, Sep 13, 2016 at 06:51:45PM +0300, Pavel Emelyanov wrote:
> On 09/13/2016 06:13 PM, Mike Rapoport wrote:
> > On Tue, Sep 13, 2016 at 05:59:42PM +0300, Pavel Emelyanov wrote:
> >> On 09/08/2016 10:39 AM, Mike Rapoport wrote:
> >>> The PE_PRESENT flags is always set for pagemap entries that have
> >>> corresponding pages in the pages*img. Pagemap entries describing a hole
> >>> either with zero page or with pages in the parent snapshot will no have
> >>> PE_PRESENT flag set.
> >>
> >> But why do we need the separate flag then? If entry.flag & (PE_ZERO | PE_LAZY)
> >> is equivalent to it...
> > 
> > If we dump lazy pages for e.g. lazy restore from images, the pagemaps with
> > lazy pages will have PE_PRESENT | PE_LAZY.
> 
> But how? I'm reading the page_xfer_dump_pages and see that flags will
> be on pe only if it calls ->write_hole() and in this case no pages
> go to the image file :(

The hunk at @@ -450,13 +451,18 @@ criu/page-xfer.c in this patch takes care
of that.
When page_xfer_dump_pages encounters buffer marked as lazy, it checks
whether it should dump lazy pages or not. In case it shouldn't, it creates
a hole using ->write_hole. Otherwise, it uses ->write_pagemap with
PE_PRESENT | PE_LAZY.
 
> > For post-copy, only PE_LAZY will be set and there won't be pages in
> > pages*img.
> > 
> > The idea is to make pagemap contain all the information to decide whether
> > uffd can/should be used during restore. Then we don't need to check each
> > time against VMA if certain page may be lazy.
>