criu: add make lazy support to crit

Submitted by Adrian Reber on April 29, 2016, 8:39 a.m.

Details

Message ID 1461919168-6947-1-git-send-email-adrian@lisas.de
State Rejected
Series "criu: add make lazy support to crit"
Headers show

Commit Message

Adrian Reber April 29, 2016, 8:39 a.m.
From: Adrian Reber <areber@redhat.com>

This enables crit to remove all memory pages from a checkpoint
directory which can be lazily restored using userfaultfd. This
changes the pagemap.img and pages.img to no longer contain pages
which can be handled by userfaultfd (MAP_PRIVATE && MAP_ANON).

Usage:

 $ crit/crit make-lazy /tmp/4/ /tmp/5
 $ du -hs /tmp/4 /tmp/5
 201M	/tmp/4
 116K	/tmp/5

The checkpoint in /tmp/5 can be used by the actual restore process
and the checkpoint in /tmp/4 (with all memory pages) can be used
by the uffd daemon which then transfers the pages into the restored
on demand.

Signed-off-by: Adrian Reber <areber@redhat.com>
---
 crit/crit | 78 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 78 insertions(+)

Patch hide | download patch | download mbox

diff --git a/crit/crit b/crit/crit
index 93cbc98..0e97fe5 100755
--- a/crit/crit
+++ b/crit/crit
@@ -217,6 +217,75 @@  explorers = { 'ps': explore_ps, 'fds': explore_fds, 'mems': explore_mems }
 def explore(opts):
 	explorers[opts['what']](opts)
 
+def make_lazy(opts):
+	""" This function takes the pages from the input directory
+	and removes all pages which can be restored lazily.
+	MAP_PRIVATE && MAP_ANON and not VDSO and not VSYSCALL. """
+	# page size is hardcoded to 0x1000; probably a bad idea
+	ps = 0x1000
+	ps_img = pycriu.images.load(dinf(opts, 'pstree.img'))
+	vids = vma_id()
+	lazy_pages = []
+	for p in ps_img['entries']:
+		pid = p['pid']
+		mmi = pycriu.images.load(dinf(opts, 'mm-%d.img' % pid))['entries'][0]
+
+		print "%d" % pid
+
+		for vma in mmi['vmas']:
+			st = vma['status']
+			# 'MAP_PRIVATE', 0x2
+			# 'MAP_ANON',    0x20
+			if (vma['flags'] & 0x2) and (vma['flags'] & 0x20):
+				# (1 << 2) vsyscall
+				# (1 << 3) vdso
+				if not (st & (1 << 2)) and not (st & (1 << 3)):
+					vaddr = vma['start']
+					while vaddr < vma['end']:
+						lazy_pages.append(vaddr)
+						vaddr += ps
+
+		pms = pycriu.images.load(dinf(opts, 'pagemap-%d.img' % pid))
+		new = []
+
+		# find first pages_id
+		pages_id = -1
+		for pm in pms['entries']:
+			if pm.has_key('pages_id'):
+				pages_id = pm['pages_id']
+
+		if pages_id == -1:
+			# something went wrong
+			raise Exception('No pages_id found in pagemap!')
+
+		# open the original pages.img to remove the lazy pages
+		pages_in = os.path.join(opts['dir'], 'pages-%d.img' % pages_id)
+		pages_in = open(pages_in, 'rb')
+
+		pages_out = os.path.join(opts['outdir'], 'pages-%d.img' % pages_id)
+
+		pages_out = open(pages_out, 'wb')
+
+		for pm in pms['entries']:
+			if pm.has_key('pages_id'):
+				new.append(pm)
+				continue
+			vaddr = pm['vaddr']
+			i = 0
+			start = 0
+			while vaddr < pm['vaddr'] + (pm['nr_pages'] * 0x1000):
+				page_buffer = pages_in.read(ps)
+				if vaddr not in lazy_pages:
+					if start == 0:
+						start = vaddr
+					i +=1
+					pages_out.write(page_buffer)
+				vaddr += ps
+			if i != 0:
+				new.append({'nr_pages': i, 'vaddr': start})
+		pms['entries'] = new
+		pycriu.images.dump(pms, open(os.path.join(opts['outdir'], 'pagemap-%d.img' % pid), 'w+'))
+
 def main():
 	desc = 'CRiu Image Tool'
 	parser = argparse.ArgumentParser(description=desc,
@@ -267,6 +336,15 @@  def main():
 	show_parser.add_argument("in")
 	show_parser.set_defaults(func=decode, pretty=True, out=None)
 
+	# Make Lazy
+	lazy_parser = subparsers.add_parser('make-lazy',
+			help = "remove memory pages from image which can be restored lazily")
+	lazy_parser.add_argument('dir',
+			help = "criu checkpoint directory used as input")
+	lazy_parser.add_argument('outdir',
+			help = "output directory for new pages.img and pagemap.img")
+	lazy_parser.set_defaults(func=make_lazy)
+
 	opts = vars(parser.parse_args())
 
 	opts["func"](opts)

Comments

Pavel Emelianov May 4, 2016, 4:49 p.m.
On 04/29/2016 11:39 AM, Adrian Reber wrote:
> From: Adrian Reber <areber@redhat.com>
> 
> This enables crit to remove all memory pages from a checkpoint
> directory which can be lazily restored using userfaultfd. This
> changes the pagemap.img and pages.img to no longer contain pages
> which can be handled by userfaultfd (MAP_PRIVATE && MAP_ANON).
> 
> Usage:
> 
>  $ crit/crit make-lazy /tmp/4/ /tmp/5
>  $ du -hs /tmp/4 /tmp/5
>  201M	/tmp/4
>  116K	/tmp/5
> 
> The checkpoint in /tmp/5 can be used by the actual restore process
> and the checkpoint in /tmp/4 (with all memory pages) can be used
> by the uffd daemon which then transfers the pages into the restored
> on demand.

OK, but what's the use case you see for this?

> Signed-off-by: Adrian Reber <areber@redhat.com>
> ---
>  crit/crit | 78 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 78 insertions(+)
> 
> diff --git a/crit/crit b/crit/crit
> index 93cbc98..0e97fe5 100755
> --- a/crit/crit
> +++ b/crit/crit
> @@ -217,6 +217,75 @@ explorers = { 'ps': explore_ps, 'fds': explore_fds, 'mems': explore_mems }
>  def explore(opts):
>  	explorers[opts['what']](opts)
>  
> +def make_lazy(opts):
> +	""" This function takes the pages from the input directory
> +	and removes all pages which can be restored lazily.
> +	MAP_PRIVATE && MAP_ANON and not VDSO and not VSYSCALL. """
> +	# page size is hardcoded to 0x1000; probably a bad idea
> +	ps = 0x1000
> +	ps_img = pycriu.images.load(dinf(opts, 'pstree.img'))
> +	vids = vma_id()
> +	lazy_pages = []
> +	for p in ps_img['entries']:
> +		pid = p['pid']
> +		mmi = pycriu.images.load(dinf(opts, 'mm-%d.img' % pid))['entries'][0]
> +
> +		print "%d" % pid
> +
> +		for vma in mmi['vmas']:
> +			st = vma['status']
> +			# 'MAP_PRIVATE', 0x2
> +			# 'MAP_ANON',    0x20
> +			if (vma['flags'] & 0x2) and (vma['flags'] & 0x20):
> +				# (1 << 2) vsyscall
> +				# (1 << 3) vdso
> +				if not (st & (1 << 2)) and not (st & (1 << 3)):
> +					vaddr = vma['start']
> +					while vaddr < vma['end']:
> +						lazy_pages.append(vaddr)
> +						vaddr += ps
> +
> +		pms = pycriu.images.load(dinf(opts, 'pagemap-%d.img' % pid))
> +		new = []
> +
> +		# find first pages_id
> +		pages_id = -1
> +		for pm in pms['entries']:
> +			if pm.has_key('pages_id'):
> +				pages_id = pm['pages_id']
> +
> +		if pages_id == -1:
> +			# something went wrong
> +			raise Exception('No pages_id found in pagemap!')
> +
> +		# open the original pages.img to remove the lazy pages
> +		pages_in = os.path.join(opts['dir'], 'pages-%d.img' % pages_id)
> +		pages_in = open(pages_in, 'rb')
> +
> +		pages_out = os.path.join(opts['outdir'], 'pages-%d.img' % pages_id)
> +
> +		pages_out = open(pages_out, 'wb')
> +
> +		for pm in pms['entries']:
> +			if pm.has_key('pages_id'):
> +				new.append(pm)
> +				continue
> +			vaddr = pm['vaddr']
> +			i = 0
> +			start = 0
> +			while vaddr < pm['vaddr'] + (pm['nr_pages'] * 0x1000):
> +				page_buffer = pages_in.read(ps)
> +				if vaddr not in lazy_pages:
> +					if start == 0:
> +						start = vaddr
> +					i +=1
> +					pages_out.write(page_buffer)
> +				vaddr += ps
> +			if i != 0:
> +				new.append({'nr_pages': i, 'vaddr': start})
> +		pms['entries'] = new
> +		pycriu.images.dump(pms, open(os.path.join(opts['outdir'], 'pagemap-%d.img' % pid), 'w+'))
> +
>  def main():
>  	desc = 'CRiu Image Tool'
>  	parser = argparse.ArgumentParser(description=desc,
> @@ -267,6 +336,15 @@ def main():
>  	show_parser.add_argument("in")
>  	show_parser.set_defaults(func=decode, pretty=True, out=None)
>  
> +	# Make Lazy
> +	lazy_parser = subparsers.add_parser('make-lazy',
> +			help = "remove memory pages from image which can be restored lazily")
> +	lazy_parser.add_argument('dir',
> +			help = "criu checkpoint directory used as input")
> +	lazy_parser.add_argument('outdir',
> +			help = "output directory for new pages.img and pagemap.img")
> +	lazy_parser.set_defaults(func=make_lazy)
> +
>  	opts = vars(parser.parse_args())
>  
>  	opts["func"](opts)
>
Adrian Reber May 4, 2016, 5:03 p.m.
On Wed, May 04, 2016 at 07:49:14PM +0300, Pavel Emelyanov wrote:
> On 04/29/2016 11:39 AM, Adrian Reber wrote:
> > From: Adrian Reber <areber@redhat.com>
> > 
> > This enables crit to remove all memory pages from a checkpoint
> > directory which can be lazily restored using userfaultfd. This
> > changes the pagemap.img and pages.img to no longer contain pages
> > which can be handled by userfaultfd (MAP_PRIVATE && MAP_ANON).
> > 
> > Usage:
> > 
> >  $ crit/crit make-lazy /tmp/4/ /tmp/5
> >  $ du -hs /tmp/4 /tmp/5
> >  201M	/tmp/4
> >  116K	/tmp/5
> > 
> > The checkpoint in /tmp/5 can be used by the actual restore process
> > and the checkpoint in /tmp/4 (with all memory pages) can be used
> > by the uffd daemon which then transfers the pages into the restored
> > on demand.
> 
> OK, but what's the use case you see for this?

To be able to remove the lazy pages from a checkpoint after the dump to
be able to restore it lazily. Also for testing, as a conformation, that
the restore actually works with all lazy pages removed.

		Adrian

> > Signed-off-by: Adrian Reber <areber@redhat.com>
> > ---
> >  crit/crit | 78 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> >  1 file changed, 78 insertions(+)
> > 
> > diff --git a/crit/crit b/crit/crit
> > index 93cbc98..0e97fe5 100755
> > --- a/crit/crit
> > +++ b/crit/crit
> > @@ -217,6 +217,75 @@ explorers = { 'ps': explore_ps, 'fds': explore_fds, 'mems': explore_mems }
> >  def explore(opts):
> >  	explorers[opts['what']](opts)
> >  
> > +def make_lazy(opts):
> > +	""" This function takes the pages from the input directory
> > +	and removes all pages which can be restored lazily.
> > +	MAP_PRIVATE && MAP_ANON and not VDSO and not VSYSCALL. """
> > +	# page size is hardcoded to 0x1000; probably a bad idea
> > +	ps = 0x1000
> > +	ps_img = pycriu.images.load(dinf(opts, 'pstree.img'))
> > +	vids = vma_id()
> > +	lazy_pages = []
> > +	for p in ps_img['entries']:
> > +		pid = p['pid']
> > +		mmi = pycriu.images.load(dinf(opts, 'mm-%d.img' % pid))['entries'][0]
> > +
> > +		print "%d" % pid
> > +
> > +		for vma in mmi['vmas']:
> > +			st = vma['status']
> > +			# 'MAP_PRIVATE', 0x2
> > +			# 'MAP_ANON',    0x20
> > +			if (vma['flags'] & 0x2) and (vma['flags'] & 0x20):
> > +				# (1 << 2) vsyscall
> > +				# (1 << 3) vdso
> > +				if not (st & (1 << 2)) and not (st & (1 << 3)):
> > +					vaddr = vma['start']
> > +					while vaddr < vma['end']:
> > +						lazy_pages.append(vaddr)
> > +						vaddr += ps
> > +
> > +		pms = pycriu.images.load(dinf(opts, 'pagemap-%d.img' % pid))
> > +		new = []
> > +
> > +		# find first pages_id
> > +		pages_id = -1
> > +		for pm in pms['entries']:
> > +			if pm.has_key('pages_id'):
> > +				pages_id = pm['pages_id']
> > +
> > +		if pages_id == -1:
> > +			# something went wrong
> > +			raise Exception('No pages_id found in pagemap!')
> > +
> > +		# open the original pages.img to remove the lazy pages
> > +		pages_in = os.path.join(opts['dir'], 'pages-%d.img' % pages_id)
> > +		pages_in = open(pages_in, 'rb')
> > +
> > +		pages_out = os.path.join(opts['outdir'], 'pages-%d.img' % pages_id)
> > +
> > +		pages_out = open(pages_out, 'wb')
> > +
> > +		for pm in pms['entries']:
> > +			if pm.has_key('pages_id'):
> > +				new.append(pm)
> > +				continue
> > +			vaddr = pm['vaddr']
> > +			i = 0
> > +			start = 0
> > +			while vaddr < pm['vaddr'] + (pm['nr_pages'] * 0x1000):
> > +				page_buffer = pages_in.read(ps)
> > +				if vaddr not in lazy_pages:
> > +					if start == 0:
> > +						start = vaddr
> > +					i +=1
> > +					pages_out.write(page_buffer)
> > +				vaddr += ps
> > +			if i != 0:
> > +				new.append({'nr_pages': i, 'vaddr': start})
> > +		pms['entries'] = new
> > +		pycriu.images.dump(pms, open(os.path.join(opts['outdir'], 'pagemap-%d.img' % pid), 'w+'))
> > +
> >  def main():
> >  	desc = 'CRiu Image Tool'
> >  	parser = argparse.ArgumentParser(description=desc,
> > @@ -267,6 +336,15 @@ def main():
> >  	show_parser.add_argument("in")
> >  	show_parser.set_defaults(func=decode, pretty=True, out=None)
> >  
> > +	# Make Lazy
> > +	lazy_parser = subparsers.add_parser('make-lazy',
> > +			help = "remove memory pages from image which can be restored lazily")
> > +	lazy_parser.add_argument('dir',
> > +			help = "criu checkpoint directory used as input")
> > +	lazy_parser.add_argument('outdir',
> > +			help = "output directory for new pages.img and pagemap.img")
> > +	lazy_parser.set_defaults(func=make_lazy)
> > +
> >  	opts = vars(parser.parse_args())
> >  
> >  	opts["func"](opts)
> >
Pavel Emelianov May 4, 2016, 5:08 p.m.
On 05/04/2016 08:03 PM, Adrian Reber wrote:
> On Wed, May 04, 2016 at 07:49:14PM +0300, Pavel Emelyanov wrote:
>> On 04/29/2016 11:39 AM, Adrian Reber wrote:
>>> From: Adrian Reber <areber@redhat.com>
>>>
>>> This enables crit to remove all memory pages from a checkpoint
>>> directory which can be lazily restored using userfaultfd. This
>>> changes the pagemap.img and pages.img to no longer contain pages
>>> which can be handled by userfaultfd (MAP_PRIVATE && MAP_ANON).
>>>
>>> Usage:
>>>
>>>  $ crit/crit make-lazy /tmp/4/ /tmp/5
>>>  $ du -hs /tmp/4 /tmp/5
>>>  201M	/tmp/4
>>>  116K	/tmp/5
>>>
>>> The checkpoint in /tmp/5 can be used by the actual restore process
>>> and the checkpoint in /tmp/4 (with all memory pages) can be used
>>> by the uffd daemon which then transfers the pages into the restored
>>> on demand.
>>
>> OK, but what's the use case you see for this?
> 
> To be able to remove the lazy pages from a checkpoint after the dump to
> be able to restore it lazily.

But if you did full dump (and killed tasks) then removed lazy pages from
images, where would you take the pages from?

> Also for testing, as a conformation, that
> the restore actually works with all lazy pages removed.
> 
> 		Adrian
> 
>>> Signed-off-by: Adrian Reber <areber@redhat.com>
>>> ---
>>>  crit/crit | 78 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>>>  1 file changed, 78 insertions(+)
>>>
>>> diff --git a/crit/crit b/crit/crit
>>> index 93cbc98..0e97fe5 100755
>>> --- a/crit/crit
>>> +++ b/crit/crit
>>> @@ -217,6 +217,75 @@ explorers = { 'ps': explore_ps, 'fds': explore_fds, 'mems': explore_mems }
>>>  def explore(opts):
>>>  	explorers[opts['what']](opts)
>>>  
>>> +def make_lazy(opts):
>>> +	""" This function takes the pages from the input directory
>>> +	and removes all pages which can be restored lazily.
>>> +	MAP_PRIVATE && MAP_ANON and not VDSO and not VSYSCALL. """
>>> +	# page size is hardcoded to 0x1000; probably a bad idea
>>> +	ps = 0x1000
>>> +	ps_img = pycriu.images.load(dinf(opts, 'pstree.img'))
>>> +	vids = vma_id()
>>> +	lazy_pages = []
>>> +	for p in ps_img['entries']:
>>> +		pid = p['pid']
>>> +		mmi = pycriu.images.load(dinf(opts, 'mm-%d.img' % pid))['entries'][0]
>>> +
>>> +		print "%d" % pid
>>> +
>>> +		for vma in mmi['vmas']:
>>> +			st = vma['status']
>>> +			# 'MAP_PRIVATE', 0x2
>>> +			# 'MAP_ANON',    0x20
>>> +			if (vma['flags'] & 0x2) and (vma['flags'] & 0x20):
>>> +				# (1 << 2) vsyscall
>>> +				# (1 << 3) vdso
>>> +				if not (st & (1 << 2)) and not (st & (1 << 3)):
>>> +					vaddr = vma['start']
>>> +					while vaddr < vma['end']:
>>> +						lazy_pages.append(vaddr)
>>> +						vaddr += ps
>>> +
>>> +		pms = pycriu.images.load(dinf(opts, 'pagemap-%d.img' % pid))
>>> +		new = []
>>> +
>>> +		# find first pages_id
>>> +		pages_id = -1
>>> +		for pm in pms['entries']:
>>> +			if pm.has_key('pages_id'):
>>> +				pages_id = pm['pages_id']
>>> +
>>> +		if pages_id == -1:
>>> +			# something went wrong
>>> +			raise Exception('No pages_id found in pagemap!')
>>> +
>>> +		# open the original pages.img to remove the lazy pages
>>> +		pages_in = os.path.join(opts['dir'], 'pages-%d.img' % pages_id)
>>> +		pages_in = open(pages_in, 'rb')
>>> +
>>> +		pages_out = os.path.join(opts['outdir'], 'pages-%d.img' % pages_id)
>>> +
>>> +		pages_out = open(pages_out, 'wb')
>>> +
>>> +		for pm in pms['entries']:
>>> +			if pm.has_key('pages_id'):
>>> +				new.append(pm)
>>> +				continue
>>> +			vaddr = pm['vaddr']
>>> +			i = 0
>>> +			start = 0
>>> +			while vaddr < pm['vaddr'] + (pm['nr_pages'] * 0x1000):
>>> +				page_buffer = pages_in.read(ps)
>>> +				if vaddr not in lazy_pages:
>>> +					if start == 0:
>>> +						start = vaddr
>>> +					i +=1
>>> +					pages_out.write(page_buffer)
>>> +				vaddr += ps
>>> +			if i != 0:
>>> +				new.append({'nr_pages': i, 'vaddr': start})
>>> +		pms['entries'] = new
>>> +		pycriu.images.dump(pms, open(os.path.join(opts['outdir'], 'pagemap-%d.img' % pid), 'w+'))
>>> +
>>>  def main():
>>>  	desc = 'CRiu Image Tool'
>>>  	parser = argparse.ArgumentParser(description=desc,
>>> @@ -267,6 +336,15 @@ def main():
>>>  	show_parser.add_argument("in")
>>>  	show_parser.set_defaults(func=decode, pretty=True, out=None)
>>>  
>>> +	# Make Lazy
>>> +	lazy_parser = subparsers.add_parser('make-lazy',
>>> +			help = "remove memory pages from image which can be restored lazily")
>>> +	lazy_parser.add_argument('dir',
>>> +			help = "criu checkpoint directory used as input")
>>> +	lazy_parser.add_argument('outdir',
>>> +			help = "output directory for new pages.img and pagemap.img")
>>> +	lazy_parser.set_defaults(func=make_lazy)
>>> +
>>>  	opts = vars(parser.parse_args())
>>>  
>>>  	opts["func"](opts)
>>>
> .
>
Adrian Reber May 4, 2016, 5:11 p.m.
On Wed, May 04, 2016 at 08:08:20PM +0300, Pavel Emelyanov wrote:
> On 05/04/2016 08:03 PM, Adrian Reber wrote:
> > On Wed, May 04, 2016 at 07:49:14PM +0300, Pavel Emelyanov wrote:
> >> On 04/29/2016 11:39 AM, Adrian Reber wrote:
> >>> From: Adrian Reber <areber@redhat.com>
> >>>
> >>> This enables crit to remove all memory pages from a checkpoint
> >>> directory which can be lazily restored using userfaultfd. This
> >>> changes the pagemap.img and pages.img to no longer contain pages
> >>> which can be handled by userfaultfd (MAP_PRIVATE && MAP_ANON).
> >>>
> >>> Usage:
> >>>
> >>>  $ crit/crit make-lazy /tmp/4/ /tmp/5
> >>>  $ du -hs /tmp/4 /tmp/5
> >>>  201M	/tmp/4
> >>>  116K	/tmp/5
> >>>
> >>> The checkpoint in /tmp/5 can be used by the actual restore process
> >>> and the checkpoint in /tmp/4 (with all memory pages) can be used
> >>> by the uffd daemon which then transfers the pages into the restored
> >>> on demand.
> >>
> >> OK, but what's the use case you see for this?
> > 
> > To be able to remove the lazy pages from a checkpoint after the dump to
> > be able to restore it lazily.
> 
> But if you did full dump (and killed tasks) then removed lazy pages from
> images, where would you take the pages from?

I have the original checkpoint directory which is not changed and which
can be used by the uffd daemon. The second directory is the checkpoint
directory with the lazy pages removed. The stripped down directory (the
second, /tmp/5 in my example) is then copied to the destination system
without the need to copy the pages, which will be restored lazily
anyway.

		Adrian

> > Also for testing, as a conformation, that
> > the restore actually works with all lazy pages removed.
> > 
> > 		Adrian
> > 
> >>> Signed-off-by: Adrian Reber <areber@redhat.com>
> >>> ---
> >>>  crit/crit | 78 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> >>>  1 file changed, 78 insertions(+)
> >>>
> >>> diff --git a/crit/crit b/crit/crit
> >>> index 93cbc98..0e97fe5 100755
> >>> --- a/crit/crit
> >>> +++ b/crit/crit
> >>> @@ -217,6 +217,75 @@ explorers = { 'ps': explore_ps, 'fds': explore_fds, 'mems': explore_mems }
> >>>  def explore(opts):
> >>>  	explorers[opts['what']](opts)
> >>>  
> >>> +def make_lazy(opts):
> >>> +	""" This function takes the pages from the input directory
> >>> +	and removes all pages which can be restored lazily.
> >>> +	MAP_PRIVATE && MAP_ANON and not VDSO and not VSYSCALL. """
> >>> +	# page size is hardcoded to 0x1000; probably a bad idea
> >>> +	ps = 0x1000
> >>> +	ps_img = pycriu.images.load(dinf(opts, 'pstree.img'))
> >>> +	vids = vma_id()
> >>> +	lazy_pages = []
> >>> +	for p in ps_img['entries']:
> >>> +		pid = p['pid']
> >>> +		mmi = pycriu.images.load(dinf(opts, 'mm-%d.img' % pid))['entries'][0]
> >>> +
> >>> +		print "%d" % pid
> >>> +
> >>> +		for vma in mmi['vmas']:
> >>> +			st = vma['status']
> >>> +			# 'MAP_PRIVATE', 0x2
> >>> +			# 'MAP_ANON',    0x20
> >>> +			if (vma['flags'] & 0x2) and (vma['flags'] & 0x20):
> >>> +				# (1 << 2) vsyscall
> >>> +				# (1 << 3) vdso
> >>> +				if not (st & (1 << 2)) and not (st & (1 << 3)):
> >>> +					vaddr = vma['start']
> >>> +					while vaddr < vma['end']:
> >>> +						lazy_pages.append(vaddr)
> >>> +						vaddr += ps
> >>> +
> >>> +		pms = pycriu.images.load(dinf(opts, 'pagemap-%d.img' % pid))
> >>> +		new = []
> >>> +
> >>> +		# find first pages_id
> >>> +		pages_id = -1
> >>> +		for pm in pms['entries']:
> >>> +			if pm.has_key('pages_id'):
> >>> +				pages_id = pm['pages_id']
> >>> +
> >>> +		if pages_id == -1:
> >>> +			# something went wrong
> >>> +			raise Exception('No pages_id found in pagemap!')
> >>> +
> >>> +		# open the original pages.img to remove the lazy pages
> >>> +		pages_in = os.path.join(opts['dir'], 'pages-%d.img' % pages_id)
> >>> +		pages_in = open(pages_in, 'rb')
> >>> +
> >>> +		pages_out = os.path.join(opts['outdir'], 'pages-%d.img' % pages_id)
> >>> +
> >>> +		pages_out = open(pages_out, 'wb')
> >>> +
> >>> +		for pm in pms['entries']:
> >>> +			if pm.has_key('pages_id'):
> >>> +				new.append(pm)
> >>> +				continue
> >>> +			vaddr = pm['vaddr']
> >>> +			i = 0
> >>> +			start = 0
> >>> +			while vaddr < pm['vaddr'] + (pm['nr_pages'] * 0x1000):
> >>> +				page_buffer = pages_in.read(ps)
> >>> +				if vaddr not in lazy_pages:
> >>> +					if start == 0:
> >>> +						start = vaddr
> >>> +					i +=1
> >>> +					pages_out.write(page_buffer)
> >>> +				vaddr += ps
> >>> +			if i != 0:
> >>> +				new.append({'nr_pages': i, 'vaddr': start})
> >>> +		pms['entries'] = new
> >>> +		pycriu.images.dump(pms, open(os.path.join(opts['outdir'], 'pagemap-%d.img' % pid), 'w+'))
> >>> +
> >>>  def main():
> >>>  	desc = 'CRiu Image Tool'
> >>>  	parser = argparse.ArgumentParser(description=desc,
> >>> @@ -267,6 +336,15 @@ def main():
> >>>  	show_parser.add_argument("in")
> >>>  	show_parser.set_defaults(func=decode, pretty=True, out=None)
> >>>  
> >>> +	# Make Lazy
> >>> +	lazy_parser = subparsers.add_parser('make-lazy',
> >>> +			help = "remove memory pages from image which can be restored lazily")
> >>> +	lazy_parser.add_argument('dir',
> >>> +			help = "criu checkpoint directory used as input")
> >>> +	lazy_parser.add_argument('outdir',
> >>> +			help = "output directory for new pages.img and pagemap.img")
> >>> +	lazy_parser.set_defaults(func=make_lazy)
> >>> +
> >>>  	opts = vars(parser.parse_args())
> >>>  
> >>>  	opts["func"](opts)
> >>>
> > .
> >
Pavel Emelianov May 6, 2016, 11:57 a.m.
Sorry for such a long delay in review. Here are my comments, inline.

On 04/29/2016 11:39 AM, Adrian Reber wrote:
> From: Adrian Reber <areber@redhat.com>
> 
> This enables crit to remove all memory pages from a checkpoint
> directory which can be lazily restored using userfaultfd. This
> changes the pagemap.img and pages.img to no longer contain pages
> which can be handled by userfaultfd (MAP_PRIVATE && MAP_ANON).
> 
> Usage:
> 
>  $ crit/crit make-lazy /tmp/4/ /tmp/5

We already have the pagemaps manipulation code called 'dedup'. And it
sits in criu binary :) I don't insist in implementing make-lazy in criu
too, crit looks better place, but any ideas how to make dedup and
make-lazy be better aligned with each other?

>  $ du -hs /tmp/4 /tmp/5
>  201M	/tmp/4
>  116K	/tmp/5
> 
> The checkpoint in /tmp/5 can be used by the actual restore process
> and the checkpoint in /tmp/4 (with all memory pages) can be used
> by the uffd daemon which then transfers the pages into the restored
> on demand.
> 
> Signed-off-by: Adrian Reber <areber@redhat.com>
> ---
>  crit/crit | 78 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 78 insertions(+)
> 
> diff --git a/crit/crit b/crit/crit
> index 93cbc98..0e97fe5 100755
> --- a/crit/crit
> +++ b/crit/crit
> @@ -217,6 +217,75 @@ explorers = { 'ps': explore_ps, 'fds': explore_fds, 'mems': explore_mems }
>  def explore(opts):
>  	explorers[opts['what']](opts)
>  
> +def make_lazy(opts):
> +	""" This function takes the pages from the input directory
> +	and removes all pages which can be restored lazily.
> +	MAP_PRIVATE && MAP_ANON and not VDSO and not VSYSCALL. """
> +	# page size is hardcoded to 0x1000; probably a bad idea
> +	ps = 0x1000
> +	ps_img = pycriu.images.load(dinf(opts, 'pstree.img'))
> +	vids = vma_id()
> +	lazy_pages = []
> +	for p in ps_img['entries']:
> +		pid = p['pid']
> +		mmi = pycriu.images.load(dinf(opts, 'mm-%d.img' % pid))['entries'][0]

There can be no mm-%d.img for zombies, here you will die with exception. The same
problem exists for 'crit x', I have fix in my queue, will send one shortly.

> +
> +		print "%d" % pid
> +
> +		for vma in mmi['vmas']:
> +			st = vma['status']
> +			# 'MAP_PRIVATE', 0x2
> +			# 'MAP_ANON',    0x20

These constants are declared in crit/pycriu/images/pb2dict.py.

> +			if (vma['flags'] & 0x2) and (vma['flags'] & 0x20):
> +				# (1 << 2) vsyscall
> +				# (1 << 3) vdso
> +				if not (st & (1 << 2)) and not (st & (1 << 3)):
> +					vaddr = vma['start']
> +					while vaddr < vma['end']:
> +						lazy_pages.append(vaddr)

Can we keep intervals rather than individual vaddrs in this list? For huge
tasks this list can grow enormous.

> +						vaddr += ps
> +
> +		pms = pycriu.images.load(dinf(opts, 'pagemap-%d.img' % pid))
> +		new = []
> +
> +		# find first pages_id
> +		pages_id = -1
> +		for pm in pms['entries']:
> +			if pm.has_key('pages_id'):
> +				pages_id = pm['pages_id']

This id is always pms['entries'][0]

> +
> +		if pages_id == -1:
> +			# something went wrong
> +			raise Exception('No pages_id found in pagemap!')
> +
> +		# open the original pages.img to remove the lazy pages
> +		pages_in = os.path.join(opts['dir'], 'pages-%d.img' % pages_id)
> +		pages_in = open(pages_in, 'rb')
> +
> +		pages_out = os.path.join(opts['outdir'], 'pages-%d.img' % pages_id)
> +
> +		pages_out = open(pages_out, 'wb')
> +
> +		for pm in pms['entries']:
> +			if pm.has_key('pages_id'):
> +				new.append(pm)
> +				continue
> +			vaddr = pm['vaddr']
> +			i = 0
> +			start = 0
> +			while vaddr < pm['vaddr'] + (pm['nr_pages'] * 0x1000):
> +				page_buffer = pages_in.read(ps)

Can we seek the image is the data in question will be skipped?

> +				if vaddr not in lazy_pages:
> +					if start == 0:
> +						start = vaddr
> +					i +=1
> +					pages_out.write(page_buffer)
> +				vaddr += ps
> +			if i != 0:
> +				new.append({'nr_pages': i, 'vaddr': start})
> +		pms['entries'] = new
> +		pycriu.images.dump(pms, open(os.path.join(opts['outdir'], 'pagemap-%d.img' % pid), 'w+'))
> +
>  def main():
>  	desc = 'CRiu Image Tool'
>  	parser = argparse.ArgumentParser(description=desc,
> @@ -267,6 +336,15 @@ def main():
>  	show_parser.add_argument("in")
>  	show_parser.set_defaults(func=decode, pretty=True, out=None)
>  
> +	# Make Lazy
> +	lazy_parser = subparsers.add_parser('make-lazy',
> +			help = "remove memory pages from image which can be restored lazily")
> +	lazy_parser.add_argument('dir',
> +			help = "criu checkpoint directory used as input")
> +	lazy_parser.add_argument('outdir',
> +			help = "output directory for new pages.img and pagemap.img")
> +	lazy_parser.set_defaults(func=make_lazy)
> +
>  	opts = vars(parser.parse_args())
>  
>  	opts["func"](opts)
>
Adrian Reber May 11, 2016, 7:19 a.m.
On Fri, May 06, 2016 at 02:57:13PM +0300, Pavel Emelyanov wrote:
> Sorry for such a long delay in review. Here are my comments, inline.
> 
> On 04/29/2016 11:39 AM, Adrian Reber wrote:
> > From: Adrian Reber <areber@redhat.com>
> > 
> > This enables crit to remove all memory pages from a checkpoint
> > directory which can be lazily restored using userfaultfd. This
> > changes the pagemap.img and pages.img to no longer contain pages
> > which can be handled by userfaultfd (MAP_PRIVATE && MAP_ANON).
> > 
> > Usage:
> > 
> >  $ crit/crit make-lazy /tmp/4/ /tmp/5
> 
> We already have the pagemaps manipulation code called 'dedup'. And it
> sits in criu binary :) I don't insist in implementing make-lazy in criu
> too, crit looks better place, but any ideas how to make dedup and
> make-lazy be better aligned with each other?

Ah, I always thought dedup was only used during restore to remove
already restored pages from the checkpoint. I was not aware that it can
also run standalone to clean up checkpoints.

If I understand the code correctly the dedup option removes pages from
parent checkpoints (pre-dump) which are also present in the child
checkpoints, right?

This sounds indeed just like what I tried to do with crit with
make-lazy. Checkpoint manipulation sounds much easier in python but if
the code already exists for dedup I am also undecided. I think python
makes more sense as I also want to provide the possibility to undo the
'make-lazy' operation: Create a 'normal' checkpoint from the two parts
which result from 'make-lazy'. I will try to address your code comments
and send a second version of the patch.

		Adrian
Pavel Emelianov May 11, 2016, 1:06 p.m.
On 05/11/2016 10:19 AM, Adrian Reber wrote:
> On Fri, May 06, 2016 at 02:57:13PM +0300, Pavel Emelyanov wrote:
>> Sorry for such a long delay in review. Here are my comments, inline.
>>
>> On 04/29/2016 11:39 AM, Adrian Reber wrote:
>>> From: Adrian Reber <areber@redhat.com>
>>>
>>> This enables crit to remove all memory pages from a checkpoint
>>> directory which can be lazily restored using userfaultfd. This
>>> changes the pagemap.img and pages.img to no longer contain pages
>>> which can be handled by userfaultfd (MAP_PRIVATE && MAP_ANON).
>>>
>>> Usage:
>>>
>>>  $ crit/crit make-lazy /tmp/4/ /tmp/5
>>
>> We already have the pagemaps manipulation code called 'dedup'. And it
>> sits in criu binary :) I don't insist in implementing make-lazy in criu
>> too, crit looks better place, but any ideas how to make dedup and
>> make-lazy be better aligned with each other?
> 
> Ah, I always thought dedup was only used during restore to remove
> already restored pages from the checkpoint. I was not aware that it can
> also run standalone to clean up checkpoints.
> 
> If I understand the code correctly the dedup option removes pages from
> parent checkpoints (pre-dump) which are also present in the child
> checkpoints, right?

Yes.

> This sounds indeed just like what I tried to do with crit with
> make-lazy. Checkpoint manipulation sounds much easier in python but if
> the code already exists for dedup I am also undecided. I think python
> makes more sense as I also want to provide the possibility to undo the
> 'make-lazy' operation: Create a 'normal' checkpoint from the two parts
> which result from 'make-lazy'. I will try to address your code comments
> and send a second version of the patch.

So you have an idea how to clean up all this mess we've made with dedup :D
Thanks heaps!!!

-- Pavel
Adrian Reber May 11, 2016, 5:31 p.m.
On Wed, May 11, 2016 at 04:06:48PM +0300, Pavel Emelyanov wrote:
> On 05/11/2016 10:19 AM, Adrian Reber wrote:
> > On Fri, May 06, 2016 at 02:57:13PM +0300, Pavel Emelyanov wrote:
> >> Sorry for such a long delay in review. Here are my comments, inline.
> >>
> >> On 04/29/2016 11:39 AM, Adrian Reber wrote:
> >>> From: Adrian Reber <areber@redhat.com>
> >>>
> >>> This enables crit to remove all memory pages from a checkpoint
> >>> directory which can be lazily restored using userfaultfd. This
> >>> changes the pagemap.img and pages.img to no longer contain pages
> >>> which can be handled by userfaultfd (MAP_PRIVATE && MAP_ANON).
> >>>
> >>> Usage:
> >>>
> >>>  $ crit/crit make-lazy /tmp/4/ /tmp/5
> >>
> >> We already have the pagemaps manipulation code called 'dedup'. And it
> >> sits in criu binary :) I don't insist in implementing make-lazy in criu
> >> too, crit looks better place, but any ideas how to make dedup and
> >> make-lazy be better aligned with each other?
> > 
> > Ah, I always thought dedup was only used during restore to remove
> > already restored pages from the checkpoint. I was not aware that it can
> > also run standalone to clean up checkpoints.
> > 
> > If I understand the code correctly the dedup option removes pages from
> > parent checkpoints (pre-dump) which are also present in the child
> > checkpoints, right?
> 
> Yes.
> 
> > This sounds indeed just like what I tried to do with crit with
> > make-lazy. Checkpoint manipulation sounds much easier in python but if
> > the code already exists for dedup I am also undecided. I think python
> > makes more sense as I also want to provide the possibility to undo the
> > 'make-lazy' operation: Create a 'normal' checkpoint from the two parts
> > which result from 'make-lazy'. I will try to address your code comments
> > and send a second version of the patch.
> 
> So you have an idea how to clean up all this mess we've made with dedup :D
> Thanks heaps!!!

No, no, no ;-) I cannot clean up _all_ that mess, I try to start with my
mess and let's see if this leads to more cleanups.

		Adrian