cr-service: require non-cooperative userfaultfd for lazy-pages

Submitted by Mike Rapoport on Aug. 9, 2017, 5:41 a.m.

Details

Message ID 1502257312-7959-1-git-send-email-rppt@linux.vnet.ibm.com
State New
Series "cr-service: require non-cooperative userfaultfd for lazy-pages"
Headers show

Commit Message

Mike Rapoport Aug. 9, 2017, 5:41 a.m.
Without non-cooperative userfaultfd some programs may fail during lazy
restore because they perform operations that cannot be handled by the
lazy-pages daemon.

Signed-off-by: Mike Rapoport <rppt@linux.vnet.ibm.com>
---

I've noticed that we don't really have RPC support for lazy-pages. I've
rechecked Adrian's pull request to runc and there is a requirement to run
lazy-pages daemon manually.
Do we want to add an RPC interface for lazy-pages?
Another possibility is to fork() lazy-pages from restore...

 criu/cr-check.c     |  2 +-
 criu/cr-service.c   | 19 ++++---------------
 criu/include/uffd.h |  2 ++
 3 files changed, 7 insertions(+), 16 deletions(-)

Patch hide | download patch | download mbox

diff --git a/criu/cr-check.c b/criu/cr-check.c
index b1de144..24a94c1 100644
--- a/criu/cr-check.c
+++ b/criu/cr-check.c
@@ -1059,7 +1059,7 @@  static int check_uffd(void)
 	return 0;
 }
 
-static int check_uffd_noncoop(void)
+int check_uffd_noncoop(void)
 {
 	unsigned long features = UFFD_FEATURE_EVENT_FORK |
 		UFFD_FEATURE_EVENT_REMAP |
diff --git a/criu/cr-service.c b/criu/cr-service.c
index 18ba2fd..1e83716 100644
--- a/criu/cr-service.c
+++ b/criu/cr-service.c
@@ -40,6 +40,7 @@ 
 #include <sys/un.h>
 #include <sys/socket.h>
 #include "common/scm.h"
+#include "uffd.h"
 
 #include "setproctitle.h"
 
@@ -895,22 +896,10 @@  static int handle_feature_check(int sk, CriuReq * msg)
 				feat.mem_track = false;
 		}
 
+		/* For lazy-pages we need non-coopetative userfaultfd */
 		if ((msg->features->has_lazy_pages == 1) &&
-		    (msg->features->lazy_pages == true)) {
-			ret = kerndat_uffd();
-
-			/*
-			 * Not checking for specific UFFD features yet.
-			 * If no error is returned it is probably
-			 * enough for basic UFFD functionality. This can
-			 * be extended in the future for a more detailed
-			 * UFFD feature check.
-			 */
-			if (ret || !kdat.has_uffd)
-				feat.lazy_pages = false;
-			else
-				feat.lazy_pages = true;
-		}
+		    (msg->features->lazy_pages == true))
+			feat.lazy_pages = !check_uffd_noncoop();
 
 		resp.features = &feat;
 		resp.type = msg->type;
diff --git a/criu/include/uffd.h b/criu/include/uffd.h
index 4d790ce..653041f 100644
--- a/criu/include/uffd.h
+++ b/criu/include/uffd.h
@@ -6,4 +6,6 @@  extern int setup_uffd(int pid, struct task_restore_args *task_args);
 extern int lazy_pages_setup_zombie(int pid);
 extern int prepare_lazy_pages_socket(void);
 
+extern int check_uffd_noncoop(void);
+
 #endif /* __CR_UFFD_H_ */

Comments

Adrian Reber Aug. 9, 2017, 7 a.m.
Acked-by: Adrian Reber <areber@redhat.com>

On Wed, Aug 09, 2017 at 08:41:52AM +0300, Mike Rapoport wrote:
> Without non-cooperative userfaultfd some programs may fail during lazy
> restore because they perform operations that cannot be handled by the
> lazy-pages daemon.
> 
> Signed-off-by: Mike Rapoport <rppt@linux.vnet.ibm.com>
> ---
> 
> I've noticed that we don't really have RPC support for lazy-pages. I've
> rechecked Adrian's pull request to runc and there is a requirement to run
> lazy-pages daemon manually.
> Do we want to add an RPC interface for lazy-pages?
> Another possibility is to fork() lazy-pages from restore...
> 
>  criu/cr-check.c     |  2 +-
>  criu/cr-service.c   | 19 ++++---------------
>  criu/include/uffd.h |  2 ++
>  3 files changed, 7 insertions(+), 16 deletions(-)
> 
> diff --git a/criu/cr-check.c b/criu/cr-check.c
> index b1de144..24a94c1 100644
> --- a/criu/cr-check.c
> +++ b/criu/cr-check.c
> @@ -1059,7 +1059,7 @@ static int check_uffd(void)
>  	return 0;
>  }
>  
> -static int check_uffd_noncoop(void)
> +int check_uffd_noncoop(void)
>  {
>  	unsigned long features = UFFD_FEATURE_EVENT_FORK |
>  		UFFD_FEATURE_EVENT_REMAP |
> diff --git a/criu/cr-service.c b/criu/cr-service.c
> index 18ba2fd..1e83716 100644
> --- a/criu/cr-service.c
> +++ b/criu/cr-service.c
> @@ -40,6 +40,7 @@
>  #include <sys/un.h>
>  #include <sys/socket.h>
>  #include "common/scm.h"
> +#include "uffd.h"
>  
>  #include "setproctitle.h"
>  
> @@ -895,22 +896,10 @@ static int handle_feature_check(int sk, CriuReq * msg)
>  				feat.mem_track = false;
>  		}
>  
> +		/* For lazy-pages we need non-coopetative userfaultfd */
>  		if ((msg->features->has_lazy_pages == 1) &&
> -		    (msg->features->lazy_pages == true)) {
> -			ret = kerndat_uffd();
> -
> -			/*
> -			 * Not checking for specific UFFD features yet.
> -			 * If no error is returned it is probably
> -			 * enough for basic UFFD functionality. This can
> -			 * be extended in the future for a more detailed
> -			 * UFFD feature check.
> -			 */
> -			if (ret || !kdat.has_uffd)
> -				feat.lazy_pages = false;
> -			else
> -				feat.lazy_pages = true;
> -		}
> +		    (msg->features->lazy_pages == true))
> +			feat.lazy_pages = !check_uffd_noncoop();
>  
>  		resp.features = &feat;
>  		resp.type = msg->type;
> diff --git a/criu/include/uffd.h b/criu/include/uffd.h
> index 4d790ce..653041f 100644
> --- a/criu/include/uffd.h
> +++ b/criu/include/uffd.h
> @@ -6,4 +6,6 @@ extern int setup_uffd(int pid, struct task_restore_args *task_args);
>  extern int lazy_pages_setup_zombie(int pid);
>  extern int prepare_lazy_pages_socket(void);
>  
> +extern int check_uffd_noncoop(void);
> +
>  #endif /* __CR_UFFD_H_ */
> -- 
> 2.7.4
>
Adrian Reber Aug. 9, 2017, 7:21 a.m.
On Wed, Aug 09, 2017 at 08:41:52AM +0300, Mike Rapoport wrote:
> I've noticed that we don't really have RPC support for lazy-pages. I've
> rechecked Adrian's pull request to runc and there is a requirement to run
> lazy-pages daemon manually.
> Do we want to add an RPC interface for lazy-pages?
> Another possibility is to fork() lazy-pages from restore...

I think fork()-ing makes sense. Except for debug situations there is no
real reason to start the lazy-pages daemon manually.

It would still be nice to have a way to start it manually, though. But
the default should be that it fork()s.

		Adrian
Mike Rapoport Aug. 10, 2017, 9:59 a.m.
Pavel,

Any comments?

On Wed, Aug 09, 2017 at 09:21:31AM +0200, Adrian Reber wrote:
> On Wed, Aug 09, 2017 at 08:41:52AM +0300, Mike Rapoport wrote:
> > I've noticed that we don't really have RPC support for lazy-pages. I've
> > rechecked Adrian's pull request to runc and there is a requirement to run
> > lazy-pages daemon manually.
> > Do we want to add an RPC interface for lazy-pages?
> > Another possibility is to fork() lazy-pages from restore...
> 
> I think fork()-ing makes sense. Except for debug situations there is no
> real reason to start the lazy-pages daemon manually.
> 
> It would still be nice to have a way to start it manually, though. But
> the default should be that it fork()s.
> 
> 		Adrian
>
Pavel Emelyanov Aug. 10, 2017, 11:40 a.m.
On 08/10/2017 12:59 PM, Mike Rapoport wrote:
> Pavel,
> 
> Any comments?

Oops, sorry :) Too many e-mails after vacation. Yes, I agree with that. The
only thing that I'd like to see as incremental fix is saving the uffd and
uffd-non-coop on kdat cache and using these values everywhere.

Said that

Acked-by: Pavel Emelyanov <xemul@virtuozzo.com>

-- Pavel

> On Wed, Aug 09, 2017 at 09:21:31AM +0200, Adrian Reber wrote:
>> On Wed, Aug 09, 2017 at 08:41:52AM +0300, Mike Rapoport wrote:
>>> I've noticed that we don't really have RPC support for lazy-pages. I've
>>> rechecked Adrian's pull request to runc and there is a requirement to run
>>> lazy-pages daemon manually.
>>> Do we want to add an RPC interface for lazy-pages?
>>> Another possibility is to fork() lazy-pages from restore...
>>
>> I think fork()-ing makes sense. Except for debug situations there is no
>> real reason to start the lazy-pages daemon manually.
>>
>> It would still be nice to have a way to start it manually, though. But
>> the default should be that it fork()s.
>>
>> 		Adrian
>>
>
Mike Rapoport Aug. 10, 2017, 12:05 p.m.
On Thu, Aug 10, 2017 at 02:40:23PM +0300, Pavel Emelyanov wrote:
> On 08/10/2017 12:59 PM, Mike Rapoport wrote:
> > Pavel,
> > 
> > Any comments?
> 
> Oops, sorry :) Too many e-mails after vacation. Yes, I agree with that. The
> only thing that I'd like to see as incremental fix is saving the uffd and
> uffd-non-coop on kdat cache and using these values everywhere.

The kdat fields haven't changed. The only change is how cr-check and
cr-service treat them.
 
> Said that
> 
> Acked-by: Pavel Emelyanov <xemul@virtuozzo.com>
> 
> -- Pavel
> 
> > On Wed, Aug 09, 2017 at 09:21:31AM +0200, Adrian Reber wrote:
> >> On Wed, Aug 09, 2017 at 08:41:52AM +0300, Mike Rapoport wrote:
> >>> I've noticed that we don't really have RPC support for lazy-pages. I've
> >>> rechecked Adrian's pull request to runc and there is a requirement to run
> >>> lazy-pages daemon manually.
> >>> Do we want to add an RPC interface for lazy-pages?
> >>> Another possibility is to fork() lazy-pages from restore...
> >>
> >> I think fork()-ing makes sense. Except for debug situations there is no
> >> real reason to start the lazy-pages daemon manually.
> >>
> >> It would still be nice to have a way to start it manually, though. But
> >> the default should be that it fork()s.

And what about fork()ing lazy-pages when restore is run with --lazy-pages?

> >> 		Adrian
> >>
> > 
>
Pavel Emelyanov Aug. 10, 2017, 1:23 p.m.
On 08/10/2017 03:05 PM, Mike Rapoport wrote:
> On Thu, Aug 10, 2017 at 02:40:23PM +0300, Pavel Emelyanov wrote:
>> On 08/10/2017 12:59 PM, Mike Rapoport wrote:
>>> Pavel,
>>>
>>> Any comments?
>>
>> Oops, sorry :) Too many e-mails after vacation. Yes, I agree with that. The
>> only thing that I'd like to see as incremental fix is saving the uffd and
>> uffd-non-coop on kdat cache and using these values everywhere.
> 
> The kdat fields haven't changed. The only change is how cr-check and
> cr-service treat them.

I mean -- handle_feature_check calls check_uffd_noncoop which does testing
of the feature, while instead it should just read the kdat.something field :)

>> Said that
>>
>> Acked-by: Pavel Emelyanov <xemul@virtuozzo.com>
>>
>> -- Pavel
>>
>>> On Wed, Aug 09, 2017 at 09:21:31AM +0200, Adrian Reber wrote:
>>>> On Wed, Aug 09, 2017 at 08:41:52AM +0300, Mike Rapoport wrote:
>>>>> I've noticed that we don't really have RPC support for lazy-pages. I've
>>>>> rechecked Adrian's pull request to runc and there is a requirement to run
>>>>> lazy-pages daemon manually.
>>>>> Do we want to add an RPC interface for lazy-pages?
>>>>> Another possibility is to fork() lazy-pages from restore...
>>>>
>>>> I think fork()-ing makes sense. Except for debug situations there is no
>>>> real reason to start the lazy-pages daemon manually.
>>>>
>>>> It would still be nice to have a way to start it manually, though. But
>>>> the default should be that it fork()s.
> 
> And what about fork()ing lazy-pages when restore is run with --lazy-pages?

Sorry, missed that :) I think fork-ing is OK, but we should somehow decide
on when restore fork()-s the lazy daemon and when connects to whatever is
there in the work dir.

Say, if there's a socket -- connect. Otherwise -- fork. If this is the way to
go what to do if there's a stale socket left from previous lazy daemon run?
Bail out or still fork()?

-- Pavel
Mike Rapoport Aug. 10, 2017, 2:02 p.m.
On Thu, Aug 10, 2017 at 04:23:57PM +0300, Pavel Emelyanov wrote:
> On 08/10/2017 03:05 PM, Mike Rapoport wrote:
> > On Thu, Aug 10, 2017 at 02:40:23PM +0300, Pavel Emelyanov wrote:
> >> On 08/10/2017 12:59 PM, Mike Rapoport wrote:
> >>> Pavel,
> >>>
> >>> Any comments?
> >>
> >> Oops, sorry :) Too many e-mails after vacation. Yes, I agree with that. The
> >> only thing that I'd like to see as incremental fix is saving the uffd and
> >> uffd-non-coop on kdat cache and using these values everywhere.
> > 
> > The kdat fields haven't changed. The only change is how cr-check and
> > cr-service treat them.
> 
> I mean -- handle_feature_check calls check_uffd_noncoop which does testing
> of the feature, while instead it should just read the kdat.something field :)

Maybe that's because cr-check was not updated when kdat cache got merged? ;-)

> >> Said that
> >>
> >> Acked-by: Pavel Emelyanov <xemul@virtuozzo.com>
> >>
> >> -- Pavel
> >>
> >>> On Wed, Aug 09, 2017 at 09:21:31AM +0200, Adrian Reber wrote:
> >>>> On Wed, Aug 09, 2017 at 08:41:52AM +0300, Mike Rapoport wrote:
> >>>>> I've noticed that we don't really have RPC support for lazy-pages. I've
> >>>>> rechecked Adrian's pull request to runc and there is a requirement to run
> >>>>> lazy-pages daemon manually.
> >>>>> Do we want to add an RPC interface for lazy-pages?
> >>>>> Another possibility is to fork() lazy-pages from restore...
> >>>>
> >>>> I think fork()-ing makes sense. Except for debug situations there is no
> >>>> real reason to start the lazy-pages daemon manually.
> >>>>
> >>>> It would still be nice to have a way to start it manually, though. But
> >>>> the default should be that it fork()s.
> > 
> > And what about fork()ing lazy-pages when restore is run with --lazy-pages?
> 
> Sorry, missed that :) I think fork-ing is OK, but we should somehow decide
> on when restore fork()-s the lazy daemon and when connects to whatever is
> there in the work dir.
> 
> Say, if there's a socket -- connect. Otherwise -- fork. If this is the way to
> go what to do if there's a stale socket left from previous lazy daemon run?
> Bail out or still fork()?

What about:

* If there's a socket -- connect
* If Connect failed:
	- re-create the socket
	- fork and let lazy-pages inherit the socket

> -- Pavel
>
Pavel Emelyanov Aug. 10, 2017, 3:35 p.m.
On 08/10/2017 05:02 PM, Mike Rapoport wrote:
> On Thu, Aug 10, 2017 at 04:23:57PM +0300, Pavel Emelyanov wrote:
>> On 08/10/2017 03:05 PM, Mike Rapoport wrote:
>>> On Thu, Aug 10, 2017 at 02:40:23PM +0300, Pavel Emelyanov wrote:
>>>> On 08/10/2017 12:59 PM, Mike Rapoport wrote:
>>>>> Pavel,
>>>>>
>>>>> Any comments?
>>>>
>>>> Oops, sorry :) Too many e-mails after vacation. Yes, I agree with that. The
>>>> only thing that I'd like to see as incremental fix is saving the uffd and
>>>> uffd-non-coop on kdat cache and using these values everywhere.
>>>
>>> The kdat fields haven't changed. The only change is how cr-check and
>>> cr-service treat them.
>>
>> I mean -- handle_feature_check calls check_uffd_noncoop which does testing
>> of the feature, while instead it should just read the kdat.something field :)
> 
> Maybe that's because cr-check was not updated when kdat cache got merged? ;-)

Maybe, but the handle_feature_check is (should be) called with kdat
cache loaded.

>>>> Said that
>>>>
>>>> Acked-by: Pavel Emelyanov <xemul@virtuozzo.com>
>>>>
>>>> -- Pavel
>>>>
>>>>> On Wed, Aug 09, 2017 at 09:21:31AM +0200, Adrian Reber wrote:
>>>>>> On Wed, Aug 09, 2017 at 08:41:52AM +0300, Mike Rapoport wrote:
>>>>>>> I've noticed that we don't really have RPC support for lazy-pages. I've
>>>>>>> rechecked Adrian's pull request to runc and there is a requirement to run
>>>>>>> lazy-pages daemon manually.
>>>>>>> Do we want to add an RPC interface for lazy-pages?
>>>>>>> Another possibility is to fork() lazy-pages from restore...
>>>>>>
>>>>>> I think fork()-ing makes sense. Except for debug situations there is no
>>>>>> real reason to start the lazy-pages daemon manually.
>>>>>>
>>>>>> It would still be nice to have a way to start it manually, though. But
>>>>>> the default should be that it fork()s.
>>>
>>> And what about fork()ing lazy-pages when restore is run with --lazy-pages?
>>
>> Sorry, missed that :) I think fork-ing is OK, but we should somehow decide
>> on when restore fork()-s the lazy daemon and when connects to whatever is
>> there in the work dir.
>>
>> Say, if there's a socket -- connect. Otherwise -- fork. If this is the way to
>> go what to do if there's a stale socket left from previous lazy daemon run?
>> Bail out or still fork()?
> 
> What about:
> 
> * If there's a socket -- connect
> * If Connect failed:
> 	- re-create the socket
> 	- fork and let lazy-pages inherit the socket

O_o  If connect failed ... for any reason? I don't know, this sounds dangerous.
I'd make it simple -- socket -> connect and fail if fail, no socket -> fork.

-- Pavel
Mike Rapoport Aug. 13, 2017, 7:14 a.m.
On Thu, Aug 10, 2017 at 06:35:36PM +0300, Pavel Emelyanov wrote:
> On 08/10/2017 05:02 PM, Mike Rapoport wrote:
> > On Thu, Aug 10, 2017 at 04:23:57PM +0300, Pavel Emelyanov wrote:
> >> On 08/10/2017 03:05 PM, Mike Rapoport wrote:
> >>> On Thu, Aug 10, 2017 at 02:40:23PM +0300, Pavel Emelyanov wrote:
> >>>> On 08/10/2017 12:59 PM, Mike Rapoport wrote:
> >>>>> Pavel,
> >>>>>
> >>>>> Any comments?
> >>>>
> >>>> Oops, sorry :) Too many e-mails after vacation. Yes, I agree with that. The
> >>>> only thing that I'd like to see as incremental fix is saving the uffd and
> >>>> uffd-non-coop on kdat cache and using these values everywhere.
> >>>
> >>> The kdat fields haven't changed. The only change is how cr-check and
> >>> cr-service treat them.
> >>
> >> I mean -- handle_feature_check calls check_uffd_noncoop which does testing
> >> of the feature, while instead it should just read the kdat.something field :)
> > 
> > Maybe that's because cr-check was not updated when kdat cache got merged? ;-)
> 
> Maybe, but the handle_feature_check is (should be) called with kdat
> cache loaded.

If I'm not much mistaken, cr-service does not init kdat at all:

~/git/criu$ git grep kerndat_init
criu/cr-dump.c: if (kerndat_init())
criu/cr-dump.c: if (kerndat_init())
criu/cr-restore.c:      if (kerndat_init())
criu/include/kerndat.h:extern int kerndat_init(void);
criu/kerndat.c:int kerndat_init(void)

 
> >>>> Said that
> >>>>
> >>>> Acked-by: Pavel Emelyanov <xemul@virtuozzo.com>
> >>>>
> >>>> -- Pavel
> >>>>
> >>>>> On Wed, Aug 09, 2017 at 09:21:31AM +0200, Adrian Reber wrote:
> >>>>>> On Wed, Aug 09, 2017 at 08:41:52AM +0300, Mike Rapoport wrote:
> >>>>>>> I've noticed that we don't really have RPC support for lazy-pages. I've
> >>>>>>> rechecked Adrian's pull request to runc and there is a requirement to run
> >>>>>>> lazy-pages daemon manually.
> >>>>>>> Do we want to add an RPC interface for lazy-pages?
> >>>>>>> Another possibility is to fork() lazy-pages from restore...
> >>>>>>
> >>>>>> I think fork()-ing makes sense. Except for debug situations there is no
> >>>>>> real reason to start the lazy-pages daemon manually.
> >>>>>>
> >>>>>> It would still be nice to have a way to start it manually, though. But
> >>>>>> the default should be that it fork()s.
> >>>
> >>> And what about fork()ing lazy-pages when restore is run with --lazy-pages?
> >>
> >> Sorry, missed that :) I think fork-ing is OK, but we should somehow decide
> >> on when restore fork()-s the lazy daemon and when connects to whatever is
> >> there in the work dir.
> >>
> >> Say, if there's a socket -- connect. Otherwise -- fork. If this is the way to
> >> go what to do if there's a stale socket left from previous lazy daemon run?
> >> Bail out or still fork()?
> > 
> > What about:
> > 
> > * If there's a socket -- connect
> > * If Connect failed:
> > 	- re-create the socket
> > 	- fork and let lazy-pages inherit the socket
> 
> O_o  If connect failed ... for any reason? I don't know, this sounds dangerous.
> I'd make it simple -- socket -> connect and fail if fail, no socket -> fork.
> 
> -- Pavel
>
Pavel Emelyanov Aug. 14, 2017, 10:30 a.m.
On 08/13/2017 10:14 AM, Mike Rapoport wrote:
> On Thu, Aug 10, 2017 at 06:35:36PM +0300, Pavel Emelyanov wrote:
>> On 08/10/2017 05:02 PM, Mike Rapoport wrote:
>>> On Thu, Aug 10, 2017 at 04:23:57PM +0300, Pavel Emelyanov wrote:
>>>> On 08/10/2017 03:05 PM, Mike Rapoport wrote:
>>>>> On Thu, Aug 10, 2017 at 02:40:23PM +0300, Pavel Emelyanov wrote:
>>>>>> On 08/10/2017 12:59 PM, Mike Rapoport wrote:
>>>>>>> Pavel,
>>>>>>>
>>>>>>> Any comments?
>>>>>>
>>>>>> Oops, sorry :) Too many e-mails after vacation. Yes, I agree with that. The
>>>>>> only thing that I'd like to see as incremental fix is saving the uffd and
>>>>>> uffd-non-coop on kdat cache and using these values everywhere.
>>>>>
>>>>> The kdat fields haven't changed. The only change is how cr-check and
>>>>> cr-service treat them.
>>>>
>>>> I mean -- handle_feature_check calls check_uffd_noncoop which does testing
>>>> of the feature, while instead it should just read the kdat.something field :)
>>>
>>> Maybe that's because cr-check was not updated when kdat cache got merged? ;-)
>>
>> Maybe, but the handle_feature_check is (should be) called with kdat
>> cache loaded.
> 
> If I'm not much mistaken, cr-service does not init kdat at all:
> 
> ~/git/criu$ git grep kerndat_init
> criu/cr-dump.c: if (kerndat_init())
> criu/cr-dump.c: if (kerndat_init())
> criu/cr-restore.c:      if (kerndat_init())
> criu/include/kerndat.h:extern int kerndat_init(void);
> criu/kerndat.c:int kerndat_init(void)

With kdat cache it now can ;) Actually, with kdat cache we can load kdat right
in crtools.c before doing anything else.

-- Pavel
Andrei Vagin Sept. 5, 2017, 5:16 a.m.
On Thu, Aug 10, 2017 at 06:35:36PM +0300, Pavel Emelyanov wrote:
> On 08/10/2017 05:02 PM, Mike Rapoport wrote:
> > On Thu, Aug 10, 2017 at 04:23:57PM +0300, Pavel Emelyanov wrote:
> >> On 08/10/2017 03:05 PM, Mike Rapoport wrote:
> >>> On Thu, Aug 10, 2017 at 02:40:23PM +0300, Pavel Emelyanov wrote:
> >>>> On 08/10/2017 12:59 PM, Mike Rapoport wrote:
> >>>>> Pavel,
> >>>>>
> >>>>> Any comments?
> >>>>
> >>>> Oops, sorry :) Too many e-mails after vacation. Yes, I agree with that. The
> >>>> only thing that I'd like to see as incremental fix is saving the uffd and
> >>>> uffd-non-coop on kdat cache and using these values everywhere.
> >>>
> >>> The kdat fields haven't changed. The only change is how cr-check and
> >>> cr-service treat them.
> >>
> >> I mean -- handle_feature_check calls check_uffd_noncoop which does testing
> >> of the feature, while instead it should just read the kdat.something field :)
> > 
> > Maybe that's because cr-check was not updated when kdat cache got merged? ;-)
> 
> Maybe, but the handle_feature_check is (should be) called with kdat
> cache loaded.
> 
> >>>> Said that
> >>>>
> >>>> Acked-by: Pavel Emelyanov <xemul@virtuozzo.com>
> >>>>
> >>>> -- Pavel
> >>>>
> >>>>> On Wed, Aug 09, 2017 at 09:21:31AM +0200, Adrian Reber wrote:
> >>>>>> On Wed, Aug 09, 2017 at 08:41:52AM +0300, Mike Rapoport wrote:
> >>>>>>> I've noticed that we don't really have RPC support for lazy-pages. I've
> >>>>>>> rechecked Adrian's pull request to runc and there is a requirement to run
> >>>>>>> lazy-pages daemon manually.
> >>>>>>> Do we want to add an RPC interface for lazy-pages?
> >>>>>>> Another possibility is to fork() lazy-pages from restore...

We can do this in case of rpc. Because in this case we can send
notifications when restore is completed  and when laze-pages daemon exits.

But I'm not sure about forking a lazy pages daemon from "criu restore".

> >>>>>>
> >>>>>> I think fork()-ing makes sense. Except for debug situations there is no
> >>>>>> real reason to start the lazy-pages daemon manually.
> >>>>>>
> >>>>>> It would still be nice to have a way to start it manually, though. But
> >>>>>> the default should be that it fork()s.
> >>>
> >>> And what about fork()ing lazy-pages when restore is run with --lazy-pages?
> >>
> >> Sorry, missed that :) I think fork-ing is OK, but we should somehow decide
> >> on when restore fork()-s the lazy daemon and when connects to whatever is
> >> there in the work dir.
> >>
> >> Say, if there's a socket -- connect. Otherwise -- fork. If this is the way to
> >> go what to do if there's a stale socket left from previous lazy daemon run?
> >> Bail out or still fork()?
> > 
> > What about:
> > 
> > * If there's a socket -- connect
> > * If Connect failed:
> > 	- re-create the socket
> > 	- fork and let lazy-pages inherit the socket
> 
> O_o  If connect failed ... for any reason? I don't know, this sounds dangerous.
> I'd make it simple -- socket -> connect and fail if fail, no socket -> fork.
> 
> -- Pavel