[Devel,criu,2/6] action-scripts: Add "pre-resume" stage

Submitted by Cyrill Gorcunov on Feb. 7, 2017, 9:05 p.m.

Details

Message ID 1486501541-20795-3-git-send-email-gorcunov@virtuozzo.com
State New
Series "Rework memory limits restore"
Headers show

Commit Message

Cyrill Gorcunov Feb. 7, 2017, 9:05 p.m.
The main idea is to be able to operate with container
at the moment where its processess and resources are
already restored but the processes are not yet in
running state, ie just before we kick them.

Beside the need of tuning up beancounters (which is vz7
specific feature) this might be useful to make some
additional debug tests from the script.

We can't reuse ACT_POST_RESUME action or move it because
we can kill the restored processes in post-resume stage
and resume them on source side as avagin@ explained.

https://jira.sw.ru/browse/PSBM-58742

Signed-off-by: Cyrill Gorcunov <gorcunov@virtuozzo.com>
---
 criu/action-scripts.c         | 1 +
 criu/cr-restore.c             | 4 ++++
 criu/include/action-scripts.h | 1 +
 3 files changed, 6 insertions(+)

Patch hide | download patch | download mbox

diff --git a/criu/action-scripts.c b/criu/action-scripts.c
index 380b05a..1dd6bb2 100644
--- a/criu/action-scripts.c
+++ b/criu/action-scripts.c
@@ -24,6 +24,7 @@  static const char *action_names[ACT_MAX] = {
 	[ ACT_NET_UNLOCK ]	= "network-unlock",
 	[ ACT_SETUP_NS ]	= "setup-namespaces",
 	[ ACT_POST_SETUP_NS ]	= "post-setup-namespaces",
+	[ ACT_PRE_RESUME ]	= "pre-resume",
 	[ ACT_POST_RESUME ]	= "post-resume",
 	[ ACT_POST_NET_LOCK ]	= "post-network-lock",
 };
diff --git a/criu/cr-restore.c b/criu/cr-restore.c
index 2e84ed3..c3db8bd 100644
--- a/criu/cr-restore.c
+++ b/criu/cr-restore.c
@@ -1948,6 +1948,10 @@  static int restore_root_task(struct pstree_item *init)
 	if (ret == 0)
 		finalize_restore();
 
+	ret = run_scripts(ACT_PRE_RESUME);
+	if (ret)
+		pr_err("Pre-resume script ret code %d\n", ret);
+
 	if (restore_freezer_state())
 		pr_err("Unable to restore freezer state\n");
 
diff --git a/criu/include/action-scripts.h b/criu/include/action-scripts.h
index e49f57c..13fd011 100644
--- a/criu/include/action-scripts.h
+++ b/criu/include/action-scripts.h
@@ -11,6 +11,7 @@  enum script_actions {
 	ACT_SETUP_NS,
 	ACT_POST_SETUP_NS,
 	ACT_POST_RESUME,
+	ACT_PRE_RESUME,
 	ACT_POST_NET_LOCK,
 
 	ACT_MAX

Comments

Andrey Vagin Feb. 8, 2017, 8:37 p.m.
On Wed, Feb 08, 2017 at 12:05:37AM +0300, Cyrill Gorcunov wrote:
> The main idea is to be able to operate with container
> at the moment where its processess and resources are
> already restored but the processes are not yet in
> running state, ie just before we kick them.
> 
> Beside the need of tuning up beancounters (which is vz7
> specific feature) this might be useful to make some
> additional debug tests from the script.

What do we have to do, if we failed to tuning up beancounters.
At this moment, we can't resume a container on a source host,
because its network was already unlocked...

> 
> We can't reuse ACT_POST_RESUME action or move it because
> we can kill the restored processes in post-resume stage
> and resume them on source side as avagin@ explained.
> 
> https://jira.sw.ru/browse/PSBM-58742
> 
> Signed-off-by: Cyrill Gorcunov <gorcunov@virtuozzo.com>
> ---
>  criu/action-scripts.c         | 1 +
>  criu/cr-restore.c             | 4 ++++
>  criu/include/action-scripts.h | 1 +
>  3 files changed, 6 insertions(+)
> 
> diff --git a/criu/action-scripts.c b/criu/action-scripts.c
> index 380b05a..1dd6bb2 100644
> --- a/criu/action-scripts.c
> +++ b/criu/action-scripts.c
> @@ -24,6 +24,7 @@ static const char *action_names[ACT_MAX] = {
>  	[ ACT_NET_UNLOCK ]	= "network-unlock",
>  	[ ACT_SETUP_NS ]	= "setup-namespaces",
>  	[ ACT_POST_SETUP_NS ]	= "post-setup-namespaces",
> +	[ ACT_PRE_RESUME ]	= "pre-resume",
>  	[ ACT_POST_RESUME ]	= "post-resume",
>  	[ ACT_POST_NET_LOCK ]	= "post-network-lock",
>  };
> diff --git a/criu/cr-restore.c b/criu/cr-restore.c
> index 2e84ed3..c3db8bd 100644
> --- a/criu/cr-restore.c
> +++ b/criu/cr-restore.c
> @@ -1948,6 +1948,10 @@ static int restore_root_task(struct pstree_item *init)
>  	if (ret == 0)
>  		finalize_restore();
>  
> +	ret = run_scripts(ACT_PRE_RESUME);
> +	if (ret)
> +		pr_err("Pre-resume script ret code %d\n", ret);
> +
>  	if (restore_freezer_state())
>  		pr_err("Unable to restore freezer state\n");
>  
> diff --git a/criu/include/action-scripts.h b/criu/include/action-scripts.h
> index e49f57c..13fd011 100644
> --- a/criu/include/action-scripts.h
> +++ b/criu/include/action-scripts.h
> @@ -11,6 +11,7 @@ enum script_actions {
>  	ACT_SETUP_NS,
>  	ACT_POST_SETUP_NS,
>  	ACT_POST_RESUME,
> +	ACT_PRE_RESUME,
>  	ACT_POST_NET_LOCK,
>  
>  	ACT_MAX
> -- 
> 2.7.4
>
Cyrill Gorcunov Feb. 8, 2017, 9:01 p.m.
On Wed, Feb 08, 2017 at 12:37:18PM -0800, Andrey Vagin wrote:
> On Wed, Feb 08, 2017 at 12:05:37AM +0300, Cyrill Gorcunov wrote:
> > The main idea is to be able to operate with container
> > at the moment where its processess and resources are
> > already restored but the processes are not yet in
> > running state, ie just before we kick them.
> > 
> > Beside the need of tuning up beancounters (which is vz7
> > specific feature) this might be useful to make some
> > additional debug tests from the script.
> 
> What do we have to do, if we failed to tuning up beancounters.
> At this moment, we can't resume a container on a source host,
> because its network was already unlocked...

This action must never fail I think. If it does -- we need
to investigate carefully why :/ I simply don't see another
option at the moment, but I'm open for ideas.
Andrey Vagin Feb. 9, 2017, 11:29 p.m.
On Thu, Feb 09, 2017 at 12:01:20AM +0300, Cyrill Gorcunov wrote:
> On Wed, Feb 08, 2017 at 12:37:18PM -0800, Andrey Vagin wrote:
> > On Wed, Feb 08, 2017 at 12:05:37AM +0300, Cyrill Gorcunov wrote:
> > > The main idea is to be able to operate with container
> > > at the moment where its processess and resources are
> > > already restored but the processes are not yet in
> > > running state, ie just before we kick them.
> > > 
> > > Beside the need of tuning up beancounters (which is vz7
> > > specific feature) this might be useful to make some
> > > additional debug tests from the script.
> > 
> > What do we have to do, if we failed to tuning up beancounters.
> > At this moment, we can't resume a container on a source host,
> > because its network was already unlocked...
> 
> This action must never fail I think. If it does -- we need
> to investigate carefully why :/ I simply don't see another
> option at the moment, but I'm open for ideas.

Why can we not use POST_RESTORE?
Cyrill Gorcunov Feb. 10, 2017, 7:14 a.m.
On Thu, Feb 09, 2017 at 03:29:15PM -0800, Andrey Vagin wrote:
> > This action must never fail I think. If it does -- we need
> > to investigate carefully why :/ I simply don't see another
> > option at the moment, but I'm open for ideas.
> 
> Why can we not use POST_RESTORE?

opst-restore is too early, we're still operating in container
after this script. What we need is a stage where everything
is ready to go and we don't any any other resources allocated
inside container.

	Cyrill