ve/mount: allow pseudosuper to temporary exceed the container limit

Submitted by Pavel Tikhomirov on July 13, 2018, 11:48 a.m.

Details

Message ID 20180713114831.16981-1-ptikhomirov@virtuozzo.com
State New
Series "ve/mount: allow pseudosuper to temporary exceed the container limit"
Headers show

Commit Message

Pavel Tikhomirov July 13, 2018, 11:48 a.m.
Criu algorithm is (prepare_mnt_ns):
1) Restore all mounts of the CT (from all mntns'es) in single temporary
mount namespace.
2) For each mount namespace of the container recreate it's mounts:
 a) Unshare temporary mntns (mounts are doubled)
 b) Remove with pivot_root all excess mounts

So at some point we have many mntnses of the CT already created with
their mounts and two temporary mount namespaces with mounts copies, that
is ~3x mounts (and may be also some aditional temporary mounts).

When we restore a CT with > 1/3*sysctl_ve_mount_nr mounts we hit the
limit and fail, fix it ignoring the limit at restore stage.

https://jira.sw.ru/browse/PSBM-86511
Signed-off-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>
---
 fs/namespace.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Patch hide | download patch | download mbox

diff --git a/fs/namespace.c b/fs/namespace.c
index 83624d7a0def..7ffb6398f1da 100644
--- a/fs/namespace.c
+++ b/fs/namespace.c
@@ -2504,7 +2504,7 @@  static inline int ve_mount_allowed(void)
 {
 	struct ve_struct *ve = get_exec_env();
 
-	return ve_is_super(ve) ||
+	return ve_is_super(ve) || ve->is_pseudosuper ||
 		atomic_read(&ve->mnt_nr) < (int)sysctl_ve_mount_nr;
 }
 

Comments

Cyrill Gorcunov July 13, 2018, 12:05 p.m.
On Fri, Jul 13, 2018 at 02:48:31PM +0300, Pavel Tikhomirov wrote:
> Criu algorithm is (prepare_mnt_ns):
> 1) Restore all mounts of the CT (from all mntns'es) in single temporary
> mount namespace.
> 2) For each mount namespace of the container recreate it's mounts:
>  a) Unshare temporary mntns (mounts are doubled)
>  b) Remove with pivot_root all excess mounts
> 
> So at some point we have many mntnses of the CT already created with
> their mounts and two temporary mount namespaces with mounts copies, that
> is ~3x mounts (and may be also some aditional temporary mounts).
> 
> When we restore a CT with > 1/3*sysctl_ve_mount_nr mounts we hit the
> limit and fail, fix it ignoring the limit at restore stage.
> 
> https://jira.sw.ru/browse/PSBM-86511
> Signed-off-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>
> ---
>  fs/namespace.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/fs/namespace.c b/fs/namespace.c
> index 83624d7a0def..7ffb6398f1da 100644
> --- a/fs/namespace.c
> +++ b/fs/namespace.c
> @@ -2504,7 +2504,7 @@ static inline int ve_mount_allowed(void)
>  {
>  	struct ve_struct *ve = get_exec_env();
>  
> -	return ve_is_super(ve) ||
> +	return ve_is_super(ve) || ve->is_pseudosuper ||
>  		atomic_read(&ve->mnt_nr) < (int)sysctl_ve_mount_nr;
>  }
>  
> -- 
> 2.17.0
> 
Acked-by: Cyrill Gorcunov <gorcunov@openvz.org>