pipes: Restore pipe size via userns call

Submitted by Cyrill Gorcunov on Feb. 3, 2017, 6:28 p.m.

Details

Message ID 1486146503-23949-1-git-send-email-gorcunov@openvz.org
State Rejected
Series "pipes: Restore pipe size via userns call"
Headers show

Commit Message

Cyrill Gorcunov Feb. 3, 2017, 6:28 p.m.
From: Cyrill Gorcunov <gorcunov@virtuozzo.com>

In vanilla kernels starting from v4.4-6172-g759c011
pipes allocation is limited per user-namespace, and
what is more interesting is that if hard limit is
zero but softlimit is not, then it is possible to
create new pipes with one buffer but setting its
size if not allowed until we're having CAP_SYS_RESOURCE
and CAP_SYS_ADMIN caps granted.

So to be able to restore this resource we need
to use userns daemon.

Signed-off-by: Cyrill Gorcunov <gorcunov@virtuozzo.com>
---
 criu/pipes.c | 32 +++++++++++++++++++++++++-------
 1 file changed, 25 insertions(+), 7 deletions(-)

Patch hide | download patch | download mbox

diff --git a/criu/pipes.c b/criu/pipes.c
index 7be1d5f047a1..26aa2b4c359f 100644
--- a/criu/pipes.c
+++ b/criu/pipes.c
@@ -11,6 +11,7 @@ 
 #include "pipes.h"
 #include "util-pie.h"
 #include "autofs.h"
+#include "namespaces.h"
 
 #include "protobuf.h"
 #include "util.h"
@@ -142,6 +143,25 @@  static int mark_pipe_master(void *unused)
 
 static struct pipe_data_rst *pd_hash_pipes[PIPE_DATA_HASH_SIZE];
 
+typedef struct {
+	unsigned int	pipe_id;
+	size_t		size;
+} pipe_set_size_arg_t;
+
+static int pipe_set_size(void *arg, int fd, int pid)
+{
+	pipe_set_size_arg_t *p = arg;
+
+	pr_info("Restoring size %#zx for %#x\n", p->size, p->pipe_id);
+
+	if (fcntl(fd, F_SETPIPE_SZ, p->size) < 0) {
+		pr_perror("Can't restore pipe size");
+		return -1;
+	}
+
+	return 0;
+}
+
 int restore_pipe_data(int img_type, int pfd, u32 id, struct pipe_data_rst **hash)
 {
 	int ret;
@@ -201,13 +221,11 @@  int restore_pipe_data(int img_type, int pfd, u32 id, struct pipe_data_rst **hash
 out:
 	ret = 0;
 	if (pd->pde->has_size) {
-		pr_info("Restoring size %#x for %#x\n",
-				pd->pde->size, pd->pde->pipe_id);
-		ret = fcntl(pfd, F_SETPIPE_SZ, pd->pde->size);
-		if (ret < 0)
-			pr_perror("Can't restore pipe size");
-		else
-			ret = 0;
+		pipe_set_size_arg_t args = {
+			.pipe_id	= pd->pde->pipe_id,
+			.size		= (size_t)pd->pde->size,
+		};
+		ret = userns_call(pipe_set_size, UNS_ASYNC, &args, sizeof(args), pfd);
 	}
 err:
 	return ret;

Comments

Pavel Emelianov Feb. 6, 2017, 10:22 a.m.
On 02/03/2017 09:28 PM, Cyrill Gorcunov wrote:
> From: Cyrill Gorcunov <gorcunov@virtuozzo.com>
> 
> In vanilla kernels starting from v4.4-6172-g759c011
> pipes allocation is limited per user-namespace, and
> what is more interesting is that if hard limit is
> zero but softlimit is not, then it is possible to
> create new pipes with one buffer but setting its
> size if not allowed until we're having CAP_SYS_RESOURCE
> and CAP_SYS_ADMIN caps granted.
> 
> So to be able to restore this resource we need
> to use userns daemon.
> 
> Signed-off-by: Cyrill Gorcunov <gorcunov@virtuozzo.com>
> ---
>  criu/pipes.c | 32 +++++++++++++++++++++++++-------
>  1 file changed, 25 insertions(+), 7 deletions(-)
> 
> diff --git a/criu/pipes.c b/criu/pipes.c
> index 7be1d5f047a1..26aa2b4c359f 100644
> --- a/criu/pipes.c
> +++ b/criu/pipes.c
> @@ -11,6 +11,7 @@
>  #include "pipes.h"
>  #include "util-pie.h"
>  #include "autofs.h"
> +#include "namespaces.h"
>  
>  #include "protobuf.h"
>  #include "util.h"
> @@ -142,6 +143,25 @@ static int mark_pipe_master(void *unused)
>  
>  static struct pipe_data_rst *pd_hash_pipes[PIPE_DATA_HASH_SIZE];
>  
> +typedef struct {
> +	unsigned int	pipe_id;
> +	size_t		size;
> +} pipe_set_size_arg_t;
> +
> +static int pipe_set_size(void *arg, int fd, int pid)
> +{
> +	pipe_set_size_arg_t *p = arg;
> +
> +	pr_info("Restoring size %#zx for %#x\n", p->size, p->pipe_id);

You don't need pipe_id here, just print it before calling userns_call and
remove this ugly pipe_set_size_arg_t.

> +
> +	if (fcntl(fd, F_SETPIPE_SZ, p->size) < 0) {
> +		pr_perror("Can't restore pipe size");
> +		return -1;
> +	}
> +
> +	return 0;
> +}
> +
>  int restore_pipe_data(int img_type, int pfd, u32 id, struct pipe_data_rst **hash)
>  {
>  	int ret;
> @@ -201,13 +221,11 @@ int restore_pipe_data(int img_type, int pfd, u32 id, struct pipe_data_rst **hash
>  out:
>  	ret = 0;
>  	if (pd->pde->has_size) {
> -		pr_info("Restoring size %#x for %#x\n",
> -				pd->pde->size, pd->pde->pipe_id);
> -		ret = fcntl(pfd, F_SETPIPE_SZ, pd->pde->size);
> -		if (ret < 0)
> -			pr_perror("Can't restore pipe size");
> -		else
> -			ret = 0;
> +		pipe_set_size_arg_t args = {
> +			.pipe_id	= pd->pde->pipe_id,
> +			.size		= (size_t)pd->pde->size,
> +		};
> +		ret = userns_call(pipe_set_size, UNS_ASYNC, &args, sizeof(args), pfd);
>  	}
>  err:
>  	return ret;
>
Kirill Gorkunov Feb. 6, 2017, 10:39 a.m.
On Mon, Feb 06, 2017 at 01:22:39PM +0300, Pavel Emelyanov wrote:
> >  
> >  static struct pipe_data_rst *pd_hash_pipes[PIPE_DATA_HASH_SIZE];
> >  
> > +typedef struct {
> > +	unsigned int	pipe_id;
> > +	size_t		size;
> > +} pipe_set_size_arg_t;
> > +
> > +static int pipe_set_size(void *arg, int fd, int pid)
> > +{
> > +	pipe_set_size_arg_t *p = arg;
> > +
> > +	pr_info("Restoring size %#zx for %#x\n", p->size, p->pipe_id);
> 
> You don't need pipe_id here, just print it before calling userns_call and
> remove this ugly pipe_set_size_arg_t.

It is *ASYNC* call, thus it's printed when it executes, which is a
way more suitable for log I think, no?
Cyrill Gorcunov Feb. 28, 2017, 7:43 a.m.
On Mon, Feb 6, 2017 at 1:39 PM, Cyrill Gorcunov <gorcunov@virtuozzo.com> wrote:
> On Mon, Feb 06, 2017 at 01:22:39PM +0300, Pavel Emelyanov wrote:
>> >
>> >  static struct pipe_data_rst *pd_hash_pipes[PIPE_DATA_HASH_SIZE];
>> >
>> > +typedef struct {
>> > +   unsigned int    pipe_id;
>> > +   size_t          size;
>> > +} pipe_set_size_arg_t;
>> > +
>> > +static int pipe_set_size(void *arg, int fd, int pid)
>> > +{
>> > +   pipe_set_size_arg_t *p = arg;
>> > +
>> > +   pr_info("Restoring size %#zx for %#x\n", p->size, p->pipe_id);
>>
>> You don't need pipe_id here, just print it before calling userns_call and
>> remove this ugly pipe_set_size_arg_t.
>
> It is *ASYNC* call, thus it's printed when it executes, which is a
> way more suitable for log I think, no?

Pavel, have I convince you? If so please ack so we merge it upstream.
Pavel Emelianov Feb. 28, 2017, 3:18 p.m.
On 02/28/2017 10:43 AM, Cyrill Gorcunov wrote:
> On Mon, Feb 6, 2017 at 1:39 PM, Cyrill Gorcunov <gorcunov@virtuozzo.com> wrote:
>> On Mon, Feb 06, 2017 at 01:22:39PM +0300, Pavel Emelyanov wrote:
>>>>
>>>>  static struct pipe_data_rst *pd_hash_pipes[PIPE_DATA_HASH_SIZE];
>>>>
>>>> +typedef struct {
>>>> +   unsigned int    pipe_id;
>>>> +   size_t          size;
>>>> +} pipe_set_size_arg_t;
>>>> +
>>>> +static int pipe_set_size(void *arg, int fd, int pid)
>>>> +{
>>>> +   pipe_set_size_arg_t *p = arg;
>>>> +
>>>> +   pr_info("Restoring size %#zx for %#x\n", p->size, p->pipe_id);
>>>
>>> You don't need pipe_id here, just print it before calling userns_call and
>>> remove this ugly pipe_set_size_arg_t.
>>
>> It is *ASYNC* call, thus it's printed when it executes, which is a
>> way more suitable for log I think, no?
> 
> Pavel, have I convince you? 

Nope :(

> If so please ack so we merge it upstream.

I'll leave it up to Andrey.

-- Pavel
Cyrill Gorcunov Feb. 28, 2017, 3:29 p.m.
On Tue, Feb 28, 2017 at 06:18:02PM +0300, Pavel Emelyanov wrote:
> >>>
> >>> You don't need pipe_id here, just print it before calling userns_call and
> >>> remove this ugly pipe_set_size_arg_t.
> >>
> >> It is *ASYNC* call, thus it's printed when it executes, which is a
> >> way more suitable for log I think, no?
> > 
> > Pavel, have I convince you? 
> 
> Nope :(

Look, if this pipe_set_size failed we will have the pipe id in the log,
so we will be able to detect which exactly pipe is failed.
Andrey Vagin March 13, 2017, 6:44 p.m.
On Fri, Feb 03, 2017 at 09:28:23PM +0300, Cyrill Gorcunov wrote:
> From: Cyrill Gorcunov <gorcunov@virtuozzo.com>
> 
> In vanilla kernels starting from v4.4-6172-g759c011
> pipes allocation is limited per user-namespace, and
> what is more interesting is that if hard limit is
> zero but softlimit is not, then it is possible to
> create new pipes with one buffer but setting its
> size if not allowed until we're having CAP_SYS_RESOURCE
> and CAP_SYS_ADMIN caps granted.
> 
> So to be able to restore this resource we need
> to use userns daemon.
> 
> Signed-off-by: Cyrill Gorcunov <gorcunov@virtuozzo.com>
> ---
>  criu/pipes.c | 32 +++++++++++++++++++++++++-------
>  1 file changed, 25 insertions(+), 7 deletions(-)
> 
> diff --git a/criu/pipes.c b/criu/pipes.c
> index 7be1d5f047a1..26aa2b4c359f 100644
> --- a/criu/pipes.c
> +++ b/criu/pipes.c
> @@ -11,6 +11,7 @@
>  #include "pipes.h"
>  #include "util-pie.h"
>  #include "autofs.h"
> +#include "namespaces.h"
>  
>  #include "protobuf.h"
>  #include "util.h"
> @@ -142,6 +143,25 @@ static int mark_pipe_master(void *unused)
>  
>  static struct pipe_data_rst *pd_hash_pipes[PIPE_DATA_HASH_SIZE];
>  
> +typedef struct {
> +	unsigned int	pipe_id;
> +	size_t		size;
> +} pipe_set_size_arg_t;
> +
> +static int pipe_set_size(void *arg, int fd, int pid)
> +{
> +	pipe_set_size_arg_t *p = arg;
> +
> +	pr_info("Restoring size %#zx for %#x\n", p->size, p->pipe_id);
> +
> +	if (fcntl(fd, F_SETPIPE_SZ, p->size) < 0) {
> +		pr_perror("Can't restore pipe size");
> +		return -1;
> +	}
> +
> +	return 0;
> +}
> +
>  int restore_pipe_data(int img_type, int pfd, u32 id, struct pipe_data_rst **hash)
>  {
>  	int ret;
> @@ -201,13 +221,11 @@ int restore_pipe_data(int img_type, int pfd, u32 id, struct pipe_data_rst **hash
>  out:
>  	ret = 0;
>  	if (pd->pde->has_size) {
> -		pr_info("Restoring size %#x for %#x\n",
> -				pd->pde->size, pd->pde->pipe_id);
> -		ret = fcntl(pfd, F_SETPIPE_SZ, pd->pde->size);
> -		if (ret < 0)
> -			pr_perror("Can't restore pipe size");
> -		else
> -			ret = 0;
> +		pipe_set_size_arg_t args = {
> +			.pipe_id	= pd->pde->pipe_id,
> +			.size		= (size_t)pd->pde->size,
> +		};
> +		ret = userns_call(pipe_set_size, UNS_ASYNC, &args, sizeof(args), pfd);

UNS_ASYNC is used to improve perfomance, isn't it? If it is, I would
suggest to try to set a pipe size here and make userns_call only if it is failed
with EPERM. In this case we can do this synchroniously, because the
number of these calls probably will be small. And you will be able to
apply Pavel's comment.

>  	}
>  err:
>  	return ret;
> -- 
> 2.7.4
> 
> _______________________________________________
> CRIU mailing list
> CRIU@openvz.org
> https://lists.openvz.org/mailman/listinfo/criu
Cyrill Gorcunov March 13, 2017, 7:16 p.m.
On Mon, Mar 13, 2017 at 11:44:39AM -0700, Andrei Vagin wrote:
...
> >  int restore_pipe_data(int img_type, int pfd, u32 id, struct pipe_data_rst **hash)
> >  {
> >  	int ret;
> > @@ -201,13 +221,11 @@ int restore_pipe_data(int img_type, int pfd, u32 id, struct pipe_data_rst **hash
> >  out:
> >  	ret = 0;
> >  	if (pd->pde->has_size) {
> > -		pr_info("Restoring size %#x for %#x\n",
> > -				pd->pde->size, pd->pde->pipe_id);
> > -		ret = fcntl(pfd, F_SETPIPE_SZ, pd->pde->size);
> > -		if (ret < 0)
> > -			pr_perror("Can't restore pipe size");
> > -		else
> > -			ret = 0;
> > +		pipe_set_size_arg_t args = {
> > +			.pipe_id	= pd->pde->pipe_id,
> > +			.size		= (size_t)pd->pde->size,
> > +		};
> > +		ret = userns_call(pipe_set_size, UNS_ASYNC, &args, sizeof(args), pfd);
> 
> UNS_ASYNC is used to improve perfomance, isn't it? If it is, I would
> suggest to try to set a pipe size here and make userns_call only if it is failed
> with EPERM. In this case we can do this synchroniously, because the
> number of these calls probably will be small. And you will be able to
> apply Pavel's comment.

Which would bloat code size :/ Drop it. Hopefully I'll code with
some better idea.