page-server: Allow blocking on pipe

Submitted by Pavel Emelianov on Dec. 19, 2016, 10:13 a.m.

Details

Message ID 5857B2DF.70607@virtuozzo.com
State Accepted
Series "page-server: Allow blocking on pipe"
Commit 00e25daaf2bddf8ff998b7ef0e0ef70b988cd660
Headers show

Commit Message

Pavel Emelianov Dec. 19, 2016, 10:13 a.m.
This splice tries to get pages from socket into local pipe to
splice them into images later. The data on the socket may not
be there by the time we get to this splice, so there's no reason
to force non-blocking IO here.

Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>
---
 criu/page-xfer.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Patch hide | download patch | download mbox

diff --git a/criu/page-xfer.c b/criu/page-xfer.c
index 41e6c8a..39c6977 100644
--- a/criu/page-xfer.c
+++ b/criu/page-xfer.c
@@ -596,7 +596,7 @@  static int page_server_add(int sk, struct page_server_iov *pi, u32 flags)
 		if (chunk > cxfer.pipe_size)
 			chunk = cxfer.pipe_size;
 
-		chunk = splice(sk, NULL, cxfer.p[1], NULL, chunk, SPLICE_F_MOVE | SPLICE_F_NONBLOCK);
+		chunk = splice(sk, NULL, cxfer.p[1], NULL, chunk, SPLICE_F_MOVE);
 		if (chunk < 0) {
 			pr_perror("Can't read from socket");
 			return -1;

Comments

Andrey Vagin Jan. 2, 2017, 7:30 p.m.
On Mon, Dec 19, 2016 at 01:13:51PM +0300, Pavel Emelyanov wrote:
> This splice tries to get pages from socket into local pipe to
> splice them into images later. The data on the socket may not
> be there by the time we get to this splice, so there's no reason
> to force non-blocking IO here.

This SPLICE_F_NONBLOCK isn't about data on the socket. We don't set
SOCK_NONBLOCK, so this splice waits data on the socket even with
SPLICE_F_NONBLOCK.

ssize_t tcp_splice_read(struct socket *sock, loff_t *ppos,
...
        timeo = sock_rcvtimeo(sk, sock->file->f_flags & O_NONBLOCK);

but if we execute splice() without SPLICE_F_NONBLOCK, it can be block on
a pipe and it's a problem for us, because we are going to read from it
after this slice().

[root@zdtm ~]# cat /proc/6191/stack 
[<ffffffff8e259450>] pipe_wait+0x70/0xc0
[<ffffffff8e282402>] splice_to_pipe+0x1b2/0x240
[<ffffffff8e6cad97>] skb_socket_splice+0x27/0x40
[<ffffffff8e6c9045>] skb_splice_bits+0xb5/0xe0
[<ffffffff8e72daec>] tcp_splice_data_recv+0x3c/0x50
[<ffffffff8e72e083>] tcp_read_sock+0xa3/0x1e0
[<ffffffff8e72e2ad>] tcp_splice_read+0xed/0x260
[<ffffffff8e6b92b3>] sock_splice_read+0x23/0x30
[<ffffffff8e2825d6>] do_splice_to+0x76/0x90
[<ffffffff8e284da4>] SyS_splice+0x754/0x790
[<ffffffff8e802932>] entry_SYSCALL_64_fastpath+0x1a/0xa4
[<ffffffffffffffff>] 0xffffffffffffffff

https://github.com/xemul/criu/issues/265


> 
> Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>
> ---
>  criu/page-xfer.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/criu/page-xfer.c b/criu/page-xfer.c
> index 41e6c8a..39c6977 100644
> --- a/criu/page-xfer.c
> +++ b/criu/page-xfer.c
> @@ -596,7 +596,7 @@ static int page_server_add(int sk, struct page_server_iov *pi, u32 flags)
>  		if (chunk > cxfer.pipe_size)
>  			chunk = cxfer.pipe_size;
>  
> -		chunk = splice(sk, NULL, cxfer.p[1], NULL, chunk, SPLICE_F_MOVE | SPLICE_F_NONBLOCK);
> +		chunk = splice(sk, NULL, cxfer.p[1], NULL, chunk, SPLICE_F_MOVE);
>  		if (chunk < 0) {
>  			pr_perror("Can't read from socket");
>  			return -1;
> -- 
> 2.5.0
> _______________________________________________
> CRIU mailing list
> CRIU@openvz.org
> https://lists.openvz.org/mailman/listinfo/criu
Pavel Emelianov Jan. 9, 2017, 8:41 a.m.
On 01/02/2017 10:30 PM, Andrei Vagin wrote:
> On Mon, Dec 19, 2016 at 01:13:51PM +0300, Pavel Emelyanov wrote:
>> This splice tries to get pages from socket into local pipe to
>> splice them into images later. The data on the socket may not
>> be there by the time we get to this splice, so there's no reason
>> to force non-blocking IO here.
> 
> This SPLICE_F_NONBLOCK isn't about data on the socket. We don't set
> SOCK_NONBLOCK, so this splice waits data on the socket even with
> SPLICE_F_NONBLOCK.
> 
> ssize_t tcp_splice_read(struct socket *sock, loff_t *ppos,
> ...
>         timeo = sock_rcvtimeo(sk, sock->file->f_flags & O_NONBLOCK);

True, but when the socket is AF_UNIX one, the issues the other way around:

        if (sock->file->f_flags & O_NONBLOCK ||
            flags & SPLICE_F_NONBLOCK)
                state.flags = MSG_DONTWAIT;

so getting data from empty unix socket (it can be empty simply because no
data other than header has arrived yet) results in EAGAIN.

-- Pavel

> but if we execute splice() without SPLICE_F_NONBLOCK, it can be block on
> a pipe and it's a problem for us, because we are going to read from it
> after this slice().
> 
> [root@zdtm ~]# cat /proc/6191/stack 
> [<ffffffff8e259450>] pipe_wait+0x70/0xc0
> [<ffffffff8e282402>] splice_to_pipe+0x1b2/0x240
> [<ffffffff8e6cad97>] skb_socket_splice+0x27/0x40
> [<ffffffff8e6c9045>] skb_splice_bits+0xb5/0xe0
> [<ffffffff8e72daec>] tcp_splice_data_recv+0x3c/0x50
> [<ffffffff8e72e083>] tcp_read_sock+0xa3/0x1e0
> [<ffffffff8e72e2ad>] tcp_splice_read+0xed/0x260
> [<ffffffff8e6b92b3>] sock_splice_read+0x23/0x30
> [<ffffffff8e2825d6>] do_splice_to+0x76/0x90
> [<ffffffff8e284da4>] SyS_splice+0x754/0x790
> [<ffffffff8e802932>] entry_SYSCALL_64_fastpath+0x1a/0xa4
> [<ffffffffffffffff>] 0xffffffffffffffff
> 
> https://github.com/xemul/criu/issues/265
> 
> 
>>
>> Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>
>> ---
>>  criu/page-xfer.c | 2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/criu/page-xfer.c b/criu/page-xfer.c
>> index 41e6c8a..39c6977 100644
>> --- a/criu/page-xfer.c
>> +++ b/criu/page-xfer.c
>> @@ -596,7 +596,7 @@ static int page_server_add(int sk, struct page_server_iov *pi, u32 flags)
>>  		if (chunk > cxfer.pipe_size)
>>  			chunk = cxfer.pipe_size;
>>  
>> -		chunk = splice(sk, NULL, cxfer.p[1], NULL, chunk, SPLICE_F_MOVE | SPLICE_F_NONBLOCK);
>> +		chunk = splice(sk, NULL, cxfer.p[1], NULL, chunk, SPLICE_F_MOVE);
>>  		if (chunk < 0) {
>>  			pr_perror("Can't read from socket");
>>  			return -1;
>> -- 
>> 2.5.0
>> _______________________________________________
>> CRIU mailing list
>> CRIU@openvz.org
>> https://lists.openvz.org/mailman/listinfo/criu
> .
>