[Devel,rh7,v2] vzprivnet: vzprivnet_hook: fix crash if skb->dev == NULL

Submitted by Vladimir Davydov on Aug. 23, 2016, 8:46 a.m.

Details

Message ID 1471941991-21200-1-git-send-email-vdavydov@virtuozzo.com
State New
Series "vzprivnet: vzprivnet_hook: fix crash if skb->dev == NULL"
Headers show

Commit Message

Vladimir Davydov Aug. 23, 2016, 8:46 a.m.
For the sake of Docker, we only call vzprivnet rules if skb comes from
the host [1]. To check that, we look at skb->dev->nd_net->owner_ve. This
works fine when skb is retransmitted by a device (as it is the case in
case of a bridged network), but this results in KP when skb is sent
directly to a veth or venet device (via sendto) provided sysctl
net.vzpriv_filter_host is enabled:

  BUG: unable to handle kernel NULL pointer dereference at 00000000000003e8
  IP: [<ffffffffa03058f9>] vzprivnet_hook+0x9/0xb0 [ip_vzprivnet]
  Oops: 0000 [#1] SMP
  CPU: 0 PID: 3669 Comm: sendmail ve: 2ee1e66b-d1d4-4cb9-b65e-56af4cdd60b7 Not tainted 3.10.0-327.28.2.vz7.17.1 #1 17.1
  task: ffff880039e94c20 ti: ffff880036428000 task.ti: ffff880036428000
  RIP: 0010:[<ffffffffa03058f9>]  [<ffffffffa03058f9>] vzprivnet_hook+0x9/0xb0 [ip_vzprivnet]
  RSP: 0018:ffff88003642ba78  EFLAGS: 00010202
  RAX: 0000000000000000 RBX: ffff88003642bb08 RCX: ffff88003a451000
  RDX: 0000000000000001 RSI: 0000000000000001 RDI: ffff88000509d000
  RBP: ffff88003642ba80 R08: ffff88003642bb08 R09: 0000000000000024
  R10: ffff880035f78360 R11: 0000000000000006 R12: ffff88003642bad0
  R13: ffff88000509d000 R14: ffffffff81a6c1f0 R15: 0000000000000000
  FS:  00007fcc702ca840(0000) GS:ffff88003de00000(0000) knlGS:0000000000000000
  CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  CR2: 00000000000003e8 CR3: 0000000039c1d000 CR4: 00000000000006f0
  DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
  DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
  Stack:
   ffffffffa03059f2 ffff88003642bac0 ffffffff81562950 ffffffffa0308680
   0000000043838e8f ffff88000509d000 ffff88003642bb08 ffff88000509d000
   0000000000000024 ffff88003642baf8 ffffffff81562a38 ffffffffa0308680
  Call Trace:
   [<ffffffffa03059f2>] ? vzprivnet_out_hook+0x32/0x40 [ip_vzprivnet]
   [<ffffffff81562950>] nf_iterate+0x70/0xb0
   [<ffffffff81562a38>] nf_hook_slow+0xa8/0x110
   [<ffffffff8156ff7e>] __ip_local_out_sk+0xee/0x100
   [<ffffffff81572e02>] ? ip_make_skb+0x22/0x120
   [<ffffffff8156f660>] ? ip_forward_options+0x1c0/0x1c0
   [<ffffffff8156ffab>] ip_local_out_sk+0x1b/0x40
   [<ffffffff81572d46>] ip_send_skb+0x16/0x50
   [<ffffffff81599df0>] udp_send_skb+0x170/0x380
   [<ffffffff8156fd00>] ? ip_copy_metadata+0x170/0x170
   [<ffffffff8159ae87>] udp_sendmsg+0x2f7/0x9d0
   [<ffffffff812088a1>] ? link_path_walk+0x81/0x860
   [<ffffffff815a77e4>] inet_sendmsg+0x64/0xb0
   [<ffffffff812f6442>] ? radix_tree_lookup_slot+0x22/0x50
   [<ffffffff81514e87>] sock_sendmsg+0x87/0xc0
   [<ffffffff8117a80b>] ? unlock_page+0x2b/0x30
   [<ffffffff81516f41>] SYSC_sendto+0x121/0x1c0
   [<ffffffff8163eef4>] ? __do_page_fault+0x164/0x450
   [<ffffffff8163f203>] ? do_page_fault+0x23/0x80
   [<ffffffff815177ee>] SyS_sendto+0xe/0x10
   [<ffffffff81643a09>] system_call_fastpath+0x16/0x1b

This happens, because in this case skb->dev is NULL (there's no device
this skb is arrived to). To avoid crash there, let's take owner_ve from
the net namespace which the socket is assigned to.

https://jira.sw.ru/browse/PSBM-51041

Fixes: 32efdd408fad ("vzprivnet: Do not execute vzprivnet_hook inside CT") [1]
Signed-off-by: Vladimir Davydov <vdavydov@virtuozzo.com>
Cc: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>
---
Changes in v2:
 - do not crash if both skb->dev and skb->sk turn out to be NULL
   for some reason - just print a warning and accept the packet

 net/ipv4/netfilter/ip_vzprivnet.c  | 7 ++++++-
 net/ipv6/netfilter/ip6_vzprivnet.c | 7 ++++++-
 2 files changed, 12 insertions(+), 2 deletions(-)

Patch hide | download patch | download mbox

diff --git a/net/ipv4/netfilter/ip_vzprivnet.c b/net/ipv4/netfilter/ip_vzprivnet.c
index 293b2802b4c2..6e2bbe2d42ef 100644
--- a/net/ipv4/netfilter/ip_vzprivnet.c
+++ b/net/ipv4/netfilter/ip_vzprivnet.c
@@ -250,8 +250,13 @@  static unsigned int vzprivnet_hook(struct sk_buff *skb, int can_be_bridge)
 {
 	struct dst_entry *dst;
 	unsigned int pmark = VZPRIV_MARK_UNKNOWN;
+	struct net *src_net;
 
-	if (!ve_is_super(skb->dev->nd_net->owner_ve))
+	if (WARN_ON_ONCE(!skb->dev && !skb->sk))
+		return NF_ACCEPT;
+
+	src_net = skb->dev ? dev_net(skb->dev) : sock_net(skb->sk);
+	if (!ve_is_super(src_net->owner_ve))
 		return NF_ACCEPT;
 
 	dst = skb_dst(skb);
diff --git a/net/ipv6/netfilter/ip6_vzprivnet.c b/net/ipv6/netfilter/ip6_vzprivnet.c
index ee9c3c972637..0a491afd81b6 100644
--- a/net/ipv6/netfilter/ip6_vzprivnet.c
+++ b/net/ipv6/netfilter/ip6_vzprivnet.c
@@ -484,8 +484,13 @@  static unsigned int vzprivnet6_hook(struct sk_buff *skb, int can_be_bridge)
 	int verdict = NF_DROP;
 	struct vzprivnet *dst, *src;
 	struct ipv6hdr *hdr;
+	struct net *src_net;
 
-	if (!ve_is_super(skb->dev->nd_net->owner_ve))
+	if (WARN_ON_ONCE(!skb->dev && !skb->sk))
+		return NF_ACCEPT;
+
+	src_net = skb->dev ? dev_net(skb->dev) : sock_net(skb->sk);
+	if (!ve_is_super(src_net->owner_ve))
 		return NF_ACCEPT;
 
 	hdr = ipv6_hdr(skb);

Comments

Pavel Tikhomirov Aug. 23, 2016, 8:53 a.m.
Acked-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>

On 08/23/2016 11:46 AM, Vladimir Davydov wrote:
> For the sake of Docker, we only call vzprivnet rules if skb comes from
> the host [1]. To check that, we look at skb->dev->nd_net->owner_ve. This
> works fine when skb is retransmitted by a device (as it is the case in
> case of a bridged network), but this results in KP when skb is sent
> directly to a veth or venet device (via sendto) provided sysctl
> net.vzpriv_filter_host is enabled:
>
>   BUG: unable to handle kernel NULL pointer dereference at 00000000000003e8
>   IP: [<ffffffffa03058f9>] vzprivnet_hook+0x9/0xb0 [ip_vzprivnet]
>   Oops: 0000 [#1] SMP
>   CPU: 0 PID: 3669 Comm: sendmail ve: 2ee1e66b-d1d4-4cb9-b65e-56af4cdd60b7 Not tainted 3.10.0-327.28.2.vz7.17.1 #1 17.1
>   task: ffff880039e94c20 ti: ffff880036428000 task.ti: ffff880036428000
>   RIP: 0010:[<ffffffffa03058f9>]  [<ffffffffa03058f9>] vzprivnet_hook+0x9/0xb0 [ip_vzprivnet]
>   RSP: 0018:ffff88003642ba78  EFLAGS: 00010202
>   RAX: 0000000000000000 RBX: ffff88003642bb08 RCX: ffff88003a451000
>   RDX: 0000000000000001 RSI: 0000000000000001 RDI: ffff88000509d000
>   RBP: ffff88003642ba80 R08: ffff88003642bb08 R09: 0000000000000024
>   R10: ffff880035f78360 R11: 0000000000000006 R12: ffff88003642bad0
>   R13: ffff88000509d000 R14: ffffffff81a6c1f0 R15: 0000000000000000
>   FS:  00007fcc702ca840(0000) GS:ffff88003de00000(0000) knlGS:0000000000000000
>   CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>   CR2: 00000000000003e8 CR3: 0000000039c1d000 CR4: 00000000000006f0
>   DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
>   DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
>   Stack:
>    ffffffffa03059f2 ffff88003642bac0 ffffffff81562950 ffffffffa0308680
>    0000000043838e8f ffff88000509d000 ffff88003642bb08 ffff88000509d000
>    0000000000000024 ffff88003642baf8 ffffffff81562a38 ffffffffa0308680
>   Call Trace:
>    [<ffffffffa03059f2>] ? vzprivnet_out_hook+0x32/0x40 [ip_vzprivnet]
>    [<ffffffff81562950>] nf_iterate+0x70/0xb0
>    [<ffffffff81562a38>] nf_hook_slow+0xa8/0x110
>    [<ffffffff8156ff7e>] __ip_local_out_sk+0xee/0x100
>    [<ffffffff81572e02>] ? ip_make_skb+0x22/0x120
>    [<ffffffff8156f660>] ? ip_forward_options+0x1c0/0x1c0
>    [<ffffffff8156ffab>] ip_local_out_sk+0x1b/0x40
>    [<ffffffff81572d46>] ip_send_skb+0x16/0x50
>    [<ffffffff81599df0>] udp_send_skb+0x170/0x380
>    [<ffffffff8156fd00>] ? ip_copy_metadata+0x170/0x170
>    [<ffffffff8159ae87>] udp_sendmsg+0x2f7/0x9d0
>    [<ffffffff812088a1>] ? link_path_walk+0x81/0x860
>    [<ffffffff815a77e4>] inet_sendmsg+0x64/0xb0
>    [<ffffffff812f6442>] ? radix_tree_lookup_slot+0x22/0x50
>    [<ffffffff81514e87>] sock_sendmsg+0x87/0xc0
>    [<ffffffff8117a80b>] ? unlock_page+0x2b/0x30
>    [<ffffffff81516f41>] SYSC_sendto+0x121/0x1c0
>    [<ffffffff8163eef4>] ? __do_page_fault+0x164/0x450
>    [<ffffffff8163f203>] ? do_page_fault+0x23/0x80
>    [<ffffffff815177ee>] SyS_sendto+0xe/0x10
>    [<ffffffff81643a09>] system_call_fastpath+0x16/0x1b
>
> This happens, because in this case skb->dev is NULL (there's no device
> this skb is arrived to). To avoid crash there, let's take owner_ve from
> the net namespace which the socket is assigned to.
>
> https://jira.sw.ru/browse/PSBM-51041
>
> Fixes: 32efdd408fad ("vzprivnet: Do not execute vzprivnet_hook inside CT") [1]
> Signed-off-by: Vladimir Davydov <vdavydov@virtuozzo.com>
> Cc: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>
> ---
> Changes in v2:
>  - do not crash if both skb->dev and skb->sk turn out to be NULL
>    for some reason - just print a warning and accept the packet
>
>  net/ipv4/netfilter/ip_vzprivnet.c  | 7 ++++++-
>  net/ipv6/netfilter/ip6_vzprivnet.c | 7 ++++++-
>  2 files changed, 12 insertions(+), 2 deletions(-)
>
> diff --git a/net/ipv4/netfilter/ip_vzprivnet.c b/net/ipv4/netfilter/ip_vzprivnet.c
> index 293b2802b4c2..6e2bbe2d42ef 100644
> --- a/net/ipv4/netfilter/ip_vzprivnet.c
> +++ b/net/ipv4/netfilter/ip_vzprivnet.c
> @@ -250,8 +250,13 @@ static unsigned int vzprivnet_hook(struct sk_buff *skb, int can_be_bridge)
>  {
>  	struct dst_entry *dst;
>  	unsigned int pmark = VZPRIV_MARK_UNKNOWN;
> +	struct net *src_net;
>
> -	if (!ve_is_super(skb->dev->nd_net->owner_ve))
> +	if (WARN_ON_ONCE(!skb->dev && !skb->sk))
> +		return NF_ACCEPT;
> +
> +	src_net = skb->dev ? dev_net(skb->dev) : sock_net(skb->sk);
> +	if (!ve_is_super(src_net->owner_ve))
>  		return NF_ACCEPT;
>
>  	dst = skb_dst(skb);
> diff --git a/net/ipv6/netfilter/ip6_vzprivnet.c b/net/ipv6/netfilter/ip6_vzprivnet.c
> index ee9c3c972637..0a491afd81b6 100644
> --- a/net/ipv6/netfilter/ip6_vzprivnet.c
> +++ b/net/ipv6/netfilter/ip6_vzprivnet.c
> @@ -484,8 +484,13 @@ static unsigned int vzprivnet6_hook(struct sk_buff *skb, int can_be_bridge)
>  	int verdict = NF_DROP;
>  	struct vzprivnet *dst, *src;
>  	struct ipv6hdr *hdr;
> +	struct net *src_net;
>
> -	if (!ve_is_super(skb->dev->nd_net->owner_ve))
> +	if (WARN_ON_ONCE(!skb->dev && !skb->sk))
> +		return NF_ACCEPT;
> +
> +	src_net = skb->dev ? dev_net(skb->dev) : sock_net(skb->sk);
> +	if (!ve_is_super(src_net->owner_ve))
>  		return NF_ACCEPT;
>
>  	hdr = ipv6_hdr(skb);
>