[rh7] ve/net: hide handler for netlink NETLINK_REPAIR command unless CRIU restore

Submitted by Konstantin Khorenko on May 7, 2018, 4:20 p.m.

Details

Message ID 20180507162058.2510-1-khorenko@virtuozzo.com
State New
Series "ve/net: hide handler for netlink NETLINK_REPAIR command unless CRIU restore"
Headers show

Commit Message

Konstantin Khorenko May 7, 2018, 4:20 p.m.
The following patch to be applied to old kernels using ReadyKernel.
It makes updated "ip" working even if a Node was not rebooted.

Idea of the patch is taken from:
08dc16449a39 ("net: Change number of netlink repair")

   Mainstream has NETLINK_EXT_ACK 11, which is used by fresh
   iproute utils. We don't want these utils switch the socket
   in repair mode.

   https://jira.sw.ru/browse/PSBM-83415

   Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>

Distributives (for example Ubuntu 18.04, RHEL7) include now those
"fresh" version of "ip" utility which hangs on unpatched kernel.

Idea of the patch: we handle netlink command number 11
(NETLINK_REPAIR in VZ kernel / NETLINK_EXT_ACK in mainstream)
only in case we detect CRIU restore stage, otherwise we claim
kernel does not support it and "ip" is happy with that.

https://jira.sw.ru/browse/PSBM-84191

Signed-off-by: Konstantin Khorenko <khorenko@virtuozzo.com>
---
 include/uapi/linux/netlink.h | 3 +++
 net/netlink/af_netlink.c     | 8 ++++++++
 2 files changed, 11 insertions(+)

Patch hide | download patch | download mbox

diff --git a/include/uapi/linux/netlink.h b/include/uapi/linux/netlink.h
index 56ddadf14e0e..a5e6e5c4c238 100644
--- a/include/uapi/linux/netlink.h
+++ b/include/uapi/linux/netlink.h
@@ -111,7 +111,10 @@  struct nlmsgerr {
 #define NETLINK_LISTEN_ALL_NSID		8
 #define NETLINK_LIST_MEMBERSHIPS	9
 #define NETLINK_CAP_ACK			10
+
+/* intersects with mainstream NETLINK_EXT_ACK */
 #define NETLINK_REPAIR			11
+#define NETLINK_REPAIR2			127
 
 struct nl_pktinfo {
 	__u32	group;
diff --git a/net/netlink/af_netlink.c b/net/netlink/af_netlink.c
index d5afa0322990..5560a2736ba4 100644
--- a/net/netlink/af_netlink.c
+++ b/net/netlink/af_netlink.c
@@ -2196,6 +2196,14 @@  static int netlink_setsockopt(struct socket *sock, int level, int optname,
 
 	switch (optname) {
 	case NETLINK_REPAIR:
+		/* Hide the command handler unless "criu" process
+		 * resumes a Container
+		 */
+		if (likely(!get_exec_env()->is_pseudosuper ||
+			   strcmp(current->comm, "criu")))
+			return -ENOPROTOOPT;
+		/* fall through */
+	case NETLINK_REPAIR2:
 		if (val)
 			nlk->flags |= NETLINK_F_REPAIR;
 		else

Comments

Kirill Tkhai May 8, 2018, 8:32 a.m.
On 07.05.2018 19:20, Konstantin Khorenko wrote:
> The following patch to be applied to old kernels using ReadyKernel.
> It makes updated "ip" working even if a Node was not rebooted.
> 
> Idea of the patch is taken from:
> 08dc16449a39 ("net: Change number of netlink repair")
> 
>    Mainstream has NETLINK_EXT_ACK 11, which is used by fresh
>    iproute utils. We don't want these utils switch the socket
>    in repair mode.
> 
>    https://jira.sw.ru/browse/PSBM-83415
> 
>    Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
> 
> Distributives (for example Ubuntu 18.04, RHEL7) include now those
> "fresh" version of "ip" utility which hangs on unpatched kernel.
> 
> Idea of the patch: we handle netlink command number 11
> (NETLINK_REPAIR in VZ kernel / NETLINK_EXT_ACK in mainstream)
> only in case we detect CRIU restore stage, otherwise we claim
> kernel does not support it and "ip" is happy with that.
> 
> https://jira.sw.ru/browse/PSBM-84191
> 
> Signed-off-by: Konstantin Khorenko <khorenko@virtuozzo.com>

Acked-by: Kirill Tkhai <ktkhai@virtuozzo.com>

> ---
>  include/uapi/linux/netlink.h | 3 +++
>  net/netlink/af_netlink.c     | 8 ++++++++
>  2 files changed, 11 insertions(+)
> 
> diff --git a/include/uapi/linux/netlink.h b/include/uapi/linux/netlink.h
> index 56ddadf14e0e..a5e6e5c4c238 100644
> --- a/include/uapi/linux/netlink.h
> +++ b/include/uapi/linux/netlink.h
> @@ -111,7 +111,10 @@ struct nlmsgerr {
>  #define NETLINK_LISTEN_ALL_NSID		8
>  #define NETLINK_LIST_MEMBERSHIPS	9
>  #define NETLINK_CAP_ACK			10
> +
> +/* intersects with mainstream NETLINK_EXT_ACK */
>  #define NETLINK_REPAIR			11
> +#define NETLINK_REPAIR2			127
>  
>  struct nl_pktinfo {
>  	__u32	group;
> diff --git a/net/netlink/af_netlink.c b/net/netlink/af_netlink.c
> index d5afa0322990..5560a2736ba4 100644
> --- a/net/netlink/af_netlink.c
> +++ b/net/netlink/af_netlink.c
> @@ -2196,6 +2196,14 @@ static int netlink_setsockopt(struct socket *sock, int level, int optname,
>  
>  	switch (optname) {
>  	case NETLINK_REPAIR:
> +		/* Hide the command handler unless "criu" process
> +		 * resumes a Container
> +		 */
> +		if (likely(!get_exec_env()->is_pseudosuper ||
> +			   strcmp(current->comm, "criu")))
> +			return -ENOPROTOOPT;
> +		/* fall through */
> +	case NETLINK_REPAIR2:
>  		if (val)
>  			nlk->flags |= NETLINK_F_REPAIR;
>  		else
>