[Devel,6/6] venet: VEIP "SHUTDOWN" hook introduced

Submitted by Stanislav Kinsburskiy on April 21, 2017, 3:04 p.m.

Details

Message ID 20170421150447.31147.67268.stgit@localhost.localdomain
State New
Series "Fix VEIP object handling on container destroy"
Headers show

Commit Message

Stanislav Kinsburskiy April 21, 2017, 3:04 p.m.
Previously, there was SS hook to destroy the whole venet device (with veip
object).
Well, this is more or less ok, but SS hook was called before put_nsproxy(),
leading to network disable, while there might be some files still opened on
NFS, which is a deadlock.
This was the reason, why the original SS hook was removed in commit
a9002f5a714f130e624f42d8e3a57bb276213596 to fix this big issue.

So, why another one is introduced again? Here is the rationale.

First of all, I have to say, that the whole intention in ve_exit_ns() is to
remove container ve_struct from the global list thus allowing to create a new
one with the same VEID while the old one is under destruction.
Second, destruction may take a while, because ve_struct is being hold by it's
net, which is in turn destroyed asynchroniously.
Third, this destruction means not only VE struct + net, but also "veip"
object, which holds IP address. This means that it's unpredictable, when
container IP address is released, while container is considered as destroyed
already (when it's removed from the list, another one with the same veid can
be created). This can lead to EEXIST error on IP assigning attempt to another
interface.
Thus, there is a problem.
Look like the right solution should be to remove ve struct from the list on
final put (when network is destroyed), but it changes current logic, which
assumes, that container is "destroyed" when final task has exited, making
container stop unpredictably long especially when node is overloaded.

So, this patch bring back a piece of logic, dropped in commit
a9002f5a714f130e624f42d8e3a57bb276213596, but in different circumstances.
First of all, "SHUTDOWN" hook chain is used instead of "SS".
Second, only VEIP object is destroyed in this hook to allow to assign the
address to the new interface (if any). Venet destruction kept as it was: it
will be destroyed asynchroniously with the network.

Note: venet_dellink will skip VEIP destruction, iff ve->ve_netns is NULL (and
it is set to NULL in ve_exit_ns() _before_ final net put).

https://jira.sw.ru/browse/PSBM-64869

Signed-off-by: Stanislav Kinsburskiy <skinsbursky@virtuozzo.com>
---
 drivers/net/venetdev.c |   13 ++++++++++++-
 1 file changed, 12 insertions(+), 1 deletion(-)

Patch hide | download patch | download mbox

diff --git a/drivers/net/venetdev.c b/drivers/net/venetdev.c
index b7d7596..5e832d3 100644
--- a/drivers/net/venetdev.c
+++ b/drivers/net/venetdev.c
@@ -761,7 +761,11 @@  static void venet_dellink(struct net_device *dev, struct list_head *head)
 {
 	struct ve_struct *env = dev->nd_net->owner_ve;
 
-	veip_shutdown(env);
+	/* We check ve_netns to avid races with veip SHUTDOWN hook, called from
+	 * ve_exit_ns()
+	 */
+	if (env->ve_netns)
+		veip_shutdown(env);
 
 	env->_venet_dev = NULL;
 	unregister_netdevice_queue(dev, head);
@@ -1176,6 +1180,12 @@  static struct rtnl_link_ops venet_link_ops = {
 	.maxtype	= VENET_INFO_MAX,
 };
 
+static struct ve_hook veip_shutdown_hook = {
+	.fini		= veip_shutdown,
+	.priority	= HOOK_PRIO_FINISHING,
+	.owner		= THIS_MODULE,
+};
+
 __init int venet_init(void)
 {
 	struct proc_dir_entry *de;
@@ -1198,6 +1208,7 @@  __init int venet_init(void)
 
 	vzioctl_register(&venetcalls);
 	vzmon_register_veaddr_print_cb(veaddr_seq_print);
+	ve_hook_register(VE_SHUTDOWN_CHAIN, &veip_shutdown_hook);
 
 	return rtnl_link_register(&venet_link_ops);