[v3,2/2] spfs: Main process wakes and kills its children on exit

Submitted by Kirill Tkhai on Jan. 23, 2018, 10:43 a.m.

Details

Message ID 151670421136.20168.18247584929541574460.stgit@localhost.localdomain
State New
Series "Series without cover letter"
Headers show

Commit Message

Kirill Tkhai Jan. 23, 2018, 10:43 a.m.
Stanislav Kinsburskiy says:

"SPFS manager has a special "--exit-with-spfs" options, which is used by CRIU.
 The idea of the option is simple: force SPFS manager to exit, when it has some
 SPFS processes among its children (i.e. spfs was mounted at least once),
 but all these processes have exited for whatever reason (which usually happens
 when restore has failed and spfs mounts where unmounted).
 Although it works in overall (main SPFS manager process exits), its children
 (responsible to SPFS replacement) may wait on FUTEX for "release" command
 for corresponding SPFS mount and thus never stop until they are killed".

1 spfs-manager
2   \_ spfs
3   \_ spfs-manager
4   \_ spfs
5   \_ spfs-manager

2 and 3 are pair of a mount, and 4 and 5 are pair of another mount.
The patch makes spfs-manager 1 kill 3 in case of 2 exited.

https://jira.sw.ru/browse/PSBM-80055

v2: Kill only single replacer, not all of them.
v3: Print debug msg about kill().

Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
---
 manager/context.c |    9 +++++++--
 manager/spfs.c    |    1 +
 2 files changed, 8 insertions(+), 2 deletions(-)

Patch hide | download patch | download mbox

diff --git a/manager/context.c b/manager/context.c
index 73d5ada..54667bb 100644
--- a/manager/context.c
+++ b/manager/context.c
@@ -47,11 +47,15 @@  static void cleanup_spfs_mount(struct spfs_manager_context_s *ctx,
 {
 	bool failed = WIFSIGNALED(status) || !!WEXITSTATUS(status);
 
-	pr_debug("removing info %s from the list\n", info->mnt.id);
+	pr_debug("removing info %s from the list (replacer pid %d)\n",
+		  info->mnt.id, info->replacer);
 
-	if (failed)
+	if (failed) {
 		/* SPFS master was failed. We need to release the reference */
 		spfs_release_mnt(info);
+		if (info->replacer > 0 && kill(info->replacer, SIGKILL))
+			pr_perror("Failed to kill replacer");
+	}
 
 	info->dead = true;
 	del_spfs_info(ctx->spfs_mounts, info);
@@ -88,6 +92,7 @@  static void sigchld_handler(int signal, siginfo_t *siginfo, void *data)
 				 * corresponding fd.
 				 */
 				spfs_release_mnt(info);
+			info->replacer = -1;
 		} else {
 			info = find_spfs_by_pid(ctx->spfs_mounts, pid);
 			if (info) {
diff --git a/manager/spfs.c b/manager/spfs.c
index 99845b1..7ee582f 100644
--- a/manager/spfs.c
+++ b/manager/spfs.c
@@ -107,6 +107,7 @@  int create_spfs_info(const char *id,
 	INIT_LIST_HEAD(&info->processes);
 
 	info->mode = SPFS_REPLACE_MODE_HOLD;
+	info->replacer = -1;
 
 	*i = info;