[v7,1/9] restore: Use pstree_pid_cmp helper in pid_rst_prio

Submitted by Cyrill Gorcunov on May 23, 2018, 4:06 p.m.

Details

Message ID 20180523160615.31428-2-gorcunov@gmail.com
State New
Series "Add support of deleted unix sockets"
Headers show

Commit Message

Cyrill Gorcunov May 23, 2018, 4:06 p.m.
The pid number is absolutely not a guarantee to
introduce order relationship into pid set because
they might be reused. Instead we should consider
parent->child relationship.

Otherwise we might get

 |  31964  31964  31964       epoll
 |    585  31964  31964           epoll
 |    586  31964  31964           epoll
 |...
 | (04.797121)    585: Error (criu/eventpoll.c:180): epoll: Unexpected state for tfd (id 0 fd 8)

Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>
---
 criu/include/pid.h | 12 +++++++++++-
 criu/pstree.c      | 36 ++++++++++++++++++++++++++++++++++++
 2 files changed, 47 insertions(+), 1 deletion(-)

Patch hide | download patch | download mbox

diff --git a/criu/include/pid.h b/criu/include/pid.h
index 37e39428e0d6..d8869fde0a09 100644
--- a/criu/include/pid.h
+++ b/criu/include/pid.h
@@ -69,13 +69,23 @@  static inline pid_t last_level_pid(struct pid *pid)
 	return pid->ns[pid->level-1].virt;
 }
 
+extern int pstree_pid_cmp(pid_t a, pid_t b);
+
 /*
  * When we have to restore a shared resource, we mush select which
  * task should do it, and make other(s) wait for it. In order to
  * avoid deadlocks, always make task with lower pid be the restorer.
  */
-static inline bool pid_rst_prio(unsigned pid_a, unsigned pid_b)
+static inline bool pid_rst_prio(pid_t pid_a, pid_t pid_b)
 {
+	int ret = pstree_pid_cmp(pid_a, pid_b);
+
+	if (ret == 1)
+		return true;
+	else if (ret == 2)
+		return false;
+
+	/* Fallback into plain pid comparision */
 	return pid_a < pid_b;
 }
 
diff --git a/criu/pstree.c b/criu/pstree.c
index 6047d29ccc97..f180b9b221ed 100644
--- a/criu/pstree.c
+++ b/criu/pstree.c
@@ -655,6 +655,42 @@  int fixup_pid_for_children_ns(TaskKobjIdsEntry *ids)
 	return 0;
 }
 
+/*
+ *  0 -- pids are the same
+ *  1 -- @a is a parent of @b
+ *  2 -- @b is a parent of @a
+ *  3 -- pids are not connected
+ *  -1 -- pid not found
+ */
+int pstree_pid_cmp(pid_t a, pid_t b)
+{
+	struct pstree_item *pstree_a, *pstree_b, *t;
+	struct pid *pid_a, *pid_b;
+
+	if (a == b)
+		return 0;
+
+	pid_a = pstree_pid_by_virt(a);
+	pid_b = pstree_pid_by_virt(b);
+	if (!pid_a || !pid_b)
+		return -1;
+
+	pstree_a = pid_a->item;
+	pstree_b = pid_b->item;
+
+	for (t = pstree_b; t->parent; t = t->parent) {
+		if (t == pstree_a)
+			return 1;
+	}
+
+	for (t = pstree_a; t->parent; t = t->parent) {
+		if (t == pstree_b)
+			return 2;
+	}
+
+	return 3;
+}
+
 static int read_pstree_ids(pid_t pid, TaskKobjIdsEntry **ids)
 {
 	struct cr_img *img;

Comments

Andrei Vagin June 4, 2018, 6:45 p.m.
On Wed, May 23, 2018 at 07:06:07PM +0300, Cyrill Gorcunov wrote:
> The pid number is absolutely not a guarantee to
> introduce order relationship into pid set because
> they might be reused. Instead we should consider
> parent->child relationship.

I don't understand this. Could you elaborate with more details?

> 
> Otherwise we might get
> 
>  |  31964  31964  31964       epoll
>  |    585  31964  31964           epoll
>  |    586  31964  31964           epoll
>  |...
>  | (04.797121)    585: Error (criu/eventpoll.c:180): epoll: Unexpected state for tfd (id 0 fd 8)
> 
> Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>
> ---
>  criu/include/pid.h | 12 +++++++++++-
>  criu/pstree.c      | 36 ++++++++++++++++++++++++++++++++++++
>  2 files changed, 47 insertions(+), 1 deletion(-)
> 
> diff --git a/criu/include/pid.h b/criu/include/pid.h
> index 37e39428e0d6..d8869fde0a09 100644
> --- a/criu/include/pid.h
> +++ b/criu/include/pid.h
> @@ -69,13 +69,23 @@ static inline pid_t last_level_pid(struct pid *pid)
>  	return pid->ns[pid->level-1].virt;
>  }
>  
> +extern int pstree_pid_cmp(pid_t a, pid_t b);
> +
>  /*
>   * When we have to restore a shared resource, we mush select which
>   * task should do it, and make other(s) wait for it. In order to
>   * avoid deadlocks, always make task with lower pid be the restorer.
>   */
> -static inline bool pid_rst_prio(unsigned pid_a, unsigned pid_b)
> +static inline bool pid_rst_prio(pid_t pid_a, pid_t pid_b)
>  {
> +	int ret = pstree_pid_cmp(pid_a, pid_b);
> +
> +	if (ret == 1)
> +		return true;
> +	else if (ret == 2)
> +		return false;
> +
> +	/* Fallback into plain pid comparision */
>  	return pid_a < pid_b;
>  }
>  
> diff --git a/criu/pstree.c b/criu/pstree.c
> index 6047d29ccc97..f180b9b221ed 100644
> --- a/criu/pstree.c
> +++ b/criu/pstree.c
> @@ -655,6 +655,42 @@ int fixup_pid_for_children_ns(TaskKobjIdsEntry *ids)
>  	return 0;
>  }
>  
> +/*
> + *  0 -- pids are the same
> + *  1 -- @a is a parent of @b
> + *  2 -- @b is a parent of @a
> + *  3 -- pids are not connected
> + *  -1 -- pid not found
> + */
> +int pstree_pid_cmp(pid_t a, pid_t b)
> +{
> +	struct pstree_item *pstree_a, *pstree_b, *t;
> +	struct pid *pid_a, *pid_b;
> +
> +	if (a == b)
> +		return 0;
> +
> +	pid_a = pstree_pid_by_virt(a);
> +	pid_b = pstree_pid_by_virt(b);
> +	if (!pid_a || !pid_b)
> +		return -1;
> +
> +	pstree_a = pid_a->item;
> +	pstree_b = pid_b->item;
> +
> +	for (t = pstree_b; t->parent; t = t->parent) {
> +		if (t == pstree_a)
> +			return 1;
> +	}
> +
> +	for (t = pstree_a; t->parent; t = t->parent) {
> +		if (t == pstree_b)
> +			return 2;
> +	}
> +
> +	return 3;
> +}
> +
>  static int read_pstree_ids(pid_t pid, TaskKobjIdsEntry **ids)
>  {
>  	struct cr_img *img;
> -- 
> 2.14.3
>
Cyrill Gorcunov June 4, 2018, 7:28 p.m.
On Mon, Jun 04, 2018 at 11:45:51AM -0700, Andrey Vagin wrote:
> On Wed, May 23, 2018 at 07:06:07PM +0300, Cyrill Gorcunov wrote:
> > The pid number is absolutely not a guarantee to
> > introduce order relationship into pid set because
> > they might be reused. Instead we should consider
> > parent->child relationship.
> 
> I don't understand this. Could you elaborate with more details?
> 
> > 
> > Otherwise we might get
> > 
> >  |  31964  31964  31964       epoll
> >  |    585  31964  31964           epoll
> >  |    586  31964  31964           epoll
> >  |...
> >  | (04.797121)    585: Error (criu/eventpoll.c:180): epoll: Unexpected state for tfd (id 0 fd 8)

An example tree is above: comparing only pid numbers we will
choose 586 as a master and 31964 as a slave, which is wrong
of course. Instead we should walk over the pid tree and
find which pid is a master (parent) and which is slave (children)
Andrei Vagin June 4, 2018, 10:26 p.m.
On Mon, Jun 04, 2018 at 10:28:04PM +0300, Cyrill Gorcunov wrote:
> On Mon, Jun 04, 2018 at 11:45:51AM -0700, Andrey Vagin wrote:
> > On Wed, May 23, 2018 at 07:06:07PM +0300, Cyrill Gorcunov wrote:
> > > The pid number is absolutely not a guarantee to
> > > introduce order relationship into pid set because
> > > they might be reused. Instead we should consider
> > > parent->child relationship.
> > 
> > I don't understand this. Could you elaborate with more details?
> > 
> > > 
> > > Otherwise we might get
> > > 
> > >  |  31964  31964  31964       epoll
> > >  |    585  31964  31964           epoll
> > >  |    586  31964  31964           epoll
> > >  |...
> > >  | (04.797121)    585: Error (criu/eventpoll.c:180): epoll: Unexpected state for tfd (id 0 fd 8)
> 
> An example tree is above: comparing only pid numbers we will
> choose 586 as a master and 31964 as a slave, which is wrong
> of course. Instead we should walk over the pid tree and
> find which pid is a master (parent) and which is slave (children)

Why is it wrong? Why is the parent better than a child?
Cyrill Gorcunov June 6, 2018, 7:23 p.m.
On Mon, Jun 04, 2018 at 03:26:30PM -0700, Andrey Vagin wrote:
> On Mon, Jun 04, 2018 at 10:28:04PM +0300, Cyrill Gorcunov wrote:
> > On Mon, Jun 04, 2018 at 11:45:51AM -0700, Andrey Vagin wrote:
> > > On Wed, May 23, 2018 at 07:06:07PM +0300, Cyrill Gorcunov wrote:
> > > > The pid number is absolutely not a guarantee to
> > > > introduce order relationship into pid set because
> > > > they might be reused. Instead we should consider
> > > > parent->child relationship.
> > > 
> > > I don't understand this. Could you elaborate with more details?
> > > 
> > > > 
> > > > Otherwise we might get
> > > > 
> > > >  |  31964  31964  31964       epoll
> > > >  |    585  31964  31964           epoll
> > > >  |    586  31964  31964           epoll
> > > >  |...
> > > >  | (04.797121)    585: Error (criu/eventpoll.c:180): epoll: Unexpected state for tfd (id 0 fd 8)
> > 
> > An example tree is above: comparing only pid numbers we will
> > choose 586 as a master and 31964 as a slave, which is wrong
> > of course. Instead we should walk over the pid tree and
> > find which pid is a master (parent) and which is slave (children)
> 
> Why is it wrong? Why is the parent better than a child?

Please take a look on https://jira.sw.ru/browse/PSBM-63355, the
former fix I made for criu-2.10 series where we alread had new
files engine. The parent should have opened eventpoll descriptor
so we would add target descriptors then.

I think what I really should do instead is to create a test
case for it since we have an ability to generate pid tree
with pids needed. Thus for now please drop the patch.
Once I create a test and it get failed without a patch
then I'll come with some proves.