[Devel,vz7] ploop: push_backup: ploop_pb_get_pending should wait again instead of ENOENT

Submitted by Maxim Patlasov on July 20, 2017, 10:06 p.m.

Details

Message ID 150058834007.15458.577893455423236810.stgit@maxim-thinkpad
State New
Series "ploop: push_backup: ploop_pb_get_pending should wait again instead of ENOENT"
Headers show

Commit Message

Maxim Patlasov July 20, 2017, 10:06 p.m.
The patch fixes a race when ploop_pb_get_pending was rightly woken up
to pass an extent to userspace, but before it re-acquire pbd->ppb_lock
another thread of vz_backup_agent reports exactly this extent as processed.

This effectively steals the extent from ploop_pb_get_pending, so it fails
to get a preq from ploop_pb_get_first_reqs_from_pending(). Before the patch,
the kernel returned ENOENT to userspace confusing vz_backup_agent. So far
as the race happens in kernel and userspace cannot control it, let's retry
in kernel.

https://jira.sw.ru/browse/PSBM-68608
Signed-off-by: Maxim Patlasov <mpatlasov@virtuozzo.com>
---
 drivers/block/ploop/push_backup.c |    4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

Patch hide | download patch | download mbox

diff --git a/drivers/block/ploop/push_backup.c b/drivers/block/ploop/push_backup.c
index 032706e..d92b93c 100644
--- a/drivers/block/ploop/push_backup.c
+++ b/drivers/block/ploop/push_backup.c
@@ -803,6 +803,7 @@  int ploop_pb_get_pending(struct ploop_pushbackup_desc *pbd,
 			err = -EBUSY;
 			goto get_pending_unlock;
 		}
+wait_again:
 		pbd->ppb_waiting = true;
 		spin_unlock_irq(&pbd->ppb_lock);
 
@@ -825,7 +826,8 @@  int ploop_pb_get_pending(struct ploop_pushbackup_desc *pbd,
 				err =  -ESTALE;
 			else if (signal_pending(current))
 				err = -ERESTARTSYS;
-			else err = -ENOENT;
+			else
+				goto wait_again;
 
 			goto get_pending_unlock;
 		}