[RHEL7,COMMIT] nfs: protect callback execution against per-net callback thread shutdown

Submitted by Konstantin Khorenko on Nov. 9, 2017, 11:12 a.m.


Message ID 201711091112.vA9BCgfs009818@finist_ce7.work
State New
Series "nfs: fix race between callback shutdown and execution"
Headers show

Commit Message

Konstantin Khorenko Nov. 9, 2017, 11:12 a.m.
The commit is pushed to "branch-rh7-3.10.0-693.1.1.vz7.37.x-ovz" and will appear at https://src.openvz.org/scm/ovz/vzkernel.git
after rh7-3.10.0-693.1.1.vz7.37.25
commit 2149800a70af636b2b22289fc5aa977420b392c2
Author: Stanislav Kinsburskiy <skinsbursky@virtuozzo.com>
Date:   Thu Nov 9 14:12:42 2017 +0300

    nfs: protect callback execution against per-net callback thread shutdown
    Patchset description:
    nfs: fix race between callback shutdown and execution
    The idea is to use mutex for protecting callback execution agains per-net
    callback shutdown and destroying all the net-related backchannel requests
    before transports destruction.
    Stanislav Kinsburskiy (2):
        sunrpc: bc_svc_flush_queue_net() helper introduced
        nfs: protect callback execution against per-net callback thread shutdown
    This patch description:
    Here is the race:
    CPU #0				CPU#1
    cleanup_mnt			nfs41_callback_svc (get xprt from the list)
    nfs_callback_down		...
    ...				...
    svc_close_net			...
    ...				...
    svc_xprt_free			...
    svc_bc_sock_free		bc_svc_process
    kfree(xprt)			svc_process_common
    				rqstp->rq_xprt->xpt_ops (use after free)
    The problem is that per-net SUNRPC transports shutdown is done regardless
    current callback execution. This is a race leading to transport use-after-free
    in callback handler.
    This patch fixes it in stright-forward way. I.e. it protects callback
    execution with the same mutex used for per-net data creation and destruction.
    Hopefully, it won't slow down NFS client significantly.
    v5: destroy all per-net backchannel requests before transports on in
    v4: use another mutex to protect callback execution agains per-net transports
    This guarantees, that transports won't be destroyed by shutdown callback while
    execution is in progress and vice versa.
    v3: Fix mutex deadlock, when shutdown callback waits for thread to exit (with
    mutex taken), while thread wait for the mutex to take.
    The idea is to simply check if thread has to exit, if mutex lock has failed.
    This is a busy loop, but it shouldn't happend often and for long.
    Signed-off-by: Stanislav Kinsburskiy <skinsbursky@virtuozzo.com>
    Reviewed-by: Kirill Tkhai <ktkhai@virtuozzo.com>
 fs/nfs/callback.c | 17 +++++++++++++++++
 1 file changed, 17 insertions(+)

Patch hide | download patch | download mbox

diff --git a/fs/nfs/callback.c b/fs/nfs/callback.c
index 0beb275..e18d774 100644
--- a/fs/nfs/callback.c
+++ b/fs/nfs/callback.c
@@ -99,6 +99,8 @@  nfs4_callback_up(struct svc_serv *serv)
 #if defined(CONFIG_NFS_V4_1)
+static DEFINE_MUTEX(nfs41_callback_mutex);
  * The callback service for NFSv4.1 callbacks
@@ -117,6 +119,12 @@  nfs41_callback_svc(void *vrqstp)
 		if (try_to_freeze())
+		mutex_lock(&nfs41_callback_mutex);
+		if (kthread_should_stop()) {
+			mutex_unlock(&nfs41_callback_mutex);
+			return 0;
+		}
 		prepare_to_wait(&serv->sv_cb_waitq, &wq, TASK_INTERRUPTIBLE);
 		if (!list_empty(&serv->sv_cb_list)) {
@@ -129,8 +137,10 @@  nfs41_callback_svc(void *vrqstp)
 			error = bc_svc_process(serv, req, rqstp);
 			dprintk("bc_svc_process() returned w/ error code= %d\n",
+			mutex_unlock(&nfs41_callback_mutex);
 		} else {
+			mutex_unlock(&nfs41_callback_mutex);
 			finish_wait(&serv->sv_cb_waitq, &wq);
@@ -242,6 +252,7 @@  static void nfs_callback_down_net(u32 minorversion, struct svc_serv *serv, struc
 	dprintk("NFS: destroy per-net callback data; net=%p\n", net);
+	bc_svc_flush_queue_net(serv, net);
 	svc_shutdown_net(serv, net);
@@ -377,7 +388,13 @@  void nfs_callback_down(int minorversion, struct net *net)
 	struct nfs_callback_data *cb_info = &nfs_callback_info[minorversion];
+#if defined(CONFIG_NFS_V4_1)
+	mutex_lock(&nfs41_callback_mutex);
+	nfs_callback_down_net(minorversion, cb_info->serv, net);
+	mutex_unlock(&nfs41_callback_mutex);
 	nfs_callback_down_net(minorversion, cb_info->serv, net);
 	if (cb_info->users == 0 && cb_info->task != NULL) {