fix crash on module reference leak

Submitted by Pavel Tikhomirov on Feb. 26, 2018, 10:01 a.m.

Details

Reviewer None
Submitted Feb. 26, 2018, 10:01 a.m.
Last Updated Feb. 26, 2018, 9:04 p.m.
Revision 1

Cover Letter

That's how the race happens:
========
mutex_lock(&module_mutex);
try_stop_module -> stop_machine -> __stop_machine -> stop_cpus ->
__stop_cpus->[for_each_cpu]cpu_stop_queue_work ... -> __try_stop_module
{
	if (module_refcount
	    {
		count decs;
		smp_rmb;
		count incs;
		return incs - decs;
	})
		return -EWOULDBLOCK;

#2 decs == incs, no reference on module
	
	state = MODULE_STATE_GOING;
#3 change state to GOING away
}
mutex_unlock(&module_mutex);

========
uio_open
{
	try_module_get
	{
		preempt_disable(); // only compiler barrier
		if (module_is_live {state != MODULE_STATE_GOING})
#1 load and check state is not GOING

		{
			increment incs;
#4 increment while already GOING
		}
		preempt_enable();
	};
}

commit 24a2b6e22b38 fixes it by using atomic_inc_not_zero in
try_module_get, thus either #4 can't happen if we already released
module reference in try_stop_module, or we can't release the reference
in try_stop_module if try_module_get already got it's reference on the
module.

Note: These patches are cherry-picked from MS with two small conflicts:
1) In second patch rhel-introduced structure changes hunk's surounding,
but no meaningful change here.
2) In third patch leave stop_machine header, it is still used elsewhere.

https://jira.sw.ru/browse/PSBM-80508

Christoph Lameter (1):
  modules: use raw_cpu_write for initialization of per cpu refcount.

Masami Hiramatsu (2):
  module: Replace module_ref with atomic_t refcnt
  module: Remove stop_machine from module unloading

 include/linux/module.h        | 16 +-------
 include/trace/events/module.h |  2 +-
 kernel/module.c               | 93 ++++++++++++++++++-------------------------
 3 files changed, 40 insertions(+), 71 deletions(-)
  

Revisions