On Mon, Aug 19, 2019 at 09:38:41AM -0300, Jason Gunthorpe wrote:
On Mon, Aug 19, 2019 at 07:24:09PM +1000, Dave Chinner wrote:
> So that leaves just the normal close() syscall exit case, where the
> application has full control of the order in which resources are
> released. We've already established that we can block in this
> context. Blocking in an interruptible state will allow fatal signal
> delivery to wake us, and then we fall into the
> fatal_signal_pending() case if we get a SIGKILL while blocking.
The major problem with RDMA is that it doesn't always wait on close() for the
MR holding the page pins to be destoyed. This is done to avoid a
deadlock of the form:
uverbs_destroy_ufile_hw()
mutex_lock()
[..]
mmput()
exit_mmap()
remove_vma()
fput();
file_operations->release()
I think this is wrong, and I'm pretty sure it's an example of why
the final __fput() call is moved out of line.
fput()
fput_many()
task_add_work(f, __fput())
and the call chain ends there.
Before the syscall returns to userspace, it then runs the __fput()
call through the task_work_run() interfaces, and hence the call
chain is just:
task_work_run
__fput
file_operations->release()
ib_uverbs_close()
uverbs_destroy_ufile_hw()
mutex_lock() <-- Deadlock
And there is no deadlock because nothing holds the mutex at this
point.
Cheers,
Dave.
--
Dave Chinner
david(a)fromorbit.com