On Mon, Oct 16, 2017 at 10:43 AM, Dan Williams <dan.j.williams(a)intel.com> wrote:
On Mon, Oct 16, 2017 at 12:26 AM, Christoph Hellwig
<hch(a)lst.de> wrote:
> On Fri, Oct 13, 2017 at 11:31:45AM -0600, Jason Gunthorpe wrote:
>> I don't think that really represents how lots of apps actually use
>> RDMA.
>>
>> RDMA is often buried down in the software stack (eg in a MPI), and by
>> the time a mapping gets used for RDMA transfer the link between the
>> FD, mmap and the MR is totally opaque.
>>
>> Having a MR specific notification means the low level RDMA libraries
>> have a chance to deal with everything for the app.
>>
>> Eg consider a HPC app using MPI that uses some DAX aware library to
>> get DAX backed mmap's. It then passes memory in those mmaps to the
>> MPI library to do transfers. The MPI creates the MR on demand.
>>
>
> I suspect one of the more interesting use cases might be a file server,
> for which that's not the case. But otherwise I agree with the above,
> and also thing that notifying the MR handle is the only way to go for
> another very important reason: fencing. What if the application/library
> does not react on the notification? With a per-MR notification we
> can unregister the MR in kernel space and have a rock solid fencing
> mechanism. And that is the most important bit here.
While I agree with the need for a per-MR notification mechanism, one
thing we lose by walking away from MAP_DIRECT is a way for a
hypervisor to coordinate pass through of a DAX mapping to an RDMA
device in a guest. That will remain a case where we will still need to
use device-dax. I'm fine if that's the answer, but just want to be
clear about all the places we need to protect a DAX mapping against
RDMA from a non-ODP device.
For this specific issue perhaps we promote FL_LAYOUT as a lease-type
that can be set by fcntl().