* Dan Williams <dan.j.williams(a)intel.com> wrote:
> What is the primary thing that is driving this need? Do we have
> very concrete example?
My pet concrete example is covered by __pfn_t. Referencing
persistent memory in an md/dm hierarchical storage configuration.
Setting aside the thrash to get existing block users to do
"bvec_set_page(page)" instead of "bvec->page = page" the onus is
that md/dm implementation and backing storage device driver to
operate on __pfn_t. That use case is simple because there is no use
of page locking or refcounting in that path, just dma_map_page() and
kmap_atomic(). The more difficult use case is precisely what Al
picked up on, O_DIRECT and RDMA. This patchset does nothing to
address those use cases outside of not needing a struct page when
they eventually craft a bio.
So why not do a dual approach?
There are code paths where the 'pfn' of a persistent device is mostly
used as a sector_t equivalent of terabytes of storage, not as an index
of a memory object.
It's not an address to a cache, it's an index into a huge storage
space - which happens to be (flash) RAM. For them using pfn_t seems
natural and using struct page * is a strained (not to mention
For more complex facilities, where persistent memory is used as a
memory object, especially where the underlying device is true,
unfinitely writable RAM (not flash), treating it as a memory zone, or
setting up dynamic struct page would be the natural approach. (with
the inevitable cost of setup/teardown in the latter case)
I'd say that for anything where the dynamic struct page is torn down
unconditionally after completion of only a single use, the natural API
is probably pfn_t, not struct page. Any synchronization is already
handled at the block request layer already, and it's storage op
synchronization, not memory access synchronization really.
For anything more complex, that maps any of this storage to
user-space, or exposes it to higher level struct page based APIs,
etc., where references matter and it's more of a cache with
potentially multiple users, not an IO space, the natural API is struct
I'd say that this particular series mostly addresses the 'pfn as
sector_t' side of the equation, where persistent memory is IO space,
not memory space, and as such it is the more natural and thus also the
Linus probably disagrees? :-)