On Sun, May 10, 2015 at 2:46 AM, Ingo Molnar <mingo(a)kernel.org> wrote:
* Dan Williams <dan.j.williams(a)intel.com> wrote:
> > "Directly mapped pmem integrated into the page cache":
> > ------------------------------------------------------
> Nice, I think it makes sense as an area that gets reserved at file
> system creation time. You are not proposing that this gets
> automatically reserved at the device level, right? [...]
Well, it's most practical if the device does it automatically (the
layout is determined prior filesystem creation), and the filesystem
does not necessarily have to be aware of it - but obviously as a user
opt-in.
Hmm, my only hesitation is that the raw size of a pmem device is
visible outside of Linux (UEFI/BIOS other OSes, etc). What about a
simple layered block-device that fronts a raw pmem device? It can
store a small superblock signature and then reserve / init the struct
page space. This is where we can use the __pfn_t flags that Linus
suggested. Whereas the raw device ->direct_access() returns __pfn_t
values with a 'PFN_DEV' flag indicating originating from
device-memory, a ->direct_access() on this struct-page-provider-device
yields __pfn_t's with 'PFN_DEV | PFN_MAPPED' indicating that
__pfn_t_to_page() can attempt to lookup the page in a device-specific
manner (similar to how kmap_atomic_pfn_t is implemented in the
'evacuate struct page from the block layer' patch set).