On 11/01/2017 12:25 PM, Dan Williams wrote:
[..]
> It's not persistent memory if it requires a hypercall to make
it
> persistent. Unless memory writes can be made durable purely with cpu
> instructions it's dangerous for it to be treated as a PMEM range.
> Consider a guest that tried to map it with device-dax which has no
> facility to route requests to a special flushing interface.
>
Can we separate the concept of flush interface from persistent memory?
Say there are two APIs, one is used to indicate the memory type (i.e,
/proc/iomem) and another one indicates the flush interface.
So for existing nvdimm hardwares:
1: Persist-memory + CLFLUSH
2: Persiste-memory + flush-hint-table (I know Intel does not use it)
and for the virtual nvdimm which backended on normal storage:
Persist-memory + virtual flush interface
I see the flush interface as fundamental to identifying the media
properties. It's not byte-addressable persistent memory if the
application needs to call a sideband interface to manage writes. This
is why we have pushed for something like the MAP_SYNC interface to
make filesystem-dax actually behave in a way that applications can
safely treat it as persistent memory, and this is also the guarantee
that device-dax provides. Changing the flush interface makes it
distinct and unusable for applications that want to manage data
persistence in userspace.
>>
>>> In what way is this "more complicated"? It was trivial to add
support
>>> for the "volatile" NFIT range, this will not be any more
complicated
>>> than that.
>>>
>>
>> Introducing memory type is easy indeed, however, a new flush interface
>> definition is inevitable, i.e, we need a standard way to discover the
>> MMIOs to communicate with host.
>
>
> Right, the proposed way to do that for x86 platforms is a new SPA
> Range GUID type. in the NFIT.
>
So this SPA is used for both persistent memory region and flush interface?
Maybe i missed it in previous mails, could you please detail how to do
it?
Yes, the GUID will specifically identify this range as "Virtio Shared
Memory" (or whatever name survives after a bikeshed debate). The
libnvdimm core then needs to grow a new region type that mostly
behaves the same as a "pmem" region, but drivers/nvdimm/pmem.c grows a
new flush interface to perform the host communication. Device-dax
would be disallowed from attaching to this region type, or we could
grow a new device-dax type that does not allow the raw device to be
mapped, but allows a filesystem mounted on top to manage the flush
interface.
BTW, please note hypercall is not acceptable for standard, MMIO/PIO
regions
are. (Oh, yes, it depends on Paolo. :))
MMIO/PIO regions works for me, that's not the part of the proposal I'm
concerned about.