Dan Williams <dan.j.williams(a)intel.com> writes:
In general MAP_SYNC, makes more sense semantic sense in that the
filesystem knows that the application is not going to be calling *sync
and so it makes sure its metadata is consistent after a write fault.
What you wrote is true for both MAP_SYNC and MAP_PMEM_AWARE. :)
I assume you meant that MAP_SYNC is semantically cleaner from the file
system developer's point of view, yes? Boaz, it might be helpful for
you to write down how an application might be structured to make use of
MAP_PMEM_AWARE. Up to this point, I've been assuming you'd call it
whenever an application would call pcommit (or whatever the incantation
is on current CPUs).
Although if we had MAP_SYNC today we'd still be in the situation
that
an app that fails to do its own cache flushes / bypass correctly gets
to keep the broken pieces.
Dan, we already have this problem with existing storage and existing
interfaces. Nothing changes with dax.
The crux of the problem, in my opinion, is that we're asking for
an "I
know what I'm doing" flag, and I expect that's an impossible statement
for a filesystem to trust generically.
The file system already trusts that. If an application doesn't use
fsync properly, guess what, it will break. This line of reasoning
doesn't make any sense to me.
If you can get MAP_PMEM_AWARE in, great, but I'm more and more of
the
opinion that the "I know what I'm doing" interface should be something
separate from today's trusted filesystems.
Just so I understand you, MAP_PMEM_AWARE isn't the "I know what I'm
doing" interface, right?
It sounds like we're a long way off from anything like MAP_SYNC going
in. What I think would be useful at this stage is to come up with a
programming model we can all agree on. ;-) Crucially, I want to avoid
the O_DIRECT quagmire of different file systems behaving differently,
and having no way to actually query what behavior you're going to get.
Cheers,
Jeff