Boaz Harrosh <boaz(a)plexistor.com> writes:
> An application creates a file and writes to every single block in
the
> thing, sync's it, closes it. It then opens it back up, calls mmap with
> this new MAP_DAX flag or on a file system mounted with -o dax, and
> proceeds to access the file using loads and stores. It persists its
> data by using non-temporal stores, flushing and fencing cpu
> instructions.
>
> If I understand you correctly, you're saying that that application is
> not written correctly, because it needs to call fsync to persist
> metadata (that it presumably did not modify). Is that right?
>
Hi Jeff
I do not understand why you chose to drop my email address from your
reply? What do I need to feel when this happens?
Hi Boaz,
Sorry you were dropped, that was not my intention; I blame my mailer, as
I did hit reply-all. No hard feelings?
And to your questions above. As I answered to Dave.
This is the novelty of my approach and the big difference between
what you guys thought with MAP_DAX and my patches as submitted.
1. Application will/need to call m/fsync to let the FS the freedom it needs
2. The m/fsync as well as the page faults will be very light wait and fast,
all that is required from the pmem aware app is to do movnt stores and cl_flushes.
I like the approach for these existing file systems.
So enjoying both worlds. And actually more:
With your approach of fallocat(ing) the all space in advance you might as well
just partition the storage and use the DAX(ed) block device. But with my
approach you need not pre-allocate and enjoy the over provisioned model and
the space allocation management of a modern FS. And even with all that still
enjoy very fast direct mapped stores by not requiring the current slow m/fsync()
Well, that remains to be seen. Certainly for O_DIRECT appends or hole
filling, there is extra overhead involved when compared to writes to
already-existing blocks. Apply that to DAX and the overhead will be
much more prominent. I'm not saying that this is definitely the case,
but I think it's something we'll have to measure going forward.
I hope you guys stand behind me in my effort to accelerate userspace
pmem apps
and still not break any built in assumptions.
I do like the idea of reducing the msync/fsync overhead, though I admit
I haven't yet looked at the patches in any detail. My mail in this
thread was primarily an attempt to wrap my head around why the fs needs
the fsync/msync at all. I've got that cleared up now.
Cheers,
Jeff