Christoph Hellwig <hch(a)infradead.org> writes:
On Mon, Feb 22, 2016 at 12:58:18PM -0500, Jeff Moyer wrote:
> Sorry for being dense, but why, exactly? If the file system is making
> changes without the application's involvement, then the file system
> should be responsible for ensuring its own consistency, irrespective of
> whether the application issues an fsync. Clearly I'm missing some key
> point here.
The simplest example is a copy on write file system (or simply a copy on
write file, which can exist with ocfs2 and will with xfs very soon),
where each write will allocate a new block, which will require metadata
updates.
We've built the whole I/O model around the concept that by default our
I/O will required fsync/msync. For read/write-style I/O you can opt out
using O_DSYNC. There currently is no way to opt out for memory mapped
I/O, mostly because it's
a) useless without something like DAX, and
b) much harder to implement
So a MAP_SYNC option might not be entirely off the table, but I think
it would be a lot of hard work and I'm not even sure it's possible
to handle it in the general case.
I see. So, at write fault time, you're saying that new blocks may be
allocated, and that in order to make that persistent, we need a sync
operation. Presumably this MAP_SYNC option could sync out the necessary
metadata updates to the log before returning from the write fault
handler. The arguments against making this work are that it isn't
generally useful, and that we don't want more dax special cases in the
code. Did I get that right?
Thanks,
Jeff