On Tue, Sep 26, 2017 at 2:06 PM, Ross Zwisler
On Tue, Sep 26, 2017 at 12:19:21PM -0700, Dan Williams wrote:
> On Tue, Sep 26, 2017 at 11:57 AM, Ross Zwisler
> > This decision can only be made (in this
> > proposed scheme) *after* the inode->i_mapping->i_mmap tree has been
> > populated, which means we need another call into the filesystem after this
> > insertion has happened.
> I get that, but it seems over-engineered and something that can also
> be safely cleaned up after the fact by the code path that is disabling
I don't think you can safely clean it up after the fact because some thread
might have already called ->mmap() to set up the vma->vm_flags for their new
mapping, but they haven't added it to inode->i_mapping->i_mmap.
If madvise(MADV_NOHUGEPAGE) can dynamically change vm_flags, then the
DAX disable path can as well. VM_MIXEDMAP looks to be a nop for normal
The inode->i_mapping->i_mmap tree is the only way (that I know
of at least)
that the filesystem has any idea about about the mapping. This is the method
by which we would try and clean up mapping flags, if we were to do so, and
it's the only way that the filesystem can know whether or not mappings exist.
The only way that I could think of to make this safely work is to have the
insertion into the inode->i_mapping->i_mmap tree be our sync point. After
that the filesystem and the mapping code can communicate on the state of DAX,
but before that I think it's basically indeterminate.
If we lose the race and leak VM_HUGEPAGE to a non-DAX mapping what
breaks? I'd rather be in favor of not setting VM_HUGEPAGE at all in
the ->mmap() handler and let the default THP policy take over. In
fact, see transparent_hugepage_enabled() we already auto-enable huge
page support for dax mappings regardless of VM_HUGEPAGE.