On Mon, Sep 28, 2015 at 03:57:29PM -0700, Dan Williams wrote:
On Mon, Sep 28, 2015 at 2:35 PM, Dave Chinner
> On Mon, Sep 28, 2015 at 05:13:50AM -0700, Dan Williams wrote:
>> On Sun, Sep 27, 2015 at 5:59 PM, Dave Chinner <david(a)fromorbit.com>
>> > On Fri, Sep 25, 2015 at 09:17:45PM -0600, Ross Zwisler wrote:
>> >> On Fri, Sep 25, 2015 at 12:53:57PM +1000, Dave Chinner wrote:
>> >> Does this sound like a reasonable path forward for v4.3? Dave, and
>> >> you guys can provide guidance and code reviews for the XFS and ext4
>> > IMO, it's way too much to get into 4.3. I'd much prefer we revert
>> > the bad changes in 4.3, and then work towards fixing this for the
>> > 4.4 merge window. If someone needs this for 4.3, then they can
>> > backport the 4.4 code to 4.3-stable.
>> If the proposal is to step back and get a running start at these fixes
>> for 4.4, then it is worth considering what the state of allocating
>> pages for DAX mappings will be in 4.4.
> Oh, do tell. I haven't seen any published design, code, etc,
This is via the devm_memremap_pages() api that went into 4.2  and
my v1 (RFC quality) series using it for dax get_user_pages() .
I'll have a look at some point when I'm not trying to put out fires.
> And, quite frankly, I'm not enabling any new DAX
> in XFS until I've had time to review, test and fix it so it works
> without deadlocking or corrupting data.
I'm in violent agreement, to the point where I'm pondering whether
CONFIG_FS_DAX should just depend on CONFIG_BROKEN in 4.3 until we've
convinced ourselves of all the fixes in 4.4. It's not clear to me
that we have a stable baseline to which we can revert this "still in
development" implementation, did you have one in mind?
XFS warns that DAX is experimental when you mount with that option,
so there is no need to do that:
[ 686.055780] XFS (ram0): DAX enabled. Warning: EXPERIMENTAL, use at your own risk
[ 686.058464] XFS (ram0): Mounting V5 Filesystem
[ 686.062857] XFS (ram0): Ending clean mount
>> It's already that case that
>> allocating struct page for DAX mappings is the only solution on the
>> horizon for enabling a get_user_pages() solution for persistent
>> memory. We of course need to get the page-less DAX path fixed up, but
>> the near-term path to full functionality and safety is when struct
>> page is available to enable the typical synchronization mechanics.
> And we do so at the expense of medium to long term complexity and
> maintenance. I'm no fan of using struct pages to track terabytes to
> petabytes of persistent memory, and I'm even less of a fan of having
> to simultaneously support both struct page and pfn based DAX
I'm no fan of tracking petabytes of persistent memory with struct
page, but we're in the near term space (hardware technology-wise) of
how to enable DMA/RDMA to 100s of gigabytes to a few terabytes of
Don't think I don't know that - as I said to someone a few hours
ago on IRC:
[29/09/15 07:41] <dchinner> I'm sure they do, but they have a hard requirement
to support RDMA from persistent memory
[29/09/15 07:41] <dchinner> and that's what seems to be driving the "we
need to use struct pages" design
A page-less solution to that problem is not on the
horizon as far as I can tell. In short, I am concerned we are
spending time working around the lack of struct page to get to a
stable page-less solution that is still missing support for the use
cases that are expected to "just work".
I'm concerned with making what we have work before we go and change
everything. You might want to move really quickly, but without sane
filesystem support you can't ship anything worth a damn. There's all
sorts of issues here, and introducing struct pages doesn't solve all
Let's concentrate on ensuring the basic operation of DAX is robust
first - get the page fault vs extent manipulations serialised, sane
and scalable before we start changing anything else. If we don't
solve these problems, then nothing else we do will be reliable, and
the problems exist regardless of whether we are using struct pages
or not. Hence these are the critical problems we need to fix before
Once we have these issues sorted out, switching between struct page
and pfn should be much simpler because we don't have to worry about
different locking strategies to protect against truncate, racing
page faults, etc.
I do not think introducing page-back persistent memory sets us back
square 1. Instead, given the functionality that is enabled when pages
are present I think it is safe to assume most platforms will arrange
for page backed persistent memory.
Sure, but it will take a little time to get there. Moving fast
doesn't help us here - it only results in stuff we have to revert or
redo in the near future and that means progress is much slower than
it should be. Let's solve the DAX problems in the right order - it
will make things simpler and faster down the road.