On Mon, Mar 7, 2016 at 10:58 AM, Toshi Kani <toshi.kani(a)hpe.com> wrote:
On Mon, 2016-03-07 at 09:18 -0800, Dan Williams wrote:
> On Mon, Mar 7, 2016 at 9:56 AM, Toshi Kani <toshi.kani(a)hpe.com> wrote:
> > On Fri, 2016-03-04 at 18:23 -0800, Dan Williams wrote:
> > > On Fri, Mar 4, 2016 at 6:48 PM, Toshi Kani <toshi.kani(a)hpe.com>
> > > wrote:
> > > As far as I can see
> > > all we do is ask firmware implementations to respect Linux section
> > > boundaries and otherwise not change alignments.
> > In addition to the requirement that pmem range alignment may not
> > change, the code also requires a regular memory range does not change
> > to intersect with a pmem section later. This seems fragile to me since
> > guest config may vary / change as I mentioned above.
> > So, shouldn't the driver fails to attach when the range is not aligned
> > by the section size? Since we need to place a requirement to firmware
> > anyway, we can simply state that it must be aligned by 128MiB (at
> > least) on x86. Then, memory and pmem physical layouts can be changed
> > as long as this requirement is met.
> We can state that it must be aligned, but without a hard specification
> I don't see how we can guarantee it. We will fail the driver load
> with a warning if our alignment fixups end up getting invalidated by a
> later configuration change, but in the meantime we cover the gap of a
> BIOS that has generated a problematic configuration.
I do not think it has to be stated in the spec (although it may be a good
idea to state it as an implementation note :-).
This is an OS-unique requirement (and the size is x86-specific) that if it
wants to support Linux pmem pfn, then the alignment needs to be at least
128MiB. Regular pmem does not have this restriction, but it needs to be
aligned by 2MiB or 1GiB for using huge page mapping, which does not have to
be stated in the spec, either.
We can check that the alignment is correct when the namespace is first
instantiated, but we're still stuck if the configuration ever changes.
For KVM to support the pmem pfn feature on x86, it needs to guarantee
128MiB alignment. Otherwise, this feature is not supported. (I do not
worry about NVDIMM-N since it is naturally aligned by its size.)
If we allow unaligned cases, then the driver needs to detect change from
the initial condition and fail to attach for protecting data. I did not
see such check in the code, but I may have overlooked. We cannot check if
KVM has any guarantee to keep the alignment at the initial setup, though.
devm_memremap_pages() will fail if the driver tries to pass in an
unaligned address  ...and now that I look again, that patch
mishandles the aligning 'size', will fix.