On Wed, May 06, 2015 at 04:04:53PM -0400, Dan Williams wrote:
Changes since v1 [1]:
1/ added include/asm-generic/pfn.h for the __pfn_t definition and helpers.
2/ added kmap_atomic_pfn_t()
3/ rebased on v4.1-rc2
[1]:
http://marc.info/?l=linux-kernel&m=142653770511970&w=2
---
A lead in note, this looks scarier than it is. Most of the code thrash
is automated via Coccinelle. Also the subtle differences behind an
'unsigned long pfn' and a '__pfn_t' are mitigated by type-safety and a
Kconfig option (default disabled CONFIG_PMEM_IO) that globally controls
whether a pfn and a __pfn_t are equivalent.
The motivation for this change is persistent memory and the desire to
use it not only via the pmem driver, but also as a memory target for I/O
(DAX, O_DIRECT, DMA, RDMA, etc) in other parts of the kernel. Aside
from the pmem driver and DAX, persistent memory is not able to be used
in these I/O scenarios due to the lack of a backing struct page, i.e.
persistent memory is not part of the memmap. This patchset takes the
position that the solution is to teach I/O paths that want to operate on
persistent memory to do so by referencing a __pfn_t. The alternatives
are discussed in the changelog for "[PATCH v2 01/10] arch: introduce
__pfn_t for persistent memory i/o", copied here:
Alternatives:
1/ Provide struct page coverage for persistent memory in
DRAM. The expectation is that persistent memory capacities make
this untenable in the long term.
2/ Provide struct page coverage for persistent memory with
persistent memory. While persistent memory may have near DRAM
performance characteristics it may not have the same
write-endurance of DRAM. Given the update frequency of struct
page objects it may not be suitable for persistent memory.
3/ Dynamically allocate struct page. This appears to be on
the order of the complexity of converting code paths to use
__pfn_t references instead of struct page, and the amount of
setup required to establish a valid struct page reference is
mostly wasted when the only usage in the block stack is to
perform a page_to_pfn() conversion for dma-mapping. Instances
of kmap() / kmap_atomic() usage appear to be the only occasions
in the block stack where struct page is non-trivially used. A
new kmap_atomic_pfn_t() is proposed to handle those cases.
*grumble*
What are you going to do with things like iov_iter_get_pages()? Long-term,
that is, after you go for "this pfn has no struct page for it"...