On Wed, Jun 7, 2017 at 1:48 PM, Ross Zwisler
<ross.zwisler(a)linux.intel.com> wrote:
To be able to use the common 4k zero page in DAX we need to have our
PTE
fault path look more like our PMD fault path where a PTE entry can be
marked as dirty and writeable as it is first inserted, rather than waiting
for a follow-up dax_pfn_mkwrite() => finish_mkwrite_fault() call.
Right now we can rely on having a dax_pfn_mkwrite() call because we can
distinguish between these two cases in do_wp_page():
case 1: 4k zero page => writable DAX storage
case 2: read-only DAX storage => writeable DAX storage
This distinction is made by via vm_normal_page(). vm_normal_page() returns
false for the common 4k zero page, though, just as it does for DAX ptes.
Instead of special casing the DAX + 4k zero page case, we will simplify our
DAX PTE page fault sequence so that it matches our DAX PMD sequence, and
get rid of dax_pfn_mkwrite() completely.
This means that insert_pfn() needs to follow the lead of insert_pfn_pmd()
and allow us to pass in a 'mkwrite' flag. If 'mkwrite' is set
insert_pfn()
will do the work that was previously done by wp_page_reuse() as part of the
dax_pfn_mkwrite() call path.
Signed-off-by: Ross Zwisler <ross.zwisler(a)linux.intel.com>
---
include/linux/mm.h | 9 +++++++--
mm/memory.c | 21 ++++++++++++++-------
2 files changed, 21 insertions(+), 9 deletions(-)
diff --git a/include/linux/mm.h b/include/linux/mm.h
index b892e95..11e323a 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -2294,10 +2294,15 @@ int vm_insert_pfn(struct vm_area_struct *vma, unsigned long
addr,
unsigned long pfn);
int vm_insert_pfn_prot(struct vm_area_struct *vma, unsigned long addr,
unsigned long pfn, pgprot_t pgprot);
-int vm_insert_mixed(struct vm_area_struct *vma, unsigned long addr,
- pfn_t pfn);
+int vm_insert_mixed_mkwrite(struct vm_area_struct *vma, unsigned long addr,
+ pfn_t pfn, bool mkwrite);
Are there any other planned public users of vm_insert_mixed_mkwrite()
that would pass false? I think not.
int vm_iomap_memory(struct vm_area_struct *vma, phys_addr_t start,
unsigned long len);
+static inline int vm_insert_mixed(struct vm_area_struct *vma,
+ unsigned long addr, pfn_t pfn)
+{
+ return vm_insert_mixed_mkwrite(vma, addr, pfn, false);
+}
...in other words instead of making the distinction of
vm_insert_mixed_mkwrite() and vm_insert_mixed() with extra flag
argument just move the distinction into mm/memory.c directly.
So, the prototype remains the same as vm_insert_mixed()
int vm_insert_mixed_mkwrite(struct vm_area_struct *vma, unsigned long
addr, pfn_t pfn);
...and only static insert_pfn(...) needs to change.