* Ingo Molnar <mingo(a)kernel.org> wrote:
> Is handling kernel pagefault on the vmemmap completely out of
the
> picture ? So we would carveout a chunck of kernel address space
> for those pfn and use it for vmemmap and handle pagefault on it.
That's pretty clever. The page fault doesn't even have to do remote
TLB shootdown, because it only establishes mappings - so it's pretty
atomic, a bit like the minor vmalloc() area faults we are doing.
Some sort of LRA (least recently allocated) scheme could unmap the
area in chunks if it's beyond a certain size, to keep a limit on
size. Done from the same context and would use remote TLB shootdown.
The only limitation I can see is that such faults would have to be
able to sleep, to do the allocation. So pfn_to_page() could not be
used in arbitrary contexts.
So another complication would be that we cannot just unmap such pages
when we want to recycle them, because the struct page in them might be
in use - so all struct page uses would have to refcount the underlying
page. We don't really do that today: code just looks up struct pages
and assumes they never go away.
Thanks,
Ingo