Hi All
We encountered a strange issue using pmem emulation.
We configure emulated pmem devices in our system by adding these params to
the boot command:
memmap=32G!224G memmap=32G!480G .
PMEM devices look OK.
This issue is specific to 4.14 kernel AND centos 7.5, other combinations
follow exactly the same flow but don't crash.
Happens when ndctl converts an <emulated> /dev/pmem0 to /dev/dax0.0
command: ndctl create-namespace -f -e namespace0.0 --type=pmem --mode=dax
This crashes the kernel in udev when trying to free page in do_munmap()
syscall. And hangs.
echo namespace0.0 >
/sys/devices/platform/e820_pmem/ndbus0/region0/namespace0.0/driver/unbind
Calltrace():
ndctl_unbind()
ndctl_namespace_disable()
ndctl_namespace_disable_invalidate()
ndctl_namespace_disable_safe()
namespace_destroy()
namespace_reconfig()
do_xaction_namespace()
When we use for example ubuntu 16.04 with this kernel version it does not
happen.
When we use centos7.5 and kernel version 4.9 is also does not happen.
When working with actual NVDIMMs this does not happen
Did you encounter such an issue?
Why does the kernel think that this area is mapped and who might be mapping
it?
If no one maps it why does the kernel has indication that these pages are
mapped?
Any help would be highly appreciated.
Thanks
Oren Berman
On 10 January 2018 at 09:41, Oren Berman <oren(a)lightbitslabs.com> wrote:
Hi
Thanks for your answer
If we do memremap on the physical address
Of the nvram from within the kernel to get a new virtual address mapping
will it lock the mapping?
Can this be also a workaround?
Oren
נשלח מה-iPhone שלי
ב-10 בינו׳ 2018, בשעה 18:38, Dan Williams <dan.j.williams(a)intel.com>
כתב/ה:
On Wed, Jan 10, 2018 at 7:23 AM, Oren Berman <oren(a)lightbitslabs.com>
wrote:
Now to all of the forum
Hi Dan
Thanks we are going to try this.
Can you explain why this can cause this issue - is the NVDIMM memory space
also being randomized?
Is it done during runtime?
Yes, kaslr randomizes the direct map. We have seen problems with it in
the past relative to setting up pmem mappings. We fixed one such bug
with this commit:
fc5f9d5f151c x86/mm: Fix boot crash caused by incorrect loop count
calculation in sync_global_pgds()
...but it appears we may have another bug in this area.