Am 09.05.2020 um 01:53 schrieb Andrew Morton
<akpm(a)linux-foundation.org>:
On Fri, 8 May 2020 10:42:14 +0200 David Hildenbrand <david(a)redhat.com> wrote:
> Assume we have kmem configured and loaded:
> [root@localhost ~]# cat /proc/iomem
> ...
> 140000000-33fffffff : Persistent Memory$
> 140000000-1481fffff : namespace0.0
> 150000000-33fffffff : dax0.0
> 150000000-33fffffff : System RAM
>
> Assume we try to unload kmem. This force-unloading will work, even if
> memory cannot get removed from the system.
> [root@localhost ~]# rmmod kmem
> [ 86.380228] removing memory fails, because memory
[0x0000000150000000-0x0000000157ffffff] is onlined
> ...
> [ 86.431225] kmem dax0.0: DAX region [mem 0x150000000-0x33fffffff] cannot be
hotremoved until the next reboot
>
> Now, we can reconfigure the namespace:
> [root@localhost ~]# ndctl create-namespace --force --reconfig=namespace0.0
--mode=devdax
> [ 131.409351] nd_pmem namespace0.0: could not reserve region [mem
0x140000000-0x33fffffff]dax
> [ 131.410147] nd_pmem: probe of namespace0.0 failed with error -16namespace0.0
--mode=devdax
> ...
>
> This fails as expected due to the busy memory resource, and the memory
> cannot be used. However, the dax0.0 device is removed, and along its name.
>
> The name of the memory resource now points at freed memory (name of the
> device).
> [root@localhost ~]# cat /proc/iomem
> ...
> 140000000-33fffffff : Persistent Memory
> 140000000-1481fffff : namespace0.0
> 150000000-33fffffff : �_�^7_��/_��wR��WQ���^��� ...
> 150000000-33fffffff : System RAM
>
> We have to make sure to duplicate the string. While at it, remove the
> superfluous setting of the name and fixup a stale comment.
>
> Fixes: 9f960da72b25 ("device-dax: "Hotremove" persistent memory that
is used like normal RAM")
> Cc: stable(a)vger.kernel.org # v5.3
hm.
Is this really -stable material? These are all privileged operations,
I expect?
Yes, my thought was rather that an admin could bring the system into such a state (by
mistake?). Let‘s see if somebody has a suggestion.
I guess if we were really unlucky, we could access invalid memory and trigger a BUG (e.g.,
page at the end of memory and does not contain a 0 byte).
Assuming "yes", I've queued this separately, staged for 5.7-rcX. I'll
redo patches 2-4 as a three-patch series for 5.8-rc1.
Make sense, let‘s wait for review feedback, thanks!