On 09.02.21 09:59, Michal Hocko wrote:
On Mon 08-02-21 22:38:03, David Hildenbrand wrote:
>> Am 08.02.2021 um 22:13 schrieb Mike Rapoport <rppt(a)kernel.org>:
>> On Mon, Feb 08, 2021 at 10:27:18AM +0100, David Hildenbrand wrote:
>>> On 08.02.21 09:49, Mike Rapoport wrote:
>>> Some questions (and request to document the answers) as we now allow to have
>>> unmovable allocations all over the place and I don't see a single
>>> regarding that in the cover letter:
>>> 1. How will the issue of plenty of unmovable allocations for user space be
>>> tackled in the future?
>>> 2. How has this issue been documented? E.g., interaction with ZONE_MOVABLE
>>> and CMA, alloc_conig_range()/alloc_contig_pages?.
>> Secretmem sets the mappings gfp mask to GFP_HIGHUSER, so it does not
>> allocate movable pages at the first place.
> That is not the point. Secretmem cannot go on CMA / ZONE_MOVABLE
> memory and behaves like long-term pinnings in that sense. This is a
> real issue when using a lot of sectremem.
A lot of unevictable memory is a concern regardless of CMA/ZONE_MOVABLE.
As I've said it is quite easy to land at the similar situation even with
tmpfs/MAP_ANON|MAP_SHARED on swapless system. Neither of the two is
really uncommon. It would be even worse that those would be allowed to
consume both CMA/ZONE_MOVABLE.
IIRC, tmpfs/MAP_ANON|MAP_SHARED memory
a) Is movable, can land in ZONE_MOVABLE/CMA
b) Can be limited by sizing tmpfs appropriately
AFAIK, what you describe is a problem with memory overcommit, not with
zone imbalances (below). Or what am I missing?
One has to be very careful when relying on CMA or movable zones. This is
definitely worth a comment in the kernel command line parameter
documentation. But this is not a new problem.
I see the following thing worth documenting:
Assume you have a system with 2GB of ZONE_NORMAL/ZONE_DMA and 4GB of
Assume you make use of 1.5GB of secretmem. Your system might run into
OOM any time although you still have plenty of memory on ZONE_MOVAVLE
(and even swap!), simply because you are making excessive use of
unmovable allocations (for user space!) in an environment where you
should not make excessive use of unmovable allocations (e.g., where
should page tables go?).
The existing controls (mlock limit) don't really match the current
semantics of that memory. I repeat it once again: secretmem *currently*
resembles long-term pinned memory, not mlocked memory. Things will
change when implementing migration support for secretmem pages. Until
then, the semantics are different and this should be spelled out.
For long-term pinnings this is kind of obvious, still we're now
documenting it because it's dangerous to not be aware of. Secretmem
behaves exactly the same and I think this is worth spelling out:
secretmem has the potential of being used much more often than fairly
special vfio/rdma/ ...
Looking at a cover letter that doesn't even mention the issue of
unmovable allocations makes me thing that we are either trying to ignore
the problem or are not aware of the problem.
David / dhildenb