On Sun 14-02-21 11:21:02, James Bottomley wrote:
On Sun, 2021-02-14 at 10:58 +0100, David Hildenbrand wrote:
[...]
> > And here we come to the question "what are the differences that
> > justify a new system call?" and the answer to this is very
> > subjective. And as such we can continue bikeshedding forever.
>
> I think this fits into the existing memfd_create() syscall just fine,
> and I heard no compelling argument why it shouldn‘t. That‘s all I can
> say.
OK, so let's review history. In the first two incarnations of the
patch, it was an extension of memfd_create(). The specific objection
by Kirill Shutemov was that it doesn't share any code in common with
memfd and so should be a separate system call:
https://lore.kernel.org/linux-api/20200713105812.dnwtdhsuyj3xbh4f@box/
Thanks for the pointer. But this argument hasn't been challenged at all.
It hasn't been brought up that the overlap would be considerable higher
by the hugetlb/sealing support. And so far nobody has claimed those
combinations as unviable.
The other objection raised offlist is that if we do use
memfd_create,
then we have to add all the secret memory flags as an additional ioctl,
whereas they can be specified on open if we do a separate system call.
The container people violently objected to the ioctl because it can't
be properly analysed by seccomp and much preferred the syscall version.
Since we're dumping the uncached variant, the ioctl problem disappears
but so does the possibility of ever adding it back if we take on the
container peoples' objection. This argues for a separate syscall
because we can add additional features and extend the API with flags
without causing anti-ioctl riots.
I am sorry but I do not understand this argument. What kind of flags are
we talking about and why would that be a problem with memfd_create
interface? Could you be more specific please?
--
Michal Hocko
SUSE Labs