On Fri, 1 May 2015, Simmons, James A. wrote:
>From: Julia Lawall <Julia.Lawall(a)lip6.fr>
>
>Replace OBD_ALLOC, OBD_ALLOC_WAIT, OBD_ALLOC_PTR, and OBD_ALLOC_PTR_WAIT by
>kalloc/kcalloc, and OBD_FREE and OBD_FREE_PTR by kfree.
Nak: James Simmons <jsimmons(a)infradead.org>
A simple replace will not work. The OBD_ALLOC and OBD_FREE functions allocate memory
anywhere from one page to 4MB in size. You can't use kmalloc for the 4MB
allocations.
Currently lustre uses a 4 page water mark to determine if we allocate using vmalloc.
Even
using kmalloc for 4 pages has shown high failure rates on some systems. It gets even
more
messy with 64K page systems like ppc64 boxes. Now I'm not suggesting to port the
larger
allocations to vmalloc either since issues have been founded with using vmalloc. For
example
when using large stripe count files the MDS rpc generated crosses the 4 page line and
vmalloc
is used. Using vmalloc caused a global spinlock to be taken which causes meta data
operations
to serialized on the MDS servers.
It's not the LARGE functions that do the switching? For example OBD_ALLOC
ends up at __OBD_MALLOC_VERBOSE, which as far as I can see calls kmalloc
(with __GFP_ZERO, and hance the use of kzalloc).
julia