On May 10, 2018, at 07:25, Vimal Patnaik <VPatnaik(a)tylercapital.co.uk> wrote:
Hey folks,
Server version details – Lustre 2.7.19.8
Client version details – Lustre 2.11.0-1 and 2.7.19 (old version) tested on both
Question is...
We have a Lustre client running a primarily compute intensive operation, but which acts
over a medium sized data set (~ 80 GB) consisting of a number of files that are ~
300-500MB big. We're trawling across those files sequentially in a repeated fashion.
We'd like to retain the data set within the Lustre client cache, and have increased
this using lctl set_param llite.*.max_cached_mb. However, one observation we've made
is that the cache usage appears to top out at ~ 93%, i.e. when configured at the default ~
70304 MB for this host (140GB mem), the used_mb appears to top out at 65920:
llite.omegad01-ffff88116b38a800.max_cached_mb=
users: 6
max_cached_mb: 70304
used_mb: 65920
unused_mb: 4384
reclaim_count: 0
Likewise, when we up the max to 100000MB, it seems to top out at ~93GB ish (as in we see
the used_mb fix, and an increase in network traffic on the Lustre port back to the lustre
server).
I'm wondering whether that is expected, and whether there's any configuration
option possible to max used of that remaining portion of the cache?
This isn't something we've noticed before - typically people will complain if
we exceed the maximum used memory, but I don't think anyone has complained that
we don't cache as much as requested...
My short term suggestion would be to increase max_cached_mb by 7% so you can cache the
desired amount. Alternately, 93GiB = 99857989632 bytes, so it almost reaches
100000MB as requested?
You may find more responses on the lustre-discuss(a)lists.lustre.org list for your
question.
Cheers, Andreas
--
Andreas Dilger
Lustre Principal Architect
Intel Corporation