On Jun 1, 2015, at 12:22 PM, Kumar, Amit <ahkumar(a)mail.smu.edu>
wrote:
Our default stripe count is 1 and the applications that are running out of space are
probably writing to an OST that have reached 100% on its space usage. But I assumed this
being the most common case, this is something Lustre would handle it on its own?
Lustre has two methods for assigning OSTs to a file: Round Robin (RR) and Quality Of
Service (QOS). Without getting too deep into how they work, in your case I suspect the
spread of OST usages is enough that Lustre is trying to use the QOS allocation mechanism.
QOS tries to even out usage using a weighted distribution to choose OSTs (so that less
full OSTs are more likely to be selected). However, this doesn’t necessarily prevent a
full OST from being selected. Also, if the user specifically chooses an OST for a file
(using lfs setstripe), then that would override that QOS mechanism.
In general, I would suggest monitoring the OSTs and look for ones which have a usage that
is much higher than normal. Sometimes you can identify a large file which only using one
stripe when it should be using multiple stripes. Restriping the file can redistribute the
usage. Your goal should be to prevent any OST from filling up. (Personally, I start to
get worried when any OST goes above 80%. If an OST goes above 90%, I treat it like it is
“full” and immediately try to redistribute data.)
Also, if you are using ldiskfs for the backend storage, then you may be running into a
case where you are hitting the reserved block limit for root. The default behavior of the
ext file system is to reserve some blocks so that root will always have some space
available. Unless something has changed, I think this value is 5% of the disk space.
Since ldiskfs is a modified ext file system, it has the same space reservation. So an ost
can run out of space even though it looks like there is some space left.
What puzzles me is why would a copy command fail at times, because
during a copy I am sure file system would checked on available space on an OST and then
obtained an Object index to write to where there was enough space.
The copy command doesn’t know about Lustre, and Lustre doesn’t know about the size of the
file you are copying. The copy command asks for a new file, and Lustre gives it one.
There is no communication between the two to ensure that the assigned OSTs have enough
space to accommodate the new file.
Do I get this right? Or Am I missing something? Should I change my
default stripe count on the file system to higher than 1?
That might help, but it won’t guarantee that you won’t run into this problem. again.
You’ll probably want to put some effort into user education so that the users can select
appropriate stripe counts for their files.
—
Rick Mohr
Senior HPC System Administrator
National Institute for Computational Sciences
http://www.nics.tennessee.edu