Correction: If a file already exists then I can't restripe it unless I redistribute
it. So I guess I cannot salvage any running jobs. I hope I got this right?
Hi Rick,
I just experimented writing to a near full OST's with stripe count 1 and I ran
into same error No space left. This is now making sense with what you just
said. Although I checked the percentage of the space at which it complains no
space left, it is around 0.15% (based on 35GB/22TB OST). Yes I use ldiskfs, this
is good to know that there are reserved blocks and hence run out of space
even though there is tiny space left.
I am also surprised why only a handful are filled up, and my guess is our
chemistry applications that dump large files could be the reason as you point
out.
Although I have question here: Would it hurt if I restripe a file when it is
currently in use? If not, then I will salvage some jobs that I see are dumping to
OST's that are nearing full.
Thank you for the clarification on copy command and helping me understand
this much better.
Amit
>wrote:
>> Our default stripe count is 1 and the applications that are running
>> out of
>space are probably writing to an OST that have reached 100% on its
>space usage. But I assumed this being the most common case, this is
>something Lustre would handle it on its own?
>
>Lustre has two methods for assigning OSTs to a file: Round Robin (RR)
>and Quality Of Service (QOS). Without getting too deep into how they
>work, in your case I suspect the spread of OST usages is enough that
>Lustre is trying to use the QOS allocation mechanism. QOS tries to
>even out usage using a weighted distribution to choose OSTs (so that
>less full OSTs are more likely to be selected). However, this doesn’t
>necessarily prevent a full OST from being selected. Also, if the user
>specifically chooses an OST for a file (using lfs setstripe), then that would
override that QOS mechanism.
>
>In general, I would suggest monitoring the OSTs and look for ones which
>have a usage that is much higher than normal. Sometimes you can
>identify a large file which only using one stripe when it should be using
multiple stripes.
>Restriping the file can redistribute the usage. Your goal should be to
>prevent any OST from filling up. (Personally, I start to get worried
>when any OST goes above 80%. If an OST goes above 90%, I treat it like
>it is “full” and immediately try to redistribute data.)
>
>Also, if you are using ldiskfs for the backend storage, then you may be
>running into a case where you are hitting the reserved block limit for
>root. The default behavior of the ext file system is to reserve some
>blocks so that root will always have some space available. Unless
>something has changed, I think this value is 5% of the disk space.
>Since ldiskfs is a modified ext file system, it has the same space
>reservation. So an ost can run out of space even though it looks like there is
some space left.
>
>> What puzzles me is why would a copy command fail at times, because
>during a copy I am sure file system would checked on available space on
>an OST and then obtained an Object index to write to where there was
>enough space.
>
>The copy command doesn’t know about Lustre, and Lustre doesn’t know
>about the size of the file you are copying. The copy command asks for
>a new file, and Lustre gives it one. There is no communication between
>the two to ensure that the assigned OSTs have enough space to
>accommodate the new file.
>
>> Do I get this right? Or Am I missing something? Should I change my
>> default
>stripe count on the file system to higher than 1?
>
>That might help, but it won’t guarantee that you won’t run into this
problem.
>again. You’ll probably want to put some effort into user education so
>that the users can select appropriate stripe counts for their files.
>
>—
>Rick Mohr
>Senior HPC System Administrator
>National Institute for Computational Sciences
>http://www.nics.tennessee.edu
_______________________________________________
HPDD-discuss mailing list
HPDD-discuss(a)lists.01.org
https://lists.01.org/mailman/listinfo/hpdd-discuss