Hi Amit and Patrick,
On 09/04/2015 11:09 PM, Patrick Farrell wrote:
Sent this once already, but in reply to the wrong message...
Martin might know about that short read thing, since his site has a nice wiki page on
it:
https://wickie.hlrs.de/platforms/index.php/Lustre_short_read sure. At first I
didn't relate this question to short reads, but
"Erroneous read. Read 1 instead of 8" sounds a bit like short reads.
On the other hand, are these numbers in bytes? If that's the case I
would doubt it's a short read of a single byte. I would expect that at
least the block size is read in one IOP (unless the application really
reads each byte separately).
At
biowulf.nih.gov/apps/gaussian/#errors I have found the "translation
to English":
Disk quota of disk size exceeded. Could also be disk failure or NFS
timeout.
If you are really close to the quota limit, it might happen that the client thinks that it
could write the data, and until the next read happens, the quota mechanism has wiped away
a few bytes. At least I have seen such a behavior on NFS. Full OSTs could perhaps cause
similar behavior
Another issue comes to my mind: Which kind of locking is used? It can be specified in the
mount options. Locking might slow down lustre a bit (also depending on the version you are
using). No mount option at all means there is no locking and thread A does not notice that
thread B has the file still open.
But the error message you wrote suggests that at some locking is active. localflock
handles locking on each client locally, so the threads notice that the file is still in
use as long as the owner is on the same node, and flock does global locking. Maybe you
have to switch to global flock if you use several nodes within one job.
Technically Lustre is allowed to return fewer bytes than requested, as it says on that
page. But it doesn't normally - LU-6389 is a bug where that can happen kind of often.
(Again, it's technically allowed as that page says... But it shouldn't really
happen in practice, which is why LU-6389 is a bug.)
So perhaps Gaussian does not retry short reads? If memory serves, it's closed
source, so you can't check - but perhaps you could ask the vendor?
maybe you can
see something with strace? But I'm not sure if you are
allowed to do such kind of debugging
Martin
________________________________________
From: HPDD-discuss [hpdd-discuss-bounces(a)lists.01.org] on behalf of Patrick Farrell
[paf(a)cray.com]
Sent: Friday, September 04, 2015 4:03 PM
To: Kumar, Amit; Martin Hecht; hpdd-discuss(a)lists.01.org
Subject: Re: [HPDD-discuss] Stale File is it possible?
What's your Lustre version?
If the file looks OK when you look at it yourself (ie, no gaps), you might be running in
to this bug:
https://jira.hpdd.intel.com/browse/LU-6389
Lustre 2.5 and newer will sometimes return fewer than expected bytes on a read or write,
without giving an error.
- Patrick
________________________________________
From: Kumar, Amit [ahkumar(a)mail.smu.edu]
Sent: Friday, September 04, 2015 2:51 PM
To: Martin Hecht; Patrick Farrell; hpdd-discuss(a)lists.01.org
Subject: RE: [HPDD-discuss] Stale File is it possible?
Patrick & Matin,
Thank you for responses. Files are Gau-*.rwf files, they are data files created by
Gaussian and a bunch of text or log files. Gaussian does what they call it as logically
random-access.
Error that we run into often is: Erroneous read. Read 1 instead of 8
I don't seem to find any lustre/lnet errors in dmesg or syslogs. Although I have
seen " Error -2 syncing data on lock cancel " here and there but they tend to be
also on other clients so I don't think that is of significance.
We are mounting Lustre over IB, not sure if that has any significance.
I am looking at runing another test with additional debugging enabled, if you have any
hints here that would be helpful.
Best Regards,
Amit