Hi Patrick,
yes, I was talking about the flock mount option, and I believe this is
an interface to allow the application to request locks from ldlm.
Without the flock mount option applications that rely on POSIX locking
are in trouble.
I'm not sure if the applications believe that they have acquired a lock,
or if it's the decision of some application to behave as if they had a
lock (ignoring that there may be race conditions with other threads or
processes). I'm also not sure if Gaussian uses flock at all but I
thought so. However, I can't find any document about this.
But you are right, the syslog message about syncing data on log cancel
is no indication if the mount option is used. It could as well be a
lustre internal lock.
Martin
On 09/07/2015 03:25 PM, Patrick Farrell wrote:
Martin,
Interesting find on the quota stuff.
About file locking...
Flock (the mount option I think you're talking about) refers specifically to POSIX
file locks. It does not change the normal Lustre locking, it only controls POSIX file
locking, which is voluntary in any case (in the sense that an application can request and
wait for such a lock if it is held elsewhere, but it can also or read write without asking
for the lock).
- Patrick
________________________________________
From: Martin Hecht [hecht(a)hlrs.de]
Sent: Monday, September 07, 2015 4:46 AM
To: Patrick Farrell; Kumar, Amit; hpdd-discuss(a)lists.01.org
Subject: Re: [HPDD-discuss] Stale File is it possible?
Hi Amit and Patrick,
On 09/04/2015 11:09 PM, Patrick Farrell wrote:
> Sent this once already, but in reply to the wrong message...
>
> Martin might know about that short read thing, since his site has a nice wiki page on
it:
>
https://wickie.hlrs.de/platforms/index.php/Lustre_short_read
sure. At first I didn't relate this question to short reads, but
"Erroneous read. Read 1 instead of 8" sounds a bit like short reads.
On the other hand, are these numbers in bytes? If that's the case I
would doubt it's a short read of a single byte. I would expect that at
least the block size is read in one IOP (unless the application really
reads each byte separately).
At
biowulf.nih.gov/apps/gaussian/#errors I have found the "translation
to English":
Disk quota of disk size exceeded. Could also be disk failure or NFS
timeout.
If you are really close to the quota limit, it might happen that the client thinks that
it could write the data, and until the next read happens, the quota mechanism has wiped
away a few bytes. At least I have seen such a behavior on NFS. Full OSTs could perhaps
cause similar behavior
Another issue comes to my mind: Which kind of locking is used? It can be specified in the
mount options. Locking might slow down lustre a bit (also depending on the version you are
using). No mount option at all means there is no locking and thread A does not notice that
thread B has the file still open.
But the error message you wrote suggests that at some locking is active. localflock
handles locking on each client locally, so the threads notice that the file is still in
use as long as the owner is on the same node, and flock does global locking. Maybe you
have to switch to global flock if you use several nodes within one job.
> Technically Lustre is allowed to return fewer bytes than requested, as it says on
that page. But it doesn't normally - LU-6389 is a bug where that can happen kind of
often. (Again, it's technically allowed as that page says... But it shouldn't
really happen in practice, which is why LU-6389 is a bug.)
>
> So perhaps Gaussian does not retry short reads? If memory serves, it's closed
source, so you can't check - but perhaps you could ask the vendor?
maybe you can see something with strace? But I'm not sure if you are
allowed to do such kind of debugging
Martin
> ________________________________________
> From: HPDD-discuss [hpdd-discuss-bounces(a)lists.01.org] on behalf of Patrick Farrell
[paf(a)cray.com]
> Sent: Friday, September 04, 2015 4:03 PM
> To: Kumar, Amit; Martin Hecht; hpdd-discuss(a)lists.01.org
> Subject: Re: [HPDD-discuss] Stale File is it possible?
>
> What's your Lustre version?
>
> If the file looks OK when you look at it yourself (ie, no gaps), you might be running
in to this bug:
>
https://jira.hpdd.intel.com/browse/LU-6389
>
> Lustre 2.5 and newer will sometimes return fewer than expected bytes on a read or
write, without giving an error.
>
> - Patrick
> ________________________________________
> From: Kumar, Amit [ahkumar(a)mail.smu.edu]
> Sent: Friday, September 04, 2015 2:51 PM
> To: Martin Hecht; Patrick Farrell; hpdd-discuss(a)lists.01.org
> Subject: RE: [HPDD-discuss] Stale File is it possible?
>
> Patrick & Matin,
>
> Thank you for responses. Files are Gau-*.rwf files, they are data files created by
Gaussian and a bunch of text or log files. Gaussian does what they call it as logically
random-access.
>
> Error that we run into often is: Erroneous read. Read 1 instead of 8
>
> I don't seem to find any lustre/lnet errors in dmesg or syslogs. Although I have
seen " Error -2 syncing data on lock cancel " here and there but they tend to be
also on other clients so I don't think that is of significance.
>
> We are mounting Lustre over IB, not sure if that has any significance.
>
> I am looking at runing another test with additional debugging enabled, if you have
any hints here that would be helpful.
>
> Best Regards,
> Amit