On 09/04/2015 11:09 PM, Patrick Farrell wrote:
Sent this once already, but in reply to the wrong message...
Martin might know about that short read thing, since his site has a nice wiki page on
it:
https://wickie.hlrs.de/platforms/index.php/Lustre_short_read
Technically Lustre is allowed to return fewer bytes than requested, as it says on that
page. But it doesn't normally - LU-6389 is a bug where that can happen kind of often.
(Again, it's technically allowed as that page says... But it shouldn't really
happen in practice, which is why LU-6389 is a bug.)
So perhaps Gaussian does not retry short reads? If memory serves, it's closed
source, so you can't check - but perhaps you could ask the vendor?
Depends on the license. At least in the past it was common for to get
the Gaussian source code to be able to compile yourself or to modify it
(have been doing that myself a long time ago).
Short reads should be solvable using LD_PRELOAD. Although one might
argue that Gaussian as well as Lustre should be able to handle this any
better on their own.
Attached are rather untested read-preload files. Compile it and then run
the gaussian binary with something like
LD_PRELOAD=<path/to/file>/read-preload.so <binary>
Btw, the wiki code is not ideal, assuming already the first read returns
-1, it is going to use ptr -1, which might be outside of valid address
space. Similar if there would be one successful (short) read, but
several -1 read results after.
Cheers,
Bernd
--
DataDirect Networks