Dear All,

 

We are running lot of Gaussian(I am sure everyone knows this but for clarity, it is a computational chemistry package) jobs that run for month or 2 and we have noticed that these jobs write to open files today and then next time it would write to this file will be  in 1-2-3 weeks, on our diskless cluster-compute nodes.

Theoretically this should not be a problem. But since we are running into a problem where Gaussian is complaining about: Erroneous read on file. I am thinking could the luster client/server evict any of information preserved after sometime because on no activity? Is there a timeout setting on the client that I could tweak to make the application resilient to getting access to the files and not behave as if it were stale. Just trying to understand any help here is greatly appreciated.

 

Best regards,

Amit