I've noticed one issue with NFS that may or may not be related.  Some NFS systems are "clusters" of storage nodes that are striped across the set of machines.   In these NFS systems, a client connects to one node, and the information may propagate to other nodes after some time lag.

If you're using condor, this means that your NFS mounts on the submitter and execution nodes may be out of sync...since each mount may be to a different physical storage cluster node.

Adding small delays after writing files that both the execution and submitter node need to see fixed the problem for us.

On Sun, Jun 12, 2011 at 6:15 AM, Mag Gam wrote:
At our university we are a heavy NFS user. When we run run long jobs
with condor and there is a performance problem with our home
directories (which on are NFS). It seems the job gets requeued.

I was wondering if anyone else out there have a similar problem and
what they did to fix it :-)
