[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] nfs and condor



I've noticed one issue with NFS that may or may not be related.  Some NFS systems are "clusters" of storage nodes that are striped across the set of machines.   In these NFS systems, a client connects to one node, and the information may propagate to other nodes after some time lag.

If you're using condor, this means that your NFS mounts on the submitter and execution nodes may be out of sync...since each mount may be to a different physical storage cluster node.

Adding small delays after writing files that both the execution and submitter node need to see fixed the problem for us.

On Sun, Jun 12, 2011 at 6:15 AM, Mag Gam <magawake@xxxxxxxxx> wrote:
At our university we are a heavy NFS user. When we run run long jobs
with condor and there is a performance problem with our home
directories (which on are NFS). It seems the job gets requeued.

I was wondering if anyone else out there have a similar problem and
what they did to fix it :-)
_______________________________________________
Condor-users mailing list
To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/condor-users

The archives can be found at:
https://lists.cs.wisc.edu/archive/condor-users/