[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] nfs and condor



On Sun, 12 Jun 2011, Mag Gam wrote:

How are you adding the writes after each job?

I was wondering if there are any tricks I can do such as writing log
files to local filesystems instead of NFS. Can I place, "log",
"output", and "error" to local directories?


Yes you can and should.  Some very busy sites even get a SSD
device to put them on.
There are also tricks you can do with condor_transfer_files
to transfer your executable to the worker nodes aand
to bring back the stdout/stderr files.

Steve Timm



On Sun, Jun 12, 2011 at 6:27 AM, Erik Aronesty <erik@xxxxxxx> wrote:
I've noticed one issue with NFS that may or may not be related.  Some NFS
systems are "clusters" of storage nodes that are striped across the set of
machines.   In these NFS systems, a client connects to one node, and the
information may propagate to other nodes after some time lag.

If you're using condor, this means that your NFS mounts on the submitter and
execution nodes may be out of sync...since each mount may be to a different
physical storage cluster node.

Adding small delays after writing files that both the execution and
submitter node need to see fixed the problem for us.

On Sun, Jun 12, 2011 at 6:15 AM, Mag Gam <magawake@xxxxxxxxx> wrote:

At our university we are a heavy NFS user. When we run run long jobs
with condor and there is a performance problem with our home
directories (which on are NFS). It seems the job gets requeued.

I was wondering if anyone else out there have a similar problem and
what they did to fix it :-)
_______________________________________________
Condor-users mailing list
To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/condor-users

The archives can be found at:
https://lists.cs.wisc.edu/archive/condor-users/


_______________________________________________
Condor-users mailing list
To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/condor-users

The archives can be found at:
https://lists.cs.wisc.edu/archive/condor-users/


_______________________________________________
Condor-users mailing list
To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/condor-users

The archives can be found at:
https://lists.cs.wisc.edu/archive/condor-users/

--
------------------------------------------------------------------
Steven C. Timm, Ph.D  (630) 840-8525
timm@xxxxxxxx  http://home.fnal.gov/~timm/
Fermilab Computing Division, Scientific Computing Facilities,
Grid Facilities Department, FermiGrid Services Group, Group Leader.
Lead of FermiCloud project.