[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Condor-users] NFS errors with log file
- Date: Thu, 05 Jun 2008 21:11:38 -0500 (CDT)
- From: Steven Timm <timm@xxxxxxxx>
- Subject: Re: [Condor-users] NFS errors with log file
We are successfully running O(100k) node DAGs in LIGO using the existing
7.0.1 schedd and dagman scalability enhancements with just a single schedd.
I am curious what limitation you are running into with your large dags
on a single schedd? Are you using an older 6.8.x version?
We have not applied any of the dagman scalability enhancements as far
as I know. The multiple schedd configuration dates from the condor 6.7
days. There is one schedd to deal with running the dags, six to submit
glideins to remote grid sites, and four to match jobs to slots in
the glidein pool. These four sub schedd's use a large fraction of the
cpu.. and as I said right now they are all on the same node.
HAving a dual quad core node will help a lot and probably get
us around the problem for now.
Stuart Anderson anderson@xxxxxxxxxxxxxxxx
Condor-users mailing list
To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with a
You can also unsubscribe by visiting
The archives can be found at: