Mailing List Archives Public Access	UW Madison Computer Sciences Department Computer Systems Lab

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] faster condor_submits with dagman

Date: Wed, 30 Jun 2010 14:02:56 -0400
From: Peter Doherty <doherty@xxxxxxxxxxxxxxxxxxx>
Subject: Re: [Condor-users] faster condor_submits with dagman

Hi Steve,

I've thought about multiple schedds. I've been hesitant to go thatroute, but I like the idea of one for dagman and one for the jobs.

When you ask about spool files, do you mean the input and outputfiles, or something else?

For input files, there are three, 4K, ~100K, ~400K
Output is one file, less than 5K

The variables you asked about are these:
JOB_START_COUNT = 50
JOB_START_DELAY = 2

But doesn't that control the rate of jobs after they are already inthe queue? I feel like my problem is inability to get jobs in thequeue quickly enough.


Peter


On Jun 30, 2010, at 13:47 , Steven Timm wrote:

Peter--have you considered using more than one schedd on your
submitter, that is what some of the big virtual organizations do.
For example, CDF has one schedd to manage the dagmen and another
one to manage the jobs that it spawns.  At one time in the past
they used to have as many as four schedd's for the jobs.  Basically
the dagman processing and the submission of the jobs that are
the dag stages are competing for the condor_schedd time.

Also, how many spool files do you have on each submitted job, and
how big, that could be an effect.

Also what's the value of JOB_START_COUNT, JOB_START_DELAY

Steve



On Wed, 30 Jun 2010, Peter Doherty wrote:
Hello,
I'm running a large amount of short running jobs (2 minutes,maybe?) on a large condor pool. I know, I know, this isn't ideal,not Condor's design, and I should figure out a way to make the jobslonger running. But I want to work on this a little more.
It's a large Condor DAG managing the jobs.
The jobs are able to finish as fast as dagman can submit new onesinto the queue, so eventually I go from 1000 idle jobs, and 2000running, to 10 idle jobs, and 2000 running, and i can't keep thequeue full of pending jobs.I've moved the schedd's spool onto a RAMdisk to try and improvethroughput, and this helped somewhat but not enough.Any other suggestions to tune to system for a higher rate of jobthroughput, before I give up and take a different approach?
Here's some of the variables I've been playing with, but withlimited success.The machine (schedd and collector/negotiator on the same host) is a2.4GHz 4-core AMD system with 8GB RAM.
SCHEDD_INTERVAL    = 30
DAGMAN_MAX_JOBS_IDLE = 1000
DAGMAN_SUBMIT_DELAY = 0
DAGMAN_MAX_SUBMITS_PER_INTERVAL = 1000
DAGMAN_USER_LOG_SCAN_INTERVAL = 1
SCHEDD_INTERVAL_TIMESLICE = 0.10
SUBMIT_SKIP_FILECHECKS = True
HISTORY =
NEGOTIATOR_INTERVAL = 30
NEGOTIATOR_MAX_TIME_PER_SUBMITTER=20
NEGOTIATOR_MAX_TIME_PER_PIESPIN=20


Thanks,
Peter
_______________________________________________
Condor-users mailing list
To unsubscribe, send a message to condor-users-request@xxxxxxxxxxxwith a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/condor-users

The archives can be found at:
https://lists.cs.wisc.edu/archive/condor-users/
--
------------------------------------------------------------------
Steven C. Timm, Ph.D  (630) 840-8525
timm@xxxxxxxx  http://home.fnal.gov/~timm/
Fermilab Computing Division, Scientific Computing Facilities,
Grid Facilities Department, FermiGrid Services Group, AssistantGroup Leader.
_______________________________________________
Condor-users mailing list
To unsubscribe, send a message to condor-users-request@xxxxxxxxxxxwith a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/condor-users

The archives can be found at:
https://lists.cs.wisc.edu/archive/condor-users/

References:
- [Condor-users] faster condor_submits with dagman
  - From: Peter Doherty
- Re: [Condor-users] faster condor_submits with dagman
  - From: Steven Timm

Prev by Date: Re: [Condor-users] faster condor_submits with dagman
Next by Date: Re: [Condor-users] faster condor_submits with dagman
Previous by thread: Re: [Condor-users] faster condor_submits with dagman
Next by thread: Re: [Condor-users] faster condor_submits with dagman
Index(es):
- Date
- Thread

Mailing List Archives

Public Access

Re: [Condor-users] faster condor_submits with dagman