Hello Condor folks,|
This subject has been previously visited (see subject: 'DAGMAN slow startup'), but I was hoping somebody might have some more insight. I submit dependent jobs via the condor DAG submit, and I'm finding that there is a delay between when the condor_dagman starts running and submits the first job in my DAG and when that job actually gets farmed out to one of the machines in my network. The delay is actually significant. Anywhere between 2 to 5 minutes. On the odd occasion, it will start up almost immediately, so I'm assuming its related to waiting for a reschedule event or something and is kind of luck of the draw.
When I submit any of these jobs with a plain ol' condor_submit, it finds a dance partner pretty quickly and starts running. It seems to only be when dagman submits a job. I don't know the underlying logic behind these calls, so I don't know if that makes any sense to those of you who are developing for Condor.
The solution to previous emails is 'doing a condor_reschedule will normally get the job running in less than a minute'. If anybody else is experiencing this issue, is that the common solution? Performing a condor_reschedule once the condor_dagman has submitted its jobs for a run?
I'm using Condor 7.0.4 if that helps. If there's any configuration or anything I could do to speed up the time for DAG submitted jobs, I'd greatly appreciate it :)
Get your information fix on your phone. With MSN Mobile you get regular news, sports and finance updates. Try it today!