[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] change to condor_submit - user feedback desired! (was Re: multiple condor_submit's - one cluster)



> On Feb 10, 2015, at 2:59 PM, Dimitri Maziuk <dmaziuk@xxxxxxxxxxxxx> wrote:
> 
> On 02/10/2015 02:45 PM, Todd Tannenbaum wrote:
> 
>>    input = in.$(process)
>>    queue 1000
>> The above requires the system to perform just one fork/exec, create just
>> one network connection, perform authentication (schedd authenticate the
>> condor_submit user) once, perform just one fsync to disk, etc.  On the
>> other hand, invoking condor_submit 1000 times with "queue 1" results in
>> 1000 fsyncs, 1000 authentications, etc.
> 
> So what does condor_submit_dag do for a dag w/ 1000 jobs (and no
> parent-child deps)?

It invokes condor_submit 1,000 times - 
- 1,000 fork/exec of condor_submit.
- 1,000 TCP connections.
- 1,000 new authorizations to the schedd.
- 2,000 fsyncs (once per job for the schedd, once per job in the dag log).
- 1,000 transactions; meaning that after a failure, you would have anywhere between 0 and 1,000 jobs in queue.  condor_submit is a single transaction - either 0 jobs (failure) or 1,000 jobs will be in queue.

DAGMan isn't particularly efficient here - on the other hand, it has an internal layer of consistency and queueing so it doesn't have to be!

Anyhow, a bit of a tangent, but might be useful to know...

Brian