[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [HTCondor-users] Prefetch all-to-one inter-dependencies in DAGMan
- Date: Tue, 18 Dec 2012 14:54:06 +0100
- From: Walid Saad <walid.saadd@xxxxxxxxx>
- Subject: Re: [HTCondor-users] Prefetch all-to-one inter-dependencies in DAGMan
Thank you Mattew,
The use case is (1).
I want to send the output file of the completed Job (A, B or C) to the host in which Job D will be executed (I can specify the name of the host in the requirements section of Job D's submit file, it can be the same host of Job A,B, or C). I need to launch the download of the present output file before DAGMan schedule the Job D (as soon as output file is present in the condor cache ).
Especially, When A,B, C compute, simultaneously the present output file(s) will be transferred to host in which Job D will be executed.
2012/12/18 Matthew Farrellee <matt@xxxxxxxxxx>
On 12/17/2012 03:15 PM, R. Kent Wenger wrote:
On Mon, 17 Dec 2012, Walid Saad wrote:
I want to submit to condor scheduler the following DAG (attached).
JOB A A.condor
JOB B B.condor
JOB C C.condor
JOB D D.condor
PARENT A B C CHILD D
The output file of jobs A, B, C are the input file for the job D.
According to the job DAGMan the job D will be submitted if all jobs A, B
and C are completed.
My question is as follows:
Is there a way in Condor to submit the job D as soon as one of jobs A,
C is finished?
Right now DAGMan doesn't support a way to do this. (There's a ticket
requesting this kind of feature; for some reason, I can't access the
gittrac tickets right now, so I can't give you a link.)
The only way I can think of to implement this would be a lot of work:
you'd have to have POST scripts for jobs A, B, and C that would
condor_rm the other jobs in that "set", and also mark the node as
successful if the job was removed.
Is the use case: (0) completion of A, B or C is sufficient for D to complete, i.e. D only needs the output from one of A, B or C; or, (1) completion of A, B or C is sufficient for D to start, i.e. D itself will wait for completion of A, B and C, but can start processing output as soon as any of them completes.
There's a variation on use case (1), where D can only process A then B then C instead of some random ordering, e.g. B C A or C A B.
For (0), you can use Kent's trick with POST. For (1), if you have shared storage, you can make A B C and D into siblings.