[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] On-the-fly DAGs?



Hi Siarhei,

There are several different ways to do what you're asking for.

If Michael's suggestion using condor_wait does what you need, that's great! I think you would need to run this manually though so it's a bit prone to error.

Another option would be to use POST scripts. If you put your original job into a single-node DAG, you could write a POST script which checks a certain condition. If the condition passes, your script would write out a new DAG file and then run condor_submit_dag on it. If the condition fails, your script exits and the on-the-fly DAG is done.Â

A third option (depending on your needs) would be to use a SUBDAG EXTERNAL object. When you define this, you have to provide a .dag file for it to run, although that file doesn't need to exist up until the moment DAGMan reaches that node. So your earlier jobs, PRE scripts and POST scripts can look at their output and write to the .dag file. There are more details in the manual:Â

http://research.cs.wisc.edu/htcondor/manual/current/2_10DAGMan_Applications.html#SECTION0031091200000000000000

Mark



On Wed, May 9, 2018 at 10:36 AM, Michael Pelletier <Michael.V.Pelletier@xxxxxxxxxxxx> wrote:
My upcoming HTCondor Week presentation goes over a few useful tricks with the newer submit description features which reduce the need for script-generated submit descriptions. Keep an eye out for it in the proceedings, or if you're attending the conference I'll see you there! You might also find the HTCondor Python bindings to be useful for defining and submitting jobs.

Just to be sure I'm clear, we're not talking about the Error or Output parameters, but the Log parameter in the submit description - generally you only ever want one log per cluster, since it's logging the management of the entire cluster or group of clusters from a single submission. It doesn't contain much information that's particularly useful in the context of a single job within a 1000-job cluster.

As for using a cluster number instead of a log file, you could do a condor_wait wrapper like so:

#!/bin/bash
condor_wait $(condor_q $1 -af UserLog | head -1)

You'd give this script the job ID as the argument, and it would wait until all the jobs in the specified cluster are done, assuming the cluster defines a UserLog.

    -Michael Pelletier.

-----Original Message-----
From: HTCondor-users [mailto:htcondor-users-bounces@xxxxxxxxxxx] On Behalf Of Vaurynovich, Siarhei
Sent: Wednesday, May 9, 2018 11:17 AM
To: HTCondor-Users Mail List <htcondor-users@xxxxxxxxxxx>
Subject: [External] Re: [HTCondor-users] On-the-fly DAGs?


Thank you for your reply, Michael!

That sounds like what I want. I would just prefer to not give a log file as input but instead a cluster number only, and let Condor figure out which log file to watch.

Currently, my submit files are generated programmatically and each job in a cluster gets its own log file. It seems I need to reconsider it.

Thank you,
Siarhei.


-----Original Message-----
From: HTCondor-users [mailto:htcondor-users-bounces@xxxxxxxxxxx] On Behalf Of Michael Pelletier
Sent: Wednesday, May 09, 2018 10:20 AM
To: HTCondor-Users Mail List <htcondor-users@xxxxxxxxxxx>
Subject: Re: [HTCondor-users] On-the-fly DAGs?

Sounds like a good use for condor_wait.

When you give condor_wait a job's log file (log = htcondor-$(Cluster).log) it watches the file and will only exit when all the jobs in that log have completed.

So what you'll want to do is write a little script which runs a condor_wait on the pending job cluster and then submits your next job after condor_wait exits.

You could submit it as a "local" universe job so that the condor_wait that's sitting around doing nothing wouldn't be using a CPU slot.

    -Michael Pelletier.

-----Original Message-----
From: HTCondor-users [mailto:htcondor-users-bounces@xxxxxxxxxxx] On Behalf Of Vaurynovich, Siarhei
Sent: Tuesday, May 8, 2018 9:35 AM
To: HTCondor-Users Mail List <htcondor-users@xxxxxxxxxxx>
Subject: [External] [HTCondor-users] On-the-fly DAGs?


Hello,

Could you please let me know if it is possible to create on-the-fly DAGs in HTCondor?

Here is an example: I work on some code and when it is ready I submit a number of jobs to job cluster 1000. After that I work on the next processing step and finish the needed code before the jobs in cluster 1000 are completed. I want to be able to say: start this next set of jobs when and if all the jobs in cluster 1000 are completed successfully, i.e. I want to create an "on-the-fly" DAG. The goal is to have some computing to be done on some steps of the workflow even before the whole workflow code is ready and keep adding to the workflow on the fly.

Thank you,
Siarhei.

............................................................................



_______________________________________________
HTCondor-users mailing list
To unsubscribe, send a message to htcondor-users-request@cs.wisc.edu with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users

The archives can be found at:
https://lists.cs.wisc.edu/archive/htcondor-users/
............................................................................

Trading instructions sent electronically to Bernstein shall not be deemed accepted until a representative of Bernstein acknowledges receipt electronically or by telephone. Comments in this e-mail transmission and any attachments are part of a larger body of investment analysis. For our research reports, which contain information that may be used to support investment decisions, and disclosures see our website at www.bernsteinresearch.com.

For further important information about AllianceBernstein please click here http://www.abglobal.com/disclaimer/email/disclaimer.html


_______________________________________________
HTCondor-users mailing list
To unsubscribe, send a message to htcondor-users-request@cs.wisc.edu with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users

The archives can be found at:
https://lists.cs.wisc.edu/archive/htcondor-users/

_______________________________________________
HTCondor-users mailing list
To unsubscribe, send a message to htcondor-users-request@cs.wisc.edu with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users

The archives can be found at:
https://lists.cs.wisc.edu/archive/htcondor-users/



--
Mark Coatsworth
Systems Programmer
Center for High Throughput Computing
Department of Computer Sciences
University of Wisconsin-Madison
+1 608 206 4703