[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] monitor for dagman job to finish



On Tue, 29 Apr 2014, Jiande Wang wrote:

I want to know if there is any tool that can be used to monitor the status of dagman job. Say I submit a dagman job which contains several condor jobs, I want to monitor this dagman job. Seems to me condor_wait is for individual condor job only.

As Ben Mentioned, you can set things up to have HTCondor send you an email when your DAG completes.

If you want finer-grained status, there are several things you can do.

First of all, there is status information in the DAGMan job's ClassAd (see
http://research.cs.wisc.edu/htcondor/manual/v8.1/2_10DAGMan_Applications.html#SECTION0031013000000000000000).
This is updated as the DAG runs.

You can also use a DAG node status file (see
http://research.cs.wisc.edu/htcondor/manual/v8.1/2_10DAGMan_Applications.html#SECTION0031011000000000000000).
Note that the format of the node status file will change in versino 8.1.6, so if you're going to implement something based on that you should probably wait until you can use 8.1.6.

Neither of these are really tools per se, but you could certainly build something on top of them... Just as an example, you could easily write a script that used condor_status to get information from DAGMan's classad, and emailed you that information every hour.

Kent Wenger
CHTC Team