[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] Questions about DAGMAN jobs & How to setup notification to other users in DAGMAN jobs?



On Wed, 22 Jul 2009, Alas, Alex [FEDI] wrote:

I am here again requesting expert help. I've been away for a while which
is a good thing because my condor environment have been stabled and
working fine also I've been on top of all the needs so far so I thank
you all for the help and support you have been giving me through all
these past times.

I am starting to play with DAGMAN jobs and so far I was able to
successfully submit a few testing jobs but I don't know how to make
DAGMAN submit file to issue a notification to other users (different
than the one is submitting the job).

If you just want to notify a single user, you can do this:

  condor_submit_dag -append 'notify_user = <email>' <whatever>.dag

If you want to notify multiple users, I think you'd have to do something like add a final node to the DAG that would send the notification.

One more thing, I have read in some DAGMAN tutorials that you can
configure a Rescue part within the DAGMAN jobs but after searching all
over the manual and within those same tutorials I don't find the correct
way to do it, maybe I am looking at the wrong documentation, if anyone
here can point me to a link or document where the rescue part is well
explain, please do so.

Well, you don't really "configure" a rescue DAG -- you get a rescue DAG
if your DAG fails because a node or nodes fail, or because the DAGMan
job itself is condor_rm'ed.  If you are running a fairly recent version
of DAGMan (7.1.0 or later), if a rescue DAG exists, it is run automatically when you run the "base" DAG. With older versions you need to manually run the rescue DAG (e.g., 'condor_submit_dag foo.dag.rescue').

Kent Wenger
Condor Team