[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] Suggestion: transfer on error



On Fri, 6 Dec 2013, Brian Candler wrote:

(2) When I submit a DAG full of jobs, in log files I cannot see any record of which host a particular job ran on.

If I see it while it's running (condor_q -run -dag) then I can see the host. But this is not recorded in *.dagman.out as far as I can see.

You're right, this isn't in dagman.out.

Is there any way to log this information? It would be really helpful, for example, if a job fails when it is matched with one particular host because an NFS mount is missing.

But it is in the user log file for the job. If you're running a fairly recent DAGMan (7.9 something or later) you'll get a file called *.dag.nodes.log (unless you've changed the default settings). That file will have events for all of your node jobs, including execute events which have the IP address of the execute host.

If you specify 'log = ...' in your submit files, you can also see the same events in those files.

If you're running an older dagman you most likely won't get the .dag.nodes.log file, but you'll still have the individual logs specified in the submit files.

Kent Wenger
CHTC Team