[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] Results of aborted condor job




Unfortunately, there is no way to tell condor to remove the job and bring back the output. If you run condor_vacate_job <jobid> then this should cause the job to stop running, bring back the output, and return the job to the idle state, ready to run again. That's as close to what you are trying to do as anything I can think of.

--Dan

Robin Harrington wrote:

Hi,

I'm new to condor so I'm sorry if the answer is obvious. I'm trying some
experiments with long jobs and want to put a limit on the time they use.

But when the job aborts, an email message is sent but no transfer of any
files generated

Here is my condor submit job file
#
should_transfer_files = YES
when_to_transfer_output = ON_EXIT_OR_EVICT
Executable = frames.exe
Universe = vanilla
output = frames.out
error = frames.err
log = frames.log
notify_user = rha50@xxxxxxxxxxxxxxxx
notification = Always
periodic_remove = (RemoteWallClockTime - CumulativeSuspensionTime) >
1000
transfer_input_files = tiles.txt
queue

frames.log file:

000 (040.000.000) 01/28 11:01:48 Job submitted from host:
<132.181.5.14:1072>
...
001 (040.000.000) 01/28 11:01:53 Job executing on host:
<132.181.5.14:1074>
...
009 (040.000.000) 01/28 11:18:54 Job was aborted by the user.
	The job attribute PeriodicRemove expression
'(RemoteWallClockTime - CumulativeSuspensionTime) > 1000' evaluated to
TRUE

frames.err and frames.out are empty files

What do I need to do to retrieve the output file contents from
processing done before the abort?

Thanks
Robin

_______________________________________________
Condor-users mailing list
To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/condor-users

The archives can be found at: https://lists.cs.wisc.edu/archive/condor-users/