[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] Completed jobs never removed from queue?



On Tue, Aug 12, 2008 at 03:02:52PM -0500, Matthijs van der Meer wrote:
> Another question -- when my server submits jobs, they run just fine. 
> When any other machine in the pool submits a job (using condor_submit 
> -r) everything appears to be well (shadow, starter, executable run as 
> normal) but then the job gets stuck in the 'Completed' state. The 
> submitting machine does not see anything about this in the job log, and 
> does not get the results back. Any ideas what might cause this?

this is expected behavior.  when a job is submitted remotely, all it's input
files are spooled to the remote schedd.  after the jobs runs, the schedd will
keep the job in completed state.  the user then fetches their output files
(included the updated log file with the "Completed" event) using
condor_transfer_data.

the reason for all this is that there is no agent (necessarily) running on the
submit machine when you do a remote submit.  so, you must poll the schedd to
see if your job is completed, get the data, and (i think) manually remove it.


cheers,
-zach