[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] Jobs that won't condor_rm



On Wed, Dec 28, 2005 at 02:38:51PM -0600, Steven Timm wrote:
> 
> I have some jobs that users submitted to globus universe
> in the queue of my schedd right now and I cannot, as root, remove
> them with condor_rm.
> When I execute the condor_rm command, they show "X" for a few
> seconds and then revert to status "H"
> 
<...>
> 
> I have made sure that there are no condor processes running
> on the "remote server" which in this case is the same as the submit 
> machine.  It appears that on the condor_rm of the globus universe job
> it tries to contact the remote server to kill the job, but can't do
> so because the proxy has obviously expired long ago.
> 
> Any idea how to get rid of such a job?  Output of condor_q -long
> is below for one of them.
> 

condor_rm -force

It is meant as a last-resort, and should not be used as the common case,
but it will remove a job that is in the 'X' state immediately. In the
case of Globus universe jobs, it will not contact the Globus resource and
try to remove the remote job, so you will leak jobs on the Globus side. 

It'd be best if you could get a current proxy, but if you can't, -force
will clean up your queue.

-Erik