[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[HTCondor-users] jobs stuck; cannot get rid of them.



Hi all,

There is a discrepancy between what condor_q thinks is runing, and what condor_status things is running. I run this set of commands to see the difference.

# condor_status -af JobId -af Cpus | grep -v undef | sort | sed -e "s/\.0//"> s # condor_q -af ClusterId -af RequestCpus -constraint "JobStatus=?=2" | sort > q
# diff s q
1,4d0
< 1079641 8
< 1080031 8
< 1080045 8
< 1080321 1

See; condor_status has 4 jobs that actually don't exist in condor_q !?!

They've been there for days, since I had some Linux problems that needed a reboot (not very related to htcondor.)

So I'm losing 25 slots, due to this. How can I purge this stale information from the HTCondor system, good and proper?

Cheers,

Ste


--
Steve Jones                             sjones@xxxxxxxxxxxxxxxx
Grid System Administrator               office: 220
High Energy Physics Division            tel (int): 43396
Oliver Lodge Laboratory                 tel (ext): +44 (0)151 794 3396
University of Liverpool                 http://www.liv.ac.uk/physics/hep/