[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[HTCondor-users] Stuck dagman jobs after restart



After rebooting a "personal condor" machine, I see a couple of dagman processes stuck at idle in the queue (see below). They don't ever seem to start. Is there anything which could be done to jog these along automatically, or should I just make a mental note to rm and resubmit jobs like this after a reboot?

This is 8.2.4 under Ubuntu 14.04.

Thanks,

Brian Candler.

$ condor_q


-- Submitter: ardb.int.example.net : <192.168.5.192:42373> : ardb.int.example.net
 ID      OWNER            SUBMITTED     RUN_TIME ST PRI SIZE CMD
4580.0 brian 12/12 14:49 0+00:00:38 I 0 0.3 condor_dagman -f - 4581.0 brian 12/12 14:49 0+00:00:33 I 0 0.3 condor_dagman -f -

2 jobs; 0 completed, 0 removed, 2 idle, 0 running, 0 held, 0 suspended


$ condor_q -analyze 4580


-- Submitter: ardb.int.example.net : <192.168.5.192:42373> : ardb.int.example.net
---
4580.000:  Request has not yet been considered by the matchmaker.

User priority for brian@xxxxxxxxxxxxxxxxxxxx is not available, attempting to analyze without it.
---
4580.000:  Run analysis summary.  Of 1 machines,
      0 are rejected by your job's requirements
      0 reject your job because of their own requirements
      0 match and are already running your jobs
      0 match but are serving other users
      1 are available to run your job

WARNING: Analysis is meaningless for Scheduler universe jobs.