On Mon, 15 Dec 2014, Brian Bockelman wrote:
Hi Brian,
It might be worth it to look at the UserLog of these jobs - it's
possible they are switching quickly between R and I?
Hmm, you could look, but I'd be really surprised if that were happening.
Could you send us your SchedLog? I think that's the most likely log
to give us some useful information.
We actually have a test for DAGs getting correctly restarted across a
Condor restart, so I'm a little surprised this is happening.
Something else I just thought of -- you might want to try doing
condor_hold and then condor_release on one of the DAGs, to see if that
gets it to run (just a wild guess).