[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] How to find out what jobs are held?



Hi,

try something like:

condor_q -better-analyze -global <JOBID>

and look for holdreason ....

cheers
chris



--
Christoph Beyer
DESY Hamburg
IT-Department

Notkestr. 85
Building 02b, Room 009
22607 Hamburg

phone:+49-(0)40-8998-2317
mail: christoph.beyer@xxxxxxx


Von: "Justin Fisher" <justin0419@xxxxxxxxx>
An: "htcondor-users" <htcondor-users@xxxxxxxxxxx>
Gesendet: Mittwoch, 14. Juni 2017 14:22:01
Betreff: [HTCondor-users] How to find out what jobs are held?

I recently ran a batch of job, just shy of 4000 in total. When it was done I got this:
condor_q
-- Schedd: jfisher.ingenazure.com : <192.168.1.206:9618?... @ 06/14/17 14:15:14
OWNER   BATCH_NAME      SUBMITTED   DONE   RUN    IDLE   HOLD  TOTAL JOB_IDS
jfisher      CMD: ngspice        6/7  22:30        1787      _           _        9          1800 261.0 ... 262.4

9 jobs; 0 completed, 0 removed, 0 idle, 0 running, 9 held, 0 suspended

Running condor_release restarted the jobs, but then something crashes and the jobs go back to being held.

then:

condor_q -hold
-- Schedd: jfisher.myserver : <192.168.1.206:9618?... @ 06/14/17 14:05:55
 ID      OWNER          HELD_SINCE  HOLD_REASON
 261.0   jfisher         6/14 14:03          Error from slot1_1@xxxxxxxxxxxxx: SHADOW at 192.168.1.206 failed to send fi
 261.1   jfisher         6/14 14:03          Error from slot2_1@xxxxxxxxxxxxx: SHADOW at 192.168.1.206 failed to send fi
 261.2   jfisher         6/14 14:03          Error from slot3_1@xxxxxxxxxxxxx: SHADOW at 192.168.1.206 failed to send fi
 261.3   jfisher         6/14 14:03          Error from slot4_1@xxxxxxxxxxxxx: SHADOW at 192.168.1.206 failed to send fi
 262.0   jfisher         6/14 14:03          Error from slot5_1@xxxxxxxxxxxxx: SHADOW at 192.168.1.206 failed to send fi
 262.1   jfisher         6/14 14:03          Error from slot6_1@xxxxxxxxxxxxx: SHADOW at 192.168.1.206 failed to send fi
 262.2   jfisher         6/14 14:03          Error from slot1_1@xxxxxxxxxxxxx: SHADOW at 192.168.1.206 failed to send fi
 262.3   jfisher         6/14 14:03          Error from slot2_1@xxxxxxxxxxxxx: SHADOW at 192.168.1.206 failed to send fi
 262.4   jfisher         6/14 14:03          Error from slot3_1@xxxxxxxxxxxxx: SHADOW at 192.168.1.206 failed to send fi

Alas the truncation is right where I suspect the information I need is going to be.

Any ideas as to how to find out what those jobs are?


--
Kind regards,
Justin Fisher.

_______________________________________________
HTCondor-users mailing list
To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users

The archives can be found at:
https://lists.cs.wisc.edu/archive/htcondor-users/