[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] Jobs in idle state on a submitter blocks other jobs



Hi Xavier,

Without additional information, it's hard to say what was
happening.  One execute node being down shouldn't cause jobs
to idle in the queue -- they would just match to one of the
other execute nodes (if they fit the job's requirements).

Can you post the job log from one of the stuck jobs somewhere?
Perhaps that will give us more information.

Thanks,
-Mat

On 4/7/21 5:00 AM, Xavier OUVRARD wrote:
Dear all,

since yesterday I had 6 jobs that were idle on a scheduler; one
computation node was faulty and I kept having attempt to connect to ...
in the SchedulLog; it seems then that it was blocking all the remaining
jobs that were kept in the condor_q. Rebooting the faulty node (not the
scheduler), allowed all the remaining jobs that were iddled to be run
again without any additional intervention.

Is it a normal behaviour?

The condor version is 8.8.13 on all machines.

Best regards,

Xavier