[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] Jobs fetched with a hook being killed after 20 minutes



>> So the problem I'm seeing is once the job is evicted, new ones aren't
>> getting run.
>
> Ahh, well I didn't have another job queued to be fetched. I'll test
that
> though to make sure that another job is being fetched. My "sleep 1h"
job
> has passed the 20 minute mark though.
>
>> Basically, job "sleep 30" with a JobLeaseDuration=5 will get the job
>> evicted after 5 seconds. I'd expect the next call to the fetch hook
to
>> retrieve another "sleep 30" job to be run. Seems the slot is stuck
>> "Preemptin/Killing".
>
> Don't read too much into the Preempting stuff in that output below.
After
> the job got picked up I did: condor_off -peaceful so the startd would
stop
> as soon as the job finished the logs wouldn't be full of fetch work
hook
> output. They fill up fast otherwise.
>
> So if the fetch work hook script runs after my current 1 hour sleep
job
> completes 40 minutes from now I'm out of the woods. I'll let you know
how
> it goes...

So with:

JobLeaseDuration = 62899200

Set properly in my classad output from my fetch work script my hour long
sleep job ran to completion.

And immediately upon completion the fetch work script was fired again
and another job was picked up.

So no problem here other than I need to pay more careful attention to my
line ending output in my fetch work scripts.

Thanks again for the help.

- Ian

Confidentiality Notice.
This message may contain information that is confidential or otherwise protected from disclosure. If you are not the intended recipient, you are hereby notified that any use, disclosure, dissemination, distribution,  or copying  of this message, or any attachments, is strictly prohibited.  If you have received this message in error, please advise the sender by reply e-mail, and delete the message and any attachments.  Thank you.