I have containers that suffers from out of memory error and I don't want to configure swap on my executors. Currently to handle this situation I'm running with a very large RequestMemory value for my jobs.
Is there a mechanism in condor which allows me to "catch" jobs that enters HOLD state, edit their request memory (to a larger value) and resubmit automatically?
Multiple iterations of this process would be great but even changing the amount ones is enough for me.
I tried the hooks mechanism but couldn't find a way to be invoked when my job enters hold state.
HTCondor-users mailing list
To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
You can also unsubscribe by visiting
The archives can be found at: