[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[HTCondor-users] negotiator memory usage keeps climbing after enabling preemption



Hello.

Memory usage of our condor_negotiator process started a continuous slow climb after I enabled preemption. It takes about 48 hours for it to go from 0 to 25GB at fairly constant rate (at which point our central manager runs out of memory). Before preemption, condor_negotiator used at most 1GB of memory.

Is that normal? Our pool has about 6000 cores and about 20k jobs in the queue. Upgrading the central manager from 8.3.8 to 8.5.1 didn't help (all other machines in our pool run 8.3.8). I didn't see anything obviously wrong in the logs.

This behavior started when I replaced
NEGOTIATOR_CONSIDER_PREEMPTION = False
with
NEGOTIATOR_CONSIDER_PREEMPTION = True
ALLOW_PSLOT_PREEMPTION = True
PREEMPTION_REQUIREMENTS = False
(we only do rank-based preemption)

Anybody know what might be going on?


Thanks,

Vlad