[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[HTCondor-users] All jobs running at once

Dear Condor users,

I am attempting to get condor configured on my server (n machines = 1, n_cpus = 6). I am running Ubuntu 14.0.3 with condor version Debian-8.4.2 as distributed via NeuroDebian.

I'm able to run jobs, but condor ignores all limitations I place. For instance, I set RESERVED_MEMORY=4000, and MAX_JOBS_RUNNING=1 in the condor_config.local file, and in submit files, reserve_cpu=1 and reserve_memory=4Gb. I then submitted about 200 jobs, and every single one of them starts, quickly depleting all available memory.

I investigated condor_status during a recent run and noticed that all the slots were shown to be unclaimed / idle. Investigating the log files, I see "Number of Active Workers 0" in the collector log, which on each negotiation receives 12 ads (cpu hyperthreading). This leads me to believe there is a problem with the collector.

I tried condor_restart to no effect, but I don't know where to go from here. Can you please help?


Jordan Poppenk, Ph.D.
Canada Research Chair in Cognitive Neuroimaging
Department of Psychology and Centre for Neuroscience Studies
Queen's University