[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] All jobs running at once

Dear Condor users, 

This issue turned out to be self-inflicted. As I am running on a single machine, I was submitting jobs using a "local" environment, as this seemed like the intuitively correct. When the "local" environment is specified, jobs run in an unlimited fashion; but when I submit jobs using a "vanilla" environment, queuing behaves perfectly.

Special thanks to Francisco Pereira for helping me work through this off-list.


Jordan Poppenk, Ph.D.
Canada Research Chair in Cognitive Neuroimaging
Department of Psychology and Centre for Neuroscience Studies
Queen's University

On 2016-01-31, 4:49 AM, "Jordan Poppenk" <jpoppenk@xxxxxxxxxx> wrote:

>Dear Condor users,
>I am attempting to get condor configured on my server (n machines = 1, n_cpus = 6). I am running Ubuntu 14.0.3 with condor version Debian-8.4.2 as distributed via NeuroDebian.
>I'm able to run jobs, but condor ignores all limitations I place. For instance, I set RESERVED_MEMORY=4000, and MAX_JOBS_RUNNING=1 in the condor_config.local file, and in submit files, reserve_cpu=1 and reserve_memory=4Gb. I then submitted about 200 jobs, and every single one of them starts, quickly depleting all available memory.
>I investigated condor_status during a recent run and noticed that all the slots were shown to be unclaimed / idle. Investigating the log files, I see "Number of Active Workers 0" in the collector log, which on each negotiation receives 12 ads (cpu hyperthreading). This leads me to believe there is a problem with the collector.
>I tried condor_restart to no effect, but I don't know where to go from here. Can you please help?