[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] condor_status and condor_q disagree about state ofvm's


> I've spent the last couple of days looking for an answer to this
> issue and searched the archives, but came up empty handed.  If this
> has been addressed before please excuse the rehash.
> I've got a small pool of two SMP machines, both with dual dual-core
> Opteron processors.  In the default configuration that's 8 vm's.  I
> would expect that this would mean that I should never be able to
> have more than 8 jobs running in this pool at any given time, but
> I have been able to do just that.
> For (as of yet) undetermined reasons, the schedd will not recognize
> that a startd is running for on some vms.  See below the (trimmed)
> results of a condor_status:
> Name          OpSys       Arch   State      Activity
> vm1@server-1  LINUX       X86_64 Unclaimed  Idle
> vm2@server-1  LINUX       X86_64 Unclaimed  Idle
> vm3@server-1  LINUX       X86_64 Claimed    Busy
> vm4@server-1  LINUX       X86_64 Unclaimed  Idle
> vm1@server-2  LINUX       X86_64 Unclaimed  Idle
> vm2@server-2  LINUX       X86_64 Unclaimed  Idle
> vm3@server-2  LINUX       X86_64 Claimed    Busy
> vm4@server-2  LINUX       X86_64 Claimed    Busy
> Now look at the (trimmed) results of a condor_q -running:
> ID      HOST(S)
> 68.0   vm4@server-1
> 69.0   vm4@server-2
> 70.0   vm3@server-1
> 71.0   vm3@server-2
> notice that vm4 on server-1 is running a job, but shows up as
> Unclaimed/Idle.  Does anyone have an explanation of why this might
> happen or what I can do to further debug the issue?

I have seen this type of behavior before.  Check to be sure that there
is only one condor_startd process running on server-1.  I have seen
cases where there are two condor_masters, each with a condor_startd,
and what you see in condor_status is the status of the condor_startd
that has most recently sent an update to your condor_collector.

Daniel K. Forrest	Laboratory for Molecular and
forrest@xxxxxxxxxxxxx	Computational Genomics
(608) 262 - 9479	University of Wisconsin, Madison