[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] flocking problem



On Thu, 2005-01-06 at 15:58 -0600, David A. Kotz wrote:
> On Thu, 2005-01-06 at 15:46 -0600, David A. Kotz wrote:
> > In my department, I have 3 pools of Condor machines -- 2 dedicated
> > computing clusters and the desktop machines.  All 3 pools are configured
> > for 2-way flocking  with one another.
> > 
> > The machines in the Mastodon cluster advertise the attribute
> > "InMastodon", and the machines in the Scout cluster advertise the
> > attribute "InScout".  They have lines such as "InMastodon = TRUE" in the
> > local config files.
> > 
> > Prior to splitting the machines into 3 separate pools (for the purpose
> > of keeping separate logs for reporting to funding agencies), users could
> > specify in their submit descriptions that they wanted jobs to run on one
> > of the cluster nodes by including something like this, "Requirements =
> > InScout".
> > 
> > Now if a user submits a job from one pool, requesting an execute machine
> > in a different pool, the job sticks in the queue because Condor can't
> > find a match.
> > 
> > Is this a configuration issue, or a fundamental limitation?  I'm
> > currently running Condor 6.6.6 on Linux.
> > 
> 
> Mere seconds after I submitted my question, the jobs inexplicably
> decided to flock.  Prior to sending, I had watched at least three
> consecutive negotiator cycles say that there were no matches for my test
> jobs.  Prior to that, I left a job cluster overnight trying to flock in
> the opposite direction, to no avail.
> 
> I'm guessing that I have a configuration issue or that Condor hates me,
> or both.


Following up on my own post again, I left 110 "hello, world" programs in
the queue over the weekend, and they were still waiting for me today.
The requirement keeping them from running, according to condor_analyze,
is "InMastodon".


-- 
David A. Kotz <dkotz@xxxxxxxxxxxxx>