[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: [Condor-users] Condor View and flocking



> I have two internal Condor pools configured for two-way flocking, as
> well as two pools external to our organization also configured for
> two-way flocking with us.  I'm running Condor View on both of my
pools.

> The users submit jobs from desktops in one of the pools, with jobs
> running on both.  Condor View indicates that jobs are running on the
> pool from which the jobs are submitted, and that there are only idle
> jobs on the second pool.

Just to be sure, are you getting this from the "Machine" statistics or
the "Job" statistics?  I think you're only looking at the 'job' side,
right?  (One could infer job behavior based on machine usage...)

> Is this the expected behavior?  Are there any other reporting quirks
of
> which I should be aware in regards to flocking?  Should jobs flocked
> from other pools show up in my Condor View stats?

This does seem strange.  You're saying that pool 1 has some amounts of
running/idle jobs, and pool 2 shows just the idle jobs from pool 1?  I
would expect for pool 2 to see either both idle and running, or nothing.

More details...this explanation may be a little too in-depth, but here
goes.  During normal operation, a schedd sends "submittor" ads to the
collector.  There is one submittor ad per schedd per
job-submitting-user.  You can see these ads via 'condor_status -sub'.
If the submittor ad shows that a user has jobs to run, then the
negotiator will contact the schedd and negotiate.  If the jobs can't be
matched with the normal collector, the schedd will then send submittor
ads to other collectors in $(FLOCK_TO).  

For our purposes, the interesting parts of the submittor ad are
RunningJobs, IdleJobs, and FlockedJobs.  It looks to me like the
submittor ads are always filled in with the number of idle jobs a user
has.  If this gets sent to both collectors, both pools will show idle
jobs from that submittor.  For the running jobs...I suspect that the ads
to the pool 1 collector are given

RunningJobs = the number of running jobs for that submittor (for pool 1)
FlockedJobs = 0 (?)

and the pool 2 collector gets

RunningJobs = 0
FlockedJobs = the number of running jobs for that submittor (for pool 2)

And the "FlockedJobs" value is totally ignored by the collector,
negotiator, and condor view.  :-)  That's just a guess, but you could do
some 'condor_status -sub -l' queries on both pools and confirm/deny.

Good luck,
Mike Yoder
Principal Member of Technical Staff
Direct : +1.408.321.9000
Fax    : +1.408.321.9030
Mobile : +1.408.497.7597
yoderm@xxxxxxxxxx

Optena Corporation
2860 Zanker Road, Suite 201
San Jose, CA 95134
http://www.optena.com