[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] condor-status and its options


> I noticed there was no
> condor-status -collector
> which I thought a bit strange since I'd have thought this as useful
> as -negotiator

I suppose it doesn't make much sense since you have to know where the
collector is before you can run condor_status.

> I did however try -master and -any
> condor-status -master
> I thought would give me a list of machines running a master daemon
> (which should be one per machine, as opposed to one per proc, I
> think), however only a subset of the nodes appeared.
> condor-status -any
> should have been a supset of all entries above, and indeed it
> covered all the nodes, but not all had a DaemonMaster entry.
> Looking at these last two, there seems to be a vague pattern. My
> Windows 2000 machine didn't appear to have a Master entry, neither
> did my multi-proc Windows XP ones, but my multi-proc Linux one did.
> In fact, my multi-proc Linux one was listed without the vmN prefix
> whereas the multi-proc Windows XP ones (in condor-status -any)
> displayed the vmN prefix.
> Is this expected behaviour.

As far as the Windows behavior goes, this seems to be a problem with
how Windows handles UDP traffic.  It seems to consider a packet "sent"
as soon as the system call returns, but before the packet has actually
hit the wire.  This leads to many master and some startd updates being
lost (because the update is bigger than one UDP packet and the second
packet can bump the first from the send queue - blame Windows).  What
we do here is add D_NETWORK to the MASTER_DEBUG and STARTD_DEBUG flags
on our Windows machines to slow things down enough that this doesn't
happen.  I suppose you could also just set UPDATE_COLLECTOR_WITH_TCP
to True to avoid UDP entirely.

Daniel K. Forrest	Laboratory for Molecular and
forrest@xxxxxxxxxxxxx	Computational Genomics
(608) 262 - 9479	University of Wisconsin, Madison