[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] ProcD issues



Thank you, Dr. Tannenbaum, that does the trick. In addition to feeling a bit silly I went straight for the wisdom page instead of the manual, I have no idea what could be causing this issue. We only experience it on Windows 8, and I'm not familiar enough with the intricacies of either ProcD or NT to even hazard a guess. 

Thank you so much,

John Lambert


On Wed, Apr 24, 2013 at 6:47 PM, Todd Tannenbaum <tannenba@xxxxxxxxxxx> wrote:
On 4/24/2013 4:26 PM, John Lambert wrote:
We've been having some trouble with ProcD. Whenever a machine changes
states (usually going from backfill to unclaimed) ProcD has the
following error on Windows 8 machines (not entirely sure if it's
exclusive to them...)

04/23/13 17:39:41 CreateFile error: 2
04/23/13 17:39:41 ProcFamilyClient: failed to start connection with ProcD
04/23/13 17:39:41 unregister_subfamily: ProcD communication error
04/23/13 17:39:41 ERROR "ProcD has failed" at line 621 in file
c:\condor\execute\dir_1756\userdir\src\condor_utils\proc_family_proxy.cpp

The ProcD.STARTD logs don't mention any errors, so I'm not sure what
happened.

Anyway, I was wondering what alternatives to ProcD existed. The ProcD
wisdom page mentions a killfamily class method. Is this a UNIX exclusive
thing? How can I configure the rest of my daemons to not use ProcD? Any
light you guys can shine on this would be greatly appreciated.


Hi John -

Thanks for the heads-up re potential issues w/ the PROCD on Windows 8. We do not yet do automated regression testing on Windows 8 but will be starting soon (our build/test pool has Windows 7, and we just added in a Windows 8 machine but haven't fully configured it yet....soon...).

To disable the procd you can set
  USE_PROCD = False
in your condor_config and then restart HTCondor (I don't think a reconfig will do it here, I think you'll need a full stop and start of the service). This setting works the same on all platforms.  In general use of a PROCD is advisable for scalability issues (see below) esp if you have an execute machine with a lot of slots, although the scalability impact of disabling the procd may actually be less on Windows than on Unix-type platforms.  But everything should still work with the PROCD disabled.  Info on USE_PROCD knob from the manual cut-n-pasted below.

p.s. look forward to meeting you next week in Madison @ HTCondor Week 2013!

regards,
Todd


USE_PROCD
    This boolean variable determines whether the condor_procd will be used for managing process families. If the condor_procd is not used, each daemon will run the process family tracking logic on its own. Use of the condor_procd results in improved scalability because only one instance of this logic is required. The condor_procd is required when using privilege separation (see Section 3.6.14) or group ID-based process tracking (see Section 3.12.11). In either of these cases, the USE_PROCD setting will be ignored and a condor_procd will always be used. By default, the condor_master will start a condor_procd that all other daemons that need process family tracking will use. A daemon that uses the condor_procd will start a condor_procd for use by itself and all of its child daemons.