[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] [Filter Test: P272621] Startd's crashing with fatal error getting process info for starter and descendants



> 7/5 14:14:45 ERROR "Starter::percentCpuUsage(): Fatal error getting
> process info for the starter and decendents" at line 859 in file
> ..\src\condor_startd.V6\Starter.C

I've figure out the problem. One of the jobs run by one of my users is
using the Windows API call:

SetPriorityClass(GetCurrentProcess(), HIGH_PRIORITY_CLASS);

On one of it's spawned threads and the result is the condor_startd gets
starved for CPU on the machine (it's a single processor machine).

The user was trying to reduce the variance in run time of his job from
run to run by preventing this critical thread from being interrupted.

First thing I've noticed:

All the condor daemons on Windows run at 'Normal' priority. Would it be
possible to add a config setting that would let me change this? I'd like
to see all the daemons run at 'High' priority.

Second thing:

I haven't tried out Matt Hope's suggestion of JOB_RENICE_INCREMENT=0 but
it would be really nice to have an explicit way, given the daemons are
running at 'High' priority, to spawn the job thread at 'Normal'
priority. This would ensure far fewer interruptions from system process
on the machine.

- Ian