[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] no CPU time for a finished job!!



By default, the Starter is supposed to poll 8 seconds into the job run, and then every STARTER_UPDATE_INTERVAL.

interval.setDefaultInterval( param_integer( "STARTER_UPDATE_INTERVAL", 300, 0 ) ); interval.setTimeslice( param_double( "STARTER_UPDATE_INTERVAL_TIMESLICE", 0.1, 0, 1 ) ); interval.setInitialInterval( param_integer( "STARTER_INITIAL_UPDATE_INTERVAL", 8 ) );

I'm not entirely sure why you'd be getting no data. If ImageSize or ResidentSetSize are getting updated, maybe there just was no cpu time registered.

It looks like the code tries to get a final update too. I wonder if it actually works.

Best,


matt

On 03/29/2011 07:48 AM, Santanu Das wrote:
Hi Matt,

Those jobs should run for 10 to 15 sec. or so but most of the time it
comes as 0.
Do you think a small STARTER_UPDATE_INTERVAL value will improve this
situation?

-Santanu



On 03/27/2011 04:00 PM, Santanu Das wrote:
Hi there,

I see there are a number of jobs with JobStats 4, but the 'RemoteUserCpu
+ RemoteSysCPU' is still 0 - is it a common thing?

[testac1@serv07 /]$ condor_history -l 488692 | egrep
'^JobStatus|RemoteSysCPU|RemoteUserCpu'
RemoteUserCpu = 0.000000
JobStatus = 4


Does anyone know the reason?

Cheers,
Santanu

Could be that the job ran for such a short period that the runtime
stats were never collected. IIRC, shrinking STARTER_UPDATE_INTERVAL
can get stats polled more frequently. It will increase the network
load starter->shadow, but not shadow->schedd.

Best,


matt

_______________________________________________
Condor-users mailing list
To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/condor-users

The archives can be found at:
https://lists.cs.wisc.edu/archive/condor-users/