[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] job eviction and image size

The problem is not normally that the job is evicted due to its image
size* but that it goes beyound the limit for the machine and then is
preempted for another reason.

At this point the imagesize has been updated to report the real usage
but when compared against the startd's reported memory it is not

Normally this is not an issue if the startd is correctly reporting the
right amount of memory but there is a bug which causes it to report
too little for SMP machines in windows at least. I believe this is
fixed in the dev release but can't remember if it in the 6.6 series.

The only easy way to work round this is:

1) lie in the startd (artificially report more memory than exists)
2) lie in the job ClassAd
(when a job cannot run condor_qedit it's ImageSize to be low again.)

I would suggest that using 2 is easier in the short term to deal with
it when it happens


*If someone has altered their preempt settings to kick a job which is
now hitting disk rather than staying in memory you're SOOL and need to
obey the memory constraints...

On Thu, 02 Dec 2004 09:51:32 -0600, Michael Remijan
<remijan@xxxxxxxxxxxxx> wrote:
> How do I prevent jobs from being evicted because of image size?
> Michael J. Remijan
> Research Programmer
> http://www.ncsa.uiuc.edu
> (217) 244-7069
> remijan@xxxxxxxxxxxxx
> mikeremijan (AIM)
> mjremijan (Yahoo)
> _______________________________________________
> Condor-users mailing list
> Condor-users@xxxxxxxxxxx
> http://lists.cs.wisc.edu/mailman/listinfo/condor-users