[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] condor's calculated memory vs image size of jobs in queue



On Wed, May 16, 2007 at 12:13:53PM -0700, Stuart Anderson wrote:
> Paul,
> 	This looks like it might be a problem with the automatic job clustering.
> As a test you might try enabling NEGOTIATE_ALL_JOBS_IN_CLUSTER to see if that
> solves the problem before digging deeper.

In addition to this, you might consider setting a RESERVED_MEMORY (or the like,
don't have the manual at hand) of -256M (yes, negative) to allow the job to
run into swap but get a chance to complete.

This leads me to repeat my request for "dynamic resource repartitioning" on 
SMP machines: if there are idle boxen which together would amount for enough
memory etc. it should be possible to temporarily close one of the vms and 
use its resources for the one one(s). 

Steffen

-- 
Steffen Grunewald * MPI Grav.Phys.(AEI) * Am Mühlenberg 1, D-14476 Potsdam
Cluster Admin * http://pandora.aei.mpg.de/merlin/ * http://www.aei.mpg.de/
* e-mail: steffen.grunewald(*)aei.mpg.de * +49-331-567-{fon:7233,fax:7298}
No Word/PPT mails - http://www.gnu.org/philosophy/no-word-attachments.html