Mailing List Archives Public Access	UW Madison Computer Sciences Department Computer Systems Lab

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] condor's calculated memory vs image size of jobs in queue

Date: Thu, 17 May 2007 08:14:25 +0200
From: Steffen Grunewald <steffen.grunewald@xxxxxxxxxx>
Subject: Re: [Condor-users] condor's calculated memory vs image size of jobs in queue

On Wed, May 16, 2007 at 12:13:53PM -0700, Stuart Anderson wrote:
> Paul,
> 	This looks like it might be a problem with the automatic job clustering.
> As a test you might try enabling NEGOTIATE_ALL_JOBS_IN_CLUSTER to see if that
> solves the problem before digging deeper.

In addition to this, you might consider setting a RESERVED_MEMORY (or the like,
don't have the manual at hand) of -256M (yes, negative) to allow the job to
run into swap but get a chance to complete.

This leads me to repeat my request for "dynamic resource repartitioning" on 
SMP machines: if there are idle boxen which together would amount for enough
memory etc. it should be possible to temporarily close one of the vms and 
use its resources for the one one(s). 

Steffen

-- 
Steffen Grunewald * MPI Grav.Phys.(AEI) * Am Mühlenberg 1, D-14476 Potsdam
Cluster Admin * http://pandora.aei.mpg.de/merlin/ * http://www.aei.mpg.de/
* e-mail: steffen.grunewald(*)aei.mpg.de * +49-331-567-{fon:7233,fax:7298}
No Word/PPT mails - http://www.gnu.org/philosophy/no-word-attachments.html

Follow-Ups:
- Re: [Condor-users] condor's calculated memory vs image size of jobs in queue
  - From: Alan Cass

References:
- [Condor-users] how to ask an execute machine "stop after this job" ?
  - From: Nicolas GUIOT
- Re: [Condor-users] how to ask an execute machine "stop after this job" ?
  - From: Matt Hope
- [Condor-users] condor's calculated memory vs image size of jobs in queue
  - From: Paul Armor
- Re: [Condor-users] condor's calculated memory vs image size of jobs in queue
  - From: Stuart Anderson

Prev by Date: Re: [Condor-users] condor_submit sets job owner to SYSTEM
Next by Date: Re: [Condor-users] condor's calculated memory vs image size of jobs in queue
Previous by thread: Re: [Condor-users] condor's calculated memory vs image size of jobs in queue
Next by thread: Re: [Condor-users] condor's calculated memory vs image size of jobs in queue
Index(es):
- Date
- Thread

Mailing List Archives

Public Access

Re: [Condor-users] condor's calculated memory vs image size of jobs in queue