[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] Job remains in idle (worked until I increased pool size)



Hi Michael,

On Wed, Feb 9, 2011 at 2:15 PM, Michael O'Donnell <odonnellm@xxxxxxxx> wrote:

I have a pool of 200 cores with various OS for Windows machines. Yesterday, I expanded the pool from 100 cores to 200 cores and since than any jobs that I submit remain in the idle state.

How did you expand the cores? By doubling the slot counts on your machines or by adding new machines?

I ask because...
 
1   ( ( ( 1024 * target.Memory ) >= 4500 ) && ( ( 1024 * ceiling(ifThenElse(JobVMMemory isnt undefined,JobVMMemory,4.394
531250000000E+000)) ) >= 4500 ) )
                                      0                   REMOVE

That above statement says there are no machines with enough memory to run your jobs. If your slot count was doubled by doubling slots advertised by your pool's machines, then you've halved the memory allocated per slot by Condor and possible constrained yourself out of slots because of this.

 
2   ( target.HasWindowsRunAsOwner && ( target.LocalCredd is "CM" ) )
                                      0                   REMOVE

This one I cannot explain. Did you change submissions to use run_as_owner = true?

Regards,
- Ian