[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] Dynamic Partitioning Updating Memory Usage

My previous configuration may have been wrong because I believe PREEMPTION_REQUIRMENTS is only evaluated when RemoteUserPrio > SubmittorPrio.  I've update the preemption configuration as reflected below.  Unfortunately jobs are still not preempted when ImageSize grows.  condor_q reflects the updated image size but I've discovered the ImageSize in the local job ad (in the file pointed to by $(_CONDOR_JOB_AD)) only updates when a job starts, but not while the job is running. I suspect this may be where the problem lies.  Is this functioning as intended?  Is there a way to force a job to vacate when ImageSize grows larger than the memory allocated to a dynamically partitioned slot?  Is there a more direct way to grown the memory allocated to a slot? Is there a problem with my configuration?

New schedd configuration:
RAN_FOR_A_BIT = ($(ActivationTimer) > (10 * $(MINUTE)))
KILL= ($(ActivityTimer) > $(MaxVacateTime))
PREEMPTION_REQUIREMENTS = ( $(StateTimer) > (10 * $(MINUTE)) && RemoteUserPrio > SubmitterUserPrio * 1.2 )

On 01/10/12, Jake Adriaens   wrote:
> I've configured condor to use dynamic slot partitioning, however the memory size for the dynamic slots are never updated.  When my jobs run, each dynamic slot always has a memory size of 1 and never grows with the image size of the job. It is my understanding that once a dynamic slot is allocated the resources assigned to it cannot change.  To work around this I've tried to configure condor to preempt a job if its image size grows larger than the memory allocated to the slot.  However, my jobs are never preempted even though image size shown by condor_q is considerably larger than the amount of memory available in the dynamic slot as shown by condor_status.  Below are my preemption settings from the condor configuration. Any suggestions would be greatly appreciated.Jake
> PREEMPT = True
> RAN_FOR_A_BIT = $(StateTimer) > (10 * $(MINUTE))
> PRIORITY_EXCEEDED = RemoteUserPrio > SubmittorPrio * 1.2
> MEMORY_EXCEEDED = (TARGET.ImageSize/1024*0.7) > (Memory*1.0)
> _______________________________________________
> Condor-users mailing list
> To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with a
> subject: Unsubscribe
> You can also unsubscribe by visiting
> https://lists.cs.wisc.edu/mailman/listinfo/condor-users
> The archives can be found at:
> https://lists.cs.wisc.edu/archive/condor-users/