[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] Limiting size of tmp files in /var/condor/execute



Hi Paul,

You can try adding a suspend / preempt expression to kill the jobs when they go over a certain amount of disk space.  This would be evaluated every X minutes, meaning only jobs with a large amount of I/O would be problematic.

If you are using partitionable slots, the implementation is rather clean.  You can provide a certain amount of default space (5GB, for example), and allow users to request a different amount by adding:

request_disk = 6000

to their submit files.  HTCondor will track the disk space it has handed out to jobs to make sure things don't go over the amount available on the spool directory.

Would this approach suffice?  In Linux, this is a fairly difficult problem - to do something more along the lines of a true quota would require HTCondor development.

Brian

On Mar 30, 2013, at 4:04 PM, Paul Brenner <paul.r.brenner@xxxxxx> wrote:

> We are expanding the use of HTCondor across cluster systems of more diverse hardware and ownership requirements on our campus.  The default Condor configuration allows users to fill /var/condor/execute until machines crash or nearly crash.  I did some "googling" on the Condor suppor pages to fix this and the majority of fixes recommended a separate disk partition.  Because of the highly diverse nature of the contributed systems, setting up separate partitions is not an option.
> 
> Is there a recommended alternative fix somewhere in the user manual or listserv archives that someone can point me to?  I thought about changing default local machine config values such as "TotalDisk = something_smaller" but I'm not sure this would actually set a maximal limit for space /var/condor/execute can utilize.
> 
> Thanks for any insight,
> Paul
> 
> -- 
> Paul R Brenner, PhD, P.E.
> Center for Research Computing
> The University of Notre Dame _______________________________________________
> HTCondor-users mailing list
> To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
> subject: Unsubscribe
> You can also unsubscribe by visiting
> https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users
> 
> The archives can be found at:
> https://lists.cs.wisc.edu/archive/htcondor-users/

Attachment: smime.p7s
Description: S/MIME cryptographic signature