[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] Clipped Mac OSX



The config on our classroom G5s is similar to what was described. Namely, it's the UWCS example config with the following tweaks:
StartIdleTime           =  1 * $(HOUR)
ContinueIdleTime        =  1 * $(HOUR)
MaxSuspendTime          = 24 * $(HOUR)
MaxVacateTime           = 10 * $(MINUTE)

and

PREEMPTION_REQUIREMENTS = $(StateTimer) > (2 * $(HOUR)) && \
RemoteUserPrio > SubmittorPrio * 1000

The dedicated cluster nodes look closer to the TESTINGMODE examples in the config. Unfourtunately we're still having problems with jobs getting evicted when they shouldn't be. We can't just use 'False' because we have a number of nice jobs which *should* be evicted, but all other jobs shouldn't be evicted. I was thinking of changing the config to the following:
PREEMPTION_REQUIREMENTS = $(StateTimer) > (2 * $(HOUR)) && \
RemoteUserPrio > SubmittorPrio * 1000 && \
RemoteUserPrio > SubmittorPrio + 1000

But maybe it would be better to use something that detected NiceUser? Does anyone have any better suggestions for that?
~Seth


On Jun 12, 2007, at 9:53 AM, Kewley, J (John) wrote:

This sounds a useful setup. All this is explained in the manuals, but sometimes
seeing the settings and what they are used for is useful for others in this
group.

If you can submit them in a few lines, could you do so?

Cheers

JK

-----Original Message-----
[mailto:condor-users-bounces@xxxxxxxxxxx]On Behalf Of Finch, Ralph
Sent: Tuesday, June 12, 2007 3:24 PM
To: Condor-Users Mail List
Subject: Re: [Condor-users] Clipped Mac OSX


We have a somewhat similar situation:  Windows SMP machines with jobs
that can run up to several days.  We want minimal impact on 
interactive
use, so if both CPUs on the SMP machines are in use and the 
keyboard is
in use, we suspend one of the Condor jobs for up to a few hours total
time.  It seems to work well: The suspense occurs immediately on
keyboard use, then unsuspends after a period of no keyboard use (5
minutes).  Condor keeps track of total suspend time, and if a job
accumulates beyond a max we no longer suspend it (so it 
doesn't delay an
entire batch of runs).

If your jobs are evicted for a similar reason (interactive use) you
might want to consider suspending rather than evicting.  Evicting just
doesn't make much sense with jobs that run more than an hour or two.


-----Original Message-----
[mailto:condor-users-bounces@xxxxxxxxxxx] On Behalf Of Seth Price
Sent: Tuesday, June 05, 2007 12:40 PM
Subject: [Condor-users] Clipped Mac OSX

I'm with the SAGE group on the wisc campus (www.sage.wisc.edu). We  
have a cluster of 40 Mac G5 processors, and we are 
considering adding  
a number of 8 core Mac Pros to the mix.

Are there plans for un-clipping the Mac version of Condor? We would  
really like checkpointing to work.

The largest complaint with the condor setup is when a job gets  
evicted for one reason or another, and the run needs to start from  
scratch on another machine. The scientists I'm working with often  
have week-long jobs, so this is a serious problem.

We might abandon using condor in favor of stand alone 
machines to put  
fewer possible interruptions between the scientists and the 
hardware.  
(I hope not, though. I'm sure I have things configured 
correctly this  
time...)

Thanks,
Seth



PS: I realize I can configure condor to *never* evict a job, but my  
intent is to only evict my 'nice' jobs whenever any other jobs are  
submitted.


_______________________________________________
Condor-users mailing list
To unsubscribe, send a message to 
subject: Unsubscribe
You can also unsubscribe by visiting

The archives can be found at: 


_______________________________________________
Condor-users mailing list
To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting

The archives can be found at: