[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] windows xp log off kills jobs



> What are the values of SUSPEND and PREEMPT on these machines.

WANT_SUSPEND 		= TRUE
PREEMPT			= FALSE
PREEMPTION_REQUIREMENTS	= FALSE
KILL 				= FALSE 
# suspend job on VM1 if keyboard is touched 
# and VM2 has a Condor job or high load;
# but don't suspend if job suspension time exceeds limit
SUSPEND	 = (VirtualMachineID == 1) \
 		&& ($(KeyboardBusy) ) \
		&& ( (vm2_Activity == "Busy") || (vm2_LoadAvg >
$(HighLoad)) ) \
		&& ( ((TotalJobSuspendTime =!= UNDEFINED) &&
(TotalJobSuspendTime <= $(MaxSuspendTime))) \
		|| (TotalJobSuspendTime =?= UNDEFINED))

> It is possible the standard 'kick a job off this machine if 
> the owner wants to use it' routines are kicking in.
> You may wish to change that behaviour...

We try to suspend jobs in our pool when interactive use is wanted with
the above settings.  This has worked properly for a couple of years and
works now; when keyboard activity happens the job on VM1 is suspended.
Anyway, why would logging OFF a machine result in killing jobs even if
we had SUSPEND and PREEMPT incorrect? :-(

Ralph Finch
916-653-7552


> -----Original Message-----
> From: condor-users-bounces@xxxxxxxxxxx 
> [mailto:condor-users-bounces@xxxxxxxxxxx] On Behalf Of Matt Hope
> Sent: Friday, December 28, 2007 7:09 AM
> To: Condor-Users Mail List
> Subject: Re: [Condor-users] windows xp log off kills jobs
> 
> On Dec 27, 2007 10:00 PM, Finch, Ralph <rfinch@xxxxxxxxxxxx> wrote:
> > condor -version
> > $CondorVersion: 6.8.3 Jan  5 2007 $
> > $CondorPlatform: INTEL-WINNT50 $
> >
> > I am submitting jobs from machine1 to a pool, all windows xp.  If I 
> > then remote login to a machine running my jobs--say machine2--then 
> > logoff, the jobs on machine2 are killed and new jobs restart a few 
> > minutes later from the idle jobs in the pool.  Damn 
> annoying as you can guess.
> >
> > In this thread
> > 
> https://lists.cs.wisc.edu/archive/condor-users/2004-November/msg00076.
> > sh
> > tml
> >
> > the poster had the same problem but seemed to think it was 
> only Java 
> > jobs.  Mine are not Java, my executable is a windows .bat 
> file which 
> > then runs a compiled exe.  He had a klugy solution to his Java jobs 
> > which I doubt would work with mine, plus it seems a serious 
> deficiency 
> > and should have a better solution.  I'm believing I'm not the first 
> > person to hit on this problem so is there a good solution?
> 
> What are the values of SUSPEND and PREEMPT on these machines.
> 
> It is possible the standard 'kick a job off this machine if 
> the owner wants to use it' routines are kicking in.
> You may wish to change that behaviour...