[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] Jobs interruption in the middle of running cause end results to failed



Alex,

At minimum, to simply disable preemption, set PREEMPT = FALSE and RANK = 0 in the configuration for your startds, and PREEMPTION_REQUIREMENTS = FALSE for your negotiator.  That will stop preempting based on user activity, better-ranked jobs, and user prio, respectively.

There is a helpful section explaining preemption in section 3.5 of the Condor manual at

http://www.cs.wisc.edu/condor/manual/v7.2/3_5Policy_Configuration.html#sec:Disabling_Preemption

Regards,
Doug

On Feb 19, 2009, at 10:33 AM, Alas, Alex [FEDI] wrote:

Hello to all of you,
I have a situation here. I have a Condor Pool of 22 CPU’s running version 7.05, distributed in 4 Windows 2003 boxes (each win2k3 box has 4 cpu’s) and 3 Windows XP boxes (each winxp box has 2 cpu’s) . Jobs runs fine while there is only one user running his jobs at a time. I have to clarify that we haven’t set any kind of priorities (job or user) when the jobs are been sent. The issues presents when two users send their jobs at the same time or one after the other, being more specific if one user sends a jobs first and minutes later the other one send his jobs while the first ones are running. The Central manager puts the first  jobs on hold  to take the jobs of the second user to run with higher priority but once those jobs of the first users were placed on idle the integrity of the data is corrupted and from there the rest of jobs once they come back to running mode is wrong. The same happen with the jobs of the second user, during that process of putting them on idle and running the data is mishandle it and it will degenerate the results.
Is there any setting I need to configure at the condor_config level of the central manager to stipulate that once a jobs is running on one node it should not be interrupted and need to finish to its completion?
Thanks for your answer\input in advance!!!
 
 
Respectfully,
Alex Alas
Systems Administrator
Fugro EarthData Inc.
Tel. 301-948-8550 x219 Fax 301-963-2064 E-mail: aalas@xxxxxxxxxxxxx
7320 Executive Way, Frederick, MD  21704
 
_______________________________________________
Condor-users mailing list
To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/condor-users

The archives can be found at: 
https://lists.cs.wisc.edu/archive/condor-users/

-- 
===================================
Douglas Clayton
phone: 919.647.9648

Cycle Computing, LLC
Leader in Condor Grid Solutions
Enterprise Condor Support and Management Tools

http://www.cyclecomputing.com