[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] about differences between vanilla and standard universe

On Wed, 9 Mar 2005 18:36:10 +0800 (HKT), Carson Hung
<carson@xxxxxxxxxxxxxx> wrote:
> Hi,
> I am a little bit confuse about the differences between standard and
> vanilla universe.
> I think their main differences come from the I/O handling, is that true?
> is there any other differences?

you can in general think of standard vs vanilla as being:

Standard only works if I can link against the condor libraries for io
and still have my program work (which means scripts are out as are
anything which may run on clipped ports such as windows).
If so great you get transparent checkpointing support. lucky you.

If not you are in vanilla land and you are basically responsible for
everything yourself and your only interaction with condor is via:

1) control signal indicating 
  a) vacate please 
  b) vacate NOW.
2) the files copied to your starting directory if you responded in time to 1a

For the most part all the rest of the condor mechanism (schedd,
negotiator etc) behaves the same irrespective of which universe you
use except:

1) PREEMPT_VANILLA expression may optionally be used instead of
PREEMPT (and likewise for SUSPEND etc.).
This will happen automatically if you define them but they default to
being the same as PREEMPT.

2) The likely default values for some of the preemption specific
behaviour will be badly skewed for predominantly vanilla jobs since it
will attempt to preempt the longest running jobs first (Aaaargh) so
users basing their pool settings off the UWCS ones would do well to
re-evaluate this if they are predominantly vanilla.
This last one is not so much a difference in the behaviour as in the
defaults provided being non optimal.

Off the top of my head I can't think of any others. Really the key one
is that vanilla universe jobs have almost no constraints in behaviour
relative to the standard universe ones but have considerably coarser
interaction with the startd/starter. If you can live with the standard
universe relinking and restrictions it really seems to be the way
condor is designed to be used (well you can probably tell that from
the 'standard' name).

If you know up front you will be using exclusively vanilla I strongly
recommend doing a pre check that what you think will happen with your
jobs under heavy load and multiple users is what you actually want...
If you are planning on using the machine RANK then be prepared to
sacrifice throughput for latency. until RANK preemption can be
controlled (any plans for this by the way) as the other means are then
this is always going to be a non optimal way of resource
prioritization without checkpointing.