[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[HTCondor-users] Rounding of ImageSize (etc.) - why 25%?



Good morning/afternoon everyone,

last week I paid a visit to one of the darker corners of Condor
configuration... I had found that ImageSize values for jobs which
got held after overcommitting their requested memory would come
in multiples of 2500000 kB. (Note that this isn't close to a multiple
of 512 MB.)
First, I suspected some "quantize" issue, but the settings using
quantize would also use powers of two.
I then remembered that "raw" values would be rounded. I grepped
through the whole manual, to find exactly two lines which mention
'_RAW' - and there (in the SCHEDD_ROUND_ATTR_* paragraph) I found
that ImageSizes (and other values) would be rounded up by up to
25% of their current value.
This seems to be rather dangerous in a setup that allows for a 2 GB
margin beyond the specified RequestMemory which may go up to 16 GB:
2/16 = 12.5%, double the rounding effect.
In other words, just the rounding may make a job appear to overstep
its declared requirements by more than the allowed margin!

I think it would be safe to reduce this percentage to 10% or even
less - but before I do that I'd like to know what the reason was
in the first place to add such a large number?
What do other Condor pools use?
What would I lose or risk if I set the value to 0%? Is it just
to keep the traffic low that tells Condor that an image size has
grown again?

Cheers,
 Steffen

-- 
Steffen Grunewald * Cluster Admin * steffen.grunewald(*)aei.mpg.de
MPI f. Gravitationsphysik (AEI) * Am Mühlenberg 1, D-14476 Potsdam
http://www.aei.mpg.de/ * ------- * +49-331-567-{fon:7274,fax:7298}