[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[HTCondor-users] ASSIGN_CPU_AFFINITY and Parallel Universe?



Everyone,

I had set ASSIGN_CPU_AFFINITY = True as recommended in the 
HowToLimitCpuUsage wiki page a while ago.

Today, a user complained that his Parallel Universe jobs "sometimes"
would be 5 times slower than others.

A somewhat deeper investigation showed that indeed multiple nodes
of the same job would share one CPU, causing a slowdown of a factor
up to 4.

This didn't happen (or wasn't noticed) without that setting in place.

Is this a known side-effect? I still have the feeling that ParUniv
jobs are not the most-loved children of Condor... I'm also still
looking for a way to match ParUniv jobs to as few nodes as possible
(currently, the matchmaker selects the "most complete" partitionable
slots for the first round, but then looks for others - how can I
override this behaviour, for MPI jobs only, of course?)

Thanks,
 Steffen

-- 
Steffen Grunewald * Cluster Admin * steffen.grunewald(*)aei.mpg.de
MPI f. Gravitationsphysik (AEI) * Am Mühlenberg 1, D-14476 Potsdam
http://www.aei.mpg.de/ * ------- * +49-331-567-{fon:7274,fax:7298}