[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] Condor to reboot a machine



Hi,
The 3.5% of compute penalty for running inside VM vs. on bare metal OS are coming from my study.
There are many qualifiers attached to this number.

Here are the most relevant details:
Hardware: Dell 1950, 4 dual CPUs, 32 GB RAM 
bare metal OS: Ubuntu 12.04
VM controller : OpenStack/grizzly, installed using open source from devstack.org
VM guest system: Scientific Linux 6.4, 8 vCores, 6GB RAM

Work definition:  computing oriented task. A loop with calls of trig functions, pow(), random(),  NO disc I/O.
Each job would allocate 300 MB RAM and make random  read/write preventing any optimization or memory swapping.
The job would self-time to run for 8 minutes of wall clock time  and quits. 
The 'work output' is defined as the  # of compute loop cycles per minute.
Gradually, number of embarrassingly parallel jobs as above were increasing from 1 up to few more then the # of cores on the blade (8).

The same test was run on 4 different blades and the aggregated work per wall time minute was the ultimate quantity of interest.

For jobs run on the bare hardware OS (Ubuntu) this work per minute increases linearly with # of parallel jobs until 8 jobs, then plato.
The 8-jobes per blade plato is about 8% lower than the extrapolated from work provided by a single job.
I  speculate, this 8% loss is caused by competition between jobs for some common resource - at the end the pairs of cores are served by a single CPU.

Now comes the losses due to virtualization.
The 8-vCore VM  were launched one the same 4 blades, exactly one VM per blade, no other activity on blades.

The pattern of obtained aggregated compute work from the whole 4 8-vCore VM cluster is exactly the same, except it plato 3.5% lower then if run on the bare hardware.
Thanks
Jan Balewski




On Jan 9, 2014, at 2:27 PM, Tiago Macarios <tiagomacarios@xxxxxxxxx> wrote:

> What are you using for the VM management? VMware, VirtualBox...?