[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] Condor to reboot a machine



Tiago Macarios wrote:
> Yeah we know that. Problem is that we run intensive simulations that may
> take days/weeks to finish. The extra overhead of running a VM is really
> not desirable.

The overhead is about 2-3% in actual operation, maybe a little more,
maybe a little less depending on the nature of the jobs.

By comparison, a tiny error will render a node unbootable. That would
reduce its processing capability by 100% until an administrator fixes
the problem.

Those are the two worst case situations. Take your pick.

I know how I'd go about implementing this dual-boot strategy but, as I
wrote before, I strongly recommend using the VM universe instead. Linux
boot-time device enumeration can be inconsistent and this inconsistency
will bite you.

-- 
Rich Pieri <ratinox@xxxxxxx>
MIT Laboratory for Nuclear Science