[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] Condor Win!



Hi Todd,

They were actually attempting to "hand schedule" 40 VM's (without the aid of condor) when I started talking to them!  It seems there might have been a lot of contention (she reported exponentially increasing runtimes as she added more concurrent jobs) and maybe the person setting up the VM's did not understand the requirements of the workload.

So to avert any issues outside of my direct control I ran these jobs with the vanilla universe on physical desktop machines all running 32-bit Windows 7.

Thanks,
Eddie

-----Original Message-----
From: htcondor-users-bounces@xxxxxxxxxxx [mailto:htcondor-users-bounces@xxxxxxxxxxx] On Behalf Of Todd Tannenbaum
Sent: Wednesday, June 26, 2013 1:15 PM
To: HTCondor-Users Mail List
Subject: Re: [HTCondor-users] Condor Win!

On 6/25/2013 9:57 AM, Dunn, George Jr wrote:
> Hi All,
>
> Just wanted to share a little success story.
>

Hi Eddie -

Thank you for sharing!  The folks who work on HTCondor (like me) really appreciate the kind words and you taking the time to post.

Curious... were you submitting VM universe jobs?  Or vanilla universe jobs to HTCondor execute nodes running inside of VMs (i.e. running the condor_startd inside a VM) ?

Thanks and welcome to the HTCondor community, Todd


> We are a small public university that has been hit hard in both human 
> and budget resources in recent years (as many schools have). We do 
> have a research computing group but overall our IT is stretched so 
> thin that most all of those folks are on fire duty. A group in our 
> Statistics department was up against a runtime wall trying to run some 
> R jobs in VM's. There were a total of 8000 jobs that each took 12 
> hours  to complete (with unfettered access to a single core). By 
> marshaling a few of our labs we were able to run over 1100 
> simultaneous jobs on 175 ish machines (mostly i7's a few i5's) even 
> through several electrical storms that rebooted all the machines and 
> caused all manner of networking errors IN TIME! She is going to add me 
> to the Author list for the paper and we will put a blurb about Condor.
>
> I hope this will be a proof of concept for a larger roll-out. I also 
> plan to replace our torque scheduler for ROCKS with this for our mpi jobs.
>
> Thank you for such a polished and stable product. I am a fan for life!!!
>
> Thanks,
>
> Eddie Dunn
>
> Systems Administrator
>
> Department of Computer Science
>
> University of North Carolina Wilmington
>
>
>
> _______________________________________________
> HTCondor-users mailing list
> To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx 
> with a
> subject: Unsubscribe
> You can also unsubscribe by visiting
> https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users
>
> The archives can be found at:
> https://lists.cs.wisc.edu/archive/htcondor-users/
>


--
Todd Tannenbaum <tannenba@xxxxxxxxxxx> University of Wisconsin-Madison
Center for High Throughput Computing   Department of Computer Sciences
HTCondor Technical Lead                1210 W. Dayton St. Rm #4257

_______________________________________________
HTCondor-users mailing list
To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users

The archives can be found at:
https://lists.cs.wisc.edu/archive/htcondor-users/