[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] Cluster Multithreading



John and David,

Thanks for your reply.
I am doing some optimization task,
the solver (IPOPT) can make use of OpenBlas and MA86 linear sparse solver which both are able to multithread and both are based on OpenMP for that.
I was hoping that if I could throw a single image of the cluster in front of the solver then I can efficiently use the resources available in the cluster we have in my university.
To be honest I am not a computer programmer and don't I have enough knowledge on it to be able to write such script that intervenes the solver internal process to spawn tasks in the cluster so I was curious whether such thing is possible without much difficulty and has it been addressed by some softwares out there.


On Mon, May 20, 2013 at 8:08 AM, John Lambert <john.lambert@xxxxxxxxxxx> wrote:
Mostafa-

You might also be interested in OpenMosix. SSI (Single System Image) clustering is something that *can* be successful for certain problems, but you often need a batch scheduler like Condor to get the fine grained control and parameterization that almost all workflows need. OpenMosix is a Linux kernel module so it will be limited to that platform. The MPIs  are good if you have new code to write, or if you have the source and skill to port to those platforms. 

I'd really suggest thinking specifically about what you need to accomplish. It's likely that either Condor or OpenMPI will be able to accomplish it, but sadly, it won't be a turnkey solution.

Thanks,
John Lambert


On Mon, May 20, 2013 at 8:01 AM, David Hentchel <dhentchel@xxxxxxxxx> wrote:
I believe the capability you want is called MPI (Message Passing Interface).  This is an open C or Java API that lets you split processing across different Processes, much the way current languages do across Threads.  Just search the Condor documentation (or google condor mpi) and you'll find a lots of information.

I'm one of those Linux guys, but I have no reason to think there'd be any problem with Windows. 

By default, Condor schedules and manages down to the CPU core level, each of which is called a "Slot".  Just go through the tutorial to set up Condor and you'll see these in the condor_status output.  So, for example, my machines have 8 cores each and if I don't put specific restrictions in the job class ad, Condor will run up to 8 concurrent programs on each host.

What I can't say is whether there is any attempt at "isolation" - i.e., if a condor job is assigned to Slot #3, can we be sure it's only using one particular core on the machine or is that left up to the operating system?  Modern day programs typically use threading libraries that spawn lots of threads and the OS (windows and linux) generally tries to use as many of the cpu cores as it can get access to.  The internals to prevent this exist deep in the operating system, but for general-purpose computing it almost always turns out to be a bad idea trying to use them (i.e. the OS scheduler tends to make better decisions than a human programmer would).

My final caution is that MPI programming (actually multi-threaded, multi-host applications in general) can be very difficult to get working correctly, because there are unpredictable timing issues as the threads attempt to work together.  If you haven't done much coding with this paradigm, you need to make sure that the problem you are solving really can be split up into individual pieces that can run independently and then somehow collect together the final result you need.


dave


On Sun, May 19, 2013 at 6:04 AM, Mostafa.B <bakhtvar@xxxxxxxxx> wrote:
Hi,

I am interested in running a task in my PC and let it use the threads available in the cluster, I mean the CPU cores available in the cluster would appear as cores available in my PC from the the point of view of the task that I am going to run in my PC!
(I don't know whether this is called cluster Multithreading or not!)
is such thing possible with Condor?
if yes:
1. is it also available in Windows or this is also one of those amazing capabilities that is only possible with Linux.
2.How this is accomplished?

any other suggestions?

Regards

-Mosy

_______________________________________________
HTCondor-users mailing list
To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users

The archives can be found at:
https://lists.cs.wisc.edu/archive/htcondor-users/



--

David Hentchel

Performance Engineer

www.nuodb.com

(617) 803 - 1193


_______________________________________________
HTCondor-users mailing list
To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users

The archives can be found at:
https://lists.cs.wisc.edu/archive/htcondor-users/


_______________________________________________
HTCondor-users mailing list
To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users

The archives can be found at:
https://lists.cs.wisc.edu/archive/htcondor-users/