[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Condor-users] MATLAB Distributed Computing Server and Condor
- Date: Thu, 8 Sep 2011 17:34:01 -0700
- From: Mark Cafaro <cafarm@xxxxxx>
- Subject: Re: [Condor-users] MATLAB Distributed Computing Server and Condor
Thank you for the link Todd and all the information Jon.
I have not setup MCDS yet, but as I imagine it the benefits are that the end user does not
have to put much thought into parallelism. Using Condor directly, without MCDS, you have to
consider how to split your M-Files into self-contained jobs.
With MCDS you can simply change, for instance, a for-loop into a parfor-loop and MCDS will
handle the parallelism. This is my hope, at least.
Maybe Jon can shed some light on this.
I am sure most of you already know this, but just in case ...
You do not have to use MCDS to enable Matlab programs to harness the CPU power of your Condor pool. You can just run many individual Matlab invocations in parallel by submission into the vanilla universe just like any other job. The only real non-obvious hurdle to doing this is how to deal w/ licenses. See
for some specific notes related to using Condor and Matlab together. Jon, if you are interested/willing to improve the above HowTo (to include MCDS wisdom perhaps?), that'd be awesome.
Jonathan D. Proulx wrote:
On Thu, Sep 08, 2011 at 01:20:52PM -0700, Mark Cafaro wrote:
:There was a brief thread about this in 2008, but times have changed and so has the MATLAB Distributed Computing Server (MDCS).
:Has anyone had success using MCDS on top of Condor through the generic scheduler option? We are currently evaluating a trial version of MDCS to see if this possibility exists.
:I have noticed a condor specific SubmitFcn in the MCDS examples, so it appears some customers have been exploring this option as well.
A few years back Mathworks wrote a SubmitFcn for us wich is probably
the basis for that example. It does work but is fragile compared to
most condor things.
If any job fails or is interrupted Matlab will just hang forever
waiting for it.
If any of the jobs can't get a DCE license it will fail and Matlab
will hang forever waiting for it. This means you need to use resource
limits in condor and configure your submit function to require that
limit. It also means if you have systems outside condor using the
same DCE license pool it's very likely things will occasionally break
in weird ways.
we also have a situation where on apparently identical systems within
our pool some system deterministically segfault when DCE tries to
start a matlab process though running matlab by hand (witht he same
commandline DCE uses) succeeds. This turned up about a year ago and
Mathworks assigned an engineer ot look into it but has yet to come to
a determination. We work around this by having a negative "requires"
statement in teh submit function that excludes nodes by name that have
displayed this bug.
so yes you "can"
Condor-users mailing list
To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with a
You can also unsubscribe by visiting
The archives can be found at: