[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] Matlab Compiler and Condor: Shadow Exception



Title: RE: [Condor-users] Matlab Compiler and Condor: Shadow Exception

Tobias,

Is it possible that Matlab is installed on your central manager, the machine where the jobs are running, and isn't installed on any of the execute nodes?

If this is the case, you need to install the Matlab Component Runtime (MCR) in order to get Matlab jobs running over the Condor grid.  Mathworks say that the Matlab Compiler compiles .m files so they are standalone.  This is _not_ the case.  The jobs actually require the installation of a 120mb+ runtime in order to execute, some would say this is still standalone, I on the other hand would not.

Anyway, browse into your toolboxes folder and you'll see a Matlab Compiler folder.  Somewhere in there (I'm not on a Windows box atm) you will see a 120mb+ executable called MCRInstall or something like that.  Install that on all of your execute nodes.  This will solve half of your problem.  You then need to ensure that the account that Condor executes under can actually see these libraries when it attempts to execute the job.

The easiest way to do this, I have found, is rather than put executable = your_executable.exe in your submit file, actually make a .bat file.  In this .bat file put

SET PATH=c:/path/to/MCR/runtime:%PATH%
SET PATH=c:/path/to/x64/runtime;%PATH%

your_executable.exe %1 %2 %3

(note the second PATH statement is only relevent if you're installing on a x64 machine, it will of course install into a different default directory, appending (x86) to the program files dir.  The %1 %2 %3 are only relevent if you are passing in arguements from the submit script, this would accept three arguements in this case).

Your submit script will now have executable = batch.bat, and inside batch.bat will be something like that above.

Please do let me know if this works.  I had serious difficulties solving this problem.

Kind Regards,

Shaun





-----Original Message-----
From: condor-users-bounces@xxxxxxxxxxx on behalf of Tobias Pingel
Sent: Tue 12/19/2006 10:01 AM
To: condor-users@xxxxxxxxxxx
Subject: [Condor-users] Matlab Compiler and Condor: Shadow Exception

Hi there,



Actually I'm trying to combine MATLAB with Condor and its nearly working now. There is only one thing I couldn't solve:



If I distribute the compiled Matlab-*.exe file, all nodes, except the central manager, couldn't compute/return their results. In the job log file (remote node), there is an entry (no entry in *.err or *.out):



001 (019.000.000) 12/19 10:26:25 Job executing on host: <192.168.0.8:1029>

...

007 (019.000.000) 12/19 10:26:25 Shadow exception!

            Can no longer talk to condor_starter on execute machine (192.168.0.8)

            0  -  Run Bytes Sent By Job

            46163  -  Run Bytes Received By Job



All other jobs (helloWorld.exe), so no MATLAB jobs, run perfectly! And the MATLAB *.exe itself runs without any problem on the remote nodes (manually started, without Condor) as well.



How I performed it:



Central Manager: WinXP (WinNT51), shared MATLAB directory (for the Clients)



Remote Nodes: Win2000 (WinNT50), network acces to shared MATLAB directory.



  1.. compiling files, create input.mat
  2.. submitting files
  3.. the central manager could compute it without an exceptions
  4.. the remote nodes return job states to idle, after a few seconds
  5.. after that (in normal case) collect output.mat files and display them in MATLAB


I don't know if there's a problem, because the Matlab compiled job needs to extract an archive. This takes a while, and then the computation starts.



I hope that some of you solved such problems before and can give me the solution for that problem ;-)



All the Best,

Tobias