[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] anyone with experience with matlab

Title: Message

Are you referring to the first set of code? Would I just enter condor_wait in the submit function? Where exactly and how would that be used?


Josh Richardson

Integrity Applications Incorporated Intern (IAI)

703-378-8672 ext 632


From: condor-users-bounces@xxxxxxxxxxx [mailto:condor-users-bounces@xxxxxxxxxxx] On Behalf Of Jones, Torrin A (US SSA)
Sent: Thursday, August 02, 2007 5:16 PM
To: Condor-Users Mail List
Subject: Re: [Condor-users] anyone with experience with matlab


Sorry, no experience with matlab, but let me ask a question anyway.


In your waitForState function is it actually waiting for the job to finish?  Or is it waiting for the 4 condor_submit commands that you run to finish?  I ask because you need to use the condor_wait command to wait for a job to finish and I'm not sure if that's what's being done here.

-----Original Message-----
From: condor-users-bounces@xxxxxxxxxxx [mailto:condor-users-bounces@xxxxxxxxxxx] On Behalf Of Richardson, Joshua
Sent: Thursday, August 02, 2007 13:32
To: Condor-Users Mail List
Subject: [Condor-users] anyone with experience with matlab

I have been trying to set up condor with matlab for quite a while. I have simplified my matlab code that I am trying to run and am still having difficulty get the results. It seems as if Condor is returning results before matlab has a chance to finish completing the tasks. I am using the matlab distributed computing engine but am just asking for help from anyone that is familiar with matlab. I am going to paste my code below and was wondering if someone could take a look at it and see if there seems to be any glaring mistakes that my team and I can not figure out.


First, I have a m file named trial.m. this is the code that I want to run and get results from.


function trial()


jm = findResource('scheduler','configuration', 'generic');


job1 = createJob(jm);

createTask(job1, @sum, 1, {[1 1]});

createTask(job1, @sum, 1, {[2 2]});

createTask(job1, @sum, 1, {[3 3]});

createTask(job1, @sum, 1, {[4 4]});


waitForState(job1, 'finished', 60)

results = getAllOutputArguments(job1)



Next, is my submit function through matlab


function submitfcn(scheduler, job, props, extraCondorSubmitArgs) %#ok Not using job

%SUBMITFCN Submit a Matlab job to a Condor scheduler


% See also workerDecodeFunc.


% Assign the relevant values to environment variables, starting

% with identifying the decode function to be run by the worker:

decodeFcn = 'workerDecodeFunc';

if nargin < 4

    extraCondorSubmitArgs = '';


% Ask the workers to print debug messages by default by setting MDCE_DEBUG to

% true.

jobEnvVars = {'MDCE_DECODE_FUNCTION',decodeFcn, ...

              ...%' MDCE_STORAGE_LOCATION',props.StorageLocation, ...

              ' MDCE_STORAGE_CONSTRUCTOR',props.StorageConstructor, ...

              ' MDCE_JOB_LOCATION',props.JobLocation, ...

              ' MDCE_DEBUG','true'};

taskEnvVars = cell(1, numel(props.TaskLocations));

for i = 1:numel(props.TaskLocations)

    taskEnvVars{i} = {'MDCE_TASK_LOCATION', props.TaskLocations{i}};


if isempty(scheduler.ClusterMatlabRoot)

    warning('distcomp:condor:NoClusterMatlabRoot', ...

            ['The scheduler''s ClusterMatlabRoot property is empty.\n', ...

             'Using  matlabroot  instead.']);

    clusterMatlabRoot = matlabroot;


    clusterMatlabRoot = scheduler.ClusterMatlabRoot;


matlabScript = fullfile(clusterMatlabRoot, 'bin', 'matlab');

% ... Do we need the following ??? ...

if ispc

    matlabScript = [matlabScript, '.bat'];


matlabArgs = strrep(scheduler.matlabCommandToRun, 'matlab ', '');


% Determine where to save the standard output, standard error and the

% Condor log.

logFiles = cell(1, props.NumberOfTasks);

outFiles = cell(1, props.NumberOfTasks);

errFiles = cell(1, props.NumberOfTasks);

for i = 1:props.NumberOfTasks

    taskLoc = fullfile(scheduler.DataLocation, props.TaskLocations{i});

    logFiles{i} = [taskLoc, '.log'];

    outFiles{i} = [taskLoc, '.out'];

    errFiles{i} = [taskLoc, '.err'];



% Create one condor submit file for all the tasks.

script = createCondorSubmitScript(matlabScript, matlabArgs, ...

                                  jobEnvVars, taskEnvVars, ...

                                  errFiles, outFiles, logFiles);

% Submit a Condor job that executes all the tasks:

...%[pathstr, name, ext, versn] = fileparts(script);

 ...%   script2 = name;

    % Execute the submit command on the remote host.

    %copyfile(script, '.')

condorSubmitCommand = ['condor_submit ', script, ' ', extraCondorSubmitArgs];

[s, w] = system(condorSubmitCommand);

% Leave behind the necessary debugging information if the submission failed.

if s ~= 0

    warning('distcomp:condor:SubmitFailed', ...

            ['Call to condor_submit failed with the following message:\n\n', ...

             '    %s\n\n', ...

             'The submit command used was:\n\n    %s\n\n', ...

             'Not deleting the submission file %s.'], ...

             w, condorSubmitCommand, script2);


    % Display the Condor job number:


    % Clean up:


   % delete(script2);



function filename = createCondorSubmitScript(matlabScript, matlabArgs, jobEnvVars, taskEnvVars, errFiles, outFiles, logFiles)

%Create a Condor submit script that forwards the correct environment variables

%and executes Matlab.


% We assume that the decode function has been put on the path of the MATLAB

% workers, e.g. by putting it into $MATLABROOT/toolbox/local.


% Double all backslashes so fprintf prints out a single backslash.

matlabScript = strrep(matlabScript, '\', '\\');

matlabArgs = strrep(matlabArgs, '\', '\\');

jobEnvVars = strrep(jobEnvVars, '\', '\\');

for i = 1:numel(taskEnvVars)

    taskEnvVars{i} = strrep(taskEnvVars{i}, '\', '\\');


outFiles = strrep(outFiles, '\', '\\');

errFiles = strrep(errFiles, '\', '\\');

logFiles = strrep(logFiles, '\', '\\');


condorHeader = [ 'Universe            = vanilla\n', ...

                 'Executable          = condor_exec.bat \n',... %s\n', ...

                 'Transfer_Executable = true\n', ...

                 'Requirements        = (machine == "condor01.integrity-apps.com")\n'


taskString   = ['Arguments            = %s\n', ...

                'Environment          = "%s"\n', ...

                ...%'input                = matlab_metadata.mat \n', ...

                'Error                = %s\n', ...

                'Output               = %s\n', ...

                'Log                  = %s\n', ...

                'should_transfer_files = YES \n',...

                'transfer_input_files = matlab_metadata.mat, job1.in.mat, job1.common.mat, job1.state.mat, job1.out.mat \n', ...

                'when_to_transfer_output = ON_EXIT \n',...

                'notify_user          = jrichardson@xxxxxxxxxxxxxxxxxx \n',...


filename = tempname;

fid = fopen(filename, 'wt');

fprintf(fid, condorHeader, matlabScript);


for i = 1:numel(taskEnvVars)

    % Create a cell-array of all the environment variables we want to set

    % for the current task, and transform it into a string for the Condor

    % script.

    envString = createCondorEnvString({jobEnvVars{:}, taskEnvVars{i}{:}});

    % Append a clause to the Condor script to queue the current task.

    fprintf(fid, taskString, matlabArgs, envString, errFiles{i}, outFiles{i}, logFiles{i});




function envString = createCondorEnvString(envVars)

%envStr = createCondorEnvString(envVars)

%  envVars should be a cell arra of even length.  The even entries are

%  the environment variables, the odd entries are their values.


% In Condor, environment variables are specified in UNIX as

%  Environment = var1=val1;var2=val2;...varn=valn

% and on Windows, the separator is '|' instead of ';', i.e. the format is

%  Environment = var1=val1|var2=val2|...varn=valn


if ispc

    envSep = '  ';


    envSep = ';';


envString = '';

for i = 1:2:numel(envVars)

    envString = [envString, envVars{i}, '=', envVars{i + 1}, envSep];





Now is my executable that is being passed through condor. There might be a problem here. It calls worker.bat that starts matlab.bat which starts matlab. My argument is worker.bat


@echo off








echo %DL%












Any suggestions will be greatly appreciated. Currently, I receive a message saying that the job is completed while tasks are still running and I get a result of an empty 4 x 0 array

Josh Richardson

Integrity Applications Incorporated Intern (IAI)

703-378-8672 ext 632