[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] anyone with experience with matlab

Title: Message
Yes, you need to call condor_wait for every log file.
Or you can make every log point to the same file.  Then you only have to call condor_wait once.  I'm not sure how safe this is, but it seems to work for me.
-----Original Message-----
From: condor-users-bounces@xxxxxxxxxxx [mailto:condor-users-bounces@xxxxxxxxxxx] On Behalf Of Richardson, Joshua
Sent: Friday, August 03, 2007 08:11
To: Condor-Users Mail List
Subject: Re: [Condor-users] anyone with experience with matlab

What I was saying though was that my output was returned to me and it was still frozen. Maybe I just did not wait long enough?


My submit file looks like….



Universe            = vanilla

Executable          = condor_exec.bat

Transfer_Executable = true

Requirements        = (machine == "condor01.integrity-apps.com")

Arguments            = worker.bat

Environment          = "MDCE_DECODE_FUNCTION=workerDecodeFunc   MDCE_STORAGE_CONSTRUCTOR=makeFileStorageObject   MDCE_JOB_LOCATION=Job1   MDCE_DEBUG=true  MDCE_TASK_LOCATION=Job1/Task1  "

Error                = H:\\MatLab_from_Wade\\MTI\\Job1\\Task1.err

Output               = H:\\MatLab_from_Wade\\MTI\\Job1\\Task1.out

Log                  = H:\\MatLab_from_Wade\\MTI\\Job1\\Task1.log

should_transfer_files = YES

transfer_input_files = matlab_metadata.mat, job1.in.mat, job1.common.mat, job1.state.mat, job1.out.mat

when_to_transfer_output = ON_EXIT

notify_user          = jrichardson@xxxxxxxxxxxxxxxxxx



Arguments            = worker.bat

Environment          = "MDCE_DECODE_FUNCTION=workerDecodeFunc   MDCE_STORAGE_CONSTRUCTOR=makeFileStorageObject   MDCE_JOB_LOCATION=Job1   MDCE_DEBUG=true  MDCE_TASK_LOCATION=Job1/Task2  "

Error                = H:\\MatLab_from_Wade\\MTI\\Job1\\Task2.err

Output               = H:\\MatLab_from_Wade\\MTI\\Job1\\Task2.out

Log                  = H:\\MatLab_from_Wade\\MTI\\Job1\\Task2.log

should_transfer_files = YES

transfer_input_files = matlab_metadata.mat, job1.in.mat, job1.common.mat, job1.state.mat, job1.out.mat

when_to_transfer_output = ON_EXIT

notify_user          = jrichardson@xxxxxxxxxxxxxxxxxx



Arguments            = worker.bat

Environment          = "MDCE_DECODE_FUNCTION=workerDecodeFunc   MDCE_STORAGE_CONSTRUCTOR=makeFileStorageObject   MDCE_JOB_LOCATION=Job1   MDCE_DEBUG=true  MDCE_TASK_LOCATION=Job1/Task3  "

Error                = H:\\MatLab_from_Wade\\MTI\\Job1\\Task3.err

Output               = H:\\MatLab_from_Wade\\MTI\\Job1\\Task3.out

Log                  = H:\\MatLab_from_Wade\\MTI\\Job1\\Task3.log

should_transfer_files = YES

transfer_input_files = matlab_metadata.mat, job1.in.mat, job1.common.mat, job1.state.mat, job1.out.mat

when_to_transfer_output = ON_EXIT

notify_user          = jrichardson@xxxxxxxxxxxxxxxxxx



Arguments            = worker.bat

Environment          = "MDCE_DECODE_FUNCTION=workerDecodeFunc   MDCE_STORAGE_CONSTRUCTOR=makeFileStorageObject   MDCE_JOB_LOCATION=Job1   MDCE_DEBUG=true  MDCE_TASK_LOCATION=Job1/Task4  "

Error                = H:\\MatLab_from_Wade\\MTI\\Job1\\Task4.err

Output               = H:\\MatLab_from_Wade\\MTI\\Job1\\Task4.out

Log                  = H:\\MatLab_from_Wade\\MTI\\Job1\\Task4.log

should_transfer_files = YES

transfer_input_files = matlab_metadata.mat, job1.in.mat, job1.common.mat, job1.state.mat, job1.out.mat

when_to_transfer_output = ON_EXIT

notify_user          = jrichardson@xxxxxxxxxxxxxxxxxx






So how would I handle this? Would I call the condor_wait for each submission?


Josh Richardson

Integrity Applications Incorporated Intern (IAI)

703-378-8672 ext 632


From: condor-users-bounces@xxxxxxxxxxx [mailto:condor-users-bounces@xxxxxxxxxxx] On Behalf Of Jones, Torrin A (US SSA)
Sent: Friday, August 03, 2007 10:53 AM
To: Condor-Users Mail List
Subject: Re: [Condor-users] anyone with experience with matlab


Thats what condor_wait does.  It "freezes" until the job is done.


Take a look at the help for condor_wait.



-----Original Message-----
From: condor-users-bounces@xxxxxxxxxxx [mailto:condor-users-bounces@xxxxxxxxxxx] On Behalf Of Richardson, Joshua
Sent: Friday, August 03, 2007 07:27
To: Condor-Users Mail List
Subject: Re: [Condor-users] anyone with experience with matlab

I am trying to run it with other code that I previously had written and it does not seem to be working for me. My command window inputs are here…


C:\Documents and Settings\jrichardson\Desktop\build-looks>condor_wait build-log.



C:\Documents and Settings\jrichardson\Desktop\build-looks>condor_wait build-log.


Couldn't open build-log.txt`: No such file or directory


C:\Documents and Settings\jrichardson\Desktop\build-looks>condor_wait build-log

Couldn't open build-log: No such file or directory


C:\Documents and Settings\jrichardson\Desktop\build-looks>condor_wait build-log.



C:\Documents and Settings\jrichardson\Desktop\build-looks>


The first time I ran it, the results returned and the command screen seems to still be busy. After entering condor_wait build-log.txt, the cursor went to the next line and seemed to freeze as it looked in a busy state. Is there something you can see that I did wrong?


Josh Richardson

Integrity Applications Incorporated Intern (IAI)

703-378-8672 ext 632


From: condor-users-bounces@xxxxxxxxxxx [mailto:condor-users-bounces@xxxxxxxxxxx] On Behalf Of Jones, Torrin A (US SSA)
Sent: Friday, August 03, 2007 9:59 AM
To: Condor-Users Mail List
Subject: Re: [Condor-users] anyone with experience with matlab


OK, I've included a sample submit description file.  You'll notice that condor_wait doesn't go in the submit description file.  It's a seperate command.  When you submit the job you'll get something like this.


V:\shared\condor\sample_jobs\bludaa931717>condor_submit.exe example.job
Submitting job(s).....
Logging submit event(s).....
5 job(s) submitted to cluster 491.


In order to wait for the job to complete you need to use the condor wait command on the log file listed in the submit description file.  It's pointed to v:\temp\condor\dir.log.  Below is a transcript of what I did.  The only thing necessary for you is to use condor_wait.




 Volume in drive V is DATA
 Volume Serial Number is FC55-1736


 Directory of V:\temp\condor


[.]                [..]               dir.491.0.error    dir.491.0.output
dir.491.1.error    dir.491.1.output   dir.491.2.error    dir.491.2.output
dir.491.3.error    dir.491.3.output   dir.491.4.error    dir.491.4.output
              11 File(s)      3,969,255 bytes
               2 Dir(s)  17,617,494,016 bytes free


V:\temp\condor>condor_wait.exe dir.log
All jobs done.




-----Original Message-----
From: condor-users-bounces@xxxxxxxxxxx [mailto:condor-users-bounces@xxxxxxxxxxx] On Behalf Of Richardson, Joshua
Sent: Friday, August 03, 2007 06:46
To: Condor-Users Mail List
Subject: Re: [Condor-users] anyone with experience with matlab

So just making sure I understand you correctly, when I set the log = log file, instead of just naming the log file, give the full path of the log file. Then as the last line (before the queue? Or after?) add condor_wait with the path of the log file?


Possibly can you show me an example of this being used in regular condor format? That would be very helpful. Just a skeleton of what it would look like….Thanks


Josh Richardson

Integrity Applications Incorporated Intern (IAI)

703-378-8672 ext 632


From: condor-users-bounces@xxxxxxxxxxx [mailto:condor-users-bounces@xxxxxxxxxxx] On Behalf Of Jones, Torrin A (US SSA)
Sent: Thursday, August 02, 2007 5:44 PM
To: Condor-Users Mail List
Subject: Re: [Condor-users] anyone with experience with matlab


Yes, I was looking at the first set of code.  In order to use condor_wait, you will first call the condor_submit with a submit file.  The submit files should have a line that says log = [somelogfile].  Replace [somelogfile] with the actually path to the log file.  After the submit is done, then you call condor_wait with the full path to the log file.  When the job is done, condor_wait will finish.


As for where this would go in you code.  I don't know.  I don't know matlab that well.


-----Original Message-----
From: condor-users-bounces@xxxxxxxxxxx [mailto:condor-users-bounces@xxxxxxxxxxx] On Behalf Of Richardson, Joshua
Sent: Thursday, August 02, 2007 14:22
To: Condor-Users Mail List
Subject: Re: [Condor-users] anyone with experience with matlab

Are you referring to the first set of code? Would I just enter condor_wait in the submit function? Where exactly and how would that be used?


Josh Richardson

Integrity Applications Incorporated Intern (IAI)

703-378-8672 ext 632


From: condor-users-bounces@xxxxxxxxxxx [mailto:condor-users-bounces@xxxxxxxxxxx] On Behalf Of Jones, Torrin A (US SSA)
Sent: Thursday, August 02, 2007 5:16 PM
To: Condor-Users Mail List
Subject: Re: [Condor-users] anyone with experience with matlab


Sorry, no experience with matlab, but let me ask a question anyway.


In your waitForState function is it actually waiting for the job to finish?  Or is it waiting for the 4 condor_submit commands that you run to finish?  I ask because you need to use the condor_wait command to wait for a job to finish and I'm not sure if that's what's being done here.

-----Original Message-----
From: condor-users-bounces@xxxxxxxxxxx [mailto:condor-users-bounces@xxxxxxxxxxx] On Behalf Of Richardson, Joshua
Sent: Thursday, August 02, 2007 13:32
To: Condor-Users Mail List
Subject: [Condor-users] anyone with experience with matlab

I have been trying to set up condor with matlab for quite a while. I have simplified my matlab code that I am trying to run and am still having difficulty get the results. It seems as if Condor is returning results before matlab has a chance to finish completing the tasks. I am using the matlab distributed computing engine but am just asking for help from anyone that is familiar with matlab. I am going to paste my code below and was wondering if someone could take a look at it and see if there seems to be any glaring mistakes that my team and I can not figure out.


First, I have a m file named trial.m. this is the code that I want to run and get results from.


function trial()


jm = findResource('scheduler','configuration', 'generic');


job1 = createJob(jm);

createTask(job1, @sum, 1, {[1 1]});

createTask(job1, @sum, 1, {[2 2]});

createTask(job1, @sum, 1, {[3 3]});

createTask(job1, @sum, 1, {[4 4]});


waitForState(job1, 'finished', 60)

results = getAllOutputArguments(job1)



Next, is my submit function through matlab


function submitfcn(scheduler, job, props, extraCondorSubmitArgs) %#ok Not using job

%SUBMITFCN Submit a Matlab job to a Condor scheduler


% See also workerDecodeFunc.


% Assign the relevant values to environment variables, starting

% with identifying the decode function to be run by the worker:

decodeFcn = 'workerDecodeFunc';

if nargin < 4

    extraCondorSubmitArgs = '';


% Ask the workers to print debug messages by default by setting MDCE_DEBUG to

% true.

jobEnvVars = {'MDCE_DECODE_FUNCTION',decodeFcn, ...

              ...%' MDCE_STORAGE_LOCATION',props.StorageLocation, ...

              ' MDCE_STORAGE_CONSTRUCTOR',props.StorageConstructor, ...

              ' MDCE_JOB_LOCATION',props.JobLocation, ...

              ' MDCE_DEBUG','true'};

taskEnvVars = cell(1, numel(props.TaskLocations));

for i = 1:numel(props.TaskLocations)

    taskEnvVars{i} = {'MDCE_TASK_LOCATION', props.TaskLocations{i}};


if isempty(scheduler.ClusterMatlabRoot)

    warning('distcomp:condor:NoClusterMatlabRoot', ...

            ['The scheduler''s ClusterMatlabRoot property is empty.\n', ...

             'Using  matlabroot  instead.']);

    clusterMatlabRoot = matlabroot;


    clusterMatlabRoot = scheduler.ClusterMatlabRoot;


matlabScript = fullfile(clusterMatlabRoot, 'bin', 'matlab');

% ... Do we need the following ??? ...

if ispc

    matlabScript = [matlabScript, '.bat'];


matlabArgs = strrep(scheduler.matlabCommandToRun, 'matlab ', '');


% Determine where to save the standard output, standard error and the

% Condor log.

logFiles = cell(1, props.NumberOfTasks);

outFiles = cell(1, props.NumberOfTasks);

errFiles = cell(1, props.NumberOfTasks);

for i = 1:props.NumberOfTasks

    taskLoc = fullfile(scheduler.DataLocation, props.TaskLocations{i});

    logFiles{i} = [taskLoc, '.log'];

    outFiles{i} = [taskLoc, '.out'];

    errFiles{i} = [taskLoc, '.err'];



% Create one condor submit file for all the tasks.

script = createCondorSubmitScript(matlabScript, matlabArgs, ...

                                  jobEnvVars, taskEnvVars, ...

                                  errFiles, outFiles, logFiles);

% Submit a Condor job that executes all the tasks:

...%[pathstr, name, ext, versn] = fileparts(script);

 ...%   script2 = name;

    % Execute the submit command on the remote host.

    %copyfile(script, '.')

condorSubmitCommand = ['condor_submit ', script, ' ', extraCondorSubmitArgs];

[s, w] = system(condorSubmitCommand);

% Leave behind the necessary debugging information if the submission failed.

if s ~= 0

    warning('distcomp:condor:SubmitFailed', ...

            ['Call to condor_submit failed with the following message:\n\n', ...

             '    %s\n\n', ...

             'The submit command used was:\n\n    %s\n\n', ...

             'Not deleting the submission file %s.'], ...

             w, condorSubmitCommand, script2);


    % Display the Condor job number:


    % Clean up:


   % delete(script2);



function filename = createCondorSubmitScript(matlabScript, matlabArgs, jobEnvVars, taskEnvVars, errFiles, outFiles, logFiles)

%Create a Condor submit script that forwards the correct environment variables

%and executes Matlab.


% We assume that the decode function has been put on the path of the MATLAB

% workers, e.g. by putting it into $MATLABROOT/toolbox/local.


% Double all backslashes so fprintf prints out a single backslash.

matlabScript = strrep(matlabScript, '\', '\\');

matlabArgs = strrep(matlabArgs, '\', '\\');

jobEnvVars = strrep(jobEnvVars, '\', '\\');

for i = 1:numel(taskEnvVars)

    taskEnvVars{i} = strrep(taskEnvVars{i}, '\', '\\');


outFiles = strrep(outFiles, '\', '\\');

errFiles = strrep(errFiles, '\', '\\');

logFiles = strrep(logFiles, '\', '\\');


condorHeader = [ 'Universe            = vanilla\n', ...

                 'Executable          = condor_exec.bat \n',... %s\n', ...

                 'Transfer_Executable = true\n', ...

                 'Requirements        = (machine == "condor01.integrity-apps.com")\n'


taskString   = ['Arguments            = %s\n', ...

                'Environment          = "%s"\n', ...

                ...%'input                = matlab_metadata.mat \n', ...

                'Error                = %s\n', ...

                'Output               = %s\n', ...

                'Log                  = %s\n', ...

                'should_transfer_files = YES \n',...

                'transfer_input_files = matlab_metadata.mat, job1.in.mat, job1.common.mat, job1.state.mat, job1.out.mat \n', ...

                'when_to_transfer_output = ON_EXIT \n',...

                'notify_user          = jrichardson@xxxxxxxxxxxxxxxxxx \n',...


filename = tempname;

fid = fopen(filename, 'wt');

fprintf(fid, condorHeader, matlabScript);


for i = 1:numel(taskEnvVars)

    % Create a cell-array of all the environment variables we want to set

    % for the current task, and transform it into a string for the Condor

    % script.

    envString = createCondorEnvString({jobEnvVars{:}, taskEnvVars{i}{:}});

    % Append a clause to the Condor script to queue the current task.

    fprintf(fid, taskString, matlabArgs, envString, errFiles{i}, outFiles{i}, logFiles{i});




function envString = createCondorEnvString(envVars)

%envStr = createCondorEnvString(envVars)

%  envVars should be a cell arra of even length.  The even entries are

%  the environment variables, the odd entries are their values.


% In Condor, environment variables are specified in UNIX as

%  Environment = var1=val1;var2=val2;...varn=valn

% and on Windows, the separator is '|' instead of ';', i.e. the format is

%  Environment = var1=val1|var2=val2|...varn=valn


if ispc

    envSep = '  ';


    envSep = ';';


envString = '';

for i = 1:2:numel(envVars)

    envString = [envString, envVars{i}, '=', envVars{i + 1}, envSep];





Now is my executable that is being passed through condor. There might be a problem here. It calls worker.bat that starts matlab.bat which starts matlab. My argument is worker.bat


@echo off








echo %DL%












Any suggestions will be greatly appreciated. Currently, I receive a message saying that the job is completed while tasks are still running and I get a result of an empty 4 x 0 array

Josh Richardson

Integrity Applications Incorporated Intern (IAI)

703-378-8672 ext 632