Re: [Condor-users] Condor problem with MPI Jobs

oops, I meant condor_status -l of course...

On May 19, 2006, at 1:15 PM, Rok Roskar wrote:

The negotiator log file may give you an idea why DedicatedScheduler can't get a hold of resources...

Make sure you are submitting your jobs from the machine configured to be the DedicatedScheduler.

If you do condor_submit -l , do the resources, which you intend to use for MPI, advertise the correct DedicatedScheduler name?

On May 19, 2006, at 9:18 AM, Natarajan, Senthil wrote:

<x-tad-bigger>No, just those two inputfiles infile.0, infile.1</x-tad-bigger>
<x-tad-bigger>Thanks for the tips and I appreciate, I will try those and let you know.</x-tad-bigger>
<x-tad-bigger>BTW does your job have any parameters or input files other than those in INPUT.$(NODE)?</x-tad-bigger>
<x-tad-bigger>I have posted this couple of times but no response, hopefully this time I will get some.</x-tad-bigger>
<x-tad-bigger>I was trying to run MPI job using condor 6.6.10 on windows. I am using condor supported MPI (MPICH 1.2.4)</x-tad-bigger>
<x-tad-bigger>MPICH 1.2.4 libraries are installed properly on the windows machines and the path to the libraries are properly set in System Environmental variables. And of course I configured condor_config files in the execution node as dedicated resources and suitable for running MPI jobs by following the condor document.</x-tad-bigger>
<x-tad-bigger>If I submit the job, it is in ideal condition and it is not reporting any error and even it is not trying to contact the execution nodes. I have no clue what is going on.</x-tad-bigger>
<x-tad-bigger>Could you please some one point out what might be the problem. I was wondering is the condor MPI universe is fully developed feature, is it possible to use this for real production environment.</x-tad-bigger>
<x-tad-bigger>universe = MPI</x-tad-bigger>
<x-tad-bigger>executable = simplempi.exe</x-tad-bigger>
<x-tad-bigger>#executable = cpi.exe</x-tad-bigger>
<x-tad-bigger>requirements   = Arch == "INTEL" && OpSys == "WINNT51"</x-tad-bigger>
<x-tad-bigger>log = logfile</x-tad-bigger>
<x-tad-bigger>input = infile.$(NODE)</x-tad-bigger>
<x-tad-bigger>output = outfile.$(NODE)</x-tad-bigger>
<x-tad-bigger>error = errfile.$(NODE)</x-tad-bigger>
<x-tad-bigger>machine_count = 2</x-tad-bigger>
<x-tad-bigger>should_transfer_files = yes</x-tad-bigger>
<x-tad-bigger>when_to_transfer_output = on_exit</x-tad-bigger>
