[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] Problem MPI jobs idle!



Hi,

One very good diagnosing suggestion. Add SCHEDD_DEBUG = D_FULLDEBUG to
the main condor_config file inthe appropriate location. ( CHeck the
pelling for typos if necessary). And then in the log files in the schedd
log files to be precise , at one point of time you will see the list of
all the machines whicha re configured for this dedicated Scheduler.

If you do not see any machines it means that the machines are not
configured properly. If you see the machines I think there may abe a
problem in the condor_config file and the policy to run jobs  may be
very restriciting. I would strongly advise you to switch to the condor's
internally provided testing configuration just for the testing purposes.
Replacing the UWCS_ with TESTINGMODE_ ( CHeck the pelling for typos if
necessary)

This thing works for me for sure. Then what you can do is that check the
condor_config.local for typos. IN the slaves configuration file in the
place of the DEDICATED SCHEDULER thing you must put the name of the
concenrfened condor_masterr not eh locval machines name.


I think this will take you as far as at least getting the lists of resources on you master sched.

Hope this solves it or atleast you get some errors which you can
comprehend and solve. Worked for me.


Regards,


Chaitanya V. Hazarey

School of Technology and Computer Science,
Tata Institute of Fundamental Research,
Colaba, Homi Bhabha Road,
Mumbai, Maharashtra, 400005