[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Condor-users] MPI Jobs strange problem



Hi all,

This MPI problems do not seem to end. See I had got all the things
working previously. But now out of the blue, I get this error in the
SCHEDD log file that


9/4 00:18:13 Cluster 1862 is in idle_clusters, but no longer in job queue, removing 9/4 00:18:13 Cluster 1862 is in idle_clusters, but no longer in job queue, removing 9/4 00:18:13 Found no idle dedicated job(s) 9/4 00:18:13 No idle dedicated jobs, handleDedicatedJobs() returning 9/4 00:18:13 Entering DedicatedScheduler::checkSanity()


and



9/4 00:17:41 Started timer (19363) to call handleDedicatedJobs() in 2 secs
9/4 00:17:41 JobsRunning = 0
9/4 00:17:41 JobsIdle = 0
9/4 00:17:41 JobsHeld = 0
9/4 00:17:41 JobsRemoved = 0
9/4 00:17:41 SchedUniverseJobsRunning = 0
9/4 00:17:41 SchedUniverseJobsIdle = 0
9/4 00:17:41 N_Owners = 1
9/4 00:17:41 MaxJobsRunning = 200





and when I submit 100 jobs 1 seems to run and the rest sit idle there.


Please if any one could atleast guess the cause. It would be great. I have put all the machines in the TESTING_MODE setting so that they always run the jobs. The strange thing is that when I googled for this error I could not find a single matching query.




Thanks,


Chaitanya V. Hazarey


School of Technology and Computer Science, Tata Institute of Fundamental Research, Colaba, Homi Bhabha Road, Mumbai, Maharashtra, 400005