[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Condor-users] analyzing problems in the MPI universe



I am running condor-6.7.1. I compiled an executable with mpich-1.2.4. MPI jobs sitting in the queue never run. They simply sit there idle. When I ran condor_q -analyze I got the following:

9009.000:  Run analysis summary.  Of 443 machines,
   125 are rejected by your job's requirements
    75 reject your job because of their own requirements
     0 match, but are serving users with a better priority in the pool
    23 match, match, but reject the job for unknown reasons
   220 match, but will not currently preempt their existing job
     0 are available to run your job

WARNING: Analysis is meaningless for MPI universe jobs.

But, I guess the above output is irrelevant.

I've looked in my central manager log files, job error/output files, execution node StartLog logs, and the Schedular logs. I don't see anything, though. Where should I look?

-Danny