[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] my parallel universe error , 4 match but reject the job for unknown reasons



Hi Arash--
condor_q -better-analyze
may tell you more information,
so will
condor_q -l -ana

You say that you set up mpi0 as dedicated scheduler and
mpi0 and mpi1 as dedicated resources--what is the value of the START
macro for those two machines and what is the value of the
Requirements for your job.  Are you sure that the DedicatedScheduler
attribute is in your job classad?
\
Steve Timm

------------------------------------------------------------------
Steven C. Timm, Ph.D  (630) 840-8525
timm@xxxxxxxx  http://home.fnal.gov/~timm/
Fermilab Computing Division, Scientific Computing Facilities,
Grid Facilities Department, FermiGrid Services Group, Assistant Group Leader.

On Mon, 14 Jan 2008, Arash noorghorbani wrote:

Dear All,

I am a new user of condor. I have two computers two quad-core computers
with Linux ubuntu 7.10 and condor 6.8.8, and I am trying to add them to a
condor pool to running parallel jobs. I set both of them as dedicated
resource (which called mpi0 and mpi1) and mpi0 is,in addition, dedicated
scheduler. (our last condor pool has no dedicated scheduler.)

but parallel jobs only run on the dedicated scheduler (mpi0).
and in mpi1 I get the error:

"4 match but reject the job for unknown reasons"

I think this problem may be appear because my scheduler
is a quad-core machine. but I don't know how to fix it.

In the following you can see some detail of one of my try:

submitted file:
_________________________________________________________
universe = parallel
executable =/bin/sleep
arguments = 30
machine_count = 3
log    = logfile
error  = err
queue
________________________________________________________


mpi1@.....$ condor_q -analyze

-- Submitter: mpi1.x.x.x : <x.x.x.x:46536> : mpi1.x.x.x
ID      OWNER            SUBMITTED     RUN_TIME ST PRI SIZE CMD
---
008.000:  Run analysis summary.  Of 25 machines,
    21 are rejected by your job's requirements
     0 reject your job because of their own requirements
     0 match but are serving users with a better priority in the pool
     4 match but reject the job for unknown reasons
     0 match but will not currently preempt their existing job
     0 are available to run your job

1 jobs; 1 idle, 0 running, 0 held
_______________________________________________
Condor-users mailing list
To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/condor-users

The archives can be found at:
https://lists.cs.wisc.edu/archive/condor-users/