[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] Jobs blocked as Idle in Multi-CPU machine



Title: Message
I hate when you get an answer like that from condor_q.  Since we don't know what the "unknown reasons" are, the best bet is probably to look at the log files and see if you can figure it out.  You probably need to look at the CollectorLog and NegotiatiorLog on the central manager and maybe the StartLog on the execute machine.
 
Just a side note, I've seen it happen to my jobs also upon submit.  However, on the next negotiation cycle (300 seconds later I think), the job runs.  Usually I can get it to run quicker if I submit a dummy job (A job that prints out "Hello World") to every computer in the queue.  I haven't found a better way around this yet.
 
-----Original Message-----
From: condor-users-bounces@xxxxxxxxxxx [mailto:condor-users-bounces@xxxxxxxxxxx] On Behalf Of ye huang
Sent: Wednesday, August 22, 2007 14:26
To: Condor-Users Mail List
Subject: Re: [Condor-users] Jobs blocked as Idle in Multi-CPU machine

condor_q  -analyzer and -better-analyzer upon NODEA says
------
ye@nodea:~$ condor_q -analyze


-- Submitter: nodea.gridgroup.eif.ch : < 160.98.20.75:40855> : nodea.gridgroup.ei
f.ch
 ID      OWNER            SUBMITTED     RUN_TIME ST PRI SIZE CMD
---
002.000:  Run analysis summary.  Of 2 machines,
      0 are rejected by your job's requirements
      0 reject your job because of their own requirements
      0 match but are serving users with a better priority in the pool
      2 match but reject the job for unknown reasons
      0 match but will not currently preempt their existing job
      0 are available to run your job
1 jobs; 1 idle, 0 running, 0 held

ye@nodea:~$ condor_q -better-analyze

-- Submitter: nodea.gridgroup.eif.ch : < 160.98.20.75:40855> : nodea.gridgroup.ei
f.ch
---
002.000:  Run analysis summary.  Of 2 machines,
      0 are rejected by your job's requirements
      0 reject your job because of their own requirements
      0 match but are serving users with a better priority in the pool
      2 match but reject the job for unknown reasons
      0 match but will not currently preempt their existing job
      0 are available to run your job
------

condor_q  -analyzer and -better-analyzer upon NODEB says:

------
ye@nodeb:~$ condor_q -analyze


-- Submitter: nodeb.gridgroup.eif.ch : < 160.98.20.76:57419> : nodeb.gridgroup.ei
f.ch
 ID      OWNER            SUBMITTED     RUN_TIME ST PRI SIZE CMD
0 jobs; 0 idle, 0 running, 0 held

ye@nodeb:~$ condor_q -better-analyze

-- Submitter: nodeb.gridgroup.eif.ch : <160.98.20.76:57419> : nodeb.gridgroup.ei
f.ch

------