[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] problem with condor_q -analyze



You would need to send some log files for further information, also your submit file.
 
It says they don't match, so have a look at the requirements in the submit file.
 
Is the OpSys and Arch the same on machine they run on and all the others?
 
JK
 
 
-----Original Message-----
From: condor-users-bounces@xxxxxxxxxxx [mailto:condor-users-bounces@xxxxxxxxxxx]On Behalf Of Partha sarathi
Sent: Wednesday, June 06, 2007 12:27 PM
To: Condor-Users Mail List
Subject: Re: [Condor-users] problem with condor_q -analyze

when i give a condor_q -run i see only one job getting processed on one machine and even the jobs are there they are not going the other machines in the pool.......I can see the condor processes running on all the machines but i have no clue why these machines are not able to process the jobs........in the previous mail i sent the condor_q -analyze also.....please help me out........
 
 
[condor@Perfcoelnx3 bin]$ ./condor_q -run


-- Submitter: Perfcoelnx3 : <10.237.226.83:21193> : Perfcoelnx3
 ID      OWNER            SUBMITTED     RUN_TIME HOST(S)
  66.0   condor          6/5  07:03   0+04:50:28 Perfcoelnx3

[condor@Perfcoelnx3 bin]$ ./condor_q

-- Submitter: Perfcoelnx3 : <10.237.226.83:21193> : Perfcoelnx3
 ID      OWNER            SUBMITTED     RUN_TIME ST PRI SIZE CMD
  66.0   condor          6/5  07:03   0+04:41:52 R  0   9.8  partha2.out
  67.0   condor          6/5  07:03   0+00:00:00 I  0   9.8  partha3.out
  68.0   condor          6/5  07:03   0+00:00:00 I  0   9.8  partha4.out
  69.0   condor          6/5  07:03   0+00:00:00 I  0   9.8  partha5.out
  70.0   condor          6/5  07:03   0+00:00:00 I  0   9.8  partha6.out
  71.0   condor          6/5  07:03   0+00:00:00 I  0   9.8  partha7.out
  72.0   condor          6/5  07:03   0+00:00:00 I  0   9.8  partha8.out
  73.0   condor          6/5  07:03   0+00:00:00 I  0   9.8  partha9.out
  74.0   condor          6/5  07:03   0+00:00:00 I  0   9.8  partha10.out

9 jobs; 8 idle, 1 running, 0 held



On 6/6/07, Partha sarathi <jinka.partha@xxxxxxxxx> wrote:
My jobs are processed on the same mahcine frm which they are submitted...i have no idea why they are not going to other machines.......can somebody give me a clue what is going wrong...........
 
 
i gave a condor_q -analyaze , after submitting jobs and my output is 
 
 
069.000:  Run analysis summary.  Of 3 machines,
      2 are rejected by your job's requirements
      0 reject your job because of their own requirements
      1 match but are serving users with a better priority in the pool
      0 match but reject the job for unknown reasons
      0 match but will not currently preempt their existing job
      0 are available to run your job
---
070.000:  Run analysis summary.  Of 3 machines,
      2 are rejected by your job's requirements
      0 reject your job because of their own requirements
      1 match but are serving users with a better priority in the pool
      0 match but reject the job for unknown reasons
      0 match but will not currently preempt their existing job
      0 are available to run your job
---
071.000:  Run analysis summary.  Of 3 machines,
      2 are rejected by your job's requirements
      0 reject your job because of their own requirements
      1 match but are serving users with a better priority in the pool
      0 match but reject the job for unknown reasons
      0 match but will not currently preempt their existing job
      0 are available to run your job
---
072.000:  Run analysis summary.  Of 3 machines,
      2 are rejected by your job's requirements
      0 reject your job because of their own requirements
      1 match but are serving users with a better priority in the pool
      0 match but reject the job for unknown reasons
      0 match but will not currently preempt their existing job
      0 are available to run your job
---
073.000:  Run analysis summary.  Of 3 machines,
      2 are rejected by your job's requirements
      0 reject your job because of their own requirements
      1 match but are serving users with a better priority in the pool
      0 match but reject the job for unknown reasons
      0 match but will not currently preempt their existing job
      0 are available to run your job
---
074.000:  Run analysis summary.  Of 3 machines,
      2 are rejected by your job's requirements
      0 reject your job because of their own requirements
      1 match but are serving users with a better priority in the pool
      0 match but reject the job for unknown reasons
      0 match but will not currently preempt their existing job
      0 are available to run your job


 
On 5/31/07, Ian Chesal <ICHESAL@xxxxxxxxxx > wrote:
> initally it was like
>
> 127.0.0.1 localhost.localdomainperfcoelnx3 localhost

>
> but i changed it with all the machines in the pool like
>
> 127.0.0.1 perfcoelnx3
> 10.237.234.... second m/c
> 10.237.234.... third m/c

This is wrong. It should be:

127.0.0.1       localhost.localdomain localhost
10.237.234....  perfcoelnx3

Right now you've got perfcoelnx3 resolving to the loopback address on the machine. Kind of a circular route.

This also explains your condor_status issues in the other email thread BTW.

- Ian



_______________________________________________
Condor-users mailing list
To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/condor-users

The archives can be found at:
https://lists.cs.wisc.edu/archive/condor-users/