[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] Problems with jobs



Hi Thanks for the response.
 
SUBMIT_SEND_RESCHEDULE has not specified in any of my config files which
means that its automatically set to true does it not?
 
condor_q -ana says jobs being serviced.
 
It seems a lot of machines go into the claimed state but stay idle.
 
tux.neuralgri LINUX       INTEL  Unclaimed  Idle       0.250   512  0+00:34:48
vm1@xxxxxxxxx LINUX       X86_64 Owner      Idle       0.750  2048  0+00:00:02
vm2@xxxxxxxxx LINUX       X86_64 Claimed    Idle       0.000  2048  0+00:00:05
vm1@xxxxxxxxx LINUX       X86_64 Claimed    Idle       0.120  2048  0+00:01:20
vm2@xxxxxxxxx LINUX       X86_64 Claimed    Idle       0.000  2048  0+00:01:21
vm1@xxxxxxxxx LINUX       X86_64 Claimed    Idle       0.000  2048  0+00:00:05
vm2@xxxxxxxxx LINUX       X86_64 Claimed    Idle       0.270  2048  0+00:00:05
vm1@xxxxxxxxx LINUX       X86_64 Claimed    Idle       0.180  2048  0+00:00:07
vm2@xxxxxxxxx LINUX       X86_64 Claimed    Idle       0.000  2048  0+00:00:11
vm1@xxxxxxxxx LINUX       X86_64 Claimed    Idle       0.050  2048  0+00:00:07
vm2@xxxxxxxxx LINUX       X86_64 Claimed    Idle       0.000  2048  0+00:00:10
vm1@xxxxxxxxx LINUX       X86_64 Claimed    Idle       0.100  2048  0+00:00:07
vm2@xxxxxxxxx LINUX       X86_64 Claimed    Idle       0.000  2048  0+00:00:11
vm1@xxxxxxxxx LINUX       X86_64 Claimed    Idle       0.100  2048  0+00:00:07
vm2@xxxxxxxxx LINUX       X86_64 Claimed    Idle       0.000  2048  0+00:00:11
vm1@xxxxxxxxx LINUX       X86_64 Claimed    Idle       0.000  2048  0+00:00:07
vm2@xxxxxxxxx LINUX       X86_64 Claimed    Idle       0.210  2048  0+00:00:11
vm1@xxxxxxxxx LINUX       X86_64 Claimed    Idle       0.080  2048  0+00:00:08
vm2@xxxxxxxxx LINUX       X86_64 Claimed    Idle       0.000  2048  0+00:00:03
vm1@xxxxxxxxx LINUX       X86_64 Claimed    Idle       0.050  2048  0+00:00:02
vm2@xxxxxxxxx LINUX       X86_64 Claimed    Idle       0.000  2048  0+00:00:03
vm1@xxxxxxxxx LINUX       X86_64 Claimed    Idle       0.160  2048  0+00:00:04
vm2@xxxxxxxxx LINUX       X86_64 Claimed    Idle       0.000  2048  0+00:00:05
vm1@xxxxxxxxx LINUX       X86_64 Owner      Idle       1.000  2048  0+20:15:59
vm2@xxxxxxxxx LINUX       X86_64 Owner      Idle       0.310  2048  0+00:00:02
vm1@xxxxxxxxx LINUX       X86_64 Owner      Idle       1.000  2048  0+20:16:02
vm2@xxxxxxxxx LINUX       X86_64 Claimed    Idle       0.220  2048  0+00:00:09
vm1@xxxxxxxxx LINUX       X86_64 Claimed    Idle       0.130  2048  0+00:00:05
vm2@xxxxxxxxx LINUX       X86_64 Claimed    Idle       0.000  2048  0+00:00:06
vm1@xxxxxxxxx LINUX       X86_64 Claimed    Idle       0.000  2048  0+00:00:04
vm2@xxxxxxxxx LINUX       X86_64 Claimed    Idle       0.460  2048  0+00:00:05
vm1@xxxxxxxxx LINUX       X86_64 Claimed    Idle       0.200  2048  0+00:00:04
vm2@xxxxxxxxx LINUX       X86_64 Claimed    Idle       0.000  2048  0+00:00:05
vm1@xxxxxxxxx LINUX       X86_64 Unclaimed  Idle       0.130  2048  0+00:00:05
vm2@xxxxxxxxx LINUX       X86_64 Claimed    Idle       0.000  2048  0+00:00:05
vm1@xxxxxxxxx LINUX       X86_64 Claimed    Idle       0.170  2048  0+00:00:04
vm2@xxxxxxxxx LINUX       X86_64 Claimed    Busy       0.000  2048  0+00:00:06
vm1@xxxxxxxxx LINUX       X86_64 Claimed    Busy       0.020  2048  0+00:00:04
vm2@xxxxxxxxx LINUX       X86_64 Claimed    Busy       0.000  2048  0+00:00:06
vm1@xxxxxxxxx LINUX       X86_64 Claimed    Idle       0.110  2048  0+00:00:05
vm2@xxxxxxxxx LINUX       X86_64 Claimed    Idle       0.000  2048  0+00:00:05
 
Chris
----- Original Message -----
From: Ian Chesal
Sent: Monday, December 05, 2005 2:39 PM
Subject: Re: [Condor-users] Problems with jobs

 

 

Hi.

 

When im submitting jobs into my pool it seems to take ages to start running the

jobs unless i run condor_reschedule. Is there a way to speed the process up without

me running this command?

 

[Ian Chesal] See: http://www.cs.wisc.edu/condor/manual/v6.7/3_3Configuration.html#11494 -- make sure you have that set to True in the config file on the machine you?re calling condor_submit from. It will automatically issue a reschedule after submission.

 

My second problem is that job results are not returning to me any quicker than If i ran

my jobs one a one machine pool. I.e im checking condor_q and the queue is only going

down 1 at a time at roughly the same speed as if there was only one machine in that pool.

It is also slower than if I actually ran my jobs sequentially on one machine using a batch

file or shell script.

 

[Ian Chesal] What does condor_q -ana say? Are you setting your job requirements such that only one VM in the system is able to match with all your jobs in your cluster? What about the MAX_JOBS_RUNNING setting on your schedd? Make sure that isn?t set to 1.


_______________________________________________
Condor-users mailing list
Condor-users@xxxxxxxxxxx
https://lists.cs.wisc.edu/mailman/listinfo/condor-users