[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] Priority issue



Details (hope this can help):

On the next 74 job in the list, I have the following condor_q -better-analyze :(a 72 job has approximatly the same)

root@rhea:~# condor_q -better-analyze 74.25


-- Submitter: rhea.my.domain : <172.XX.XX.XX:32772> : rhea.my.domain
AddConstraint: Condition value not literal
AddConstraint: Condition value not literal
AddConstraint: Condition value not literal
AddConstraint: Condition value not literal
AddConstraint: Condition value not literal
AddConstraint: Condition value not literal
AddConstraint: Condition value not literal
AddConstraint: Condition value not literal
AddConstraint: Condition value not literal
AddConstraint: Condition value not literal
AddConstraint: Condition value not literal
AddConstraint: Condition value not literal
AddConstraint: Condition value not literal
AddConstraint: Condition value not literal
AddConstraint: Condition value not literal
AddConstraint: Condition value not literal
AddConstraint: Condition value not literal
AddConstraint: Condition value not literal
AddConstraint: Condition value not literal
---
074.025:  Run analysis summary.  Of 29 machines,
      2 are rejected by your job's requirements
      8 reject your job because of their own requirements
     19 match but are serving users with a better priority in the pool
      0 match but reject the job for unknown reasons
      0 match but will not currently preempt their existing job
      0 are available to run your job
        No successful match recorded.
        Last failed match: Tue Dec 12 11:16:42 2006
        Reason for last match failure: no match found

The Requirements expression for your job is:

( target.Arch == "INTEL" ) && ( target.OpSys == "LINUX" ) &&
( target.Disk >= DiskUsage ) && ( ( target.Memory * 1024 ) >= ImageSize ) &&
( target.HasFileTransfer )

    Condition                         Machines Matched    Suggestion
    ---------                         ----------------    ----------
1   ( target.Arch == "INTEL" )        27
2   ( target.OpSys == "LINUX" )       29
3   ( target.Disk >= 686 )            29
4   ( ( 1024 * target.Memory ) >= 571 )29
5   ( target.HasFileTransfer )        29

The following attributes are missing from the job ClassAd:

CheckpointPlatform


----------------
On Tue, 12 Dec 2006 11:07:57 +0100
Nicolas GUIOT <nicolas.guiot@xxxxxxx> wrote:

> Hi,
> 
> I started a first job (72), which is made of about 150 queued jobs.
> Then I later started a second one (74), which I need first. 
> So, once started, I modified the 74's priority with : 
> condor_prio -p 500 74
> I also modified the 72's priority to -15.
> 
> Now my problem is that only one of the 74 job runs and other CPUs are used by 72. Even when a 72 job finishes, if a 74 is running, it doesn't launch any new 74.
> 
> Here is the submissions script (both similar) : 
> 
> Universe = vanilla
> 
> Executable      = /nfs/rhea/attract
> arguments       = T27_R_M-mutate.pdb T27_L.red $(Process)
> output          = /nfs/MC2/output.$(Process).txt
> error           = /nfs/MC2/ERROR.$(Process)
> Log             = /nfs/MC2/LOG.$(Process)
> 
> 
> should_transfer_files = YES
> when_to_transfer_output = ON_EXIT
> transfer_input_files = T27_R_M-mutate.pdb, T27_L.red,translat.dat,attract.inp,aminon.par,rotation.dat,stan
> dard.pdb
> notify_user     = user@xxxxxxxxx
> notification    = error
> 
> queue 147
> 
> 
> Here is the (truncated) condor_q result  : 
> 
> -- Submitter: rhea.my.domain : <172.XX.XX.XX:32772> : rhea.my.domain
>  ID      OWNER            SUBMITTED     RUN_TIME ST PRI SIZE CMD
>   72.37  saladin        12/9  17:49   0+01:29:47 R  -15 701.1 attract T27_R_M-mu
>   72.38  saladin        12/9  17:49   0+00:13:23 R  -15 0.6  attract T27_R_M-mu
>   72.41  saladin        12/9  17:49   0+00:12:25 R  -15 0.6  attract T27_R_M-mu
>   72.42  saladin        12/9  17:49   0+00:00:00 I  -15 0.6  attract T27_R_M-mu
>   72.72  saladin        12/9  17:49   0+00:00:00 I  -15 0.6  attract T27_R_M-mu
>   72.73  saladin        12/9  17:49   0+00:00:00 I  -15 0.6  attract T27_R_M-mu
>   72.74  saladin        12/9  17:49   0+00:00:00 I  -15 0.6  attract T27_R_M-mu
>   72.145 saladin        12/9  17:49   0+00:00:00 I  -15 0.6  attract T27_R_M-mu
>   72.146 saladin        12/9  17:49   0+00:00:00 I  -15 0.6  attract T27_R_M-mu
>   74.24  saladin        12/11 12:43   0+00:06:35 R  500 0.6  attract T27_R_M-mu
>   74.25  saladin        12/11 12:43   0+00:00:00 I  500 0.6  attract T27_R_M-mu
>   74.26  saladin        12/11 12:43   0+00:00:00 I  500 0.6  attract T27_R_M-mu
>   74.27  saladin        12/11 12:43   0+00:00:00 I  500 0.6  attract T27_R_M-mu
>   74.28  saladin        12/11 12:43   0+00:00:00 I  500 0.6  attract T27_R_M-mu
>   74.29  saladin        12/11 12:43   0+00:00:00 I  500 0.6  attract T27_R_M-mu
>   74.30  saladin        12/11 12:43   0+00:00:00 I  500 0.6  attract T27_R_M-mu
>   74.31  saladin        12/11 12:43   0+00:00:00 I  500 0.6  attract T27_R_M-mu
>   74.32  saladin        12/11 12:43   0+00:00:00 I  500 0.6  attract T27_R_M-mu
>   74.33  saladin        12/11 12:43   0+00:00:00 I  500 0.6  attract T27_R_M-mu
>   74.34  saladin        12/11 12:43   0+00:00:00 I  500 0.6  attract T27_R_M-mu
> 
> 190 jobs; 171 idle, 19 running, 0 held
> root@rhea:~#                                                                    
> 
> Thanks for any help.
> Nicolas
> 
> ----------------------------------------------------
> CNRS - UPR 9080 : Laboratoire de Biochimie Theorique
> Institut de Biologie Physico-Chimique
> 13 rue Pierre et Marie Curie
> 75005 PARIS - FRANCE
> 
> Tel : +33 158 41 51 70
> Fax : +33 158 41 50 26
> ----------------------------------------------------
> _______________________________________________
> Condor-users mailing list
> To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with a
> subject: Unsubscribe
> You can also unsubscribe by visiting
> https://lists.cs.wisc.edu/mailman/listinfo/condor-users
> 
> The archives can be found at either
> https://lists.cs.wisc.edu/archive/condor-users/
> http://www.opencondor.org/spaces/viewmailarchive.action?key=CONDOR
> 

----------


----------------------------------------------------
CNRS - UPR 9080 : Laboratoire de Biochimie Theorique
Institut de Biologie Physico-Chimique
13 rue Pierre et Marie Curie
75005 PARIS - FRANCE

Tel : +33 158 41 51 70
Fax : +33 158 41 50 26
----------------------------------------------------