[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] Priority issue



Nicolas,

I think that you assigned the priorities the wrong way around. i.e. swap
the values around, so that the high numbered priority job now has a low
number and vice versa.

Also you need to note that I think only the numbers between 0 and 20 are
valid, (although I may be wrong and it is 1-20 or 0-19)

Peter

Dr Peter Myerscough-Jackopson  -  Engineer
MULTIPLE ACCESS COMMUNICATIONS LIMITED
Delta House, The University of Southampton Science Park, Southampton,
SO16 7NS,
United Kingdom.
Tel: +44 (0)23 8076 7808 Fax: +44 (0)23 8076 0602
Web: http://www.macltd.com/  Email:
peter.myerscough-jackopson@xxxxxxxxxx

-----Original Message-----
From: condor-users-bounces@xxxxxxxxxxx
[mailto:condor-users-bounces@xxxxxxxxxxx] On Behalf Of Nicolas GUIOT
Sent: 12 December 2006 12:53
To: condor-users@xxxxxxxxxxx
Subject: Re: [Condor-users] Priority issue

More details (sorry I didn't wait before writing, I'm quite in a hurry
to get these results...) : 

Some new CPUs got available for condor (they were in Owner state before)
: some 74 jobs took them.
But on the CPUS where 72s jobs were running and are finished, it dosn't
want to start 74s, but keeps running new 72s : how can I make this
change ?

Thanks in advance
Nicolas
----------------
On Tue, 12 Dec 2006 11:31:48 +0100
Nicolas GUIOT <nicolas.guiot@xxxxxxx> wrote:

> Details (hope this can help):
> 
> On the next 74 job in the list, I have the following condor_q 
> -better-analyze :(a 72 job has approximatly the same)
> 
> root@rhea:~# condor_q -better-analyze 74.25
> 
> 
> -- Submitter: rhea.my.domain : <172.XX.XX.XX:32772> : rhea.my.domain
> AddConstraint: Condition value not literal
> AddConstraint: Condition value not literal
> AddConstraint: Condition value not literal
> AddConstraint: Condition value not literal
> AddConstraint: Condition value not literal
> AddConstraint: Condition value not literal
> AddConstraint: Condition value not literal
> AddConstraint: Condition value not literal
> AddConstraint: Condition value not literal
> AddConstraint: Condition value not literal
> AddConstraint: Condition value not literal
> AddConstraint: Condition value not literal
> AddConstraint: Condition value not literal
> AddConstraint: Condition value not literal
> AddConstraint: Condition value not literal
> AddConstraint: Condition value not literal
> AddConstraint: Condition value not literal
> AddConstraint: Condition value not literal
> AddConstraint: Condition value not literal
> ---
> 074.025:  Run analysis summary.  Of 29 machines,
>       2 are rejected by your job's requirements
>       8 reject your job because of their own requirements
>      19 match but are serving users with a better priority in the pool
>       0 match but reject the job for unknown reasons
>       0 match but will not currently preempt their existing job
>       0 are available to run your job
>         No successful match recorded.
>         Last failed match: Tue Dec 12 11:16:42 2006
>         Reason for last match failure: no match found
> 
> The Requirements expression for your job is:
> 
> ( target.Arch == "INTEL" ) && ( target.OpSys == "LINUX" ) && ( 
> target.Disk >= DiskUsage ) && ( ( target.Memory * 1024 ) >= ImageSize 
> ) && ( target.HasFileTransfer )
> 
>     Condition                         Machines Matched    Suggestion
>     ---------                         ----------------    ----------
> 1   ( target.Arch == "INTEL" )        27
> 2   ( target.OpSys == "LINUX" )       29
> 3   ( target.Disk >= 686 )            29
> 4   ( ( 1024 * target.Memory ) >= 571 )29
> 5   ( target.HasFileTransfer )        29
> 
> The following attributes are missing from the job ClassAd:
> 
> CheckpointPlatform
> 
> 
> ----------------
> On Tue, 12 Dec 2006 11:07:57 +0100
> Nicolas GUIOT <nicolas.guiot@xxxxxxx> wrote:
> 
> > Hi,
> > 
> > I started a first job (72), which is made of about 150 queued jobs.
> > Then I later started a second one (74), which I need first. 
> > So, once started, I modified the 74's priority with : 
> > condor_prio -p 500 74
> > I also modified the 72's priority to -15.
> > 
> > Now my problem is that only one of the 74 job runs and other CPUs
are used by 72. Even when a 72 job finishes, if a 74 is running, it
doesn't launch any new 74.
> > 
> > Here is the submissions script (both similar) : 
> > 
> > Universe = vanilla
> > 
> > Executable      = /nfs/rhea/attract
> > arguments       = T27_R_M-mutate.pdb T27_L.red $(Process)
> > output          = /nfs/MC2/output.$(Process).txt
> > error           = /nfs/MC2/ERROR.$(Process)
> > Log             = /nfs/MC2/LOG.$(Process)
> > 
> > 
> > should_transfer_files = YES
> > when_to_transfer_output = ON_EXIT
> > transfer_input_files = T27_R_M-mutate.pdb, 
> > T27_L.red,translat.dat,attract.inp,aminon.par,rotation.dat,stan
> > dard.pdb
> > notify_user     = user@xxxxxxxxx
> > notification    = error
> > 
> > queue 147
> > 
> > 
> > Here is the (truncated) condor_q result  : 
> > 
> > -- Submitter: rhea.my.domain : <172.XX.XX.XX:32772> : rhea.my.domain
> >  ID      OWNER            SUBMITTED     RUN_TIME ST PRI SIZE CMD
> >   72.37  saladin        12/9  17:49   0+01:29:47 R  -15 701.1
attract T27_R_M-mu
> >   72.38  saladin        12/9  17:49   0+00:13:23 R  -15 0.6  attract
T27_R_M-mu
> >   72.41  saladin        12/9  17:49   0+00:12:25 R  -15 0.6  attract
T27_R_M-mu
> >   72.42  saladin        12/9  17:49   0+00:00:00 I  -15 0.6  attract
T27_R_M-mu
> >   72.72  saladin        12/9  17:49   0+00:00:00 I  -15 0.6  attract
T27_R_M-mu
> >   72.73  saladin        12/9  17:49   0+00:00:00 I  -15 0.6  attract
T27_R_M-mu
> >   72.74  saladin        12/9  17:49   0+00:00:00 I  -15 0.6  attract
T27_R_M-mu
> >   72.145 saladin        12/9  17:49   0+00:00:00 I  -15 0.6  attract
T27_R_M-mu
> >   72.146 saladin        12/9  17:49   0+00:00:00 I  -15 0.6  attract
T27_R_M-mu
> >   74.24  saladin        12/11 12:43   0+00:06:35 R  500 0.6  attract
T27_R_M-mu
> >   74.25  saladin        12/11 12:43   0+00:00:00 I  500 0.6  attract
T27_R_M-mu
> >   74.26  saladin        12/11 12:43   0+00:00:00 I  500 0.6  attract
T27_R_M-mu
> >   74.27  saladin        12/11 12:43   0+00:00:00 I  500 0.6  attract
T27_R_M-mu
> >   74.28  saladin        12/11 12:43   0+00:00:00 I  500 0.6  attract
T27_R_M-mu
> >   74.29  saladin        12/11 12:43   0+00:00:00 I  500 0.6  attract
T27_R_M-mu
> >   74.30  saladin        12/11 12:43   0+00:00:00 I  500 0.6  attract
T27_R_M-mu
> >   74.31  saladin        12/11 12:43   0+00:00:00 I  500 0.6  attract
T27_R_M-mu
> >   74.32  saladin        12/11 12:43   0+00:00:00 I  500 0.6  attract
T27_R_M-mu
> >   74.33  saladin        12/11 12:43   0+00:00:00 I  500 0.6  attract
T27_R_M-mu
> >   74.34  saladin        12/11 12:43   0+00:00:00 I  500 0.6  attract
T27_R_M-mu
> > 
> > 190 jobs; 171 idle, 19 running, 0 held
> > root@rhea:~#

> > 
> > Thanks for any help.
> > Nicolas
> > 
> > ----------------------------------------------------
> > CNRS - UPR 9080 : Laboratoire de Biochimie Theorique Institut de 
> > Biologie Physico-Chimique
> > 13 rue Pierre et Marie Curie
> > 75005 PARIS - FRANCE
> > 
> > Tel : +33 158 41 51 70
> > Fax : +33 158 41 50 26
> > ----------------------------------------------------
> > _______________________________________________
> > Condor-users mailing list
> > To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx 
> > with a
> > subject: Unsubscribe
> > You can also unsubscribe by visiting 
> > https://lists.cs.wisc.edu/mailman/listinfo/condor-users
> > 
> > The archives can be found at either
> > https://lists.cs.wisc.edu/archive/condor-users/
> > http://www.opencondor.org/spaces/viewmailarchive.action?key=CONDOR
> > 
> 
> ----------
> 
> 
> ----------------------------------------------------
> CNRS - UPR 9080 : Laboratoire de Biochimie Theorique Institut de 
> Biologie Physico-Chimique
> 13 rue Pierre et Marie Curie
> 75005 PARIS - FRANCE
> 
> Tel : +33 158 41 51 70
> Fax : +33 158 41 50 26
> ----------------------------------------------------
> _______________________________________________
> Condor-users mailing list
> To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx 
> with a
> subject: Unsubscribe
> You can also unsubscribe by visiting
> https://lists.cs.wisc.edu/mailman/listinfo/condor-users
> 
> The archives can be found at either
> https://lists.cs.wisc.edu/archive/condor-users/
> http://www.opencondor.org/spaces/viewmailarchive.action?key=CONDOR
> 

----------


----------------------------------------------------
CNRS - UPR 9080 : Laboratoire de Biochimie Theorique Institut de
Biologie Physico-Chimique
13 rue Pierre et Marie Curie
75005 PARIS - FRANCE

Tel : +33 158 41 51 70
Fax : +33 158 41 50 26
----------------------------------------------------
_______________________________________________
Condor-users mailing list
To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with
a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/condor-users

The archives can be found at either
https://lists.cs.wisc.edu/archive/condor-users/
http://www.opencondor.org/spaces/viewmailarchive.action?key=CONDOR