[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Condor-users] Preemption problem -- with attachment



Sorry for the double post.

 

I'm sure this has been covered before, but I couldn't find an answer on the archives or in the manual.

 

I never want jobs to be preempted.  I have PREEMPTION_REQUIREMENTS = False and PREEMPT = False on all my machines.  However, I recently ran into a problem.

 

One user submitted a job to the queue while another user had several jobs running.  The best match according to RANK was one of the machines the other user had claimed, and his UserPriority wasn't as good, so it decided to PREEMPT that user.  But, since I have MaxJobRetirementTime set to 2 days, nothing happened except that the user's job remained idle when there were many other VMs available.  What'd I'd like to happen is for the job to go to the highest ranked machine that isn't already in use.  I'd like UserPriority to control what order the jobs are matched to machines, but I never want jobs preempted.

 

I've attached my NegotiatorLog for this event.  I couldn't find anything else useful in my logs, but I'd be happy to post more if anyone thinks that would help.

 

Thanks a lot!

-Colin

 

This email and any files transmitted with it are confidential, proprietary
and intended solely for the individual or entity to whom they are addressed.
If you have received this email in error please delete it immediately.

7/13 11:48:38 ---------- Started Negotiation Cycle ----------
7/13 11:48:39 Phase 1:  Obtaining ads from collector ...
7/13 11:48:39   Getting all public ads ...
7/13 11:48:39 Trying to query collector <xxx:9618>
7/13 11:48:39 NEGOTIATOR_TIMEOUT_MULTIPLIER is undefined, using default value of 0
7/13 11:48:39 SEC_DEBUG_PRINT_KEYS is undefined, using default value of False
7/13 11:48:39   Sorting 120 ads ...
7/13 11:48:39   Getting startd private ads ...
7/13 11:48:39 Trying to query collector <xxx:9618>
7/13 11:48:39 NEGOTIATOR_TIMEOUT_MULTIPLIER is undefined, using default value of 0
7/13 11:48:39 SEC_DEBUG_PRINT_KEYS is undefined, using default value of False
7/13 11:48:39 Got ads: 120 public and 52 private
7/13 11:48:39 Public ads include 19 submitter, 52 startd
7/13 11:48:39 Phase 2:  Performing accounting ...
7/13 11:48:49 Phase 3:  Sorting submitter ads by priority ...
7/13 11:48:49 Phase 4.1:  Negotiating with schedds ...
7/13 11:48:49     NumStartdAds = 52
7/13 11:48:49     NormalFactor = 51.729545
7/13 11:48:49     MaxPrioValue = 5.200722
7/13 11:48:49     NumScheddAds = 19
7/13 11:48:49   Negotiating with hip@xxxxxxxxxxxxx at <xxx:32774>
7/13 11:48:49 0 seconds so far
7/13 11:48:49 NEGOTIATOR_IGNORE_USER_PRIORITIES is undefined, using default value of False
7/13 11:48:49   Calculating schedd limit with the following parameters
7/13 11:48:49     ScheddPrio       = 0.500000
7/13 11:48:49     ScheddPrioFactor = 1.000000
7/13 11:48:49     scheddShare      = 0.201074
7/13 11:48:49     scheddAbsShare   = 0.166667
7/13 11:48:49     ScheddUsage      = 0
7/13 11:48:49     scheddLimit      = 10
7/13 11:48:49     MaxscheddLimit   = 10
7/13 11:48:49 Socket to <xxx:32774> already in cache, reusing
7/13 11:48:49     Sending SEND_JOB_INFO/eom
7/13 11:48:49     Getting reply from schedd ...
7/13 11:48:49     Got JOB_INFO command; getting classad/eom
7/13 11:48:49     Request 00242.00000:
7/13 11:48:49       Preempting dxn@xxxxxxxxxxxxx (prio=5.20) on vm1@xxxxxxxxxxxxxxxxx for hip@xxxxxxxxxxxxx (prio\
=0.50)
7/13 11:48:50       Connecting to startd vm1@xxxxxxxxxxxxxxxxx at <xxx:32774>
7/13 11:48:50 NEGOTIATOR_TIMEOUT_MULTIPLIER is undefined, using default value of 0
7/13 11:48:50 SEC_DEBUG_PRINT_KEYS is undefined, using default value of False
7/13 11:48:50       Sending MATCH_INFO/capability
7/13 11:48:50       (Capability is "<xxx:32774>#1152738936#12" )
7/13 11:48:50       Sending PERMISSION, capability, startdAd to schedd
7/13 11:48:50       Notifying the accountant
7/13 11:48:50       Successfully matched with vm1@xxxxxxxxxxxxxxxxx
7/13 11:48:50     Sending SEND_JOB_INFO/eom
7/13 11:48:50     Getting reply from schedd ...
7/13 11:48:50     Got NO_MORE_JOBS;  done negotiating
7/13 11:48:50   Schedd hip@xxxxxxxxxxxxx got all it wants; removing it.
7/13 11:48:50 NEGOTIATOR_IGNORE_USER_PRIORITIES is undefined, using default value of False
7/13 11:48:50   Negotiating with mhc@xxxxxxxxxxxxx skipped because no idle jobs
7/13 11:48:50   Schedd mhc@xxxxxxxxxxxxx got all it wants; removing it.
7/13 11:48:50   Negotiating with aap@xxxxxxxxxxxxx at <xxx:32811>
7/13 11:48:50 0 seconds so far
7/13 11:48:50 NEGOTIATOR_IGNORE_USER_PRIORITIES is undefined, using default value of False
7/13 11:48:50   Calculating schedd limit with the following parameters
7/13 11:48:50     ScheddPrio       = 0.500216
7/13 11:48:50     ScheddPrioFactor = 1.000000
7/13 11:48:50     scheddShare      = 0.200987
7/13 11:48:50     scheddAbsShare   = 0.166667
7/13 11:48:50     ScheddUsage      = 0
7/13 11:48:50     scheddLimit      = 10
7/13 11:48:50     MaxscheddLimit   = 10
7/13 11:48:50 Socket to <xxx:32811> already in cache, reusing
7/13 11:48:50     Sending SEND_JOB_INFO/eom
7/13 11:48:50     Getting reply from schedd ...
7/13 11:48:50     Got JOB_INFO command; getting classad/eom
7/13 11:48:50     Request 05321.00000:
7/13 11:48:50       Connecting to startd vm2@xxxxxxxxxxxxxxxxx at <xxx:32773>
7/13 11:48:50 NEGOTIATOR_TIMEOUT_MULTIPLIER is undefined, using default value of 0
7/13 11:48:50 SEC_DEBUG_PRINT_KEYS is undefined, using default value of False
7/13 11:48:50       Sending MATCH_INFO/capability
7/13 11:48:50       (Capability is "<xxx:32773>#1149947559#1780" )
7/13 11:48:50       Sending PERMISSION, capability, startdAd to schedd
7/13 11:48:50       Notifying the accountant
7/13 11:48:50       Successfully matched with vm2@xxxxxxxxxxxxxxxxx
7/13 11:48:50     Sending SEND_JOB_INFO/eom
7/13 11:48:50     Getting reply from schedd ...
7/13 11:48:50     Got NO_MORE_JOBS;  done negotiating
7/13 11:48:50   Schedd aap@xxxxxxxxxxxxx got all it wants; removing it.
7/13 11:48:50 NEGOTIATOR_IGNORE_USER_PRIORITIES is undefined, using default value of False
7/13 11:48:50   Negotiating with mrt@xxxxxxxxxxxxx skipped because no idle jobs
7/13 11:48:50   Schedd mrt@xxxxxxxxxxxxx got all it wants; removing it.
7/13 11:48:50 NEGOTIATOR_IGNORE_USER_PRIORITIES is undefined, using default value of False
7/13 11:48:50   Negotiating with jha@xxxxxxxxxxxxx skipped because no idle jobs
7/13 11:48:50   Schedd jha@xxxxxxxxxxxxx got all it wants; removing it.
7/13 11:48:50 NEGOTIATOR_IGNORE_USER_PRIORITIES is undefined, using default value of False
7/13 11:48:50   Negotiating with jha@xxxxxxxxxxxxx skipped because no idle jobs
7/13 11:48:50   Schedd jha@xxxxxxxxxxxxx got all it wants; removing it.
7/13 11:48:50 NEGOTIATOR_IGNORE_USER_PRIORITIES is undefined, using default value of False
7/13 11:48:50   Negotiating with dxn@xxxxxxxxxxxxx skipped because no idle jobs
7/13 11:48:50   Schedd dxn@xxxxxxxxxxxxx got all it wants; removing it.
7/13 11:48:50 NEGOTIATOR_IGNORE_USER_PRIORITIES is undefined, using default value of False
7/13 11:48:50   Negotiating with dxn@xxxxxxxxxxxxx skipped because no idle jobs
7/13 11:48:50   Schedd dxn@xxxxxxxxxxxxx got all it wants; removing it.
7/13 11:48:50 NEGOTIATOR_IGNORE_USER_PRIORITIES is undefined, using default value of False
7/13 11:48:50   Negotiating with dxn@xxxxxxxxxxxxx skipped because no idle jobs
7/13 11:48:50   Schedd dxn@xxxxxxxxxxxxx got all it wants; removing it.
7/13 11:48:50 NEGOTIATOR_IGNORE_USER_PRIORITIES is undefined, using default value of False
7/13 11:48:50   Negotiating with dxn@xxxxxxxxxxxxx skipped because no idle jobs
7/13 11:48:50   Schedd dxn@xxxxxxxxxxxxx got all it wants; removing it.
7/13 11:48:50 NEGOTIATOR_IGNORE_USER_PRIORITIES is undefined, using default value of False
7/13 11:48:50   Negotiating with dxn@xxxxxxxxxxxxx skipped because no idle jobs
7/13 11:48:50   Schedd dxn@xxxxxxxxxxxxx got all it wants; removing it.
7/13 11:48:50 NEGOTIATOR_IGNORE_USER_PRIORITIES is undefined, using default value of False
7/13 11:48:50   Negotiating with dxn@xxxxxxxxxxxxx skipped because no idle jobs
7/13 11:48:50   Schedd dxn@xxxxxxxxxxxxx got all it wants; removing it.
7/13 11:48:50 NEGOTIATOR_IGNORE_USER_PRIORITIES is undefined, using default value of False
7/13 11:48:50   Negotiating with dxn@xxxxxxxxxxxxx skipped because no idle jobs
7/13 11:48:50   Schedd dxn@xxxxxxxxxxxxx got all it wants; removing it.
7/13 11:48:50 NEGOTIATOR_IGNORE_USER_PRIORITIES is undefined, using default value of False
7/13 11:48:50   Negotiating with dxn@xxxxxxxxxxxxx skipped because no idle jobs
7/13 11:48:50   Schedd dxn@xxxxxxxxxxxxx got all it wants; removing it.
7/13 11:48:50 NEGOTIATOR_IGNORE_USER_PRIORITIES is undefined, using default value of False
7/13 11:48:50   Negotiating with dxn@xxxxxxxxxxxxx skipped because no idle jobs
7/13 11:48:50   Schedd dxn@xxxxxxxxxxxxx got all it wants; removing it.
7/13 11:48:50 NEGOTIATOR_IGNORE_USER_PRIORITIES is undefined, using default value of False
7/13 11:48:50   Negotiating with dxn@xxxxxxxxxxxxx skipped because no idle jobs
7/13 11:48:50   Schedd dxn@xxxxxxxxxxxxx got all it wants; removing it.
7/13 11:48:50 NEGOTIATOR_IGNORE_USER_PRIORITIES is undefined, using default value of False
7/13 11:48:50   Negotiating with dxn@xxxxxxxxxxxxx skipped because no idle jobs
7/13 11:48:50   Schedd dxn@xxxxxxxxxxxxx got all it wants; removing it.
7/13 11:48:50 NEGOTIATOR_IGNORE_USER_PRIORITIES is undefined, using default value of False
7/13 11:48:50   Negotiating with dxn@xxxxxxxxxxxxx skipped because no idle jobs
7/13 11:48:50   Schedd dxn@xxxxxxxxxxxxx got all it wants; removing it.
7/13 11:48:50 NEGOTIATOR_IGNORE_USER_PRIORITIES is undefined, using default value of False
7/13 11:48:50   Negotiating with dxn@xxxxxxxxxxxxx skipped because no idle jobs
7/13 11:48:50   Schedd dxn@xxxxxxxxxxxxx got all it wants; removing it.
7/13 11:48:50 ---------- Finished Negotiation Cycle ----------
7/13 11:49:10 NEGOTIATOR_CYCLE_DELAY is undefined, using default value of 20
7/13 11:49:10 ---------- Started Negotiation Cycle ----------
7/13 11:49:10 Phase 1:  Obtaining ads from collector ...
7/13 11:49:10   Getting all public ads ...
7/13 11:49:10 Trying to query collector <xxx:9618>
7/13 11:49:10 NEGOTIATOR_TIMEOUT_MULTIPLIER is undefined, using default value of 0
7/13 11:49:10 SEC_DEBUG_PRINT_KEYS is undefined, using default value of False
7/13 11:49:10   Sorting 120 ads ...
7/13 11:49:10   Getting startd private ads ...
7/13 11:49:10 Trying to query collector <xxx:9618>
7/13 11:49:10 NEGOTIATOR_TIMEOUT_MULTIPLIER is undefined, using default value of 0
7/13 11:49:10 SEC_DEBUG_PRINT_KEYS is undefined, using default value of False
7/13 11:49:10 Got ads: 120 public and 52 private
7/13 11:49:10 Public ads include 19 submitter, 52 startd
7/13 11:49:10 Phase 2:  Performing accounting ...
7/13 11:49:13 Phase 3:  Sorting submitter ads by priority ...
7/13 11:49:13 Phase 4.1:  Negotiating with schedds ...
7/13 11:49:13     NumStartdAds = 52
7/13 11:49:13     NormalFactor = 51.741117
7/13 11:49:13     MaxPrioValue = 5.202501
7/13 11:49:13     NumScheddAds = 19
7/13 11:49:13 NEGOTIATOR_IGNORE_USER_PRIORITIES is undefined, using default value of False
7/13 11:49:13   Negotiating with mhc@xxxxxxxxxxxxx skipped because no idle jobs
7/13 11:49:13   Schedd mhc@xxxxxxxxxxxxx got all it wants; removing it.
7/13 11:49:13   Negotiating with hip@xxxxxxxxxxxxx at <xxx:32774>
7/13 11:49:13 0 seconds so far
7/13 11:49:13 NEGOTIATOR_IGNORE_USER_PRIORITIES is undefined, using default value of False
7/13 11:49:13   Calculating schedd limit with the following parameters
7/13 11:49:13     ScheddPrio       = 0.500036
7/13 11:49:13     ScheddPrioFactor = 1.000000
7/13 11:49:13     scheddShare      = 0.201083
7/13 11:49:13     scheddAbsShare   = 0.166667
7/13 11:49:13     ScheddUsage      = 0
7/13 11:49:13     scheddLimit      = 10
7/13 11:49:13     MaxscheddLimit   = 10
7/13 11:49:13 Socket to <xxx:32774> already in cache, reusing
7/13 11:49:13     Sending SEND_JOB_INFO/eom
7/13 11:49:13     Getting reply from schedd ...
7/13 11:49:13     Got NO_MORE_JOBS;  done negotiating
7/13 11:49:13   Schedd hip@xxxxxxxxxxxxx got all it wants; removing it.