[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] User jobs stuck on idle



Hi,

On Tue, Jul 20, 2010 at 3:43 PM, Sassy Natan <sassyn@xxxxxxxxx> wrote:
> did u try to do condor_rm "jod_id_number" ?

Not yet. I was hoping I could diagnose why the jobs were not being resubmitted.

Thanks,
David

> On Tue, Jul 20, 2010 at 5:40 PM, David McKechan
> <david.mckechan@xxxxxxxxxxxxxx> wrote:
>>
>> Hi,
>>
>> I have a user whose jobs are stuck on idle after restarting
>> condor_master. The output of condor_q -better_analyze includes:
>> ==================================================
>> 229280.000:  Run analysis summary.  Of 166 machines,
>>     16 are rejected by your job's requirements
>>      0 reject your job because of their own requirements
>>      0 match but are serving users with a better priority in the pool
>>      0 match but reject the job for unknown reasons
>>    150 match but will not currently preempt their existing job
>>      0 match but are currently offline
>>      0 are available to run your job
>>        Last successful match: Mon Jul 12 21:53:17 2010
>>
>> The Requirements expression for your job is:
>>
>> ( target.Arch == "X86_64" ) && ( target.OpSys == "LINUX" ) &&
>> ( ( CkptArch == target.Arch ) || ( CkptArch is undefined ) ) &&
>> ( ( CkptOpSys == target.OpSys ) || ( CkptOpSys is undefined ) ) &&
>> ( target.Disk >= DiskUsage ) && ( ( ( target.Memory * 1024 ) >= ImageSize
>> ) &&
>> ( ( RequestMemory * 1024 ) >= ImageSize ) )
>>
>>    Condition                         Machines Matched    Suggestion
>>    ---------                         ----------------    ----------
>> 1   ( ( ( 1024 * target.Memory ) >= 1000000 ) && ( ( 1024 *
>> ceiling(ifThenElse(JobVMMemory isnt
>> undefined,JobVMMemory,9.765625000000000E+02)) ) >= 1000000 ) )
>>                                      0                   REMOVE
>> 2   ( target.Arch == "X86_64" )       166
>> 3   ( target.OpSys == "LINUX" )       166
>> 4   ( ( "X86_64" == target.Arch ) )   166
>> 5   ( ( "LINUX" == target.OpSys ) )   166
>> 6   ( target.Disk >= 22500 )          166
>> ==================================================
>>
>> How can I fix this?
>>
>> Thanks,
>> David
>> --
>> Help me raise money for Alzheimer Scotland - http://www.waitup.org.uk
>> _______________________________________________
>> Condor-users mailing list
>> To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with a
>> subject: Unsubscribe
>> You can also unsubscribe by visiting
>> https://lists.cs.wisc.edu/mailman/listinfo/condor-users
>>
>> The archives can be found at:
>> https://lists.cs.wisc.edu/archive/condor-users/
>
>
> _______________________________________________
> Condor-users mailing list
> To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with a
> subject: Unsubscribe
> You can also unsubscribe by visiting
> https://lists.cs.wisc.edu/mailman/listinfo/condor-users
>
> The archives can be found at:
> https://lists.cs.wisc.edu/archive/condor-users/
>
>



-- 
Help me raise money for Alzheimer Scotland - http://www.waitup.org.uk