[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] scheduler signal



Hi Don,

You are running with glideins at Comet from OSG? Are you sure? I don't think we have an entry in the glidein factories submitting to Comet at this time. But maybe I can help. 

Marty Kandes
UCSD Glidein Factory Operations 



From: HTCondor-users [htcondor-users-bounces@xxxxxxxxxxx] on behalf of Krieger, Donald N. [kriegerd@xxxxxxxx]
Sent: Tuesday, June 21, 2016 12:32 PM
To: htcondor-users@xxxxxxxxxxx
Subject: [HTCondor-users] scheduler signal

I am running on the OSG.

I have a limited allocation on Comet which use to run glideins, each of which provides 24 slots on which only my jobs run.

I set parameter MINS_UNTIL_RETIREMENT to be 15 minutes less than the time the glidein runs.

I had been using 10 as that is a more efficient number given the relatively short run time of my jobs.

 

For the 3rd time, I am seeing my glideins dying because they do not get assigned jobs within 15 minutes of startup. 

Is there a way to signal the scheduler that there are idle cores available which are reserved for a particular user?

Or is there a reasonable way to implement it?

And is there some other parameter that I’m missing that will keep my glideins alive till they are assigned jobs.

 

 

Best - Don