[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] Batch slurm gahp is checking job status too frequently. Can be configured?



> On Dec 7, 2016, at 2:34 PM, Marco Mambelli <marcom@xxxxxxxx> wrote:
> 
> Thanks Brian,
> 
>> On Dec 7, 2016, at 2:17 PM, Brian Bockelman <bbockelm@xxxxxxxxxxx> wrote:
>> 
>> 
>>> On Dec 7, 2016, at 2:06 PM, Marco Mambelli <marcom@xxxxxxxx> wrote:
>>> 
>>> In a slurm cluster accessed via BOSCO (grid universe, batch slurm) the system administrators complained that scontrol is invoked too frequently.
>>> 
>>> Is it there a way to cache the results in the batch gahp and reduce the frequency of these requests (e.g. wait at least X seconds before issuing another request)?
>> 
>> I know this is done in the latest developer series.  It might be worth it to look through the release notes to see whether it was backported to stable series (and when exactly it was released).
> 
> Is this controlled in the submit host where the schedd resides (and remote gahp is called) or on the remote machine where blahp is installed?
> If you can point me to the right documentation/variable name would be great
> 

Caching is implemented by the scripts invoked by the blahp.  I.e., the remote machine.

> 
>> 
>>> 
>>> Thank you,
>>> Marco
>>> 
>>> PS Which is the difference between grid types âbatch LRMâ and âLRMâ (where LRM is pbs, slurm, â)?
>> 
>> These should be aliases.... I believe the preferred mechanism is "LRMâ?
> 
> So both grid types end up invoking the remote gahp with the same parameters
> 

I believe additional parameters need to be added for the remote_gahp to be involved.

Brian