[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] Outage timekeeping?



You can use the following program:

https://dist.epipe.com/downtimed/
https://github.com/snabb/downtimed

Regards



On Thu, 2021-06-03 at 20:45 +0000, Michael Pelletier via HTCondor-users wrote:

Has anyone cooked up a good way to keep statistics on exec node outages? Iâm looking for something comparable to the SLURM stat from sreport.

 

Iâve got a couple of ideas, but Iâm not really sure how theyâd work or if theyâd be efficient and reliable. One idea is a startd cron or schedd cron job to report the current time into a state file, and then update a âdowntimeâ value when a gap larger than the query interval appears there.

 

However, Iâm wondering if there is there an established way to create persistent machine classads without involving state files.

 

Thanks for any ideas you might have.

 

Michael V Pelletier

Principal Engineer


C: +1 339.293.9149
michael.v.pelletier@xxxxxxx


Raytheon Technologies

Information Technology

50 Apple Hill Drive

Tewksbury, MA 01876-1198

 

RTX.com | LinkedIn | Twitter | Instagram

 

_______________________________________________
HTCondor-users mailing list
To unsubscribe, send a message to 
htcondor-users-request@xxxxxxxxxxx
 with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users


The archives can be found at:
https://lists.cs.wisc.edu/archive/htcondor-users/