[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [HTCondor-users] check which job is running on a wn at specific time
- Date: Wed, 08 Jul 2020 15:19:37 +0530
- From: ervikrant06@xxxxxxxxx
- Subject: Re: [HTCondor-users] check which job is running on a wn at specific time
May be not directly answering your question but thought this may provide some help:
Recently I came across the following command which I found very useful to get the history of jobs ran on the executor node. You need to fire this command on the executor node. It's very useful to see the jobs ran on the node submitted from different schedulers during the time of issue for troubleshooting purposes. It covers history not current runs.Â
condor_history -file `condor_config_val LOG`/startd_history -limit 2 -af remotehost globaljobid
We have clusters consisting of 400+ nodes. We do capture condor_who at intervals of 1 minute and it doesn't seem to be causing any issue for us.Â
Thanks & Regards,
this seems like an every-day-htc-admin-problem to me, so lateral brain in gear everyone :)
When it comes to certain effects on a workernode I often would like to know if it is job related or not, hence I would like to check quickly which jobs were running on a host at a certain point in time.
I know this sounds not spectacular but as you need to check active queue and history at the same time and get the timestamps right, maybe someone scripted somethin already to get a quick result ?
As an option I thought about running 'condor_who >> /var/log/condor/who.log' every couple of minutes or so but I am uncertain if this would put too much load on the sched or collector as the condor_who command seems to run around quite a bit to gather it's statistic ...
Building 02b, Room 009
HTCondor-users mailing list
To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
You can also unsubscribe by visiting
The archives can be found at: