[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] slow creation of condor_shadow processes



Hello Experts,

ÂAny thoughts on following query.Â

On Tue, 23 Feb, 2021, 17:19 Vikrant Aggarwal, <ervikrant06@xxxxxxxxx> wrote:
Hello Experts,

On bad condor submit boxes seeing a very slow process creation rate for condor shadow processes 10-15/s while on good condor submit boxes rate is 100-150/s.

Submitting a large batch of jobs more than 5k, jobs are in R state but I see no slot allocated to jobs. Using the following command to see the jobs in R state without any slot allocated to them. As the shadow process gets created, slot starts showing for the running job. Approx 20m delay noticed between slotÂallocation to first and last job. Definitely slow shadow process creation causing this issue

while true ; do condor_q -run -nobatch | grep -v 'slot' | wc -l ;sleep 10 ; doneÂ

Command used to calculate number of shadow processes created per sec for a batch:

grep 'Starting add_shadow_birthdate(616' /var/log/condor/SchedLog | awk '{print $2}' | sort | uniq -c | less

Troubleshooting Done:

- condor conf is managed through the same conf mgmt tool.Â
- condor version in use is 8.5.8 (dev) on both boxes
- Tried to apply htcondor kernel turning script on bad submit box but no luck.Â

Seeing the same issue with the 8.8.5 (stable) condor submit box also.Â

Any input in helping to pinpoint the issue is highly appreciated.Â

Thanks & Regards,
Vikrant Aggarwal