[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[HTCondor-users] condor_submit how to avoid bottleneck



Hi All,

We have a HTCondor cluster of 200 machines. We need to push a large number of jobs (~50k) through the cluster on a daily basis. Currently all job submissions are done from a single machine. There may be thousands of jobs running concurrently at any given time.

It seems like having a single job submit machine is not the best choice. There are thousands of condor_shadow running on the submit machine at the same time and itâs becoming a bottleneck. I have a recent incident where the condor_shadows running the submit machine were consuming high percentage of CPU.

Given there is one condor_shadow per job running on the submit machine I would like to know if there is a way for condor to automatically distribute the job submission throughout the cluster e.g. use a random condor_sched for every job? 



Thanks
Jason

PRIVACY AND CONFIDENTIALITY NOTICE
The information contained in this message is intended for the named recipients only. It may contain confidential information and if you are not the intended recipient, you must not copy, distribute or take any action in reliance on it. If you have received this message in error please destroy it and reply to the sender immediately or contact us at the above telephone number.
VIRUS DISCLAIMER
While we take every precaution against presence of computer viruses on our system, we accept no responsibility for loss or damage arising from the transmission of viruses to e-mail recipients.