[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Condor-users] Number of Condor clients dropping over time



Title: Number of Condor clients dropping over time

We have about 30 PCs (~50 slots) running Windows XP. About 6 months ago we had about ~25 of the machines running Condor. Then, we got a couple more machines and added condor to a few others. However, about 3 months ago we started only getting jobs to run on about 10 machines. (about 15 of those had been successfully using condor for more than 1 year).

When we run condor_q -ana we find the same machines running jobs and the other jobs rejected due to the host machine requirements. I checked the classAds from the jobs by running "condor_submit -verbose" and the "condor_status -long" commands. I couldn't find any requirements that were not being met. We compared the information (ClassAds) from several machines that were running jobs and those that were not and could find no differences (except of course the extra Ads that were added by the job running on the machine).

If any of you have had a similar problem and found a fix, please respond. Also, if you can think of something else that I can check that could be causing this problem, let me know.

Our START for all of the machines is '=' to the default values: keyboard idle > 15 minutes etc.
We are using a credd_host located on the master. This was working well for a few weeks before several machines started refusing to do jobs.

I am the primary user of our Condor pool but do not have admin privileges.

Thanks,

Jason