[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] change spool file location from the local administration machine to a network one



Hello Chris,
 
I also thought of that - though as a last measure. At the moment I am trying to employ more machines to do the administration. I am afraid that even with an SSD drive on a single machine the bottleneck might start becoming the processing speed of the local machine. More machines will provide more than enough processing power to get this going and will improve the whole hard disk transfer speed (but not solve it) issue though if they all stored the checkpoint data in the network location it would have been much easier. 
 
A.
 
Sent: Monday, March 25, 2013 2:10 PM
Subject: Re: [HTCondor-users] change spool file location from the local administration machine to a network one
 
I might recommend a SSD for the submit machine's "spool" area.   I have experienced extremely fast io access, reduced latency and improved throughput just by installed the ssd's.   Just one item you might want to consider.
    -C
On 3/25/13 8:59 AM, Antonis Sergis wrote:
Hello,
 
we have condor installed at the campus (around 3500 machines available) and I am trying to submit around 3000 jobs per instance. I have installed condor on my office machine and it acts as a server which administrates the submission and orchestrates the whole thing. The problem is that my hard disk is not fast enough to keep a track of more than 400-500 machines (I have checked the disk queue length while condor is running and it is rather large). We have a network storage scheme which is extremely fast. I was wondering how can I store the “spool” file that keeps the checkpoints for every job in my network space instead of my local machine. I have benchmarked the network storage location and it is fast enough to do the job. The problem is that I don’t know how to make my machine to use the network for checkpoint storage instead of the local one in my computer.
 
I have seen the “checkpoint server” option but I am not sure if there is any other simpler method to do that.
 
Any ideas?
 
Thanks


_______________________________________________
HTCondor-users mailing list
To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users

The archives can be found at:
https://lists.cs.wisc.edu/archive/htcondor-users/


_______________________________________________
HTCondor-users mailing list
To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users

The archives can be found at:
https://lists.cs.wisc.edu/archive/htcondor-users/