[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] Submission of large numbers of jobs



On 11/07/2012 09:04, "Brian Candler" <B.Candler@xxxxxxxxx> wrote:

>On Wed, Jul 11, 2012 at 09:30:40AM +0200, Martin Kudlej wrote:
>> >>From what I've read, each job will take ~10KB of RAM in schedd, so
>>100K jobs
>> >would be about 1G of RAM just for the job queue.  If I can afford
>>that, is
>> >there anything else to worry about?
>> You can configure more than one scheduler.
>> Also you should consider to do it on 64 bit arch. because of memory
>>allocation by one daemon.
>
> All machines are built as 64-bit anyway (some nodes have 96GB RAM)
>
>For multiple schedulers, I found
>http://www.cyclecomputing.com/wiki/index.php?title=Running_Multiple_Condor
>_Schedds
>
>Thanks for the pointers!
>
>Cheers,
>
>Brian.
>_______________________________________________
>Condor-users mailing list
>To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with a
>subject: Unsubscribe
>You can also unsubscribe by visiting
>https://lists.cs.wisc.edu/mailman/listinfo/condor-users
>
>The archives can be found at:
>https://lists.cs.wisc.edu/archive/condor-users/
>

Hi Brian,
 a colleague here tried multiple scheds and there are some complications.
(I can send you his email if you like.)
In the end we (so far) just have one a big machine for submitting and a
smaller one for matchmaking. Many of our users submit tens of thousands of
jobs at the same time, and occasionally 100,000.

I guess the other thing is how may slots/cores you have as that is the
maximum number of shadow processes that will run. We only have around 2000
so that too works out ok.
-Ian
-- 
Ian Cottam
IT Services -- supporting research
Faculty of Engineering and Physical Sciences
The University of Manchester
"The only strategy that is guaranteed to fail is not taking risks." Mark
Zuckerberg