[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] job's id: cluster.process


<-----Original Message----->
>From: Todd Tannenbaum
>Sent: 12/20/2007 4:38:58 PM
>To: condor-users@xxxxxxxxxxx
>Subject: Re: [Condor-users] job's id: cluster.process
>waka jawaka wrote:
>> Hi, I was wonderring what does the cluster part of the id mean,
>It is simply a handle to a group of similar jobs. Because all jobs
>within a cluster must (a) share the same executable and (b) be submitted
>within the same transaction, Condor is able to store the job information
>more efficiently. Thus 20,000 jobs submitted in one cluster will use
>less system resources than 20,000 jobs submitted in 20,000 separate
>> and how is it selected (randomly, cyclic or any other computation).
>The cluster id is selected by the condor_schedd; it is just a
>monotonically increasing number that starts out at 1 and subsequently
>increases by 1. Unfortunately, I
>think it will wrap back to 0 once it goes above 4 billion -- hopefully
>that is not a concern. :).
>> Also if you submit a job A and then submit another one B , can a
>> situation where A and B will have the same cluster in their id and
>> simply succeeding process numbers accure? (example A:124.0 b:124.1)
>No, because the submission of jobs into a cluster must happen within the
>same transaction. When using the command-line tools, this means that
>each invocation of condor_submit will result in a new cluster number.
>Submission of multiple jobs into one cluster can be done with
>condor_submit via multiple "queue" statements or "queue n" (where n> 1)
>statements within one submit file.
>> If you know of a link to a detailed explanation about this subject,
>> I'll also appreciate it.
>Hmmm, found this on the condor_submit man page:
>> condor_ submit requires a submit description file which contains
>> commands to direct the queuing of jobs. One submit description file
>> may contain specifications for the queuing of many Condor jobs at
>> once. A single invocation of condor_ submit may cause one or more
>> clusters. A cluster is a set of jobs specified in the submit
>> description file between queue commands for which the executable is
>> not changed. It is advantageous to submit multiple jobs as a single
>> cluster because:
>> * Only one copy of the checkpoint file is needed to represent all
>> jobs in a cluster until they begin execution.
>> * There is much less
>> overhead involved for Condor to start the next job in a cluster than
>> for Condor to start a new cluster. This can make a big difference
>> when submitting lots of short jobs.
>> Multiple clusters may be specified within a single submit description
>> file. Each cluster must specify a single executable.
>hope this helps,
>Condor-users mailing list
>To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with a
>subject: Unsubscribe
>You can also unsubscribe by visiting
>The archives can be found at:

Click for your daily horoscope, learn about money, love & family.

ICQ - You get the message, anywhere!
Get it @ http://www.icq.com