[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] About choosing nodes over slots



It can depend on the job, Vikas.

For example, if your job reads in a 100-megabyte input data file from an NFS path, it would be better to have jobs fill a single node before moving to the next node, since the Linux disk buffer cache would be able to supply all 32 jobs with the blocks from that file while only pulling a single copy across the network from that one machine.

If you go breadth-first, filling each machine one job at a time, then all the machines would launch jobs which each would attempt to read the input data from the fileserver at the same time. If you have 100 machines, for example, that's not healthy.

But of course every job uses scratch space and file transfers, right?  ;)

	-Michael Pelletier.


> -----Original Message-----
> From: HTCondor-users [mailto:htcondor-users-bounces@xxxxxxxxxxx] On Behalf
> Of Bansal, Vikas
> Sent: Monday, August 14, 2017 12:02 PM
> To: HTCondor-Users Mail List <htcondor-users@xxxxxxxxxxx>
> Subject: Re: [HTCondor-users] About choosing nodes over slots
> 
> Hi Max,
> 
> Thanks for your detailed reply.
> We may try to play with the knobs.
> 
> Some thoughts on a general note (I assume this can be independent of the
> batch system e.g. HTCondor, LSF, SLURM, PBS).
> 
> Is there an advantage to submit jobs to fill all slots on a single node
> first? Hence perhaps the default choice?
> 
> Or perhaps from an admin point of view, it is better to manage jobs on
> single node first and then move to next node?
> 
> Or perhaps in the end it does not matter as one gets so many jobs in a
> relatively short time, that either way, all slots on all nodes would be
> eventually taken.
> 
> Thanks,
> Vikas
> 
> 
> 
> 
> On 8/14/17, 12:54 AM, "HTCondor-users on behalf of Fischer, Max (SCC)"
> <htcondor-users-bounces@xxxxxxxxxxx on behalf of max.fischer@xxxxxxx>
> wrote:
> 
> >Hi Vikas,
> >
> >rule of thumb: HTCondor has a knob for it. ;)
> >
> >Which node to run on is determined by the RANK settings of the
> negotiator, each job, and each worker node. For a start, let me just dump
> a comment from our local configuration:
> >
> ># -- How does HTCondor schedule jobs? -- # Jobs are scheduled to
> >StartDs via a sequence of filtering and sorting. Condor # matches
> >*jobs* to *workers*, not the other way arround! For each job, the #
> >following sequence is used:
> >#  - Find all startds which match the job REQUIREMENT and vice versa #
> >- Sort startds by NEGOTIATOR_PRE_JOB_RANK #  - Sort startds by the
> >job's RANK #  - Sort startds by NEGOTIATOR_POST_JOB_RANK #  - If
> >preemtion is required, sort startds by PREEMPTION_RANK #  - Assign Job
> >to the highest ranked startd # Sorting preserves the order of the
> >previous step, so later steps only # have an effect if there was a tie.
> >
> >Each ALL_CAPS word is something set via configuration or on submission.
> Note that the startd (node) Rank never shows up here - if it is not used
> by the negotiator or job, it is ignored.
> >If you have a fresh HTCondor, the most influential thing is
> NEGOTIATOR_PRE_JOB_RANK, which defaults to:
> >	NEGOTIATOR_PRE_JOB_RANK = (10000000 * My.Rank) + (1000000 *
> >(RemoteOwner =?= UNDEFINED)) - (100000 * Cpus) - Memory
> >
> >My.Rank : this is the ranking of the Startd. By default, it has the
> highest precedence. However, nodes probably all have the same policy.
> >RemoteOwner =?= UNDEFINED : this means "avoid kicking out running jobs".
> If your cluster has free capacity, this basically means "do not kill
> running jobs".
> >- Cpus : this prefers nodes with fewer *unused* cores. In effect, this
> means depth first filling.
> >
> >In other words, the default is to fill up a single node first before
> moving on to the next.
> >
> >There are a *lot* of knobs to tweak, and going through all their effects
> has a lot to do with your cluster setup. Most of the time, keeping a
> simple policy works best.
> >The manual has some pretty decent info on configuration settings
> >
> >http://research.cs.wisc.edu/htcondor/manual/current/3_5Configuration_Ma
> >cros.html#SECTION004516000000000000000
> >the effects and examples of scheduling policy configuration
> >
> >http://research.cs.wisc.edu/htcondor/manual/current/3_7Policy_Configura
> >tion.html
> >and how user/group priorities are used
> >
> >http://research.cs.wisc.edu/htcondor/manual/current/3_6User_Priorities.
> >html
> >
> >Cheers,
> >Max
> >
> >> Am 13.08.2017 um 01:13 schrieb Bansal, Vikas <Vikas.Bansal@xxxxxxxx>:
> >>
> >> Hi,
> >>
> >> I have a condor batch system available at my site.
> >>
> >> $ condor_version
> >> $CondorVersion: 8.2.10 Oct 27 2015 $
> >> $CondorPlatform: X86_64-CentOS_6.7 $
> >>
> >> I have 3200 slots on it.
> >> 100 nodes each with 32 slots.
> >>
> >> I am wondering how jobs are scheduled to the nodes and slots.
> >>
> >> E.g. If I have 100 jobs submitted in 1-5 minutes to the queue, how will
> they land up on the nodes/slots.
> >>
> >> Will each job occupy one slot on a NEW node or will they fill up all 32
> slots on one node and then move to next node and so on?
> >>
> >> Is this configurable ? I.e. Which resource to choose first, node or
> slot?
> >>
> >> Thanks for any help on this.
> >>
> >> Vikas
> >>
> >>
> >> _______________________________________________
> >> HTCondor-users mailing list
> >> To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx
> >> with a
> >> subject: Unsubscribe
> >> You can also unsubscribe by visiting
> >> https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users
> >>
> >> The archives can be found at:
> >> https://lists.cs.wisc.edu/archive/htcondor-users/
> >
> _______________________________________________
> HTCondor-users mailing list
> To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with
> a
> subject: Unsubscribe
> You can also unsubscribe by visiting
> https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users
> 
> The archives can be found at:
> https://lists.cs.wisc.edu/archive/htcondor-users/