[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] negotiator "poor" performance issue



Yep, indeed, it was just an illustration I put together when was
writing the email, and #!/usr/bin/python is more in my hand nowadays,
so that's why :) But of course the job-script is correct in this
manner. Thanks for pointing out anyway.

Cheers,
Daniel

2014-03-14 17:15 GMT+01:00 Carl Edquist <edquist@xxxxxxxxxxx>:
> Hi Pek,
>
> This might not be a problem for you at all, but I can't help noticing the
> script starts with "#!/usr/bin/bash" ... The standard location is just
> "/bin/bash" and if any of the workers don't have bash installed under
> /usr/bin/ (EL5/EL6 don't, for instance), I believe the jobs will fail on
> those workers.  Might not matter for you but just an idea / something to
> keep in mind.
>
> Carl
>
>
> On Fri, 14 Mar 2014, Pek Daniel wrote:
>
>> Hi,
>>
>> I have 700 machines running startd, and 400 000 identical jobs
>> submitted to 10 schedds. I have 100 subcollectors on 2 machines, the
>> main collector on one of these machines and the negotiator on the
>> other.
>>
>> All of the jobs are simple scripts:
>>
>> #!/usr/bin/bash
>> exit 0
>>
>> I assigned to the jobs I submitted randomized priorities, because
>> otherwise the negotiator would go through the schedds sequentially
>> (first, it runs all the jobs from schedd1, then from schedd2, etc).
>> I've also set:
>> USE_GLOBAL_JOB_PRIOS = true
>>
>> I don't use job arrays or clusters and I can't consider using them,
>> this is a constraint.
>>
>> So, what I do is:
>> # Turn off dispatching
>> condor_config_val -neg -rset "NEGOTIATOR_SLOT_CONSTRAINT = False";
>> condor_reconfig -neg
>>
>> # Submit jobs, RANDOMPRIO ranges from 1 - 100
>> for i in `seq 1 N`
>> do
>>    /usr/bin/condor_submit -verbose -append 'priority = $RANDOMPRIO'
>> submitfile
>> done
>>
>> # Turn back on dispatching with 400 000 queued jobs on 10 schedds
>> condor_config_val -neg -runset NEGOTIATOR_SLOT_CONSTRAINT; condor_reconfig
>> -neg
>>
>> In this way, I could achieve ~10 jobs / sec negotiation (dispatching)
>> rate (not using priorities doesn't change this).
>>
>> [root@condormaster1 condor]# condor_version
>> $CondorVersion: 8.1.2 Oct 19 2013 BuildID: 189797 $
>> $CondorPlatform: x86_64_RedHat6 $
>>
>> My questions:
>> - did anybody measure before a higher dispatch rate?
>> - is this 10 jobs / sec considered a "normal" or "good enough" value
>> in case of HTCondor?
>> - can I do anything without touching the source to increase the
>> negotiation performance?
>>
>> Thanks,
>> Daniel
>> _______________________________________________
>> HTCondor-users mailing list
>> To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with
>> a
>> subject: Unsubscribe
>> You can also unsubscribe by visiting
>> https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users
>>
>> The archives can be found at:
>> https://lists.cs.wisc.edu/archive/htcondor-users/
>>
> _______________________________________________
> HTCondor-users mailing list
> To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
> subject: Unsubscribe
> You can also unsubscribe by visiting
> https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users
>
> The archives can be found at:
> https://lists.cs.wisc.edu/archive/htcondor-users/