[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] HTCondor - Slurm integration



Hi Steve,

> One other question--what corresponding version of BLAHP is needed on the 
> remote side to take advantage of the parameters below?

>From the list below, BatchProject & BatchRuntime were the latest to be 
added.  They appeared in the UW condor version of the blahp (that is, the 
blahp scripts bundled with the non-osg builds of condor) in condor 8.7.6.

As far as I can tell they were added to the bosco tarball in version 
1.2.12.

Some items including these two attributs were late to get propagated back 
to the osg fork of the blahp, and they did not appear in the most recent 
osg blahp release.  They'll be there in the next osg blahp release though 
(v1.18.44).

Hopefully that answers your question but if not let me know..!

Carl

On Mon, 30 Sep 2019, Steven C Timm wrote:

> One other question--what corresponding version of BLAHP is needed on the 
> remote side to take advantage of the parameters below?  We are using the 
> client tarball from Bosco 1.2.10 right now.  It is my understanding that 
> BLAHP is currently forked between bosco and the htcondor-ce.
> 
> Steve Timm
> 
> ____________________________________________________________________________
> From: Steven C Timm <timm@xxxxxxxx>
> Sent: Monday, September 30, 2019 10:48 AM
> To: HTCondor-Users Mail List <htcondor-users@xxxxxxxxxxx>
> Subject: Re: [HTCondor-users] HTCondor - Slurm integration  
> Thanks for the update Carl.. I know NodeNumber works, I will try the other
> four.
> That is about half of the custom parameters we set at NERSC right now.
> It would be very helpful to have a utility by which we can just send an
> arbitrary #SBATCH 
> parameter through to the job, much like the globus extended RSL attributes
> used to
> work back in the day.  At the moment we have to have a different entry in
> the GlideinWMS 
> factory for every different combination of these.
> 
> Steve Timm
> 
> 
> ____________________________________________________________________________
> From: HTCondor-users <htcondor-users-bounces@xxxxxxxxxxx> on behalf of Carl
> Edquist <edquist@xxxxxxxxxxx>
> Sent: Monday, September 30, 2019 10:36 AM
> To: HTCondor-Users Mail List <htcondor-users@xxxxxxxxxxx>
> Subject: Re: [HTCondor-users] HTCondor - Slurm integration  
> Hi Asvija,
> 
> Brian asked me to look into this - sorry for the delay getting back to
> you.
> 
> The mappings I find based on the condor 8.8.4 version of slurm_submit.sh
> are:
> 
>          "BatchProject" ->
>          #SBATCH -A $bls_opt_project
> 
>          "BatchRuntime" ->
>          #SBATCH -t $((bls_opt_runtime / 60))
> 
>          "RequestMemory" ->
>          #SBATCH --mem=${bls_opt_req_mem}
> 
>          "Queue" ->
>          #SBATCH -p $bls_opt_queue
> 
>          "NodeNumber" ->
>          #SBATCH -N $bls_opt_mpinodes
> 
> Carl
> 
> On Thu, 5 Sep 2019, Asvija B wrote:
> 
> > Hi Brian,
> >
> > Condor version is 8.8.4
> >
> >
> > Thanks and regards,
> >
> > Asvija
> >
> > On 9/5/2019 2:33 AM, Brian Lin wrote:
> >> Hi Asvija,
> >>
> >> Unfortunately, there isn't much in terms for documentation but I could
> >> give you a mapping if you give me the version of HTCondor you're running.
> >>
> >> Thanks,
> >> Brian
> >>
> >> On 8/19/19 12:12 AM, Asvija B wrote:
> >>> Thanks a lot Brian... I am able to see the +remote_NodeNumber getting
> >>> translated properly.
> >>>
> >>> Can you also please indicate the corresponding directives for other
> >>> SLURM related attributes as well (like --nodes, ntasks etc.)
> >>>
> >>> It would be great if you can point me to some documentation related to
> >>> this info..
> >>>
> >>> Additionally, the slurm_submit.sh file from BLAH's github directory (
> >>>https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_prelz_BLAH_
> blob_master_src_scripts_slurm-5Fsubmit.sh&d=DwIFbA&c=gRgGjJ3BkIsb5y6s49Qqs
> A&r=10BCTK25QMgkMYibLRbpYg&m=VCj3itsHHqD4WL7jaj14STI_RiA3yPFQuYkHOeb9zfM&s
> =uSoCpZIHSkJbWZvxQFc38hmbXxpxB11Zcgi6nOZorLs&e=
> >>> ) has additional capabilities of GPU support and MIC support.  Do we
> >>> have any documentation which points to the corresponding Condor
> >>> directives for these ?
> >>>
> >>> Thanks again for the information.
> >>>
> >>> Regards,
> >>>
> >>> Asvija
> >>>
> >>>
> >>> On 8/16/2019 8:53 PM, Brian Lin wrote:
> >>>> Hi Asvjia,
> >>>>
> >>>> You'll want to specify '+remote_NodeNumber' in your original grid job
> >>>> submit file. However, you should note that the Slurm directives we set
> >>>> will be changing in future releases of HTCondor 8.9 to the following:
> >>>>
> >>>> "#SBATCH --nodes=1"
> >>>> "#SBATCH --ntasks=1"
> >>>> "#SBATCH --cpus-per-task=$bls_opt_mpinodes"
> >>>>
> >>>> - Brian
> >>>>
> >>>> On 8/13/19 12:32 AM, Asvija B wrote:
> >>>>> Dear Condor users,
> >>>>>
> >>>>> We are planning to use HT-Condor for submitting jobs to some of our
> >>>>> SLURM managed clusters.  As I digged into the documentation, I
> >>>>> understood that HT-Condor uses BLAH GAHP for supporting job submission
> >>>>> to SLURM.
> >>>>>
> >>>>> We are interested in submitting MPI jobs to SLURM  through HT-Condor.
> >>>>> In this regard, I am unable to look at the configuration parameters in
> >>>>> the condor submission script for indicating MPI related information
> >>>>> (for eg. number of nodes etc.)
> >>>>>
> >>>>> I have seen the script file
> >>>>> $CONDOR_HOME/libexec/glite/bin/slurm_submit.sh .  It does include
> >>>>> statements with   $bls_opt_mpinodes  which translate to "SBATCH -N "
> >>>>> directives.   However I am not clear about the equivalent condor
> >>>>> directives that will result in the proper SLURM directives. Hence it
> >>>>> would be great if any of the SLURM users can comment on this.
> >>>>>
> >>>>>
> >>>>> Thanks and regards,
> >>>>>
> >>>>> Asvija B
> >>>>>
> >>>>>
> >>>>>
> >---------------------------------------------------------------------------
> ---------------------------------
> >>>>>
> >>>>>
> >>>>> [ C-DAC is on Social-Media too. Kindly follow us at:
> >>>>> Facebook:https://urldefense.proofpoint.com/v2/url?u=https-3A__www.facebook.com_CDACI
> NDIA&d=DwIFbA&c=gRgGjJ3BkIsb5y6s49QqsA&r=10BCTK25QMgkMYibLRbpYg&m=VCj3itsH
> HqD4WL7jaj14STI_RiA3yPFQuYkHOeb9zfM&s=uvVH3LcThEuGbesE0n2o3_BwAhhAFvrhFuoG
> ZIVbviw&e=  & Twitter: @cdacindia ]
> >>>>>
> >>>>> This e-mail is for the sole use of the intended recipient(s) and may
> >>>>> contain confidential and privileged information. If you are not the
> >>>>> intended recipient, please contact the sender by reply e-mail and
> >>>>> destroy
> >>>>> all copies and the original message. Any unauthorized review, use,
> >>>>> disclosure, dissemination, forwarding, printing or copying of this
> >>>>> email
> >>>>> is strictly prohibited and appropriate legal action will be taken.
> >>>>>
> >---------------------------------------------------------------------------
> ---------------------------------
> >>>>>
> >>>>>
> >>>>>
> >>>>> _______________________________________________
> >>>>> HTCondor-users mailing list
> >>>>> To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx
> >>>>> with a
> >>>>> subject: Unsubscribe
> >>>>> You can also unsubscribe by visiting
> >>>>>https://urldefense.proofpoint.com/v2/url?u=https-3A__lists.cs.wisc.edu_mail
> man_listinfo_htcondor-2Dusers&d=DwIFbA&c=gRgGjJ3BkIsb5y6s49QqsA&r=10BCTK25
> QMgkMYibLRbpYg&m=VCj3itsHHqD4WL7jaj14STI_RiA3yPFQuYkHOeb9zfM&s=WBQKEaMHUAF
> VqImfbLGU1P8F_wjAZQRDNkKVZSRfaVU&e=
> >>>>>
> >>>>> The archives can be found at:
> >>>>>https://urldefense.proofpoint.com/v2/url?u=https-3A__lists.cs.wisc.edu_arch
> ive_htcondor-2Dusers_&d=DwIFbA&c=gRgGjJ3BkIsb5y6s49QqsA&r=10BCTK25QMgkMYib
> LRbpYg&m=VCj3itsHHqD4WL7jaj14STI_RiA3yPFQuYkHOeb9zfM&s=sMGjIfjYSKnCI3pGrWI
> MpuctjLWtvfAv5yg6eFUthJ0&e=
> >>>>
> >>>
> >---------------------------------------------------------------------------
> ---------------------------------
> >>>
> >>> [ C-DAC is on Social-Media too. Kindly follow us at:
> >>> Facebook:https://urldefense.proofpoint.com/v2/url?u=https-3A__www.facebook.com_CDACI
> NDIA&d=DwIFbA&c=gRgGjJ3BkIsb5y6s49QqsA&r=10BCTK25QMgkMYibLRbpYg&m=VCj3itsH
> HqD4WL7jaj14STI_RiA3yPFQuYkHOeb9zfM&s=uvVH3LcThEuGbesE0n2o3_BwAhhAFvrhFuoG
> ZIVbviw&e=  & Twitter: @cdacindia ]
> >>>
> >>> This e-mail is for the sole use of the intended recipient(s) and may
> >>> contain confidential and privileged information. If you are not the
> >>> intended recipient, please contact the sender by reply e-mail and
> destroy
> >>> all copies and the original message. Any unauthorized review, use,
> >>> disclosure, dissemination, forwarding, printing or copying of this email
> >>> is strictly prohibited and appropriate legal action will be taken.
> >>>
> >---------------------------------------------------------------------------
> ---------------------------------
> >>>
> >>>
> >>
> >>
> >
> >---------------------------------------------------------------------------
> ---------------------------------
> > [ C-DAC is on Social-Media too. Kindly follow us at:
> > Facebook:https://urldefense.proofpoint.com/v2/url?u=https-3A__www.facebook.com_CDACI
> NDIA&d=DwIFbA&c=gRgGjJ3BkIsb5y6s49QqsA&r=10BCTK25QMgkMYibLRbpYg&m=VCj3itsH
> HqD4WL7jaj14STI_RiA3yPFQuYkHOeb9zfM&s=uvVH3LcThEuGbesE0n2o3_BwAhhAFvrhFuoG
> ZIVbviw&e=  & Twitter: @cdacindia ]
> >
> > This e-mail is for the sole use of the intended recipient(s) and may
> > contain confidential and privileged information. If you are not the
> > intended recipient, please contact the sender by reply e-mail and destroy
> > all copies and the original message. Any unauthorized review, use,
> > disclosure, dissemination, forwarding, printing or copying of this email
> > is strictly prohibited and appropriate legal action will be taken.
> >---------------------------------------------------------------------------
> ---------------------------------
> >
> > _______________________________________________
> > HTCondor-users mailing list
> > To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with
> a
> > subject: Unsubscribe
> > You can also unsubscribe by visiting
> >https://urldefense.proofpoint.com/v2/url?u=https-3A__lists.cs.wisc.edu_mail
> man_listinfo_htcondor-2Dusers&d=DwIFbA&c=gRgGjJ3BkIsb5y6s49QqsA&r=10BCTK25
> QMgkMYibLRbpYg&m=VCj3itsHHqD4WL7jaj14STI_RiA3yPFQuYkHOeb9zfM&s=WBQKEaMHUAF
> VqImfbLGU1P8F_wjAZQRDNkKVZSRfaVU&e=
> >
> > The archives can be found at:
> >https://urldefense.proofpoint.com/v2/url?u=https-3A__lists.cs.wisc.edu_arch
> ive_htcondor-2Dusers_&d=DwIFbA&c=gRgGjJ3BkIsb5y6s49QqsA&r=10BCTK25QMgkMYib
> LRbpYg&m=VCj3itsHHqD4WL7jaj14STI_RiA3yPFQuYkHOeb9zfM&s=sMGjIfjYSKnCI3pGrWI
> MpuctjLWtvfAv5yg6eFUthJ0&e=
> _______________________________________________
> HTCondor-users mailing list
> To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
> subject: Unsubscribe
> You can also unsubscribe by visiting
> https://urldefense.proofpoint.com/v2/url?u=https-3A__lists.cs.wisc.edu_mail
> man_listinfo_htcondor-2Dusers&d=DwIFbA&c=gRgGjJ3BkIsb5y6s49QqsA&r=10BCTK25
> QMgkMYibLRbpYg&m=VCj3itsHHqD4WL7jaj14STI_RiA3yPFQuYkHOeb9zfM&s=WBQKEaMHUAF
> VqImfbLGU1P8F_wjAZQRDNkKVZSRfaVU&e=
> 
> The archives can be found at:
> https://urldefense.proofpoint.com/v2/url?u=https-3A__lists.cs.wisc.edu_arch
> ive_htcondor-2Dusers_&d=DwIFbA&c=gRgGjJ3BkIsb5y6s49QqsA&r=10BCTK25QMgkMYib
> LRbpYg&m=VCj3itsHHqD4WL7jaj14STI_RiA3yPFQuYkHOeb9zfM&s=sMGjIfjYSKnCI3pGrWI
> MpuctjLWtvfAv5yg6eFUthJ0&e=
> 
>