[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] HTCondor - Slurm integration



One other question--what corresponding version of BLAHP is needed on the remote side to take advantage of the parameters below?  We are using the client tarball from Bosco 1.2.10 right now.  It is my understanding
that BLAHP is currently forked between bosco and the htcondor-ce.

Steve Timm


From: Steven C Timm <timm@xxxxxxxx>
Sent: Monday, September 30, 2019 10:48 AM
To: HTCondor-Users Mail List <htcondor-users@xxxxxxxxxxx>
Subject: Re: [HTCondor-users] HTCondor - Slurm integration
 
Thanks for the update Carl.. I know NodeNumber works, I will try the other four.
That is about half of the custom parameters we set at NERSC right now.
It would be very helpful to have a utility by which we can just send an arbitrary #SBATCH 
parameter through to the job, much like the globus extended RSL attributes used to
work back in the day.  At the moment we have to have a different entry in the GlideinWMS 
factory for every different combination of these.

Steve Timm



From: HTCondor-users <htcondor-users-bounces@xxxxxxxxxxx> on behalf of Carl Edquist <edquist@xxxxxxxxxxx>
Sent: Monday, September 30, 2019 10:36 AM
To: HTCondor-Users Mail List <htcondor-users@xxxxxxxxxxx>
Subject: Re: [HTCondor-users] HTCondor - Slurm integration
 
Hi Asvija,

Brian asked me to look into this - sorry for the delay getting back to
you.

The mappings I find based on the condor 8.8.4 version of slurm_submit.sh
are:

         "BatchProject" ->
         #SBATCH -A $bls_opt_project

         "BatchRuntime" ->
         #SBATCH -t $((bls_opt_runtime / 60))

         "RequestMemory" ->
         #SBATCH --mem=${bls_opt_req_mem}

         "Queue" ->
         #SBATCH -p $bls_opt_queue

         "NodeNumber" ->
         #SBATCH -N $bls_opt_mpinodes

Carl

On Thu, 5 Sep 2019, Asvija B wrote:

> Hi Brian,
>
> Condor version is 8.8.4
>
>
> Thanks and regards,
>
> Asvija
>
> On 9/5/2019 2:33 AM, Brian Lin wrote:
>> Hi Asvija,
>>
>> Unfortunately, there isn't much in terms for documentation but I could
>> give you a mapping if you give me the version of HTCondor you're running.
>>
>> Thanks,
>> Brian
>>
>> On 8/19/19 12:12 AM, Asvija B wrote:
>>> Thanks a lot Brian... I am able to see the +remote_NodeNumber getting
>>> translated properly.
>>>
>>> Can you also please indicate the corresponding directives for other
>>> SLURM related attributes as well (like --nodes, ntasks etc.)
>>>
>>> It would be great if you can point me to some documentation related to
>>> this info..
>>>
>>> Additionally, the slurm_submit.sh file from BLAH's github directory (
>>> https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_prelz_BLAH_blob_master_src_scripts_slurm-5Fsubmit.sh&d=DwIFbA&c=gRgGjJ3BkIsb5y6s49QqsA&r=10BCTK25QMgkMYibLRbpYg&m=VCj3itsHHqD4WL7jaj14STI_RiA3yPFQuYkHOeb9zfM&s=uSoCpZIHSkJbWZvxQFc38hmbXxpxB11Zcgi6nOZorLs&e=
>>> ) has additional capabilities of GPU support and MIC support.  Do we
>>> have any documentation which points to the corresponding Condor
>>> directives for these ?
>>>
>>> Thanks again for the information.
>>>
>>> Regards,
>>>
>>> Asvija
>>>
>>>
>>> On 8/16/2019 8:53 PM, Brian Lin wrote:
>>>> Hi Asvjia,
>>>>
>>>> You'll want to specify '+remote_NodeNumber' in your original grid job
>>>> submit file. However, you should note that the Slurm directives we set
>>>> will be changing in future releases of HTCondor 8.9 to the following:
>>>>
>>>> "#SBATCH --nodes=1"
>>>> "#SBATCH --ntasks=1"
>>>> "#SBATCH --cpus-per-task=$bls_opt_mpinodes"
>>>>
>>>> - Brian
>>>>
>>>> On 8/13/19 12:32 AM, Asvija B wrote:
>>>>> Dear Condor users,
>>>>>
>>>>> We are planning to use HT-Condor for submitting jobs to some of our
>>>>> SLURM managed clusters.  As I digged into the documentation, I
>>>>> understood that HT-Condor uses BLAH GAHP for supporting job submission
>>>>> to SLURM.
>>>>>
>>>>> We are interested in submitting MPI jobs to SLURM  through HT-Condor.
>>>>> In this regard, I am unable to look at the configuration parameters in
>>>>> the condor submission script for indicating MPI related information
>>>>> (for eg. number of nodes etc.)
>>>>>
>>>>> I have seen the script file
>>>>> $CONDOR_HOME/libexec/glite/bin/slurm_submit.sh .  It does include
>>>>> statements with   $bls_opt_mpinodes  which translate to "SBATCH -N "
>>>>> directives.   However I am not clear about the equivalent condor
>>>>> directives that will result in the proper SLURM directives. Hence it
>>>>> would be great if any of the SLURM users can comment on this.
>>>>>
>>>>>
>>>>> Thanks and regards,
>>>>>
>>>>> Asvija B
>>>>>
>>>>>
>>>>>
> ------------------------------------------------------------------------------------------------------------
>>>>>
>>>>>
>>>>> [ C-DAC is on Social-Media too. Kindly follow us at:
>>>>> Facebook: https://urldefense.proofpoint.com/v2/url?u=https-3A__www.facebook.com_CDACINDIA&d=DwIFbA&c=gRgGjJ3BkIsb5y6s49QqsA&r=10BCTK25QMgkMYibLRbpYg&m=VCj3itsHHqD4WL7jaj14STI_RiA3yPFQuYkHOeb9zfM&s=uvVH3LcThEuGbesE0n2o3_BwAhhAFvrhFuoGZIVbviw&e=  & Twitter: @cdacindia ]
>>>>>
>>>>> This e-mail is for the sole use of the intended recipient(s) and may
>>>>> contain confidential and privileged information. If you are not the
>>>>> intended recipient, please contact the sender by reply e-mail and
>>>>> destroy
>>>>> all copies and the original message. Any unauthorized review, use,
>>>>> disclosure, dissemination, forwarding, printing or copying of this
>>>>> email
>>>>> is strictly prohibited and appropriate legal action will be taken.
>>>>>
> ------------------------------------------------------------------------------------------------------------
>>>>>
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> HTCondor-users mailing list
>>>>> To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx
>>>>> with a
>>>>> subject: Unsubscribe
>>>>> You can also unsubscribe by visiting
>>>>> https://urldefense.proofpoint.com/v2/url?u=https-3A__lists.cs.wisc.edu_mailman_listinfo_htcondor-2Dusers&d=DwIFbA&c=gRgGjJ3BkIsb5y6s49QqsA&r=10BCTK25QMgkMYibLRbpYg&m=VCj3itsHHqD4WL7jaj14STI_RiA3yPFQuYkHOeb9zfM&s=WBQKEaMHUAFVqImfbLGU1P8F_wjAZQRDNkKVZSRfaVU&e=
>>>>>
>>>>> The archives can be found at:
>>>>> https://urldefense.proofpoint.com/v2/url?u=https-3A__lists.cs.wisc.edu_archive_htcondor-2Dusers_&d=DwIFbA&c=gRgGjJ3BkIsb5y6s49QqsA&r=10BCTK25QMgkMYibLRbpYg&m=VCj3itsHHqD4WL7jaj14STI_RiA3yPFQuYkHOeb9zfM&s=sMGjIfjYSKnCI3pGrWIMpuctjLWtvfAv5yg6eFUthJ0&e=
>>>>
>>>
> ------------------------------------------------------------------------------------------------------------
>>>
>>> [ C-DAC is on Social-Media too. Kindly follow us at:
>>> Facebook: https://urldefense.proofpoint.com/v2/url?u=https-3A__www.facebook.com_CDACINDIA&d=DwIFbA&c=gRgGjJ3BkIsb5y6s49QqsA&r=10BCTK25QMgkMYibLRbpYg&m=VCj3itsHHqD4WL7jaj14STI_RiA3yPFQuYkHOeb9zfM&s=uvVH3LcThEuGbesE0n2o3_BwAhhAFvrhFuoGZIVbviw&e=  & Twitter: @cdacindia ]
>>>
>>> This e-mail is for the sole use of the intended recipient(s) and may
>>> contain confidential and privileged information. If you are not the
>>> intended recipient, please contact the sender by reply e-mail and destroy
>>> all copies and the original message. Any unauthorized review, use,
>>> disclosure, dissemination, forwarding, printing or copying of this email
>>> is strictly prohibited and appropriate legal action will be taken.
>>>
> ------------------------------------------------------------------------------------------------------------
>>>
>>>
>>
>>
>
> ------------------------------------------------------------------------------------------------------------
> [ C-DAC is on Social-Media too. Kindly follow us at:
> Facebook: https://urldefense.proofpoint.com/v2/url?u=https-3A__www.facebook.com_CDACINDIA&d=DwIFbA&c=gRgGjJ3BkIsb5y6s49QqsA&r=10BCTK25QMgkMYibLRbpYg&m=VCj3itsHHqD4WL7jaj14STI_RiA3yPFQuYkHOeb9zfM&s=uvVH3LcThEuGbesE0n2o3_BwAhhAFvrhFuoGZIVbviw&e=  & Twitter: @cdacindia ]
>
> This e-mail is for the sole use of the intended recipient(s) and may
> contain confidential and privileged information. If you are not the
> intended recipient, please contact the sender by reply e-mail and destroy
> all copies and the original message. Any unauthorized review, use,
> disclosure, dissemination, forwarding, printing or copying of this email
> is strictly prohibited and appropriate legal action will be taken.
> ------------------------------------------------------------------------------------------------------------
>
> _______________________________________________
> HTCondor-users mailing list
> To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
> subject: Unsubscribe
> You can also unsubscribe by visiting
> https://urldefense.proofpoint.com/v2/url?u=https-3A__lists.cs.wisc.edu_mailman_listinfo_htcondor-2Dusers&d=DwIFbA&c=gRgGjJ3BkIsb5y6s49QqsA&r=10BCTK25QMgkMYibLRbpYg&m=VCj3itsHHqD4WL7jaj14STI_RiA3yPFQuYkHOeb9zfM&s=WBQKEaMHUAFVqImfbLGU1P8F_wjAZQRDNkKVZSRfaVU&e=
>
> The archives can be found at:
> https://urldefense.proofpoint.com/v2/url?u=https-3A__lists.cs.wisc.edu_archive_htcondor-2Dusers_&d=DwIFbA&c=gRgGjJ3BkIsb5y6s49QqsA&r=10BCTK25QMgkMYibLRbpYg&m=VCj3itsHHqD4WL7jaj14STI_RiA3yPFQuYkHOeb9zfM&s=sMGjIfjYSKnCI3pGrWIMpuctjLWtvfAv5yg6eFUthJ0&e=
_______________________________________________
HTCondor-users mailing list
To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://urldefense.proofpoint.com/v2/url?u=https-3A__lists.cs.wisc.edu_mailman_listinfo_htcondor-2Dusers&d=DwIFbA&c=gRgGjJ3BkIsb5y6s49QqsA&r=10BCTK25QMgkMYibLRbpYg&m=VCj3itsHHqD4WL7jaj14STI_RiA3yPFQuYkHOeb9zfM&s=WBQKEaMHUAFVqImfbLGU1P8F_wjAZQRDNkKVZSRfaVU&e=

The archives can be found at:
https://urldefense.proofpoint.com/v2/url?u=https-3A__lists.cs.wisc.edu_archive_htcondor-2Dusers_&d=DwIFbA&c=gRgGjJ3BkIsb5y6s49QqsA&r=10BCTK25QMgkMYibLRbpYg&m=VCj3itsHHqD4WL7jaj14STI_RiA3yPFQuYkHOeb9zfM&s=sMGjIfjYSKnCI3pGrWIMpuctjLWtvfAv5yg6eFUthJ0&e=