[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] HTCondor-CE on slurm



Hello Jaime, thank You for the useful hints.

On 17/06/19 19:10, Jaime Frey wrote:
On Jun 17, 2019, at 10:17 AM, Stefano Dal Pra <stefano.dalpra@xxxxxxxxxxxx> wrote:

[SNIP]
I would need some enlightenment on how to troubleshoot this:

- How can i see what slurm submission command is generated?
(I added a cp $bls_tmp_file /tmp/copia_${bls_tmp_file} to see the slurm submit file but no file is created,
thus i doubt this script is actually executed at all).
Adding this in the right place in slum_submit.sh should work.
Indeed. After inspecting the bls_add_job_wrapper function in blah_common_submit_functions.sh we found a few default settings that had to be defined. We did this by editing

/usr/libexec/condor/glite/bin/slurm_local_submit_attributes.sh

And adding a few entries, such as:

echo "#SBATCH --account=testgrp"
echo "#SBATCH -D ${HOME}"

This suggests the machinery isnât getting this far.
The Job Router log shows that job 19.0 was created to do the submission into slurm. Does that job appear in condor_ce_q or condor_ce_history? If so, whatâs its status?

Is there a /var/log/condor-ce/GridmanagerLog.<user> file? That is the log file for the daemon that invokes the blahp.

yep, the logfile is there and after checking that one a few more tries we have our first successful jobs!

Probably there is no real need to edit /usr/libexec/condor/glite/bin/slurm_local_submit_attributes.sh and everything could be made through proper configuration of the Job Routing Table, but i really
need some more experience on that.

Cheers
Stefano



- How do i specify in the submit file the partition name? (and a few most common slurm options, i would say;
do you have a simple example submit file for slurm?)

To specify the slurm partition, you can add this to your Condor submit file:

batch_queue = mypartition

Some common slurm options are supported out-of-the-box, and support for additional options can be added by customizing your blahp configuration and using the CERequirements job attribute.

  - Jaime

_______________________________________________
HTCondor-users mailing list
To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users

The archives can be found at:
https://lists.cs.wisc.edu/archive/htcondor-users/