[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[HTCondor-users] Using BLAH to submit/monitor/handle jobs to different slurm clusters (Cross-Cluster Operations)



Hello,

These days there are execution sites (eg. NERSC) that have multiple job management clusters to facilitate specific user needs, such as data transfers, computations etc.
For example NERSC on Cori has two slurm clusters "cori" and "escori", with the first one being used for compute and the second one for other tasks, like transferring data.

I'm currently trying to setup a BOSCO submit node that uses ssh to submit/monitor/modify jobs at NERSC, and also take advantage of both Slurm clusters.
However I've stumbled upon an issue regarding the monitoring and modifying of the submitted jobs.
Even though I was able to specify the cluster I want submit the job with Slurm's #SBATCH -M argument, I couldn't find a way to pass this to the rest of the operations (eg. status, cancel etc.)
As a result I cannot interact correctly with the submitted jobs to the "escori" cluster (the non-default one).

Is there a way to handle this?

Thanks,
George