[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] Submission to remote slurm cluster is Failing consistently



Hi,

I tried running the command you said: 

sacctmgr show user hbaig withassoc 

but it gives the error that command not found. 
bash: sacctmgr: command not found...


Where am I suppose to run this command from?

regards
hasan
On Feb 2, 2021, at 12:37 PM, christoph.beyer@xxxxxxx wrote:

Hi,

from my very limited slurm experience this means you as a user are not allowed to use this slurm ressource, try

sacctmgr show user <user> withassoc

I think it will tell you that the qos you are attempting to use is NOT listed as part of the association ?

Best
christoph

--
Christoph Beyer
DESY Hamburg
IT-Department

Notkestr. 85
Building 02b, Room 009
22607 Hamburg

phone:+49-(0)40-8998-2317
mail: christoph.beyer@xxxxxxx


Von: hasanbaigg@xxxxxxxxx
An: bosco-discuss@xxxxxxxxxxxxxxxxxxx, "htcondor-users" <htcondor-users@xxxxxxxxxxx>
Gesendet: Dienstag, 2. Februar 2021 18:17:49
Betreff: [HTCondor-users] Submission to remote slurm cluster is Failing        consistently

Hi,

I added a bosco cluster successfully using slurm queue manager. However, when I try to test it, it gets failed every time while âChecking for submission to remote slurm clusterâ..â. It gives the following error: 

Testing bosco submission...Passed!
Submission and log files for this job are in /home/cloudcopasi/bosco/local.bosco/bosco-test/boscotest.26ZihK
Waiting for jobmanager to accept job...Passed
Checking for submission to remote slurm cluster (could take ~30 seconds)...Failed
Showing last 5 lines of logs:
02/02/21 12:01:56 [20703] (11.0) doEvaluateState called: gmState GM_SUBMIT, remoteState 0
02/02/21 12:01:56 [20703] (11.0) blah_job_submit() failed: submission command failed (exit code = 1) (stdout:) (stderr:sbatch: error: Batch job submission failed: Invalid qos specification-Error from sbatch: -)
02/02/21 12:02:00 [20703] No jobs left, shutting down
02/02/21 12:02:00 [20703] Got SIGTERM. Performing graceful shutdown.
02/02/21 12:02:00 [20703] **** condor_gridmanager (condor_GRIDMANAGER) pid 20703 EXITING WITH STATUS 0 


Could anyone please guide me what could be an issue?

Thank you.

Regards
HB

_______________________________________________
HTCondor-users mailing list
To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users

The archives can be found at:
https://lists.cs.wisc.edu/archive/htcondor-users/
_______________________________________________
HTCondor-users mailing list
To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users

The archives can be found at:
https://lists.cs.wisc.edu/archive/htcondor-users/