# module load cuda-toolkit/10.2
# module load cuda-toolkit/11.1
getenv can be dangerous as the environment in your submission
environment might not work on the executing node.
Are you preparing the environment in your batch job the same way as you
set it up compared to when you run interactively? (do you source all the
same environment scripts etc.?)
Maybe you can try and print your batch job's environment into your log
file running `env` and compare with the interactive environment.
On 25/03/2021 16.55, brando.science@xxxxxxxxx wrote:
> I am a user of a HTCondor hpc. I noticed that my pytorch jobs that use
> cuda work just fine in the interactive mode (it seems with any version
> of pytorch or cuda even if nvidia-smi says one version of cuda but my
> pytorch says another) but when I try to run them in the condor_submit
> without interactive it doesn't run. It get's into a deadlock because I
> am trying to do parallel training (but note this does not happen in
> interactive mode even with 4 gpus).
> My question seems simple. How do I force my condor_submit job to be
> identical to the environment when I run it from a interactive session?
> I've tried the famous getenv flag and that didn't work for some reason.
> I assume it is because it copies my envs from the login node instead
> from the interactive session (but I cannot run a submission job from an
> interactive session so I can't do it that way). Is there a way to have
> the submission run job with exactly the same settings as a interactive
> job? I am not a sys adminÂI am only a user if that helps.
> I've also read these two pages:
> - https://htcondor.readthedocs.io/en/latest/man-pages/condor_submit.html
> and posted this question on SO:
> Thanks for your time HTConder users list.
> Sincerley, Brando
> HTCondor-users mailing list
> To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
> subject: Unsubscribe
> You can also unsubscribe by visiting
> The archives can be found at:
HTCondor-users mailing list
To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
You can also unsubscribe by visiting
The archives can be found at: