[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[HTCondor-users] How do I have my interactive job and my submission job in condor match 100%?
- Date: Thu, 25 Mar 2021 10:55:45 -0500
- From: brando.science@xxxxxxxxx
- Subject: [HTCondor-users] How do I have my interactive job and my submission job in condor match 100%?
I am a user of a HTCondor hpc. I noticed that my pytorch jobs that use cuda work just fine in the interactive mode (it seems with any version of pytorch or cuda even if nvidia-smi says one version of cuda but my pytorch says another) but when I try to run them in the condor_submit without interactive it doesn't run. It get's into a deadlock because I am trying to do parallel training (but note this does not happen in interactive mode even with 4 gpus).Â
My question seems simple. How do I force my condor_submit job to be identical to the environment when I run it from a interactive session?
I've tried the famous getenv flag and that didn't work for some reason. I assume it is because it copies my envs from the login node instead from the interactive session (but I cannot run a submission job from an interactive session so I can't do it that way). Is there a way to have the submission run job with exactly the same settings as a interactive job? I am not a sys adminÂI am only a user if that helps.
I've also read these two pages:
Thanks for your time HTConder users list.