[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] Docker Universe: Increase Shared Memory



On 11/2/21 4:06 AM, Schock, Justus wrote:
Dear HTCondor users,

I have successfully setup HTCondor and especially its docker universe.
However, for me it seems to be impossible to increase the shared memory within the docker containers which however is really necessary for me since deep learning frameworks (e.g PyTorch) use this for interprocess communication in python.
I have already tried to pass flags likeÂ--ipc=hostÂorÂâshm-size=10gÂviaÂDOCKER_EXTRA_ARGUMENTS, but apparently they are ignored. I also tried to activate using public shared memory (settingÂMOUNT_PRIVATE_DEV_SHM to false)in combination with the mentioned flags but that also didnât do the trick.

I know that we have some users successfully setting --shm-size via DOCKER_EXTRA_ARGUMENTS. If you run a test job, can you look in the StarterLog.slotXXX file in the condor log directory? The exact command line passed to docker container create will be in that log file, and we can see exactly what options are getting set.


-greg