[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[HTCondor-users] Process not running - out of memory



Hello all

 

I am trying a third party application in our cluster. I have prepared a bash script to run the program, and I make a submit-file to tell condor to start that script with some arguments.

 

However, it does not behave as I would like it to.

 

The condor-job starts and moves input-files as I expect it to. The process starts, and runs for about 8-10 minutes and then is killed by the kernel (oom-kill). The strange thing is that when I observe the running jobs on the startd-machine (via ssh) it does not seem to use almost any resources at all (top says MEM% less than 0.5).

 

I have tried starting the job manually (via ssh) on the startd-machine and it runs just fine. When I start it this was it also consumes a lot more resources. Top says %CPU of about 200 and %MEM of about 4. The programs clearly demands some resources, but I find it strange that the kernel kills it when run through condor and I can’t see that it uses hardly any resources at all.

 

Thoughts?

 

P

 

 

 

Peter Ellevseth 

Principal Advisor / Principal Advisor

+47 93 43 56 01 / +47 73 90 05 00

 peter.ellevseth@xxxxxxxxxx

 safetec.no