I read in the archives that v 6.6.x has a 2 GB memory limit, and will be fixed in 6.7.x. Can anyone confirm or deny? Is there a work around? This would explain the symptoms I am having below.
BTW - I loaded 6.7.10 on a submit-only box and an execution box, but it didn't help. Does the Master box need 6.7.10 as well?
condor-users-bounces@xxxxxxxxxxx [mailto:condor-users-bounces@xxxxxxxxxxx] On Behalf Of Cox, James A (TRAC)
The subject tells it all: I can run the job from the command line and it will go to completion (about 100 hrs), but when I submit it under Condor, it starts and runs for 40 mins (while it is mostly reading in data). Condor then gets a "SIGQUIT" and thinks it's done.
I suspect it is running out of memory under Condor, but it works from the command line because all the memory is available. I've tried reconfiguring the VMs, RAM available, etc. We even pumped up one box to 10GB RAM, so that each vm had 5 GB! No luck.
The boxes are dual processor, 64bit AMDs, running RHEL 4 and condor 6.6.10. with 4 GB. The job, however, was compiled on a 32 bit box, since the compiler is currently only available in 32 bit.
Is there some reason Condor won't let the executable use the entire advertised space? Submitting 60 jobs from the command line every few days isn't fun, and it keeps us from efficiently using the farm.
Any ideas where else to troubleshoot?
James A. Cox