Hi, I'm testing a brand new condor install (1st install ever). /home is nfs automounted ( rw,nosuid,nfsvers=3,rsize=32768,wsize=32768,soft,addr=xxx.xxx.xxx.xxx ) main machine runs COLLECTOR, MASTER, NEGOTIATOR, SCHEDD, STARTD. job file is: Universe = vanilla Executable = CONDOR_pp_VIS06_75.ksh Arguments = 8500 Requirements = Memory >= 32 Rank = Memory >= 64 Image_Size = 28 Meg Log = CONDOR_pp_VIS06_75.ksh.log Output = CONDOR_pp_VIS06_75.ksh.$(Process).out Error = CONDOR_pp_VIS06_75.ksh.$(Process).error initialdir = /home/dom Queue Arguments = 8501 Queue Arguments = 8502 Queue Arguments = 8503 Queue Arguments = 8504 Queue Requirements, Rank and Image_Size are not realistic, copy/paste from an example file. 10 tries, each time 4 jobs out 5 run fine, sometimes the 1st, sometimes the 2nd…found nothing particular about it in logs or google. The failing one gives me either Fortran runtime error: No such file or directory or Fortran runtime error: End of file Fortran and associated libs are of course installed, otherwise the other jobs won't run. Box is a 12 cores opteron, 32gb ram (far enough for the 5 jobs), running CentOS 5.5 x86_64, using the condor yum repo provided by the official website. Could someone give me a hand ? Regards, -- Laurent Wandrebeck HYGEOS, Earth Observation Department / Observation de la Terre Euratechnologies 165 Avenue de Bretagne 59000 Lille, France tel: +33 3 20 08 24 98 http://www.hygeos.com GPG fingerprint/Empreinte GPG: F5CA 37A4 6D03 A90C 7A1D 2A62 54E6 EF2C D17C F64C
Attachment:
pgpfL8REpmio0.pgp
Description: PGP signature