[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Condor-users] Parallel universe : lamscript bug ?



Hi,

I finally succeeded in running MPI jobs with LAM on parallele universe : I had to manually add the $PATH in sshd.sh.

Though, I have a minor problem now : If I condor_rm a job while running, the "contact" and "machines" files don't get removed : it seems as if the lamscript is not finished running (sshd_cleanup and stuff) : 

Is is a feature or a bug ? :)

As said, just a minor pb :)

Bye
Nicolas

PS : by the way, here are some of the modifications I had to do :(reference to this message : https://www-auth.cs.wisc.edu/lists/condor-users/2006-June/msg00032.shtml)

lamscript : 

68c68,71
< lamboot -ssi boot rsh machines
---
>
> #NG
> #lamboot -ssi boot rsh machines
> lamboot machines

---> my lam version could not support the -ssi option


77c80,81
< mpirun C $EXECUTABLE $@ &
---
> #mpirun C $EXECUTABLE $@ &
> $EXECUTABLE $@ &

--> I think it was a bug since my submit file already had as argument the mpirun command, as said in the doc : I prefered to modify it here.


sshd.sh : just added the $PATH, and removed the "-oAcceptEnv" option that my sshd didn't support (only for 3.9 and above)

> PATH=/sbin:/bin:/usr/sbin:/usr/bin:/usr/X11R6/bin:/usr/local/sbin:/usr/local/bin:.:/ibpc/io/condor/bin:/ibpc/io/condor/sbin
>
63c88,89
< $SSHD -p$PORT -oAuthorizedKeysFile=${idkey}.pub -h$hostkey -De -f/dev/null -oStrictModes=no -oPidFile=/dev/null -oAcceptEnv=_CONDOR
< /dev/null > sshd.out 2>&1 & ---
> #$SSHD -p$PORT -oAuthorizedKeysFile=${idkey}.pub -h$hostkey -De -f/dev/null -oStrictModes=no -oPidFile=/dev/null -oAcceptEnv=_CONDOR < /dev/null > sshd.out 2>&1 &
> $SSHD -p$PORT -oAuthorizedKeysFile=${idkey}.pub -h$hostkey -De -f/dev/null -oStrictModes=no -oPidFile=/dev/null  < /dev/null > sshd.out2>&1 &






----------------------------------------------------
CNRS - UPR 9080 : Laboratoire de Biochimie Theorique
Institut de Biologie Physico-Chimique
13 rue Pierre et Marie Curie
75005 PARIS - FRANCE

Tel : +33 158 41 51 70
Fax : +33 158 41 50 26
----------------------------------------------------