[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] mpich2 error " '.../condor_exec.exe' witharguments hellow.exe: No such file or directory"



For the record, we use dedicated users for each vm(pre 6.9)/slot(post 6.9), so for a four core machine with the default setting of four slots we'd have the following in that execute machine's condor_config.local (using 6.8 notation):

VM1_USER                   = condor_user1
VM2_USER                   = condor_user2
VM3_USER                   = condor_user3
VM4_USER                   = condor_user4
EXECUTE_LOGIN_IS_DEDICATED = TRUE

Each account has a home directory, like an ordinary user account.

Hope this helps,
Mark

--
Cambridge eScience Centre, University of Cambridge
Centre for Mathematical Sciences, Wilberforce Road, Cambridge CB3 0WA
Tel. (+44/0) 1223 765317, Fax  (+44/0) 1223 765900
http://www.escience.cam.ac.uk/~mcal00

Ben Burnett wrote:
Hi Arash:

It may be that you are getting an error when the script tries to create the
loclocloc file in the current user's home directory.  If the job is run as
nobody, then there is no home directory (or, alternatively, may not have access
to it).  As for the "bad number" error, it seems that the script is comparing a
string "hellow.exe" to 0 using an arithmetic comparison, which is invalid.

-B


-----Original Message-----
From: condor-users-bounces@xxxxxxxxxxx [mailto:condor-users-bounces@xxxxxxxxxxx]
On Behalf Of arash
Sent: Tuesday, February 05, 2008 9:48 AM
To: Condor-Users Mail List
Subject: Re: [Condor-users] mpich2 error " '.../condor_exec.exe' witharguments
hellow.exe: No such file or directory"


Dear All
I am so sorry about forgetting to attach related files.
It is all of the file

Best wish,
Arash
-----Original Message-----
From: arash [mailto:anoorghorbani@xxxxxxxxx]
Sent: Tuesday, February 05, 2008 7:01 PM
To: 'Condor-Users Mail List'
Subject: RE: [Condor-users] mpich2 error " '.../condor_exec.exe'
witharguments hellow.exe: No such file or directory"

Thanks for your consideration,
I add this line but I get the same result.
Moreover I have another error in my configuration, I had called condor start
twice in my startup of Linux, after fixing that it seems that the job run, but I
have no output, and additionally I receive very similar error files.

Again , I attached all of the related files.

I think there is an error in Mark Calleja's mp2script, or I am using this file
wrongly. In particular at the end of my error files you can see:

___________________________________________________

+ hostname=mpi0
+ pwd
+ currentDir=/home/condor/execute/dir_6717
+ whoami
+ user=condor
+ echo hellow.exe mpi0 4446 condor /home/condor/execute/dir_6717 + /usr/local/condor/libexec/condor_chirp put -mode cwa -
/home/condor/spool/cluster41.proc0.subproc0/contact
+ [ 0 -ne 0 ]
+ [ hellow.exe -eq 0 ]
[: 1: hellow.exe: bad number
+ EXECUTABLE=hellow.exe
+ shift
+ chmod +x hellow.exe
+ MPDIR=/usr/local/mpich2
+
PATH=/usr/local/mpich2/bin:.:/usr/local/condor/bin:/sbin:/bin:/usr/sbin:/usr
/bin
+ export PATH
+ export SCRATCH_LOC=loclocloc
/home/condor/execute/dir_6717/condor_exec.exe: 39: cannot create
~/loclocloc: Directory nonexistent
+ echo /home/condor/execute/dir_6717
+ trap finalize TERM
+ [ hellow.exe -ne 0 ]
[: 1: hellow.exe: bad number
+ [ hellow.exe -eq 0 ]
[: 1: hellow.exe: bad number
+ exit 0

___________________________________________________


I don't know what is loclocloc and also I am confusing about the meaning of


[: 1: hellow.exe: bad number

Again Thanks for your consideration,
Regard,
Arash



_______________________________________________
Condor-users mailing list
To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/condor-users

The archives can be found at: https://lists.cs.wisc.edu/archive/condor-users/