[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] New to Condor, Need to RUN MPI

Hi Todd
As per your suggestion i just changed the MPDIR

# Set this to the bin directory of MPICH installation
export PATH



# The second field in the contact file is the machine name
# that condor_ssh knows how to use
sort -n +0 < $CONDOR_CONTACT_FILE | awk '{print $2}' > machines

## run the actual mpijob
mpirun -v -np $_CONDOR_NPROCS -machinefile machines $EXECUTABLE $@


That strange message seems to go away but i still get the following

running /var/opt/condor/execute/dir_6084/bones on 2 LINUX ch_p4 processors
Cannot read machines.
Looked for files with extension LINUX in
directory /opt/mpich/gnu/share .
I check and there is a file called machines.LINUX in that DIR.


Samir Khanal
CS Grad Student
Hayes 226
Bowling Green State University
Bowling Green, OH 43402

From: condor-users-bounces@xxxxxxxxxxx [condor-users-bounces@xxxxxxxxxxx] On Behalf Of Todd Tannenbaum [tannenba@xxxxxxxxxxx]
Sent: Friday, January 30, 2009 3:03 PM
To: Condor-Users Mail List
Subject: Re: [Condor-users] New to Condor, Need to RUN MPI

Samir Khanal wrote:
> I tried Parallel Universe too, here is what i get
> running /home/skhanal/condor/bones on 2 LINUX ch_p4 processors
> Created /var/opt/condor/execute/dir_5352/PILxVizf5531
> Host compute-0-0 is not in contact file /var/opt/condor/execute/dir_5352/contact
> p0_5556:  p4_error: Child process exited while making connection to remote process on compute-0-0: 0
> p0_5556: (2.003906) net_send: could not write to fd=4, errno = 32
> The job does not complete successfully with above messages.
> Help ! Help!

Why did you feel compelled to hack the sample mp1script included with
Condor?  Are you trying to use mpich?  If so, just set the path
correctly (to MPDIR) in the sample script where the comment says so; no
other changes should be needed.

Your customizations to the sample mp1script look very suspect to me.


Condor-users mailing list
To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting

The archives can be found at: