[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] New to Condor, Need to RUN MPI



You are correct Javier.
I have one more question, how do you utilize flocking in condor. I have configured two pools and made changes to the condor_config files on both for two way flocking. 
Has any one tried this?
What is the syntax for running these kind of jobs?
If two pools have 5 cpus each and if i have 10 CPU request, then will this feature span 5 to one and 5 to other? is that the case?
I do not want to go through the hassles of Codor-G, at least for now, but would definitely want to look in that after a couple of weeks.
Please shed light on the flocking issue.
thanks
Samir

-----------------------------------
Distributed Computing Lab
Bowling Green State University
Bowling Green, OH 43402
skhanal@xxxxxxxx

________________________________________
From: condor-users-bounces@xxxxxxxxxxx [condor-users-bounces@xxxxxxxxxxx] On Behalf Of Javier Forment Millet [jforment@xxxxxxxxxxxx]
Sent: Thursday, February 05, 2009 5:03 AM
To: Condor-Users Mail List
Subject: Re: [Condor-users] New to Condor, Need to RUN MPI

Hi...

AFAIK, that is not absolutely true. For example, by using Condor you can take
advantages of a queuing system, which is not the case if you just launch MPI
jobs. For example, if two independent MPI jobs requiring 10 processors each are
launched with Condor in a 10-node cluster, Condor executes one of them, and the
other one waits until the first one is finished. If you launch them without
Condor, they both will run in the 10-nodes, alternating the CPU time among the
two processes.

Of course, this is better or not depending on what one wants. But, in any case,
it is not the same thing.

Please, correct me if I am wrong.

Cheers,

Javier.



Quoting "J.S. van Bethlehem" <j.s.van.bethlehem@xxxxxxxxxxxx>:

> It seems this message was blocked because I changed the alias of my
> email address. My apologies if I'm wrong and this is the second time the
> message is send.
> --------------------------------------------------------------------------
>
> Dear mr Kanal,
>
> I've been following your struggles with some interest, as I'm trying to
> run MPI-jobs on Condor as well, except that I'm using openMPI.
> Fortunately (for me) all the configuration stuff is done by a
> systemadministrator at our institute, so I couldn't help you out. But as
> I'm reading your problems, I'm wondering about the following:
> It seems that you have a machinepool that is dedicated to run MPI-jobs.
> If that is really the case, Condor is completely useless for you. In the
> parallel Universe, _all_ that Condor does, is create a machinefile and
> then call the mpirun-command with that (temporary) machinefile. If you
> have a cluster dedicated to do MPI-computations, you can just as well
> just run mpirun and the result will be exactly the same! (without all
> the Condor-headaches)
>
> Greetings, Jakob
>
> Samir Khanal wrote:
> > Hi Zach
> > I tried and MPI worked too.
> > Thank you so much,
> > You made my day!
> > Samir
> >
> > -----Original Message-----
> > From: condor-users-bounces@xxxxxxxxxxx
> [mailto:condor-users-bounces@xxxxxxxxxxx] On Behalf Of Zachary Miller
> > Sent: Tuesday, February 03, 2009 7:03 PM
> > To: Condor-Users Mail List
> > Subject: Re: [Condor-users] New to Condor, Need to RUN MPI
> >
> >> I looked up the CollectorLog and found the following entries. Those ips
> are of the computenodes
> >>
> >> 2/3 17:28:27 DaemonCore: PERMISSION DENIED to unknown user from host
> <10.1.255.251:59011> for command 0 (UPDATE_STARTD_AD), access level
> ADVERTISE_STARTD
> >
> > this is the key problem.
> >
> > you need to tell condor which IP addresses are allowed to do certain
> > operations.  look in the condor_config file for the string "HOSTALLOW"
> > and set them to something like this:
> >
> > HOSTALLOW_READ = 10.1.*
> > HOSTALLOW_WRITE = 10.1.*
> > HOSTALLOW_ADMINISTRATOR = $(CONDOR_HOST)
> >
> > this will allow machines in those networks to join your pool, run and/or
> submit
> > jobs depending on which daemons they are running.
> >
> > it also makes it so only the central manager can issue administrative
> commands.
> > you may wish to make the central manager a different machine from your main
> > submit point if you do not trust users on the submit machine not to say,
> turn
> > off condor.
> >
> > i hope that helps!
> >
> >
> > cheers,
> > -zach
> >
> > _______________________________________________
> > Condor-users mailing list
> > To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with a
> > subject: Unsubscribe
> > You can also unsubscribe by visiting
> > https://lists.cs.wisc.edu/mailman/listinfo/condor-users
> >
> > The archives can be found at:
> > https://lists.cs.wisc.edu/archive/condor-users/
> > _______________________________________________
> > Condor-users mailing list
> > To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with a
> > subject: Unsubscribe
> > You can also unsubscribe by visiting
> > https://lists.cs.wisc.edu/mailman/listinfo/condor-users
> >
> > The archives can be found at:
> > https://lists.cs.wisc.edu/archive/condor-users/
> _______________________________________________
> Condor-users mailing list
> To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with a
> subject: Unsubscribe
> You can also unsubscribe by visiting
> https://lists.cs.wisc.edu/mailman/listinfo/condor-users
>
> The archives can be found at:
> https://lists.cs.wisc.edu/archive/condor-users/
>


--
Javier Forment Millet
Instituto de Biolog�a Celular y Molecular de Plantas (IBMCP) CSIC-UPV
 Ciudad Polit�cnica de la Innovaci�n (CPI) Edificio 8 E, Escalera 7 Puerta E
 Calle Ing. Fausto Elio s/n. 46022 Valencia, Spain
Tlf.:+34-96-3877858
FAX: +34-96-3877859
jforment@xxxxxxxxxxxx