[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] MPI condor Config




On Oct 20, 2006, at 5:50 PM, Diego Bello wrote:

Hi everyone, I have a Condor pool made of workstations to support MPI,
simple jobs and dag, all using globus. Condor version is 6.8.0

What I need is that MPI jobs could be stopped if a machine is used,
wich normally is between 10 am and 9 pm. I have tried some
configurations taken from the condor manual, but some jobs doesn't
start. I think there could be a problem with the start configuration.

I'm now trying a dag job, with three jobs doing nothing more than
/bin/hostname, but it gets to the queue, the first job starts running
but, after several hours, it doesn't finish. If I send a globus job
directly, it works. My proxy is valid for 48 hrs.

I have attached my central manager and my exec nodes's config files.
Can someone tell me if there is something wrong with my config files?.


You'll want to adjust your START policy for the execute nodes as follows:

Add:
IsNighttime = (ClockMin < 600 || ClockMin > 1260)


Replace the START and PREEMPT expressions with:
START = ( (Scheduler =?= $(DedicatedScheduler) && $(IsNighttime) =? = TRUE && $(KeyboardIdleTime) > $(StartIdleTime) ) || $(START) )
PREEMPT = (Scheduler =!= $(DedicatedScheduler) && $(KeyboardBusy)

This policy will allow MPI jobs to start only during the nighttime hours if nobody is actively using the machine. Once you set up the new policy, make sure you are able to run a simple Vanilla universe / bin/hostname job. When that is working try the dag job with /bin/ hostname again. Then try an MPI job.

If you are using the MPI Universe for your MPI jobs I'd recommend switching to the Parallel Universe.


Thanks,

Becky



Thanks.
--
Diego Bello Carreño
Estudiante Memorista de Ingeniería Civil Informática
UTFSM, Valparaíso, Chile
Usuario #294897 counter.li.org
<condor_config.local-central-manager>
<condor_config.local-exex-nodes>
_______________________________________________
Condor-users mailing list
To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/condor-users

The archives can be found at either
https://lists.cs.wisc.edu/archive/condor-users/
http://www.opencondor.org/spaces/viewmailarchive.action?key=CONDOR