[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] Looking for suggestions for inhomogeneous pool



On Thu, Nov 24, 2011 at 01:03:32PM +0100, Steffen Grunewald wrote:
> On Thu, Nov 24, 2011 at 10:48:33AM +0100, Steffen Grunewald wrote:
> > But I always get confused by the Target/My prefixes, and again this time I
> > suspect I got it wrong - parallel universe jobs, with a
> > 
> > NEGOTIATOR_PRE_JOB_RANK=1000000000 + 1000000000 * (TARGET.JobUniverse == 11) * TotalCpus - 1000 * Memory
> 
> Replacing the "==" with a "=?=" and a series of reconfigs and restarts somehow
> fixed the issue that "low end" machines were selected.
> Now, a different problem shows up.
> 
> 1. Not all output=out.$(NODE) files get written (2 of 8 missing)
> 2. Each multi-core machine (single dynamic slot) gets only one MPI node
request_cpus = TARGET.Cpus
Is this what you need?

I let answer to the first question for others :)

Regards,
Lukas

> 
> This is Condor 7.6.0, but the release notes don't promise that 7.6.4 would
> behave differently. What's a feasible way to schedule up to TotalCpus MPI
> nodes, without dropping the dynamic provisioning of slots?
> 
> https://condor-wiki.cs.wisc.edu/index.cgi/wiki?p=HowToPackParallelJobs
> unfortunately doesn't tell anything about the slot setup...
> 
> Cheers,
>  Steffen
> 
> -- 
> Steffen Grunewald * MPI Grav.Phys.(AEI) * Am Mühlenberg 1, D-14476 Potsdam
> Cluster Admin * --------------------------------- * http://www.aei.mpg.de/
> * e-mail: steffen.grunewald(*)aei.mpg.de * +49-331-567-{fon:7274,fax:7298}
> _______________________________________________
> Condor-users mailing list
> To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with a
> subject: Unsubscribe
> You can also unsubscribe by visiting
> https://lists.cs.wisc.edu/mailman/listinfo/condor-users
> 
> The archives can be found at:
> https://lists.cs.wisc.edu/archive/condor-users/
>