[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Condor-users] Some doubts about using LAM/MPI and about the dedicated scheduler


We are thinking to use Condor to manage a pool of dedicated
multiprocessor machines. One of our goals is to be able of running
parallel jobs using LAM/MPI and running the job on a single machine
(using the different processors). We have been doing some tests with
only a few machines but some doubts have appeared.

1. We tried to use the lamscript script provided but it didn't work out
probably because the user's login shell is bash. Is it necessary to have
csh as a login shell in order to run the lamscript? If so, how can we
overcome that since all users in our pool use bash? If I am confused
what is exactly meant by this paragraph taken from the manual "For LAM,
there is a similar path setting, but it is called LAMDIR in the lamscript script. In addition, this path must be part of the path set in the user’s .cshrc script. As of this writing, the LAM implementation does not work if the user’s login shell is the Bourne or compatible shell."?

2. Is it imperative to define a dedicated scheduler in order to run
parallel jobs or is this only optional? If so what are the advantages?
What happens for instance when the submission script defines a scheduler
but is submitted from a different machine (that not the dedicated
scheduler)? Finally, how does the central manager orders the jobs from
the different submit machines' queues and is this related with the
convenience of defining a dedicated scheduler?

I hope I haven't made too many boring questions... Thanks in advance.

Sara Campos