[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] How to combine SGE and Condor under the same submit system?



Alte, 

We've seen folks use Condor as the main scheduler in several ROCKS v5.0 environments for running clusters. 

For more information about the Condor Roll for ROCKS 5.0 go here:
http://www.rocksclusters.org/wordpress/?p=94

If you decide to go this route, you'd get: 
- One submission system for all of your jobs,
- You're right that Condor has more flexible policies with workstations, but it also does for servers 
For example, you can group sets of servers that share various switches with the same ParallelSchedulingGroup and set your jobs to say
+WantParallelSchedulingGroups = True
This will make MPI jobs run on servers that share a switch, which is frequently better for MPI.

- With servers in the same pool you can run batch, non-mpi jobs on both workstations and servers and dedicated jobs on both, where appropriate.
- Condor has a great deal of functionality around parallel jobs:
http://www.cs.wisc.edu/condor/manual/v7.0/2_9Parallel_Applications.html
http://www.cs.wisc.edu/condor/manual/v7.0/3_12Setting_Up.html#sec:Config-Dedicated-Jobs

One of the Cycle team is setting up an environment with modern multi-core workstations to have a core or two (out of 4 or 8) that run (and never preempt) dedicated jobs, so workstations can be used effectively. You can also tag your long running MPI jobs to favor servers, while short running MPI jobs favor available workstation cores. If you want some configuration help doing this, please feel free to get in touch.

Hope this helps, and good luck!
Doug

On Oct 30, 2008, at 11:58 PM, Matthew Farrellee wrote:

Why not use Condor for both? You can use Condor's Parallel Universe for
your MPI jobs.

Best,


matt

Atle Rudshaug wrote:
Hi!

I want to combine a dedicated cluster and the office workstations into a
grid like system. I want  large mpi jobs to run on a Rocks cluster
(which uses SGE) and smaller (mpi and non-mpi) jobs to run on available
SMP workstations (using some technique for cycle scavenging with
checkpointing and migration).

AFAIK Condor is better for cycle scavenging and checkpointing/migration.
Is there a way to combine SGE (for the cluster mpi jobs) and Condor (for
cycle scavenging on the workstations) under the same submit system
(GridWay? Condor-G? Condor-C?)? Or can SGE be used for the workstation
cycle scavenging, from the already up and running Rocks cluster, as well?

- Atle

_______________________________________________
Condor-users mailing list
To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/condor-users

The archives can be found at:
https://lists.cs.wisc.edu/archive/condor-users/
_______________________________________________
Condor-users mailing list
To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/condor-users

The archives can be found at:
https://lists.cs.wisc.edu/archive/condor-users/

-- 
===================================
Douglas Clayton
phone: 919.647.9648

Cycle Computing, LLC
Leader in Condor Grid Solutions
Enterprise Condor Support and Management Tools

http://www.cyclecomputing.com