[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] MPI and multiple SMP machines



> This was added in 6.7.17, so you should be good.
>
> -Greg

Unfortunately, it does not work as I think it should (following chapter 
3.13.8.2 of the manual):

I created a small job description file as follows:

	universe = parallel
	executable = /bin/sleep
	arguments = 300
	machine_count = 2
	+WantParallelSchedulingGroups = True
	queue

On the SMP machines I first tried to set

ParallelSchedulingGroup = test

and also

ParallelSchedulingGroup = $(HOSTNAME)

in their local config files as. I also tried to set it in the global 
config file.

After restarting Condor on the SMP machines and dedicated scheduler the 
job was submitted, but did never start. Instead, I see the entry

11/20 16:22:15 (pid:20614) Trying to satisfy job with group scheduling
11/20 16:22:15 (pid:20614) Job requested parallel scheduling groups, but 
no groups found


Do you have any idea if I did something wrong?

Without setting 

+WantParallelSchedulingGroups = True

the job runs fine on two different machines.

Jens