[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Condor-users] Running parallel job in condor


I am new to using condor. I am trying to run jobs in parallel using this script:

universe = parallel
executable = /bin/date
log = logfile
#input = infile.$(NODE)
output = outfile.$(NODE)
error = errfile.$(NODE)
machine_count = 1
should_transfer_files = yes
when_to_transfer_output = on_exit

However, the job goes into idle state even though 100 processors are
free.. Can anyone guess what the problem might be ??

Also, I found out from the manual that I need to have dedicated
resources in order to run jobs in parallel universe. So, I made on of
the nodes as dedicated by copying the example config file from
/opt/condor/etc/examples dir to
/opt/condor/hostname/condor_config.local .. I also added

Scheduler = DedicatedScheduler@hostname

to condor_config.local file in the main node running the scheduler.
But the problem still remains.. Is there a command to figure out
whether a resource in dedicated or not ??

Any trouble shooting idea will be helpful.. Also, jobs submitted to
Vanilla universe run perfectly.

Thank you very much.


"I have never taken any exercise except sleeping and resting." - Mark Twain

Sourangshu Bhattacharya
PhD Student,
Dept. of Computer Science & Automation,
Indian Institute of Science,
Bangalore, India.