[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Condor-users] Running parallel job in condor



Hi,

I am new to using condor. I am trying to run jobs in parallel using this script:

universe = parallel
executable = /bin/date
log = logfile
#input = infile.$(NODE)
output = outfile.$(NODE)
error = errfile.$(NODE)
machine_count = 1
should_transfer_files = yes
when_to_transfer_output = on_exit
queue

However, the job goes into idle state even though 100 processors are
free.. Can anyone guess what the problem might be ??

Also, I found out from the manual that I need to have dedicated
resources in order to run jobs in parallel universe. So, I made on of
the nodes as dedicated by copying the example config file from
/opt/condor/etc/examples dir to
/opt/condor/hostname/condor_config.local .. I also added

Scheduler = DedicatedScheduler@hostname

to condor_config.local file in the main node running the scheduler.
But the problem still remains.. Is there a command to figure out
whether a resource in dedicated or not ??

Any trouble shooting idea will be helpful.. Also, jobs submitted to
Vanilla universe run perfectly.

Thank you very much.

Regards,
Sourangshu

-- 
"I have never taken any exercise except sleeping and resting." - Mark Twain

Sourangshu Bhattacharya
PhD Student,
Dept. of Computer Science & Automation,
Indian Institute of Science,
Bangalore, India.

http://people.csa.iisc.ernet.in/sourangshu