I am using condor 7.4.4 on two VMware Ubuntu machines.
I have setup Globus and can submit and run jobs. I have setup Condor and can submit and run jobs. If there is only one machine I can use Globus to run jobs on condor and vice versa.
When I add a second machine and issue a submit with 5 jobs. 2-3 goes to one machine and rest to the other machine. On the manager machine the jobs run without a problem. On the second machine the jobs appears to start to run and are put on hold... For the following reason (from condor_q –better-analyze):
Hold reason: Error from helium.adiroy.com: Failed to open '/home/globus/.globus/job/hydrogen.adiroy.com/16073795612117631466.2588226358823932351/stdout' as standard output: No such file or directory (errno 2)
The directory does not exist and if I create it does nothing. I have opened my ports for GLOBUS_TCP_PORT and this too did nothing. I have searched quite extensively on the web but cannot find any more information. Can someone help me? Thanks in advance
The job is defined as
executable = /bin/hostname
globusscheduler = hydrogen
universe = globus
output = condorg.out.$(cluster).$(Process)
log = condorg.log.$(cluster).$(Process)
should_transfer_files = YES
when_to_transfer_output = ON_EXIT
stream_output = true
stream_error = true