[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Condor-users] How to solve problem between condor and globus?



Hello, all,
I installed condor on 2 machine(I do not use NFS or AFS) and tried to submit 
some globus jobs. When I submited a Vanilla job, everything could work fine. 
However, when I tried to submited globus jobs, my jobs were in the job queue 
no mater there is a machine available. Does anyone know how to solve this?

The following is what I got from the log files.
==MatchLog==
12/19 07:42:20       Matched 4350.0 sary357@xxxxxxxxxxxxxxxxxx 
<140.109.98.41:34992> preempting none <140.109.98.40:42530>


===NegotiatorLog==
12/19 07:42:20   Getting startd private ads ...
12/19 07:42:20 Got ads: 14 public and 4 private
12/19 07:42:20 Public ads include 5 submitter, 4 startd
12/19 07:42:20 Phase 2:  Performing accounting ...
12/19 07:42:20 Phase 3:  Sorting submitter ads by priority ...
12/19 07:42:20 Phase 4.1:  Negotiating with schedds ...
12/19 07:42:20   Negotiating with sary357@xxxxxxxxxxxxxxxxxx at 
<140.109.98.41:34992>
12/19 07:42:20     Request 04350.00000:
12/19 07:42:20       Matched 4350.0 sary357@xxxxxxxxxxxxxxxxxx 
<140.109.98.41:34992> preempting none <140.109.98.40:42530>
12/19 07:42:20       Successfully matched with vm2@xxxxxxxxxxxxxxxxxxxxxxxxx
12/19 07:42:20     Got NO_MORE_JOBS;  done negotiating


==SchedLog==
12/19 07:42:30 Shadow pid 6088 for job 4350.0 exited with status 4
12/19 07:42:30 ERROR: Shadow exited with job exception code!
12/19 07:42:32 Starting add_shadow_birthdate(4350.0)
12/19 07:42:32 Started shadow for job 4350.0 on "<140.109.98.40:42530>", 
(shadow pid = 6092)
12/19 07:42:32 Shadow pid 6092 for job 4350.0 exited with status 4
12/19 07:42:32 ERROR: Shadow exited with job exception code!
12/19 07:42:34 Starting add_shadow_birthdate(4350.0)
12/19 07:42:34 Started shadow for job 4350.0 on "<140.109.98.40:42530>", 
(shadow pid = 6097)
12/19 07:42:34 Shadow pid 6097 for job 4350.0 exited with status 4
12/19 07:42:34 ERROR: Shadow exited with job exception code!
12/19 07:42:34 Match for cluster 4350 has had 5 shadow exceptions, 
relinquishing.
12/19 07:42:34 Sent RELEASE_CLAIM to startd on <140.109.98.40:42530>
12/19 07:42:34 Match record (<140.109.98.40:42530>, 4350, 0) deleted
12/19 07:42:34 DaemonCore: Command received via TCP from host 
<140.109.98.40:42555>
12/19 07:42:34 DaemonCore: received command 443 (VACATE_SERVICE), calling 
handler (vacate_service)
12/19 07:42:34 Got VACATE_SERVICE from <140.109.98.40:42555>

==ShadowLog==
12/19 07:42:30 ******************************************************
12/19 07:42:30 ** condor_shadow (CONDOR_SHADOW) STARTING UP
12/19 07:42:30 ** /opt/osg/osg_0.2.0/condor/sbin/condor_shadow
12/19 07:42:30 ** $CondorVersion: 6.7.7 Apr 27 2005 $
12/19 07:42:30 ** $CondorPlatform: I386-LINUX_RH9 $
12/19 07:42:30 ** PID = 6088
12/19 07:42:30 ******************************************************
12/19 07:42:30 Using config file: /opt/osg/osg_0.2.0/condor/etc/condor_config
12/19 07:42:30 Using local config 
files: /opt/osg/osg_0.2.0/condor/home/condor_config.local
12/19 07:42:30 DaemonCore: Command Socket at <140.109.98.41:35215>
12/19 07:42:30 Initializing a VANILLA shadow for job 4350.0
12/19 07:42:30 (4350.0) (6088): Request to run on <140.109.98.40:42530> was 
ACCEPTED
12/19 07:42:30 (4350.0) (6088): ERROR "Error from starter on 
vm2@xxxxxxxxxxxxxxxxxxxxxxxxx: Failed to open standard output 
file '/home/sary357/.globus/job/osgc01.grid.sinica.edu.tw/5973.1134978120/stdo
ut': No such file or directory (errno 2)" at line 597 in file pseudo_ops.C


And the following is my job description file.

Universe        = globus
globusscheduler = osgc01.grid.sinica.edu.tw/jobmanager-condor
Executable      = job4.sh
Output          = job4.out
Error           = job4.err
Log             = job4.log
Requirements    = (Name=="vm2@xxxxxxxxxxxxxxxxxxxxxxxxx")
should_transer_file =  IF_NEEDED
when_to_transfer_output = ON_EXIT
Queue



Any help is appreciated.

Best regards

----------------------------------------------------------------------
Fu-Ming Tsai
sary357@xxxxxxxxxxxxxxxxxx
------------------------------------------------------------------------