[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] Condor-G and GT4




What reason is given for the job going on hold? You can find out by running 'condor_q -held' or 'condor_q -l'.

--Dan

Digvijoy Chatterjee wrote:

Hi List,

I have 51 globus-wsrf-services running on 4 linux boxes [one of them is an IA-64 others 3 are i686(INTEL in condor parlance)] at port 8443,(all GSI etc is configure):

I did this:

$make gt4-gram-condor
$make install
$ $GLOBUS_LOCATION/setup/globus/setup-globus-job-manager-condor
$ $GLOBUS_LOCATION/setup/globus/setup-globus-scheduler-provider-condor

THE CONTAINER ON STARTING SPAWNED TWO PROCESSES LIKE:

/usr/local/globus-4.0.1/libexec/globus-scheduler-event-generator -s fork -t 1134561543 /usr/local/globus-4.0.1/libexec/globus-scheduler-event-generator -s condor -t 1134561770

What else is needed to configure Condor-G as we are not able to submit jobs for example:

when we are trying to submit a simple shell script like>
------------------------------------------------------------------------------
#!/bin/bash
hostname
------------------------------------------------------------------------------
with the submit file:
Universe        = grid
Grid_Type       = gt4
Jobmanager_Type = Fork
GlobusScheduler = https://advaitha:8443( <https://advaitha:8443> this <https://advaitha:8443> is the IA64 box)
Executable      = myhostname
Output          = job.output
Error           = job.error
Log             = job.log
Queue

the job is held in "H"state and is never run even though the IA64 machine is free

here is a snippet of the SchedLog file:

6228 12/15 17:37:01 (pid:18784) warning: setting UserUid to 99, was 520 previosly
   6229 12/15 17:37:02 (pid:18784) IO: Failed to read packet header
   6230 12/15 17:37:02 (pid:18784) IO: Failed to read packet header
6231 12/15 17:37:02 (pid:18784) condor_gridmanager exited pid=21180 status=0 owner=condor 6232 12/15 17:37:02 (pid:18784) condor_gridmanager exited pid=21181 status=0 owner=null
   6233 12/15 17:37:02 (pid:18784) IO: Failed to read packet header
6234 12/15 17:37:02 (pid:18784) condor_gridmanager exited pid=21182 status=0 owner=vinodh 6235 12/15 17:37:14 (pid:18784) Received HTTP POST connection from <172.25.243.135:18137>
   6236 12/15 17:37:14 (pid:18784) About to serve HTTP request...
   6237 12/15 17:37:14 (pid:18784) Completed servicing HTTP request
6238 12/15 17:37:16 (pid:18784) Received HTTP POST connection from <172.25.243.135:18139>
   6239 12/15 17:37:16 (pid:18784) About to serve HTTP request...
   6240 12/15 17:37:16 (pid:18784) Completed servicing HTTP request
6241 12/15 17:37:19 (pid:18784) Received HTTP POST connection from <172.25.243.135:18144>
   6242 12/15 17:37:19 (pid:18784) About to serve HTTP request...
   6243 12/15 17:37:19 (pid:18784) Completed servicing HTTP request
   6244 12/15 17:41:26 (pid:18784) IO: Failed to read packet header
6245 12/15 17:41:40 (pid:18784) DaemonCore: Command received via TCP from host <172.25.243.135:18371> 6246 12/15 17:41:40 (pid:18784) DaemonCore: received command 478 (ACT_ON_JOBS), calling handler (actOnJobs)
   6247 12/15 17:41:43 (pid:18784) IO: Failed to read packet header
6248 12/15 17:41:53 (pid:18784) Sent ad to central manager for nobody@xxxxxxxxxxxxxxxxxxxxxxx <mailto:nobody@xxxxxxxxxxxxxxxxxxxxxxx> 6249 12/15 17:41:53 (pid:18784) Sent ad to 1 collectors for nobody@xxxxxxxxxxxxxxxxxxxxxxx <mailto:nobody@xxxxxxxxxxxxxxxxxxxxxxx> 6250 12/15 17:41:53 (pid:18784) Sent ad to central manager for condor@xxxxxxxxxxxxxxxxxxxxxxx <mailto:condor@xxxxxxxxxxxxxxxxxxxxxxx> 6251 12/15 17:41:53 (pid:18784) Sent ad to 1 collectors for condor@xxxxxxxxxxxxxxxxxxxxxxx <mailto:condor@xxxxxxxxxxxxxxxxxxxxxxx> 6252 12/15 17:41:53 (pid:18784) Sent ad to central manager for null@xxxxxxxxxxxxxxxxxxxxxxx <mailto:null@xxxxxxxxxxxxxxxxxxxxxxx> 6253 12/15 17:41:53 (pid:18784) Sent ad to 1 collectors for null@xxxxxxxxxxxxxxxxxxxxxxx <mailto:null@xxxxxxxxxxxxxxxxxxxxxxx> 6254 12/15 17:41:53 (pid:18784) Sent ad to central manager for vinodh@xxxxxxxxxxxxxxxxxxxxxxx <mailto:vinodh@xxxxxxxxxxxxxxxxxxxxxxx> 6255 12/15 17:41:53 (pid:18784) Sent ad to 1 collectors for vinodh@xxxxxxxxxxxxxxxxxxxxxxx <mailto:vinodh@xxxxxxxxxxxxxxxxxxxxxxx> 6256 12/15 17:41:53 (pid:18784) Started condor_gmanager for owner nobody pid=21690 6257 12/15 17:41:53 (pid:18784) warning: setting UserUid to 522, was 99 previosly 6258 12/15 17:41:53 (pid:18784) Started condor_gmanager for owner condor pid=21691 6259 12/15 17:41:53 (pid:18784) warning: setting UserUid to 525, was 522 previosly 6260 12/15 17:41:53 (pid:18784) Started condor_gmanager for owner null pid=21692 6261 12/15 17:41:53 (pid:18784) warning: setting UserUid to 520, was 525 previosly 6262 12/15 17:41:53 (pid:18784) Started condor_gmanager for owner vinodh pid=21693
   6263 12/15 17:41:56 (pid:18784) IO: Failed to read packet header
   6264 12/15 17:41:56 (pid:18784) IO: Failed to read packet header
   6265 12/15 17:41:56 (pid:18784) IO: Failed to read packet header
   6266 12/15 17:41:56 (pid:18784) IO: Failed to read packet header
   6267 12/15 17:42:01 (pid:18784) IO: Failed to read packet header
6268 12/15 17:42:01 (pid:18784) condor_gridmanager exited pid=21690 status=0 owner=nobody 6269 12/15 17:42:01 (pid:18784) warning: setting UserUid to 99, was 520 previosly
   6270 12/15 17:42:01 (pid:18784) IO: Failed to read packet header
   6271 12/15 17:42:01 (pid:18784) IO: Failed to read packet header
6272 12/15 17:42:01 (pid:18784) condor_gridmanager exited pid=21691 status=0 owner=condor 6273 12/15 17:42:01 (pid:18784) condor_gridmanager exited pid=21692 status=0 owner=null
**************** CAUTION - Disclaimer *****************
This e-mail contains PRIVILEGED AND CONFIDENTIAL INFORMATION intended solely for the use of the addressee(s). If you are not the intended recipient, please notify the sender by e-mail and delete the original message. Further, you are not to copy, disclose, or distribute this e-mail or its contents to any other person and any such actions are unlawful. This e-mail may contain viruses. Infosys has taken every reasonable precaution to minimize this risk, but is not liable for any damage you may sustain as a result of any virus in this e-mail. You should carry out your own virus checks before opening the e-mail or attachment. Infosys reserves the right to monitor and review the content of all messages sent to or from this e-mail address. Messages sent to or from this e-mail address may be stored on the Infosys e-mail system.
***INFOSYS******** End of Disclaimer ********INFOSYS***

------------------------------------------------------------------------

_______________________________________________
Condor-users mailing list
Condor-users@xxxxxxxxxxx
https://lists.cs.wisc.edu/mailman/listinfo/condor-users