[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] Condor-G and GT4



It sounds like your Condor daemons were started as user condor and not root. Therefore, the gridmanager daemon (which normally runs as the job owner) can only run as condor, hence the name of its log / tmp. Since Condor can't become user digz, it can't read your proxy. Start Condor as root and you should be fine.

 -- Jaime

On Dec 15, 2005, at 10:50 AM, Digvijoy Chatterjee wrote:

Thanks Dan , that helped but still not able to solve it

$>condor_q -held gives
Submitter: advaitha.ad.infosys.com : <172.25.243.135:33879> :
advaitha.ad.infosys.com
 ID      OWNER           HELD_SINCE HOLD_REASON
  17.0   digz           12/15 20:24 Failed to acquire proxy

1 jobs; 0 idle, 0 running, 1 held

I am able to do all normal operations on the grid using this username
digz, so its probably not a proxy issue, on googling I found that
/tmp/Gridmanager.$(USERNAME) should help but it my temp only
Gridmanager.condor is created..there is no Gridmanager.digz,based on
whatever I cud make out I started condor_c-gahp but that didn't help
either,could not see anything in config file about PROXY or GRIDMANAGER

Any Pointers
Digz


-----Original Message-----
From: condor-users-bounces@xxxxxxxxxxx
[mailto:condor-users-bounces@xxxxxxxxxxx] On Behalf Of Dan Bradley
Sent: Thursday, December 15, 2005 10:02 PM
To: Condor-Users Mail List
Subject: Re: [Condor-users] Condor-G and GT4


What reason is given for the job going on hold?  You can find out by
running 'condor_q -held' or 'condor_q -l'.

--Dan

Digvijoy Chatterjee wrote:

Hi List,

I have  51 globus-wsrf-services running on 4 linux boxes [one of them
is an IA-64 others 3 are i686(INTEL in condor parlance)] at port
8443,(all GSI etc is configure):

I did this:

$make gt4-gram-condor
$make install
$ $GLOBUS_LOCATION/setup/globus/setup-globus-job-manager-condor
$ $GLOBUS_LOCATION/setup/globus/setup-globus-scheduler-provider- condor

THE CONTAINER ON STARTING SPAWNED TWO PROCESSES LIKE:

/usr/local/globus-4.0.1/libexec/globus-scheduler-event-generator -s
fork -t 1134561543
/usr/local/globus-4.0.1/libexec/globus-scheduler-event-generator -s
condor -t 1134561770

What else is needed to configure Condor-G as we are not able to submit

jobs for example:

when we are trying to submit a simple shell script like>

---------------------------------------------------------------------- --
------
#!/bin/bash
hostname

---------------------------------------------------------------------- --
------
with the submit file:
Universe        = grid
Grid_Type       = gt4
Jobmanager_Type = Fork
GlobusScheduler = https://advaitha:8443( <https://advaitha:8443> this
<https://advaitha:8443> is the IA64 box)
Executable      = myhostname
Output          = job.output
Error           = job.error
Log             = job.log
Queue

the job is held in "H"state and is never run even though the IA64
machine is free

here is a snippet of the SchedLog file:

6228 12/15 17:37:01 (pid:18784) warning: setting UserUid to 99, was
520 previosly
   6229 12/15 17:37:02 (pid:18784) IO: Failed to read packet header
   6230 12/15 17:37:02 (pid:18784) IO: Failed to read packet header
6231 12/15 17:37:02 (pid:18784) condor_gridmanager exited pid=21180

status=0 owner=condor
6232 12/15 17:37:02 (pid:18784) condor_gridmanager exited pid=21181

status=0 owner=null
   6233 12/15 17:37:02 (pid:18784) IO: Failed to read packet header
6234 12/15 17:37:02 (pid:18784) condor_gridmanager exited pid=21182

status=0 owner=vinodh
   6235 12/15 17:37:14 (pid:18784) Received HTTP POST connection from
<172.25.243.135:18137>
   6236 12/15 17:37:14 (pid:18784) About to serve HTTP request...
   6237 12/15 17:37:14 (pid:18784) Completed servicing HTTP request
   6238 12/15 17:37:16 (pid:18784) Received HTTP POST connection from
<172.25.243.135:18139>
   6239 12/15 17:37:16 (pid:18784) About to serve HTTP request...
   6240 12/15 17:37:16 (pid:18784) Completed servicing HTTP request
   6241 12/15 17:37:19 (pid:18784) Received HTTP POST connection from
<172.25.243.135:18144>
   6242 12/15 17:37:19 (pid:18784) About to serve HTTP request...
   6243 12/15 17:37:19 (pid:18784) Completed servicing HTTP request
   6244 12/15 17:41:26 (pid:18784) IO: Failed to read packet header
   6245 12/15 17:41:40 (pid:18784) DaemonCore: Command received via
TCP from host <172.25.243.135:18371>
   6246 12/15 17:41:40 (pid:18784) DaemonCore: received command 478
(ACT_ON_JOBS), calling handler (actOnJobs)
   6247 12/15 17:41:43 (pid:18784) IO: Failed to read packet header
   6248 12/15 17:41:53 (pid:18784) Sent ad to central manager for
nobody@xxxxxxxxxxxxxxxxxxxxxxx <mailto:nobody@xxxxxxxxxxxxxxxxxxxxxxx>
   6249 12/15 17:41:53 (pid:18784) Sent ad to 1 collectors for
nobody@xxxxxxxxxxxxxxxxxxxxxxx <mailto:nobody@xxxxxxxxxxxxxxxxxxxxxxx>
   6250 12/15 17:41:53 (pid:18784) Sent ad to central manager for
condor@xxxxxxxxxxxxxxxxxxxxxxx <mailto:condor@xxxxxxxxxxxxxxxxxxxxxxx>
   6251 12/15 17:41:53 (pid:18784) Sent ad to 1 collectors for
condor@xxxxxxxxxxxxxxxxxxxxxxx <mailto:condor@xxxxxxxxxxxxxxxxxxxxxxx>
   6252 12/15 17:41:53 (pid:18784) Sent ad to central manager for
null@xxxxxxxxxxxxxxxxxxxxxxx <mailto:null@xxxxxxxxxxxxxxxxxxxxxxx>
   6253 12/15 17:41:53 (pid:18784) Sent ad to 1 collectors for
null@xxxxxxxxxxxxxxxxxxxxxxx <mailto:null@xxxxxxxxxxxxxxxxxxxxxxx>
   6254 12/15 17:41:53 (pid:18784) Sent ad to central manager for
vinodh@xxxxxxxxxxxxxxxxxxxxxxx <mailto:vinodh@xxxxxxxxxxxxxxxxxxxxxxx>
   6255 12/15 17:41:53 (pid:18784) Sent ad to 1 collectors for
vinodh@xxxxxxxxxxxxxxxxxxxxxxx <mailto:vinodh@xxxxxxxxxxxxxxxxxxxxxxx>
   6256 12/15 17:41:53 (pid:18784) Started condor_gmanager for owner
nobody pid=21690
   6257 12/15 17:41:53 (pid:18784) warning: setting UserUid to 522,
was 99 previosly
   6258 12/15 17:41:53 (pid:18784) Started condor_gmanager for owner
condor pid=21691
   6259 12/15 17:41:53 (pid:18784) warning: setting UserUid to 525,
was 522 previosly
   6260 12/15 17:41:53 (pid:18784) Started condor_gmanager for owner
null pid=21692
   6261 12/15 17:41:53 (pid:18784) warning: setting UserUid to 520,
was 525 previosly
   6262 12/15 17:41:53 (pid:18784) Started condor_gmanager for owner
vinodh pid=21693
   6263 12/15 17:41:56 (pid:18784) IO: Failed to read packet header
   6264 12/15 17:41:56 (pid:18784) IO: Failed to read packet header
   6265 12/15 17:41:56 (pid:18784) IO: Failed to read packet header
   6266 12/15 17:41:56 (pid:18784) IO: Failed to read packet header
   6267 12/15 17:42:01 (pid:18784) IO: Failed to read packet header
6268 12/15 17:42:01 (pid:18784) condor_gridmanager exited pid=21690

status=0 owner=nobody
6269 12/15 17:42:01 (pid:18784) warning: setting UserUid to 99, was

520 previosly
   6270 12/15 17:42:01 (pid:18784) IO: Failed to read packet header
   6271 12/15 17:42:01 (pid:18784) IO: Failed to read packet header
6272 12/15 17:42:01 (pid:18784) condor_gridmanager exited pid=21691

status=0 owner=condor
6273 12/15 17:42:01 (pid:18784) condor_gridmanager exited pid=21692

status=0 owner=null



**************** CAUTION - Disclaimer *****************
This e-mail contains PRIVILEGED AND CONFIDENTIAL INFORMATION intended
solely for the use of the addressee(s). If you are not the intended
recipient, please notify the sender by e-mail and delete the original
message. Further, you are not to copy, disclose, or distribute this
e-mail or its contents to any other person and any such actions are
unlawful. This e-mail may contain viruses. Infosys has taken every
reasonable precaution to minimize this risk, but is not liable for any

damage you may sustain as a result of any virus in this e-mail. You
should carry out your own virus checks before opening the e-mail or
attachment. Infosys reserves the right to monitor and review the
content of all messages sent to or from this e-mail address. Messages
sent to or from this e-mail address may be stored on the Infosys
e-mail system.
***INFOSYS******** End of Disclaimer ********INFOSYS***

--------------------------------------------------------------------- --
-

_______________________________________________
Condor-users mailing list
Condor-users@xxxxxxxxxxx
https://lists.cs.wisc.edu/mailman/listinfo/condor-users


_______________________________________________
Condor-users mailing list
Condor-users@xxxxxxxxxxx
https://lists.cs.wisc.edu/mailman/listinfo/condor-users

_______________________________________________
Condor-users mailing list
Condor-users@xxxxxxxxxxx
https://lists.cs.wisc.edu/mailman/listinfo/condor-users

+----------------------------------+---------------------------------+
|            Jaime Frey            |  Public Split on Whether        |
|        jfrey@xxxxxxxxxxx         |  Bush Is a Divider              |
|  http://www.cs.wisc.edu/~jfrey/  |         -- CNN Scrolling Banner |
+----------------------------------+---------------------------------+