[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] condor_read(): Socket closed when trying to read 5 bytes in StartLog



Hi matt,
  It actually didn't started the Starter Process. There is no entry in
the Starter Log at that time.

 In Starter Log 
 Log last touched 10/17 10:44:30

by
Johnson




On Fri, 2008-10-17 at 07:50 -0500, Matthew Farrellee wrote:
> > 10/17 13:09:57 Starter pid 31556 exited with status 1
>  > 10/17 13:09:57 State change: starter exited
>  > 10/17 13:09:57 Changing activity: Busy -> Idle
> 
> The Starter exiting with status 1 is an error in the Starter. Look in 
> the StarterLog.
> 
> Best,
> 
> 
> matt
> 
> Johnson koil Raj wrote:
> > Hi,
> > 
> >    When I submited a Job it Matched with all Machine in the Pool.
> > Negotiator sent a Matched with a particular Machine X. But the X machine
> > Start Log Shows like this and the Jobs Keeps on Idle.
> > 
> > 1)Why the condor is not finding another suitable machine in the pool
> > because the Job is not started in X machine. 
> > 
> > 2) why it keep on trying the same X machine to submit that Job.
> > 
> > 3) condor_read(): Socket closed is what kind of error.
> > 
> > -------------Start Log ---------
> > 
> > 10/17 13:09:57 Remote global job ID is
> > scorpio.pesgrid.wipro.com#1224227353#161.0
> > 10/17 13:09:57 JobLeaseDuration not defined: using 1800 (alive_interval
> > [300] * max_missed [6]
> > 10/17 13:09:57 About to Create_Process "condor_starter -f
> > scorpio.pesgrid.wipro.com"
> > 10/17 13:09:57 Create_Process: using fast clone() to create child
> > process.
> > 10/17 13:09:57 Got RemoteUser (idealgrid@xxxxxxxxxxxxxxxxx) from request
> > classad
> > 10/17 13:09:57 Got universe "VM" (13) from request classad
> > 10/17 13:09:57 State change: claim-activation protocol successful
> > 10/17 13:09:57 Changing activity: Idle -> Busy
> > 10/17 13:09:57 condor_read(): Socket closed when trying to read 5 bytes
> > from <127.0.0.1:43202>
> > 10/17 13:09:57 IO: EOF reading packet header
> > 10/17 13:09:57 Closing job ClassAd update socket from starter.
> > 10/17 13:09:57 DaemonCore: No more children processes to reap.
> > 10/17 13:09:57 Starter pid 31556 exited with status 1
> > 10/17 13:09:57 State change: starter exited
> > 10/17 13:09:57 Changing activity: Busy -> Idle
> > 10/17 13:09:57 Got activate_claim request from shadow
> > (<10.201.42.242:9603>)
> > 10/17 13:09:57 Read request ad and starter from shadow.
> > 10/17 13:09:57 Swap space: 1052124
> > 10/17 13:09:57 28786748 kbytes available for "/vm/local.grid7/execute"
> > 10/17 13:09:57 Looking up RESERVED_DISK parameter
> > 10/17 13:09:57 Reserving 5120 kbytes for file system
> > 10/17 13:09:57 Total execute space: 28781628
> > 10/17 13:09:57 Remote job ID is 161.0
> > 10/17 13:09:57 Remote global job ID is
> > scorpio.pesgrid.wipro.com#1224227353#161.0
> > 10/17 13:09:57 JobLeaseDuration not defined: using 1800 (alive_interval
> > [300] * max_missed [6]
> > 10/17 13:09:57 About to Create_Process "condor_starter -f
> > scorpio.pesgrid.wipro.com"
> > 10/17 13:09:57 Create_Process: using fast clone() to create child
> > process.
> > 10/17 13:09:57 Got RemoteUser (idealgrid@xxxxxxxxxxxxxxxxx) from request
> > classad
> > 10/17 13:09:57 Got universe "VM" (13) from request classad
> > 10/17 13:09:57 State change: claim-activation protocol successful
> > 10/17 13:09:57 Changing activity: Idle -> Busy
> > 10/17 13:09:57 condor_read(): Socket closed when trying to read 5 bytes
> > from <127.0.0.1:34016>
> > 10/17 13:09:57 IO: EOF reading packet header
> > 10/17 13:09:57 Closing job ClassAd update socket from starter.
> > 10/17 13:09:57 DaemonCore: No more children processes to reap.
> > 10/17 13:09:57 Starter pid 31557 exited with status 1
> > 10/17 13:09:57 State change: starter exited
> > 10/17 13:09:57 Changing activity: Busy -> Idle
> > 
> > 
> > by
> > Johnson
> _______________________________________________
> Condor-users mailing list
> To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with a
> subject: Unsubscribe
> You can also unsubscribe by visiting
> https://lists.cs.wisc.edu/mailman/listinfo/condor-users
> 
> The archives can be found at: 
> https://lists.cs.wisc.edu/archive/condor-users/


Please do not print this email unless it is absolutely necessary. 

The information contained in this electronic message and any attachments to this message are intended for the exclusive use of the addressee(s) and may contain proprietary, confidential or privileged information. If you are not the intended recipient, you should not disseminate, distribute or copy this e-mail. Please notify the sender immediately and destroy all copies of this message and any attachments. 

WARNING: Computer viruses can be transmitted via email. The recipient should check this email and any attachments for the presence of viruses. The company accepts no liability for any damage caused by any virus transmitted by this email. 

www.wipro.com