[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] Limit Memory



Hi,

I've always problems I do test since 2 days everything is right but a 
machine crash.
I look the log but I don't really find how to solve the problem
THis is my log:

On submit machine:

09/10/13 15:51:42 (71.137) (18501): ERROR "Error from slot2@machine02: 
unable to restart the ProcD after several tries" at line 558 in file 
/slots/05/dir_28144/userdir/src/condor_shadow.V6.1/pseudo_ops.cpp
before that nothing special (no warning or error)

On execute machine ("machine02"):

09/10/13 15:51:42 error writing to named pipe: watchdog pipe has closed
09/10/13 15:51:42 LocalClient: error sending message to server
09/10/13 15:51:42 ProcFamilyClient: failed to start connection with ProcD
09/10/13 15:51:42 get_usage: ProcD communication error
09/10/13 15:51:42 waiting a second to allow the ProcD to be restarted
09/10/13 15:58:05 ** Log last touched 9/10 15:51:42

On the grid master I see nothing special occur about the same time (15:50 to 
53)

Thank you in advance if youcan help me

Goodbye have a nice day

--
Romain