[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] idle jobs



Thanks,  
Yes, I have set NIS domain.  
However, it dose not work.  
when I set  
RESERVED_SWAP=0  as hints from ScheduleLog 
it works , however, the job is killed everytime.  
 
regards, 
Lizhe 
 
SchedLog: 
9/7 16:44:22 DaemonCore: Command received via UDP from host 
<194.199.22.87:33110> 
9/7 16:44:22 DaemonCore: received command 421 (RESCHEDULE), calling handler 
(reschedule_negotiator) 
9/7 16:44:22 Sent ad to central manager for lwang@xxxxxxxxxxxxxxxxxxx 
9/7 16:44:22 Called reschedule_negotiator() 
9/7 16:44:22 DaemonCore: Command received via TCP from host 
<194.199.22.87:36201> 
9/7 16:44:22 DaemonCore: received command 416 (NEGOTIATE), calling handler 
(negotiate) 
9/7 16:44:22 Negotiating for owner: lwang@xxxxxxxxxxxxxxxxxxx 
9/7 16:44:22 Checking consistency running and runnable jobs 
9/7 16:44:22 Tables are consistent 
9/7 16:44:22 Swap space estimate reached! No more jobs can be run! 
9/7 16:44:22     Solution: get more swap space, or set RESERVED_SWAP = 0 
9/7 16:44:22     0 jobs matched, 1 jobs idle 
9/7 16:44:22 Activity on stashed negotiator socket 
9/7 16:44:22 Negotiating for owner: lwang@xxxxxxxxxxxxxxxxxxx 
9/7 16:44:22 Checking consistency running and runnable jobs 
9/7 16:44:22 Tables are consistent 
9/7 16:44:22 Swap space estimate reached! No more jobs can be run! 
9/7 16:44:22     Solution: get more swap space, or set RESERVED_SWAP = 0 
9/7 16:44:22     0 jobs matched, 1 jobs idle 
~ 
 
 
 
the ShadowLog: 
----------------------------- 
9/7 16:25:49 (?.?) (24024):******* Standard Shadow starting up ******* 
9/7 16:25:49 (?.?) (24024):** $CondorVersion: 6.6.10 Jun 13 2005 $ 
9/7 16:25:49 (?.?) (24024):** $CondorPlatform: I386-LINUX_RH9 $ 
9/7 16:25:49 (?.?) (24024):******************************************* 
9/7 16:25:49 (?.?) (24024):uid=501, euid=501, gid=501, egid=501 
9/7 16:25:49 (?.?) (24024):Hostname = "<194.199.22.87:35559>", Job = 12.0 
9/7 16:25:49 (12.0) (24024):Requesting Primary Starter 
9/7 16:25:49 (12.0) (24024):Shadow: Request to run a job was ACCEPTED 
9/7 16:25:49 (12.0) (24024):Shadow: RSC_SOCK connected, fd = 17 
9/7 16:25:49 (12.0) (24024):Shadow: CLIENT_LOG connected, fd = 18 
9/7 16:25:49 (12.0) (24024):My_Filesystem_Domain = "HEAVEN.inrialpes.fr" 
9/7 16:25:49 (12.0) (24024):My_UID_Domain = "HEAVEN.inrialpes.fr" 
9/7 16:25:49 (12.0) (24024):    Entering pseudo_get_file_stream 
9/7 16:25:49 (12.0) (24024):    file = 
"/home/lwang/condor/install/hosts/HEAVEN/spool/cluster12.ickpt.subproc0" 
9/7 16:25:49 (12.0) (24024):     Weird 0xc2c71657 
9/7 16:25:49 (12.0) (24024):     Weird 0xc2c71657 
9/7 16:25:49 (12.0) (24024):Shadow: Job 12.0 exited, termsig = 0, coredump = 
128, retcode = 0 
9/7 16:25:49 (12.0) (24024):user_time = 1 ticks 
9/7 16:25:49 (12.0) (24024):sys_time = 0 ticks 
9/7 16:25:49 (12.0) (24024):Static Policy: removing job because OnExitRemove 
has become true 
9/7 16:25:49 (12.0) (24024):********** Shadow Exiting(102) **********