Mailing List Archives
Public Access
|
|
|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Condor-users] idle jobs
Thanks,
Yes, I have set NIS domain.
However, it dose not work.
when I set
RESERVED_SWAP=0 as hints from ScheduleLog
it works , however, the job is killed everytime.
regards,
Lizhe
SchedLog:
9/7 16:44:22 DaemonCore: Command received via UDP from host
<194.199.22.87:33110>
9/7 16:44:22 DaemonCore: received command 421 (RESCHEDULE), calling handler
(reschedule_negotiator)
9/7 16:44:22 Sent ad to central manager for lwang@xxxxxxxxxxxxxxxxxxx
9/7 16:44:22 Called reschedule_negotiator()
9/7 16:44:22 DaemonCore: Command received via TCP from host
<194.199.22.87:36201>
9/7 16:44:22 DaemonCore: received command 416 (NEGOTIATE), calling handler
(negotiate)
9/7 16:44:22 Negotiating for owner: lwang@xxxxxxxxxxxxxxxxxxx
9/7 16:44:22 Checking consistency running and runnable jobs
9/7 16:44:22 Tables are consistent
9/7 16:44:22 Swap space estimate reached! No more jobs can be run!
9/7 16:44:22 Solution: get more swap space, or set RESERVED_SWAP = 0
9/7 16:44:22 0 jobs matched, 1 jobs idle
9/7 16:44:22 Activity on stashed negotiator socket
9/7 16:44:22 Negotiating for owner: lwang@xxxxxxxxxxxxxxxxxxx
9/7 16:44:22 Checking consistency running and runnable jobs
9/7 16:44:22 Tables are consistent
9/7 16:44:22 Swap space estimate reached! No more jobs can be run!
9/7 16:44:22 Solution: get more swap space, or set RESERVED_SWAP = 0
9/7 16:44:22 0 jobs matched, 1 jobs idle
~
the ShadowLog:
-----------------------------
9/7 16:25:49 (?.?) (24024):******* Standard Shadow starting up *******
9/7 16:25:49 (?.?) (24024):** $CondorVersion: 6.6.10 Jun 13 2005 $
9/7 16:25:49 (?.?) (24024):** $CondorPlatform: I386-LINUX_RH9 $
9/7 16:25:49 (?.?) (24024):*******************************************
9/7 16:25:49 (?.?) (24024):uid=501, euid=501, gid=501, egid=501
9/7 16:25:49 (?.?) (24024):Hostname = "<194.199.22.87:35559>", Job = 12.0
9/7 16:25:49 (12.0) (24024):Requesting Primary Starter
9/7 16:25:49 (12.0) (24024):Shadow: Request to run a job was ACCEPTED
9/7 16:25:49 (12.0) (24024):Shadow: RSC_SOCK connected, fd = 17
9/7 16:25:49 (12.0) (24024):Shadow: CLIENT_LOG connected, fd = 18
9/7 16:25:49 (12.0) (24024):My_Filesystem_Domain = "HEAVEN.inrialpes.fr"
9/7 16:25:49 (12.0) (24024):My_UID_Domain = "HEAVEN.inrialpes.fr"
9/7 16:25:49 (12.0) (24024): Entering pseudo_get_file_stream
9/7 16:25:49 (12.0) (24024): file =
"/home/lwang/condor/install/hosts/HEAVEN/spool/cluster12.ickpt.subproc0"
9/7 16:25:49 (12.0) (24024): Weird 0xc2c71657
9/7 16:25:49 (12.0) (24024): Weird 0xc2c71657
9/7 16:25:49 (12.0) (24024):Shadow: Job 12.0 exited, termsig = 0, coredump =
128, retcode = 0
9/7 16:25:49 (12.0) (24024):user_time = 1 ticks
9/7 16:25:49 (12.0) (24024):sys_time = 0 ticks
9/7 16:25:49 (12.0) (24024):Static Policy: removing job because OnExitRemove
has become true
9/7 16:25:49 (12.0) (24024):********** Shadow Exiting(102) **********