[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] Jobs don't run



Hi!!! Thanks very much!!! I've just cleared the jobs queue with the command condor_rm and sent one job to the machine1 again. See:

[aryjr@machine1 log]$ ./bin/condor_q

-- Submitter: machine1 : <192.168.1.182:48988> : machine1
 ID      OWNER            SUBMITTED     RUN_TIME ST PRI SIZE CMD
   1.0   aryjr          10/27 15:56   0+00:00:00 I  0   0.0  simple 4 10

1 job; 1 idle, 0 running, 0 held

See any logs in /opt/condor/local.machine1/log:

[aryjr@machine1 log]$ tail -n 20 CollectorLog
10/27 15:48:58 DC_AUTHENTICATE: attempt to open invalid session machine1:990:1130377103:26, failing.
10/27 15:51:19 (Sent 5 ads in response to query)
10/27 15:51:19 Can't connect to <128.105.143.14:9618>:0, errno = 113
10/27 15:51:19 Will keep trying for 10 seconds...
10/27 15:52:10 Connect failed for 10 seconds; returning FALSE
10/27 15:52:10 ERROR: SECMAN:2003:TCP connection to <128.105.143.14:9618> failed
10/27 15:52:10 Can't send UPDATE_COLLECTOR_AD to collector (condor.cs.wisc.edu): Failed to send UDP update command to collector
10/27 15:52:10 Got QUERY_STARTD_PVT_ADS
10/27 15:52:10 (Sent 2 ads in response to query)
10/27 15:52:57 DC_AUTHENTICATE: attempt to open invalid session machine1:990:1130377087:24, failing.
10/27 15:53:04 DC_AUTHENTICATE: attempt to open invalid session machine1:990:1130377092:25, failing.
10/27 15:53:57 DC_AUTHENTICATE: attempt to open invalid session machine1:990:1130377103:26, failing.
10/27 15:53:58 DC_AUTHENTICATE: attempt to open invalid session machine1:990:1130377103:26, failing.
10/27 15:56:17 (Sent 5 ads in response to query)
10/27 15:56:17 Got QUERY_STARTD_PVT_ADS
10/27 15:56:17 (Sent 2 ads in response to query)
10/27 15:57:57 DC_AUTHENTICATE: attempt to open invalid session machine1:990:1130377087:24, failing.
10/27 15:58:03 DC_AUTHENTICATE: attempt to open invalid session machine1:990:1130377092:25, failing.

[aryjr@machine1 log]$ tail -n 20 MasterLog
10/26 23:42:58 The NEGOTIATOR (pid 991) exited with status 0
10/26 23:42:58 The SCHEDD (pid 993) exited with status 0
10/26 23:42:58 All daemons are gone.  Exiting.
10/26 23:42:58 **** condor_master (condor_MASTER) EXITING WITH STATUS 0
10/26 23:44:32 ******************************************************
10/26 23:44:32 ** condor_master (CONDOR_MASTER) STARTING UP
10/26 23:44:32 ** /opt/condor/sbin/condor_master
10/26 23:44:32 ** $CondorVersion: 6.6.10 Jun 13 2005 $
10/26 23:44:32 ** $CondorPlatform: I386-LINUX_RH9 $
10/26 23:44:32 ** PID = 1550
10/26 23:44:32 ******************************************************
10/26 23:44:32 Using config file: /opt/condor/etc/condor_config
10/26 23:44:32 Using local config files: /opt/condor/local.machine1/condor_config.local
10/26 23:44:32 DaemonCore: Command Socket at <192.168.1.182:48986>
10/26 23:44:32 Started DaemonCore process "/opt/condor/sbin/condor_collector", pid and pgroup = 1551
10/26 23:44:32 Started DaemonCore process "/opt/condor/sbin/condor_negotiator", pid and pgroup = 1552
10/26 23:44:32 Started DaemonCore process "/opt/condor/sbin/condor_startd", pid and pgroup = 1553
10/26 23:44:32 Started DaemonCore process "/opt/condor/sbin/condor_schedd", pid and pgroup = 1554
10/27 00:44:32 Preen pid is 1669
10/27 00:44:32 Child 1669 died, but not a daemon -- Ignored

[aryjr@machine1 log]$ tail -n 20 NegotiatorLog
10/27 15:56:17 Public ads include 1 submitter, 2 startd
10/27 15:56:17 Phase 2:  Performing accounting ...
10/27 15:56:17 Phase 3:  Sorting submitter ads by priority ...
10/27 15:56:17 Phase 4.1:  Negotiating with schedds ...
10/27 15:56:17   Negotiating with aryjr@xxxxxxxxxxxxxxxxxxxxxxx at <192.168.1.182:48988>
10/27 15:56:17     Got NO_MORE_JOBS;  done negotiating
10/27 15:56:17 ---------- Finished Negotiation Cycle ----------
10/27 16:01:17 ---------- Started Negotiation Cycle ----------
10/27 16:01:17 Phase 1:  Obtaining ads from collector ...
10/27 16:01:17   Getting all public ads ...
10/27 16:01:17   Sorting 5 ads ...
10/27 16:01:17   Getting startd private ads ...
10/27 16:01:17 Got ads: 5 public and 2 private
10/27 16:01:17 Public ads include 1 submitter, 2 startd
10/27 16:01:17 Phase 2:  Performing accounting ...
10/27 16:01:17 Phase 3:  Sorting submitter ads by priority ...
10/27 16:01:17 Phase 4.1:  Negotiating with schedds ...
10/27 16:01:17   Negotiating with aryjr@xxxxxxxxxxxxxxxxxxxxxxx at <192.168.1.182:48988>
10/27 16:01:17     Got NO_MORE_JOBS;  done negotiating
10/27 16:01:17 ---------- Finished Negotiation Cycle ----------

[aryjr@machine1 log]$ tail -n 20 SchedLog
10/27 15:55:41 Sent ad to central manager for aryjr@xxxxxxxxxxxxxxxxxxxxxxx
10/27 15:56:17 DaemonCore: Command received via UDP from host <192.168.1.182:33156>
10/27 15:56:17 DaemonCore: received command 421 (RESCHEDULE), calling handler (reschedule_negotiator)
10/27 15:56:17 Sent ad to central manager for aryjr@xxxxxxxxxxxxxxxxxxxxxxx
10/27 15:56:17 Called reschedule_negotiator()
10/27 15:56:17 Activity on stashed negotiator socket
10/27 15:56:17 Negotiating for owner: aryjr@xxxxxxxxxxxxxxxxxxxxxxx
10/27 15:56:17 Checking consistency running and runnable jobs
10/27 15:56:17 Tables are consistent
10/27 15:56:17 Swap space estimate reached! No more jobs can be run!
10/27 15:56:17     Solution: get more swap space, or set RESERVED_SWAP = 0
10/27 15:56:17     0 jobs matched, 2 jobs idle
10/27 16:01:17 Activity on stashed negotiator socket
10/27 16:01:17 Negotiating for owner: aryjr@xxxxxxxxxxxxxxxxxxxxxxx
10/27 16:01:17 Checking consistency running and runnable jobs
10/27 16:01:17 Tables are consistent
10/27 16:01:17 Swap space estimate reached! No more jobs can be run!
10/27 16:01:17     Solution: get more swap space, or set RESERVED_SWAP = 0
10/27 16:01:17     0 jobs matched, 2 jobs idle
10/27 16:01:17 Sent ad to central manager for aryjr@xxxxxxxxxxxxxxxxxxxxxxx

[aryjr@machine1 log]$ tail -n 20 StartLog
10/27 15:35:24 Failed to obtain keyboard or mouse idle information.
10/27 15:35:24 Assuming the keyboard and mouse to be infinitely idle.
10/27 15:40:24 Failed to obtain keyboard or mouse idle information.
10/27 15:40:24 Assuming the keyboard and mouse to be infinitely idle.
10/27 15:45:24 Failed to obtain keyboard or mouse idle information.
10/27 15:45:24 Assuming the keyboard and mouse to be infinitely idle.
10/27 15:45:24 State change: RunBenchmarks is TRUE
10/27 15:45:24 vm1: Changing activity: Idle -> Benchmarking
10/27 15:45:29 State change: benchmarks completed
10/27 15:45:29 vm1: Changing activity: Benchmarking -> Idle
10/27 15:45:29 State change: RunBenchmarks is TRUE
10/27 15:45:29 vm2: Changing activity: Idle -> Benchmarking
10/27 15:45:33 State change: benchmarks completed
10/27 15:45:33 vm2: Changing activity: Benchmarking -> Idle
10/27 15:50:33 Failed to obtain keyboard or mouse idle information.
10/27 15:50:33 Assuming the keyboard and mouse to be infinitely idle.
10/27 15:55:33 Failed to obtain keyboard or mouse idle information.
10/27 15:55:33 Assuming the keyboard and mouse to be infinitely idle.
10/27 16:00:33 Failed to obtain keyboard or mouse idle information.
10/27 16:00:33 Assuming the keyboard and mouse to be infinitely idle.

Thanks again!!!

On 10/27/05, ulgandhi@xxxxxxxxxxxxxx <ulgandhi@xxxxxxxxxxxxxx > wrote:
Hi Ary
please paste here the log file and submit file aswell

thanks

> sent to machine1? What's wrong with my configuration?
>
> Thanks very much
>
> Ary Junior
>


regards,
upendra

_______________________________________________
Condor-users mailing list
Condor-users@xxxxxxxxxxx
https://lists.cs.wisc.edu/mailman/listinfo/condor-users

_______________________________________________
Condor-users mailing list
Condor-users@xxxxxxxxxxx
https://lists.cs.wisc.edu/mailman/listinfo/condor-users