Mailing List Archives
Public Access
|
|
|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
RE: [Condor-users] idle jobs
Thanks your kindly reply
I have set START=TRUE in the local configuration file
and run the condor_reschedule command
The problem still exists.
to provide more information:
I include the log file contents.
Thanks,
Lizhe
StartLog:
-----------------
9/7 13:56:51 Connect failed for 10 seconds; returning FALSE
9/7 13:56:51 ERROR:
SECMAN:2004:Failed to start a session with TCP
SECMAN:2003:TCP connection to <194.199.22.87:34884> failed
9/7 13:56:51 condor_write(): Socket closed when trying to write buffer
9/7 13:56:51 Buf::write(): condor_write() failed
9/7 13:56:51 SECMAN: Error sending response classad!
9/7 13:56:51 Our parent process (pid 17196) went away; shutting down
9/7 13:56:51 Can't connect to <194.199.22.87:9618>:0, errno = 111
9/7 13:56:51 Will keep trying for 10 seconds...
9/7 13:57:01 Connect failed for 10 seconds; returning FALSE
9/7 13:57:01 ERROR:
SECMAN:2003:TCP connection to <194.199.22.87:9618> failed
9/7 13:57:01 Error sending update to the collector HEAVEN.inrialpes.fr
<194.199.22.87:9618>: Failed to send UDP update command to collector
9/7 13:57:01 Error sending update to collector(s)
9/7 13:57:01 Got SIGTERM. Performing graceful shutdown.
9/7 13:57:01 shutdown graceful
9/7 13:57:01 Deleting Cronmgr
9/7 13:57:01 Can't connect to <194.199.22.87:9618>:0, errno = 111
9/7 13:57:01 Will keep trying for 10 seconds...
9/7 13:57:14 passwd_cache::cache_uid(): getpwnam("condor") failed: Success
9/7
13:57:14 passwd_cache::cache_uid(): getpwnam("condor") failed: Success
9/7 13:57:14 ******************************************************
9/7 13:57:14 ** condor_startd (CONDOR_STARTD) STARTING UP
9/7 13:57:14 ** /home/lwang/condor/install/sbin/condor_startd
9/7 13:57:14 ** $CondorVersion: 6.6.10 Jun 13 2005 $
9/7 13:57:14 ** $CondorPlatform: I386-LINUX_RH9 $
9/7 13:57:14 ** PID = 17728
9/7 13:57:14 ******************************************************
9/7 13:57:14 Using config file: /home/lwang/condor/install/etc/condor_config
9/7 13:57:14 Using local config
files: /home/lwang/condor/install/hosts/HEAVEN/condor_config.local
9/7 13:57:14 DaemonCore: Command Socket at <194.199.22.87:35036>
9/7 13:57:20 New machine resource allocated
9/7 13:57:20 About to run initial benchmarks.
9/7 13:57:26 Completed initial benchmarks.
9/7 13:57:26 State change: IS_OWNER is false
9/7 13:57:26 Changing state: Owner -> Unclaimed
---------------------------------------------------------
ScheduleLog:
-------------------
9/7 13:57:14 passwd_cache::cache_uid(): getpwnam("condor") failed: Success
9/7 13:57:14 passwd_cache::cache_uid(): getpwnam("condor") failed: Success
9/7 13:57:14 ******************************************************
9/7 13:57:14 ** condor_schedd (CONDOR_SCHEDD) STARTING UP
9/7 13:57:14 ** /home/lwang/condor/install/sbin/condor_schedd
9/7 13:57:14 ** $CondorVersion: 6.6.10 Jun 13 2005 $
9/7 13:57:14 ** $CondorPlatform: I386-LINUX_RH9 $
9/7 13:57:14 ** PID = 17729
9/7 13:57:14 ******************************************************
9/7 13:57:14 Using config file: /home/lwang/condor/install/etc/condor_config
9/7 13:57:14 Using local config
files: /home/lwang/condor/install/hosts/HEAVEN/condor_config.local
9/7 13:57:14 DaemonCore: Command Socket at <194.199.22.87:35037>
-------------------------
egotiatorLog:
----------------------
9/7 13:57:14 passwd_cache::cache_uid(): getpwnam("condor") failed: Success
9/7 13:57:14 passwd_cache::cache_uid(): getpwnam("condor") failed: Success
9/7 13:57:14 ******************************************************
9/7 13:57:14 ** condor_negotiator (CONDOR_NEGOTIATOR) STARTING UP
9/7 13:57:14 ** /home/lwang/condor/install/sbin/condor_negotiator
9/7 13:57:14 ** $CondorVersion: 6.6.10 Jun 13 2005 $
9/7 13:57:14 ** $CondorPlatform: I386-LINUX_RH9 $
9/7 13:57:14 ** PID = 17727
9/7 13:57:14 ******************************************************
9/7 13:57:14 Using config file: /home/lwang/condor/install/etc/condor_config
9/7 13:57:14 Using local config
files: /home/lwang/condor/install/hosts/HEAVEN/condor_config.local
9/7 13:57:14 DaemonCore: Command Socket at <194.199.22.87:9614>
9/7 13:57:14 ACCOUNTANT_HOST = None (local)
9/7 13:57:14 NEGOTIATOR_INTERVAL = 300 sec
9/7 13:57:14 NEGOTIATOR_TIMEOUT = 30 sec
9/7 13:57:14 PREEMPTION_REQUIREMENTS = (CurrentTime - EnteredCurrentState) >
(1 * (60 * 60)) && RemoteUserPrio > SubmittorPrio * 1.2
9/7 13:57:14 PREEMPTION_RANK = (RemoteUserPrio * 1000000) - TARGET.ImageSize
9/7 13:57:14 ---------- Started Negotiation Cycle ----------
9/7 13:57:14 Phase 1: Obtaining ads from collector ...
9/7 13:57:14 Getting all public ads ...
9/7 13:57:14 Sorting 0 ads ...
9/7 13:57:14 Getting startd private ads ...
9/7 13:57:14 Got ads: 0 public and 0 private
9/7 13:57:14 Public ads include 0 submitter, 0 startd
9/7 13:57:14 Phase 2: Performing accounting ...
9/7 13:57:14 Phase 3: Sorting submitter ads by priority ...
9/7 13:57:14 Phase 4.1: Negotiating with schedds ...
9/7 13:57:14 ---------- Finished Negotiation Cycle ----------
9/7 14:02:14 ---------- Started Negotiation Cycle ----------
9/7 14:02:14 Phase 1: Obtaining ads from collector ...
9/7 14:02:14 Getting all public ads ...
9/7 14:02:14 Sorting 3 ads ...
9/7 14:02:14 Getting startd private ads ...
9/7 14:02:14 Got ads: 3 public and 1 private
9/7 14:02:14 Public ads include 0 submitter, 1 startd
9/7 14:02:14 Phase 2: Performing accounting ...
9/7 14:02:14 Phase 3: Sorting submitter ads by priority ...
9/7 14:02:14 Phase 4.1: Negotiating with schedds ...
9/7 14:02:14 ---------- Finished Negotiation Cycle ----------
------------------------------------
MasterLog
-------------------
9/7 13:57:14 passwd_cache::cache_uid(): getpwnam("condor") failed: Success
9/7 13:57:14 passwd_cache::cache_uid(): getpwnam("condor") failed: Success
9/7 13:57:14 ******************************************************
9/7 13:57:14 ** condor_master (CONDOR_MASTER) STARTING UP
9/7 13:57:14 ** /home/lwang/condor/install/sbin/condor_master
9/7 13:57:14 ** $CondorVersion: 6.6.10 Jun 13 2005 $
9/7 13:57:14 ** $CondorPlatform: I386-LINUX_RH9 $
9/7 13:57:14 ** PID = 17725
9/7 13:57:14 ******************************************************
9/7 13:57:14 Using config file: /home/lwang/condor/install/etc/condor_config
9/7 13:57:14 Using local config
files: /home/lwang/condor/install/hosts/HEAVEN/condor_config.local
9/7 13:57:14 DaemonCore: Command Socket at <194.199.22.87:35035>
9/7 13:57:14 Started DaemonCore process
"/home/lwang/condor/install/sbin/condor_collector", pid and pgroup = 17726
9/7 13:57:14 Started DaemonCore process
"/home/lwang/condor/install/sbin/condor_negotiator", pid and pgroup = 17727
9/7 13:57:14 Started DaemonCore process
"/home/lwang/condor/install/sbin/condor_startd", pid and pgroup = 17728
9/7 13:57:14 Started DaemonCore process
"/home/lwang/condor/install/sbin/condor_schedd", pid and pgroup = 17729
-------------------------------------------------------
CollectorLog:
------------
9/7 13:57:14 passwd_cache::cache_uid(): getpwnam("condor") failed: Success
9/7 13:57:14 passwd_cache::cache_uid(): getpwnam("condor") failed: Success
9/7 13:57:14 ******************************************************
9/7 13:57:14 ** condor_collector (CONDOR_COLLECTOR) STARTING UP
9/7 13:57:14 ** /home/lwang/condor/install/sbin/condor_collector
9/7 13:57:14 ** $CondorVersion: 6.6.10 Jun 13 2005 $
9/7 13:57:14 ** $CondorPlatform: I386-LINUX_RH9 $
9/7 13:57:14 ** PID = 17726
9/7 13:57:14 ******************************************************
9/7 13:57:14 Using config file: /home/lwang/condor/install/etc/condor_config
9/7 13:57:14 Using local config
files: /home/lwang/condor/install/hosts/HEAVEN/condor_config.local
9/7 13:57:14 DaemonCore: Command Socket at <194.199.22.87:9618>
9/7 13:57:14 In ViewServer::Init()
9/7 13:57:14 In CollectorDaemon::Init()
9/7 13:57:14 In ViewServer::Config()
9/7 13:57:14 In CollectorDaemon::Config()
9/7 13:57:14 enable: Creating stats hash table
9/7 13:57:14 (Sent 0 ads in response to query)
9/7 13:57:14 Got QUERY_STARTD_PVT_ADS
9/7 13:57:14 (Sent 0 ads in response to query)
9/7 13:57:14 WARNING: No master ad for < HEAVEN.inrialpes.fr >
9/7 13:57:14 ScheddAd : Inserting ** "< HEAVEN.inrialpes.fr ,
194.199.22.87 >"
9/7 13:57:14 stats: Inserting new hashent for
'Schedd':'HEAVEN.inrialpes.fr':'194.199.22.87'
9/7 13:57:19 ** Master < HEAVEN.inrialpes.fr > rejuvenated from recently down
9/7 13:57:19 stats: Inserting new hashent for
'Master':'HEAVEN.inrialpes.fr':'194.199.22.87'
9/7 13:57:30 StartdAd : Inserting ** "< HEAVEN.inrialpes.fr ,
194.199.22.87 >"
9/7 13:57:30 stats: Inserting new hashent for
'Start':'HEAVEN.inrialpes.fr':'194.199.22.87'
9/7 13:57:30 StartdPvtAd : Inserting ** "< HEAVEN.inrialpes.fr ,
194.199.22.87 >"
9/7 13:57:30 stats: Inserting new hashent for
'StartdPvt':'HEAVEN.inrialpes.fr':'194.199.22.87'
9/7 13:58:59 Got QUERY_STARTD_ADS
9/7 13:58:59 (Sent 1 ads in response to query)
9/7 14:02:14 (Sent 3 ads in response to query)
9/7 14:02:14 Got QUERY_STARTD_PVT_ADS
9/7 14:02:14 (Sent 1 ads in response to query)
9/7 14:07:14 (Sent 3 ads in response to query)
9/7 14:07:14 Got QUERY_STARTD_PVT_ADS
9/7 14:07:14 (Sent 1 ads in response to query)
~
------------------------------------
Quoting Prashant Lal <lalp@xxxxxxxxxxx>:
> do condor_reschedule on that machien and see
>
>
> LAL
>
>
> -----Original Message-----
> From: condor-users-bounces@xxxxxxxxxxx on behalf of abhi
> Sent: Wed 9/7/2005 4:19 PM
> To: Condor-Users Mail List
> Subject: RE: [Condor-users] idle jobs
>
> execute the following command
>
> echo "START=TRUE" >> /path of machine's local file/condor_config.local
> condor_reconfig
>
>
> > -----Original Message-----
> > From: lizhe.wang@xxxxxxxxxxxx
> > Sent: Wed, 7 Sep 2005 11:30:39 +0200
> > To: condor-users@xxxxxxxxxxx
> > Subject: [Condor-users] idle jobs
> >
> >
> >
> > Dear all:
> > I 'm new to condor on a single machine for test.
> > This machine take the role of submit, execute and manager.
> >
> >
> > I installed condor and submit an example submission file like:
> >
> > Executable=/bin/date
> > Log =/tmp/logr
> > output=/tmp/logr.out
> > Queue =
> >
> > when I run condor_q -analyze
> > the output is :
> > 0 are rejected by your job's requirements
> > 0 reject your job because of their own requirements
> > 0 match, but are serving users with a better priority in the pool
> > 1 match, match, but reject the job for unknown reasons
> > 0 match, but will not currently preempt their existing job
> > 0 are available to run your job
> >
> >
> > If I want that any job can run this machine regardless of the status of
> > the
> > machine.
> > I have set the START = True in the configure file, it seems it does not
> > work.
> >
> > How can I configure the file?
> > any hints?
> >
> > thanks,
> > Lizhe
> >
> >
> >
> >
> >
> >
> > _______________________________________________
> > Condor-users mailing list
> > Condor-users@xxxxxxxxxxx
> > https://lists.cs.wisc.edu/mailman/listinfo/condor-users
> _______________________________________________
> Condor-users mailing list
> Condor-users@xxxxxxxxxxx
> https://lists.cs.wisc.edu/mailman/listinfo/condor-users
>
>