[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] condor_rooster



Hi Dan,
Thanks for  the prompt reply,I have configured according to the manual but still failed to send wake packet .Here are my classad,rooster_log  and  DAEMON_LIST
c:\>condor\bin\condor_power
condor_power: failed to sent wake packet
 
DAEMON_LIST = MASTER  SCHEDD STARTD KBDD ROOSTER
 
Rooster_log
03/02 12:02:02 Using config source: C:\condor\condor_config
03/02 12:02:02 Using local config sources:
03/02 12:02:02    C:\condor/condor_config.local
03/02 12:02:02 DaemonCore: command socket at <xxxxx:1133>
03/02 12:02:02 Will perform unhibernate checks every ROOSTER_INTERVAL=300 seconds.
03/02 12:16:47 Got SIGHUP.  Re-reading config files.
03/02 12:16:47 Locale: English_United States.1252
03/02 12:17:25 Got SIGQUIT.  Performing fast shutdown.
03/02 12:17:25 **** condor_rooster.exe (condor_ROOSTER) pid 3392 EXITING WITH STATUS 0
03/02 12:21:33 Locale: English_United States.1252
03/02 12:21:33 ******************************************************
03/02 12:21:33 ** condor_rooster.exe (CONDOR_ROOSTER) STARTING UP
03/02 12:21:33 ** C:\condor\bin\condor_rooster.exe
03/02 12:21:34 ** SubsystemInfo: name=ROOSTER type=DAEMON(12) class=DAEMON(1)
03/02 12:21:34 ** Configuration: subsystem:ROOSTER local:<NONE> class:DAEMON
03/02 12:21:34 ** $CondorVersion: 7.5.0 Dec 21 2009 BuildID: 205324 $
03/02 12:21:34 ** $CondorPlatform: INTEL-WINNT50 $
03/02 12:21:34 ** PID = 2428
03/02 12:21:34 ** Log last touched 3/2 12:17:25
03/02 12:21:34 ******************************************************
03/02 12:21:34 Using config source: C:\condor\condor_config
03/02 12:21:34 Using local config sources:
03/02 12:21:34    C:\condor/condor_config.local
03/02 12:21:34 DaemonCore: command socket at <xxxxx:1117>
03/02 12:21:34 Will perform unhibernate checks every ROOSTER_INTERVAL=300 seconds.

Classad
Offline=true
MyType = "Machine"
TargetType = "Job"
Name = "slot1@xxxxxxxxxxx"
Rank = 0.000000
CpuBusy = ((LoadAvg - CondorLoadAvg) >= 0.500000)
SlotWeight = Cpus
Unhibernate = MY.MachineLastMatchTime =!= UNDEFINED
MyCurrentTime = 1267196179
Machine = "xxxxx.xxxxx"
MyAddress = "<xxxxx:1027>"
COLLECTOR_HOST_STRING = "window-pc.xxxxx"
CondorVersion = "$CondorVersion: 7.5.0 Dec 21 2009 BuildID: 205324 $"
CondorPlatform = "$CondorPlatform: INTEL-WINNT50 $"
SlotID = 1
VirtualMachineID = 1
VirtualMemory = 1240084
TotalDisk = 53040124
Disk = 26520062
CondorLoadAvg = 0.000000
LoadAvg = 0.200000
KeyboardIdle = 159048
ConsoleIdle = 159048
Memory = 1023
Cpus = 1
StartdIpAddr = "<xxxxx:1027>"
Arch = "INTEL"
OpSys = "WINNT51"
UidDomain = "xxxxx"
FileSystemDomain = "xxxxx.xxxxx"
HasIOProxy = TRUE
CheckpointPlatform = "WINNT51 INTEL Unknown normal"
WindowsMajorVersion = 5
WindowsMinorVersion = 1
WindowsBuildNumber = 2600
WindowsServicePackMajorVersion = 3
WindowsServicePackMinorVersion = 0
WindowsProductType = 1
TotalVirtualMemory = 2480168
TotalCpus = 2
TotalMemory = 2046
KFlops = 605703
Mips = 3143
LastBenchmark = 1267195875
TotalLoadAvg = 0.200000
TotalCondorLoadAvg = 0.000000
ClockMin = 596
ClockDay = 5
TotalSlots = 2
TotalVirtualMachines = 2
HasFileTransfer = TRUE
HasPerFileEncryption = TRUE
HasReconnect = TRUE
HasMPI = TRUE
HasTDP = TRUE
HasJobDeferral = TRUE
HasJICLocalConfig = TRUE
HasJICLocalStdin = TRUE
JavaVendor = "Sun Microsystems Inc."
JavaVersion = "1.6.0_05"
JavaSpecificationVersion = "1.6"
JavaMFlops = 303.123383
HasJava = TRUE
HasWindowsRunAsOwner = TRUE
StarterAbilityList = "HasFileTransfer,HasPerFileEncryption,HasReconnect,HasMPI,HasTDP,HasJobDeferral,HasJICLocalConfig,HasJICLocalStdin,HasJava,HasVM,HasWindowsRunAsOwner"
HasVM = FALSE
HibernationLevel = 0
HibernationState = "NONE"
HibernationSupportedStates = ""
CanHibernate = True
HardwareAddress = "00:14:22:51:BF:06"
SubnetMask = "255.255.255.0"
IsWakeOnLanSupported = TRUE
IsWakeOnLanEnabled = TRUE
IsWakeAble = TRUE
WakeOnLanSupportedFlags = "Magic Packet"
WakeOnLanEnabledFlags = "Magic Packet"
CpuBusyTime = 0
CpuIsBusy = FALSE
TimeToLive = 2147483647
State = "Unclaimed"
EnteredCurrentState = 1267037127
Activity = "Idle"
EnteredCurrentActivity = 1267037127
TotalTimeOwnerIdle = 10
TotalTimeUnclaimedIdle = 159052
Start = TRUE
Requirements = (START) && (IsValidCheckpointPlatform)
IsValidCheckpointPlatform = (((TARGET.JobUniverse == 1) == FALSE) || ((MY.CheckpointPlatform =!= UNDEFINED) && ((TARGET.LastCheckpointPlatform =?= MY.CheckpointPlatform) || (TARGET.NumCkpts == 0))))
MaxJobRetirementTime = 0
LastFetchWorkSpawned = 0
LastFetchWorkCompleted = 0
NextFetchWorkDelay = -1
CurrentRank = 0.000000
MonitorSelfTime = 1267196028
MonitorSelfCPUUsage = 2.159518
MonitorSelfImageSize = 46500.000000
MonitorSelfResidentSetSize = 11176
MonitorSelfAge = 158918
MonitorSelfRegisteredSocketCount = 1
DaemonStartTime = 1267107333
UpdateSequenceNumber = 296
LastHeard>From = 1267196179
UpdatesTotal = 261
UpdatesSequenced = 260
UpdatesLost = 0
UpdatesHistory = "0x00000000000000000000000000000000"
 
Thank you
Tom
 
 


From: Dan Bradley <dan@xxxxxxxxxxxx>
To: Condor-Users Mail List <condor-users@xxxxxxxxxxx>
Sent: Tue, March 2, 2010 11:33:11 AM
Subject: Re: [Condor-users] condor_rooster

Tom,

General advice for getting condor_rooster to work:

1. the execute nodes and central manager must be running Condor 7.3.2 or later.

2. OFFLINE_LOG must be configured so that the ClassAds of offline machines are saved.  Verify that when a machine enters the hibernating state that you can see a ClassAd for that machine.  The following command should list all hibernating machines:

condor_status -constraint 'Offline'

3. you must add ROOSTER to DAEMON_LIST on a machine from which condor_power can successfully wake up hibernating machines.  (Changes to DAEMON_LIST require a restart of condor.)  I recommend testing with condor_power by hand to make sure it can wake up hibernating machines.

--Dan

Tom T wrote:
> All,
> Has anyone who has spent time to get the Condor pool to
> work with power saving Windows hosts I am very interested in
> using condor_rooster to wake up machines automatically according to
> demand but I cannot for the life of me understand how it works
> Is there any more documentation available on this - Version 7.5 manual is a bit
> short on info.
> I am curious to see how other people are handling this .
>  Thanks in Advance
> Tom

> ------------------------------------------------------------------------
>
> _______________________________________________
> Condor-users mailing list
> To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with a
> subject: Unsubscribe
> You can also unsubscribe by visiting
> https://lists.cs.wisc.edu/mailman/listinfo/condor-users
>
> The archives can be found at:
> https://lists.cs.wisc.edu/archive/condor-users/

_______________________________________________
Condor-users mailing list
To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/condor-users

The archives can be found at:
https://lists.cs.wisc.edu/archive/condor-users/