[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Condor-users] Condor & (parallel || MPI universe ) problem



Hi,
I have got a problem. I am trying to run MPI over a condor pool.

Basically I installed condor 6.6.10 on two machines. I checked if I could run jobs on both of them. No problem Condor works. Since I was under Fedora Core 4.0 I had to specify manually the menory in condor_config.local (this appears to be a known bug). I could run the C and Fortran examples. I had to specify the ports to be used so that I can have my firewall working and Condor also working.

I also installed MPI. I did not test it without Condor since I don?t want to have any other application than condor using it (so no ssh to wrap MPI communication nor rsh). However I read some Condor document which stated that the P4_RSHCOMMAND environment variable should point towards Condor?s own rsh . I did that .

So now I am trying to check that Condor can work with MPI (or any other parallel script). I added to my condor_config.local some parameters for the Dedicated scheduler . Then I tried to run some MPI jobs, and for some reason, although Condor managed to find one resource which was unclaimed and whose requirements fitted, it did not manage to select it (I turned on the full debug mode to check this).

So I tried several things to figure out what is the problem. First completely disable the firewall. No change. Then I installed Condor 6.7.14 (it did not have the memory problems stated above, so maybe it was a bug that 6.7.14 solved). It did not change a thing. Then I thought, maybe I did not install MPI properly or some variable are improperly defined. So I tried to run a simple script in the parallel environment. Well same problem. So the current problem is not a MPI issue.

I am a bit at loss about what I could try next to figure out what the problem is.

Here is  condor_config.local :

DedicatedScheduler = "DedicatedScheduler@xxxxxxxxxxxxxxx"
START     = True
SUSPEND   = False
CONTINUE  = True
PREEMPT   = False
KILL      = False
WANT_SUSPEND   = False
WANT_VACATE    = False
RANK      = Scheduler =?= $(DedicatedScheduler)
MPI_CONDOR_RSH_PATH = $(LIBEXEC)
CONDOR_SSHD = /usr/sbin/sshd
CONDOR_SSH_KEYGEN = /usr/bin/ssh-keygen
STARTD_EXPRS = $(STARTD_EXPRS), DedicatedScheduler
ALL_DEBUG               = D_FULLDEBUG

Here is the parallel script I run.

######################################
## Parallel example submit description file
## without using a shared file system
######################################
universe = parallel
executable = /bin/cat
log = logfile
input = infile.$(NODE)
output = outfile.$(NODE)
error = errfile.$(NODE)
machine_count = 1
should_transfer_files = yes
when_to_transfer_output = on_exit
queue

Here are the log files produced (I truncated the output so that only the described log appeared after the job was submitted).




Here is SchedLog







1/26 16:30:36 (pid:10568) DaemonCore: Command received via TCP from host <129.215.181.34:32970> 1/26 16:30:36 (pid:10568) DaemonCore: received command 1111 (QMGMT_CMD), calling handler (handle_q)
1/26 16:30:36 (pid:10568) OwnerCheck retval 1 (success),no ad
1/26 16:30:36 (pid:10568) OwnerCheck retval 1 (success),no ad
1/26 16:30:36 (pid:10568) get_file(): going to write to filename /home/condor/spool/cluster19.ickpt.subproc0
1/26 16:30:36 (pid:10568) get_file: Receiving 21104 bytes
1/26 16:30:36 (pid:10568) get_file: wrote 21104 bytes to file
1/26 16:30:36 (pid:10568) done with transfer, errno = 0
1/26 16:30:36 (pid:10568) condor_read(): Socket closed when trying to read buffer
1/26 16:30:36 (pid:10568) IO: Failed to read packet header
1/26 16:30:36 (pid:10568) QMGR Connection closed
1/26 16:30:36 (pid:10568) DaemonCore: Command received via TCP from host <129.215.181.34:32971> 1/26 16:30:36 (pid:10568) DaemonCore: received command 464 (ATTEMPT_ACCESS), calling handler (attempt_access_handler)
1/26 16:30:36 (pid:10568) ATTEMPT_ACCESS: Switching to user uid: 500 gid: 500.
1/26 16:30:36 (pid:10568) Checking file /home/jgrunche/condex/MPI/infile.0 for read permission.
1/26 16:30:36 (pid:10568) Switching back to old priv state.
1/26 16:30:36 (pid:10568) DaemonCore: Command received via TCP from host <129.215.181.34:32972> 1/26 16:30:36 (pid:10568) DaemonCore: received command 464 (ATTEMPT_ACCESS), calling handler (attempt_access_handler)
1/26 16:30:36 (pid:10568) ATTEMPT_ACCESS: Switching to user uid: 500 gid: 500.
1/26 16:30:36 (pid:10568) Checking file /home/jgrunche/condex/MPI/outfile.0 for write permission.
1/26 16:30:36 (pid:10568) Switching back to old priv state.
1/26 16:30:36 (pid:10568) DaemonCore: Command received via TCP from host <129.215.181.34:32973> 1/26 16:30:36 (pid:10568) DaemonCore: received command 464 (ATTEMPT_ACCESS), calling handler (attempt_access_handler)
1/26 16:30:36 (pid:10568) ATTEMPT_ACCESS: Switching to user uid: 500 gid: 500.
1/26 16:30:36 (pid:10568) Checking file /home/jgrunche/condex/MPI/errfile.0 for write permission.
1/26 16:30:36 (pid:10568) Switching back to old priv state.
1/26 16:30:36 (pid:10568) DaemonCore: Command received via UDP from host <129.215.181.34:33145> 1/26 16:30:36 (pid:10568) DaemonCore: received command 421 (RESCHEDULE), calling handler (reschedule_negotiator)
1/26 16:30:36 (pid:10568) Found idle MPI cluster 19
1/26 16:30:36 (pid:10568) Started timer (12) to call handleDedicatedJobs() in 2 secs
1/26 16:30:36 (pid:10568) JobsRunning = 0
1/26 16:30:36 (pid:10568) JobsIdle = 0
1/26 16:30:36 (pid:10568) JobsHeld = 0
1/26 16:30:36 (pid:10568) JobsRemoved = 0
1/26 16:30:36 (pid:10568) LocalUniverseJobsRunning = 0
1/26 16:30:36 (pid:10568) LocalUniverseJobsIdle = 0
1/26 16:30:36 (pid:10568) SchedUniverseJobsRunning = 0
1/26 16:30:36 (pid:10568) SchedUniverseJobsIdle = 0
1/26 16:30:36 (pid:10568) N_Owners = 1
1/26 16:30:36 (pid:10568) MaxJobsRunning = 200
1/26 16:30:36 (pid:10568) ENABLE_SOAP is undefined, using default value of False
1/26 16:30:36 (pid:10568) Trying to update collector <129.215.181.34:9618>
1/26 16:30:36 (pid:10568) Attempting to send update via UDP to collector ys.cap.ed.ac.uk <129.215.181.34:9618> 1/26 16:30:36 (pid:10568) SEC_DEBUG_PRINT_KEYS is undefined, using default value of False 1/26 16:30:36 (pid:10568) Sent HEART BEAT ad to 1 collectors. Number of submittors=1
1/26 16:30:36 (pid:10568) Changed attribute: RunningJobs = 0
1/26 16:30:36 (pid:10568) Changed attribute: IdleJobs = 0
1/26 16:30:36 (pid:10568) Changed attribute: HeldJobs = 0
1/26 16:30:36 (pid:10568) Changed attribute: FlockedJobs = 0
1/26 16:30:36 (pid:10568) Changed attribute: Name = "jgrunche@xxxxxxxxxxxxxxx"
1/26 16:30:36 (pid:10568) Sent ad to central manager for jgrunche@xxxxxxxxxxxxxxx
1/26 16:30:36 (pid:10568) Trying to update collector <129.215.181.34:9618>
1/26 16:30:36 (pid:10568) Attempting to send update via UDP to collector ys.cap.ed.ac.uk <129.215.181.34:9618> 1/26 16:30:36 (pid:10568) SEC_DEBUG_PRINT_KEYS is undefined, using default value of False
1/26 16:30:36 (pid:10568) Sent ad to 1 collectors for jgrunche@xxxxxxxxxxxxxxx
1/26 16:30:36 (pid:10568) ============ Begin clean_shadow_recs =============
1/26 16:30:36 (pid:10568) ============ End clean_shadow_recs =============
1/26 16:30:36 (pid:10568) SCHEDD_TIMEOUT_MULTIPLIER is undefined, using default value of 0
1/26 16:30:36 (pid:10568) Called reschedule_negotiator()
1/26 16:30:36 (pid:10568) Sending RESCHEDULE command to negotiator(s)
1/26 16:30:36 (pid:10568) SCHEDD_TIMEOUT_MULTIPLIER is undefined, using default value of 0 1/26 16:30:36 (pid:10568) SCHEDD_TIMEOUT_MULTIPLIER is undefined, using default value of 0 1/26 16:30:36 (pid:10568) Will use UDP to update collector ys.cap.ed.ac.uk <129.215.181.34:9618>
1/26 16:30:36 (pid:10568) Trying to query collector <129.215.181.34:9618>
1/26 16:30:36 (pid:10568) SCHEDD_TIMEOUT_MULTIPLIER is undefined, using default value of 0 1/26 16:30:36 (pid:10568) SEC_TCP_SESSION_TIMEOUT is undefined, using default value of 20
1/26 16:30:38 (pid:10568) Starting DedicatedScheduler::handleDedicatedJobs
1/26 16:30:38 (pid:10568) Found 1 idle dedicated job(s)
1/26 16:30:38 (pid:10568) DedicatedScheduler: Listing all dedicated jobs -
1/26 16:30:38 (pid:10568) Dedicated job: 19.0 jgrunche
1/26 16:30:38 (pid:10568) SCHEDD_TIMEOUT_MULTIPLIER is undefined, using default value of 0 1/26 16:30:38 (pid:10568) Will use UDP to update collector ys.cap.ed.ac.uk <129.215.181.34:9618>
1/26 16:30:38 (pid:10568) Trying to query collector <129.215.181.34:9618>
1/26 16:30:38 (pid:10568) SCHEDD_TIMEOUT_MULTIPLIER is undefined, using default value of 0 1/26 16:30:38 (pid:10568) SEC_DEBUG_PRINT_KEYS is undefined, using default value of False
1/26 16:30:38 (pid:10568) Found 1 potential dedicated resources
1/26 16:30:38 (pid:10568) idle resource list
1/26 16:30:38 (pid:10568)  ************ empty ************
1/26 16:30:38 (pid:10568) limbo resource list
1/26 16:30:38 (pid:10568)  ************ empty ************
1/26 16:30:38 (pid:10568) unclaimed resource list
1/26 16:30:38 (pid:10568)    LINUX      INTEL   ys.cap.ed.ac.uk
1/26 16:30:38 (pid:10568) busy resource list
1/26 16:30:38 (pid:10568)  ************ empty ************
1/26 16:30:38 (pid:10568) Trying to find 1 resource(s) for dedicated job 19.0
1/26 16:30:38 (pid:10568) Satisfied job 19 with 1 unclaimed resources
1/26 16:30:38 (pid:10568) Generating 1 resource requests for job 19
1/26 16:30:38 (pid:10568) Waiting to negotiate for 1 dedicated resource request(s)
1/26 16:30:38 (pid:10568) In DedicatedScheduler::publishRequestAd()
1/26 16:30:38 (pid:10568) Trying to update collector <129.215.181.34:9618>
1/26 16:30:38 (pid:10568) Attempting to send update via UDP to collector ys.cap.ed.ac.uk <129.215.181.34:9618> 1/26 16:30:38 (pid:10568) SEC_DEBUG_PRINT_KEYS is undefined, using default value of False 1/26 16:30:38 (pid:10568) Skipping RESCHEDULE as optimization: sent one 2 seconds ago and still haven't heard from negotiatator. Will not reschedule more frequently than every 30 seconds
1/26 16:30:38 (pid:10568) Entering DedicatedScheduler::checkSanity()
1/26 16:30:38 (pid:10568) Finished DedicatedScheduler::handleDedicatedJobs
1/26 16:31:16 (pid:10568) -------- Begin starting jobs --------
1/26 16:31:16 (pid:10568) -------- Done starting jobs --------

The previous logs seems to hint that the resource was found and the job ?somehow? started.



Here is NegociatorLog






1/26 16:30:36 NEGOTIATOR_CYCLE_DELAY is undefined, using default value of 20
1/26 16:30:36 ---------- Started Negotiation Cycle ----------
1/26 16:30:36 Phase 1:  Obtaining ads from collector ...
1/26 16:30:36   Getting all public ads ...
1/26 16:30:36 Trying to query collector <129.215.181.34:9618>
1/26 16:30:36 NEGOTIATOR_TIMEOUT_MULTIPLIER is undefined, using default value of 0
1/26 16:30:36 SEC_DEBUG_PRINT_KEYS is undefined, using default value of False
1/26 16:30:36   Sorting 5 ads ...
1/26 16:30:36   Getting startd private ads ...
1/26 16:30:36 Trying to query collector <129.215.181.34:9618>
1/26 16:30:36 NEGOTIATOR_TIMEOUT_MULTIPLIER is undefined, using default value of 0
1/26 16:30:36 SEC_DEBUG_PRINT_KEYS is undefined, using default value of False
1/26 16:30:36 Got ads: 5 public and 1 private
1/26 16:30:36 Public ads include 1 submitter, 1 startd
1/26 16:30:36 Phase 2:  Performing accounting ...
1/26 16:30:36 Phase 3:  Sorting submitter ads by priority ...
1/26 16:30:36 Phase 4.1:  Negotiating with schedds ...
1/26 16:30:36     NumStartdAds = 1
1/26 16:30:36     NormalFactor = 1.000000
1/26 16:30:36     MaxPrioValue = 0.500000
1/26 16:30:36     NumScheddAds = 1
1/26 16:30:36 NEGOTIATOR_IGNORE_USER_PRIORITIES is undefined, using default value of False 1/26 16:30:36 Negotiating with jgrunche@xxxxxxxxxxxxxxx skipped because no idle jobs
1/26 16:30:36   Schedd jgrunche@xxxxxxxxxxxxxxx got all it wants; removing it.
1/26 16:30:36 ---------- Finished Negotiation Cycle ----------
1/26 16:31:06 enter Matchmaker::updateCollector
1/26 16:31:06 Trying to update collector <129.215.181.34:9618>
1/26 16:31:06 Attempting to send update via UDP to collector ys.cap.ed.ac.uk <129.215.181.34:9618>
1/26 16:31:06 SEC_DEBUG_PRINT_KEYS is undefined, using default value of False
1/26 16:31:06 exit Matchmaker::UpdateCollector




Here is CollectorLog



1/26 16:30:36 Found ScheddIpAddr
1/26 16:30:36 Got IP = '<129.215.181.34:32959>'
1/26 16:30:36 ScheddAd : Updating ... "< ys.cap.ed.ac.uk , 129.215.181.34 >"
1/26 16:30:36 Found ScheddIpAddr
1/26 16:30:36 Got IP = '<129.215.181.34:32959>'
1/26 16:30:36 SubmittorAd : Inserting ** "< jgrunche@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxx , 129.215.181.34 >" 1/26 16:30:36 stats: Inserting new hashent for 'Submittor':'jgrunche@xxxxxxxxxxxxxxx':'129.215.181.34'
1/26 16:30:36 Got QUERY_NEGOTIATOR_ADS
1/26 16:30:36 (Sending 1 ads in response to query)
1/26 16:30:36 Got QUERY_ANY_ADS
1/26 16:30:36 (Sending 5 ads in response to query)
1/26 16:30:36 Got QUERY_STARTD_PVT_ADS
1/26 16:30:36 (Sending 1 ads in response to query)
1/26 16:30:38 Got QUERY_STARTD_ADS
1/26 16:30:38 (Sending 1 ads in response to query)
1/26 16:30:38 Found ScheddIpAddr
1/26 16:30:38 Got IP = '<129.215.181.34:32959>'
1/26 16:30:38 SubmittorAd : Inserting ** "< DedicatedScheduler@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxx , 129.215.181.34 >" 1/26 16:30:38 stats: Inserting new hashent for 'Submittor':'DedicatedScheduler@xxxxxxxxxxxxxxx':'129.215.181.34'
1/26 16:31:06 NegotiatorAd  : Inserting ** "< ys.cap.ed.ac.uk >"
1/26 16:31:11 MasterAd     : Updating ... "< ys.cap.ed.ac.uk >"
1/26 16:31:22 Found StartdIpAddr
1/26 16:31:22 Got IP = '<129.215.181.34:32958>'
1/26 16:31:22 StartdAd : Updating ... "< ys.cap.ed.ac.uk , 129.215.181.34 >" 1/26 16:31:22 StartdPvtAd : Updating ... "< ys.cap.ed.ac.uk , 129.215.181.34 >" 1/26 16:32:33 DC_AUTHENTICATE: attempt to open invalid session ys:19948:1136569740:9, failing. 1/26 16:32:34 DC_AUTHENTICATE: attempt to open invalid session ys:19948:1136569740:9, failing. 1/26 16:32:54 DC_AUTHENTICATE: attempt to open invalid session ys:19948:1136569842:17, failing. 1/26 16:32:54 DC_AUTHENTICATE: attempt to open invalid session ys:19948:1136569842:17, failing. 1/26 16:33:17 DC_AUTHENTICATE: attempt to open invalid session ys:19948:1136569851:18, failing.



All those messages ? DC_AUTHENTICATE: attempt to open invalid session ys:19948:1136569740:9, failing.? are a bit bizarre. Maybe they are the problem.



Here is MasterLog





1/26 16:31:11 enter Daemons::UpdateCollector
1/26 16:31:11 Trying to update collector <129.215.181.34:9618>
1/26 16:31:11 Attempting to send update via UDP to collector ys.cap.ed.ac.uk <129.215.181.34:9618>
1/26 16:31:11 SEC_DEBUG_PRINT_KEYS is undefined, using default value of False
1/26 16:31:11 exit Daemons::UpdateCollector
1/26 16:31:11 enter Daemons::CheckForNewExecutable
1/26 16:31:11 Time stamp of running /usr/local/condor/sbin/condor_master: 1134521440
1/26 16:31:11 GetTimeStamp returned: 1134521440
1/26 16:31:11 Time stamp of running /usr/local/condor/sbin/condor_collector: 1134521437
1/26 16:31:11 GetTimeStamp returned: 1134521437
1/26 16:31:11 Time stamp of running /usr/local/condor/sbin/condor_negotiator: 1134521434
1/26 16:31:11 GetTimeStamp returned: 1134521434
1/26 16:31:11 Time stamp of running /usr/local/condor/sbin/condor_startd: 1134521418
1/26 16:31:11 GetTimeStamp returned: 1134521418
1/26 16:31:11 Time stamp of running /usr/local/condor/sbin/condor_schedd: 1134521421
1/26 16:31:11 GetTimeStamp returned: 1134521421
1/26 16:31:11 exit Daemons::CheckForNewExecutable
1/26 16:31:22 ProcAPI::buildFamily() Found daddypid on the system: 10565
1/26 16:31:22 ProcAPI::buildFamily() Found daddypid on the system: 10566
1/26 16:31:22 ProcAPI::buildFamily() Found daddypid on the system: 10567
1/26 16:31:22 ProcAPI::buildFamily() Found daddypid on the system: 10568
1/26 16:32:22 ProcAPI::buildFamily() Found daddypid on the system: 10565
1/26 16:32:22 ProcAPI::buildFamily() Found daddypid on the system: 10566
1/26 16:32:22 ProcAPI::buildFamily() Found daddypid on the system: 10567
1/26 16:32:22 ProcAPI::buildFamily() Found daddypid on the system: 10568
1/26 16:33:22 ProcAPI::buildFamily() Found daddypid on the system: 10565
1/26 16:33:22 ProcAPI::buildFamily() Found daddypid on the system: 10566
1/26 16:33:22 ProcAPI::buildFamily() Found daddypid on the system: 10567
1/26 16:33:22 ProcAPI::buildFamily() Found daddypid on the system: 10568
1/26 16:34:06 Getting monitoring info for pid 10564
1/26 16:34:22 ProcAPI::buildFamily() Found daddypid on the system: 10565
1/26 16:34:22 ProcAPI::buildFamily() Found daddypid on the system: 10566
1/26 16:34:22 ProcAPI::buildFamily() Found daddypid on the system: 10567
1/26 16:34:22 ProcAPI::buildFamily() Found daddypid on the system: 10568




Here is StartLog




1/26 16:31:18 Swap space: 2031608
1/26 16:31:18 37367864 kbytes available for "/home/condor/execute"
1/26 16:31:18 Looking up RESERVED_DISK parameter
1/26 16:31:18 Reserving 5120 kbytes for file system
1/26 16:31:18 Disk space: 37362744
1/26 16:31:18 Mouse IRQ: 12
1/26 16:31:18 Add 1985543 mouse interrupts.  Total: 1985543
1/26 16:31:22 Trying to update collector <129.215.181.34:9618>
1/26 16:31:22 Attempting to send update via UDP to collector ys.cap.ed.ac.uk <129.215.181.34:9618>
1/26 16:31:22 SEC_DEBUG_PRINT_KEYS is undefined, using default value of False
1/26 16:31:22 Sent update to 1 collector(s)
1/26 16:34:18 Getting monitoring info for pid 10567




Here are the results of condor_q -l






-- Submitter: ys.cap.ed.ac.uk : <129.215.181.34:32959> : ys.cap.ed.ac.uk
MyType = "Job"
TargetType = "Machine"
ClusterId = 19
QDate = 1138293036
CompletionDate = 0
Owner = "jgrunche"
RemoteWallClockTime = 0.000000
LocalUserCpu = 0.000000
LocalSysCpu = 0.000000
RemoteUserCpu = 0.000000
RemoteSysCpu = 0.000000
ExitStatus = 0
NumCkpts = 0
NumRestarts = 0
NumSystemHolds = 0
CommittedTime = 0
TotalSuspensions = 0
LastSuspensionTime = 0
CumulativeSuspensionTime = 0
ExitBySignal = FALSE
CondorVersion = "$CondorVersion: 6.7.14 Dec 13 2005 $"
CondorPlatform = "$CondorPlatform: I386-LINUX_RH9 $"
RootDir = "/"
Iwd = "/home/jgrunche/condex/MPI"
JobUniverse = 11
Cmd = "/bin/cat"
WantIOProxy = TRUE
CurrentHosts = 0
WantRemoteSyscalls = FALSE
WantCheckpoint = FALSE
RemoteSpoolDir = "/home/condor/spool/cluster19.proc0.subproc0"
MinHosts = 1
MaxHosts = 1
JobStatus = 1
EnteredCurrentStatus = 1138293036
JobPrio = 0
User = "jgrunche@xxxxxxxxxxxxxxx"
NiceUser = FALSE
Env = ""
JobNotification = 2
WantRemoteIO = TRUE
UserLog = "/home/jgrunche/condex/MPI/logfile"
CoreSize = 0
KillSig = "SIGTERM"
Rank = 0.000000
In = "infile.#pArAlLeLnOdE#"
StreamIn = FALSE
Out = "outfile.#pArAlLeLnOdE#"
StreamOut = FALSE
Err = "errfile.#pArAlLeLnOdE#"
StreamErr = FALSE
BufferSize = 524288
BufferBlockSize = 32768
ShouldTransferFiles = "YES"
WhenToTransferOutput = "ON_EXIT"
TransferFiles = "ONEXIT"
ImageSize = 21
ExecutableSize = 21
DiskUsage = 21
Requirements = (Arch == "INTEL") && (OpSys == "LINUX") && (Disk >= DiskUsage) && ((Memory * 1024) >= ImageSize) && (HasFileTransfer)
PeriodicHold = FALSE
PeriodicRelease = FALSE
PeriodicRemove = FALSE
OnExitHold = FALSE
OnExitRemove = TRUE
LeaveJobInQueue = FALSE
Args = ""
GlobalJobId = "ys.cap.ed.ac.uk#1138293036#19.0"
ProcId = 0
Scheduler = "DedicatedScheduler@xxxxxxxxxxxxxxx"
ServerTime = 1138293387






Here are the results of condor_status -l






MyType = "Machine"
TargetType = "Job"
Name = "ys.cap.ed.ac.uk"
Machine = "ys.cap.ed.ac.uk"
Rank = Scheduler =?= "DedicatedScheduler@xxxxxxxxxxxxxxx"
CpuBusy = ((LoadAvg - CondorLoadAvg) >= 0.500000)
COLLECTOR_HOST_STRING = "ys.cap.ed.ac.uk"
DedicatedScheduler = "DedicatedScheduler@xxxxxxxxxxxxxxx"
CondorVersion = "$CondorVersion: 6.7.14 Dec 13 2005 $"
CondorPlatform = "$CondorPlatform: I386-LINUX_RH9 $"
VirtualMachineID = 1
VirtualMemory = 2031608
Disk = 37362724
CondorLoadAvg = 0.000000
LoadAvg = 0.280000
KeyboardIdle = 0
ConsoleIdle = 0
Memory = 2027
Cpus = 1
StartdIpAddr = "<129.215.181.34:32958>"
Arch = "INTEL"
OpSys = "LINUX"
UidDomain = "ys.cap.ed.ac.uk"
FileSystemDomain = "ys.cap.ed.ac.uk"
Subnet = "129.215.181"
HasIOProxy = TRUE
TotalVirtualMemory = 2031608
TotalDisk = 37362724
TotalCpus = 1
TotalMemory = 2027
KFlops = 730496
Mips = 2108
LastBenchmark = 1138292778
TotalLoadAvg = 0.280000
TotalCondorLoadAvg = 0.000000
ClockMin = 996
ClockDay = 4
TotalVirtualMachines = 1
HasFileTransfer = TRUE
HasPerFileEncryption = TRUE
HasReconnect = TRUE
HasMPI = TRUE
HasTDP = TRUE
HasJobDeferral = TRUE
HasJICLocalConfig = TRUE
HasJICLocalStdin = TRUE
JavaVendor = "Free Software Foundation, Inc."
JavaVersion = "1.4.2"
JavaMFlops = 3.146770
HasJava = TRUE
HasPVM = TRUE
HasRemoteSyscalls = TRUE
HasCheckpointing = TRUE
StarterAbilityList = "HasFileTransfer,HasPerFileEncryption,HasReconnect,HasMPI,HasTDP,HasJobDeferral,HasJICLocalConfig,HasJICLocalStdin,HasJava,HasPVM,HasRemoteSyscalls,HasCheckpointing"
CpuBusyTime = 0
CpuIsBusy = FALSE
TimeToLive = 2147483647
State = "Unclaimed"
EnteredCurrentState = 1138292778
Activity = "Idle"
EnteredCurrentActivity = 1138292778
Start = TRUE
Requirements = START
MaxJobRetirementTime = 0
CurrentRank = 0.000000
MonitorSelfTime = 1138293258
MonitorSelfCPUUsage = 0.000000
MonitorSelfImageSize = 7496.000000
MonitorSelfResidentSetSize = 3300
MonitorSelfAge = 493
DaemonStartTime = 1138292771
UpdateSequenceNumber = 2
MyAddress = "<129.215.181.34:32958>"
LastHeardFrom = 1138293382
UpdatesTotal = 3
UpdatesSequenced = 2
UpdatesLost = 0
UpdatesHistory = "0x00000000000000000000000000000000"



Thx for your help !