[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] Condor Java problem



On Fri, Jan 07, 2005 at 11:54:34AM -0600, David A. Kotz wrote:
> None of the Linux machines in our department will acknowledge having
> Java in their classads.  If I manually run condor_starter, it shows up.
> I've run the test as myself, root, and condor, and in all cases, Java is
> detected by condor_starter.  On the Suns, Java seems to behave
> correctly.  Any ideas would be appreciated.  (I believe I mentioned
> before that Condor hates me.)
> 

Almost certainly this means it's some sort of path problem - either in
the path to java or some sort of dynamic library path. The condor_starter
gets it's environment from the condor_startd, which gets it from the
condor_master. If your master is being started up by init when the
machine boots, your path might be kind of sparse.

If you can't figure it out, the easiest way to fix it is to wrap 
the condor_starter in a shell script and see what environment it's being
invoked with. rename condor_starter to be condor_starter.bin, and then
put this in a shell script and name it condor_starter:

#!/bin/sh

echo >>/tmp/StarterEnvDump.txt
echo >>"Starting the starter"
date >>/tmp/StarterEnvDump.txt
env >>/tmp/StarterEnvDump.txt
java -version >>/tmp/StarterEnvDump.txt
exec condor_starter.bin $*

> Below is the output from my desktop.
> 
> - dave
> 
> _________________________________
> 
> keemun $ condor_starter -classad
> CondorVersion = "$CondorVersion: 6.6.6 Jul 26 2004 $"
> IsDaemonCore = True
> HasFileTransfer = True
> HasMPI = True
> HasJICLocalConfig = True
> HasJICLocalStdin = True
> JavaVendor = "Sun Microsystems Inc."
> JavaVersion = "1.4.2"
> JavaMFlops = 135.990906
> HasJava = True
> 
> _________________________________
> 
> keemun $ condor_status -l keemun
> MyType = "Machine"
> TargetType = "Job"
> Name = "vm1@xxxxxxxxxxxxxxxxxxxx"
> Machine = "keemun.cs.utexas.edu"
> Rank = ((TARGET.Group =?= "CARTEL") * 3) + ((TARGET.Group =?= "PROF") *
> 3) + ((TARGET.Group =?= "GRAD") * 3) + ((TARGET.Group =?= "UNDER") * 2)
> CpuBusy = ((LoadAvg - CondorLoadAvg) >= 0.500000)
> COLLECTOR_HOST_STRING = "ungoliant.cs.utexas.edu"
> CKPT_SERVER_HOST = ungoliant.cs.utexas.edu
> CondorVersion = "$CondorVersion: 6.6.6 Jul 26 2004 $"
> CondorPlatform = "$CondorPlatform: I386-LINUX_RH72 $"
> VirtualMachineID = 1
> VirtualMemory = 525622
> Disk = 412332
> CondorLoadAvg = 0.000000
> LoadAvg = 0.020000
> KeyboardIdle = 65
> ConsoleIdle = 65
> Memory = 250
> Cpus = 1
> StartdIpAddr = "<128.83.120.125:32774>"
> Arch = "INTEL"
> OpSys = "LINUX"
> UidDomain = "cs.utexas.edu"
> FileSystemDomain = "cs.utexas.edu"
> Subnet = "128.83.120"
> HasIOProxy = TRUE
> TotalVirtualMemory = 1051244
> TotalDisk = 824664
> KFlops = 618930
> Mips = 1368
> LastBenchmark = 1105102361
> TotalLoadAvg = 0.020000
> TotalCondorLoadAvg = 0.000000
> ClockMin = 692
> ClockDay = 5
> TotalVirtualMachines = 2
> HasFileTransfer = TRUE
> HasMPI = TRUE
> HasJICLocalConfig = TRUE
> HasJICLocalStdin = TRUE
> HasPVM = TRUE
> HasRemoteSyscalls = TRUE
> HasCheckpointing = TRUE
> StarterAbilityList =
> "HasFileTransfer,HasMPI,HasJICLocalConfig,HasJICLocalStdin,HasPVM,HasRemoteSyscalls,HasCheckpointing"
> CpuBusyTime = 0
> CpuIsBusy = FALSE
> State = "Owner"
> EnteredCurrentState = 1105111961
> Activity = "Idle"
> EnteredCurrentActivity = 1105111961
> Start = ((KeyboardIdle > 15 * 60) && (((LoadAvg - CondorLoadAvg) <=
> 0.300000) || (State != "Unclaimed" && State != "Owner")) &&
> ((TARGET.Project =?= "ARCHITECTURE") || (TARGET.Project =?=
> "FORMAL_METHODS") || (TARGET.Project =?= "AI_ROBOTICS") ||
> (TARGET.Project =?= "OPERATING_DISTRIBUTED_SYSTEMS") || (TARGET.Project
> =?= "NETWORKING_MULTIMEDIA") || (TARGET.Project =?=
> "PROGRAMMING_LANGUAGES") || (TARGET.Project =?= "THEORY") ||
> (TARGET.Project =?= "GRAPHICS_VISUALIZATION") || (TARGET.Project =?=
> "COMPONENT_BASED_SOFTWARE") || (TARGET.Project =?=
> "SCIENTIFIC_COMPUTING") || (TARGET.Project =?= "COMPUTATIONAL_BIOLOGY")
> || (TARGET.Project =?= "INSTRUCTIONAL") || (TARGET.Project =?= "UTGRID")
> || (TARGET.Project =?= "OTHER")) && (TARGET.ProjectDescription =!=
> UNDEFINED))
> Requirements = START
> CurrentRank = 0.000000
> DaemonStartTime = 1105049249
> UpdateSequenceNumber = 266
> MyAddress = "<128.83.120.125:32774>"
> LastHeardFrom = 1105119166
> UpdatesTotal = 1542
> UpdatesSequenced = 1539
> UpdatesLost = 0
> UpdatesHistory = "0x00000000000000000000000000000000"
> 
> 
> 
> -- 
> David A. Kotz <dkotz@xxxxxxxxxxxxx>
> 
> _______________________________________________
> Condor-users mailing list
> Condor-users@xxxxxxxxxxx
> http://lists.cs.wisc.edu/mailman/listinfo/condor-users