[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Condor-users] Malformed ClassAd




I am trying to make a ClassAd for condor-G based matching.

Machine A: to be the machine which receives the classads from N different
different clusters B1...BN

The idea is that you submit a condor-G job on machineA, and it does
matching based on the classAds and forwards the job respectively.

I copied a shell script that someone else had, and modified
the script to make what I thought was a good condor ClassAd:

Contents of Classad for cluster B1:

[timm@fermigrid1 ~]$ condor_status -long fngp-osg.fnal.gov:2119/jobmanager-condor
MyType = "Machine"
TargetType = "Job"
Name = "fngp-osg.fnal.gov:2119/jobmanager-condor"
gatekeeper_url = "fngp-osg.fnal.gov:2119/jobmanager-condor"
Requirements = TRUE
Rank = 0.000000
CurrentRank = 0.000000
WantAdRevaluate = TRUE
CurMatches = 0
UpdateSequenceNumber = 1121277498
gluehostapplicationsoftwareruntimeenvironment = "VO-atlas-release-9.0.3 VO-atlas-lcg-release-0.0.2"
glueceinfohostname = "fnal.gov"
gluesubclustername = "fnal.gov"
gluecestatestatus = "Production"
gluecepolicymaxcputime = 2880
gluecepolicymaxwallclocktime = 2880
glueceaccesscontrolbaserule = "VO:*"
GlueCEStateTotalCPUs = 80
gluecestatefreecpus = 0
GlueCEStateRunningJobs = 26
GlueCEStateWaitingJobs = 0
gluecestateestimatedresponsetime = 0
MyAddress = "<131.225.167.42:0>"
LastHeardFrom = 1121277499
UpdatesTotal = 1
UpdatesSequenced = 0
UpdatesLost = 0
UpdatesHistory = "0x00000000000000000000000000000000"


The ClassAd is sent to machineA via condor_advertise.
(above is the output of condor_status -long).
MachineA sees the ClassAD but claims that it's malformed.


[timm@fermigrid1 ~]$ condor_status

Name OpSys Arch State Activity LoadAv Mem ActvtyTime

fngp-osg.fnal [?????????] [????] [????????] [???] [??] [Unknown]
vm1@fermigrid LINUX INTEL Claimed Busy 1.170 997 0+02:30:58
vm2@fermigrid LINUX INTEL Claimed Busy 1.420 997 0+01:22:47
vm3@fermigrid LINUX INTEL Claimed Busy 1.170 997 0+16:14:03
vm4@fermigrid LINUX INTEL Claimed Busy 1.170 997 0+16:14:03


                     Machines Owner Claimed Unclaimed Matched Preempting

         INTEL/LINUX        4     0       4         0       0          0

               Total        4     0       4         0       0          0

                    (Omitted 1 malformed ads in computed attribute totals)


So 3 questions:

1) Is it legal and/or advisable to try to have both job
execution slots from a startd, and a pool ad, in the same condor pool,
as I have above... e.g., condor_status shows 1 remote cluster and 4
cpu's on this machine

2) what's malformed about the classad as included above?

3) Is there a shortcut condor mechanism to have condor itself create
the classad for condor_g type matching.

Steve



--------------------------------------------------------------------
Steven C. Timm, Ph.D  (630) 840-8525  timm@xxxxxxxx  http://home.fnal.gov/~timm/
Fermilab Computing Div/Core Support Services Dept./Scientific Computing Section
Assistant Group Leader, Farms and Clustered Systems Group
Lead of Computing Farms Team