[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[HTCondor-users] CE jobs staying idle after unsuccessful match



Hi all,

we have occassionaly jobs coming in via our EL7/9.0.15 CEs from one of our supported VOs. They recently switched one of their submission nodes to EL9/Condor23 and now sometimes jobs coming in from this submit node are staying idle in our Condor cluster. While so far only jobs from this submission node seem to be affected, only a subsection of all jobs from this submitter are problematic with other jobs starting without problems. So far we have not found an obvious difference between starting/running jobs and jobs seemingly stuck in idle, so that our first suspicion of an EL7/Condor9 vs. EL9/Condor23 issue did not hold.

Affected jobs seem to have all
  'numjobmatches==1 && jobstatus==1'
as ad states, i.e., all got matched once.

We increased our logging on the CE & Condor entry points and central managers to `ALL_DEBUG = D_FULLDEBUG` but so far obvious hints on why these jobs stay idle and are not re-matched are sparse.

On the active central manager, such jobs have a matching attempt logged like [1] where the target execution point's startd (dynamic slots) seem to just rejects the job. Afterwards, there seem to be no further matching attempts. In the rejecting worker's logs there are no hints of the affected cluster id, so I have no good idea why the worker did not accept the job (I am a bit hesitant to increase logging on all execution points). In principle, the jobs are matchable and better-analyze looks good [2] with our execution points nominally willing to run them.

Maybe someone has an idea, why these once matched & rejected jobs, i.e., numjobmatches==1, are not matched again?

Package versions are as of [3a,b] for the CE and the worker (not in sync due to reasons...)

Cheers,
  Thomas

[1]
NegotiatorLog:02/09/24 08:52:52 Request 19458678.00000: autocluster 723303 (request count 1 of 2) NegotiatorLog:02/09/24 08:52:52 Matched 19458678.0 group_ATLAS.atlasprd000@xxxxxxx <131.169.223.129:9620?addrs=131.169.223.129-9620+[2001-638-700-10df--1-81]-9620&alias=grid-htcondorce0.desy.de&noUDP&sock=schedd_1587_20e3> preempting none <131.169.161.162:9620?addrs=131.169.161.162-9620+[2001-638-700-10a0--1-1a2]-9620&alias=batch0558.desy.de&noUDP&sock=startd_3590_0516> slot1@xxxxxxxxxxxxxxxxx NegotiatorLog:02/09/24 08:52:52 Request 19458678.00000: autocluster 723303 (request count 2 of 2) NegotiatorLog:02/09/24 08:52:52 Rejected 19458678.0 group_ATLAS.atlasprd000@xxxxxxx <131.169.223.129:9620?addrs=131.169.223.129-9620+[2001-638-700-10df--1-81]-9620&alias=grid-htcondorce0.desy.de&noUDP&sock=schedd_1587_20e3>: no match found MatchLog:02/09/24 08:52:52 Matched 19458678.0 group_ATLAS.atlasprd000@xxxxxxx <131.169.223.129:9620?addrs=131.169.223.129-9620+[2001-638-700-10df--1-81]-9620&alias=grid-htcondorce0.desy.de&noUDP&sock=schedd_1587_20e3> preempting none <131.169.161.162:9620?addrs=131.169.161.162-9620+[2001-638-700-10a0--1-1a2]-9620&alias=batch0558.desy.de&noUDP&sock=startd_3590_0516> slot1@xxxxxxxxxxxxxxxxx MatchLog:02/09/24 08:52:52 Rejected 19458678.0 group_ATLAS.atlasprd000@xxxxxxx <131.169.223.129:9620?addrs=131.169.223.129-9620+[2001-638-700-10df--1-81]-9620&alias=grid-htcondorce0.desy.de&noUDP&sock=schedd_1587_20e3>: no match found

[2]
-- Schedd: grid-htcondorce0.desy.de : <131.169.223.129:4792?...
The Requirements expression for job 19458678.000 is

NODE_IS_HEALTHY && ifThenElse(x509UserProxyVOName is "desy",TEST_RESOURCE == true,GRID_RESOURCE == true) && (OpSysAndVer == "CentOS7") && ifThenElse((x509UserProxyVOName isnt "desy") && (x509UserProxyVOName isnt "ops") && (x509UserProxyVOName isnt "calice") && (x509UserProxyVOName isnt "belle"),(OLD_RESOURCE == false),(OLD_RESOURCE == false) || (OLD_RESOURCE == true)) && ifThenElse((x509UserProxyVOName isnt "desy") && (x509UserProxyVOName isnt "ops") && (x509UserProxyVOName isnt "belle"),(BELLECALIBRATION_RESOURCE == false),(BELLECALIBRATION_RESOURCE is false) ||
      (BELLECALIBRATION_RESOURCE is true))

Job 19458678.000 defines the following attributes:

    x509UserProxyVOName = "atlas"

The Requirements expression for job 19458678.000 reduces to these conditions:

         Slots
Step    Matched  Condition
-----  --------  ---------
[0]        9634  NODE_IS_HEALTHY
[1] 9634 ifThenElse(x509UserProxyVOName is "desy",TEST_RESOURCE == true,GRID_RESOURCE == true)
[3]        9634  OpSysAndVer == "CentOS7"
[5] 9634 ifThenElse((x509UserProxyVOName isnt "desy") && (x509UserProxyVOName isnt "ops") && (x509UserProxyVOName isnt "calice") && (x509UserProxyVOName isnt "belle"),(OLD_RESOURCE == false),(OLD_RESOURCE == false) || (OLD_RESOURCE == true)) [7] 9634 ifThenElse((x509UserProxyVOName isnt "desy") && (x509UserProxyVOName isnt "ops") && (x509UserProxyVOName isnt "belle"),(BELLECALIBRATION_RESOURCE == false),(BELLECALIBRATION_RESOURCE is false) || (BELLECALIBRATION_RESOURCE is true))


19458678.000:  Job has been matched.

Last successful match: Fri Feb  9 08:52:52 2024


19458678.000: Run analysis summary ignoring user priority. Of 359 machines,
      0 are rejected by your job's requirements
     17 reject your job because of their own requirements
      0 match and are already running your jobs
      0 match but are serving other users
    342 are able to run your job


[3.a - CE Entry Point]
condor-9.0.15-1.el7.x86_64
condor-boinc-7.16.16-1.el7.x86_64
condor-classads-9.0.15-1.el7.x86_64
condor-externals-9.0.15-1.el7.x86_64
condor-procd-9.0.15-1.el7.x86_64
htcondor-ce-5.1.5-1.el7.noarch
htcondor-ce-apel-5.1.5-1.el7.noarch
htcondor-ce-bdii-5.1.3-1.el7.noarch
htcondor-ce-client-5.1.5-1.el7.noarch
htcondor-ce-condor-5.1.5-1.el7.noarch
htcondor-ce-view-5.1.5-1.el7.noarch
python2-condor-9.0.15-1.el7.x86_64
python3-condor-9.0.15-1.el7.x86_64

[3.b - Execution Point]
condor-9.0.8-1.el7.x86_64
condor-boinc-7.16.16-1.el7.x86_64
condor-classads-9.0.8-1.el7.x86_64
condor-externals-9.0.8-1.el7.x86_64
condor-procd-9.0.8-1.el7.x86_64
htcondor-ce-client-5.1.3-1.el7.noarch
python2-condor-9.0.8-1.el7.x86_64
python3-condor-9.0.8-1.el7.x86_64
Args = "-s DESY-HH -r DESY-HH -q DESY-HH -j unified -i PR --pythonversion 3 -w generic --pilot-user ATLAS --url https://pandaserver.cern.ch --harvester-submit-mode PULL --allow-same-user=False --job-type=unified --resource-type MCORE --pilotversion 3.7.0.36"
AuthTokenId = "2ba7c743-a5fe-4357-bb6f-b64870e775ce"
AuthTokenIssuer = "https://atlas-auth.web.cern.ch/";
AuthTokenScopes = "compute.cancel,compute.create,compute.modify,compute.read"
AuthTokenSubject = "7dee38a3-6ab8-4fe2-9e4c-58039c21d817"
ClusterId = 6599465
Cmd = "runpilot2-wrapper.sh"
CommittedSlotTime = 0
CommittedSuspensionTime = 0
CommittedTime = 0
CompletionDate = 0
CoreSize = 0
CumulativeRemoteSysCpu = 0.0
CumulativeRemoteUserCpu = 0.0
CumulativeSlotTime = 0
CumulativeSuspensionTime = 0
CurrentHosts = 0
DelegateJobGSICredentialsLifetime = 0
DiskUsage = 32
DiskUsage_RAW = 32
EncryptExecuteDirectory = false
EnteredCurrentStatus = 1707463747
Environment = "APFFID=CERN_central_B HARVESTER_WORKER_ID=510113260 APFMON=http://apfmon.lancs.ac.uk/api APFCID=2179005.15 PANDA_JSID=harvester-CERN_central_B HARVESTER_ID=CERN_central_B GTAG=https://aipanda158.cern.ch/condor_logs_2/24-02-09_07/grid.2179005.15.out";
Err = "_condor_stderr"
ExecutableSize = 32
ExecutableSize_RAW = 32
ExitBySignal = false
ExitStatus = 0
GlobalJobId = "grid-htcondorce0.desy.de#6599465.0#1707463737"
harvesterID = "CERN_central_B"
harvesterWorkerID = "510113260"
HoldReason = undefined
HoldReasonCode = undefined
ImageSize = 32
ImageSize_RAW = 31
In = "/dev/null"
ioIntensity = 0
Iwd = "/var/lib/condor-ce/spool/9465/0/cluster6599465.proc0.subproc0"
JobLeaseDuration = 2400
JobNotification = 0
JobPrio = 0
JobStatus = 1
JobSubmitMethod = 0
JobUniverse = 5
KillSig = "SIGTERM"
LastHoldReason = "Spooling input data files"
LastHoldReasonCode = 16
LastJobStatus = 5
LastSuspensionTime = 0
LeaveJobInQueue = JobStatus == 4
Managed = "External"
ManagedManager = "htcondor-ce"
MaxHosts = 1
maxMemory = 16000
maxWallTime = 3600
MinHosts = 1
MyType = "Job"
NumCkpts = 0
NumCkpts_RAW = 0
NumJobCompletions = 0
NumJobMatches = 1
NumJobStarts = 0
NumRestarts = 0
NumSystemHolds = 0
Out = "_condor_stdout"
Owner = "atlasprd000"
PeriodicRemove = (JobStatus == 5 && (CurrentTime - EnteredCurrentStatus) > 3600) || (JobStatus == 1 && globusstatus =!= 1 && (CurrentTime - EnteredCurrentStatus) > 86400)
ProcId = 0
QDate = 1707463734
queue = "atlas"
Rank = 0.0
ReleaseReason = "Data files spooled"
RemoteSysCpu = 0.0
RemoteUserCpu = 0.0
RemoteWallClockTime = 0.0
RequestCpus = 1
RequestDisk = DiskUsage
RequestMemory = ifthenelse(MemoryUsage =!= undefined,MemoryUsage,(ImageSize + 1023) / 1024)
Requirements = true
RootDir = "/"
RoutedToJobId = "19458678.0"
SciTokensFile = "/cephfs/atlpan/harvester/tokens/ce/prod/d9a2d3d608b788ef3cae8835973aec82"
sdfCopied = 0
sdfPath = "/cephfs/atlpan/harvester/harvester_wdirs/CERN_central_B/32/60/510113260/tmp8xwhzzsn_submit.sdf"
ServerTime = 1707475341
ShouldTransferFiles = "YES"
StageInFinish = 1707463746
StageInStart = 1707463745
StreamErr = false
StreamOut = false
SUBMIT_Cmd = "/cvmfs/atlas.cern.ch/repo/sw/PandaPilotWrapper/latest/runpilot2-wrapper.sh"
SUBMIT_Iwd = "/cephfs/atlpan/harvester/harvester_wdirs/CERN_central_B/32/60/510113260"
SUBMIT_TransferOutputRemaps = "_condor_stdout=/data2/atlpan/condor_logs/24-02-09_07/grid.2179005.15.out;_condor_stderr=/data2/atlpan/condor_logs/24-02-09_07/grid.2179005.15.err"
SUBMIT_x509userproxy = "/cephfs/atlpan/harvester/proxy/x509up_u25606_prod"
SubmitterGlobalJobId = "aipanda158.cern.ch#2179005.15#1707463731"
SubmitterId = "aipanda158.cern.ch"
SubmitterLeaveJobInQueue = false
TargetType = "Machine"
TotalSubmitProcs = 1
TotalSuspensions = 0
TransferIn = false
TransferInputSizeMB = 0
TransferOutput = ""
TransferOutputRemaps = undefined
User = "atlasprd000@xxxxxxxxxxxxxxxxxx"
WantClaiming = false
WhenToTransferOutput = "ON_EXIT_OR_EVICT"
x509userproxy = "x509up_u25606_prod"
x509UserProxyEmail = "atlas.pilot1@xxxxxxx"
x509UserProxyExpiration = 1707817570
x509UserProxyFirstFQAN = "/atlas/Role=production/Capability=NULL"
x509UserProxyFQAN = "/DC=ch/DC=cern/OU=Organic Units/OU=Users/CN=atlpilo1/CN=614260/CN=Robot: ATLAS Pilot1,/atlas/Role=production/Capability=NULL,/atlas/Role=NULL/Capability=NULL,/atlas/lcg1/Role=NULL/Capability=NULL,/atlas/usatlas/Role=NULL/Capability=NULL"
x509userproxysubject = "/DC=ch/DC=cern/OU=Organic Units/OU=Users/CN=atlpilo1/CN=614260/CN=Robot: ATLAS Pilot1"
x509UserProxyVOName = "atlas"
xcount = 8

AccountingGroup = "group_ATLAS.atlasprd000"
AcctGroup = "group_ATLAS"
AcctGroupUser = "atlasprd000"
Args = "-s DESY-HH -r DESY-HH -q DESY-HH -j unified -i PR --pythonversion 3 -w generic --pilot-user ATLAS --url https://pandaserver.cern.ch --harvester-submit-mode PULL --allow-same-user=False --job-type=unified --resource-type MCORE --pilotversion 3.7.0.36"
AutoClusterAttrs = "_cp_orig_RequestCpus,_cp_orig_RequestDisk,_cp_orig_RequestMemory,ClusterAvgCoreHS23,MachineLastMatchTime,PartitionableSlot,RequestCpus,RequestDisk,RequestMemory,ConcurrencyLimits,FlockTo,Rank,Requirements,RemoteOwner,TotalJobRuntime,NODE_IS_HEALTHY,BELLECALIBRATION_RESOURCE,DESYAcctGroup,DiskUsage,GlideinCpusIsGood,GRID_RESOURCE,JobCpus,JobIsRunning,JobMemory,JobStatus,MATCH_EXP_JOB_GLIDEIN_Cpus,MATCH_EXP_JOB_GLIDEIN_Memory,OLD_RESOURCE,OpSysAndVer,OriginalCpus,OriginalMemory,Owner,TEST_RESOURCE,TotalCpus,TotalMemory,WantWholeNode,x509UserProxyVOName"
AutoClusterId = 723303
BatchRuntime = 216000
CERequirements = "CondorCE"
ClusterId = 19458678
Cmd = "runpilot2-wrapper.sh"
CommittedSlotTime = 0
CommittedSuspensionTime = 0
CommittedTime = 0
CompletionDate = 0
ConcurrencyLimits = strcat(DESYAcctGroup,".",Owner)
CondorCE = 1
CumulativeRemoteSysCpu = 0.0
CumulativeRemoteUserCpu = 0.0
CumulativeSlotTime = 0
CumulativeSuspensionTime = 0
CurrentHosts = 1
cutAcctGroup = toLower(split(AcctGroup,"_")[1])
default_maxMemory = 2048
default_maxWallTime = 5760
default_xcount = 1
DelegateJobGSICredentialsLifetime = 0
DESYAcctGroup = "group_ATLAS"
DESYAcctGroupATLASMembers = { "atlas" }
DESYAcctGroupBelle2Members = { "belle","belle2" }
DESYAcctGroupBioMembers = { "enmr","biomed" }
DESYAcctGroupCMSMembers = { "cms" }
DESYAcctGroupDESYMembers = { "dech","desy","ghep" }
DESYAcctGroupILCMembers = { "calice","ilc" }
DESYAcctGroupLHCBMembers = { "LHCB" }
DESYAcctGroupOpsMembers = { "ops" }
DESYAcctMCoreSubGroup = "mcore"
DESYAcctSubGroup = ifThenElse(regexp("desyplt",Owner) || regexp("desytst",Owner),"desyplt",ifThenElse(regexp("desyprd",Owner),"desyprd",ifThenElse(regexp("desysgm",Owner),"desysgm",ifThenElse(regexp("desyusr",Owner),"desyusr",ifThenElse(regexp("sgmcms",Owner),"ops",ifThenElse(regexp("atlas",Owner) && RequestCpus > 1,"atlas_multicore",ifThenElse(regexp("cms",Owner) && RequestCpus > 1,"cms_multicore",ifThenElse(regexp("lhcb",Owner) && RequestCpus > 1,"lhcb_multicore","other"))))))))
DESYDEFAULTSSET = true
DESYGRIDSET = true
DiskUsage = 32
DiskUsage_RAW = 32
EncryptExecuteDirectory = false
EnteredCurrentStatus = 1707463758
Environment = "APFFID=CERN_central_B HARVESTER_WORKER_ID=510113260 APFMON=http://apfmon.lancs.ac.uk/api CONDORCE_COLLECTOR_HOST=grid-htcondorce0.desy.de:9619 APFCID=2179005.15 PANDA_JSID=harvester-CERN_central_B HARVESTER_ID=CERN_central_B GTAG=https://aipanda158.cern.ch/condor_logs_2/24-02-09_07/grid.2179005.15.out";
Err = "_condor_stderr"
ExecutableSize = 32
ExecutableSize_RAW = 32
ExitBySignal = false
ExitStatus = 0
GlideinCpusIsGood =  !isUndefined(MATCH_EXP_JOB_GLIDEIN_Cpus) && (int(MATCH_EXP_JOB_GLIDEIN_Cpus) =!= error)
GlideinGPUsIsGood =  !isUndefined(MATCH_EXP_JOB_GLIDEIN_GPUs) && (int(MATCH_EXP_JOB_GLIDEIN_GPUs) =!= error)
GlobalJobId = "grid-htcondorce0.desy.de#19458678.0#1707463758"
harvesterID = "CERN_central_B"
harvesterWorkerID = "510113260"
HoldReason = undefined
HoldReasonCode = undefined
ImageSize = 32
ImageSize_RAW = 31
In = "/dev/null"
ioIntensity = 0
isATLASMember = member(cutAcctGroup,DESYAcctGroupATLASMembers)
isBELLE2Member = member(cutAcctGroup,DESYAcctGroupBelle2Members)
isBIOMember = member(cutAcctGroup,DESYAcctGroupBioMembers)
isCMSMember = member(cutAcctGroup,DESYAcctGroupCMSMembers)
isDESYMember = member(cutAcctGroup,DESYAcctGroupDESYMembers)
isILCMember = member(cutAcctGroup,DESYAcctGroupILCMembers)
isLHCBMember = member(cutAcctGroup,DESYAcctGroupOpsMembers)
isOPSMember = member(cutAcctGroup,DESYAcctGroupOpsMembers)
Iwd = "/var/lib/condor-ce/spool/9465/0/cluster6599465.proc0.subproc0"
JOB_GLIDEIN_Cpus = "$$(ifThenElse(WantWholeNode is true, !isUndefined(TotalCpus) ? TotalCpus : JobCpus, OriginalCpus))"
JOB_GLIDEIN_GPUs = "$$(ifThenElse(WantWholeNode is true, !isUndefined(TotalGPUs) ? TotalGPUs : JobGPUs, OriginalGPUs))"
JOB_GLIDEIN_Memory = "$$(TotalMemory:0)"
JobCpus = JobIsRunning ? int(MATCH_EXP_JOB_GLIDEIN_Cpus) : OriginalCpus
JobGPUs = JobIsRunning ? int(MATCH_EXP_JOB_GLIDEIN_GPUs) : OriginalGPUs
JobIsRunning = (JobStatus =!= 1) && (JobStatus =!= 5) && GlideinCpusIsGood
JobLeaseDuration = 2400
JobMemory = JobIsRunning ? int(MATCH_EXP_JOB_GLIDEIN_Memory) * 95 / 100 : OriginalMemory
JobNotification = 0
JobPrio = 0
JobStatus = 1
JobSubmitMethod = 0
JobUniverse = 5
KillSig = "SIGTERM"
LastHoldReason = "Spooling input data files"
LastHoldReasonCode = 16
LastJobStatus = 5
LastMatchTime = 1707465172
LastRejMatchReason = "no match found "
LastRejMatchTime = 1707465172
LastSuspensionTime = 0
LeaveJobInQueue = JobStatus == 4
MachineAttrApelScaledPerSlot0 = 0.8226837060702875
MachineAttrApelScaling0 = 0.8226837060702875
MachineAttrArch0 = "X86_64"
MachineAttrClusterAvgCoreHS060 = 15.65
MachineAttrCondorVersion0 = "$CondorVersion: 9.0.8 Dec 02 2021 BuildID: 564626 PackageID: 9.0.8-1 $"
MachineAttrCpuModel0 = undefined
MachineAttrCpus0 = 8
MachineAttrHS060 = 515
MachineAttrHS06PerSlot0 = 12.875
MachineAttrHS06perWatt0 = 1.81
MachineAttrMachine0 = "batch0558.desy.de"
MachineAttrMemory0 = 56099
MachineAttrOpSysAndVer0 = "CentOS7"
MachineAttrSlotWeight0 = 8
MachineAttrTotalCpus0 = 40.0
MachineAttrTotalDisk0 = 779152364
MachineAttrTotalMemory0 = 128675
MachineAttrTotalSlotCpus0 = 40
MachineAttrTotalSlotMemory0 = 128675
MachineAttrTotalSlots0 = 12
Matched = true
MaxHosts = 1
maxMemory = 16000
maxWallTime = 3600
MinHosts = 1
MyType = "Job"
NumCkpts = 0
NumCkpts_RAW = 0
NumJobCompletions = 0
NumJobMatches = 1
NumJobStarts = 0
NumRestarts = 0
NumSystemHolds = 0
OnExitHold = ifThenElse(orig_OnExitHold =!= undefined,orig_OnExitHold,false) || ifThenElse(minWalltime =!= undefined && RemoteWallClockTime =!= undefined,RemoteWallClockTime < 60 * minWallTime,false)
OnExitHoldReason = ifThenElse((orig_OnExitHold =!= undefined) && orig_OnExitHold,ifThenElse(orig_OnExitHoldReason =!= undefined,orig_OnExitHoldReason,strcat("The on_exit_hold expression (",unparse(orig_OnExitHold),") evaluated to TRUE.")),ifThenElse(minWalltime =!= undefined && RemoteWallClockTime =!= undefined && (RemoteWallClockTime < 60 * minWallTime),strcat("The job's wall clock time, ",int(RemoteWallClockTime / 60),"min, is less than the minimum specified by the job (",minWalltime,")"),"Job held for unknown reason."))
OnExitHoldSubCode = ifThenElse((orig_OnExitHold =!= undefined) && orig_OnExitHold,ifThenElse(orig_OnExitHoldSubCode =!= undefined,orig_OnExitHoldSubCode,1),42)
orig_AuthTokenId = "2ba7c743-a5fe-4357-bb6f-b64870e775ce"
orig_AuthTokenIssuer = "https://atlas-auth.web.cern.ch/";
orig_AuthTokenScopes = "compute.cancel,compute.create,compute.modify,compute.read"
orig_AuthTokenSubject = "7dee38a3-6ab8-4fe2-9e4c-58039c21d817"
orig_environment = "APFFID=CERN_central_B HARVESTER_WORKER_ID=510113260 APFMON=http://apfmon.lancs.ac.uk/api APFCID=2179005.15 PANDA_JSID=harvester-CERN_central_B HARVESTER_ID=CERN_central_B GTAG=https://aipanda158.cern.ch/condor_logs_2/24-02-09_07/grid.2179005.15.out";
orig_RequestCpus = 1
OriginalCpus = 8
OriginalGPUs = undefined
OriginalMemory = 16000
osg_environment = ""
Out = "_condor_stdout"
Owner = "atlasprd000"
PostArguments = undefined
PostCmd = "/usr/local/bin/condor_bash_postCMD_wrapper.sh"
ProcId = 0
QDate = 1707463758
queue = "atlas"
Rank = 0.0
ReleaseReason = "Data files spooled"
Remote_JobUniverse = 5
remote_NodeNumber = 8
remote_OriginalMemory = 16000
remote_queue = "atlas"
remote_SMPGranularity = 8
RemoteSysCpu = 0.0
RemoteUserCpu = 0.0
RequestCpus = ifThenElse(WantWholeNode =?= true, !isUndefined(TotalCpus) ? TotalCpus : JobCpus,OriginalCpus)
RequestDisk = DiskUsage
RequestGPUs = ifThenElse((WantWholeNode =?= true && OriginalGPUs =!= undefined),( !isUndefined(TotalGPUs) && TotalGPUs > 0) ? TotalGPUs : JobGPUs,OriginalGPUs)
RequestMemory = ifThenElse(WantWholeNode =?= true, !isUndefined(TotalMemory) ? TotalMemory * 95 / 100 : JobMemory,OriginalMemory)
Requirements = NODE_IS_HEALTHY && ifThenElse(x509UserProxyVOName =?= "desy",TEST_RESOURCE == true,GRID_RESOURCE == true) && (OpSysAndVer == "CentOS7") && ifThenElse((x509UserProxyVOName =!= "desy") && (x509UserProxyVOName =!= "ops") && (x509UserProxyVOName =!= "calice") && (x509UserProxyVOName =!= "belle"),(OLD_RESOURCE == false),(OLD_RESOURCE == false) || (OLD_RESOURCE == true)) && ifThenElse((x509UserProxyVOName =!= "desy") && (x509UserProxyVOName =!= "ops") && (x509UserProxyVOName =!= "belle"),(BELLECALIBRATION_RESOURCE == false),(BELLECALIBRATION_RESOURCE =?= false) || (BELLECALIBRATION_RESOURCE =?= true))
RootDir = "/"
RoutedBy = "htcondor-ce"
RoutedFromJobId = "6599465.0"
RoutedJob = true
RouteName = "DESYGRID"
SciTokensFile = "/cephfs/atlpan/harvester/tokens/ce/prod/d9a2d3d608b788ef3cae8835973aec82"
sdfCopied = 0
sdfPath = "/cephfs/atlpan/harvester/harvester_wdirs/CERN_central_B/32/60/510113260/tmp8xwhzzsn_submit.sdf"
ServerTime = 1707475325
ShouldTransferFiles = "YES"
StreamErr = false
StreamOut = false
SUBMIT_Cmd = "/cvmfs/atlas.cern.ch/repo/sw/PandaPilotWrapper/latest/runpilot2-wrapper.sh"
SUBMIT_TransferOutputRemaps = "_condor_stdout=/data2/atlpan/condor_logs/24-02-09_07/grid.2179005.15.out;_condor_stderr=/data2/atlpan/condor_logs/24-02-09_07/grid.2179005.15.err"
SUBMIT_x509userproxy = "/cephfs/atlpan/harvester/proxy/x509up_u25606_prod"
SubmitterGlobalJobId = "aipanda158.cern.ch#2179005.15#1707463731"
SubmitterId = "aipanda158.cern.ch"
SubmitterLeaveJobInQueue = false
TargetType = "Machine"
TotalSubmitProcs = 1
TotalSuspensions = 0
TransferIn = false
TransferInputSizeMB = 0
TransferOutput = ""
TransferOutputRemaps = undefined
User = "atlasprd000@xxxxxxx"
WantClaiming = false
WhenToTransferOutput = "ON_EXIT_OR_EVICT"
x509userproxy = "x509up_u25606_prod"
x509UserProxyEmail = "atlas.pilot1@xxxxxxx"
x509UserProxyExpiration = 1707806404
x509UserProxyFirstFQAN = "/atlas/Role=production/Capability=NULL"
x509UserProxyFQAN = "/DC=ch/DC=cern/OU=Organic Units/OU=Users/CN=atlpilo1/CN=614260/CN=Robot: ATLAS Pilot1,/atlas/Role=production/Capability=NULL,/atlas/Role=NULL/Capability=NULL,/atlas/lcg1/Role=NULL/Capability=NULL,/atlas/usatlas/Role=NULL/Capability=NULL"
x509userproxysubject = "/DC=ch/DC=cern/OU=Organic Units/OU=Users/CN=atlpilo1/CN=614260/CN=Robot: ATLAS Pilot1"
x509UserProxyVOName = "atlas"
xcount = 8

Attachment: smime.p7s
Description: S/MIME Cryptographic Signature