[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] HTCondor-CE not purging finished jobs



Hi Stefano,

I'm a little confused, your system periodic remove expressions seem to remove jobs that have been held for more than 8 hours whereas your queries are looking for held jobs whose proxies have been expired for more than 4 hours. I imagine there's some overlap but they seem like fairly different queries.

Though having the RemoveReason set that ways is pretty strange. If you set "SCHEDD_DEBUG = D_CAT D_ALWAYS:2", you may see some hints in the SchedLog as to why the Schedd is failing to remove these jobs.

Thanks,
Brian

On 5/16/20 11:05 AM, Stefano Dal Pra wrote:
Hello,
htcondor-ce-3.4.0-1.el7.noarch here.

We have a problem common to all of our CEs:

[root@ce02-htc ~]# condor_ce_q -cons '(JobStatus == 5 ) && (time() - x509UserProxyExpiration > 4 * 3600)' -af Owner | sort | uniq -c
   9592 user1
      4 user2
   1114 user3
    575 user4
     44 user5

I have set up REMOVE  and REMOVE REASON rule:
SYSTEM_PERIODIC_REMOVE = (JobStatus == 5 && CurrentTime - EnteredCurrentStatus > 3600*8) SYSTEM_PERIODIC_REMOVE_REASON = strcat("CE job removed by SYSTEM_PERIODIC_REMOVE due to ", ifThenElse((JobStatus == 5 && CurrentTime - EnteredCurrentStatus > 3600*8), "being in the hold state for 8 hours.", ifThenElse((JobStatus == 5 && isUndefined(RoutedToJobId)), "non-existent route or entry in JOB_ROUTER_ENTRIES.", "input files missing." ) ) )

Inspecting these "non purged jobs", they have a RemoveReason set, but they are not gone nevertheless:

[root@ce02-htc ~]# condor_ce_q 1679707.0 -af JobStatus RemoveReason
5 CE job removed by SYSTEM_PERIODIC_REMOVE due to being in the hold state for 8 hours.

Until now i have no better way than removing these jobs manually using somethin like: condor_ce_q -cons '(JobStatus == 5 ) && (time() - x509UserProxyExpiration > 4 * 3600)' -af 'strcat(ClusterId,".",ProcId)' | xargs condor_ce_rm

Do i miss something obvious?
Cheers,
Stefano
_______________________________________________
HTCondor-users mailing list
To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users

The archives can be found at:
https://lists.cs.wisc.edu/archive/htcondor-users/