[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Condor-users] Question about SYSTEM_PERIDOC_* expressions



I have a user who submitted jobs, and then deleted the job result
directories. The cluster was put on hold with every job getting:

HoldReasonCode = 14
HoldReasonSubCode = 2
LastHoldReason = "Cannot access initial working directory
/blah/blah/blah: No such file or directory"

I have on my scheduler defined:

#SYSTEM_PERIODIC_HOLD = False
#SYSTEM_PERIODIC_RELEASE = False
SYSTEM_PERIODIC_REMOVE = ((JobStatus == 2) && ((CurrentTime -
EnteredCurrentStatus) > AlteraMaxJobRunTime)) || ((JobStatus == 5) &&
(HoldReasonCode != 1))

I was expecting the periodic remove expression to wipe these jobs from
the system but instead they're bouncing back and forth from held to
queued and causing some havoc in my negotiation cycles. How can I stop
them from returning to the queued state once they've been held? It
thought SYSTEM_PERIODIC_RELEASE=False would have ensured this.

- Ian

--
Ian R. Chesal <ichesal@xxxxxxxxxx>
Senior Software Engineer

Altera Corporation
Toronto Technology Center
Tel: (416) 926-8300