[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] Condor execute directory on Windows keeps pilling up

The dir_XXXX directories are not empty at all.  In fact, each directory contains all the files.  .job.ad, .machine.ad, condor_exec.bat still exists.  As for antivirus, I don't believe there are any installed on this machine.



From: Todd Tannenbaum <tannenba@xxxxxxxxxxx>
Sent: Tuesday, October 3, 2017 6:20:06 PM
To: Zhuang, Di; HTCondor-Users Mail List; Dan, Bowen
Cc: Louis, Prabha
Subject: Re: [HTCondor-users] Condor execute directory on Windows keeps pilling up
On 10/3/2017 5:53 PM, Zhuang, Di wrote:
> Looking into StarterLog.slot_1, I do see the following.  Could the build
> up have something to do with the line "Got SIGQUIT. Performing fast
> shutdown."  What can this be caused by.

The above is normal / expected behavior.... it is just the condor_startd
telling the condor_starter to go away.

> If there are no immediate
> solution, as a temporary workaround, is there a way for me to safely
> identify which scratch directories are being worked on and remove the rest?

Couple thoughts:

Are the leaked dir_xxx subdirectories empty? I.e. they do not even
contain a ".job.ad" file?  In the one time I could reproduce the problem
on my machine, the subdirectory was indeed empty - in other words,
HTCondor successfully removed all the files but had an error removing
the (now empty) subdirectory.   If the leaked directories are also empty
for you, that would be an easy way to identify which ones you can
remove... if the subdirectory is older than a few seconds and is empty,
you could remove it.

Are you using a real-time virus scanner like Windows Security
Essentials, Windows Defender, etc?  You could try adding
C:\condor\execute folder to the list of folder excluded from being
scanned.  In another thread TJ guessed that HTCondor was unable to
remove the (empty) subdirectory because a virus scanner had it
temporarily open.