[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] Problem with condor-slot and temp folders on windows



Hey,

 

Thanks for your quick response. We indeed see sometimes error logging that the execute folder cannot be deleted. We know that there is a problem in this type of job.

However, in many other cases (different job type) we do not see any logging, which might hint to a problem of deleting the execute folder.

 

In either case the execute folder gets finally deleted somehow.

 

What persists are the user folders in the windows user directory, including only empty folders. It is always the “AppData” folder with some further subfolders, mostly Local\Microsoft\something or Roaming\Microsoft\somethingElse. Furthermore, the “something” is always something different…

 

Thanks for any further ideas!

 

Best,

Michael

 

From: HTCondor-users [mailto:htcondor-users-bounces@xxxxxxxxxxx] On Behalf Of John M Knoeller
Sent: Wednesday, March 29, 2017 7:56 PM
To: HTCondor-Users Mail List <htcondor-users@xxxxxxxxxxx>
Subject: Re: [HTCondor-users] Problem with condor-slot and temp folders on windows

 

We have seen this happen when the user’s job creates worker processes that are still alive when the job exits.  HTCondor tries to clean up, but detection of worker processes is imperfect on Windows because Windows doesn’t actually keep track of parent-child relationships between processes.

 

If there is a processes that has one of the directories we are trying to delete as their current working directory, or have a file open in that directory, then it is simply not possible to delete the directory without first killing the process.

 

Is there anything in the logs on the execute node that indicate that we tried and failed to delete the execute directory? It’s likely that the problem is caused by a specific job.

 

 

You can use process explorer (one of the sys-internals tools) to identify what processes are keeping the directories from being deleted. 

 

-tj

 

From: HTCondor-users [mailto:htcondor-users-bounces@xxxxxxxxxxx] On Behalf Of Almansour Blanco
Sent: Wednesday, March 29, 2017 10:50 AM
To: htcondor-users@xxxxxxxxxxx
Subject: [HTCondor-users] Problem with condor-slot and temp folders on windows

 

Hello,

 

 

 

We are using Windows 7 and condor version 8.4.9

When condor runs on the system, it creates the folders condor-slot and TEMP directories in the user home directory.

However, in some cases when the condor job is done, the condor-slot* directories are not cleaned up even though they are empty, and they keep on accumulating until there are hundreds of them, and at some point, condor jobs will stop executing on that machine, maybe because it can’t create any more folders.

Has someone faced this problem before? And is there any solution to solve this issue and prevent it from happening?

 

 

 

Kind regards

 

Almansour Belleh Blanco