[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] Condor jobs leave directories in hosts/*/execute



On Wed, Jul 21, 2004 at 02:03:33PM +0100, Richard Gillman wrote:
> I've just set up Condor 6.6.5 on a Linux cluster. When I run jobs, they 
> apparently complete OK, but when the jobs have completed, there are 
> directories left in the ~condor/hosts/hostname/execute directory.
> 
> #  find ./*/execute -mtime -1
> ./livlae/execute
> ./livlaf/execute
> ./livlaf/execute/dir_20562
> ./livlaf/execute/dir_20567
> ./livlah/execute
> ./livlah/execute/dir_4722
> ./livlai/execute
> #
> 
> The only items in the condor logs that look exceptional are, in 
> StartLog, DEACTIVATE_CLAIM_FORCIBLY and "Error: can't find resource with 
> capability", and in StarterLog.vm2, "ERROR: the submitting host claims 
> to be in our UidDomain (nerc-bidston.ac.uk), yet its hostname (bilag) 
> does not match". I have CONDOR_HOST set to livlae.nerc-bidston.ac.uk; 
> UID_DOMAIN and FILESYSTEM_DOMAIN are both set to nerc-bidston.ac.uk; 
> nslookup on bilag's address gives bilag.nerc-bidston.ac.uk.
> 
> How do I ensure jobs clean up after themselves? Are these messages 
> related? If not, should I worry about them?
> 
> I haven't seen the same problem in a Solaris installation.
> 
> Any suggestions appreciated.
> 

Upgrade to 6.6.6. There are some known problems in cleaning up the
execute directories in older 6.6.x releases.


-Erik