[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] Files in /var/lib/condor/execute/ filling up disk



Do you know when or why the HTCondor daemons stopped running?  

Hopefully that information would be in the daemon logs   run

condor_config_val STARTD_LOG STARTER_LOG

To find where the StartdLog and StarterLog.* are

The condor_startd daemon should clean up execute directory when it starts up
as well as when job exits/crashes and leaves behind stuff in the execute directory.
condor_vacate should not be necessary in this situation.   In fact I would expect
condor_vacate to do nothing in this situation.

If the condor_startd was hard killed by some external process or user, it is plausible
that this would leave behind files in execute until the HTCondor daemons had a chance to run
again.  but if the HTCondor daemons did an orderly shutdown, then it is a bug that
the execute directory was not cleaned up as part of shutdown - please let us know. 

-tj

-----Original Message-----
From: HTCondor-users <htcondor-users-bounces@xxxxxxxxxxx> On Behalf Of John G Heim
Sent: Tuesday, June 11, 2019 11:03 AM
To: HTCondor-Users Mail List <htcondor-users@xxxxxxxxxxx>
Subject: [HTCondor-users] Files in /var/lib/condor/execute/ filling up disk

This morning I found that 2 of the machines in my condor cluster were 
essentially down. It turns out that it was because the disk was full and 
that was because there were several hundred gigabytes of iles in 
/var/lib/condor/execute/. I guessed that condor_vacate would remove them 
but condor_vacate returned an error message indicating the the condor 
daemon was not running. I had to clear some space on the disk before I 
could restart the condor daemons. At that point condor_vacate worked and 
the funtion of the machiens returned to normal.

I would like to keep this from happening again. Any ideas on how?

-- 
John G. Heim; jheim@xxxxxxxxxxxxx; 608-263-4189
_______________________________________________
HTCondor-users mailing list
To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users

The archives can be found at:
https://lists.cs.wisc.edu/archive/htcondor-users/