[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[HTCondor-users] Jobs evicted after completion



Hi,

We run across a weird case - jobs get submitted (through flocking), input files gets transferred, the job runs fine, complete, output files are being transferred and just as the "Finished transferring output files" is reported in the log file, the job gets evicted with no output. I looked into resources usage, and the jobs use way under what is being requested, so it's not that. The shadow logs only tell me that the process "exited with status 102", which means they got killed, from what I know, and the eviction says "RemoteResource::killStarter(): Could not send command to startd" and "logEvictEvent with unknown reason (108), not logging."

We recently moved the spool directory to a non-standard location due to disk space issues, but it's been working fine. Any ideas of what might be going on? Appreciate the help.

Best,

-Jacek

--
Jacek Kominek, PhD
University of Wisconsin-Madison
1552 University Avenue, Wisconsin Energy Institute 4154 Madison, WI
53726-4084, USA
jkominek@xxxxxxxx