[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] Jobs evicted after completion



You haven't mentioned what version you are running. Your problem description matches on of the bugs fixed in the HTCondor 9.0.6 LTS version.

Take a look at the last bug in the list for version 9.0.6.

https://htcondor.readthedocs.io/en/v9_0/version-history/stable-release-series-90.html#version-9-0-6

...Tim
--
Tim Theisen (he, him, his)
Release Manager
HTCondor & Open Science Grid
Center for High Throughput Computing
Department of Computer Sciences
University of Wisconsin - Madison
4261 Computer Sciences and Statistics
1210 W Dayton St
Madison, WI 53706-1685
+1 608 265 5736

From: HTCondor-users <htcondor-users-bounces@xxxxxxxxxxx> on behalf of Jacek Kominek via HTCondor-users <htcondor-users@xxxxxxxxxxx>
Sent: Tuesday, May 10, 2022 12:42 PM
To: htcondor-users@xxxxxxxxxxx <htcondor-users@xxxxxxxxxxx>
Cc: JACEK KOMINEK <jkominek@xxxxxxxx>
Subject: [HTCondor-users] Jobs evicted after completion
 
Hi,

We run across a weird case - jobs get submitted (through flocking),
input files gets transferred, the job runs fine, complete, output files
are being transferred and just as the "Finished transferring output
files" is reported in the log file, the job gets evicted with no output.
I looked into resources usage, and the jobs use way under what is being
requested, so it's not that. The shadow logs only tell me that the
process "exited with status 102", which means they got killed, from what
I know, and the eviction says "RemoteResource::killStarter(): Could not
send command to startd" and "logEvictEvent with unknown reason (108),
not logging."

We recently moved the spool directory to a non-standard location due to
disk space issues, but it's been working fine. Any ideas of what might
be going on? Appreciate the help.

Best,

-Jacek

--
Jacek Kominek, PhD
University of Wisconsin-Madison
1552 University Avenue, Wisconsin Energy Institute 4154 Madison, WI
53726-4084, USA
jkominek@xxxxxxxx

_______________________________________________
HTCondor-users mailing list
To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users

The archives can be found at:
https://lists.cs.wisc.edu/archive/htcondor-users/