[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] Unable to re-submit dag rescue file



Michael Remijan wrote:
I've just run into a major problem. We're running this version of condor:

    $CondorVersion: 6.6.7 Oct 11 2004 $
    $CondorPlatform: I386-LINUX_RH9 $

My job failed and produced a rescue dag. When I try to submit the rescue dag [...] I get the following message:

Can't open command file DONE for reading: No such file or directory

Michael,


This is a known bug in the Condor 6.7.0 development release of condor_dagman, which is the version reported in the dagman debugging log you sent me privately:

11/13 17:07:19 ******************************************************
11/13 17:07:19 ** condor_scheduniv_exec.599.0 (CONDOR_DAGMAN) STARTING UP
11/13 17:07:19 ** $CondorVersion: 6.7.0 Apr 27 2004 $
11/13 17:07:19 ** $CondorPlatform: I386-LINUX-RH9 $

The problem is fixed in later 6.7 series releases, including the latest, 6.7.2.


Since you're already apparently running a hybrid install of 6.6.7 and 6.7.0, just try upgrading your condor_dagman binary to the one from the latest 6.7.2 release, and you shouldn't have any more problems. But let us know if you do. :)

Thanks!

-Peter

--
Peter Couvares                        University of Wisconsin-Madison
Condor Project Research               Department of Computer Sciences
pfc@xxxxxxxxxxx                       1210 W. Dayton St. Rm #4241
(608) 265-8936                        Madison, WI 53706-1685