[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [HTCondor-users] DAGs and job ID mismatch
- Date: Thu, 24 Jan 2013 11:37:54 -0600
- From: Nathan Panike <nwp@xxxxxxxxxxx>
- Subject: Re: [HTCondor-users] DAGs and job ID mismatch
On Thu, Jan 24, 2013 at 09:49:14AM -0700, Michael O'Donnell wrote:
> HTCondor 7.8.7 on submit machine (Windows)
> I have submitted 5 different DAGs on the same submit machine. Some of these
> DAGs are completing and a rescue file was generated. I then submit those
> DAGs with failures, but no jobs run. In the DAG log I am told that the job
> ID in the userlog does not match the previously reported ID:
> ERROR: node j806: job ID in userlog submit event (917.0.0) doesn't match ID
> reported earlier by submit command (1099.0.0)! Aborting DAG; set
> DAGMAN_ABORT_ON_SCARY_SUBMIT to false if you are *sure* this shouldn't
> cause an abort.
> I could combine all these into a single DAG and throttle the maximum number
> of jobs, but I did not do this. Is this behavior intended or is it possibly
> a bug?
Looks like a bug. Can you send the .dagman.out file to me and to wenger@xxxxxxxxxxx?