[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] DAGs and job ID mismatch



On Thu, Jan 24, 2013 at 09:49:14AM -0700, Michael O'Donnell wrote:
> HTCondor 7.8.7 on submit machine (Windows)
> 
> 
> 
> I have submitted 5 different DAGs on the same submit machine. Some of these
> DAGs are completing and a rescue file was generated. I then submit those
> DAGs with failures, but no jobs run. In the DAG log I am told that the job
> ID in the userlog does not match the previously reported ID:
> 
> 
> 
> ERROR: node j806: job ID in userlog submit event (917.0.0) doesn't match ID
> reported earlier by submit command (1099.0.0)!  Aborting DAG; set
> DAGMAN_ABORT_ON_SCARY_SUBMIT to false if you are *sure* this shouldn't
> cause an abort.
> 
> 
> 
> I could combine all these into a single DAG and throttle the maximum number
> of jobs, but I did not do this. Is this behavior intended or is it possibly
> a bug?
> 
Looks like a bug.  Can you send the .dagman.out file to me and to wenger@xxxxxxxxxxx?

Nathan  Panike