[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[HTCondor-users] DAGs and job ID mismatch

HTCondor 7.8.7 on submit machine (Windows)


I have submitted 5 different DAGs on the same submit machine. Some of these DAGs are completing and a rescue file was generated. I then submit those DAGs with failures, but no jobs run. In the DAG log I am told that the job ID in the userlog does not match the previously reported ID:


ERROR: node j806: job ID in userlog submit event (917.0.0) doesn't match ID reported earlier by submit command (1099.0.0)!  Aborting DAG; set DAGMAN_ABORT_ON_SCARY_SUBMIT to false if you are *sure* this shouldn't cause an abort.


I could combine all these into a single DAG and throttle the maximum number of jobs, but I did not do this. Is this behavior intended or is it possibly a bug?


Thank you for your help,