Ian Chesal wrote:
So the first question is:
Did you delete the $(SPOOL) directory for the scheduler or the
contents of that directory or the job_queue.log files? If so, you
reset the the cluster ID counter and that's why you've got duplicates.
If you're certain you haven't wiped the job_queue.log file for the
scheduler, is it possible you have multiple schedulers writing to the
same history file? If so: that's bad.
Or perhaps you have multiple schedds writing to the same job_queue.log
file?? That would also be really bad.
> Each scheduler should have its own
history file.
I would state a superset of the above: each schedd should have its own
private log and spool subdirectory.
In any event, i think you can reset the next job id Condor assigns by
shutting down your schedd (condor_off -schedd), and append the
following to the end of the spool/job_queue.log file:
105
103 0.0 NextClusterNum xxxxx
106
where xxx = the next job cluster id you want to be assigned. Then
turn your schedd back on (condor_on -schedd). Note I haven't tried
this formula, so buyer beware. And if you haven't fixed the
underlying problem why the job ids got reused, it may happen again...
Hope the above helps
Todd
_______________________________________________
Condor-users mailing list
To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/condor-users
The archives can be found at:
https://lists.cs.wisc.edu/archive/condor-users/