[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] relation between schedd and dagman



Thanks.

I unfortunately removed the file. It makes sense what you are saying.

Also, is it possible to force a rescue DAG?

On Mon, Aug 9, 2010 at 10:44 AM, R. Kent Wenger <wenger@xxxxxxxxxxx> wrote:
> On Mon, 9 Aug 2010, Mag Gam wrote:
>
>> Is there any relationship between dagman and schedd? I accidentally
>> stopped schedd on my central manager host then I quickly restarted it
>> after 45 seconds. All jobs are ok but it seems the DAGMan job started
>> from the beginning again. Just curious if there is a relation.
>
> Yes, DAGMan is run as a scheduler universe job, so it is closely tied to the
> schedd.  If you stop and restart the schedd, that will stop and restart
> DAGMan as well.  However, when DAGMan restarts, it should go into recovery
> mode, not restart from the beginning.  Can you send your dagman.out file?
>  I'd like to check what actually happened.
>
> Kent Wenger
> Condor Team
> _______________________________________________
> Condor-users mailing list
> To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with a
> subject: Unsubscribe
> You can also unsubscribe by visiting
> https://lists.cs.wisc.edu/mailman/listinfo/condor-users
>
> The archives can be found at:
> https://lists.cs.wisc.edu/archive/condor-users/
>