Mailing List Archives Public Access	UW Madison Computer Sciences Department Computer Systems Lab

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] [PATCH] Speeding up condor_dagman submission

Date: Tue, 11 Aug 2015 11:46:04 +0100
From: Brian Candler <b.candler@xxxxxxxxx>
Subject: Re: [HTCondor-users] [PATCH] Speeding up condor_dagman submission

On 10/08/2015 02:55, R. Kent Wenger wrote:

Ah, the fundamental thing is this: we want to avoid having twoinstances of DAGMan simultaneously running on the same DAG. This willgoof things up because the two DAGMans will be using the same log fortheir node jobs, and the events will get mixed together.
So, to avoid this, DAGMan creates a lock file at startup (whichcontains the UniquePID information). When DAGMan starts up, it looksfor the lock file. If the file exists, DAGMan tries to read theUniquePID info from the lock file. If it succeeds in doing that, andthe corresponding process is still alive, DAGMan says, "Oops, there'sanother DAGMan already running on this DAG", and exits. If DAGMancan't read the UniquePID info, or that process does not exist, DAGManassumes that there was an earlier instance of DAGMan running on thatDAG, but that instance no longer exists. So the just-started DAGManthen continues in recovery mode.
Hopefully that all makes sense...

It does indeed. Thank you!

References:
- Re: [HTCondor-users] [PATCH] Speeding up condor_dagman submission
  - From: R. Kent Wenger
- Re: [HTCondor-users] [PATCH] Speeding up condor_dagman submission
  - From: Brian Candler
- Re: [HTCondor-users] [PATCH] Speeding up condor_dagman submission
  - From: R. Kent Wenger

Prev by Date: Re: [HTCondor-users] Trapping condor_hold in Python scripts on Windows
Next by Date: [HTCondor-users] Problem with line endings in docker universe
Previous by thread: Re: [HTCondor-users] [PATCH] Speeding up condor_dagman submission
Next by thread: [HTCondor-users] Hello - Newbie trying to install an hibrid cluster - Master Ubuntu 14.04, Nodes Windows 7
Index(es):
- Date
- Thread

Mailing List Archives

Public Access

Re: [HTCondor-users] [PATCH] Speeding up condor_dagman submission