[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] Condor Error No locks available



On Tue, 13 Jun 2006, lohit wrote:

> - Would it be possible for you to move the node job log files themselves
>   off of NFS?  That's probably the first thing to try.
> I tried this. I generated the sub file using condor_submit_dag -no_submit
> command. and then edited the file to point all log, lock files to local disk
> on node. Yet I am seeing this error. Here is my edited run.dag.condor.subfile
>
> # Filename: run.dag.condor.sub
> # Generated by condor_submit_dag run.dag
> universe        = vanilla
> executable      = /home/usr1/condor/bin/condor_dagman
> getenv          = True
> output          = run.dag.lib.out
> error           = run.dag.lib.out
> log             = run.dag.dagman.log
> remove_kill_sig = SIGUSR1
> arguments       = -f -l . -Debug 3 -Lockfile /scratch/usr1/run.dag.lock -Dag
> run.dag -Rescue /scratch/usr1/run.dag.rescue -Condorlog
> /scratch/usr1/run.dag.dummy_log
> environment    = _CONDOR_DAGMAN_LOG=run.dag.dagman.out
> ;_CONDOR_MAX_DAGMAN_LOG=0
> queue

Actually, to move all of the log files off of NFS, you need to edit
the submit files for each individual node, not just the submit file
for DAGMan itself.

Given the error message you got, I think that the lock is failing when
DAGMan is trying to read a node job user log, so changing DAGMan's own
log file doesn't help.

Kent Wenger
Condor Team