[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Condor-users] SPOOL file clash with multiple submitters



Hello All,

I am trying to set up mutiple schedulers on our SMP central manager/submit 
host along the lines suggested by Cycle Computing 
(see http://www.cyclecomputing.com/wiki/index.php?title=Running_Multiple_Condor_Schedds)

This seemed to be working well until I noticed there was a clash between the
checkpoint files of jobs from one schedd and those of another. As far as I
can see the job IDs of jobs in separate queues are not unique so if a user of one
scheduler has a checkpointed job with say ID 3.1, its checkpoint files will be in

$(SPOOL_ROOT)/3/1/cluster...

But then another user on another schedd has a job with same ID 3.1 and it 
attempts to use the same directory which fails because of file permissions. 

I've configured Condor with 

SPOOL_ROOT      = /condor_scratch/spool

SCHEDD1               = $(SBIN)/condor_schedd1
SCHEDD1_ARGS          = -f -local-name Q1
SCHEDD1_LOG           = $(LOG)/ScheddLog.1
SCHEDD.Q1.SCHEDD_NAME = Q1@$(HOSTNAME)
SCHEDD.Q1.SPOOL       = $(SPOOL_ROOT)/schedd1
SCHEDD.Q1.SCHEDD_LOG  = $(SCHEDD1_LOG)

SCHEDD2               = $(SBIN)/condor_schedd2
SCHEDD2_ARGS          = -f -local-name Q2
SCHEDD2_LOG           = $(LOG)/ScheddLog.2
SCHEDD.Q2.SCHEDD_NAME = Q2@$(HOSTNAME)
SCHEDD.Q2.SPOOL       = $(SPOOL_ROOT)/schedd2
SCHEDD.Q2.SCHEDD_LOG  = $(SCHEDD2_LOG)

...etc

but the checkpointing files always seem to get written under the common $(SPOOL) 
directory rather than separate ones causing the clash. 

Interestingly Condor does seem to put these files in indvidual directories (not
the common spool area):

job_queue.log  job_queue.log.1  local_univ_execute  spool_version

so it seems to be aware of SCHEDD.Q1.SCHEDD_LOG if not SCHEDD.Q2.SPOOL

If I take out the default spool/ directory and remove the $(SPOOL) definition,
the negotiator fails on start up. Since there's only one negotiator I would
expect it to use a common directory ???

Any suggestions would be very useful.

thanks in advance,

-ian.

---------------------------------------
Dr Ian C. Smith,
Advanced Research Computing,
University of Liverpool.