[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[HTCondor-users] submitted jobs and disconnect



Hi all,

maybe this is a noob question...

I try to setup a htcondor pool with several desktop pcs and notebooks in one subnet.
Submitting jobs should be possible from every host. The desktop workstations
should also act as execute host.

The problem is, the desktop machines can be switched into hibernation if not needed.
If this happens, every job submitted from these pcs are lost, because their
condor_shadow daemon also disappears. The same happens to the notebooks if they
disconnect from our LAN for a longer time.

Ok, to avoid this I tried to configure on every host, that our condor central manager
is the only schedd host in the pool:

CONDOR_HOST = <condor central manager>
SCHEDD_HOST = $(CONDOR_HOST)

In this way all jobs are started from the condor central manager although
submitted actually from an other host.
But doing so I can not transfer any local stored files to jobs, especially from
notebooks. The condor central manager (schedd host) rightly reports error about
not being able to find the remotely stored files.

What can I do to satisfy my requirements?
Is there a simple solution for me?
Or must every user transfer his files to the condor central manager manually
and start his jobs from there?

Any hint will be appreciated
Best,
Werner

-- 

-----------------------------------------------------------------------
Werner Hack              
Universität Ulm  
Institut für Nachrichtentechnik 
-----------------------------------------------------------------------


Attachment: smime.p7s
Description: S/MIME Cryptographic Signature