[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[HTCondor-users] Problems setting up condor on local node, jobs do not start



hi condor-users

I am trying to set up condor 8.1.0 on a local ubuntu 12.04 cluster, and running into quite a few problems even on a single node setup with fairly standard config files. Here is my progression so far:

- ps -efwwww | grep condor_ gives
condor   21958     1  0 13:05 ?        00:00:00 /usr/sbin/condor_master -pidfile /var/run/condor/condor.pid
root     21961 21958  0 13:05 ?        00:00:00 condor_procd -A /var/run/condor/procd_pipe -L /var/log/condor/ProcLog -R 10000000 -S 60 -C 124
condor   21962 21958  0 13:05 ?        00:00:00 condor_collector -f
condor   21963 21958  0 13:05 ?        00:00:00 condor_negotiator -f
condor   21964 21958  0 13:05 ?        00:00:00 condor_schedd -f
condor   21965 21958  0 13:05 ?        00:00:00 condor_startd -f

- 	condor_status returns nothing with the vanilla config files, I had to set ALLOW_WRITE = * to get any nodes to appear. Even setting the own machines IP manually did not work. If I set ALLOW_WRITE = * I can continue, although this is not really satisfactory
- 	submitting test jobs does not work. jobs are listed in condor_q as idle. I have 8 available nodes to run the job.
- 	running condor-q -analyze shows me that they have not been considered by the matchmaker, checking in NegotiatorLog gives me a
	condor_read() failed: recv(fd=8) returned -1, errno = 104 Connection reset by peer, reading 5 bytes from collector
-	If I change
	ALLOW_NEGOTIATOR = $(CONDOR_HOST), $(IP_ADDRESS) -> ALLOW_NEGOTIATOR = *
	jobs seem to get started but then I get:
 	
 	Error from slot3@mynodename: Failed to open 'myhomedir/testjob/first.job.10.2.out' as standard output: Permission denied (errno 13)

Any ideas on how to fix this?
Thanks, alex

Remark:
I chose the dev 8.1.0 channel, since 8.0.2 still has python2.6 bindings which I could not provide in ubuntu 12.04 without further hassle. I went through further hassle, however, and this does not change the behaviour described above.