First of all a warning, I am quite new to HT Condor, so this might just be me not knowing where to look for information. Related links are most welcome!
I have successfully set up HT Condor on our workstation running Scientific Linux 6 (based on RHEL 6), using the yum repositories. This was a very easy procedure, so thanks to whomever maintains and builds this!
Now I am trying to attach my macbook (for testing, when I know how to, I want to attach with a few Mac Proâs we have in various offices), but I am not succeeding. I also find the explanations on how to attach two computers together in a cluster to be slightly confusing, perhaps because most of the time this is working very much automagically. For example, I did not quite get which machines needed write access where, and I did not quite understand how they communicate (other than port 9618 is used).
After failing for about half a day, I figured it is better to reach out to the community to see if someone else can suggest what I am doing wrong.
Second warning, I am also not very experienced with how OSX is structured, I am much more accustomed to Linux environment. This might very well be the root cause of my problems.
The workstation is simply set up using yum, I did not configure anything specifically. This seems to work fine, at least for the one machine. I tried to set a few things in /etc/condor/condor_config.local without much success. What do I need to configure on the host machine in order to allow other machines to be added to the pool? Currently I have only set âALLOW_WRITEâ to everything from inside our domain.
On the macbook, I installed from the tarball sources. using a separate user âcondor". I tried first to install locally with the command:
$ ./condor_configure --install
This seemed to work fine, I could see that my macbook had four open slots etc with âcondor_status".
I then deleted all of that, and installed again with
$ ./condor_configure --install --central-manager=<server hostname> --type=execute --verbose
This did not really seem to work. With âcondor_statusâ I see the slots available on the server machine, but the slots from my macbook are never added.
My thought then was that there should be some errors written to the logfiles somewhere (on the mac):
- In MasterLog everything seemed fine (to my eyes). The last line states that condor_startd is started as a daemoncore process.
- Iâm not sure what I am expected to see in ProcLog, but I donât see anything suspicious
- In StartLog it all looks good, I see that the four sltos are allocated, benchmarked, and currently idle according to this log.
- StarterLog complains about ~condor/local.$HOST/config not existing, but I think that is just an optional folder for extra configuration files?
Next thought then, the server has some relevant information in the logfiles:
- I grep for my macbook hostname in all files, nothing comes up. Same for my macbookâs IP address
- I grep for warning (ignorecase), nothing comes up. I grep for error, I only get:
$ grep -i error *
CollectorLog:11/24/14 13:44:40 Daemon Log is logging: D_ALWAYS D_ERROR
MasterLog:11/24/14 13:44:40 Daemon Log is logging: D_ALWAYS D_ERROR
NegotiatorLog:11/24/14 13:44:41 Daemon Log is logging: D_ALWAYS D_ERROR D_MATCH
SchedLog:11/24/14 13:44:41 (pid:19011) Daemon Log is logging: D_ALWAYS D_ERROR
StartLog:11/24/14 13:44:41 Daemon Log is logging: D_ALWAYS D_ERROR
StartLog:11/24/14 13:44:47 VM-gahp server reported an internal error
If I were to guess, I suspect the communication is not working properly (maybe the port is not correctly opened on my mac). However I am confused since I am not seeing any error messages anywhere. Further, âcondor_statusâ is correctly getting information about the server slots when executed from my laptop. I have tried various settings and configuration on both machines but nothing has gotten me closer to the solution.
Any suggestions to documentation on related issues, what I might do wrong, or questions about more details you need to help me are most welcome!
Thanks and cheers,
Some potentially useful information:
server$ uname -a
Linux <hostname> 2.6.32-431.29.2.el6.x86_64 #1 SMP Tue Sep 9 13:45:55 CDT 2014 x86_64 x86_64 x86_64 GNU/Linux
macbook% uname -a
Darwin <host> 14.0.0 Darwin Kernel Version 14.0.0: Fri Sep 19 00:26:44 PDT 2014; root:xnu-2782.1.97~2/RELEASE_X86_64 x86_64
Macbook is running Yosemite. Both machines are running HT Condor version 8.2.4. The server is 16 core, which gives a total of 32 slots. The laptop has two cores so four slots.
HTCondor-users mailing list
To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
You can also unsubscribe by visiting
The archives can be found at: