[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[HTCondor-users] Issues with Condor on Rocks cluster



Is there any documentation available that is specific to Rocks Cluster for getting Condor working?

I am running RHEL 6.3 with Rocks 6.1 with the condor roll (and other rolls). Condor is version 7.8.5.

When I submit a job it just sits there idle. One error I get is "request has not yet been considered by the matchmaker".

All of the documentation I find involves a regular install so it is not Rocks specific. Some older rocks documentation had some info, but a lot of it was not correct for the version I'm running.

It says the job submission has been accepted and nothing happens and a little while later it tries to submit it again. It's as if the Head node is submitting it and the compute node is not receiving it.

Also, the users SSH in from Win7 computers using Putty. The users are AD (Active Directory) users and the Head node is in AD, but the compute nodes are not. I was under the impression that the "condor" user would handle all of the work on the backend.

I have 2 separate networks, Frontend and backend. The Frontend has connections to the workstations, the DC, and the head node. The backend is the head node and compute nodes only. The storage is attached to the head node via iSCSI and an SMB share. I have verified that the AD user that is submitting the job can create file where the job is submitted. I am trying to run the hello.sub/hello.sh test job.

Also, my cluster is in a lab not connected to the internet and I can't post log files. But I will get any information that I can and post it here for anyone willing to assist.