[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] Jobs only running on submit machine



What does your submit file look like?

A common problem is that the machines don't have a shared filesystem, and HTCondor's file transfer option isn't being requested in the submit file. In this case, HTCondor will only run the jobs on the submit machine.

 -- Jaime

On Feb 26, 2013, at 9:09 AM, Cody Belcher <codytrey@xxxxxxxxxxxxxxxx> wrote:

I do see all of the machines in condor-status

"codytrey@metis:~$ condor_config_val DAEMON_LIST
MASTER, SCHEDD, STARTD"

This is the submit machine, it is the same on an execute a just tried.

-Cody

On 2013-02-26 08:47, Cotton, Benjamin J wrote:

Cody,

The first question is are you sure they're all in the same pool? To
check this, do they all show up in the output of condor_status?

My suspicion is that your submit/execute machine might be running its
own condor_collector and condor_negotiator processes. You can check this
with 

condor_config_val DAEMON_LIST

If that's the case, then your execute-only nodes might be as well.



Thanks and regards,
Jaime Frey
UW-Madison HTCondor Project