[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] Specific Nodes don't run job!



On Apr 14, 2014, at 8:57 AM, Mostafa.B <bakhtvar@xxxxxxxxx> wrote:

Running tasks on two specific nodes in my cluster gives me the below log. As it can be seen it actually doesn't do anything!! This happens only with these specific two nodes. Same task is OK with any other node in the cluster. It is interesting that even running the task manually on these two nodes is OK.
I can exclude these two nodes from the cluster but I am looking for a proper solution to get them work as well. Any suggestions?

Try the -interactive option to condor_submit. It will give you an ssh session on the execute machine with the same user and environment as your job. Then, you can try running your job manually. If it fails in the same way, then you can debug the cause.

You’ll want to set/modify the requirements _expression_ in your submit file to force the job to run on one of the problematic machines, like so:

requirements = Machine==“problem.foo.org"

Thanks and regards,
Jaime Frey
UW-Madison HTCondor Project