[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] Job Scheduling



On 02/06/2013 10:52, Usman Khan wrote:
On 06/02/2013 12:58 PM, Brian Candler wrote:
On 01/06/2013 18:32, Muak rules wrote:
I just did what you asked me to do.
There is only one worker node is showing but in worker node it was not showing queue
What do you mean by "not showing queue"?
Means queue was not showing queue on workers node..
And what does that mean?? What command did you type, on which machine, and what response did you see?
How I came to know that job is running on worker machine or not if I don't have any access to master node?
The master node is where you run condor_submit to queue a job, condor_q to examine the queue, condor_status to look at what the worker machines are doing.

Are you saying you don't have access to the master node? Well, it is possible to run these tools on one machine and ask them to query a different master node. But this adds extra command-line options. Also you would need to set up access permissions so that access from that other machine was allowed.

If I were you, I'd start simple. One master node, a number of execute nodes, everyone logs into the master node to submit jobs.

And what should I do if I want to migrate my job from one worker machine to other if I'm using standard universe?
condor doesn't, as far as I know, support any form of "live" migration. If you're using standard universe then you have checkpointing, so I suppose it's possible to terminate a job and have it restart on another node from that checkpoint, but I don't use standard universe so I don't really know (I use vanilla universe)

Will you plz help me out through this.....Thankx
I've tried to explain this as simply as I can, but if I've failed then I'm sorry, I don't think I can put it any more simply than I already have.

Also, please remember to reply to the list, not just to me personally.

Regards,

Brian.