[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Condor-users] State transition for peempted jobs and its implication with Condor-G



Hi,

When a job is temporarily suspended by a higher priority job, what state does it go into? I got the impression that job state will become idle and the job will sit in the queue, waiting for a match again. But will it go through 'hold' state before becoming 'idle' and if so, will this transition (R-> (H?) -> I) reflect on condor_q?
  
I guess a possibility that a job being preempted could go into a 'hold' state is when this particular job is being checkpointed (therefore, file staging is involved => hold state). This reminds me of another question: when a job is submitted in Condor-G, grid manager on the remote gatekeeper will forward this job to Condor (assuming the underlying batch system is Condor) and let it schedule the job, but in which universe will the site's native Condor run the job in?

If job ends up being scheduled as a Vanilla job, then how would this job receive a checkpointing service? Is it the case that the jobmanager, in the meantime, also somehow watches over the job while it is being executed on the worker node and hence, even though it is being run as a Vanilla job, checkpointing could still be achieved?

Of course, thoughts above were based on my impression that Condor-G does support checkpointing but I am not sure on which level it is achieved. Or Condor-G job does not support checkpointing at all?

Is there a possibility that jobmanager on gatekeeper could somehow "inform" the its native Condor to scheduler jobs in a universe other than Vanilla?


Thank you,

~Barnett


Looking for last minute shopping deals? Find them fast with Yahoo! Search.