[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Condor-users] Job Resubmission



Hi - Just getting started with Condor. We're looking at running jobs on a local network of Linux machines with the potential of adding in a connection to an Amazon VPC in the future. Have set up a couple of virtual condor pools at 2 separate locations and have configured flocking between them.

My question right now is how Condor deals with resubmitting failed jobs. For example, should a machine die during execution a job, it seems that the job terminates as you would expect, however we would like any failed jobs to be resubmitted to another available machine in the pool. Should we be looking at Condor-G (not started there yet) and condor_resubmit?

Thanks,
Giles

EastQuayIT Ltd is a limited company, registered in England and Wales with Registration no. 07595813. VAT No: GB 116 6924 08.

Any quotation above is based on the terms and conditions of business and commencement of the services is evidence of your acceptance to the same. This message, including any attachments, has been sent by EastQuayIT Ltd and is intended solely for the use of the person(s) to whom it is addressed. Its contents are confidential and if you are not the intended recipient, please could you delete this email from your system, without copying or disclosing its contents, and inform the sender by return e-mail that you have received this message. Email communications cannot be guaranteed to be secure, or free from computer viruses, therefore EastQuayIT Ltd does not accept legal responsibility for this message or its contents. The recipient is responsible for checking this message for viruses and verifying its authenticity before acting on the contents. Any views or opinions presented are solely those of the author and do not necessarily represent those of EastQuayIT Ltd.