[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] counting licenses

Ah now i know why your name sounds familiar..we both used to work at a beachfront facility in Venice a few years ago.

Joshua Kolden wrote:

Alfred from Pixar, although not the best queue software, has a 'ping' system which allows one to run any command before a job is started, very easy to implement, and very effective for global management.

dunno if this helps...In condor you can run any command line based program too before your actual render command and analyze the ouput...just run a unix .sh file or windows .bat file in the executable portion of the condor submit. In my scripts i assign one or two frames per cpu.

Some systems that don't offer global resource monitoring do allow you to return a failure from a job that is understood to mean try again in a little bit. Such as a license failure return code. Such a failure, causes the job to no try to submit a new task for a set amout of time, or until an exsisting task finishes. Unlike a normal failure, a resource failure return code never causes the task to be marked failed, it just keeps trying until it gets the resource. If there is not such a system in Condor I would strongly encourage it's addition.

there is one..you can set how many times to resubit a frame..if it fails after teh max retry count condor makes a list ofthe failed frames. If you do a dagman submit you gain this functionality. if any portion of the dag fails it is retried...so id dagman your first tsk would check for a license..if available it submits teh frame..if not it fails the frame...condor retries it 3 times...after 3 more failures it makes note of the failed frame/job and puits it in a "rescue" list which you can resubmit later.

dagman on windows needs more testing but on linux it is solid..l.so if you do your submits from a linux machine it should be ok. thats what i used to do...worked nicely..rendered a few movie shots that way.. i gotta go test windows dagman some more.