Mailing List Archives Public Access	UW Madison Computer Sciences Department Computer Systems Lab

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] Application specific scheduler

Date: Mon, 30 Jun 2014 09:20:02 -0500 (CDT)
From: "R. Kent Wenger" <wenger@xxxxxxxxxxx>
Subject: Re: [HTCondor-users] Application specific scheduler

On Sat, 28 Jun 2014, Miha Ahronovitz wrote:

So Nick, says, I want to migrate my home grown distributed environment to
HTCondor. As a new user he considers 3 options. Miron says use DAGman. Miha
asks why. Miron says because it manages job dependencies. Gabriel says
DAGman  is the way to go, but he wonders "why, in case of failure, one
has to restart the workflow rather than retry the failed jobs, "
Kent Wegner from CHTC team clarifies ans says, yes we know it is problem,
gives the link and has a name for it: this is issue #2831.

Let me stop here. Nick seems an an experienced sysadmin  / engineer. But
HTCondor-list  has 2,100 subscribers. How many of these subscribers know
about DAGman?  Maybe they search and read why, in case of failure, they hae
resubmitt all jobs from the beginning?

Just to clarify, I was assuming (perhaps incorrectly) that Gabriel wasreferring to the case where the user has to take some kind of manualaction to fix the problem with a job that failed, before retrying thatjob.

If a job fails, but it may succeed on being retried without any actionfrom the user, the retry option in DAGMan can handle that case. The retryoption for nodes in DAGMan has existed for a long time (10+ years, Ithink), so hopefully many people are aware of that...


Kent

Follow-Ups:
- Re: [HTCondor-users] Application specific scheduler
  - From: Tevfikkosar

References:
- [HTCondor-users] Application specific scheduler
  - From: Nick Cooper
- Re: [HTCondor-users] Application specific scheduler
  - From: Miron Livny
- Re: [HTCondor-users] Application specific scheduler
  - From: Miha Ahronovitz
- Re: [HTCondor-users] Application specific scheduler
  - From: Miron Livny
- Re: [HTCondor-users] Application specific scheduler
  - From: Gabriel Mateescu
- Re: [HTCondor-users] Application specific scheduler
  - From: R. Kent Wenger
- Re: [HTCondor-users] Application specific scheduler
  - From: Miha Ahronovitz

Prev by Date: [HTCondor-users] FW: GAMS and Condor
Next by Date: Re: [HTCondor-users] Application specific scheduler
Previous by thread: Re: [HTCondor-users] Application specific scheduler
Next by thread: Re: [HTCondor-users] Application specific scheduler
Index(es):
- Date
- Thread

Mailing List Archives

Public Access

Re: [HTCondor-users] Application specific scheduler