[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] A couple of questions regarding running jobs with docker containers (Q1)



Hi Greg, thanks a lot for the responses to both questions!!
Cheers,
Jose

El vie, 26 mar 2021 a las 15:37, Greg Thain (<gthain@xxxxxxxxxxx>) escribiÃ:
>
>
> On 3/26/21 5:36 AM, jcaballero.hep@xxxxxxxxx wrote:
> > Hi,
> >
> > I have a couple of questions regarding running jobs in Docker
> > containers. Here is the first one.
> >
> > I am testing condor 8.8.12, with docker 18.03.0
> >
> > Printing some custom logs from the wrapper set in config variable
> > DOCKER, I have just noticed that not always everything works.
> > Sometimes, HTCondor decides to kill the container after a few seconds.
> > As can be seen here [*]
> > For example, the container started at "Thu Mar 25 22:55:57 2021" was
> > terminated at "Thu Mar 25 23:04:07 2021".
> >
> > Note that I am running one job at a time on that host.
>
> Jose:
>
> HTCondor will kill the container, just like it will kill a running job
> when requested to by a condor_rm or a preemption or similar reason.  My
> first guess is that's what's happening.  The StartdLog should have more
> details.
>
> In 8.9, we introduced Tickets of Execution in the job ad which have more
> details about why the job left the machine.
>
>
> -greg
>
> _______________________________________________
> HTCondor-users mailing list
> To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
> subject: Unsubscribe
> You can also unsubscribe by visiting
> https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users
>
> The archives can be found at:
> https://lists.cs.wisc.edu/archive/htcondor-users/