[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[HTCondor-users] return code of jupyter notebook jobs
- Date: Fri, 27 Mar 2020 11:25:09 +0100 (CET)
- From: "Beyer, Christoph" <christoph.beyer@xxxxxxx>
- Subject: [HTCondor-users] return code of jupyter notebook jobs
as we use jupyter notebooks running in condor slots in production for a while now we need to get a bit of monitoring around this.
One of the bigger problems to come up with something decent is that the jupyterhub uses condor_rm to end the notebook once it is not needed anymore. This results in a condor_history entry with jobstatus == 3 which is considered to be a faulted job (which in fact in this case it is not). The other option is that the notebook job runs into the timelimit and gets removed by the periodic_remove_expression which is a bit more flexible to tweak presumably.
I would like the idea of having an option for condor_rm to influence the subsequent history-job-state. If that is too much of a hassle to make possible maybe there is something we could propose to the jupyter notebook developer on howto end a notebook more clever ?
Building 02b, Room 009