[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[HTCondor-users] return code of jupyter notebook jobs



Hi,

as we use jupyter notebooks running in condor slots in production for a while now we need to get a bit of monitoring around this. 

One of the bigger problems to come up with something decent is that the jupyterhub uses condor_rm to end the notebook once it is not needed anymore. This results in a condor_history entry with jobstatus == 3 which is considered to be a faulted job (which in fact in this case it is not). The other option is that the notebook job runs into the timelimit and gets removed by the periodic_remove_expression which is a bit more flexible to tweak presumably. 

I would like the idea of having an option for condor_rm to influence the subsequent history-job-state. If that is too much of a hassle to make possible maybe there is something we could propose to the jupyter notebook developer on howto end a notebook more clever ? 

Best
Christoph 

-- 
Christoph Beyer
DESY Hamburg
IT-Department

Notkestr. 85
Building 02b, Room 009
22607 Hamburg

phone:+49-(0)40-8998-2317
mail: christoph.beyer@xxxxxxx