[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] have people seen scalability issues with condor submission using the python bindings?






On Aug 24, 2018, at 20:11, Brian Bockelman <bbockelm@xxxxxxxxxxx> wrote:

> FWIW -
> 
> This is the second time that I've seen someone report strange submission issues when calling schedd.submit from multiple python threads.  I think the prior reporter (maybe from a year or two ago?) also included a reproducer.
> 
> It's not the most efficient way to roll, but I smell a bug here.  Sounds like something inside the submit implementation is releasing the GIL when it should be hanging on to it.
> 
> Brian
> 

Hi Brian

In a sick way, it is a relief I am not the only one. 
That means it is not my fault.

As I mentioned, IIRC, I solved my issues with a double hack:

-- my high level classes for collector and schedd are now Singletons, which reduces to 1 the number of objects trying to talk to the HTCondor resources, even for multi-threaded applications,

-- I serialize the calls to condor_q / condor_history / condor_submit using Threading Lock objects.

I wonder, at this point, if my little library would be helpful also for other people.
Or, if someone wrote a better version of it, I would like to know and grab it :)

    https://github.com/bnl-sdcc/libfactory/htcondorlib.py

Cheers,
Jose