[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] Memory leak in python bindings?



Hi Scott,

Definitely looks like a leak to me.  Thanks for reporting it.

As a potential workaround: do you see the leak go away if you use schedd.xquery() instead?

Thanks,

Brian

On Jun 12, 2016, at 9:53 PM, Scott Leishman <scott.leishman@xxxxxxxxx> wrote:

Hi,

I have a long running python script that queries the scheduler periodically and noticed the process continually growing in memory over time.  

The following snippet is enough to illustrate the issue on our systems (running condor v8.4.6):

import resource
import htcondor 
schedd = htcondor.Schedd() 
while True: 
    schedd.query() 
    print('mem use: %s (kb)' % resource.getrusage(resource.RUSAGE_SELF).ru_maxrss)

With no jobs running, I typically see the memory use increase by 128kb once every 45 or so iterations of the loop.

pympler didn't show any new python objects getting created, but when I hooked up valgrind's leak checker I see it report the following lost blocks:

==2473759== 136,000 (92,000 direct, 44,000 indirect) bytes in 500 blocks are definitely lost in loss record 820 of 821

==2473759==    at 0x4C2B0E0: operator new(unsigned long) (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)

==2473759==    by 0x6B42B29: CondorQ::getFilterAndProcessAds(char const*, StringList&, int, bool (*)(void*, compat_classad::ClassAd*), void*, bool) (in /usr/lib/condor/libcondor_utils_8_4_6.so)

==2473759==    by 0x6B43EB6: CondorQ::fetchQueueFromHostAndProcess(char const*, StringList&, int, int, bool (*)(void*, compat_classad::ClassAd*), void*, int, CondorError*) (in /usr/lib/condor/libcondor_utils_8_4_6.so)

==2473759==    by 0x65750CA: Schedd::query(boost::python::api::object, boost::python::list, boost::python::api::object, int, CondorQ::QueryFetchOpts) (in /usr/lib/python2.7/dist-packages/htcondor.so)

==2473759==    by 0x6575C4E: query_overloads::non_void_return_type::gen<boost::mpl::vector7<boost::python::api::object, Schedd&, boost::python::api::object, boost::python::list, boost::python::api::object, int, CondorQ::QueryFetchOpts> >::func_0(Schedd&) (in /usr/lib/python2.7/dist-packages/htcondor.so)

==2473759==    by 0x656B0FA: boost::python::objects::caller_py_function_impl<boost::python::detail::caller<boost::python::api::object (*)(Schedd&), boost::python::default_call_policies, boost::mpl::vector2<boost::python::api::object, Schedd&> > >::operator()(_object*, _object*) (in /usr/lib/python2.7/dist-packages/htcondor.so)

==2473759==    by 0x67ED9E9: boost::python::objects::function::call(_object*, _object*) const (in /usr/lib/condor/libpyclassad2.7_8_4_6.so)

==2473759==    by 0x67EDD57: ??? (in /usr/lib/condor/libpyclassad2.7_8_4_6.so)

==2473759==    by 0x67E8852: boost::python::handle_exception_impl(boost::function0<void>) (in /usr/lib/condor/libpyclassad2.7_8_4_6.so)

==2473759==    by 0x67EC662: ??? (in /usr/lib/condor/libpyclassad2.7_8_4_6.so)

==2473759==    by 0x499BE4: PyEval_EvalFrameEx (in /usr/bin/python2.7)


==2473759==    by 0x4A1633: ??? (in /usr/bin/python2.7)


I can work around this issue by restarting the process periodically but it seems like there is an allocation in getFilterAndProcessAds that isn't later freed?


-Scott

_______________________________________________
HTCondor-users mailing list
To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users

The archives can be found at:
https://lists.cs.wisc.edu/archive/htcondor-users/