[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] ImageSize value



Hi John,

Thanks for your answer.
Could you please explain the difference between ImageSize and
MemoryUsage (which is actually computed from ResidentSetSize from what I
can see)?

By the end, what I'm interested in is the memory peak of the job at
anytime when it's running, not the current memory used.

Best,
Mathieu
On 30/07/19 19:00, htcondor-users-request@xxxxxxxxxxx wrote:
> Send HTCondor-users mailing list submissions to
> 	htcondor-users@xxxxxxxxxxx
>
> To subscribe or unsubscribe via the World Wide Web, visit
> 	https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users
> or, via email, send a message with subject or body 'help' to
> 	htcondor-users-request@xxxxxxxxxxx
>
> You can reach the person managing the list at
> 	htcondor-users-owner@xxxxxxxxxxx
>
> When replying, please edit your Subject line so it is more specific
> than "Re: Contents of HTCondor-users digest..."
>
>
> Today's Topics:
>
>    1. ImageSize value (Mathieu Bahin)
>    2. Re: ImageSize value (John M Knoeller)
>    3. Re: HTCondor on Debian 10 (problem with renamed libboost
>       python libraries) (Tim Theisen)
>
>
> ----------------------------------------------------------------------
>
> Message: 1
> Date: Tue, 30 Jul 2019 16:53:09 +0200
> From: Mathieu Bahin <mathieu.bahin@xxxxxxxxxxxxxxx>
> To: htcondor-users@xxxxxxxxxxx
> Subject: [HTCondor-users] ImageSize value
> Message-ID: <6bca8561-034d-e3c9-0b98-88614c080df8@xxxxxxxxxxxxxxx>
> Content-Type: text/plain; charset=utf-8
>
> Hi all,
>
> I'm developing an interface to monitor cluster jobs in my Institute.
> We have the version 8.6.3 of Condo with partitionable slots.
>
> I had the surprise to discover that the ImageSize value is different
> whether I get it from "condor_status -l" or "condor_q -l" for the same
> slot/job. Is there an explanation for that?
>
> For example, I have, for the same job:
> ? - from condor_q: ImageSize = 931876
> ? - from condor_status: ImageSize_RAW = 1161832
>
> Best,
> Mathieu
>
>
> ------------------------------
>
> Message: 2
> Date: Tue, 30 Jul 2019 16:04:10 +0000
> From: John M Knoeller <johnkn@xxxxxxxxxxx>
> To: HTCondor-Users Mail List <htcondor-users@xxxxxxxxxxx>
> Subject: Re: [HTCondor-users] ImageSize value
> Message-ID:
> 	<DM5PR06MB356167FD61044D60188BED5E96DC0@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx>
> 	
> Content-Type: text/plain; charset="utf-8"
>
> What you should be looking at is MemoryUsage rather than Image size. 
>
> condor_status normally will show only ImageSize, not ImageSize_RAW.
>
> condor_q should have both an ImageSize and  ImageSize_RAW value,   ImageSize_RAW is the unquantized value, while
> ImageSize is quantized from ImageSize_RAW. The quantization is meant to make matchmaking more efficient if the job 
> exits and needs to be re-run.  
>
> Detected values  like MemoryUsage and ImageSize are updated periodically, and are not passed between daemons instantaneously, 
> So you should expect the values between condor_ q and condor_status to be  a little bit different most of the time
> because they represent the value of ImageSize at different times.
>
> -tj
>
> -----Original Message-----
> From: HTCondor-users <htcondor-users-bounces@xxxxxxxxxxx> On Behalf Of Mathieu Bahin
> Sent: Tuesday, July 30, 2019 9:53 AM
> To: htcondor-users@xxxxxxxxxxx
> Subject: [HTCondor-users] ImageSize value
>
> Hi all,
>
> I'm developing an interface to monitor cluster jobs in my Institute.
> We have the version 8.6.3 of Condo with partitionable slots.
>
> I had the surprise to discover that the ImageSize value is different
> whether I get it from "condor_status -l" or "condor_q -l" for the same
> slot/job. Is there an explanation for that?
>
> For example, I have, for the same job:
> ? - from condor_q: ImageSize = 931876
> ? - from condor_status: ImageSize_RAW = 1161832
>
> Best,
> Mathieu
> _______________________________________________
> HTCondor-users mailing list
> To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
> subject: Unsubscribe
> You can also unsubscribe by visiting
> https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users
>
> The archives can be found at:
> https://lists.cs.wisc.edu/archive/htcondor-users/
>
>
>
> ------------------------------
>
> Message: 3
> Date: Tue, 30 Jul 2019 16:13:28 +0000
> From: Tim Theisen <tim@xxxxxxxxxxx>
> To: "htcondor-users@xxxxxxxxxxx" <htcondor-users@xxxxxxxxxxx>
> Subject: Re: [HTCondor-users] HTCondor on Debian 10 (problem with
> 	renamed libboost python libraries)
> Message-ID: <9fee1a2c-456a-36de-1281-7239bd1cd50c@xxxxxxxxxxx>
> Content-Type: text/plain; charset="utf-8"
>
> Hi Steffen,
>
> I wasn't thinking about Debian policy when I created this repo. Of
> course, it will all be normalized when 8.8.5 comes out. This should work
> with all current Debians and Ubuntus.
>
> The CMakefiles really need to be cleaned-up. I had to resort to a couple
> of hacks to get everything to work with our older stable releases. I'll
> address those hacks in the development release when we move to CMake 3.
>
> The warnings about HASHITER have been addressed in our 8.9 development
> series. I will back-port fixes if they become problematic. I'll have to
> look into the shlibdeps warnings.
>
> ...Tim
>
> On 7/30/19 4:35 AM, Steffen Grunewald wrote:
>> Answering myself...
>>
>> On Tue, 2019-07-30 at 09:47:18 +0200, Steffen Grunewald wrote:
>>> On Mon, 2019-07-29 at 21:53:26 +0000, Tim Theisen wrote:
>>>> I have put up a repository from Debian 10 (Buster).
>>>>
>>>> https://research.cs.wisc.edu/htcondor/instructions/debian/10/stable/
>>>>
>>>> I built this on a VM and lightly tested it. (Ran a jobs and tried out the python bindings).
>>> Hi Tim,
>>>
>>> so we now have at least two (non-identical) condor_8.8.4.orig.tar.gz files?
>>> I think this is in violation of Debian packaging rules, but what do I know...
>>> (I'd have expected the orig tarball to be left unchanged, and the mods
>>> supplied as a set of patches instead, with an updated build number. Will
>>> try to make that conversion myself...)
>> Since only one file had changed essentially, the patch would be
>>
>> --- 8< ---
>> --- condor-8.8.4.orig/src/python-bindings/CMakeLists.txt
>> +++ condor-8.8.4/src/python-bindings/CMakeLists.txt
>> @@ -220,12 +220,12 @@ else()
>>      if (DEFINED PYTHON_VERSION_STRING AND PYTHONLIBS_FOUND)
>>        set ( PYTHON_BOOST_LIB boost_python )
>>        if (DEFINED SYSTEM_NAME)
>> -        if (${SYSTEM_NAME} MATCHES "rhel7" OR ${SYSTEM_NAME} MATCHES "centos7" OR ${SYSTEM_NAME} MATCHES "sl7")
>> -            set ( PYTHON_BOOST_LIB "boost_python${PYTHON_VERSION_MAJOR}${PYTHON_VERSION_MINOR}" )
>> -        endif()
>>          if (${SYSTEM_NAME} MATCHES "Debian" OR ${SYSTEM_NAME} MATCHES "Ubuntu")
>>              set ( PYTHON_BOOST_LIB "boost_python-py${PYTHON_VERSION_MAJOR}${PYTHON_VERSION_MINOR}" )
>>          endif()
>> +        if (${SYSTEM_NAME} MATCHES "rhel7" OR ${SYSTEM_NAME} MATCHES "centos7" OR ${SYSTEM_NAME} MATCHES "sl7" OR ${SYSTEM_NAME} MATCHES "Debian.*10")
>> +            set ( PYTHON_BOOST_LIB "boost_python${PYTHON_VERSION_MAJOR}${PYTHON_VERSION_MINOR}" )
>> +        endif()
>>        endif()
>>        include_directories(${BOOST_INCLUDE})
>>        link_directories(${BOOST_LD})
>> @@ -300,12 +300,12 @@ else()
>>      endif()
>>  
>>      if (DEFINED PYTHON3_VERSION_STRING AND PYTHON3LIBS_FOUND AND NOT ${SYSTEM_NAME} MATCHES "fc27")
>> -      if (${SYSTEM_NAME} MATCHES "rhel7" OR ${SYSTEM_NAME} MATCHES "centos7" OR ${SYSTEM_NAME} MATCHES "sl7")
>> -        set ( PYTHON3_BOOST_LIB "boost_python${PYTHON3_VERSION_MAJOR}${PYTHON3_VERSION_MINOR}" )
>> -      endif()
>>        if (${SYSTEM_NAME} MATCHES "Debian" OR ${SYSTEM_NAME} MATCHES "Ubuntu")
>>          set ( PYTHON3_BOOST_LIB "boost_python-py${PYTHON3_VERSION_MAJOR}${PYTHON3_VERSION_MINOR}" )
>>        endif()
>> +      if (${SYSTEM_NAME} MATCHES "rhel7" OR ${SYSTEM_NAME} MATCHES "centos7" OR ${SYSTEM_NAME} MATCHES "sl7" OR ${SYSTEM_NAME} MATCHES "Debian.*10")
>> +        set ( PYTHON3_BOOST_LIB "boost_python${PYTHON3_VERSION_MAJOR}${PYTHON3_VERSION_MINOR}" )
>> +      endif()
>>        include_directories(${BOOST_INCLUDE})
>>        link_directories(${BOOST_LD})
>>  
>> --- 8< ---
>>
>> (at least that's what "dpkg-source --commit" provided me with, after replacing
>> the file in the old source tree with the new version)
>>
>> I put this into debian/patches/python3-buster.patch, and dpkg-source generated
>> a debian/patches/series file.
>> Then I bumped the build number in debian/changelog to -2 before appending
>> distro-specific tags.
>>
>> pbuilders are running right at this moment, for all archs and distros
>> accessible to me. Nothing suspicious has shown up yet, if one ignores (a) some
>> HASHITER-related compiler complaints (which have been there for ages but start
>> to look more dangerous with Buster - please check them!) and (b) a huge number
>> of dpkg-shlibdeps warnings.
>>
>> Actually, the Jessie build for amd64 has already finished - still with the
>> addition of python3-dev in place. The Stretch amd64 is finishing as I type,
>> and the Buster one (also amd64) is in the final stages. So this looks good
>> overall, although I'll wait for the arm64 (Stretch and Buster) builds before
>> I close this case.
>>
>>> Is this Buster-specific or would the package build on Stretch as well?
>> As shown above, the patch doesn't seem to harm other distros.
>> Something very similar may be required for the latest Ubuntu versions.
>>
>> Thank you for getting this fixed so quickly. 
>>
>> Cheers, Steffen
>> _______________________________________________
>> HTCondor-users mailing list
>> To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
>> subject: Unsubscribe
>> You can also unsubscribe by visiting
>> https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users
>>
>> The archives can be found at:
>> https://lists.cs.wisc.edu/archive/htcondor-users/