[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] Tracking available memory on a compute host



Is there no way to have condor daemons monitor the actual available
memory on a host and allow classads to be matched against it to ensure
jobs don't flock to a host without enough free RAM?

On Mon, Jan 22, 2018 at 1:43 PM, Steve Huston
<huston@xxxxxxxxxxxxxxxxxxx> wrote:
> I found a couple old mentions that "VirtualMemory" and/or
> "TotalVirtualMemory" are updated as a machine runs, and one might be
> able to use that to make sure there's enough memory available on a
> host to run jobs.  However in my experimenting I found it was not
> updated nearly often enough to be useful - I gobbled up half the
> memory on a machine and the number wasn't changed even 15 minutes
> later, though there were updated classads received from it (and I was
> querying it directly anyway).
>
> This comes up because I had a user who had queued jobs that kept
> flocking to another user's machine where there were available cores,
> but no available memory (local usage, outside of HTCondor).  Those
> queued jobs kept getting killed by oom_killer shortly after starting,
> but then new jobs would flock there.  Thus, I'm looking for some way
> to add to the requirements test of a job that the host in question has
> enough free virtual memory to run the job.
>
> --
> Steve Huston - W2SRH - Unix Sysadmin, PICSciE/CSES & Astrophysical Sci
>   Princeton University  |    ICBM Address: 40.346344   -74.652242
>     345 Lewis Library   |"On my ship, the Rocinante, wheeling through
>   Princeton, NJ   08544 | the galaxies; headed for the heart of Cygnus,
>     (267) 793-0852      | headlong into mystery."  -Rush, 'Cygnus X-1'



-- 
Steve Huston - W2SRH - Unix Sysadmin, PICSciE/CSES & Astrophysical Sci
  Princeton University  |    ICBM Address: 40.346344   -74.652242
    345 Lewis Library   |"On my ship, the Rocinante, wheeling through
  Princeton, NJ   08544 | the galaxies; headed for the heart of Cygnus,
    (267) 793-0852      | headlong into mystery."  -Rush, 'Cygnus X-1'