[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] cgroups: monitoring network io via net_cls?



On 2/23/2016 9:24 AM, Thomas Hartmann wrote:
Hi all,

I just noticed, that for jobs in cgroups not parameters are set for
network, i.e., no /cgroup/net_cls/htcondor/...

My idea was to see, if one could monitor the network I/O for each job
(tc?)? For example the overall send/received packages or after a job has
finished (Probably the same also for blkio could be interesting).

But afais condor uses cgroups only for cpu and mem, or?


HTCondor vanilla universe jobs just use CPU, Memory, and freezer controllers. Agree that it could be interesting to add blkio and net_cls.

Regarding monitoring network activity of jobs, it is not clear that a net_cls cgroup is really what you want. Last I knew, net_cls will tag traffic in the kernel so tc could do things like traffic prioritization by cgroup. But even if you could get traffic totals per cgroup (not sure how), it seems problematic - for instance, you probably don't want traffic to the loopback interface to count. So likely what you really want is monitor traffic per network interface, and then for each job (slot) to have its own virtual network interface. By having the ability to give each job its own network identity, you can also shape/monitor/control the traffic once it leaves your machine and goes onto the network. This is the approach we explored with the Lark Project, where we did work to add network awareness to HTCondor. See
  https://htcondor-wiki.cs.wisc.edu/index.cgi/wiki?p=LarkProject

It is also the approach for Docker; be aware that as of v8.5.2 of HTCondor, Docker universe jobs have network input and output usage published into the job classad as attributes. See
 https://htcondor-wiki.cs.wisc.edu/index.cgi/tktview?tn=5456

Once we merge in the code we did for the Lark project into mainstream HTCondor, vanilla universe jobs should also be able to have network usage attributes. I cannot promise when this will happen for certain, but at least we've been thinking and working on mechanisms to handle network traffic in HTCondor ....

regards
Todd