[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: [Condor-users] Poor XP responsiveness for large number of shadowprocesses



-----Original Message-----
From: Christopher Mellen [mailto:Chris.Mellen@xxxxxxxxxx] 
Sent: Wednesday, June 15, 2005 5:07 AM
To: 'Condor-Users Mail List'
Subject: [Condor-users] Poor XP responsiveness for large number of
shadowprocesses

Hi all,

We're noticing that our Windows XP submit machines seems to cope only
poorly
when large numbers (say 40+) of shadow processes are in place. This
symptoms
are that the GUI becomes very sluggish and occasionally completely
unresponsive for periods of time exceeding 10+ seconds. The jobs being
run
do generate large volumes of network traffic, both at start-up and
completion. I'm wondering :

1. Is this to be expected ? Do Linux/Unix users see similar 'stresses'
in
their systems or is it that Windows multi-tasks only poorly ?
2. If this is atypical for Windows systems is there anything we can do
to
improve useability ? Eg, increase the desktop heap size ??

FYI : each submit machine has 2Gb Ram, dual processors, max swap space
set,
1Gb Ethernet LAN connection.

Thanks,
Chris.


_______________________________________________
Condor-users mailing list
Condor-users@xxxxxxxxxxx
https://lists.cs.wisc.edu/mailman/listinfo/condor-users


Hey Chris,

We have seen this happen on our pool also. Several "desktop" machines
would become so sluggish that users threatened to kill condor jobs. We
got around it, partially, by having a 'desktop-user' policy, wherein a
condor job is accepted only when non-condor load is low. Also, we now
suspend and evict a job rapidly when the non-condor load increases on a
machine running a condor job. 

Still, these estimates of non-condor load are not quite reliable and
sometimes the collector on our windows master does not quickly update to
realize that load has gone up. Because of this we have started planning
a move to a Linux master.

It would be interesting to us too, if the developers could address this
issue.  

Thanks
Huzefa