[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: [Condor-users] Poor XP responsiveness for large number of shadowprocesses



Hi

We regularly have users with 100 shadow processes running on their user desktop submit node (WinXP SP1 or 2, P4 none HT, 1Gb RAM, 100Mbit NICs). We found our AV s/w - Mcafee VirusScan Enterprise 8 - was chewing through CPU cycles parsing the Condor log files, so added an exclusion in the AV software for d:\condor\logs. This helped tremendously.

	regards
		Patrick.

--On 15 June 2005 09:08 -0700 Huzefa Neemuchwala <hneemuchwala@xxxxxxxxxxxxxxxx> wrote:


-----Original Message----- From: Christopher Mellen [mailto:Chris.Mellen@xxxxxxxxxx] Sent: Wednesday, June 15, 2005 5:07 AM To: 'Condor-Users Mail List' Subject: [Condor-users] Poor XP responsiveness for large number of shadowprocesses

Hi all,

We're noticing that our Windows XP submit machines seems to cope only
poorly
when large numbers (say 40+) of shadow processes are in place. This
symptoms
are that the GUI becomes very sluggish and occasionally completely
unresponsive for periods of time exceeding 10+ seconds. The jobs being
run
do generate large volumes of network traffic, both at start-up and
completion. I'm wondering :

1. Is this to be expected ? Do Linux/Unix users see similar 'stresses'
in
their systems or is it that Windows multi-tasks only poorly ?
2. If this is atypical for Windows systems is there anything we can do
to
improve useability ? Eg, increase the desktop heap size ??

FYI : each submit machine has 2Gb Ram, dual processors, max swap space
set,
1Gb Ethernet LAN connection.

Thanks,
Chris.


_______________________________________________ Condor-users mailing list Condor-users@xxxxxxxxxxx https://lists.cs.wisc.edu/mailman/listinfo/condor-users


Hey Chris,

We have seen this happen on our pool also. Several "desktop" machines
would become so sluggish that users threatened to kill condor jobs. We
got around it, partially, by having a 'desktop-user' policy, wherein a
condor job is accepted only when non-condor load is low. Also, we now
suspend and evict a job rapidly when the non-condor load increases on a
machine running a condor job.

Still, these estimates of non-condor load are not quite reliable and
sometimes the collector on our windows master does not quickly update to
realize that load has gone up. Because of this we have started planning
a move to a Linux master.

It would be interesting to us too, if the developers could address this
issue.

Thanks
Huzefa


_______________________________________________ Condor-users mailing list Condor-users@xxxxxxxxxxx https://lists.cs.wisc.edu/mailman/listinfo/condor-users




-------------------------------------
Patrick Townsend  -  Computer Support
Electrical  &  Electronic Engineering
MVB room 4.27   University of Bristol
www.bris.ac.uk     Tel: 0117 954 5288
email: patrick.townsend@xxxxxxxxxxxxx