[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] limit network bandwidth to prevent shadow exception



On Apr 15, 2008, at 4:27 PM, Pasquale Tricarico wrote:

In our condor cluster (60 CPUs, 20 nodes):

$CondorVersion: 7.0.1 Feb 26 2008 BuildID: 76180 $
$CondorPlatform: X86_64-LINUX_RHEL3 $

we're facing the problem that when transferring big result files back
from the execution machine to the submission machine, the latter gets
a very high load and that triggers the shadow exception (FWIW, this is
our analysis of the problem). This of course happens at the very end
of a job, right after it's done running.

We're looking into ways to limit the network bandwidth when the
results are sent back to the submission machine. Is there any Condor
variable that can be set to do this? Otherwise, can you suggest some
other external tool to achieve this bandwidth limit?


Have you played with the MAX_CONCURRENT_DOWNLOADS and MAX_CONCURRENT_UPLOADS parameters in the Condor config file? They limit the number of simultaneous file transfers, though not the total bandwidth.

Thanks and regards,
Jaime Frey
UW-Madison Condor Team