[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[HTCondor-users] condor_trsnafer_data makes schedd unresposive
- Date: Thu, 20 Feb 2014 11:06:10 -0800
- From: Frank Berghaus <frank@xxxxxxx>
- Subject: [HTCondor-users] condor_trsnafer_data makes schedd unresposive
I'm running condor_version:
$CondorVersion: 8.0.3 Sep 19 2013 BuildID: 174914 $
$CondorPlatform: x86_64_RedHat6 $
Jobs are submitted remotely. The remote machine needs to retrieve the results via condor_transfer_data. When trying to retrieve completed jobs condor becomes unresponsive. On the remote machine:
Fetching data files...
DCSchedd::receiveJobSandbox:6004:Can't receive JobAdsArrayLen from the schedd (<18.104.22.168:8081>)
ERROR: Failed to spool job files.
There is a long (5min) pause between the "Fetching data" and the DCSchedd message. While this is going on the scheduler seems unresponsive, even locally:
SECMAN:2007:Failed to end classad message
Since this is running on a linux machines I thought the condor_transfer_data request was supposed to fork off the scheduler. Is there a setting that could be preventing this?
University of Victoria
Physics & Astronomy
UVic Phone: +1 (250) 472-4085
UVic Office: Elliot 201