[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] condor_write(): Socket closed when trying to write

OK, I am now working with a simplified version of the condor configuration file and I set the debug variable to get more information.

Again, the same scenario, sending a task from domain demo01.org and expecting that flocking occurs between clusters in the aforementioned domain and demo02.org domain.

I run the task and demo02 master's NegotiatorLog shows this

08/09/14 21:03:15 (fd:8) (pid:893) (D_ALWAYS) Phase 4.1:Â Negotiating with schedds ...
08/09/14 21:03:15 (fd:8) (pid:893) (D_ALWAYS)ÂÂ Negotiating with condor@xxxxxxxxxx at <>
08/09/14 21:03:15 (fd:8) (pid:893) (D_ALWAYS) 0 seconds so far
08/09/14 21:03:15 (fd:8) (pid:893) (D_CONFIG) TIMEOUT_MULTIPLIER is undefined, using default value of 0
08/09/14 21:03:15 (fd:8) (pid:893) (D_CONFIG) NEGOTIATOR_TIMEOUT_MULTIPLIER is undefined, using default value of 0
08/09/14 21:03:15 (fd:8) (pid:893) (D_DAEMONCORE) *** TIMEOUT_MULTIPLIER :: 0
08/09/14 21:03:15 (fd:8) (pid:893) (D_HOSTNAME) Found Name in ClassAd, using "condor@xxxxxxxxxx"
08/09/14 21:03:15 (fd:8) (pid:893) (D_HOSTNAME) Daemon client (schedd) address determined: name: "condor@xxxxxxxxxx", pool: "NULL", alias: "NULL", addr: "<>"
08/09/14 21:03:15 (fd:8) (pid:893) (D_HOSTNAME) Found SCHEDDIpAddr in ClassAd, using "<>"
08/09/14 21:03:15 (fd:8) (pid:893) (D_HOSTNAME) Found CondorVersion in ClassAd, using "$CondorVersion: 8.0.6 Feb 01 2014 BuildID: 225363 $"
08/09/14 21:03:15 (fd:8) (pid:893) (D_HOSTNAME) Found CondorPlatform in ClassAd, using "$CondorPlatform: x86_64_Ubuntu12 $"
08/09/14 21:03:15 (fd:8) (pid:893) (D_HOSTNAME) Found Machine in ClassAd, using "master01.demo01.org"
08/09/14 21:03:15 (fd:8) (pid:893) (D_HOSTNAME) New Daemon obj (schedd) name: "condor@xxxxxxxxxx", pool: "NULL", addr: "<>"
08/09/14 21:03:15 (fd:8) (pid:893) (D_HOSTNAME) Guess address string for host = <>, port = 0
08/09/14 21:03:15 (fd:8) (pid:893) (D_HOSTNAME) it was sinful string. ip =, port = 37603
08/09/14 21:03:15 (fd:11) (pid:893) (D_CONFIG) OUT_LOWPORT is undefined, using default value of 0
08/09/14 21:03:15 (fd:11) (pid:893) (D_CONFIG) LOWPORT is undefined, using default value of 0
08/09/14 21:03:15 (fd:11) (pid:893) (D_NETWORK) CONNECT bound to <> fd=8 peer=<>
08/09/14 21:03:15 (fd:11) (pid:893) (D_SECURITY) SECMAN: command 416 NEGOTIATE to schedd condor@xxxxxxxxxx from TCP port 47126 (blocking).
08/09/14 21:03:15 (fd:11) (pid:893) (D_SECURITY) SECMAN: using session master01:897:1407617882:7 for {<>,<416>}.
08/09/14 21:03:15 (fd:11) (pid:893) (D_NETWORK) condor_write(fd=8 schedd condor@xxxxxxxxxx,,size=617,timeout=30,flags=0)
08/09/14 21:03:15 (fd:11) (pid:893) (D_SECURITY) SECMAN: startCommand succeeded.
08/09/14 21:03:15 (fd:11) (pid:893) (D_HOSTNAME) Destroying Daemon object:
08/09/14 21:03:15 (fd:11) (pid:893) (D_HOSTNAME) Type: 3 (schedd), Name: condor@xxxxxxxxxx, Addr: <>
08/09/14 21:03:15 (fd:11) (pid:893) (D_HOSTNAME) FullHost: master01.demo01.org, Host: master01, Pool: (null), Port: -1
08/09/14 21:03:15 (fd:11) (pid:893) (D_HOSTNAME) IsLocal: N, IdStr: schedd condor@xxxxxxxxxx, Error: (null)
08/09/14 21:03:15 (fd:11) (pid:893) (D_HOSTNAME)Â --- End of Daemon object info ---
08/09/14 21:03:15 (fd:11) (pid:893) (D_NETWORK) condor_write(fd=8 schedd condor@xxxxxxxxxx,,size=200,timeout=30,flags=0)
08/09/14 21:03:15 (fd:11) (pid:893) (D_CONFIG) SERVICE_COMMAND_SOCKET_MAX_SOCKET_INDEX is undefined, using default value of 0
08/09/14 21:03:15 (fd:11) (pid:893) (D_CONFIG) NEG_SLEEP is undefined, using default value of 0
08/09/14 21:03:15 (fd:11) (pid:893) (D_NETWORK) condor_write(fd=8 schedd condor@xxxxxxxxxx,,size=13,timeout=30,flags=0)
08/09/14 21:03:15 (fd:11) (pid:893) (D_NETWORK) condor_read(fd=8 schedd condor@xxxxxxxxxx,,size=5,timeout=30,flags=0)
08/09/14 21:03:15 (fd:11) (pid:893) (D_NETWORK) Stream::get(int) failed to read padding
08/09/14 21:03:15 (fd:11) (pid:893) (D_ALWAYS)ÂÂÂÂ Failed to get reply from schedd
08/09/14 21:03:15 (fd:11) (pid:893) (D_NETWORK) CLOSE <> fd=8
08/09/14 21:03:15 (fd:8) (pid:893) (D_ALWAYS)ÂÂ Error: Ignoring submitter for this cycle
08/09/14 21:03:15 (fd:8) (pid:893) (D_ALWAYS)Â negotiateWithGroup resources used scheddAds length 0
08/09/14 21:03:15 (fd:8) (pid:893) (D_CONFIG) INSERT_NEGOTIATOR_CYCLE_TEST_DURATION is undefined, using default value of 0
08/09/14 21:03:15 (fd:8) (pid:893) (D_ALWAYS) ---------- Finished Negotiation Cycle ----------


On 7 August 2014 08:24, Zachary Miller <zmiller@xxxxxxxxxxx> wrote:
On Wed, Aug 06, 2014 at 09:49:36PM -0500, john alexander sanabria ordonez wrote:
> I saw that message (PERMISSION_DENIED) in the SchedLog file and I solved
> that issue defining the following macros

That is giving permission for the Schedd to appear in the Collector (i.e. Âwhen
you run "condor_status -schedd") but it is not authorizing the Negotiator to
talk to the Schedd (which is the daemon reporting the permission denied), and
why you see the subject of this email in the Negotiator log.

> Anyway, because I cannot figure out a solution to the flocking problem,
> right now the FLOCK_TO and FLOCK_FROM have '*' as Âtheir value then the
> condor_write() issue appeared.

FLOCK_TO needs to be set to a list of actual hostnames. ÂI would suggest the
same for FLOCK_FROM. ÂAs you can see in the config entries above, FLOCK_FROM is
already included in the list. ÂYou should really only need to change the
FLOCK_TO and FLOCK_FROM settings if you are starting with a stock config file.


HTCondor-users mailing list
To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting

The archives can be found at: