[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] Problem with match: sending small msg failed



John,

I think you mean, "are there any UDP messages from Submit <-> Execute?" The answer is yes. Those are the messages you can turn into TCP with in the upcoming 6.9.5 release with SCHEDD_SEND_VACATE_VIA_TCP and STARTD_SENDS_ALIVES. However, I just noticed that these knobs are not documented in the 6.9.5 manual, so perhaps I have leaked something that is not intended for public consumption yet. I better check!

--Dan

Kewley, J (John) wrote:

Dan,

That message is from Neg-> Startd, are there any udp from
Execute->Startd
or from Startd->Execute ?

I thought there was some in at least one direction.

JK

-----Original Message-----
From: condor-users-bounces@xxxxxxxxxxx on behalf of Dan Bradley
Sent: Tue 27/11/2007 23:53
To: Condor-Users Mail List
Subject: Re: [Condor-users] Problem with match: sending small msg failed


In the latest version of Condor, about to be released as 6.9.5, it should be possible to run Condor completely without using UDP.

The specific problem you are having in the negotiator can be bypassed in 6.9.5 like this:

NEGOTIATOR_INFORM_STARTD = False

That disables the UDP message from the negotiator to the startd.

As John mentioned, in current versions of Condor, you can also have your daemons advertise themselves to the collector using TCP instead of UDP. You may not have a problem with that, since, presumably, you already have your daemons successfully advertising themselves to the collector.

Other places where UDP can be bypassed in 6.9.5:

SCHEDD_SEND_VACATE_VIA_TCP = true
STARTD_SENDS_ALIVES = true

--Dan

Kewley, J (John) wrote:
You can tell condor to use tcp for its updates
http://www.cs.wisc.edu/condor/manual/v6.9/3_7Networking.html#sec:tcp-collector-update
but that is between nodes and central manager, these comms are direct between
scheduler and execute machines, so I don't think that would help.

Section 2.4 of the document
http://epubs.cclrc.ac.uk/bitstream/919/431.pdf
describes the use of udp and tcp in detail. Two important things to note is that it shows a table where Submit initiates udp to Execute, but not vice versa. It
also mentions that not all the udp traffic can be turned into tcp.

Cheers

JK


   ------------------------------------------------------------------------
   *From:* condor-users-bounces@xxxxxxxxxxx
   [mailto:condor-users-bounces@xxxxxxxxxxx] *On Behalf Of *Enol
   Fernández
   *Sent:* Thursday, November 22, 2007 6:28 PM
   *To:* Condor-Users Mail List
   *Subject:* Re: [Condor-users] Problem with match: sending small
   msg failed

   OK, so maybe it's a firewall problem because udp ports are not
   being open, but I don't have enough permission to open them. Is it
   possible to use only tcp?

   Enol.

   2007/11/22, Kewley, J (John) <j.kewley@xxxxxxxx
   <mailto:j.kewley@xxxxxxxx>>:

               It seems it's  between the range
               $ condor_config_val LOWPORT
               20000
               $ condor_config_val HIGHPORT
               25000

       OK, that is good, and since they presumably share the same
       files since
       the same machine,
       that should be OK.

               I have just 1 startd with two virtual machines but I
       have an
       startd expression to make my job run only on one of them,
       that's why it
       does match vm2 but not vm1

       OK, good plan!

               I also have a firewall in schedd machine, but it is
       also open at
       20000:25000 range.

       Can you check your firewall settings, you need to have it open
       for udp
       and well as tcp,
       it might be that, although I doubt this is causing this problem.

       You can do some simple checks on the ports in each direction
       as follows:

       [on startd machine]
       telnet <schedd machine> 20001
       and
       telnet <schedd machine> 19999

       and also:

       [on schedd machine]
       telnet <startd machine> 20001
       and
       telnet <startd machine> 19999

       but that will only test tcp, not udp ports being open

       Of course, it might not be anything at all to do with firewalls.
       But it usually is ... :)

       GL

       Cheers

       JK

       _______________________________________________
       Condor-users mailing list
       To unsubscribe, send a message to
       condor-users-request@xxxxxxxxxxx
       <mailto:condor-users-request@xxxxxxxxxxx> with a
       subject: Unsubscribe
       You can also unsubscribe by visiting
       https://lists.cs.wisc.edu/mailman/listinfo/condor-users

       The archives can be found at:
       https://lists.cs.wisc.edu/archive/condor-users/


------------------------------------------------------------------------

_______________________________________________
Condor-users mailing list
To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/condor-users

The archives can be found at: https://lists.cs.wisc.edu/archive/condor-users/
_______________________________________________
Condor-users mailing list
To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/condor-users

The archives can be found at: https://lists.cs.wisc.edu/archive/condor-users/



_______________________________________________
Condor-users mailing list
To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/condor-users

The archives can be found at: https://lists.cs.wisc.edu/archive/condor-users/