[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Condor-users] Condor network connections when running a job



Hi All

 

We’re trying to diagnose a comms problem that has reared it’s head

in our condor system in the last few weeks that we haven’t seen

before in the past few years.

 

We have multiple pools with flocking enabled (routers between

geographically separated sites). It is appears to be some sort

of network/router related issue but it would be useful if we had a few

more details about how condor communicates.

 

I think? that when doing “nothing” all condor’s comms are UDP?

i.e. nodes communicating with the central managers send UDP updates.

I think? that submit machines also use UDP when there are queued

jobs and this info goes to the CM as well? I presume that TCP is used

when file transfer is needed from submit to execute nodes at job start

and also from execute to submit nodes at job finish. I suspect that

during job execution though that UDP is used for comms from the

execute node to the CM and for between the execute and submit nodes?

 

Thanks for any info that anyone has with regard to this.

 

Cheers

 

Greg

------------------------------------------------------------------------------------------------------
Greg Hitchen                                                                         greg.hitchen@xxxxxxxx
CSIRO IM&T eScience                                                       phone: +61 8 6436 8663
Australian Resources Research Centre (ARRC)             fax:       +61 8 6436 8555
Postal address:                                                                     mob:          0407 952 748
PO Box 1130, Bentley WA 6102, Australia
Street Address:
26 Dick Perry Avenue, Kensington WA 6151
-------------------------------------------------------------------------------------------------------