Mailing List Archives Public Access	UW Madison Computer Sciences Department Computer Systems Lab
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Condor-users] Getting mad trying to flocking in condor

Date: Sun, 8 Jul 2012 23:30:06 -0500
From: Ziliang Guo <ziliang@xxxxxxxxxxx>
Subject: Re: [Condor-users] Getting mad trying to flocking in condor
According to the log, the machine that is supposedly matched is
located at 172.18.1.4, aka wn2.mycluster.org, and it fails to do so,
likely due to a firewall issue.  I don't know what 178.12.100.2 is in
terms of condor, since it's not listed in the log anywhere.  I'm
assuming it's the collector/negotiator for the pool you are trying to
flock to.  However, punching its firewall is not enough, you also need
to punch the firewalls of all of the execute nodes in your remote pool
so that they let your schedds talk to them.  I do find it a bit
strange though that your wn2.mycluster.org seems to switch between
172.18.1.4 and 172.18.0.4, unless you're scrubbing the logs.

On Sun, Jul 8, 2012 at 10:58 PM, Michell Guzman Cancimance
<michellrad@xxxxxxxxx> wrote:
> Hi Ziliang,
>
> Well what I'm doing is :
>
> In the remote node (178.12.100.2), I'am running the following commands, in
> order to accept jobs from two submit nodes (172.18.0.3 and 172.18.0.2)
>
> sudo iptables -A INPUT -m state --state NEW -m tcp -p tcp -s 172.18.0.2 -d
> 178.12.100.2 -j ACCEPT
> sudo iptables -A INPUT -m state --state NEW -m tcp -p tcp -s 172.18.0.3 -d
> 178.12.100.2 -j ACCEPT
> sudo iptables -A INPUT -p udp -s 172.18.0.2 -d 178.12.100.2 -j ACCEPT
> sudo iptables -A INPUT -p udp -s 172.18.0.3 -d 178.12.100.2 -j ACCEPT
>
> And I do the same thing in the two submit nodes. I'am doing something
> wrong?. Thanks in advance.
>
> Best regards
> Michell
>
>
> 2012/7/8 Ziliang Guo <ziliang@xxxxxxxxxxx>
>>
>> The last few errors look like you matched, but when your schedd tries
>> to connect to the startd on the execute node, it got blocked.  Are you
>> sure the firewall of your execute nodes in the remote pool are set up
>> to accept connections from your submit node?  And vice versa of
>> course.
>>
>> On Sun, Jul 8, 2012 at 11:48 AM, Michell Guzman Cancimance
>> <michellrad@xxxxxxxxx> wrote:
>> > Hi Rob,
>> >
>> > Thanks very much for you reply, I have tried what you suggested, but the
>> > SchedLog shows almost the same, and the job still remains idle.
>> >
>> > Best regards
>> > Michell
>> >
>> > SchedLog
>> >
>> > 07/08/12 16:37:50 (pid:976) Match record (slot1@xxxxxxxxxxxxxxxxx
>> > <172.18.1.4:50355> for vagrant, 10.0) deleted
>> > 07/08/12 16:38:11 (pid:976) attempt to connect to <67.215.65.132:9618>
>> > failed: timed out after 20 seconds.
>> > 07/08/12 16:38:11 (pid:976) attempt to connect to <67.215.65.132:9618>
>> > failed: timed out after 20 seconds.
>> > 07/08/12 16:38:11 (pid:976) ERROR: SECMAN:2004:Failed to create security
>> > session to <67.215.65.132:9618> with TCP.
>> >
>> > |SECMAN:2003:TCP connection to <67.215.65.132:9618> failed.
>> > 07/08/12 16:38:11 (pid:976) Failed to start non-blocking update to
>> > <67.215.65.132:9618>.
>> > 07/08/12 16:38:11 (pid:976) ERROR: SECMAN:2004:Failed to create security
>> > session to <67.215.65.132:9618> with TCP.
>> >
>> > |SECMAN:2003:TCP connection to <67.215.65.132:9618> failed.
>> > 07/08/12 16:38:11 (pid:976) Failed to start non-blocking update to
>> > <67.215.65.132:9618>.
>> > 07/08/12 16:38:31 (pid:976) Activity on stashed negotiator socket:
>> > <172.18.0.2:46064>
>> > 07/08/12 16:38:31 (pid:976) Using negotiation protocol: NEGOTIATE
>> > 07/08/12 16:38:31 (pid:976) Negotiating for owner: vagrant@xxxxxxxxxxx
>> > 07/08/12 16:38:31 (pid:976) Finished negotiating for vagrant in local
>> > pool:
>> > 0 matched, 1 rejected
>> > 07/08/12 16:38:51 (pid:976) Activity on stashed negotiator socket:
>> > <172.18.1.2:50732>
>> > 07/08/12 16:38:51 (pid:976) Using negotiation protocol: NEGOTIATE
>> > 07/08/12 16:38:51 (pid:976) Negotiating for owner: vagrant@xxxxxxxxxxxxx
>> > (flock level 2, pool 172.18.1.2)
>> > 07/08/12 16:38:51 (pid:976) Finished negotiating for vagrant in pool
>> > 172.18.1.2: 1 matched, 0 rejected
>> > 07/08/12 16:38:51 (pid:976) TransferQueueManager stats: active up=0/10
>> > down=0/10; waiting up=0 down=0; wait time up=0s down=0s
>> > 07/08/12 16:38:51 (pid:976) Sent ad to central manager for
>> > vagrant@xxxxxxxxxxx
>> > 07/08/12 16:38:51 (pid:976) Sent ad to 1 collectors for
>> > vagrant@xxxxxxxxxxx
>> > 07/08/12 16:38:51 (pid:976) condor_read() failed: recv() returned -1,
>> > errno
>> > = 104 Connection reset by peer, reading 5 bytes from startd
>> > slot2@xxxxxxxxxxxxxxx <172.18.1.4:50355> for vagrant.
>> > 07/08/12 16:38:51 (pid:976) IO: Failed to read packet header
>> > 07/08/12 16:38:51 (pid:976) Response problem from startd when requesting
>> > claim slot2@xxxxxxxxxxxxxxxxx <172.18.0.4:50355> for vagrant 10.0.
>> > 07/08/12 16:38:51 (pid:976) Failed to send REQUEST_CLAIM to startd
>> > slot2@xxxxxxxxxxxxxxxxx <172.18.1.4:50355> for vagrant:
>> > CEDAR:6004:failed
>> > reading from socket
>> > 07/08/12 16:38:51 (pid:976) Match record (slot2@xxxxxxxxxxxxxxxxx
>> > <172.18.1.4:50355> for vagrant, 10.0) deleted
>> > 07/08/12 16:39:12 (pid:976) attempt to connect to <67.215.65.132:9618>
>> > failed: timed out after 20 seconds.
>> > 07/08/12 16:39:12 (pid:976) attempt to connect to <67.215.65.132:9618>
>> > failed: timed out after 20 seconds.
>> > 07/08/12 16:39:12 (pid:976) ERROR: SECMAN:2004:Failed to create security
>> > session to <67.215.65.132:9618> with TCP.
>> >
>> > |SECMAN:2003:TCP connection to <67.215.65.132:9618> failed.
>> > 07/08/12 16:39:12 (pid:976) Failed to start non-blocking update to
>> > <67.215.65.132:9618>.
>> > 07/08/12 16:39:12 (pid:976) ERROR: SECMAN:2004:Failed to create security
>> > session to <67.215.65.132:9618> with TCP.
>> >
>> > |SECMAN:2003:TCP connection to <67.215.65.132:9618> failed.
>> > 07/08/12 16:39:12 (pid:976) Failed to start non-blocking update to
>> > <67.215.65.132:9618>.
>> > 07/08/12 16:39:31 (pid:976) Activity on stashed negotiator socket:
>> > <172.18.0.2:46064>
>> > 07/08/12 16:39:31 (pid:976) Using negotiation protocol: NEGOTIATE
>> > 07/08/12 16:39:31 (pid:976) Negotiating for owner: vagrant@xxxxxxxxxxx
>> > 07/08/12 16:39:31 (pid:976) Finished negotiating for vagrant in local
>> > pool:
>> > 0 matched, 1 rejected
>> > 07/08/12 16:39:52 (pid:976) Activity on stashed negotiator socket:
>> > <172.18.1.2:50732>
>> > 07/08/12 16:39:52 (pid:976) Using negotiation protocol: NEGOTIATE
>> > 07/08/12 16:39:52 (pid:976) Negotiating for owner: vagrant@xxxxxxxxxxx
>> > (flock level 2, pool 172.18.1.2)
>> > 07/08/12 16:39:52 (pid:976) Finished negotiating for vagrant in pool
>> > 172.18.1.2: 1 matched, 0 rejected
>> > 07/08/12 16:39:52 (pid:976) TransferQueueManager stats: active up=0/10
>> > down=0/10; waiting up=0 down=0; wait time up=0s down=0s
>> > 07/08/12 16:39:52 (pid:976) Sent ad to central manager for
>> > vagrant@xxxxxxxxxxx
>> > 07/08/12 16:39:52 (pid:976) Sent ad to 1 collectors for
>> > vagrant@xxxxxxxxxxx
>> > 07/08/12 16:39:52 (pid:976) condor_read() failed: recv() returned -1,
>> > errno
>> > = 104 Connection reset by peer, reading 5 bytes from startd
>> > slot1@xxxxxxxxxxxxxxxxx <172.18.1.3:40072> for vagrant.
>> > 07/08/12 16:39:52 (pid:976) IO: Failed to read packet header
>> > 07/08/12 16:39:52 (pid:976) Response problem from startd when requesting
>> > claim slot1@xxxxxxxxxxxxxxxxx <172.18.1.3:40072> for vagrant 10.0.
>> > 07/08/12 16:39:52 (pid:976) Failed to send REQUEST_CLAIM to startd
>> > slot1@xxxxxxxxxxxxxxxxx <172.18.1.3:40072> for vagrant:
>> > CEDAR:6004:failed
>> > reading from socket
>> > 07/08/12 16:39:52 (pid:976) Match record (slot1@xxxxxxxxxxxxxxxxx
>> > <172.18.1.3:40072> for vagrant, 10.0) deleted
>> >
>> >
>> >
>> >
>> > 2012/7/7 Rob <spamrefuse@xxxxxxxxx>
>> >>
>> >> Hi,
>> >>
>> >> Could it be that a firewall prevents further communication necessary
>> >> for
>> >> the flocking?
>> >> I had this and I needed to add exceptions to the iptables firewall
>> >> rules
>> >> on both machines, like this:
>> >>
>> >> on machine X:
>> >>
>> >> -A INPUT -m state --state NEW -m tcp -p tcp -s xxx.xxx.xxx.xxx/16 -d
>> >> ip.addr.of.X -j ACCEPT
>> >> -A INPUT -p udp -s xxx.xxx.xxx.xxx/16 -d ip.addr.of.X -j ACCEPT
>> >>
>> >> where "xxx.xxx.xxx.xxx/16" includes all machines in my pool, including
>> >> the
>> >> two flocking machines.
>> >>
>> >> Rob.
>> >>
>> >> ________________________________
>> >> From: Michell Guzman Cancimance <michellrad@xxxxxxxxx>
>> >> To: condor-users@xxxxxxxxxxx
>> >> Sent: Wednesday, July 4, 2012 4:57 PM
>> >> Subject: [Condor-users] Getting mad trying to flocking in condor
>> >>
>> >> Hi,
>> >>
>> >> I'm getting mad trying to flock a job from a cluster A
>> >> (master.cluster.org, 172.18.0.2) to a cluster B
>> >> (cl-master.mycluster.org.
>> >> 178.12.100.2),
>> >> each cluster have a master and two worker nodes, the cluster A have
>> >> nodes
>> >> with arch X86_64, and the cluster
>> >> B have nodes with arch INTEL (32 bits). I have configured the two
>> >> condor_config (the flocking section) in each master nodes of this
>> >> clusters
>> >> (master.cluster.org and cl-master.mycluster.org nodes) following the
>> >> steps
>> >> in
>> >>
>> >> (http://research.cs.wisc.edu/condor/manual/v6.8/5_2Connecting_Condor.html).
>> >> When I run a job en each cluster separately that works fine, but when I
>> >> run
>> >> a job with a requirement of an arch INTEL into the cluster A (the
>> >> cluster
>> >> whose nodes have X86_64 Arch) trying to
>> >> do a flock to the cluster B doesn't works. I have tried a lot of stuff
>> >> but
>> >> I can't get any success. I would appreciate any help in order to solve
>> >> this
>> >> problem.
>> >>
>> >>
>> >> Best regards
>> >> Michell
>> >>
>> >>
>> >> This is my SchedLog file
>> >>
>> >> 07/03/12 07:05:38 (pid:813) Can't open directory "/config" as
>> >> PRIV_UNKNOWN, errno: 2 (No such file or directory)
>> >> 07/03/12 07:05:38 (pid:813) Can't open directory
>> >> "/opt/condor/tmp/condor/local.master/config" as PRIV_UNKNOWN, errno: 2
>> >> (No
>> >> such file or directory)
>> >> 07/03/12 07:05:38 (pid:813) passwd_cache::cache_uid():
>> >> getpwnam("condor")
>> >> failed: user not found
>> >> 07/03/12 07:05:38 (pid:813) passwd_cache::cache_uid():
>> >> getpwnam("condor")
>> >> failed: user not found
>> >> 07/03/12 07:05:38 (pid:813) Setting maximum accepts per cycle 4.
>> >> 07/03/12 07:05:38 (pid:813)
>> >> ******************************************************
>> >> 07/03/12 07:05:38 (pid:813) ** condor_schedd (CONDOR_SCHEDD) STARTING
>> >> UP
>> >> 07/03/12 07:05:38 (pid:813) **
>> >> /opt/condor/tmp/condor/sbin/condor_schedd
>> >> 07/03/12 07:05:38 (pid:813) ** SubsystemInfo: name=SCHEDD
>> >> type=SCHEDD(5)
>> >> class=DAEMON(1)
>> >> 07/03/12 07:05:38 (pid:813) ** Configuration: subsystem:SCHEDD
>> >> local:<NONE> class:DAEMON
>> >> 07/03/12 07:05:38 (pid:813) ** $CondorVersion: 7.6.4 Oct 20 2011
>> >> BuildID:
>> >> 379441 $
>> >> 07/03/12 07:05:38 (pid:813) ** $CondorPlatform: x86_64_deb_5.0 $
>> >> 07/03/12 07:05:38 (pid:813) ** PID = 813
>> >> 07/03/12 07:05:38 (pid:813) ** Log last touched 7/3 07:04:28
>> >> 07/03/12 07:05:38 (pid:813)
>> >> ******************************************************
>> >> 07/03/12 07:05:38 (pid:813) Using config source:
>> >> /opt/condor/tmp/condor/etc/condor_config
>> >> 07/03/12 07:05:38 (pid:813) Using local config sources:
>> >> 07/03/12 07:05:38 (pid:813)
>> >> /opt/condor/tmp/condor/local.master/condor_config.local
>> >> 07/03/12 07:05:38 (pid:813) DaemonCore: command socket at
>> >> <10.0.2.15:49711>
>> >> 07/03/12 07:05:38 (pid:813) DaemonCore: private command socket at
>> >> <10.0.2.15:49711>
>> >> 07/03/12 07:05:38 (pid:813) Setting maximum accepts per cycle 4.
>> >> 07/03/12 07:05:38 (pid:813) History file rotation is enabled.
>> >> 07/03/12 07:05:38 (pid:813)   Maximum history file size is: 20971520
>> >> bytes
>> >> 07/03/12 07:05:38 (pid:813)   Number of rotated history files is: 2
>> >> 07/03/12 07:05:40 (pid:813) About to rotate ClassAd log
>> >> /opt/condor/tmp/condor/local.master/spool/job_queue.log
>> >> 07/03/12 07:05:48 (pid:813) TransferQueueManager stats: active up=0/10
>> >> down=0/10; waiting up=0 down=0; wait time up=0s down=0s
>> >> 07/03/12 07:05:48 (pid:813) Sent ad to central manager for
>> >> vagrant@xxxxxxxxxxx
>> >> 07/03/12 07:05:48 (pid:813) Sent ad to 1 collectors for
>> >> vagrant@xxxxxxxxxxx
>> >> 07/03/12 07:06:38 (pid:813) Using negotiation protocol: NEGOTIATE
>> >> 07/03/12 07:06:38 (pid:813) Negotiating for owner: vagrant@xxxxxxxxxxx
>> >> 07/03/12 07:06:38 (pid:813) AutoCluster:config() significant attributes
>> >> changed to
>> >> 07/03/12 07:06:38 (pid:813) Checking consistency running and runnable
>> >> jobs
>> >> 07/03/12 07:06:38 (pid:813) Tables are consistent
>> >> 07/03/12 07:06:38 (pid:813) Rebuilt prioritized runnable job list in
>> >> 0.001s.
>> >> 07/03/12 07:06:38 (pid:813) Finished negotiating for vagrant in local
>> >> pool: 0 matched, 2 rejected
>> >> 07/03/12 07:06:38 (pid:813) Increasing flock level for vagrant to 1.
>> >> 07/03/12 07:06:38 (pid:813) TransferQueueManager stats: active up=0/10
>> >> down=0/10; waiting up=0 down=0; wait time up=0s down=0s
>> >> 07/03/12 07:06:38 (pid:813) Sent ad to central manager for
>> >> vagrant@xxxxxxxxxxx
>> >> 07/03/12 07:06:38 (pid:813) Sent ad to 1 collectors for
>> >> vagrant@xxxxxxxxxxx
>> >> 07/03/12 07:06:59 (pid:813) attempt to connect to <67.215.65.132:9618>
>> >> failed: Connection timed out (connect errno = 110).
>> >> 07/03/12 07:06:59 (pid:813) attempt to connect to <67.215.65.132:9618>
>> >> failed: timed out after 20 seconds.
>> >> 07/03/12 07:06:59 (pid:813) ERROR: SECMAN:2004:Failed to create
>> >> security
>> >> session to <67.215.65.132:9618> with TCP.
>> >> |SECMAN:2003:TCP connection to <67.215.65.132:9618> failed.
>> >> 07/03/12 07:06:59 (pid:813) Failed to start non-blocking update to
>> >> <67.215.65.132:9618>.
>> >> 07/03/12 07:06:59 (pid:813) ERROR: SECMAN:2004:Failed to create
>> >> security
>> >> session to <67.215.65.132:9618> with TCP.
>> >> |SECMAN:2003:TCP connection to <67.215.65.132:9618> failed.
>> >> 07/03/12 07:06:59 (pid:813) Failed to start non-blocking update to
>> >> <67.215.65.132:9618>.
>> >> 07/03/12 07:07:38 (pid:813) Activity on stashed negotiator socket:
>> >> <172.18.0.2:45498>
>> >> 07/03/12 07:07:38 (pid:813) Using negotiation protocol: NEGOTIATE
>> >> 07/03/12 07:07:38 (pid:813) Negotiating for owner: vagrant@xxxxxxxxxxx
>> >> 07/03/12 07:07:38 (pid:813) Finished negotiating for vagrant in local
>> >> pool: 0 matched, 2 rejected
>> >> 07/03/12 07:08:38 (pid:813) Activity on stashed negotiator socket:
>> >> <172.18.0.2:45498>
>> >> 07/03/12 07:08:38 (pid:813) Using negotiation protocol: NEGOTIATE
>> >> 07/03/12 07:08:38 (pid:813) Negotiating for owner: vagrant@xxxxxxxxxxx
>> >> 07/03/12 07:08:38 (pid:813) Finished negotiating for vagrant in local
>> >> pool: 0 matched, 2 rejected
>> >> 07/03/12 07:09:38 (pid:813) Activity on stashed negotiator socket:
>> >> <172.18.0.2:45498>
>> >> 07/03/12 07:09:38 (pid:813) Using negotiation protocol: NEGOTIATE
>> >> 07/03/12 07:09:38 (pid:813) Negotiating for owner: vagrant@xxxxxxxxxxx
>> >> 07/03/12 07:09:38 (pid:813) Finished negotiating for vagrant in local
>> >> pool: 0 matched, 2 rejected
>> >> 07/03/12 07:09:47 (pid:813) TransferQueueManager stats: active up=0/10
>> >> down=0/10; waiting up=0 down=0; wait time up=0s down=0s
>> >> 07/03/12 07:09:47 (pid:813) Sent ad to central manager for
>> >> vagrant@xxxxxxxxxxx
>> >> 07/03/12 07:09:47 (pid:813) Sent ad to 1 collectors for
>> >> vagrant@xxxxxxxxxxx
>> >> 07/03/12 07:10:09 (pid:813) attempt to connect to <67.215.65.132:9618>
>> >> failed: Connection timed out (connect errno = 110).  Will keep trying
>> >> for 60
>> >> total seconds (39 to go).
>> >>
>> >> 07/03/12 07:10:48 (pid:813) attempt to connect to <67.215.65.132:9618>
>> >> failed: Connection timed out (connect errno = 110).
>> >> 07/03/12 07:10:48 (pid:813) Failed to send RESCHEDULE to negotiator
>> >> cl-master.mycluster.org:
>> >> 07/03/12 07:10:48 (pid:813) attempt to connect to <67.215.65.132:9618>
>> >> failed: Connection timed out (connect errno = 110).
>> >> 07/03/12 07:10:48 (pid:813) attempt to connect to <67.215.65.132:9618>
>> >> failed: Connection timed out (connect errno = 110).
>> >> 07/03/12 07:10:48 (pid:813) Activity on stashed negotiator socket:
>> >> <172.18.0.2:45498>
>> >> 07/03/12 07:10:48 (pid:813) Using negotiation protocol: NEGOTIATE
>> >> 07/03/12 07:10:48 (pid:813) Negotiating for owner: vagrant@xxxxxxxxxxx
>> >> 07/03/12 07:10:48 (pid:813) Checking consistency running and runnable
>> >> jobs
>> >> 07/03/12 07:10:48 (pid:813) Tables are consistent
>> >> 07/03/12 07:10:48 (pid:813) Rebuilt prioritized runnable job list in
>> >> 0.000s.
>> >> 07/03/12 07:10:48 (pid:813) ERROR: SECMAN:2004:Failed to create
>> >> security
>> >> session to <67.215.65.132:9618> with TCP.
>> >> |SECMAN:2003:TCP connection to <67.215.65.132:9618> failed.
>> >> 07/03/12 07:10:48 (pid:813) Failed to start non-blocking update to
>> >> <67.215.65.132:9618>.
>> >> 07/03/12 07:10:48 (pid:813) ERROR: SECMAN:2004:Failed to create
>> >> security
>> >> session to <67.215.65.132:9618> with TCP.
>> >> |SECMAN:2003:TCP connection to <67.215.65.132:9618> failed.
>> >> 07/03/12 07:10:48 (pid:813) Failed to start non-blocking update to
>> >> <67.215.65.132:9618>.
>> >> 07/03/12 07:10:48 (pid:813) Finished negotiating for vagrant in local
>> >> pool: 0 matched, 2 rejected
>> >> 07/03/12 07:11:08 (pid:813) Activity on stashed negotiator socket:
>> >> <172.18.0.2:45498>
>> >> 07/03/12 07:11:08 (pid:813) Using negotiation protocol: NEGOTIATE
>> >> 07/03/12 07:11:08 (pid:813) Negotiating for owner: vagrant@xxxxxxxxxxx
>> >> 07/03/12 07:11:08 (pid:813) Finished negotiating for vagrant in local
>> >> pool: 0 matched, 2 rejected
>> >> 07/03/12 07:12:08 (pid:813) Activity on stashed negotiator socket:
>> >> <172.18.0.2:45498>
>> >> 07/03/12 07:12:08 (pid:813) Using negotiation protocol: NEGOTIATE
>> >> 07/03/12 07:12:08 (pid:813) Negotiating for owner: vagrant@xxxxxxxxxxx
>> >> 07/03/12 07:12:08 (pid:813) Finished negotiating for vagrant in local
>> >> pool: 0 matched, 2 rejected
>> >> 07/03/12 07:13:08 (pid:813) Activity on stashed negotiator socket:
>> >> <172.18.0.2:45498>
>> >> 07/03/12 07:13:08 (pid:813) Using negotiation protocol: NEGOTIATE
>> >> 07/03/12 07:13:08 (pid:813) Negotiating for owner: vagrant@xxxxxxxxxxx
>> >> 07/03/12 07:13:08 (pid:813) Finished negotiating for vagrant in local
>> >> pool: 0 matched, 2 rejected
>> >> 07/03/12 16:54:51 (pid:836) Can't open directory "/config" as
>> >> PRIV_UNKNOWN, errno: 2 (No such file or directory)
>> >> 07/03/12 16:54:51 (pid:836) Can't open directory
>> >> "/opt/condor/tmp/condor/local.master/config" as PRIV_UNKNOWN, errno: 2
>> >> (No
>> >> such file or directory)
>> >> 07/03/12 16:54:51 (pid:836) passwd_cache::cache_uid():
>> >> getpwnam("condor")
>> >> failed: user not found
>> >> 07/03/12 16:54:51 (pid:836) passwd_cache::cache_uid():
>> >> getpwnam("condor")
>> >> failed: user not found
>> >> 07/03/12 16:54:51 (pid:836) Setting maximum accepts per cycle 4.
>> >>
>> >>
>> >> 07/04/12 08:13:01 (pid:1936) Can't open directory "/config" as
>> >> PRIV_UNKNOWN, errno: 2 (No such file or directory)
>> >> 07/04/12 08:13:01 (pid:1936) Can't open directory
>> >> "/opt/condor/tmp/condor/local.master/config" as PRIV_UNKNOWN, errno: 2
>> >> (No
>> >> such file or directory)
>> >> 07/04/12 08:13:01 (pid:1936) passwd_cache::cache_uid():
>> >> getpwnam("condor")
>> >> failed: user not found
>> >> 07/04/12 08:13:01 (pid:1936) passwd_cache::cache_uid():
>> >> getpwnam("condor")
>> >> failed: user not found
>> >> 07/04/12 08:13:01 (pid:1936) Setting maximum accepts per cycle 4.
>> >> 07/04/12 08:13:01 (pid:1936)
>> >> ******************************************************
>> >> 07/04/12 08:13:01 (pid:1936) ** condor_schedd (CONDOR_SCHEDD) STARTING
>> >> UP
>> >> 07/04/12 08:13:01 (pid:1936) **
>> >> /opt/condor/tmp/condor/sbin/condor_schedd
>> >> 07/04/12 08:13:01 (pid:1936) ** SubsystemInfo: name=SCHEDD
>> >> type=SCHEDD(5)
>> >> class=DAEMON(1)
>> >> 07/04/12 08:13:01 (pid:1936) ** Configuration: subsystem:SCHEDD
>> >> local:<NONE> class:DAEMON
>> >> 07/04/12 08:13:01 (pid:1936) ** $CondorVersion: 7.6.4 Oct 20 2011
>> >> BuildID:
>> >> 379441 $
>> >> 07/04/12 08:13:01 (pid:1936) ** $CondorPlatform: x86_64_deb_5.0 $
>> >> 07/04/12 08:13:01 (pid:1936) ** PID = 1936
>> >> 07/04/12 08:13:01 (pid:1936) ** Log last touched 7/4 08:13:01
>> >> 07/04/12 08:13:01 (pid:1936)
>> >> ******************************************************
>> >> 07/04/12 08:13:01 (pid:1936) Using config source:
>> >> /opt/condor/tmp/condor/etc/condor_config
>> >> 07/04/12 08:13:01 (pid:1936) Using local config sources:
>> >> 07/04/12 08:13:01 (pid:1936)
>> >> /opt/condor/tmp/condor/local.master/condor_config.local
>> >> 07/04/12 08:13:01 (pid:1936) DaemonCore: command socket at
>> >> <10.0.2.15:33007>
>> >> 07/04/12 08:13:01 (pid:1936) DaemonCore: private command socket at
>> >> <10.0.2.15:33007>
>> >> 07/04/12 08:13:01 (pid:1936) Setting maximum accepts per cycle 4.
>> >> 07/04/12 08:13:01 (pid:1936) History file rotation is enabled.
>> >> 07/04/12 08:13:01 (pid:1936)   Maximum history file size is: 20971520
>> >> bytes
>> >> 07/04/12 08:13:01 (pid:1936)   Number of rotated history files is: 2
>> >> 07/04/12 08:13:41 (pid:1936) TransferQueueManager stats: active up=0/10
>> >> down=0/10; waiting up=0 down=0; wait time up=0s down=0s
>> >> 07/04/12 08:13:41 (pid:1936) Sent ad to central manager for
>> >> vagrant@xxxxxxxxxxx
>> >> 07/04/12 08:13:41 (pid:1936) Sent ad to 1 collectors for
>> >> vagrant@xxxxxxxxxxx
>> >> 07/04/12 08:14:01 (pid:1936) IPVERIFY: unable to resolve IP address of
>> >> cl-master.mycluster.org
>> >> 07/04/12 08:14:21 (pid:1936) Using negotiation protocol: NEGOTIATE
>> >> 07/04/12 08:14:21 (pid:1936) Negotiating for owner: vagrant@xxxxxxxxxxx
>> >> 07/04/12 08:14:21 (pid:1936) AutoCluster:config() significant
>> >> attributes
>> >> changed to
>> >> 07/04/12 08:14:21 (pid:1936) Checking consistency running and runnable
>> >> jobs
>> >> 07/04/12 08:14:21 (pid:1936) Tables are consistent
>> >> 07/04/12 08:14:21 (pid:1936) Rebuilt prioritized runnable job list in
>> >> 0.002s.
>> >> 07/04/12 08:14:21 (pid:1936) Finished negotiating for vagrant in local
>> >> pool: 0 matched, 2 rejected
>> >> 07/04/12 08:14:21 (pid:1936) Increasing flock level for vagrant to 1.
>> >> 07/04/12 08:14:21 (pid:1936) TransferQueueManager stats: active up=0/10
>> >> down=0/10; waiting up=0 down=0; wait time up=0s down=0s
>> >> 07/04/12 08:14:41 (pid:1936) Failed to start non-blocking update to
>> >> unknown.
>> >> 07/04/12 08:14:41 (pid:1936) Sent ad to central manager for
>> >> vagrant@xxxxxxxxxxx
>> >> 07/04/12 08:14:41 (pid:1936) Sent ad to 1 collectors for
>> >> vagrant@xxxxxxxxxxx
>> >> 07/04/12 08:15:01 (pid:1936) Failed to start non-blocking update to
>> >> unknown.
>> >> 07/04/12 08:15:26 (pid:1936) Using negotiation protocol: NEGOTIATE
>> >> 07/04/12 08:15:26 (pid:1936) Negotiating for owner: vagrant@xxxxxxxxxxx
>> >> 07/04/12 08:15:26 (pid:1936) AutoCluster:config() significant
>> >> attributes
>> >> changed to JobUniverse,LastCheckpointPlatform,NumCkpts,Scheduler
>> >> 07/04/12 08:15:26 (pid:1936) Checking consistency running and runnable
>> >> jobs
>> >> 07/04/12 08:15:26 (pid:1936) Tables are consistent
>> >> 07/04/12 08:15:26 (pid:1936) Rebuilt prioritized runnable job list in
>> >> 0.002s.
>> >> 07/04/12 08:15:26 (pid:1936) Activity on stashed negotiator socket:
>> >> <172.18.0.2:38674>
>> >> 07/04/12 08:15:26 (pid:1936) Using negotiation protocol: NEGOTIATE
>> >> 07/04/12 08:15:26 (pid:1936) Negotiating for owner: vagrant@xxxxxxxxxxx
>> >> 07/04/12 08:15:26 (pid:1936) Finished negotiating for vagrant in local
>> >> pool: 0 matched, 2 rejected
>> >> 07/04/12 08:15:26 (pid:1936) Finished negotiating for vagrant in local
>> >> pool: 0 matched, 2 rejected
>> >> 07/04/12 08:15:45 (pid:1936) Using negotiation protocol: NEGOTIATE
>> >> 07/04/12 08:15:45 (pid:1936) Negotiating for owner: vagrant@xxxxxxxxxxx
>> >> 07/04/12 08:15:45 (pid:1936) Finished negotiating for vagrant in local
>> >> pool: 0 matched, 2 rejected
>> >> 07/04/12 08:16:27 (pid:1936) Activity on stashed negotiator socket:
>> >> <172.18.0.2:40374>
>> >> 07/04/12 08:16:27 (pid:1936) Using negotiation protocol: NEGOTIATE
>> >> 07/04/12 08:16:27 (pid:1936) Negotiating for owner: vagrant@xxxxxxxxxxx
>> >> 07/04/12 08:16:27 (pid:1936) Finished negotiating for vagrant in local
>> >> pool: 0 matched, 2 rejected
>> >> 07/04/12 08:16:27 (pid:1936) Activity on stashed negotiator socket:
>> >> <172.18.0.2:38674>
>> >> 07/04/12 08:16:27 (pid:1936) Using negotiation protocol: NEGOTIATE
>> >> 07/04/12 08:16:27 (pid:1936) Negotiating for owner: vagrant@xxxxxxxxxxx
>> >> 07/04/12 08:16:27 (pid:1936) Finished negotiating for vagrant in local
>> >> pool: 0 matched, 2 rejected
>> >>
>> >>
>> >> --
>> >> "Nullius addictus jurare in verba magistri"
>> >>
>> >> _______________________________________________
>> >> Condor-users mailing list
>> >> To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with
>> >> a
>> >> subject: Unsubscribe
>> >> You can also unsubscribe by visiting
>> >> https://lists.cs.wisc.edu/mailman/listinfo/condor-users
>> >>
>> >> The archives can be found at:
>> >> https://lists.cs.wisc.edu/archive/condor-users/
>> >>
>> >>
>> >>
>> >> _______________________________________________
>> >> Condor-users mailing list
>> >> To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with
>> >> a
>> >> subject: Unsubscribe
>> >> You can also unsubscribe by visiting
>> >> https://lists.cs.wisc.edu/mailman/listinfo/condor-users
>> >>
>> >> The archives can be found at:
>> >> https://lists.cs.wisc.edu/archive/condor-users/
>> >>
>> >
>> >
>> >
>> > --
>> > "Nullius addictus jurare in verba magistri"
>> >
>> > _______________________________________________
>> > Condor-users mailing list
>> > To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with
>> > a
>> > subject: Unsubscribe
>> > You can also unsubscribe by visiting
>> > https://lists.cs.wisc.edu/mailman/listinfo/condor-users
>> >
>> > The archives can be found at:
>> > https://lists.cs.wisc.edu/archive/condor-users/
>> >
>>
>>
>>
>> --
>> Condor Project Windows Developer
>> _______________________________________________
>> Condor-users mailing list
>> To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with a
>> subject: Unsubscribe
>> You can also unsubscribe by visiting
>> https://lists.cs.wisc.edu/mailman/listinfo/condor-users
>>
>> The archives can be found at:
>> https://lists.cs.wisc.edu/archive/condor-users/
>
>
>
>
> --
> "Nullius addictus jurare in verba magistri"
>
> _______________________________________________
> Condor-users mailing list
> To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with a
> subject: Unsubscribe
> You can also unsubscribe by visiting
> https://lists.cs.wisc.edu/mailman/listinfo/condor-users
>
> The archives can be found at:
> https://lists.cs.wisc.edu/archive/condor-users/
>



-- 
Condor Project Windows Developer
Follow-Ups:
- Re: [Condor-users] Getting mad trying to flocking in condor
  - From: Michell Guzman Cancimance
References:
- [Condor-users] Getting mad trying to flocking in condor
  - From: Michell Guzman Cancimance
- Re: [Condor-users] Getting mad trying to flocking in condor
  - From: Rob
- Re: [Condor-users] Getting mad trying to flocking in condor
  - From: Michell Guzman Cancimance
- Re: [Condor-users] Getting mad trying to flocking in condor
  - From: Ziliang Guo
- Re: [Condor-users] Getting mad trying to flocking in condor
  - From: Michell Guzman Cancimance
Prev by Date: Re: [Condor-users] Getting mad trying to flocking in condor
Next by Date: Re: [Condor-users] Getting mad trying to flocking in condor
Previous by thread: Re: [Condor-users] Getting mad trying to flocking in condor
Next by thread: Re: [Condor-users] Getting mad trying to flocking in condor
Index(es):
- Date
- Thread
Mailing List Archives

Public Access

Re: [Condor-users] Getting mad trying to flocking in condor