Mailing List Archives
Public Access
|
|
|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Condor-users] New to Condor, Need to RUN MPI
- Date: Tue, 3 Feb 2009 17:44:21 -0500
- From: Samir Khanal <skhanal@xxxxxxxx>
- Subject: Re: [Condor-users] New to Condor, Need to RUN MPI
Hi Guys
I have been trying the whole day today to figure out how to at least make condor_status show up all the participants.
I just get the frontend
$ condor_status
Name OpSys Arch State Activity LoadAv Mem ActvtyTime
slot1@xxxxxxxxxxxx LINUX X86_64 Owner Idle 0.000 990 0+00:45:11
slot2@xxxxxxxxxxxx LINUX X86_64 Unclaimed Idle 0.000 990 0+00:30:05
slot3@xxxxxxxxxxxx LINUX X86_64 Unclaimed Idle 0.000 990 0+00:30:06
slot4@xxxxxxxxxxxx LINUX X86_64 Unclaimed Idle 0.000 990 0+00:30:07
slot5@xxxxxxxxxxxx LINUX X86_64 Unclaimed Idle 0.000 990 0+00:30:08
slot6@xxxxxxxxxxxx LINUX X86_64 Unclaimed Idle 0.000 990 0+00:30:09
slot7@xxxxxxxxxxxx LINUX X86_64 Unclaimed Idle 0.000 990 0+00:25:10
slot8@xxxxxxxxxxxx LINUX X86_64 Unclaimed Idle 0.000 990 0+00:30:03
Total Owner Claimed Unclaimed Matched Preempting Backfill
X86_64/LINUX 8 1 0 7 0 0 0
Total 8 1 0 7 0 0 0
Daemons that I am running on frontend is
$ ps -el | grep condor
5 S 407 3679 1 0 75 0 - 6702 - ? 00:00:01 condor_master
4 S 407 3694 3679 0 75 0 - 6933 - ? 00:00:00 condor_collecto
4 S 407 3864 3679 0 75 0 - 6705 - ? 00:00:00 condor_schedd
4 S 407 3866 3679 0 78 0 - 6604 - ? 00:00:06 condor_startd
4 S 407 3867 3679 0 75 0 - 6324 - ? 00:00:00 condor_negotiat
4 S 0 3874 3864 0 78 0 - 4981 - ? 00:00:00 condor_procd
And that on a compute node is
$ps -el | grep condor
5 S 407 2742 1 0 75 0 - 6568 - ? 00:00:00 condor_master
4 S 407 2792 2742 0 75 0 - 6658 - ? 00:00:00 condor_schedd
4 S 407 2795 2742 0 75 0 - 6671 - ? 00:00:05 condor_startd
4 S 0 2799 2792 0 78 0 - 4914 - ? 00:00:00 condor_procd
I looked up the CollectorLog and found the following entries. Those ips are of the computenodes
2/3 17:28:27 DaemonCore: PERMISSION DENIED to unknown user from host <10.1.255.251:59011> for command 0 (UPDATE_STARTD_AD), access level ADVERTISE_STARTD
2/3 17:28:31 DaemonCore: PERMISSION DENIED to unknown user from host <10.1.255.254:52303> for command 0 (UPDATE_STARTD_AD), access level ADVERTISE_STARTD
2/3 17:28:32 DaemonCore: PERMISSION DENIED to unknown user from host <10.1.255.254:35362> for command 0 (UPDATE_STARTD_AD), access level ADVERTISE_STARTD
2/3 17:28:33 DaemonCore: PERMISSION DENIED to unknown user from host <10.1.255.254:51246> for command 0 (UPDATE_STARTD_AD), access level ADVERTISE_STARTD
2/3 17:28:34 DaemonCore: PERMISSION DENIED to unknown user from host <10.1.255.254:40732> for command 0 (UPDATE_STARTD_AD), access level ADVERTISE_STARTD
2/3 17:29:01 DaemonCore: PERMISSION DENIED to unknown user from host <10.1.255.253:52190> for command 2 (UPDATE_MASTER_AD), access level ADVERTISE_MASTER
2/3 17:29:06 DaemonCore: PERMISSION DENIED to unknown user from host <10.1.255.253:50986> for command 1 (UPDATE_SCHEDD_AD), access level ADVERTISE_SCHEDD
2/3 17:29:07 NegotiatorAd : Inserting ** "< comet.cs.bgsu.edu >"
2/3 17:29:11 DaemonCore: PERMISSION DENIED to unknown user from host <10.1.255.253:42950> for command 1 (UPDATE_SCHEDD_AD), access level ADVERTISE_SCHEDD
2/3 17:29:14 DaemonCore: PERMISSION DENIED to unknown user from host <10.1.255.253:56511> for command 0 (UPDATE_STARTD_AD), access level ADVERTISE_STARTD
2/3 17:29:14 Got QUERY_STARTD_ADS
2/3 17:29:14 (Sending 0 ads in response to query)
2/3 17:29:15 DaemonCore: PERMISSION DENIED to unknown user from host <10.1.255.253:53686> for command 0 (UPDATE_STARTD_AD), access level ADVERTISE_STARTD
2/3 17:29:16 DaemonCore: PERMISSION DENIED to unknown user from host <10.1.255.253:59716> for command 0 (UPDATE_STARTD_AD), access level ADVERTISE_STARTD
2/3 17:29:17 DaemonCore: PERMISSION DENIED to unknown user from host <10.1.255.253:47375> for command 0 (UPDATE_STARTD_AD), access level ADVERTISE_STARTD
2/3 17:29:19 Got QUERY_STARTD_ADS
2/3 17:29:19 (Sending 0 ads in response to query)
2/3 17:29:34 DaemonCore: PERMISSION DENIED to unknown user from host <10.1.255.252:42804> for command 2 (UPDATE_MASTER_AD), access level ADVERTISE_MASTER
2/3 17:29:39 DaemonCore: PERMISSION DENIED to unknown user from host <10.1.255.252:39568> for command 1 (UPDATE_SCHEDD_AD), access level ADVERTISE_SCHEDD
2/3 17:29:47 DaemonCore: PERMISSION DENIED to unknown user from host <10.1.255.252:44629> for command 0 (UPDATE_STARTD_AD), access level ADVERTISE_STARTD
2/3 17:29:48 DaemonCore: PERMISSION DENIED to unknown user from host <10.1.255.252:57034> for command 0 (UPDATE_STARTD_AD), access level ADVERTISE_STARTD
2/3 17:29:49 DaemonCore: PERMISSION DENIED to unknown user from host <10.1.255.252:36419> for command 0 (UPDATE_STARTD_AD), access level ADVERTISE_STARTD
2/3 17:29:50 DaemonCore: PERMISSION DENIED to unknown user from host <10.1.255.252:50992> for command 0 (UPDATE_STARTD_AD), access level ADVERTISE_STARTD
2/3 17:30:01 DaemonCore: PERMISSION DENIED to unknown user from host <10.1.255.251:40005> for command 2 (UPDATE_MASTER_AD), access level ADVERTISE_MASTER
2/3 17:30:07 DaemonCore: PERMISSION DENIED to unknown user from host <10.1.255.251:51484> for command 1 (UPDATE_SCHEDD_AD), access level ADVERTISE_SCHEDD
2/3 17:30:25 DaemonCore: PERMISSION DENIED to unknown user from host <10.1.255.250:38544> for command 0 (UPDATE_STARTD_AD), access level ADVERTISE_STARTD
2/3 17:30:26 DaemonCore: PERMISSION DENIED to unknown user from host <10.1.255.250:41501> for command 0 (UPDATE_STARTD_AD), access level ADVERTISE_STARTD
2/3 17:30:27 (Sending 13 ads in response to query)
2/3 17:30:27 Got QUERY_STARTD_PVT_ADS
2/3 17:30:27 (Sending 8 ads in response to query)
2/3 17:30:27 DaemonCore: PERMISSION DENIED to unknown user from host <10.1.255.250:60420> for command 0 (UPDATE_STARTD_AD), access level ADVERTISE_STARTD
2/3 17:30:28 DaemonCore: PERMISSION DENIED to unknown user from host <10.1.255.250:54849> for command 0 (UPDATE_STARTD_AD), access level ADVERTISE_STARTD
This should be pretty easy fix for experts, I have been banging my head all day without any clue.
:-(
Samir