[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] BOINC running, all machine Owner (fwd)





Hi

I don't know if it could help but I add the D_ALL option to STARTER_DEBUG var.

Here is the result (very long) :
I 've delete all about timerB


3/31 14:44:51 (fd:10) (pid:2853) Done setting resource limits
3/31 14:44:51 (fd:10) (pid:2853) Starter running a local job with no shadow
3/31 14:44:51 (fd:10) (pid:2853) Getting job ClassAd from config file with keywo
rd: "boinc"
3/31 14:44:51 (fd:10) (pid:2853) "boinc_proc" not found in config file
3/31 14:44:51 (fd:10) (pid:2853) *** Job ClassAd ***
MyType = ""
TargetType = ""
JobUniverse = 5
Cmd = "/usr/local/BOINC/boinc"
Iwd = "/usr/local/BOINC"
Owner = "boinc"
Out = "/home/prof/boinc/boinc.out.liszt"
Err = "/home/prof/boinc/boinc.err.liszt"
3/31 14:44:51 (fd:10) (pid:2853) --- End of ClassAd ---
3/31 14:44:51 (fd:10) (pid:2853) Job's cluster ID not specified in ClassAd or on
 command-line, using '1'
3/31 14:44:51 (fd:10) (pid:2853) Job's proc ID not specified in ClassAd or on co
mmand-line, using '0'
3/31 14:44:51 (fd:10) (pid:2853) Initialized user_priv as "boinc"
3/31 14:44:51 (fd:10) (pid:2853) PRIV_CONDOR --> PRIV_USER at starter.C:403
3/31 14:44:51 (fd:10) (pid:2853) Done moving to directory "/home/prof/condor/hos
ts/liszt/execute/dir_2853"
3/31 14:44:51 (fd:10) (pid:2853) PRIV_USER --> PRIV_CONDOR at starter.C:506
3/31 14:44:51 (fd:10) (pid:2853) No StarterUserLog found in job ClassAd
3/31 14:44:51 (fd:10) (pid:2853) Starter will not write a local UserLog
.
3/31 14:44:51 (fd:10) (pid:2853) leaving DaemonCore NewTimer, id=6
3/31 14:44:51 (fd:10) (pid:2853) Job 1.0 set to execute immediately
3/31 14:44:51 (fd:10) (pid:2853) In DaemonCore Timeout()
.
3/31 14:44:51 (fd:10) (pid:2853) in DaemonCore NewTimer()
.
3/31 14:44:51 (fd:10) (pid:2853) in DaemonCore NewTimer()
.
3/31 14:44:51 (fd:10) (pid:2853) leaving DaemonCore NewTimer, id=2
3/31 14:44:51 (fd:10) (pid:2853) DaemonCore: Calling handler for Timer 3 (self_m
onitor)
3/31 14:44:51 (fd:10) (pid:2853) Getting monitoring info for pid 2853
3/31 14:44:51 (fd:10) (pid:2853) PRIV_CONDOR --> PRIV_ROOT at procapi.C:718
3/31 14:44:51 (fd:10) (pid:2853) PRIV_ROOT --> PRIV_CONDOR at procapi.C:831
3/31 14:44:51 (fd:10) (pid:2853) ProcAPI: new boottime = 1143725873; old_boottim
e = 0; /proc/stat boottime = 1143725873; /proc/uptime boottime = 4294884079
3/31 14:44:51 (fd:10) (pid:2853) PRIV_CONDOR --> PRIV_ROOT at procapi.C:849
3/31 14:44:51 (fd:10) (pid:2853) PRIV_ROOT --> PRIV_CONDOR at procapi.C:952
3/31 14:44:51 (fd:10) (pid:2853) in DaemonCore NewTimer()
.
3/31 14:44:51 (fd:10) (pid:2853) leaving DaemonCore NewTimer, id=3
3/31 14:44:51 (fd:10) (pid:2853) DaemonCore Timeout() Complete, returning 0
3/31 14:44:51 (fd:10) (pid:2853) In DaemonCore Timeout()
.
3/31 14:44:51 (fd:10) (pid:2853) Starting a VANILLA universe job with ID: 1.0
3/31 14:44:51 (fd:10) (pid:2853) In OsProc::OsProc()
3/31 14:44:51 (fd:10) (pid:2853) Main job KillSignal: 15 (SIGTERM)
3/31 14:44:51 (fd:10) (pid:2853) Main job RmKillSignal: 15 (SIGTERM)
3/31 14:44:51 (fd:10) (pid:2853) Main job HoldKillSignal: 15 (SIGTERM)
3/31 14:44:51 (fd:10) (pid:2853) in VanillaProc::StartJob()
3/31 14:44:51 (fd:10) (pid:2853) in OsProc::StartJob()
3/31 14:44:51 (fd:10) (pid:2853) IWD: /usr/local/BOINC
3/31 14:44:51 (fd:10) (pid:2853) PRIV_CONDOR --> PRIV_USER at os_proc.C:214
3/31 14:44:51 (fd:11) (pid:2853) Input file: /dev/null
3/31 14:44:51 (fd:12) (pid:2853) Output file: /home/prof/boinc/boinc.out.liszt
3/31 14:44:51 (fd:13) (pid:2853) Error file: /home/prof/boinc/boinc.err.liszt
3/31 14:44:51 (fd:13) (pid:2853) About to exec /usr/local/BOINC/boinc condor_exe
c.exe
3/31 14:44:51 (fd:13) (pid:2853) Env = _CONDOR_SCRATCH_DIR=/home/prof/condor/hos
ts/liszt/execute/dir_2853
3/31 14:44:51 (fd:13) (pid:2853) PRIV_USER --> PRIV_CONDOR at os_proc.C:327
3/31 14:44:51 (fd:13) (pid:2853) In DaemonCore::Create_Process(/usr/local/BOINC/
boinc,...)
3/31 14:44:51 (fd:13) (pid:2853) PRIV_CONDOR --> PRIV_USER at daemon_core.C:5350 3/31 14:44:51 (fd:13) (pid:2853) PRIV_USER --> PRIV_CONDOR at daemon_core.C:5385
3/31 14:44:51 (fd:13) (pid:2854) Create_Process: Arg: condor_exec.exe
3/31 14:44:51 (fd:13) (pid:2854) Re-mapping std(in|out|err) in child.
3/31 14:44:51 (fd:10) (pid:2854) calling nice(10)
3/31 14:44:51 (fd:10) (pid:2854) Printing fds to inherit:
3/31 14:44:51 (fd:10) (pid:2854) About to exec "/usr/local/BOINC/boinc"
3/31 14:44:51 (fd:13) (pid:2853) Child Process: pid 2854 at
3/31 14:44:51 (fd:10) (pid:2853) Create_Process succeeded, pid=2854
3/31 14:44:51 (fd:10) (pid:2853) Created new ProcFamily w/ pid 2854 as parent
3/31 14:44:51 (fd:10) (pid:2853) EXECUTE_LOGIN_IS_DEDICATED is undefined, using
default value of False
3/31 14:44:51 (fd:10) (pid:2853) in DaemonCore NewTimer()
.
3/31 14:44:52 (fd:10) (pid:2853) In DaemonCore Timeout()
.
3/31 14:44:52 (fd:10) (pid:2853) DaemonCore: attempting to connect to '<192.168.
45.120:33091>'
3/31 14:44:52 (fd:11) (pid:2853) PRIV_CONDOR --> PRIV_ROOT at sock.C:506
3/31 14:44:52 (fd:11) (pid:2853) PRIV_ROOT --> PRIV_CONDOR at sock.C:512
3/31 14:44:52 (fd:11) (pid:2853) STARTER_TIMEOUT_MULTIPLIER is undefined, using
default value of 0
3/31 14:44:52 (fd:11) (pid:2853) New Daemon obj (any) name: "NULL", pool: "NULL"
, addr: "<192.168.45.120:33091>"
3/31 14:44:52 (fd:11) (pid:2853) STARTCOMMAND: starting 60008 to <192.168.45.120
:33091> on UDP port 32806.
3/31 14:44:52 (fd:11) (pid:2853) SECMAN: command 60008 to <192.168.45.120:33091>
 on UDP port 32806.
3/31 14:44:52 (fd:11) (pid:2853) SECMAN: no cached key for {<192.168.45.120:3309
1>,<60008>}.
3/31 14:44:52 (fd:11) (pid:2853) SECMAN: Security Policy:
MyType = ""
TargetType = ""
AuthMethods = "FS,KERBEROS,GSI"
CryptoMethods = "3DES,BLOWFISH"
OutgoingNegotiation = "PREFERRED"
Authentication = "OPTIONAL"
Encryption = "OPTIONAL"
Integrity = "OPTIONAL"
Enact = "NO"
Subsystem = "STARTER"
ParentUniqueID = "liszt:2660:1143798715"
ServerPid = 2853
SessionDuration = "8640000"
3/31 14:44:52 (fd:11) (pid:2853) SECMAN: negotiating security for command 60008.
3/31 14:44:52 (fd:11) (pid:2853) SECMAN: need to start a session via TCP
3/31 14:44:52 (fd:11) (pid:2853) SEC_TCP_SESSION_TIMEOUT is undefined, using def
ault value of 20
3/31 14:44:52 (fd:11) (pid:2853) SECMAN: setting timeout to 20 seconds.
3/31 14:44:52 (fd:12) (pid:2853) PRIV_CONDOR --> PRIV_ROOT at sock.C:506
3/31 14:44:52 (fd:12) (pid:2853) PRIV_ROOT --> PRIV_CONDOR at sock.C:512
3/31 14:44:52 (fd:12) (pid:2853) CONNECT src=<192.168.45.120:33147> fd=11 dst=<1
92.168.45.120:33091>
3/31 14:44:52 (fd:12) (pid:2853) SECMAN: command 60010 to <192.168.45.120:33091>
 on TCP port 33147.
3/31 14:44:52 (fd:12) (pid:2853) SECMAN: no cached key for {<192.168.45.120:3309
1>,<60010>}.
3/31 14:44:52 (fd:12) (pid:2853) SECMAN: Security Policy:
MyType = ""
TargetType = ""
AuthMethods = "FS,KERBEROS,GSI"
CryptoMethods = "3DES,BLOWFISH"
OutgoingNegotiation = "PREFERRED"
Authentication = "OPTIONAL"
Encryption = "OPTIONAL"
Integrity = "OPTIONAL"
Enact = "NO"
Subsystem = "STARTER"
ParentUniqueID = "liszt:2660:1143798715"
ServerPid = 2853
SessionDuration = "8640000"
NewSession = "YES"
3/31 14:44:52 (fd:12) (pid:2853) SECMAN: negotiating security for command 60010.
3/31 14:44:52 (fd:12) (pid:2853) SECMAN: sending DC_AUTHENTICATE command
3/31 14:44:52 (fd:12) (pid:2853) SECMAN: sending following classad:
MyType = ""
TargetType = ""
AuthMethods = "FS,KERBEROS,GSI"
CryptoMethods = "3DES,BLOWFISH"
OutgoingNegotiation = "PREFERRED"
Authentication = "OPTIONAL"
Encryption = "OPTIONAL"
Integrity = "OPTIONAL"
Enact = "NO"
Subsystem = "STARTER"
ParentUniqueID = "liszt:2660:1143798715"
ServerPid = 2853
SessionDuration = "8640000"
NewSession = "YES"
RemoteVersion = "$CondorVersion: 6.7.17 Feb 18 2006 $"
ServerCommandSock = "<192.168.45.120:33145>"
Command = 60010
AuthCommand = 60008
3/31 14:44:52 (fd:12) (pid:2853) condor_read(): nfds=12
3/31 14:44:52 (fd:12) (pid:2853) condor_read(): nfound=1
3/31 14:44:52 (fd:12) (pid:2853) condor_read(): nfds=12
3/31 14:44:52 (fd:12) (pid:2853) condor_read(): nfound=1
3/31 14:44:52 (fd:12) (pid:2853) SECMAN: server responded with:
MyType = "(unknown type)"
TargetType = "(unknown type)"
Authentication = "NO"
Encryption = "NO"
Integrity = "NO"
AuthMethodsList = "FS,KERBEROS,GSI"
AuthMethods = "FS"
CryptoMethods = "3DES,BLOWFISH"
SessionDuration = "8640000"
Enact = "YES"
RemoteVersion = "$CondorVersion: 6.7.17 Feb 18 2006 $"
3/31 14:44:52 (fd:12) (pid:2853) condor_read(): nfds=12
3/31 14:44:52 (fd:12) (pid:2853) condor_read(): nfound=1
3/31 14:44:52 (fd:12) (pid:2853) condor_read(): nfds=12
3/31 14:44:52 (fd:12) (pid:2853) condor_read(): nfound=1
3/31 14:44:52 (fd:12) (pid:2853) SECMAN: received post-auth classad:
MyType = "(unknown type)"
TargetType = "(unknown type)"
Sid = "liszt:2660:1143809092:12"
ValidCommands = "60000,60001,60008"
3/31 14:44:52 (fd:12) (pid:2853) SECMAN: policy to be cached:
MyType = ""
TargetType = ""
OutgoingNegotiation = "PREFERRED"
Subsystem = "STARTER"
ParentUniqueID = "liszt:2660:1143798715"
ServerPid = 2853
SessionDuration = "8640000"
ServerCommandSock = "<192.168.45.120:33145>"
Command = 60010
AuthCommand = 60008
RemoteVersion = "$CondorVersion: 6.7.17 Feb 18 2006 $"
Enact = "YES"
AuthMethodsList = "FS,KERBEROS,GSI"
AuthMethods = "FS"
CryptoMethods = "3DES,BLOWFISH"
Authentication = "NO"
Encryption = "NO"
Integrity = "NO"
UseSession = "YES"
Sid = "liszt:2660:1143809092:12"
ValidCommands = "60000,60001,60008"
3/31 14:44:53 (fd:12) (pid:2853) SECMAN: added session liszt:2660:1143809092:12
to cache for 8640000 seconds.
3/31 14:44:53 (fd:12) (pid:2853) SECMAN: command {<192.168.45.120:33091>,<60000>
} mapped to session liszt:2660:1143809092:12.
3/31 14:44:53 (fd:12) (pid:2853) SECMAN: command {<192.168.45.120:33091>,<60001>
} mapped to session liszt:2660:1143809092:12.
3/31 14:44:53 (fd:12) (pid:2853) SECMAN: command {<192.168.45.120:33091>,<60008>
} mapped to session liszt:2660:1143809092:12.
3/31 14:44:53 (fd:12) (pid:2853) SECMAN: startCommand succeeded.
3/31 14:44:53 (fd:12) (pid:2853) SECMAN: sending eom() and closing TCP sock.
3/31 14:44:53 (fd:12) (pid:2853) CLOSE <192.168.45.120:33147> fd=11
3/31 14:44:53 (fd:11) (pid:2853) SECMAN: succesfully sent NOP via TCP!
3/31 14:44:53 (fd:11) (pid:2853) SECMAN: using session liszt:2660:1143809092:12
for {<192.168.45.120:33091>,<60008>}.
3/31 14:44:53 (fd:11) (pid:2853) SECMAN: SEC_UDP obtained key id liszt:2660:1143
809092:12!
3/31 14:44:53 (fd:11) (pid:2853) SECMAN: UDP, have_session == 1, can_neg == 1
3/31 14:44:53 (fd:11) (pid:2853) SECMAN: UDP has session liszt:2660:1143809092:1
2.
3/31 14:44:53 (fd:11) (pid:2853) SECMAN: sending DC_AUTHENTICATE command
3/31 14:44:53 (fd:11) (pid:2853) SECMAN: sending following classad:
MyType = ""
TargetType = ""
OutgoingNegotiation = "PREFERRED"
Subsystem = "STARTER"
ParentUniqueID = "liszt:2660:1143798715"
ServerPid = 2853
SessionDuration = "8640000"
AuthCommand = 60008
Enact = "YES"
AuthMethodsList = "FS,KERBEROS,GSI"
AuthMethods = "FS"
CryptoMethods = "3DES,BLOWFISH"
Authentication = "NO"
Encryption = "NO"
Integrity = "NO"
UseSession = "YES"
Sid = "liszt:2660:1143809092:12"
ValidCommands = "60000,60001,60008"
RemoteVersion = "$CondorVersion: 6.7.17 Feb 18 2006 $"
ServerCommandSock = "<192.168.45.120:33145>"
Command = 60008
3/31 14:44:53 (fd:11) (pid:2853) SECMAN: startCommand succeeded.
3/31 14:44:53 (fd:11) (pid:2853) DaemonCore: Sending alive to <192.168.45.120:33
091>
3/31 14:44:53 (fd:11) (pid:2853) SEND [586] <192.168.45.120:32806> <192.168.45.1
20:33091>
3/31 14:44:53 (fd:11) (pid:2853) Destroying Daemon object:
3/31 14:44:53 (fd:11) (pid:2853) Type: 1 (any), Name: (null), Addr: <192.168.45.
120:33091>
3/31 14:44:53 (fd:11) (pid:2853) FullHost: (null), Host: (null), Pool: (null), P
ort: -1
3/31 14:44:53 (fd:11) (pid:2853) IsLocal: N, IdStr: (null), Error: (null)
3/31 14:44:53 (fd:11) (pid:2853)  --- End of Daemon object info ---
3/31 14:44:53 (fd:10) (pid:2853) in DaemonCore NewTimer()
.
3/31 14:44:53 (fd:10) (pid:2853) In DaemonCore Timeout()
.
.
3/31 14:44:53 (fd:10) (pid:2853) PRIV_CONDOR --> PRIV_ROOT at killfamily.C:274
3/31 14:44:53 (fd:10) (pid:2853) ProcAPI::buildFamily() called w/ parent: 2854
3/31 14:44:53 (fd:10) (pid:2853) ProcAPI::buildFamily() Found daddypid on the sy
stem: 2854
3/31 14:44:53 (fd:10) (pid:2853) Pid 2855 is in family of 2854
3/31 14:44:53 (fd:10) (pid:2853) Pid 2856 is predicted to be in family of 2854
3/31 14:44:53 (fd:10) (pid:2853) Pid 2857 is predicted to be in family of 2854
3/31 14:44:53 (fd:10) (pid:2853) Pid 2858 is predicted to be in family of 2854
3/31 14:44:53 (fd:10) (pid:2853) ProcFamily: parent: 2854 family: 2854 2855 2856
 2857 2858
3/31 14:44:53 (fd:10) (pid:2853) ProcFamily: alive_cpu_user = 1, exited_cpu = 0,
 max_image = 16528k
3/31 14:44:53 (fd:10) (pid:2853) PRIV_ROOT --> PRIV_CONDOR at killfamily.C:460
3/31 14:44:53 (fd:10) (pid:2853) in DaemonCore NewTimer()
.
3/31 14:45:06 (fd:10) (pid:2853) In DaemonCore Timeout()
.
.
3/31 14:45:06 (fd:10) (pid:2853) PRIV_CONDOR --> PRIV_ROOT at daemon_core.C:7446 3/31 14:45:06 (fd:10) (pid:2853) PRIV_ROOT --> PRIV_CONDOR at daemon_core.C:7467
3/31 14:45:06 (fd:10) (pid:2853) in DaemonCore NewTimer()
.
3/31 14:45:08 (fd:10) (pid:2853) In DaemonCore Timeout()
.
.
3/31 14:45:08 (fd:10) (pid:2853) PRIV_CONDOR --> PRIV_ROOT at killfamily.C:274
3/31 14:45:08 (fd:10) (pid:2853) ProcAPI::buildFamily() called w/ parent: 2854
3/31 14:45:08 (fd:10) (pid:2853) ProcAPI::buildFamily() Found daddypid on the sy
stem: 2854
3/31 14:45:08 (fd:10) (pid:2853) Pid 2855 is in family of 2854
3/31 14:45:08 (fd:10) (pid:2853) Pid 2856 is predicted to be in family of 2854
3/31 14:45:08 (fd:10) (pid:2853) Pid 2857 is predicted to be in family of 2854
3/31 14:45:08 (fd:10) (pid:2853) Pid 2858 is predicted to be in family of 2854
3/31 14:45:08 (fd:10) (pid:2853) ProcFamily: parent: 2854 family: 2854 2855 2856
 2857 2858
3/31 14:45:08 (fd:10) (pid:2853) ProcFamily: alive_cpu_user = 16, exited_cpu = 0
, max_image = 17328k
3/31 14:45:08 (fd:10) (pid:2853) PRIV_ROOT --> PRIV_CONDOR at killfamily.C:460
3/31 14:45:08 (fd:10) (pid:2853) in DaemonCore NewTimer()
.
3/31 14:45:23 (fd:10) (pid:2853) In DaemonCore Timeout()
.
3/31 14:45:23 (fd:10) (pid:2853) PRIV_CONDOR --> PRIV_ROOT at killfamily.C:274
3/31 14:45:23 (fd:10) (pid:2853) ProcAPI::buildFamily() called w/ parent: 2854
3/31 14:45:23 (fd:10) (pid:2853) ProcAPI::buildFamily() Found daddypid on the sy
stem: 2854
3/31 14:45:23 (fd:10) (pid:2853) Pid 2855 is in family of 2854
3/31 14:45:23 (fd:10) (pid:2853) Pid 2856 is predicted to be in family of 2854
3/31 14:45:23 (fd:10) (pid:2853) Pid 2857 is predicted to be in family of 2854
3/31 14:45:23 (fd:10) (pid:2853) Pid 2858 is predicted to be in family of 2854
3/31 14:45:23 (fd:10) (pid:2853) ProcFamily: parent: 2854 family: 2854 2855 2856
 2857 2858
3/31 14:45:23 (fd:10) (pid:2853) ProcFamily: alive_cpu_user = 31, exited_cpu = 0
, max_image = 19380k
3/31 14:45:23 (fd:10) (pid:2853) PRIV_ROOT --> PRIV_CONDOR at killfamily.C:460
3/31 14:45:23 (fd:10) (pid:2853) in DaemonCore NewTimer()
.
3/31 14:45:23 (fd:10) (pid:2853) DaemonCore Timeout() Complete, returning 15
3/31 14:45:35 (fd:11) (pid:2853) ACCEPT from=<192.168.45.120:33148> newfd=10 to=
<192.168.45.120:33145>
3/31 14:45:35 (fd:11) (pid:2853) condor_read(): nfds=11
3/31 14:45:35 (fd:11) (pid:2853) condor_read(): nfound=1
3/31 14:45:35 (fd:11) (pid:2853) condor_read(): nfds=11
3/31 14:45:35 (fd:11) (pid:2853) condor_read(): nfound=1
3/31 14:45:35 (fd:11) (pid:2853) condor_read(): nfds=11
3/31 14:45:35 (fd:11) (pid:2853) condor_read(): nfound=1
3/31 14:45:35 (fd:11) (pid:2853) DC_AUTHENTICATE: received DC_AUTHENTICATE from
<192.168.45.120:33148>
3/31 14:45:35 (fd:11) (pid:2853) DC_AUTHENTICATE: received following ClassAd:
MyType = "(unknown type)"
TargetType = "(unknown type)"
AuthMethods = "FS,KERBEROS,GSI"
CryptoMethods = "3DES,BLOWFISH"
OutgoingNegotiation = "PREFERRED"
Authentication = "OPTIONAL"
Encryption = "OPTIONAL"
Integrity = "OPTIONAL"
Enact = "NO"
Subsystem = "STARTD"
ParentUniqueID = "liszt:2659:1143798714"
ServerPid = 2660
SessionDuration = "8640000"
NewSession = "YES"
RemoteVersion = "$CondorVersion: 6.7.17 Feb 18 2006 $"
ServerCommandSock = "<192.168.45.120:33091>"
Command = 60010
AuthCommand = 60000
3/31 14:45:35 (fd:11) (pid:2853) DC_AUTHENTICATE: our_policy:
MyType = ""
TargetType = ""
AuthMethods = "FS,KERBEROS,GSI"
CryptoMethods = "3DES,BLOWFISH"
OutgoingNegotiation = "PREFERRED"
Authentication = "OPTIONAL"
Encryption = "OPTIONAL"
Integrity = "OPTIONAL"
Enact = "NO"
Subsystem = "STARTER"
ParentUniqueID = "liszt:2660:1143798715"
ServerPid = 2853
SessionDuration = "8640000"
3/31 14:45:35 (fd:11) (pid:2853) DC_AUTHENTICATE: the_policy:
MyType = ""
TargetType = ""
Authentication = "NO"
Encryption = "NO"
Integrity = "NO"
AuthMethodsList = "FS,KERBEROS,GSI"
AuthMethods = "FS"
CryptoMethods = "3DES,BLOWFISH"
SessionDuration = "8640000"
Enact = "YES"
3/31 14:45:35 (fd:11) (pid:2853) SECMAN: Sending following response ClassAd:
MyType = ""
TargetType = ""
Authentication = "NO"
Encryption = "NO"
Integrity = "NO"
AuthMethodsList = "FS,KERBEROS,GSI"
AuthMethods = "FS"
CryptoMethods = "3DES,BLOWFISH"
SessionDuration = "8640000"
Enact = "YES"
RemoteVersion = "$CondorVersion: 6.7.17 Feb 18 2006 $"
3/31 14:45:35 (fd:11) (pid:2853) DC_AUTHENTICATE: not authenticating.
3/31 14:45:35 (fd:11) (pid:2853) DC_AUTHENTICATE: sending session ad:
MyType = ""
TargetType = ""
Sid = "liszt:2853:1143809135:0"
ValidCommands = "60000,60001,60008"
3/31 14:45:35 (fd:11) (pid:2853) DC_AUTHENTICATE: sent session liszt:2853:1143809135:0 info! 3/31 14:45:35 (fd:11) (pid:2853) DC_AUTHENTICATE: added session id liszt:2853:1143809135:0 to cache for 8640000 seconds!
MyType = ""
TargetType = ""
Authentication = "NO"
Encryption = "NO"
Integrity = "NO"
AuthMethodsList = "FS,KERBEROS,GSI"
AuthMethods = "FS"
CryptoMethods = "3DES,BLOWFISH"
SessionDuration = "8640000"
Enact = "YES"
Subsystem = "STARTD"
ServerCommandSock = "<192.168.45.120:33091>"
ParentUniqueID = "liszt:2659:1143798714"
ServerPid = 2660
RemoteVersion = "$CondorVersion: 6.7.17 Feb 18 2006 $"
Sid = "liszt:2853:1143809135:0"
ValidCommands = "60000,60001,60008"
3/31 14:45:35 (fd:11) (pid:2853) CLOSE <192.168.45.120:33145> fd=10
3/31 14:45:35 (fd:10) (pid:2853) In DaemonCore Timeout()
.
3/31 14:45:35 (fd:10) (pid:2853) RECV 576 bytes at <192.168.45.120:33145> from <192.168.45.120:32806>
3/31 14:45:35 (fd:10) (pid:2853)        Full msg [576 bytes]
3/31 14:45:35 (fd:10) (pid:2853) DC_AUTHENTICATE: received UDP packet from <192.168.45.120:32806>. 3/31 14:45:35 (fd:10) (pid:2853) DC_AUTHENTICATE: received DC_AUTHENTICATE from <192.168.45.120:32806>
3/31 14:45:35 (fd:10) (pid:2853) DC_AUTHENTICATE: received following ClassAd:
MyType = "(unknown type)"
TargetType = "(unknown type)"
OutgoingNegotiation = "PREFERRED"
Subsystem = "STARTD"
ParentUniqueID = "liszt:2659:1143798714"
ServerPid = 2660
SessionDuration = "8640000"
AuthCommand = 60000
Enact = "YES"
AuthMethodsList = "FS,KERBEROS,GSI"
AuthMethods = "FS"
CryptoMethods = "3DES,BLOWFISH"
Authentication = "NO"
Encryption = "NO"
Integrity = "NO"
UseSession = "YES"
Sid = "liszt:2853:1143809135:0"
ValidCommands = "60000,60001,60008"
RemoteVersion = "$CondorVersion: 6.7.17 Feb 18 2006 $"
ServerCommandSock = "<192.168.45.120:33091>"
Command = 60000
3/31 14:45:35 (fd:10) (pid:2853) DC_AUTHENTICATE: resuming session id liszt:2853:1143809135:0 given to <192.168.45.120:33148>:
3/31 14:45:35 (fd:10) (pid:2853) DC_AUTHENTICATE: Cached Session:
MyType = ""
TargetType = ""
Authentication = "NO"
Encryption = "NO"
Integrity = "NO"
AuthMethodsList = "FS,KERBEROS,GSI"
AuthMethods = "FS"
CryptoMethods = "3DES,BLOWFISH"
SessionDuration = "8640000"
Enact = "YES"
Subsystem = "STARTD"
ServerCommandSock = "<192.168.45.120:33091>"
ParentUniqueID = "liszt:2659:1143798714"
ServerPid = 2660
RemoteVersion = "$CondorVersion: 6.7.17 Feb 18 2006 $"
Sid = "liszt:2853:1143809135:0"
ValidCommands = "60000,60001,60008"
3/31 14:45:35 (fd:10) (pid:2853) DC_AUTHENTICATE: setting sock->decode()
3/31 14:45:35 (fd:10) (pid:2853) DC_AUTHENTICATE: allowing an empty message for sock.
3/31 14:45:35 (fd:10) (pid:2853) DC_AUTHENTICATE: Success.
3/31 14:45:35 (fd:10) (pid:2853) DaemonCore: Command received via UDP from host <192.168.45.120:32806> 3/31 14:45:35 (fd:10) (pid:2853) DaemonCore: received command 60000 (DC_RAISESIGNAL), calling handler (HandleSigCommand()) 3/31 14:45:35 (fd:10) (pid:2853) DaemonCore: received Signal 3 (SIGQUIT), raising event handle_dc_sigquit() 3/31 14:45:35 (fd:10) (pid:2853) Calling Handler <handle_dc_sigquit()> for Signal 3 <SIGQUIT>
3/31 14:45:35 (fd:10) (pid:2853) Got SIGQUIT.  Performing fast shutdown.
3/31 14:45:35 (fd:10) (pid:2853) ShutdownFast all jobs.
3/31 14:45:35 (fd:10) (pid:2853) in VanillaProc::ShutdownFast()
3/31 14:45:35 (fd:10) (pid:2853) Entering ProcFamily::hardkill





On Thu, 30 Mar 2006, Emmanuel Le Guirriec wrote:



Hi

I now put the binary file but that doesn't work.
I 've a Got SIGQUIT 30 seconds after boinc has started.

The boinc.out seems to be right (compared to the one i 've when boinc
started by hand).
2006-03-30 10:18:26 [---] Starting BOINC client version 5.2.13 for
i686-pc-linux-gnu
2006-03-30 10:18:26 [---] libcurl/7.14.0 OpenSSL/0.9.8 zlib/1.2.3
2006-03-30 10:18:26 [---] Data directory: /usr/local/BOINC
2006-03-30 10:18:26 [---] Processor: 1 AuthenticAMD AMD Athlon(tm)
Processor
2006-03-30 10:18:26 [---] Memory: 250.39 MB physical, 619.66 MB virtual
2006-03-30 10:18:26 [---] Disk: 5.68 GB total, 1.13 GB free
2006-03-30 10:18:26 [Einstein@Home] Computer ID: 490504; location: ;
project prefs: default
2006-03-30 10:18:26 [---] No general preferences found - using BOINC
defaults
2006-03-30 10:18:26 [---] Remote control not allowed; using loopback
address
2006-03-30 10:18:26 [Einstein@Home] Resuming computation for result
r1_1436.0__1970_S4R2a_2 using albert version 440


I've put the
BOINC_Arguments to -dir $(HOME_BOINC) but it's the same as whitout.
The exact Log for boinc starter is

3/30 10:18:26 Using config file: /home/prof/condor/condor_config
3/30 10:18:26 Using local config files:
/home/prof/condor/hosts/strauss/condor_c
onfig.local
3/30 10:18:26 DaemonCore: Command Socket at <192.168.45.110:39205>
3/30 10:18:26 Done setting resource limits
3/30 10:18:26 Starter running a local job with no shadow
3/30 10:18:26 Getting job ClassAd from config file with keyword: "boinc"
3/30 10:18:26 "boinc_proc" not found in config file
3/30 10:18:26 Starting a VANILLA universe job with ID: 1.0
3/30 10:18:26 IWD: /usr/local/BOINC/
3/30 10:18:26 Output file: /usr/local/BOINC//boinc.out
3/30 10:18:26 Error file: /usr/local/BOINC//boinc.err
3/30 10:18:26 About to exec /usr/local/BOINC//boinc condor_exec.exe -dir
/usr/lo
cal/BOINC/
3/30 10:18:26 Create_Process succeeded, pid=12997
3/30 10:19:11 Got SIGQUIT.  Performing fast shutdown.
3/30 10:19:11 ShutdownFast all jobs.
3/30 10:19:11 Process exited, pid=12997, signal=9
3/30 10:19:11 All jobs have exited... starter exiting
3/30 10:19:11 **** condor_starter (condor_STARTER) EXITING WITH STATUS 0


Manu


On Tue, 28 Mar 2006, Derek Wright wrote:


On Tue, 28 Mar 2006 21:22:28 +0200 (CEST)  Emmanuel Le Guirriec wrote:

StarterLog.boinc
...
3/28 18:02:56 Create_Process: child failed with errno 8 (Exec format
error) before exec()

there's your problem.  /usr/local/BOINC/run_client isn't the right
kind of binary for this machine, or doesn't exist, or something.  you
need to install a working copy of the "boinc_client" program in that
directory (if it doesn't already exist), and set BOINC_Exectuable to
point to that.

from the condor manual:
http://www.cs.wisc.edu/condor/manual/v6.7.18/3_13Setting_Up.html#SECTION004138500000000000000
----------
Required settings:

BOINC_Executable
   The full path to the boinc_client binary to use.
----------

notice, the docs say "... path to the boinc_client binary", not
"run_client" script. ;)


good luck,
-derek



_______________________________________________
Condor-users mailing list
Condor-users@xxxxxxxxxxx
https://lists.cs.wisc.edu/mailman/listinfo/condor-users




--
Emmanuel Le Guirriec
Ingenieur de Recherche Calcul Scientifique CNRS
UMR6628-MAPMO
Federation Denis Poisson
Universite d'Orleans
BP 6759
45067 Orleans Cedex 2
tel	02.38.49.46.69 / 48.50