[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] AMD Opteron Crashes



On Mon, Mar 07, 2005 at 11:21:15AM +0100, Steffen Prohaska wrote:
> Prashant,
> 
> On Mar 7, 2005, at 8:04 AM, Prashant Lal wrote:
> 
> > Yes in my cluster also it is working fine.
> >
> > May I know what the error actually you are getting?
> 
> Using linux-glibc23 on x86_64 I don't get a real error. SchedLog only 
> tells that the Starter died with a SEGFAULT.
> 
> 3/7 11:12:39 Starter pid 26885 died on signal 11 (signal 11)
> 
> The last lines in the StarterLog are
> 
> ...
> 3/7 11:12:39 (fd:11) Doing CONDOR_register_starter_info
> 3/7 11:12:39 (fd:11) ShouldTransferFiles is "NO", NOT transfering files
> 3/7 11:12:39 (fd:11) Submit UidDomain: "zib.de"
> 3/7 11:12:39 (fd:11)  Local UidDomain: "zib.de"
> 3/7 11:12:39 (fd:11) Initialized user_priv as "bzfproha"
> 
> No error is reported, the starter silently dies.

Did you try to increase the verbosity level? (STARTER_DEBUG = D_ALL, or
even ALL_DEBUG = D_ALL)

Cheers,
 Steffen

-- 
Steffen Grunewald * * * Merlin cluster admin (http://pandora.aei.mpg.de)
Albert-Einstein-Institut (MPI Gravitationsphysik, http://www.aei.mpg.de)
       Science Park Golm, Am Mühlenberg 1, 14476 Potsdam, Germany
e-mail: steffen.grunewald(*)aei.mpg.de * +49-331-567-{fon:7233,fax:7298}