[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] Problem with HAD :Intercepting an unhandled exception.



Hi Tim
  I got following stack with Visual studio.
What we are trying to achieve is that we have one single pool submission point and we want to transfer the scheduler if the primary scheduler goes down. I found this from manual.If you know any better idea,please let me know.

Thanks in Advance
//=====================================================
PID: 3564
Exception code: C0000005 ACCESS_VIOLATION
Fault address:  0043E361 01:0003D361 C:\condor\bin\condor_master.exe

Registers:
EAX:0051C0F0
EBX:00000000
ECX:00000000
EDX:00001118
ESI:00E39F28
EDI:00000400
CS:EIP:0023:0043E361
SS:ESP:002B:0160F768  EBP:0160F798
DS:002B  ES:002B  FS:0053  GS:002B
Flags:00010206

Call stack:
Address   Frame
0043E361  0160F798  CondorLock::AcquireLock (c:\condor\execute\dir_5284\userdir\src\condor_daemon_core.v6\condor_lock.cpp:74)
004041FB  0160F7AC  admin_command_handler (c:\condor\execute\dir_5284\userdir\src\condor_master.v6\master.cpp:437)
004320B4  0160F7D0  DaemonCore::CallCommandHandler (c:\condor\execute\dir_5284\userdir\src\condor_daemon_core.v6\daemon_core.cpp:3550)
00438C4A  0160F9C8  DaemonCore::HandleReq (c:\condor\execute\dir_5284\userdir\src\condor_daemon_core.v6\daemon_core.cpp:4894)
00437057  0160F9DC  DaemonCore::HandleReqSocketHandler (c:\condor\execute\dir_5284\userdir\src\condor_daemon_core.v6\daemon_core.cpp:3701)
0043900E  0160FA0C  DaemonCore::CallSocketHandler_worker (c:\condor\execute\dir_5284\userdir\src\condor_daemon_core.v6\daemon_core.cpp:3461)
00439364  0160FA2C  DaemonCore::CallSocketHandler_worker_demarshall (c:\condor\execute\dir_5284\userdir\src\condor_daemon_core.v6\daemon_core.cpp:3424)
00439639  0160FA54  DaemonCore::CallSocketHandler (c:\condor\execute\dir_5284\userdir\src\condor_daemon_core.v6\daemon_core.cpp:3412)
0043B63A  0160FAF4  DaemonCore::Driver (c:\condor\execute\dir_5284\userdir\src\condor_daemon_core.v6\daemon_core.cpp:3325)
761114D1  0160FB08  HeapFree+14
0048EFC3  0160FB48  free (f:\dd\vctools\crt_bld\self_x86\crt\src\free.c:110)
00403A37  0160FB5C  MyString::`vector deleting destructor' (c:\condor\execute\dir_5284\userdir\src\condor_master.v6\master.cpp:1527)
00465A94  0160FB64  SimpleList<MyString>::~SimpleList<MyString> (c:\condor\execute\dir_5284\userdir\src\condor_c++_util\simplelist.h:44)
00405E33  0160FB74  daemon::RealStart (c:\condor\execute\dir_5284\userdir\src\condor_master.v6\masterdaemon.cpp:509)
00405E50  0160FBB0  daemon::RealStart (c:\condor\execute\dir_5284\userdir\src\condor_master.v6\masterdaemon.cpp:779)
778A371E  0160FCC4  RtlImageNtHeader+73A
7789E20C  0160FD4C  RtlInitUnicodeString+164
7789E20C  0160FD50  RtlInitUnicodeString+164
7789DF72  0160FDC0  RtlAllocateHeap+AC
Kuldeep Singh Meel
Sophomore
Department of Computer Science and Engineering
www.cse.iitb.ac.in/~kuldeepmeel
My Blog : http://sweetwithchilli.blogspot.com/


On Tue, Jul 20, 2010 at 1:25 AM, Timothy St. Clair <tstclair@xxxxxxxxxx> wrote:
If you have MS.Visual Studio.NET you will want to attach with the
debugger, this will likely be the best source of tracking down the
error.

Just as ?... usually if you want to setup HA the spool is in some shared
location, unless you are intending on setting it up locally for process
redundancy.  Either way, I'm curious what you are trying to accomplish
with this deployment configurations?

Cheers,
Tim

On Mon, 2010-07-19 at 17:38 +1200, kuldeep singh meel wrote:
> hi Ben
> The master log output is :
> 07/19 17:35:52 Using config source: C:\condor\condor_config
> 07/19 17:35:52 Using local config sources:
> 07/19 17:35:52    C:\condor/condor_config.local
> 07/19 17:35:52 DaemonCore: Command Socket at <10.1.1.108:50637>
> 07/19 17:35:52 Will use UDP to update collector <10.1.1.108:9618>
> 07/19 17:35:52 Will use UDP to update collector <10.1.1.104:9618>
> 07/19 17:35:52 Created HA lock for SCHEDD; URL=""> > poll=300s hold=3600s
> 07/19 17:35:52 Daemon NEGOTIATOR is controlled by HAD
> 07/19 17:35:52 Intercepting an unhandled exception.
> 07/19 17:35:52 Dropping a core file.
>
>
> and core file says:
> PID: 2632
> Exception code: C0000005 ACCESS_VIOLATION
> Fault address:  0043E361 01:0003D361 C:\condor\bin\condor_master.exe
>
>
> Registers:
> EAX:0051C0F0
> EBX:00000000
> ECX:00000000
> EDX:00CBBCA8
> ESI:00E79F40
> EDI:00000400
> CS:EIP:0023:0043E361
> SS:ESP:002B:015DFE60  EBP:00000000
> DS:002B  ES:002B  FS:0053  GS:002B
> Flags:00010202
>
>
> Call stack:
> Address   Frame
>
>
>
>
> Kuldeep Singh Meel
> Sophomore
> Department of Computer Science and Engineering
> www.cse.iitb.ac.in/~kuldeepmeel
> My Blog : http://sweetwithchilli.blogspot.com/
>
>
> On Mon, Jul 19, 2010 at 5:29 PM, Burnett, Ben <ben.burnett@xxxxxxxx>
> wrote:
>         What do the master logs say when the debug level is turn on
>         full?  Also, check the core after you run it again, often
>         there is a stack-trace included in the tail end of the file.
>
>         -B
>
>
>         On 2010-07-18, at 10:42 PM, kuldeep singh meel wrote:
>
>         > Hi Everyone
>         > We want to have SCHEDD with HAD and we put the following
>         configuration :
>         >
>         >
>         > MASTER_HA_LIST = SCHEDD
>         > SPOOL = /share/spool
>         > HA_LOCK_URL = file:/share/spool
>         > VALID_SPOOL_FILES = SCHEDD.lock
>         >
>         >
>         > But when we want to restart condor on all machines the
>         master gives
>         > the following error :
>         >
>         > Intercepting an unhandled exception.
>         >
>         > with the following COre file :
>         >
>         > //=====================================================
>         > PID: 3888
>         > Exception code: C0000005 ACCESS_VIOLATION
>         > Fault address:  0043E361 01:0003D361 C:\condor\bin
>         \condor_master.exe
>         >
>         > Registers:
>         > EAX:0051C0F0
>         > EBX:00000000
>         > ECX:00000000
>         > EDX:0109C0F0
>         > ESI:0109AE88
>         > EDI:00000400
>         > CS:EIP:0023:0043E361
>         > SS:ESP:002B:0157FE60  EBP:00000000
>         > DS:002B  ES:002B  FS:0053  GS:002B
>         > Flags:00010202
>         >
>         > Call stack:
>         > Address   Frame
>         >
>         > //=====================================================
>         >
>         >
>         > I am able to figure out anything.
>         >
>         > looking for help
>         >
>         > Kuldeep Singh Meel
>         > Sophomore
>         > Department of Computer Science and Engineering
>         > www.cse.iitb.ac.in/~kuldeepmeel
>         > My Blog : http://sweetwithchilli.blogspot.com/
>
>
>         > _______________________________________________
>         > Condor-users mailing list
>         > To unsubscribe, send a message to
>         condor-users-request@xxxxxxxxxxx with a
>         > subject: Unsubscribe
>         > You can also unsubscribe by visiting
>         > https://lists.cs.wisc.edu/mailman/listinfo/condor-users
>         >
>         > The archives can be found at:
>         > https://lists.cs.wisc.edu/archive/condor-users/
>
>         _______________________________________________
>         Condor-users mailing list
>         To unsubscribe, send a message to
>         condor-users-request@xxxxxxxxxxx with a
>         subject: Unsubscribe
>         You can also unsubscribe by visiting
>         https://lists.cs.wisc.edu/mailman/listinfo/condor-users
>
>         The archives can be found at:
>         https://lists.cs.wisc.edu/archive/condor-users/
>
>
>
> _______________________________________________
> Condor-users mailing list
> To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with a
> subject: Unsubscribe
> You can also unsubscribe by visiting
> https://lists.cs.wisc.edu/mailman/listinfo/condor-users
>
> The archives can be found at:
> https://lists.cs.wisc.edu/archive/condor-users/

_______________________________________________
Condor-users mailing list
To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/condor-users

The archives can be found at:
https://lists.cs.wisc.edu/archive/condor-users/