[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] Problem with HAD :Intercepting an unhandled exception.



Oh, and the HA_LOCK_URL is not recognized/handled on Windows, since URL parsing is not supported.

-B

On 2010-07-19, at 4:38 PM, kuldeep singh meel wrote:

> *Hi Tim*
>  I got following stack with Visual studio.
> What we are trying to achieve is that we have one single pool submission
> point and we want to transfer the scheduler if the primary scheduler goes
> down. I found this from manual.If you know any better idea,please let me
> know.
> 
> Thanks in Advance
> *//=====================================================*
> *PID: 3564*
> *Exception code: C0000005 ACCESS_VIOLATION*
> *Fault address:  0043E361 01:0003D361 C:\condor\bin\condor_master.exe*
> *
> *
> *Registers:*
> *EAX:0051C0F0*
> *EBX:00000000*
> *ECX:00000000*
> *EDX:00001118*
> *ESI:00E39F28*
> *EDI:00000400*
> *CS:EIP:0023:0043E361*
> *SS:ESP:002B:0160F768  EBP:0160F798*
> *DS:002B  ES:002B  FS:0053  GS:002B*
> *Flags:00010206*
> *
> *
> *Call stack:*
> *Address   Frame*
> *0043E361  0160F798  CondorLock::AcquireLock
> (c:\condor\execute\dir_5284\userdir\src\condor_daemon_core.v6\condor_lock.cpp:74)
> *
> *004041FB  0160F7AC  admin_command_handler
> (c:\condor\execute\dir_5284\userdir\src\condor_master.v6\master.cpp:437)*
> *004320B4  0160F7D0  DaemonCore::CallCommandHandler
> (c:\condor\execute\dir_5284\userdir\src\condor_daemon_core.v6\daemon_core.cpp:3550)
> *
> *00438C4A  0160F9C8  DaemonCore::HandleReq
> (c:\condor\execute\dir_5284\userdir\src\condor_daemon_core.v6\daemon_core.cpp:4894)
> *
> *00437057  0160F9DC  DaemonCore::HandleReqSocketHandler
> (c:\condor\execute\dir_5284\userdir\src\condor_daemon_core.v6\daemon_core.cpp:3701)
> *
> *0043900E  0160FA0C  DaemonCore::CallSocketHandler_worker
> (c:\condor\execute\dir_5284\userdir\src\condor_daemon_core.v6\daemon_core.cpp:3461)
> *
> *00439364  0160FA2C  DaemonCore::CallSocketHandler_worker_demarshall
> (c:\condor\execute\dir_5284\userdir\src\condor_daemon_core.v6\daemon_core.cpp:3424)
> *
> *00439639  0160FA54  DaemonCore::CallSocketHandler
> (c:\condor\execute\dir_5284\userdir\src\condor_daemon_core.v6\daemon_core.cpp:3412)
> *
> *0043B63A  0160FAF4  DaemonCore::Driver
> (c:\condor\execute\dir_5284\userdir\src\condor_daemon_core.v6\daemon_core.cpp:3325)
> *
> *761114D1  0160FB08  HeapFree+14*
> *0048EFC3  0160FB48  free
> (f:\dd\vctools\crt_bld\self_x86\crt\src\free.c:110)*
> *00403A37  0160FB5C  MyString::`vector deleting destructor'
> (c:\condor\execute\dir_5284\userdir\src\condor_master.v6\master.cpp:1527)*
> *00465A94  0160FB64  SimpleList<MyString>::~SimpleList<MyString>
> (c:\condor\execute\dir_5284\userdir\src\condor_c++_util\simplelist.h:44)*
> *00405E33  0160FB74  daemon::RealStart
> (c:\condor\execute\dir_5284\userdir\src\condor_master.v6\masterdaemon.cpp:509)
> *
> *00405E50  0160FBB0  daemon::RealStart
> (c:\condor\execute\dir_5284\userdir\src\condor_master.v6\masterdaemon.cpp:779)
> *
> *778A371E  0160FCC4  RtlImageNtHeader+73A*
> *7789E20C  0160FD4C  RtlInitUnicodeString+164*
> *7789E20C  0160FD50  RtlInitUnicodeString+164*
> *7789DF72  0160FDC0  RtlAllocateHeap+AC*
> Kuldeep Singh Meel
> Sophomore
> Department of Computer Science and Engineering
> www.cse.iitb.ac.in/~kuldeepmeel
> My Blog : http://sweetwithchilli.blogspot.com/
> 
> 
> On Tue, Jul 20, 2010 at 1:25 AM, Timothy St. Clair <tstclair@xxxxxxxxxx>wrote:
> 
>> If you have MS.Visual Studio.NET you will want to attach with the
>> debugger, this will likely be the best source of tracking down the
>> error.
>> 
>> Just as ?... usually if you want to setup HA the spool is in some shared
>> location, unless you are intending on setting it up locally for process
>> redundancy.  Either way, I'm curious what you are trying to accomplish
>> with this deployment configurations?
>> 
>> Cheers,
>> Tim
>> 
>> On Mon, 2010-07-19 at 17:38 +1200, kuldeep singh meel wrote:
>>> hi Ben
>>> The master log output is :
>>> 07/19 17:35:52 Using config source: C:\condor\condor_config
>>> 07/19 17:35:52 Using local config sources:
>>> 07/19 17:35:52    C:\condor/condor_config.local
>>> 07/19 17:35:52 DaemonCore: Command Socket at <10.1.1.108:50637>
>>> 07/19 17:35:52 Will use UDP to update collector <10.1.1.108:9618>
>>> 07/19 17:35:52 Will use UDP to update collector <10.1.1.104:9618>
>>> 07/19 17:35:52 Created HA lock for SCHEDD; URL='file:/C:/condor/spool'
>>> poll=300s hold=3600s
>>> 07/19 17:35:52 Daemon NEGOTIATOR is controlled by HAD
>>> 07/19 17:35:52 Intercepting an unhandled exception.
>>> 07/19 17:35:52 Dropping a core file.
>>> 
>>> 
>>> and core file says:
>>> PID: 2632
>>> Exception code: C0000005 ACCESS_VIOLATION
>>> Fault address:  0043E361 01:0003D361 C:\condor\bin\condor_master.exe
>>> 
>>> 
>>> Registers:
>>> EAX:0051C0F0
>>> EBX:00000000
>>> ECX:00000000
>>> EDX:00CBBCA8
>>> ESI:00E79F40
>>> EDI:00000400
>>> CS:EIP:0023:0043E361
>>> SS:ESP:002B:015DFE60  EBP:00000000
>>> DS:002B  ES:002B  FS:0053  GS:002B
>>> Flags:00010202
>>> 
>>> 
>>> Call stack:
>>> Address   Frame
>>> 
>>> 
>>> 
>>> 
>>> Kuldeep Singh Meel
>>> Sophomore
>>> Department of Computer Science and Engineering
>>> www.cse.iitb.ac.in/~kuldeepmeel
>>> My Blog : http://sweetwithchilli.blogspot.com/
>>> 
>>> 
>>> On Mon, Jul 19, 2010 at 5:29 PM, Burnett, Ben <ben.burnett@xxxxxxxx>
>>> wrote:
>>>        What do the master logs say when the debug level is turn on
>>>        full?  Also, check the core after you run it again, often
>>>        there is a stack-trace included in the tail end of the file.
>>> 
>>>        -B
>>> 
>>> 
>>>        On 2010-07-18, at 10:42 PM, kuldeep singh meel wrote:
>>> 
>>>> Hi Everyone
>>>> We want to have SCHEDD with HAD and we put the following
>>>        configuration :
>>>> 
>>>> 
>>>> MASTER_HA_LIST = SCHEDD
>>>> SPOOL = /share/spool
>>>> HA_LOCK_URL = file:/share/spool
>>>> VALID_SPOOL_FILES = SCHEDD.lock
>>>> 
>>>> 
>>>> But when we want to restart condor on all machines the
>>>        master gives
>>>> the following error :
>>>> 
>>>> Intercepting an unhandled exception.
>>>> 
>>>> with the following COre file :
>>>> 
>>>> //=====================================================
>>>> PID: 3888
>>>> Exception code: C0000005 ACCESS_VIOLATION
>>>> Fault address:  0043E361 01:0003D361 C:\condor\bin
>>>        \condor_master.exe
>>>> 
>>>> Registers:
>>>> EAX:0051C0F0
>>>> EBX:00000000
>>>> ECX:00000000
>>>> EDX:0109C0F0
>>>> ESI:0109AE88
>>>> EDI:00000400
>>>> CS:EIP:0023:0043E361
>>>> SS:ESP:002B:0157FE60  EBP:00000000
>>>> DS:002B  ES:002B  FS:0053  GS:002B
>>>> Flags:00010202
>>>> 
>>>> Call stack:
>>>> Address   Frame
>>>> 
>>>> //=====================================================
>>>> 
>>>> 
>>>> I am able to figure out anything.
>>>> 
>>>> looking for help
>>>> 
>>>> Kuldeep Singh Meel
>>>> Sophomore
>>>> Department of Computer Science and Engineering
>>>> www.cse.iitb.ac.in/~kuldeepmeel
>>>> My Blog : http://sweetwithchilli.blogspot.com/
>>> 
>>> 
>>>> _______________________________________________
>>>> Condor-users mailing list
>>>> To unsubscribe, send a message to
>>>        condor-users-request@xxxxxxxxxxx with a
>>>> subject: Unsubscribe
>>>> You can also unsubscribe by visiting
>>>> https://lists.cs.wisc.edu/mailman/listinfo/condor-users
>>>> 
>>>> The archives can be found at:
>>>> https://lists.cs.wisc.edu/archive/condor-users/
>>> 
>>>        _______________________________________________
>>>        Condor-users mailing list
>>>        To unsubscribe, send a message to
>>>        condor-users-request@xxxxxxxxxxx with a
>>>        subject: Unsubscribe
>>>        You can also unsubscribe by visiting
>>>        https://lists.cs.wisc.edu/mailman/listinfo/condor-users
>>> 
>>>        The archives can be found at:
>>>        https://lists.cs.wisc.edu/archive/condor-users/
>>> 
>>> 
>>> 
>>> _______________________________________________
>>> Condor-users mailing list
>>> To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with
>> a
>>> subject: Unsubscribe
>>> You can also unsubscribe by visiting
>>> https://lists.cs.wisc.edu/mailman/listinfo/condor-users
>>> 
>>> The archives can be found at:
>>> https://lists.cs.wisc.edu/archive/condor-users/
>> 
>> _______________________________________________
>> Condor-users mailing list
>> To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with a
>> subject: Unsubscribe
>> You can also unsubscribe by visiting
>> https://lists.cs.wisc.edu/mailman/listinfo/condor-users
>> 
>> The archives can be found at:
>> https://lists.cs.wisc.edu/archive/condor-users/
>> 
> _______________________________________________
> Condor-users mailing list
> To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with a
> subject: Unsubscribe
> You can also unsubscribe by visiting
> https://lists.cs.wisc.edu/mailman/listinfo/condor-users
> 
> The archives can be found at:
> https://lists.cs.wisc.edu/archive/condor-users/