[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] Problem with HAD :Intercepting an unhandled exception.



Hi Ben
Yeah I made changes but still getting the errors,How can I implement High Availablity in Windows.

Eagerly waiting for your response
Kuldeep Singh Meel
Sophomore
Department of Computer Science and Engineering
www.cse.iitb.ac.in/~kuldeepmeel
My Blog : http://sweetwithchilli.blogspot.com/


On Tue, Jul 20, 2010 at 10:48 AM, Burnett, Ben <ben.burnett@xxxxxxxx> wrote:
Oh, and the HA_LOCK_URL is not recognized/handled on Windows, since URL parsing is not supported.

-B

On 2010-07-19, at 4:38 PM, kuldeep singh meel wrote:

> *Hi Tim*
>  I got following stack with Visual studio.
> What we are trying to achieve is that we have one single pool submission
> point and we want to transfer the scheduler if the primary scheduler goes
> down. I found this from manual.If you know any better idea,please let me
> know.
>
> Thanks in Advance
> *//=====================================================*
> *PID: 3564*
> *Exception code: C0000005 ACCESS_VIOLATION*
> *Fault address:  0043E361 01:0003D361 C:\condor\bin\condor_master.exe*
> *
> *
> *Registers:*
> *EAX:0051C0F0*
> *EBX:00000000*
> *ECX:00000000*
> *EDX:00001118*
> *ESI:00E39F28*
> *EDI:00000400*
> *CS:EIP:0023:0043E361*
> *SS:ESP:002B:0160F768  EBP:0160F798*
> *DS:002B  ES:002B  FS:0053  GS:002B*
> *Flags:00010206*
> *
> *
> *Call stack:*
> *Address   Frame*
> *0043E361  0160F798  CondorLock::AcquireLock
> (c:\condor\execute\dir_5284\userdir\src\condor_daemon_core.v6\condor_lock.cpp:74)
> *
> *004041FB  0160F7AC  admin_command_handler
> (c:\condor\execute\dir_5284\userdir\src\condor_master.v6\master.cpp:437)*
> *004320B4  0160F7D0  DaemonCore::CallCommandHandler
> (c:\condor\execute\dir_5284\userdir\src\condor_daemon_core.v6\daemon_core.cpp:3550)
> *
> *00438C4A  0160F9C8  DaemonCore::HandleReq
> (c:\condor\execute\dir_5284\userdir\src\condor_daemon_core.v6\daemon_core.cpp:4894)
> *
> *00437057  0160F9DC  DaemonCore::HandleReqSocketHandler
> (c:\condor\execute\dir_5284\userdir\src\condor_daemon_core.v6\daemon_core.cpp:3701)
> *
> *0043900E  0160FA0C  DaemonCore::CallSocketHandler_worker
> (c:\condor\execute\dir_5284\userdir\src\condor_daemon_core.v6\daemon_core.cpp:3461)
> *
> *00439364  0160FA2C  DaemonCore::CallSocketHandler_worker_demarshall
> (c:\condor\execute\dir_5284\userdir\src\condor_daemon_core.v6\daemon_core.cpp:3424)
> *
> *00439639  0160FA54  DaemonCore::CallSocketHandler
> (c:\condor\execute\dir_5284\userdir\src\condor_daemon_core.v6\daemon_core.cpp:3412)
> *
> *0043B63A  0160FAF4  DaemonCore::Driver
> (c:\condor\execute\dir_5284\userdir\src\condor_daemon_core.v6\daemon_core.cpp:3325)
> *
> *761114D1  0160FB08  HeapFree+14*
> *0048EFC3  0160FB48  free
> (f:\dd\vctools\crt_bld\self_x86\crt\src\free.c:110)*
> *00403A37  0160FB5C  MyString::`vector deleting destructor'
> (c:\condor\execute\dir_5284\userdir\src\condor_master.v6\master.cpp:1527)*
> *00465A94  0160FB64  SimpleList<MyString>::~SimpleList<MyString>
> (c:\condor\execute\dir_5284\userdir\src\condor_c++_util\simplelist.h:44)*
> *00405E33  0160FB74  daemon::RealStart
> (c:\condor\execute\dir_5284\userdir\src\condor_master.v6\masterdaemon.cpp:509)
> *
> *00405E50  0160FBB0  daemon::RealStart
> (c:\condor\execute\dir_5284\userdir\src\condor_master.v6\masterdaemon.cpp:779)
> *
> *778A371E  0160FCC4  RtlImageNtHeader+73A*
> *7789E20C  0160FD4C  RtlInitUnicodeString+164*
> *7789E20C  0160FD50  RtlInitUnicodeString+164*
> *7789DF72  0160FDC0  RtlAllocateHeap+AC*
> Kuldeep Singh Meel
> Sophomore
> Department of Computer Science and Engineering
> www.cse.iitb.ac.in/~kuldeepmeel
> My Blog : http://sweetwithchilli.blogspot.com/
>
>
> On Tue, Jul 20, 2010 at 1:25 AM, Timothy St. Clair <tstclair@xxxxxxxxxx>wrote:
>
>> If you have MS.Visual Studio.NET you will want to attach with the
>> debugger, this will likely be the best source of tracking down the
>> error.
>>
>> Just as ?... usually if you want to setup HA the spool is in some shared
>> location, unless you are intending on setting it up locally for process
>> redundancy.  Either way, I'm curious what you are trying to accomplish
>> with this deployment configurations?
>>
>> Cheers,
>> Tim
>>
>> On Mon, 2010-07-19 at 17:38 +1200, kuldeep singh meel wrote:
>>> hi Ben
>>> The master log output is :
>>> 07/19 17:35:52 Using config source: C:\condor\condor_config
>>> 07/19 17:35:52 Using local config sources:
>>> 07/19 17:35:52    C:\condor/condor_config.local
>>> 07/19 17:35:52 DaemonCore: Command Socket at <10.1.1.108:50637>
>>> 07/19 17:35:52 Will use UDP to update collector <10.1.1.108:9618>
>>> 07/19 17:35:52 Will use UDP to update collector <10.1.1.104:9618>
>>> 07/19 17:35:52 Created HA lock for SCHEDD; URL=""> >>> poll=300s hold=3600s
>>> 07/19 17:35:52 Daemon NEGOTIATOR is controlled by HAD
>>> 07/19 17:35:52 Intercepting an unhandled exception.
>>> 07/19 17:35:52 Dropping a core file.
>>>
>>>
>>> and core file says:
>>> PID: 2632
>>> Exception code: C0000005 ACCESS_VIOLATION
>>> Fault address:  0043E361 01:0003D361 C:\condor\bin\condor_master.exe
>>>
>>>
>>> Registers:
>>> EAX:0051C0F0
>>> EBX:00000000
>>> ECX:00000000
>>> EDX:00CBBCA8
>>> ESI:00E79F40
>>> EDI:00000400
>>> CS:EIP:0023:0043E361
>>> SS:ESP:002B:015DFE60  EBP:00000000
>>> DS:002B  ES:002B  FS:0053  GS:002B
>>> Flags:00010202
>>>
>>>
>>> Call stack:
>>> Address   Frame
>>>
>>>
>>>
>>>
>>> Kuldeep Singh Meel
>>> Sophomore
>>> Department of Computer Science and Engineering
>>> www.cse.iitb.ac.in/~kuldeepmeel
>>> My Blog : http://sweetwithchilli.blogspot.com/
>>>
>>>
>>> On Mon, Jul 19, 2010 at 5:29 PM, Burnett, Ben <ben.burnett@xxxxxxxx>
>>> wrote:
>>>        What do the master logs say when the debug level is turn on
>>>        full?  Also, check the core after you run it again, often
>>>        there is a stack-trace included in the tail end of the file.
>>>
>>>        -B
>>>
>>>
>>>        On 2010-07-18, at 10:42 PM, kuldeep singh meel wrote:
>>>
>>>> Hi Everyone
>>>> We want to have SCHEDD with HAD and we put the following
>>>        configuration :
>>>>
>>>>
>>>> MASTER_HA_LIST = SCHEDD
>>>> SPOOL = /share/spool
>>>> HA_LOCK_URL = file:/share/spool
>>>> VALID_SPOOL_FILES = SCHEDD.lock
>>>>
>>>>
>>>> But when we want to restart condor on all machines the
>>>        master gives
>>>> the following error :
>>>>
>>>> Intercepting an unhandled exception.
>>>>
>>>> with the following COre file :
>>>>
>>>> //=====================================================
>>>> PID: 3888
>>>> Exception code: C0000005 ACCESS_VIOLATION
>>>> Fault address:  0043E361 01:0003D361 C:\condor\bin
>>>        \condor_master.exe
>>>>
>>>> Registers:
>>>> EAX:0051C0F0
>>>> EBX:00000000
>>>> ECX:00000000
>>>> EDX:0109C0F0
>>>> ESI:0109AE88
>>>> EDI:00000400
>>>> CS:EIP:0023:0043E361
>>>> SS:ESP:002B:0157FE60  EBP:00000000
>>>> DS:002B  ES:002B  FS:0053  GS:002B
>>>> Flags:00010202
>>>>
>>>> Call stack:
>>>> Address   Frame
>>>>
>>>> //=====================================================
>>>>
>>>>
>>>> I am able to figure out anything.
>>>>
>>>> looking for help
>>>>
>>>> Kuldeep Singh Meel
>>>> Sophomore
>>>> Department of Computer Science and Engineering
>>>> www.cse.iitb.ac.in/~kuldeepmeel
>>>> My Blog : http://sweetwithchilli.blogspot.com/
>>>
>>>
>>>> _______________________________________________
>>>> Condor-users mailing list
>>>> To unsubscribe, send a message to
>>>        condor-users-request@xxxxxxxxxxx with a
>>>> subject: Unsubscribe
>>>> You can also unsubscribe by visiting
>>>> https://lists.cs.wisc.edu/mailman/listinfo/condor-users
>>>>
>>>> The archives can be found at:
>>>> https://lists.cs.wisc.edu/archive/condor-users/
>>>
>>>        _______________________________________________
>>>        Condor-users mailing list
>>>        To unsubscribe, send a message to
>>>        condor-users-request@xxxxxxxxxxx with a
>>>        subject: Unsubscribe
>>>        You can also unsubscribe by visiting
>>>        https://lists.cs.wisc.edu/mailman/listinfo/condor-users
>>>
>>>        The archives can be found at:
>>>        https://lists.cs.wisc.edu/archive/condor-users/
>>>
>>>
>>>
>>> _______________________________________________
>>> Condor-users mailing list
>>> To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with
>> a
>>> subject: Unsubscribe
>>> You can also unsubscribe by visiting
>>> https://lists.cs.wisc.edu/mailman/listinfo/condor-users
>>>
>>> The archives can be found at:
>>> https://lists.cs.wisc.edu/archive/condor-users/
>>
>> _______________________________________________
>> Condor-users mailing list
>> To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with a
>> subject: Unsubscribe
>> You can also unsubscribe by visiting
>> https://lists.cs.wisc.edu/mailman/listinfo/condor-users
>>
>> The archives can be found at:
>> https://lists.cs.wisc.edu/archive/condor-users/
>>
> _______________________________________________
> Condor-users mailing list
> To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with a
> subject: Unsubscribe
> You can also unsubscribe by visiting
> https://lists.cs.wisc.edu/mailman/listinfo/condor-users
>
> The archives can be found at:
> https://lists.cs.wisc.edu/archive/condor-users/

_______________________________________________
Condor-users mailing list
To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/condor-users

The archives can be found at:
https://lists.cs.wisc.edu/archive/condor-users/