Re: [Gems-users] An assertion error from LogTM and thread binding


Date: Sat, 21 Mar 2009 20:53:29 -0400
From: sjafri@xxxxxxxxxx
Subject: Re: [Gems-users] An assertion error from LogTM and thread binding
I think its a contest switch.  

LogTM-SE uses a transaction level to check if a processor is currently executing
a transaction. Suppose you are running a thread which is not executing
transactional code, transaction level would be < 1. Then if there is a context
switch, logTM would be unaware of it. Suppose further that the new thread is in
the middle of  transaction. It would call commit and you would get the assertion
violation that transaction level is < 1





Quoting BYONG WU CHONG <bernard.chong@xxxxxxxx>:

> Hi,
> 
> I ran SPLASH2 barnes on GEMS TM simulator configured in EE LogTM. I got this
> error.
> 
> --------------------------- Error Msg Begin ---------------------------
> commitTransaction ERROR NOT IN XACT proc =3 logical_proc = 3 xid = 0 isOpen =
> 0 tid = -1 pc = [0x13db4, line 0x13d80] level = 0 time = 31050951
> simics-common: log_tm/TransactionInterfaceManager.C:243: void
> TransactionInterfaceManager::commitTransaction(int, int, bool): Assertion
> `m_transactionLevel[thread] >= 1' failed.
> Abort (SIGABRT) in main thread
> The simulation state has been corrupted. Simulation cannot continue.
> Please restart Simics.
> Starting command line. (May have skipped commands in script files.)
> [cpu3] v:0x0000000000013db4 p:0x000e0813db4  magic (sethi 0x800, %g0)
> Setting new inspection cpu: cpu3
> Traceback (most recent call last):
>   File "../../../gen-scripts/mfacet.py", line 308, in
> console_branch_internal
>     wait_for_string(get_console(), __prompt)
>   File
>
"/uusoc/facility/res/arch/tools/corvus/simics-3.0.31/x86-linux/lib/python/text_console_common.py",
> line 10, in wait_for_string
>     wait_for_obj_hap("Xterm_Break_String", obj, break_id)
>   File
>
"/uusoc/facility/res/arch/tools/corvus/simics-3.0.31/x86-linux/lib/python/cli_impl.py",
> line 3374, in wait_for_obj_hap
>     return wait_for_hap_common([hap_name, name, idx0])
>   File
>
"/uusoc/facility/res/arch/tools/corvus/simics-3.0.31/x86-linux/lib/python/cli_impl.py",
> line 3352, in wait_for_hap_common
>     raise SimExc_Break, "Script branch interrupted"
> sim_core.SimExc_Break: Script branch interrupted
> Exception in python branch
> simics>
> --------------------------- Error Msg End ---------------------------
> 
> It sounds like commitTransaction() has been called without prior call of
> beginTransaction(). But how?
> I checked that every UNLOCK macros are always called after a LOCK macro.
> 
> I searched Google and I found this a-year-old thread which had no answer.
> https://lists.cs.wisc.edu/archive/gems-users/2008-March/msg00124.shtml
> 
> Since I cannot bind a working cpu to a processor set, it seems that unbound
> thread is running around beginning a transaction here and commiting the same
> transaction there causing a trouble.
> 
> Information about thread binding is here
> https://lists.cs.wisc.edu/archive/gems-users/2007-October/msg00049.shtml
> According to this answer, I cannot bind the last thread to the last remaining
> cpu.
> 
> Should I bind the last thread to one of N-1 processor set? But as far as I
> know, two threads binding to one cpu isn't a good idea.
> 
> The Simics script I used for barnes simulation is this.
> 
> --------------------------- barnes.simics Begin ---------------------------
> @sys.path.append("../../../gen-scripts")
> @cwd = os.getcwd()
> @work_name = "barnes"
> @binary = "BARNES_local"
> @mb_dir = "benchmarks/SPLASH2/%s" % work_name
> @lib_path = "../../../libs/Solaris_SPARC"
> @import mfacet, tm_ee
> @from mfacet import *
> 
> @num_proc = SIM_number_processors()
> #@if num_proc > 1:
> #    num_proc = num_proc - 1
> 
> 
> # These commands are useful.
> # "isainfo -b\n",
> # "isalist\n",
> # "psrinfo\n",
> 
> magic-break-enable
> @console_commands(("ulimit -c 0\n",
>                    "psrset -c 0\n",
>                    "psrset -c 1\n",
>                    "psrset -c 2\n",
>                    "mount /host\n",
>                    "export LD_LIBRARY_PATH=%s\n" % lib_path,
>                    "cd /host/%s/../../../%s \n" % (cwd, mb_dir),
>                    "./%s < input%02d \n" % (binary, num_proc),
>                    ), "#")
> c
> # Note that this is the first magic breakpoint
> @tm_ee.start_TM()
> @conf.sim.cpu_switch_time = 1
> # This is the second magic breakpoint
> c
> @mfacet.run_sim_command("ruby0.dump-stats SPLASH2_%s_LLTM_%02d.stats" %
> (work_name, num_proc))
> --------------------------- barnes.simics End ---------------------------
> 
> Could someone help me why I am having this assertion error?
> 
> - Bernard
> 


[← Prev in Thread] Current Thread [Next in Thread→]