[Gems-users] Ruby Segmentation Fault


Date: Thu, 05 Feb 2009 20:47:06 +0200
From: Konstantinos Nikas <knikas@xxxxxxxxxxxxxxxxx>
Subject: [Gems-users] Ruby Segmentation Fault
Hi all,

we have an 8-core CMP and a transactional workload which only uses 2 threads. We bind the 2 threads to 2 specific processors (avoiding always core 0). When we set XACT_LOG_BUFFER_SIZE=2048 everything works fine. For smaller values (0, 256, 1024) though the simulation fails.

At first we used to get the following warning messages :

45936462 2 [2,0] endEscapeAction WARNING escape depth < 1. Depth = 0

Searching the mailing list we came across a post which suggested adding a beginEscapeAction() call into hardwareAbort(). We included this in our code and the warning messages went away. However, the simulations still fail with a segmentation fault. Gdb reported the following :

#0 RegisterState::restoreCheckpoint (this=0x0, m_proc=1) at /home/users/anastop/gems/gems-2.1//common/Vector.h:92 #1 0x00002aaab066bc5d in TransactionVersionManager::restartTransaction (this=0xa341340, thread=0, xact_level=1) at /home/users/anastop/gems/gems-2.1//common/Vector.h:109 #2 0x00002aaab0656b89 in TransactionInterfaceManager::restartTransactionCallback (this=0xa341230, thread=0) at log_tm/TransactionInterfaceManager.C:751 #3 0x00002aaaad20fb70 in ?? () from /home/simics/academic/simics-3.0.31/amd64-linux/lib/sparc-u3.so #4 0x00002aaaad1aed99 in ?? () from /home/simics/academic/simics-3.0.31/amd64-linux/lib/sparc-u3.so #5 0x00002aaaad1aec9a in ?? () from /home/simics/academic/simics-3.0.31/amd64-linux/lib/sparc-u3.so #6 0x00002b1b49bc2eaf in SIM_continue () from /home/simics/academic/simics-3.0.31/amd64-linux/bin/libsimics-common.so #7 0x00002b1b49b83a9c in ?? () from /home/simics/academic/simics-3.0.31/amd64-linux/bin/libsimics-common.so #8 0x00002b1b4aaf739c in PyCFunction_Call (func=0x2aaaaab26560, arg=0x2aaaac9f6a50, kw=0x0) at /home/packages/python-2.4.2 .......

Any ideas? Or suggestions how to debug more efficiently?

Kind regards,

Kostis

PS: A similar situation occurs when we run the same 2 threads on a 4-core machine. It works fine for XACT_LOG_BUFFER_SIZE=0,256,1024,2048 and fails for size=32!

[← Prev in Thread] Current Thread [Next in Thread→]