Hello,
Here are some critical bugs I found on GEMS EETM. I made
sure that the solutions to these bugs made EETM more stable by running the GEMS
simulation on STAMP benchmarks.
I thought HTM researchers might be interested about these.
Bug #1:
Not including the thread id to tm_trap_handler
function call.
Related Files and Code:
// microbenchmarks/transactional/common/transaction.c
void tm_trap_handler(int threadID){
...
}
...
void transaction_manager_stub(int dummy){
...
BEGIN_ESCAPE
asm
volatile( \
"call %1\n" \
"mov %%g2, %%O0\n" \
"mov %%g3, %0\n" \
:"=r"(restart)
:"r"(&tm_trap_handler)
:"%o0", "%o7"
);
...
}
Reason:
The GCC inline assembly used on
transaction_manager_stub() function does not
correctly include the thread id (register g2 sent
from GEMS) value to tm_trap_handler function.
Solution:
The second assembly line of "mov %%g2,
%%o0\n" should be moved before the "call %1\n" line.
Bug #2:
Software abort trap handler unrolling the trap
handler's stack space.
Reason:
Eager VM logs thread's stack space and it becomes
trouble when log unrolling takes place on abort trap handler's
own stack space. It will overwrite itself with log
data and trap handler will act almost randomly.
Randomly writing other log's data or using random
memory section as log may happen.
Steps leading to an error:
1. A transaction enters a function and writes
some data to heap and stack.
2. The data gets written to the stack, heap and
the log.
3. The transaction exits the function and stack
pointer shrinks.
4. The transaction gets aborted due to conflict
in heap area.
5. The software abort trap handler starts log
unrolling from the shrunken stack pointer.
6. The undo log space and the stack space for
the trap handler collides.
7. Trap handler unrolls the log and overwrites
itself.
8. Corruption in trap handler's stack space
cause random log unrolling.
9. System goes haywire.
Solution:
1. Take stack pointer just before starting trap
handling.
2. Ignore undo log when the target data is
pointed at the trap handler's stack area.
We can figure this out if the
target address is between the current and old stack pointer.
Related Files and Code:
//
ruby/microbenchmarks/transactional/common/transaction.c
void tm_unroll_log_entry(unsigned int* entry){
int k;
unsigned int *address = (unsigned int *)
(*(entry+16) & ADDRESS_MASK);
for (k = 0; k < 16; k++){
// NOTE: This should be the
place for checking writing to itself.
unsigned int data = "" +
k);
*address = data;
address++;
}
}
Bug #3:
Not aborting the transaction when there is a conflict
with non-TM or escape action.
Reason:
Due to 16 words cache granularity there is a
possiblity of false conflict between a transaction
and non-transactional code. Current GEMS version just
allow non-TM code to read the transactionally isolated line.
Steps leading to an error:
Step 1: thread 1's transaction trA writes x
Step 2: thread 2's non-transactional code reads y.
Unfortunately word x and word y shares the same cache line.
Step 3: thread 1 allows thread 2's read request.
Step 4: thread 2 starts a new transaction trB.
Step 5: trB reads line x, but does not report to L2
because it's in shared state.
It
gets a simple line hit. Therefore trA is unaware of conflict.
Step 6: trA aborts
Step 7: trB writes z = x + y and commits. word z is
on a different cache.
Solution:
When a non-TM read/write request comes and there is a
hit on write-set perfect filter to a TM thread,
the transaction searches from the front for old log
value and send it to the requestor instead
of sending the value from L1 cache.
After sending out the old value, the transaction
removes the conflicted address from
the undo log and the write set. The transaction
aborts.
Note: This sending out the old value and removing the
conflicted address from the undo log
and the write set
is important because we don't want to overwrite later what non-TM
might have
written at the time of conflict. Also note that this sending out the old
log value has
been practiced on STM compilers, too.
Related Files and Code:
//
ruby/protocols/MESI_CMP_filter_directory-L1cache.sm
transition(M, Fwd_GETX, I)
{
d_sendDataToRequestor;
l_popRequestQueue;
}
transition(M, Inv, I)
{
f_sendDataToL2;
l_popRequestQueue;
}
- Byong-Wu Chong