Hello,
Here are some critical bugs I found on GEMS EETM. I made sure that the
solutions to these bugs made EETM more stable by running the GEMS simulation on
STAMP benchmarks.
I thought HTM researchers might be interested about these.
Bug #1:
Not including the thread id to tm_trap_handler function call.
Related Files and Code:
// microbenchmarks/transactional/common/transaction.c
void tm_trap_handler(int threadID){
...
}
...
void transaction_manager_stub(int dummy){
...
BEGIN_ESCAPE
asm volatile( \
"call %1\n" \
"mov %%g2, %%O0\n" \
"mov %%g3, %0\n" \
:"=r"(restart)
:"r"(&tm_trap_handler)
:"%o0", "%o7"
);
...
}
Reason:
The GCC inline assembly used on transaction_manager_stub() function
does not
correctly include the thread id (register g2 sent from GEMS) value to
tm_trap_handler function.
Solution:
The second assembly line of "mov %%g2, %%o0\n" should be
moved before the "call %1\n" line.
Bug #2:
Software abort trap handler unrolling the trap handler's stack space.
Reason:
Eager VM logs thread's stack space and it becomes trouble when log
unrolling takes place on abort trap handler's
own stack space. It will overwrite itself with log data and trap
handler will act almost randomly.
Randomly writing other log's data or using random memory section as
log may happen.
Steps leading to an error:
1. A transaction enters a function and writes some data to heap
and stack.
2. The data gets written to the stack, heap and the log.
3. The transaction exits the function and stack pointer
shrinks.
4. The transaction gets aborted due to conflict in heap area.
5. The software abort trap handler starts log unrolling from
the shrunken stack pointer.
6. The undo log space and the stack space for the trap handler
collides.
7. Trap handler unrolls the log and overwrites itself.
8. Corruption in trap handler's stack space cause random log
unrolling.
9. System goes haywire.
Solution:
1. Take stack pointer just before starting trap handling.
2. Ignore undo log when the target data is pointed at the trap
handler's stack area.
We can figure this out if the target address
is between the current and old stack pointer.
Related Files and Code:
// ruby/microbenchmarks/transactional/common/transaction.c
void tm_unroll_log_entry(unsigned int* entry){
int k;
unsigned int *address = (unsigned int *) (*(entry+16)
& ADDRESS_MASK);
for (k = 0; k < 16; k++){
// NOTE: This should be the place for
checking writing to itself.
unsigned int data = "" + k);
*address = data;
address++;
}
}
Bug #3:
Not aborting the transaction when there is a conflict with non-TM or
escape action.
Reason:
Due to 16 words cache granularity there is a possiblity of false
conflict between a transaction
and non-transactional code. Current GEMS version just allow non-TM
code to read the transactionally isolated line.
Steps leading to an error:
Step 1: thread 1's transaction trA writes x
Step 2: thread 2's non-transactional code reads y.
Unfortunately word x
and word y shares the same cache line.
Step 3: thread 1 allows thread 2's read request.
Step 4: thread 2 starts a new transaction trB.
Step 5: trB reads line x, but does not report to L2 because it's in
shared state.
It gets a simple line
hit. Therefore trA is unaware of conflict.
Step 6: trA aborts
Step 7: trB writes z = x + y and commits. word z is on a different
cache.
Solution:
When a non-TM read/write request comes and there is a hit on
write-set perfect filter to a TM thread,
the transaction searches from the front for old log value and send it
to the requestor instead
of sending the value from L1 cache.
After sending out the old value, the transaction removes the
conflicted address from
the undo log and the write set. The transaction aborts.
Note: This sending out the old value and removing the conflicted
address from the undo log
and the write set is important
because we don't want to overwrite later what non-TM
might have written at the time of
conflict. Also note that this sending out the old
log value has been practiced on
STM compilers, too.
Related Files and Code:
// ruby/protocols/MESI_CMP_filter_directory-L1cache.sm
transition(M, Fwd_GETX, I)
{
d_sendDataToRequestor;
l_popRequestQueue;
}
transition(M, Inv, I)
{
f_sendDataToL2;
l_popRequestQueue;
}
- Byong-Wu Chong