Re: [Gems-users] logtm/tourmaline with ultrasparc II


Date: Thu, 7 Dec 2006 00:00:03 +0800
From: 郭锐 <timmyguo@xxxxxxxxxxxxxxxx>
Subject: Re: [Gems-users] logtm/tourmaline with ultrasparc II
Thanks very much for your reply.

> Solaris is now freely available from Sun for research purposes. You can
> make your own USIII+/Solaris checkpoints if you want from the ISOs of
> Solaris and the scripts that come with Simics. However... if you're
> determined to use your Suse7.3/USII checkpoints, read on.
I'm far more familiar with Linux than Solaris. In fact, I never got my hands

on it before. So it's my preferred solution to make Suse7.3/USII checkpoints

work, if I can. 

> True. The general idea is to perform a "context save" on transaction
> begin and a "context restore" on transaction abort. Notably, you
If your transaction model is to execute the transaction speculatively, 
just as the case of TCC, I would say that a complete save/restore should 
be needed. But it's of course not the case in LogTM, where every
modification
is steady (stands even with an abort).

> shouldn't do anything with various control registers... but there is a
> tricky issue that arises during a context restore to a processor that is
> currently exeucting within the OS.
Do you mean that tricky things only occur when an abort happens in kernel
mode?
But remember that you have turned interrupts off, so one can only jump in to

kernel by making system calls. And I'm not sure I didn't make any syscall in
 transaction region. How did the kernel panic come out?
> 
> >Here's the original code in "RegisterStateWindowed.C" that hardcoded the
> >control register numbers:
> >  for(i=0; i < 126 ; i++){
> >    if((i <= 31) ||
> >       (i == 39 || (i == 43)) ||  // tick/stick
> >       (i == 45) ||               // pstate
> >       (i >= 53 && i <= 57) ||    // invalid
> >       (i >= 63 && i <= 67) ||    // invalid
> >       (i >= 73 && i <= 77) ||    // invalid
> >       (i >= 83 && i <= 87) ||    // invalid
> >       //(i >= 91 && i <= 95) ||    // window state
> >       (i >= 100 && i <= 110) || // interrupt status (just added)
> >       (i >= 111 && i <= 119))   // interrupt address
> >      {
> >        continue;
> >      }
> >    m_controlRegisterNumbers.insertAtBottom(i);
> >  }
> >
> >After comparing the register numbering for the two processors in Simics,
I
> >modified it to this:
> >  for(i=0; i < 111 ; i++){
> >    if((i <= 31) ||
> >       (i == 39) ||               // tick
> >       (i == 43) ||               // pstate
> >       (i >= 51 && i <= 55) ||    // invalid
> >       (i >= 61 && i <= 65) ||    // invalid
> >       (i >= 71 && i <= 75) ||    // invalid
> >       (i >= 81 && i <= 85) ||    // invalid
> >       (i >= 97 && i <= 102) || // interrupt status (just added)
> >       (i >= 103 && i <= 106))   // interrupt address
> >      {
> >        continue;
> >      }
> >    m_controlRegisterNumbers.insertAtBottom(i);
> >  }
> >
> >I'm not quite sure of this modification, because I'm a stranger to SPARC.
It
> >does work sometimes, but it can cause [[kernel panic]] or deadlock in my
> >benchmark too. Can anybody check it for me?
> >
> >
> Hey Users! Anyone out there using USII's and LogTM or Tourmaline?
> 
> If you're unsure about which registers to save/restore on USII, you can
> always take an educated guess. That is, unfortunately, probably the best
> way to figure it out -- this isn't *exactly* the same as a context
> switch, after all.
I have no idea in what situation a control register be used. I made the
modification 
base on a comparison between the register numbering of the two processors,
and the original code.
But I just realized that may be even the original code is not reliable --
the 
register using pattern should be OS depending. What a bad news!

Should I leave along all the privileged registers, consider that 
transactions are currently user-space only and interrupt disabled? 
How did you choose the set of control registers to save/restore? By guess?
What's the relation between tick and tick_cmpr? The latter is undocumented
in SPARCV9 manual.
And, these are not documented too:
> >softint			94
> >upa_config		95
> >ecache_error_enable		96
> >asynchronous_fault_status	97
> >asynchronous_fault_address	98
> >out_intr_data0			99
> >out_intr_data1			100
> >out_intr_data2			101
> >intr_dispatch_status		102
> >in_intr_data0			103
> >in_intr_data1			104
> >in_intr_data2			105
> >intr_receive			106
> >serial_id			107
> >pic			108
> >pcr			109
> >mid			110

> >
> >PS: I wonder why the original code could work, it tries to read the
> >registers 124 and 125, which doesn't exists even in UltraSPARC III+.
> >
> Thats a good question indeed. The fact that Simics doesn't kill
> execution with an error suggests there might be something in slots 124
> and 125 after all... though I cannot imagine what. What does
> SIM_register_name return for those values?
It should be a '(NULL)' and raise exception number 6, based on my experiment
on USII.

One irrelevant question: Will Opal suffer from the similar problem? I don't 
understand the retirement code clearly, especially those handling traps.
And one more: I think the assertion failure bug (the two in
SimicsProcessor::hitCallBack 
and one in isReady(request)) has been fixed according to the release note of
GEMS1.3.
Why should I run into it? Sorry for this question before I investigate it
myself.
G.R.

[← Prev in Thread] Current Thread [Next in Thread→]