Thanks very much for your reply.
> Solaris is now freely available from Sun for research purposes. You can
> make your own USIII+/Solaris checkpoints if you want from the ISOs of
> Solaris and the scripts that come with Simics. However... if you're
> determined to use your Suse7.3/USII checkpoints, read on.
I'm far more familiar with Linux than Solaris. In fact, I never got my hands
on it before. So it's my preferred solution to make Suse7.3/USII checkpoints
work, if I can.
> True. The general idea is to perform a "context save" on transaction
> begin and a "context restore" on transaction abort. Notably, you
If your transaction model is to execute the transaction speculatively,
just as the case of TCC, I would say that a complete save/restore should
be needed. But it's of course not the case in LogTM, where every
modification
is steady (stands even with an abort).
> shouldn't do anything with various control registers... but there is a
> tricky issue that arises during a context restore to a processor that is
> currently exeucting within the OS.
Do you mean that tricky things only occur when an abort happens in kernel
mode?
But remember that you have turned interrupts off, so one can only jump in to
kernel by making system calls. And I'm not sure I didn't make any syscall in
transaction region. How did the kernel panic come out?
>
> >Here's the original code in "RegisterStateWindowed.C" that hardcoded the
> >control register numbers:
> > for(i=0; i < 126 ; i++){
> > if((i <= 31) ||
> > (i == 39 || (i == 43)) || // tick/stick
> > (i == 45) || // pstate
> > (i >= 53 && i <= 57) || // invalid
> > (i >= 63 && i <= 67) || // invalid
> > (i >= 73 && i <= 77) || // invalid
> > (i >= 83 && i <= 87) || // invalid
> > //(i >= 91 && i <= 95) || // window state
> > (i >= 100 && i <= 110) || // interrupt status (just added)
> > (i >= 111 && i <= 119)) // interrupt address
> > {
> > continue;
> > }
> > m_controlRegisterNumbers.insertAtBottom(i);
> > }
> >
> >After comparing the register numbering for the two processors in Simics,
I
> >modified it to this:
> > for(i=0; i < 111 ; i++){
> > if((i <= 31) ||
> > (i == 39) || // tick
> > (i == 43) || // pstate
> > (i >= 51 && i <= 55) || // invalid
> > (i >= 61 && i <= 65) || // invalid
> > (i >= 71 && i <= 75) || // invalid
> > (i >= 81 && i <= 85) || // invalid
> > (i >= 97 && i <= 102) || // interrupt status (just added)
> > (i >= 103 && i <= 106)) // interrupt address
> > {
> > continue;
> > }
> > m_controlRegisterNumbers.insertAtBottom(i);
> > }
> >
> >I'm not quite sure of this modification, because I'm a stranger to SPARC.
It
> >does work sometimes, but it can cause [[kernel panic]] or deadlock in my
> >benchmark too. Can anybody check it for me?
> >
> >
> Hey Users! Anyone out there using USII's and LogTM or Tourmaline?
>
> If you're unsure about which registers to save/restore on USII, you can
> always take an educated guess. That is, unfortunately, probably the best
> way to figure it out -- this isn't *exactly* the same as a context
> switch, after all.
I have no idea in what situation a control register be used. I made the
modification
base on a comparison between the register numbering of the two processors,
and the original code.
But I just realized that may be even the original code is not reliable --
the
register using pattern should be OS depending. What a bad news!
Should I leave along all the privileged registers, consider that
transactions are currently user-space only and interrupt disabled?
How did you choose the set of control registers to save/restore? By guess?
What's the relation between tick and tick_cmpr? The latter is undocumented
in SPARCV9 manual.
And, these are not documented too:
> >softint 94
> >upa_config 95
> >ecache_error_enable 96
> >asynchronous_fault_status 97
> >asynchronous_fault_address 98
> >out_intr_data0 99
> >out_intr_data1 100
> >out_intr_data2 101
> >intr_dispatch_status 102
> >in_intr_data0 103
> >in_intr_data1 104
> >in_intr_data2 105
> >intr_receive 106
> >serial_id 107
> >pic 108
> >pcr 109
> >mid 110
> >
> >PS: I wonder why the original code could work, it tries to read the
> >registers 124 and 125, which doesn't exists even in UltraSPARC III+.
> >
> Thats a good question indeed. The fact that Simics doesn't kill
> execution with an error suggests there might be something in slots 124
> and 125 after all... though I cannot imagine what. What does
> SIM_register_name return for those values?
It should be a '(NULL)' and raise exception number 6, based on my experiment
on USII.
One irrelevant question: Will Opal suffer from the similar problem? I don't
understand the retirement code clearly, especially those handling traps.
And one more: I think the assertion failure bug (the two in
SimicsProcessor::hitCallBack
and one in isReady(request)) has been fixed according to the release note of
GEMS1.3.
Why should I run into it? Sorry for this question before I investigate it
myself.
G.R.
|