Ok Dan, thank you for your reply.
At the moment, I'm simulating an 8 Sparc CMP System
with DNUCA L2 cache (loading Ruby with MOESI_CMP_NUCA). In this implementation,
L1s are in SLICC, is it correct? So the "Fast Path" is disabled, so I have to
count L1 misses in Profiler.C (hopeing that for MOESI_CMP_NUCA this is the
correct way...). I'll develop my protocol later.
As I said in previous posts, I need to count L1
misses for each CPU because I have to stop the simulation when this number
reaches a given threshold. Is the m_perProcTotalMisses field (of type
Vector<integer_t>) the one I have to check? (i.e., does this
vector contain the number of L1 misses?)
Perharps, it would be better to check the m_misses
integer field of class CacheProfiler (class Profiler contains 3 pointers to
CacheProfiler... L1-I, L1-D and L2).
Examining both Profiler.C and CacheProfiler.C, and
looking also the Ruby output file for statistics, it seems that m_misses is the
number of L1 misses in ALL L1 caches in the system... is it correct? If no, can
I check the m_misses in the addStatSample() method of CacheProfiler class?
(i.e., when and who by is this method invoked?)
Finally, how many instances of Profiler are
allocated when ruby is loaded?
Sorry for being so inexperienced, and thank you
again for your support.
Marco
----- Original Message -----
From:
To:
Sent: Tuesday, February 20, 2007 3:05
PM
Subject: Re: [Gems-users] Help in
understanding class Sequencer
Hi Marco,
Marco Solinas wrote:
Hi all,
in the Sequencer.C file, what does the TSO flag
means (it is checked in many functions)?
The TSO flag
is intended to emulate the timing of the TSO memory model with a store buffer.
I say "emulate" because Simics forces sequentially consistent executions, so
Ruby can only "fake" TSO timing. It is not a widely popular flag.
In the makeRequest function, if I
want to count the number of L1 misses, is it enought to check return value
of the doRequest() call?
If Fast Path hits are
enabled, doRequest will return TRUE on L1 hits. If "Fast Path" is disabled
(aka the L1s are in SLICC), then doRequest won't return true, even on an L1
hit. If your protocol does not use the Fast Path flag, it might be easier to
count L1 misses in the Profiler instead of the sequencer.
How do I have to take into account
the previous if branch?
If by branch you mean
control-transfer instruction, the answer is no.
As it involves the TSO flag, I
can't understand how to manage this case.
If the TSO
flag is ON and the access is a store AND there is room in the store buffer
then makeRequest() will return early with a "fast path" hit, to emulate the
timing of a store buffer. If you haven't turned TSO on, there is no reason to
worry about this case.
There is a instance of Sequencer for each CPU,
is it correct? If no, what does the m_version field of the class Sequencer
means?
There is an instance of the sequencer for
every CPU.
Is there a better way to count the number of L1
misses different from checking the doRequest() return
value?
If your protocol uses fast path hits, then the
Sequencer is a great place. Otherwise, you might want to have a look at
Profiler.C. Some protocols actually don't make the right calls into the
profiler, but profiling calls are not difficult to add.
Thank you for your attention.
Marco
Regards,
Dan
_______________________________________________
Gems-users mailing list
Gems-users@xxxxxxxxxxx
https://lists.cs.wisc.edu/mailman/listinfo/gems-users
Use Google to search the GEMS Users mailing list by adding "site:https://lists.cs.wisc.edu/archive/gems-users/" to your search.
_______________________________________________
Gems-users mailing
list
Gems-users@xxxxxxxxxxx
https://lists.cs.wisc.edu/mailman/listinfo/gems-users
Use
Google to search the GEMS Users mailing list by adding
"site:https://lists.cs.wisc.edu/archive/gems-users/" to your
search.