Thanks Dan!
Thanks Min!
~Clay
Dan Gibson wrote:
Those results do seem unusual. The 1MB vs 2MB data seems to make sense,
though the 512KB L2 size is quite strange. What are the working-set
sizes for your applications? Very large or very small working sets can
be less sensitive to cache size. It could be that the working sets
overwhelm the L2 regardless of the three configurations below...try
using a very large (~2x working set size) L2 cache.
As for the instruction count, consider this:
In multithreaded applications, there is some interthread interaction,
through data sharing, cooperative caching, and synchronization. Changing
the caches changes the interactions here...suppose a processor is
spinning, waiting for a lock to release. The length of the spin (and the
number of instructions executed as a result of the spin) is influenced
by _other_ processor's cache performance (especially the thread holding
the lock!).
I was initially confused by this behavior as well...it is subtle.
Regards,
Dan
|