Since you're running 32 cores, I would expect that ocean fits entirely
in the L1D caches, unless you've made ocean's working set large or the
caches small. If ocean resides in L1s, then the L2 performance isn't
very critical.
Ocean's working set (assuming N = matrix dimension):
(N+2)*(N+2)*(sizeof(element))*2 copies
For 66x66 and double-sized elements, this comes out to 69kB. Divided 32
ways thats ~2kB.
For 258x258 and double-sized elements, total size is about 1MB. Divided
32 ways ~32kB.
The default L1D size in Ruby is 64kB.
Regards,
Dan
Filipe wrote:
Hi!
I have a question about banks with the MSI_MOSI_CMP_directory and the
MOESI_CMP_directory...
I did some tests using 4, 8 and 16 banks with both protocols and an L2
cache with 16MB and 8MB...
The CMP has 32 cores accessing the L2...
For SPLASH2-OCEAN, I got similar results for each protocol... these
are the execution times in cycles...
MSI:
110681587 - 16MB - 16 banks
110882613 - 16MB - 4 banks
110007560 - 16MB - 8 banks
112847680 - 8MB - 16 banks
115278293 - 8MB - 8 banks
114196480 - 8MB - 4 banks
MOESI:
91810800 - 16MB - 16 banks
90638080 - 16MB - 8 banks
91811853 - 16 MB - 4 banks
91366667 - 8 MB - 16 banks
90484880 - 8 MB - 8 banks
92219093 - 8 MB - 4 banks
the differences are less then 5% in each protocol... is that correct?
why is it so small?
is the number of banks too small for the number of cores?
I would appreciate any kind of help to understand this.
Thank you very much!
--
Filipe Montefusco Scoton
------------------------------------------------------------------------
_______________________________________________
Gems-users mailing list
Gems-users@xxxxxxxxxxx
https://lists.cs.wisc.edu/mailman/listinfo/gems-users
Use Google to search the GEMS Users mailing list by adding "site:https://lists.cs.wisc.edu/archive/gems-users/" to your search.
--
http://www.cs.wisc.edu/~gibson [esc]:wq!
|