Re: [Gems-users] The distribution of "total_misses" in ruby stats output


Date: Fri, 24 Aug 2007 14:52:09 -0500
From: Dan Gibson <degibson@xxxxxxxx>
Subject: Re: [Gems-users] The distribution of "total_misses" in ruby stats output
Which version of Simics are you using, and what is your target machine?

My initial thought is that there are more than one phys_mem* objects, and Ruby is only attaching itself to phys_mem0 (in Simics 2.x/Sarek target, there is _ONLY_ phys_mem0). However, with some other versions of Simics and/or different target machines, there are sometimes phys_mem objects for EACH cpu -- verify that you only have one phys_mem* object (via the Simics command line), and that its name is phys_mem0.

Regards,
Dan

Lide Duan wrote:
I found something strange when looking at the ruby stats output files. I am simulating some 16p checkpoints. If I place all the 16p on a single chip (g_PROCS_PER_CHIP 16), the results related to cache misses are shown as follow:

Total_misses: 2537939
total_misses: 2537939 [ 2537939 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ]
user_misses: 1763423 [ 1763423 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ]
supervisor_misses: 774516 [ 774516 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ]

I suppose the 16 numbers in the brackets correspond to the misses on each processor, but as we can see only misses on the 1st processor were observed. On the other hand, I also got the followings:

instruction_executed: 1322569080 [ 82094180 83437607 81021513 81093394 82632809 122205588 82083483 80671650 81138678 80191395 79408095 79763896 81353502 81528474 62244475 81700341 ] cycles_per_instruction: 4.63392 [ 4.66589 4.59077 4.72767 4.72348 4.63548 3.13441 4.6665 4.74817 4.72084 4.77661 4.82372 4.80221 4.70837 4.69827 6.15384 4.68839 ] misses_per_thousand_instructions: 1.91895 [ 30.915 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ]

So definitely all the processors were running something, but why were the misses on the 1st processor observed only?

To address this problem, I tried different configurations. If I place the processors on 4 chips each containing 4p, the first 4 numbers in the total_misses brackets are not zeros. Also, if one processor on one chip (totally 16 chips), all the 16 numbers are not zeros. Therefore, I guess the numbers indicate the misses on each CHIP, not each processor. Am I right? or did I miss something here? Actually I tried different workloads with different network topologies, but got the similar results. Can anybody give me some explanation?

Thanks,
Lide
------------------------------------------------------------------------

_______________________________________________
Gems-users mailing list
Gems-users@xxxxxxxxxxx
https://lists.cs.wisc.edu/mailman/listinfo/gems-users
Use Google to search the GEMS Users mailing list by adding "site:https://lists.cs.wisc.edu/archive/gems-users/"; to your search.


--
http://www.cs.wisc.edu/~gibson [esc]:wq!

[← Prev in Thread] Current Thread [Next in Thread→]