Re: [Gems-users] Barrnes CPI question


Date: Wed, 16 Apr 2008 23:37:16 +0800
From: "KS Chow" <g0706256@xxxxxxxxxx>
Subject: Re: [Gems-users] Barrnes CPI question
The entire thing - but the results are really unexpected.

Thanks/Regards.

Chow Kian Sim
National University of Singapore

-----Original Message-----
From: gems-users-bounces@xxxxxxxxxxx [mailto:gems-users-bounces@xxxxxxxxxxx]
On Behalf Of Dan Gibson
Sent: Wednesday, 16 April, 2008 11:35 PM
To: Gems Users
Subject: Re: [Gems-users] Barrnes CPI question

Are you simulating the entire execution or just the parallel section? 
The entire execution includes thread creation and destruction, which 
could account for the slowdown.

Regards,
Dan

KS Chow wrote:
>
> I looked at ruby_cycles:
>
> 1P: 36,420,634
>
> 2P: 64,956,443
>
> Shouldn't it be going down? Or is there something wrong with my 
> simulation?
>
> Thanks/Regards.
>
> Chow Kian Sim
>
> National University of Singapore
>
> *From:* gems-users-bounces@xxxxxxxxxxx 
> [mailto:gems-users-bounces@xxxxxxxxxxx] *On Behalf Of *Dan Gibson
> *Sent:* Wednesday, 16 April, 2008 10:56 PM
> *To:* Gems Users
> *Subject:* Re: [Gems-users] Barrnes CPI question
>
> KS Chow wrote:
>
> I have a question regarding the CPI I got from a ruby dump file.
>
> I am comparing the CPI between 2 simulations: a single core machine 
> and a dual-core - all the necessary configurations done in Simics and 
> Ruby (not using Opal).
>
> For the 1-core I ran Barnes with the param indicating 1 processor; the 
> dual-core is run with the param indicating 2 processors.
>
> 1-core CPI = 3.27789
>
> 2-core CPI = 0.621735
>
> The CPI difference could be due to spinning... see below.
>
> Is it normal to have such a huge drop in CPI going from 1 to 2 cores? 
> Both have 16Mbs L2 cache and equal amount of L1.
>
> I also noticed that in the 1-core 10,974,780 instructions were 
> executed vs 194,051,043 in the 2-core - why such a big difference in 
> num instructions executed?
>
> Is this caused by the num processors param in Barnes?
>
> SPLASH-2 synchronization is hot-spin intensive. Both CPUs are probably 
> spending a lot of time spinning waiting for the other to set a flag or 
> hit a barrier. Generally we prefer wall clock time (i.e. RUBY_CYCLES) 
> rather than CPI as a performance metric for multiprocessor runs.
>
> Thanks/Regards.
>
> Chow Kian Sim
>
> National University of Singapore
>
> Regards,
> Dan
>
>  
> ------------------------------------------------------------------------
>
>
>   
>  
> _______________________________________________
> Gems-users mailing list
> Gems-users@xxxxxxxxxxx <mailto:Gems-users@xxxxxxxxxxx>
> https://lists.cs.wisc.edu/mailman/listinfo/gems-users
> Use Google to search the GEMS Users mailing list by adding
"site:https://lists.cs.wisc.edu/archive/gems-users/"; to your search.
>  
>   
>
>
>
> -- 
> http://www.cs.wisc.edu/~gibson <http://www.cs.wisc.edu/%7Egibson>
[esc]:wq!
> ------------------------------------------------------------------------
>
> _______________________________________________
> Gems-users mailing list
> Gems-users@xxxxxxxxxxx
> https://lists.cs.wisc.edu/mailman/listinfo/gems-users
> Use Google to search the GEMS Users mailing list by adding
"site:https://lists.cs.wisc.edu/archive/gems-users/"; to your search.
>
>   

-- 
http://www.cs.wisc.edu/~gibson [esc]:wq!

_______________________________________________
Gems-users mailing list
Gems-users@xxxxxxxxxxx
https://lists.cs.wisc.edu/mailman/listinfo/gems-users
Use Google to search the GEMS Users mailing list by adding
"site:https://lists.cs.wisc.edu/archive/gems-users/"; to your search.

[← Prev in Thread] Current Thread [Next in Thread→]