Re: [Gems-users] Different execution instruction numbers between simics and opal


Date: Fri, 28 May 2010 05:56:24 -0600
From: Dan Gibson <degibson@xxxxxxxx>
Subject: Re: [Gems-users] Different execution instruction numbers between simics and opal
1. To understand why you will see different instruction counts with different timings (e.g., different random seeds or different processor models), read Alameldeen and Wood, "Addressing Workload Variability in Architectural Simulations". PDF: http://www.cs.wisc.edu/multifacet/papers/ieeemicro03_variability.pdf

2. I don't think FFT binds itself at all by default, so disabling P2 and P3 might disable FFT threads.

3. Do not simply disable CPUs once the system is booted. The OS believes disabled CPUs are still running, and will still involve them in inter-processor interrupts (IPIs), e.g., TLB shootdowns. Disabling even one CPU will fairly rapidly deadlock the system, as the disabled CPU will not respond to IPIs and the system will eventually hang.

4. Not advancing a CPU in Opal is essentially the same as disabling it, so point #3 applies again.

Regards,
Dan

On Fri, May 28, 2010 at 4:24 AM, shanshuchang <shanshuchang@xxxxxxxxx> wrote:

Hi all,

I want to execute splash2/fft benchmark with part of the cores I have configured.

So I just bind the two fft thread onto the first two cores of the 4-core CMP system like this:

./FFT –p 2 –a 0 –x 1

It would initialize two threads and the threads are bound onto core0 and core1.

In simics command, I disabled the other two cores using :

cpu2.disable and cpu3.disable.

I use magic instructions to guarantee that the execution of simics is limited to the parallel scale.

By using simics + ruby, the execution is quickly finished and the ruby statistics is like this:

 

instruction_executed: 5073779 [ 4616570 457207 1 1 ]

 

I also use opal + ruby to simulate the similar execution. The opal code was modified like this: (in opal/system/system.C)

 

for (int j = 0; j < m_numSMTProcs/2; j++ ) {

……

      m_seq[j]->advanceCycle();

      ……

 }

Therefore, the cpu2 and cpu3 would never be advanced.

 

However, when I read the same checkpoint file and executed for 10000000 cycles, the executed didn’t finished.

 

So according to the same benchmark, does OPAL+RUBY run different number of instructions as SIMICS + RUBY did?

 

Thanks in advance!

shuchang


_______________________________________________
Gems-users mailing list
Gems-users@xxxxxxxxxxx
https://lists.cs.wisc.edu/mailman/listinfo/gems-users
Use Google to search the GEMS Users mailing list by adding "site:https://lists.cs.wisc.edu/archive/gems-users/" to your search.





--
http://www.cs.wisc.edu/~gibson [esc]:wq!
[← Prev in Thread] Current Thread [Next in Thread→]