Sean,
This sounds strange to me. Can you provide more information, such as how do you
merge and sort your traces? What does a long string look like?
Thanks!
-Min
On Thu, 07 Jul 2005 Sean Ryan Leventhal wrote :
> I have modified opal to print out traces of all memory
> instructions. I call a function of the sequencer within the execute stage
> of both memop objects. This function prints out the following:
>
> m_local_cycles (which I am currently treating as the time)
> the address of the instruction
> whether it is a store
> and the address being accessed.
>
> Each sequencer has its own file.
>
> When I merge these files, and sort them based on processor/sequencer I
> observe that there are long strings in which only one processor accesses
> the cache. For instance, I start fmm -p4 (fast multipole method on four
> processors from splash2), and do
> c 1500000
> to try to jump past some of the OS stuff. I then load ruby and opal and
> initialize them and run
>
> opal0.sim-step 5000000
>
> This produces several very large traces. But sorting them and grouping
> all adjacent memory accesses of the same processor as a single "string"
> yields only 32 "strings", with an average length of 103,827 memory
> accesses. In other words, it appears that two threads are never executing
> at the same time. I get similar behavior from fft. Does anyone have any
> idea what I am doing wrong?
>
> - Sean
>
> _______________________________________________
> Gems-users mailing list
> Gems-users@xxxxxxxxxxx
> https://lists.cs.wisc.edu/mailman/listinfo/gems-users
|