Date: | Thu, 7 Jul 2005 12:24:12 -0400 (EDT) |
---|---|
From: | Sean Ryan Leventhal <sleventh@xxxxxxxxxxxx> |
Subject: | [Gems-users] Getting Multiprocessor benchmarks running on multiple processors at once |
I have modified opal to print out traces of all memory
instructions. I call a function of the sequencer within the execute stage
of both memop objects. This function prints out the following:m_local_cycles (which I am currently treating as the time) the address of the instruction whether it is a store and the address being accessed. Each sequencer has its own file. When I merge these files, and sort them based on processor/sequencer I observe that there are long strings in which only one processor accesses the cache. For instance, I start fmm -p4 (fast multipole method on four processors from splash2), and do c 1500000 to try to jump past some of the OS stuff. I then load ruby and opal and initialize them and run opal0.sim-step 5000000 This produces several very large traces. But sorting them and grouping all adjacent memory accesses of the same processor as a single "string" yields only 32 "strings", with an average length of 103,827 memory accesses. In other words, it appears that two threads are never executing at the same time. I get similar behavior from fft. Does anyone have any idea what I am doing wrong? - Sean |
[← Prev in Thread] | Current Thread | [Next in Thread→] |
---|---|---|
|
Previous by Date: | RE: [Gems-users] getting a copy of getopt-0, Mike Marty |
---|---|
Next by Date: | [Gems-users] Help with Opal Tester, Ankit Jalote |
Previous by Thread: | RE: [Gems-users] getting a copy of getopt-0, Mike Marty |
Next by Thread: | Re: [Gems-users] Getting Multiprocessor benchmarks running on multiple processors at once, Mike Marty |
Indexes: | [Date] [Thread] |