On 1/29/11 8:52 PM, "Byn Choi" <bynchoi1@xxxxxxxxxxxx> wrote:
>
>On Jan 29, 2011, at 6:05 PM, junli gu wrote:
>
>> Hey all:
>>
>> I am simulating a 16-core CMP using Simics+Ruby. First I know that
>> the latency values are all ruby cycles, which means 1 ruby cycles
>> equals to 2 CPU cycles. I am simulating a 16-core CMP using the
>> default values as the following:
>
>That depends on the SIMICS_RUBY_MULTIPLIER parameter, which is 4 by
>default, which means simics will be advanced 4 times for every ruby
>cycle. I personally use 1 since the processor modeled by the simics is
>a very simple single-issue in-order 5-stage processor.
>
>>
>> NULL_LATENCY: 1 ; Shortest
>> possible latency
>> ISSUE_LATENCY: 2 ; Latency
>> to send out a request to the interconnect
>> CACHE_LATENCY: 1 ; Latency
>> to source data from a cache to the interconnect
>> MEMORY_LATENCY: 35 ; Latency
>> to source data from a memory module to the interconnect
>> DIRECTORY_LATENCY: 1 ; Latency
>> of directory lookup
>> NETWORK_LINK_LATENCY: 1 ; Latency
>> for a single node-to-node hop in the interconnect
>> SEQUENCER_TO_CONTROLLER_LATENCY: 8 ; Latency
>> added by sequencer to requests to cache controller
>> TRANSITIONS_PER_RUBY_CYCLE: 32 ; Maximum
>> transitions per cycle for all SLICC state machines
>> SEQUENCER_OUTSTANDING_REQUESTS: 20 ; Number of
>> outstanding requests per sequencer
>>
>> My questions are:
>>
>> A) I am positive about the L2 cache latency and memory latency. It
>> is supposed to be 10 and 35 ruby cycles, which means 20 and 70 cpu
>> cycles. Am I right?
>
>This depends on how far the L2 bank is located wrt to the requestor.
>The latency will vary depending on the number of hops and number of
>routers that the request has to go through.
>
>>
>> B) are these numbers realistic? I mean do they match the ones are
>> in real products?
>
>The following paper has detailed latency numbers from the Intel
>Nehalem and AMD Shanghai chips.
>
>Comparing Cache Architectures and Coherency Protocols on x86-64
>Multicore SMP Systems (MICRO'09)
>
>>
>> C) For big cores like 16-core or even 32-core, how should these
>> numbers change? I guess when we have more cores the inter
>> connection latency and memory latency will also increase? Also I am
>> not sure whether NETWORK_LINK_LATENCY: 1 is too
>> small.
>
>The per-hop interconnection latency and memory latency (memory look up
>time) should remain unchanged here. Again, as mentioned in A), the
>overall (average) latency would increase due to increased
>interconnection diameter.
Also, you should note that NETWORK_LINK_LATENCY is not used if you are
using FILE_SPECIFIED as your topology. In that case you need to check the
parameters inside the network configuration text file under
$GEMS/ruby/simple/Network_files.
Cheers,
Abdullah
|