Re: [Gems-users] Simulation of the Communication Bottleneck in Cache Coherency Memory Access

Mailing List Archives Public Access	UW Madison Computer Sciences Department Computer Systems Lab

Date:	Wed, 27 Apr 2011 20:58:27 +0200 (CEST)
From:	"John Shield" <john.shield@xxxxxxxxxxx>
Subject:	Re: [Gems-users] Simulation of the Communication Bottleneck in Cache Coherency Memory Access

Thank you Greg,

That's very informative and does tell me that GEMS can do what I wish to do.

I already knew it was going to be very involved. I'm actually needing to
do a fair bit more than I've mentioned here. So getting into the code was
going to be a given anyway.

Again thanks, I'm going to follow up on what you said.

Regards,

John

> What I meant by bandwidth-limiting behavior in the network:  Each network
> link is modeled as a physical channel.  Only one packet can be transmitted
> across the link at a time (even through there are multiple virtual
> channels/networks using that physical channel).  So all requests coming
> into
> a Directory (for example) will be serialized by that incoming link.
>
> Getting a timestamp for a memory request is simple.  Call
> g_EventQueue_ptr->getTime
>  in Sequencer::makeRequest().
>
> Not sure how this fixes your problem, but you can certainly add a
> timestamp
> to a memory request message.  There are also timestamps associated with
> entries in a MessageBuffer, which tell the time at which the message
> entered
> the buffer.
>
> I'm not positive, but I think you'll find this is a little more involved
> than what you're proposing.  The time at which a memory operation is
> initiated has little relation to when it arrives at a remote cache or
> directory.  I'm not sure what you mean by "there's no timing relevant to
> synchronization within SLICC".  Look at the generated code to see what the
> Wakeup method for a SLICC component actually does.  (The timestamp of the
> message at the head of a MessageBuffer indicates whether it's "ready", and
> the message will not be removed from the queue until the global time is
> greater than or equal to that timestamp.)
>
> Also, be aware that Ruby time and Simics time are not the same.  Ruby
> keeps
> its own cycle count, accessible through the getTime() method mentioned
> above.
>
> ...Greg
>
>
> On Wed, Apr 27, 2011 at 1:21 PM, John Shield
> <john.shield@xxxxxxxxxxx>wrote:
>
>>  Hi Greg,
>>
>> Thanks for confirming that what I'm seeing in the code is correct.
>> Sorry, I
>> was stating the worst case scenario for behaviour and I didn't specify
>> that.
>> A fixed latency is added for each type of transaction between memory
>> components, so I assume that's what you mean by the bandwidth limiting
>> behaviour.
>>
>> I think I can resolve this issue in SLICC, but it requires time stamps
>> for
>> each memory access. There's no timing relevant to synchronisation within
>> the
>> SLICC, only a latency calculation for the memory access.
>>
>> My current direction for solving the issue is checking to see if the
>> interface between SIMICS and Ruby contains time stamp information for
>> when
>> memory accesses occur. With time stamp information it would be an easy
>> solution to calculate the busy time for each port queue.
>>
>> Greg, can you confirm whether Ruby can obtain the time stamp for memory
>> accesses from the SIMICS interface? I would hearten me greatly if you
>> could
>> tell me it's possible, or speed things up if it's not possible. I still
>> haven't digested all the documentation for the SIMICS model builder and
>> how
>> Ruby interfaces to SIMICS.
>>
>> Regards,
>>
>> John Shield
>>
>>
>> On 04/27/2011 06:08 PM, Greg Byrd wrote:
>>
>> I agree with your description of the problem.  I don't think it's quite
>> as
>> severe as you describe, because there is bandwidth-limiting behavior in
>> the
>> network.  So you won't get a burst of requests arriving simultaneously
>> at
>> the Directory (or wherever).  But I agree that "busy time" of the
>> resource
>> is not modeled, and multiple actions can overlap.
>>
>>  The best way to fix this would be to change SLICC (or, more precisely,
>> the C++ code generated by SLICC) to include a notion of busy time for a
>> component.  The components would not process any in_port actions until
>> the
>> busy time has expired.
>>
>>  You can probably model this without changing SLICC by adding a "delay"
>> function to MessageBuffer, to increase the timestamps of messages
>> waiting in
>> the queue, so that they are not "seen" until the component is not busy.
>> (Be
>> careful to avoid starvation on the incoming queues, since they are
>> checked
>> in a fixed order by the Wakeup method.)
>>
>>  ...Greg
>>
>>
>>
>>
>> On Wed, Apr 27, 2011 at 11:58 AM, John Shield
>> <john.shield@xxxxxxxxxxx>wrote:
>>
>>> Dear all,
>>>
>>> I'm going to ask two things.
>>>
>>> Firstly, can anyone confirm that the SLICC cache coherency policies do
>>> not
>>> consider the wait time caused by other accesses sent to the Directory
>>> system
>>> (modelled in the protocols).
>>>
>>> When going through the protocols, it appears that cache messages do not
>>> compete for the resources of the Directory. There are queues in the
>>> SLICC
>>> description, but the wait time for earlier messages in the queue
>>> doesn't
>>> seem to make a difference for the latency of later messages. This is
>>> would
>>> also be a problem for each individual cache being able to process an
>>> unlimited number of external requests in the time to takes to do 1
>>> request.
>>>
>>> To make the problem clear, all the components have have infinite
>>> parallel
>>> bandwidth including the main memory.
>>>
>>> This behaviour means that the additional latency caused by multiple
>>> messages at the same time are ignored. Ten messages arriving at
>>> Directory
>>> are processed with the same latency as a single message, because the
>>> latency
>>> of ten messages competing for communication bandwidth doesn't occur.
>>>
>>> Secondly, if what I'm seeing is correct (lack of bottleneck
>>> calculation)
>>> does anyone know a way this problem can be fixed? I think I could fix
>>> this
>>> problem myself if I could access the relative timing of cache requests
>>> then
>>> I could add up the bottleneck latencies of the input queues as part of
>>> the
>>> coherency protocol.
>>>
>>> The communication bottleneck is the main problem research needs to
>>> solve
>>> in the design of cache coherency architecture. Without some kind of
>>> bottleneck behaviour being modelled, the cache coherency results would
>>> be
>>> poor. The results would be for an infinite bandwidth system, and system
>>> would not model the performance losses of a badly designed cache
>>> coherency
>>> system.
>>>
>>>
>>> Background of my own work:
>>> I wanting to build some non-standard coherency policies, to relieve
>>> communication bottleneck problems. However, to do this I need the
>>> simulation
>>> to simulate the bottleneck problems. Furthermore, I was planning on
>>> adding a
>>> "main memory" description to simulate the main memory bottleneck and to
>>> allow for writeback (currently not supported in the ruby coherency
>>> protocols). Writeback is also nesscary to fix the modelling problem of
>>> infinite cache size in the SLICC description.
>>>
>>>
>>> I would appreciate any assistance,
>>>
>>> John Shield
>>>
>>> _______________________________________________
>>> Gems-users mailing list
>>> Gems-users@xxxxxxxxxxx
>>> https://lists.cs.wisc.edu/mailman/listinfo/gems-users
>>> Use Google to search the GEMS Users mailing list by adding "site:
>>> https://lists.cs.wisc.edu/archive/gems-users/"; to your search.
>>>
>>>
>>
>> _______________________________________________
>> Gems-users mailing
>> listGems-users@xxxxxxxxxxxxxxxx://lists.cs.wisc.edu/mailman/listinfo/gems-users
>> Use Google to search the GEMS Users mailing list by adding
>> "site:https://lists.cs.wisc.edu/archive/gems-users/"; to your search.
>>
>>
>>
>>
>> _______________________________________________
>> Gems-users mailing list
>> Gems-users@xxxxxxxxxxx
>> https://lists.cs.wisc.edu/mailman/listinfo/gems-users
>> Use Google to search the GEMS Users mailing list by adding "site:
>> https://lists.cs.wisc.edu/archive/gems-users/"; to your search.
>>
>>
>>
> _______________________________________________
> Gems-users mailing list
> Gems-users@xxxxxxxxxxx
> https://lists.cs.wisc.edu/mailman/listinfo/gems-users
> Use Google to search the GEMS Users mailing list by adding
> "site:https://lists.cs.wisc.edu/archive/gems-users/"; to your search.
>
>

[← Prev in Thread]	Current Thread	[Next in Thread→]
[Gems-users] How to understand the dump-stats?, sun xihuang [Gems-users] Simulation of the Communication Bottleneck in Cache Coherency Memory Access, John Shield Re: [Gems-users] Simulation of the Communication Bottleneck in Cache Coherency Memory Access, Greg Byrd Re: [Gems-users] Simulation of the Communication Bottleneck in Cache Coherency Memory Access, John Shield Re: [Gems-users] Simulation of the Communication Bottleneck in Cache Coherency Memory Access, Greg Byrd Re: [Gems-users] Simulation of the Communication Bottleneck in Cache Coherency Memory Access, John Shield <=

Previous by Date:	Re: [Gems-users] Simulation of the Communication Bottleneck in Cache Coherency Memory Access, Greg Byrd
Next by Date:	[Gems-users] Token Coherence Protocol Messages, Ragavendra
Previous by Thread:	Re: [Gems-users] Simulation of the Communication Bottleneck in Cache Coherency Memory Access, Greg Byrd
Next by Thread:	[Gems-users] How to use opal0.branch-trace-, Edward Singular Kim*
Indexes:	[Date] [Thread]

Mailing List Archives

Public Access

Re: [Gems-users] Simulation of the Communication Bottleneck in Cache Coherency Memory Access