Re: [Gems-users] Simulation of the Communication Bottleneck in Cache Coherency Memory Access


Date: Wed, 27 Apr 2011 12:08:52 -0400
From: Greg Byrd <gbyrd@xxxxxxxx>
Subject: Re: [Gems-users] Simulation of the Communication Bottleneck in Cache Coherency Memory Access
I agree with your description of the problem.  I don't think it's quite as severe as you describe, because there is bandwidth-limiting behavior in the network.  So you won't get a burst of requests arriving simultaneously at the Directory (or wherever).  But I agree that "busy time" of the resource is not modeled, and multiple actions can overlap.

The best way to fix this would be to change SLICC (or, more precisely, the C++ code generated by SLICC) to include a notion of busy time for a component.  The components would not process any in_port actions until the busy time has expired.

You can probably model this without changing SLICC by adding a "delay" function to MessageBuffer, to increase the timestamps of messages waiting in the queue, so that they are not "seen" until the component is not busy.  (Be careful to avoid starvation on the incoming queues, since they are checked in a fixed order by the Wakeup method.)

...Greg
 



On Wed, Apr 27, 2011 at 11:58 AM, John Shield <john.shield@xxxxxxxxxxx> wrote:
Dear all,

I'm going to ask two things.

Firstly, can anyone confirm that the SLICC cache coherency policies do not consider the wait time caused by other accesses sent to the Directory system (modelled in the protocols).

When going through the protocols, it appears that cache messages do not compete for the resources of the Directory. There are queues in the SLICC description, but the wait time for earlier messages in the queue doesn't seem to make a difference for the latency of later messages. This is would also be a problem for each individual cache being able to process an unlimited number of external requests in the time to takes to do 1 request.

To make the problem clear, all the components have have infinite parallel bandwidth including the main memory.

This behaviour means that the additional latency caused by multiple messages at the same time are ignored. Ten messages arriving at Directory are processed with the same latency as a single message, because the latency of ten messages competing for communication bandwidth doesn't occur.

Secondly, if what I'm seeing is correct (lack of bottleneck calculation) does anyone know a way this problem can be fixed? I think I could fix this problem myself if I could access the relative timing of cache requests then I could add up the bottleneck latencies of the input queues as part of the coherency protocol.

The communication bottleneck is the main problem research needs to solve in the design of cache coherency architecture. Without some kind of bottleneck behaviour being modelled, the cache coherency results would be poor. The results would be for an infinite bandwidth system, and system would not model the performance losses of a badly designed cache coherency system.


Background of my own work:
I wanting to build some non-standard coherency policies, to relieve communication bottleneck problems. However, to do this I need the simulation to simulate the bottleneck problems. Furthermore, I was planning on adding a "main memory" description to simulate the main memory bottleneck and to allow for writeback (currently not supported in the ruby coherency protocols). Writeback is also nesscary to fix the modelling problem of infinite cache size in the SLICC description.


I would appreciate any assistance,

John Shield

_______________________________________________
Gems-users mailing list
Gems-users@xxxxxxxxxxx
https://lists.cs.wisc.edu/mailman/listinfo/gems-users
Use Google to search the GEMS Users mailing list by adding "site:https://lists.cs.wisc.edu/archive/gems-users/" to your search.


[← Prev in Thread] Current Thread [Next in Thread→]