Re: [Gems-users] Change in ruby network code?


Date: Fri, 27 Jul 2007 12:35:51 -0500
From: "Lide Duan" <leaderduan@xxxxxxxxx>
Subject: Re: [Gems-users] Change in ruby network code?
Mike,

I think I have found out what the problem was. In Topology::make2DTorus(), the number of torus switches is determined by :
int numberOfTorusSwitches = m_nodes/MachineType_base_level(MachineType_NUM);
which is actually m_nodes/3. But in my configuration, the simulated system has 16 processors, 32 L2 banks and 16 momeries. In this case, m_nodes = 64, then numberOfTorusSwitches = 64/3 = 21 which is not a square of some number. The first 16 switches is actually used to form a 4x4 torus network. The problem was generated when the following souce code deals with the last 5 switches which are actually unused. I explictly set the numberOfTorusSwitches to 16, and the problem is gone.

But I am still concerning the protocol as you mentioned. The protocol I am using now is MOESI_CMP_directory, and the 16p are placed on 16 chips (1 p on 1 chip). Actually I want to observe the network behavior of a 16p CMP, but in order to utilize the auto-generated TORUS2D interconnet, I placed each processor on a single chip, but reduced the latencies to make the 16p communicate like on a single chip. Is it resonable to do this? Is there any restriction on the protocol can be used if I do so?

Lide

On 7/27/07, Mike Marty <mikem@xxxxxxxxxxx> wrote:
Lide,

I'm not aware of any changes to the interconnect code between GEMS 1.3.1
and GEMS 1.4.  What protocol are you using with TORUS2D?

--Mike


Lide Duan wrote:
> Hello,
>
> Currently I am using two checkpoints with GEMS, one has 8 processors,
> and the other has 16p. I compiled ruby with TORUS2D topology, and also
> modified the latencies. Everything works fine in GEMS1.3.1 for both
> checkpoints. However, when I run the 16p checkpoint in GEMS1.4 with
> TORUS2D, I got the following assertion fail (8p in GEMS1.4 runs good):
>
> failed assertion 'dest <= m_number_of_switches+m_nodes+m_nodes' at fn
> void Topology::addLink(SwitchID, SwitchID, int, int, int) in
> network/simple/Topology.C:686
> failed assertion 'dest <= m_number_of_switches+m_nodes+m_nodes' at fn
> void Topology::addLink(SwitchID, SwitchID, int, int, int) in
> network/simple/Topology.C:686
>
> I checked the source code, GEMS1.3.1 also has these assertions, but it
> doesn't complain that. That's what confused me. Is there any change in
> the network code in the two versions? or what's the problem might be?
>
> Thanks
> Lide
>
> ------------------------------------------------------------------------
>
> _______________________________________________
> Gems-users mailing list
> Gems-users@xxxxxxxxxxx
> https://lists.cs.wisc.edu/mailman/listinfo/gems-users
> Use Google to search the GEMS Users mailing list by adding "site: https://lists.cs.wisc.edu/archive/gems-users/" to your search.
>
>
_______________________________________________
Gems-users mailing list
Gems-users@xxxxxxxxxxx
https://lists.cs.wisc.edu/mailman/listinfo/gems-users
Use Google to search the GEMS Users mailing list by adding "site: https://lists.cs.wisc.edu/archive/gems-users/" to your search.


[← Prev in Thread] Current Thread [Next in Thread→]