Re: [Gems-users] Understand the protocol trace


Date: Tue, 26 Sep 2006 15:24:18 -0500 (CDT)
From: Mike Marty <mikem@xxxxxxxxxxx>
Subject: Re: [Gems-users] Understand the protocol trace
The cycle counts indicate when the transition _occurs_.  It starts and
finishes in the same cycle.  So yes, at cycle 5, the L1Cache has changed
its state.  The message arrives at the L2Cache at cycle 9.

--Mike

> Another question on the protocol trace. In the trace, does the cycle count
> indicate that the state transition in this line starts or finishes? For
> example, in the following lines, in cycle 5, the L1Cache has changed its
> state to L1_IS or not?
>
> Thanks!
>
> Lei
>
>    1   0  -1        Seq               Begin       >       [0x400, line
> 0x400]
>    5   0   0    L1Cache            Load     NP>L1_IS  [0x400, line 0x400]
>    9   0   0    L2Cache            L1_GETS  L2_NP>L2_IS  [0x400, line 0x400]
>   54   0   0  Directory             GETS     NP>S      [0x400, line 0x400]
> 136   0   0    L2Cache      Data_ext_ack_0  L2_IS>L2_SS  [0x400, line 0x400]
> 144   0  -1        Seq                Done       >       [0x400, line 0x400]
> 143 cycles NULL Yes
>
>
> ----- Original Message -----
> From: "Mike Marty" <mikem@xxxxxxxxxxx>
> To: "Lei Yang" <lya755@xxxxxxxxxxxxxxxxxxxx>
> Sent: Tuesday, September 26, 2006 2:43 PM
> Subject: Re: [Gems-users] Understand the protocol trace
>
>
> >
> > your little.trace is probalby specifying a chip/processor that doesn't
> > exist.
> >
> >> Thanks Mike. Any idea why it doesn't work for other number of processors?
> >>
> >> Lei
> >> ----- Original Message -----
> >> From: "Mike Marty" <mikem@xxxxxxxxxxx>
> >> To: "Lei Yang" <lya755@xxxxxxxxxxxxxxxxxxxx>
> >> Cc: "Gems Users" <gems-users@xxxxxxxxxxx>
> >> Sent: Tuesday, September 26, 2006 2:36 PM
> >> Subject: Re: [Gems-users] Understand the protocol trace
> >>
> >>
> >> > The second problem is because the number of L2 sets in
> >> > $GEMS/ruby/config/testconfig.defaults is set too low for the # of
> >> > processors.  Try increasing this to at least 1024 sets (10 bits)
> >> >
> >> > --Mike
> >> >
> >> >
> >> >> I changed the network topology to PT_TO_PT and the cycle counts
> >> >> reduced.
> >> >> However I found a wierd problem: when I set g_NUM_PROCESSORS to 16 and
> >> >> g_PROCS_PER_CHIP to 1, 2, 4, or 8, the tester work fine; but when I
> >> >> set
> >> >> it
> >> >> to other values it reports assertion failure.
> >> >>
> >> >> For example, I tried the following combinations:
> >> >>
> >> >> x86-linux/generated/MSI_MOSI_CMP_directory/bin/tester.exec -p 1 -a
> >> >> 1 -z
> >> >> little.trace
> >> >> x86-linux/generated/MSI_MOSI_CMP_directory/bin/tester.exec -p 4 -a
> >> >> 1 -z
> >> >> little.trace
> >> >> x86-linux/generated/MSI_MOSI_CMP_directory/bin/tester.exec -p 4 -a
> >> >> 2 -z
> >> >> little.trace
> >> >> x86-linux/generated/MSI_MOSI_CMP_directory/bin/tester.exec -p 4 -a
> >> >> 4 -z
> >> >> little.trace
> >> >> x86-linux/generated/MSI_MOSI_CMP_directory/bin/tester.exec -p 8 -a
> >> >> 1 -z
> >> >> little.trace
> >> >> x86-linux/generated/MSI_MOSI_CMP_directory/bin/tester.exec -p 8 -a
> >> >> 2 -z
> >> >> little.trace
> >> >> x86-linux/generated/MSI_MOSI_CMP_directory/bin/tester.exec -p 8 -a
> >> >> 4 -z
> >> >> little.trace
> >> >> x86-linux/generated/MSI_MOSI_CMP_directory/bin/tester.exec -p 8 -a
> >> >> 8 -z
> >> >> little.trace
> >> >>
> >> >> They gave me this error:
> >> >>
> >> >> Testing clear stats...Done.
> >> >> Reading trace from file 'little.trace'...
> >> >> failed assertion 'index < m_size' at fn const TYPE&
> >> >> Vector<TYPE>::ref(int)
> >> >> const [with TYPE = AbstractChip*] in ../common/Vector.h:157
> >> >> failed assertion 'index < m_size' at fn const TYPE&
> >> >> Vector<TYPE>::ref(int)
> >> >> const [with TYPE = AbstractChip*] in ../common/Vector.h:157
> >> >> At this point you might want to attach a debug to the running and get
> >> >> to
> >> >> the
> >> >> crash site; otherwise press enter to continue
> >> >> PID: 3889
> >> >>
> >> >> I also tried this:
> >> >> x86-linux/generated/MSI_MOSI_CMP_directory/bin/tester.exec -p 16 -a
> >> >> 16 -z
> >> >> little.trace
> >> >>
> >> >> This gave me a seg fault:
> >> >>
> >> >> failed assertion 'L2_CACHE_NUM_SETS_BITS >=
> >> >> log_int(g_NUM_L2_BANKS_PER_CHIP)' at fn static void RubyConfig::init()
> >> >> in
> >> >> config/RubyConfig.C:123
> >> >> Segmentation fault
> >> >>
> >> >> Could somebody give me a hint on what is the problem? Thanks a lot!!
> >> >>
> >> >> Lei
> >> >> ----- Original Message -----
> >> >> From: "Lei Yang" <lya755@xxxxxxxxxxxxxxxxxxxx>
> >> >> To: "Mike Marty" <mikem@xxxxxxxxxxx>; "Gems Users"
> >> >> <gems-users@xxxxxxxxxxx>
> >> >> Sent: Monday, September 25, 2006 12:06 PM
> >> >> Subject: Re: [Gems-users] Understand the protocol trace
> >> >>
> >> >>
> >> >> >I just found in the documentation that PT_TO_PT and FILE_SPECIFIED
> >> >> >are
> >> >> >the
> >> >> > recommended network topologies for the CMP protocols.  Will give
> >> >> > that a
> >> >> > try.
> >> >> >
> >> >> > Lei
> >> >> > ----- Original Message -----
> >> >> > From: "Mike Marty" <mikem@xxxxxxxxxxx>
> >> >> > To: "Lei Yang" <lya755@xxxxxxxxxxxxxxxxxxxx>; "Gems Users"
> >> >> > <gems-users@xxxxxxxxxxx>
> >> >> > Sent: Monday, September 25, 2006 11:59 AM
> >> >> > Subject: Re: [Gems-users] Understand the protocol trace
> >> >> >
> >> >> >
> >> >> >> 1)  Turn off RANDOMIZATION in
> >> >> >> $GEMS/ruby/config/testerconfig.defaults.
> >> >> >> This randomly adds 100+ cycle delays to generate race conditions
> >> >> >>
> >> >> >> 2)  You are probably using a non-CMP topology and
> >> >> >> NETWORK_LINK_LATENCY
> >> >> >> is
> >> >> >> fairly high.
> >> >> >>
> >> >> >> --Mike
> >> >> >>
> >> >> >>
> >> >> >>> Dear list,
> >> >> >>>
> >> >> >>> I was experimenting with MSI_MOSI_CMP_directory protocol with the
> >> >> >>> tester.
> >> >> >>> With the little.trace on GEMS online documentation
> >> >> >>> http://www.cs.wisc.edu/gems/doc/wiki/moin.cgi/How_do_I_understand_a_Protocol ,
> >> >> >>> below is the protocol trace I got:
> >> >> >>>
> >> >> >>> Testing clear stats...Done.
> >> >> >>> Reading trace from file 'little.trace'...
> >> >> >>>       1   7  -1        Seq               Begin       >
> >> >> >>> [0x400,
> >> >> >>> line
> >> >> >>> 0x400]
> >> >> >>>       4   1   3    L1Cache                Load     NP>L1_IS
> >> >> >>> [0x400,
> >> >> >>> line
> >> >> >>> 0x400]
> >> >> >>>     141   1   0    L2Cache             L1_GETS  L2_NP>L2_IS
> >> >> >>> [0x400,
> >> >> >>> line
> >> >> >>> 0x400]
> >> >> >>>     390   0   0  Directory                GETS     NP>S
> >> >> >>> [0x400,
> >> >> >>> line
> >> >> >>> 0x400]
> >> >> >>>     635   1   0    L2Cache      Data_ext_ack_0  L2_IS>L2_SS
> >> >> >>> [0x400,
> >> >> >>> line
> >> >> >>> 0x400]
> >> >> >>>    1097   7  -1        Seq                Done       >
> >> >> >>> [0x400,
> >> >> >>> line
> >> >> >>> 0x400] 1096 cycles NULL Yes
> >> >> >>>    1097   1   3    L1Cache             L1_Data  L1_IS>L1_S
> >> >> >>> [0x400,
> >> >> >>> line
> >> >> >>> 0x400]
> >> >> >>>    1101   1  -1        Seq               Begin       >
> >> >> >>> [0x400,
> >> >> >>> line
> >> >> >>> 0x400]
> >> >> >>>    1104   0   1    L1Cache                Load     NP>L1_IS
> >> >> >>> [0x400,
> >> >> >>> line
> >> >> >>> 0x400]
> >> >> >>>    1139   0   0    L2Cache             L1_GETS  L2_NP>L2_IS
> >> >> >>> [0x400,
> >> >> >>> line
> >> >> >>> 0x400]
> >> >> >>>    1176   0   0  Directory                GETS      S>S
> >> >> >>> [0x400,
> >> >> >>> line
> >> >> >>> 0x400]
> >> >> >>>    1309   0   0    L2Cache      Data_ext_ack_0  L2_IS>L2_SS
> >> >> >>> [0x400,
> >> >> >>> line
> >> >> >>> 0x400]
> >> >> >>>    1445   1  -1        Seq                Done       >
> >> >> >>> [0x400,
> >> >> >>> line
> >> >> >>> 0x400] 344 cycles NULL Yes
> >> >> >>>    1445   0   1    L1Cache             L1_Data  L1_IS>L1_S
> >> >> >>> [0x400,
> >> >> >>> line
> >> >> >>> 0x400]
> >> >> >>>
> >> >> >>> According to the documentation, the first column indicates the
> >> >> >>> cycle.
> >> >> >>> I
> >> >> >>> don't understand why the operation cycles are so large. In my
> >> >> >>> configuration,
> >> >> >>>
> >> >> >>> MEMORY_RESPONSE_LATENCY_MINUS_2: 78
> >> >> >>> DIRECTORY_LATENCY: 80
> >> >> >>> L2_RESPONSE_LATENCY: 6
> >> >> >>> L1_RESPONSE_LATENCY: 3
> >> >> >>> L1_REQUEST_LATENCY: 2
> >> >> >>> L2_REQUEST_LATENCY: 4
> >> >> >>> NETWORK_LINK_LATENCY: 40
> >> >> >>>
> >> >> >>> I don't understand why the cache operations would add up to, 1096
> >> >> >>> cycles
> >> >> >>> for the first LD as an example. Could someone explain this please?
> >> >> >>> By
> >> >> >>> the
> >> >> >>> way, in the ruby config file, there is a TIMER_LATENCY: 10000. I
> >> >> >>> wonder
> >> >> >>> what this is.
> >> >> >>>
> >> >> >>> Thanks a lot! I appreciate your comments.
> >> >> >>>
> >> >> >>> Lei
> >> >> >>
> >> >> >>
> >> >> >
> >> >> >
> >> >> > _______________________________________________
> >> >> > Gems-users mailing list
> >> >> > Gems-users@xxxxxxxxxxx
> >> >> > https://lists.cs.wisc.edu/mailman/listinfo/gems-users
> >> >> > Use Google to search the GEMS Users mailing list by adding
> >> >> > "site:https://lists.cs.wisc.edu/archive/gems-users/"; to your search.
> >> >> >
> >> >>
> >> >>
> >> >
> >> >
> >>
> >>
> >
> >
>
>
[← Prev in Thread] Current Thread [Next in Thread→]