Re: [Gems-users] L1 bypassing?


Date: Fri, 23 Sep 2005 15:55:42 -0400
From: Christian Bienia <cbienia@xxxxxxxxxxxxxxxx>
Subject: Re: [Gems-users] L1 bypassing?
Hi Brad,

that's exactly what I did. I need L1 caches with more than 1 cycle
latencies. In your post on June 21st, you recommended to set
REMOVE_SINGLE_CYCLE_DCACHE_FAST_PATH == true and to set
SEQUENCER_TO_CONTROLLER_LATENCY to the desired L1 latency.

Is there a way to get both an L1 latency bigger than 1 and some L1
access statistics using only the parameters? If not, the following code
in Sequencer.C seems to be in charge for profiling:



void Sequencer::issueRequest(const CacheMsg& request) {
  bool found = insertRequest(request);

  if (!found) {
    CacheMsg msg = request;
    msg.getAddress() = line_address(request.getAddress()); // Make line
address

    // Fast Path L1 misses are profiled here - all non-fast path misses
are profiled within the generated protocol code
    if (!REMOVE_SINGLE_CYCLE_DCACHE_FAST_PATH) {
      g_system_ptr->getProfiler()->addPrimaryStatSample(msg,
m_chip_ptr->getID()
);
    }



If I make the call to g_system_ptr->getProfiler()->addPrimaryStatSample
unconditional, will it work?

Thanks for your help, guys.

Chris



On Fri, 2005-09-23 at 15:05, Bradford Beckmann wrote:
> Did you set REMOVE_SINGLE_CYCLE_DCACHE_FAST_PATH == true.  There is a bad,
> implicit assumption that the SMP protocols will use the FAST_PATH for L1
> hits and therefore these hits will be profiled by the Sequencer.  When you
> set REMOVE_SINGLE_CYCLE_DCACHE_FAST_PATH == false, neither the Sequencer
> nor the L1Cache_Controller specified in the file MOSI_SMP_bcast-cache.sm
> will profile L1 hits.
> 
> Brad
> 
> 
> 
> On Fri, 23 Sep 2005, Mike Marty wrote:
> 
> > Hmmm..this protocol should show L1 and L2 stats just fine.
> >
> > The *_1level protocols would show zero L1 stats.  You didn't happen to
> > build one of these protocols, did you?
> >
> > --Mike
> >
> >
> > > Hi,
> > >
> > > I used MOSI_SMP_bcast.
> > >
> > > Chris
> > >
> > >
> > >
> > > On Thursday 22 September 2005 08:40 pm, Mike Marty wrote:
> > > > Chris,
> > > >
> > > > I suspect that the protocol you used doesn't have L1 cache statistics
> > > > hooked up correctly.  Which protocol was this?
> > > >
> > > > --Mike
> > > >
> > > > > Hi,
> > > > >
> > > > > after a simulation of a uniprocessor machine with warm caches, I got the
> > > > > following results:
> > > > >
> > > > >
> > > > >
> > > > > ----------------------------
> > > > > L1D_cache cache stats:
> > > > >   L1D_cache_total_misses: 0
> > > > >   L1D_cache_total_demand_misses: 0
> > > > >   L1D_cache_total_prefetches: 0
> > > > >   L1D_cache_total_sw_prefetches: 0
> > > > >   L1D_cache_total_hw_prefetches: 0
> > > > >   L1D_cache_misses_per_transaction: 0
> > > > >   L1D_cache_misses_per_instruction: 0
> > > > >   L1D_cache_instructions_per_misses: NaN
> > > > >
> > > > >   L1D_cache_request_size: [binsize: log2 max: 0 count: 0 average: NaN
> > > > >
> > > > > |standard
> > > > >
> > > > > deviation: NaN | 0 ]
> > > > >
> > > > > L1I_cache cache stats:
> > > > >   L1I_cache_total_misses: 0
> > > > >   L1I_cache_total_demand_misses: 0
> > > > >   L1I_cache_total_prefetches: 0
> > > > >   L1I_cache_total_sw_prefetches: 0
> > > > >   L1I_cache_total_hw_prefetches: 0
> > > > >   L1I_cache_misses_per_transaction: 0
> > > > >   L1I_cache_misses_per_instruction: 0
> > > > >   L1I_cache_instructions_per_misses: NaN
> > > > >
> > > > >   L1I_cache_request_size: [binsize: log2 max: 0 count: 0 average: NaN
> > > > >
> > > > > |standard
> > > > >
> > > > > deviation: NaN | 0 ]
> > > > >
> > > > > L2_cache cache stats:
> > > > >   L2_cache_total_misses: 74762
> > > > >   L2_cache_total_demand_misses: 42876
> > > > >   L2_cache_total_prefetches: 31886
> > > > >   L2_cache_total_sw_prefetches: 31886
> > > > >   L2_cache_total_hw_prefetches: 0
> > > > >   L2_cache_misses_per_transaction: 74762
> > > > >   L2_cache_misses_per_instruction: 7.4762e-05
> > > > >   L2_cache_instructions_per_misses: 13375.8
> > > > > ----------------------------
> > > > >
> > > > >
> > > > >
> > > > > What confuses me is that there are L2 misses, but no L1 misses, which
> > > > > seems to be a contradiction if inclusive caches are used. How can the
> > > > > result be explained? And from where can I get the total number of
> > > > > accesses to the individual levels to compute the miss rate?
> > > > >
> > > > > Cheers,
> > > > > Chris
> > > > >
> > > > >
> > > > > _______________________________________________
> > > > > Gems-users mailing list
> > > > > Gems-users@xxxxxxxxxxx
> > > > > https://lists.cs.wisc.edu/mailman/listinfo/gems-users
> > > >
> > > > _______________________________________________
> > > > Gems-users mailing list
> > > > Gems-users@xxxxxxxxxxx
> > > > https://lists.cs.wisc.edu/mailman/listinfo/gems-users
> > >
> > > _______________________________________________
> > > Gems-users mailing list
> > > Gems-users@xxxxxxxxxxx
> > > https://lists.cs.wisc.edu/mailman/listinfo/gems-users
> > >
> > _______________________________________________
> > Gems-users mailing list
> > Gems-users@xxxxxxxxxxx
> > https://lists.cs.wisc.edu/mailman/listinfo/gems-users
> >
> 
> -----------------------------------------------------------------
>  Department of Computer Science         Residence
>  University of Wisconsin
>  1210 W. Dayton St. #6366               608 Eagle Heights Apt. L
>  Madison, WI 53706                      Madison, WI 53705
>  (608)265-2702				(608)852-6133
> -----------------------------------------------------------------
> _______________________________________________
> Gems-users mailing list
> Gems-users@xxxxxxxxxxx
> https://lists.cs.wisc.edu/mailman/listinfo/gems-users

[← Prev in Thread] Current Thread [Next in Thread→]