【正文】
lty) ? Memory stall clock cycles = Memory accesses x Miss rate x Miss penalty ? Different measure: AMAT Average Memory Access time (AMAT) = Hit Time + (Miss Rate x Miss Penalty) ? Note: memory hit time is included in execution cycles. Performance [contd…] ? Suppose a processor executes at – Clock Rate = 200 MHz (5 ns per cycle) – Base CPI = – 50% arith/logic, 30% ld/st, 20% control ? Suppose that 10% of memory operations get 50 cycle miss penalty ? Suppose that 1% of instructions get same miss penalty ? CPI = Base CPI + average stalls per instruction (cycles/ins) + [ (DataMops/ins) x (miss/DataMop) x 50 (cycle/miss)] + [ 1 (InstMop/ins) x (miss/InstMop) x 50 (cycle/miss)] = ( + + .5) cycle/ins = ? 58% of the time the proc is stalled waiting for memory! ? AMAT=(1/)x[1+]+()x[1+]= Summary ? The Principle of Locality: – Program likely to access a relatively small portion of the address space at any instant of time. ? Temporal Locality: Locality in Time ? Spatial Locality: Locality in Space ? Three (+1) Major Categories of Cache Misses: – Compulsory Misses: sad facts of life. Example: cold start misses. – Conflict Misses: increase cache size and/or associativity. Nightmare Scenario: ping pong effect! – Capacity Misses: increase cache size – Coherence Misses: Caused by external processors or I/O devices ? Cache Design Space – total size, block size, associativity – replacement policy – writehit policy (writethrough, writeback) – writemiss policy Summary [contd…] ? Several interacting dimensions – cache size – block size – associativity – replacement policy – writethrough vs writeback – write allocation ? The optimal choice is a promise – depends on access characteristics ? workload ? use (Icache, Dcache, TLB) – depends on technology / cost ? Simplicity often wins Associativity Cache Size Block Size Bad Good Less More Factor A Factor B