Consider the following program fragment.  Assume that each instruction
    fetch takes 50 ns.  How long would it take for this program fragment to
    run?  You will need to read the code to know how many times the loop
    will be executed.  (Assume there is no cache.)
    
    
    
        112:         addi $t0, $zero, 0
        116:         addi $t1, $zero, 100
        120:  loop:  add $s0, $s0, $s1
        124:         addi $t0, $t0, 1
        128:         bne $t0, $t1, loop
    
    
    Direct-Mapped Caches
    
    Consider a (very small) direct-mapped cache that has 4 lines, each
    containing 1 4-byte word.
    Assume that the cache has a hit access time of 1 ns and a miss
    penalty of 50 ns (i.e., the full access time is 51 ns when there is a
    miss: the miss penalty + the usual access time).
    How long would it take for the program fragment above to run, assuming
    that none of the instructions are in the cache to begin with?  What
    is the average memory access time (overall time / number of
    instructions)?  (The program fragment is repeated below for convenience.)
    (You can fill in the table to the right 
    if that would be helpful.  I've indicated what it would look like after
    the first instruction missed and then was fetched.  Address 112 maps
    to line 00. The v column indicates whether the line
    contains valid (meaningful) data for this program, or just left-over
    garbage.)
    
    
    Filling in this table is optional
    
         |  | v | Tag | Instruction in Cache (4 bytes) | 
         | 00 | 1 | 0000 ... 0111 | addi $t0, $zero, 0 | 
         | 01 | 0 |  |  | 
         | 10 | 0 |  |  | 
         | 11 | 0 |  |  | 
    
    
     
    
    
        112:         addi $t0, $zero, 0
        116:         addi $t1, $zero, 100
        120:  loop:  add $s0, $s0, $s1
        124:         addi $t0, $t0, 1
        128:         bne $t0, $t1, loop
    
     
     
    
    
    Consider a direct-mapped cache that has 8 lines, each containing
    1 word.  For simplicity, assume that a word is 2 bytes long,
    which means that addresses are also 2 bytes.
    
        - How long (how many bits) is an address?
        
- Are any address bits used to determine the byte within the
        cache line? If so, how many?
        
- Which address bits indicate the line to which that address maps?
        
- How many bits are needed for the tag?
        
Consider the following set of memory address requests.
    How many of the requests would be hits and how many misses, assuming
    that the cache is empty beforehand?
    (I.e.,v is 0 for every line.)
    What would be the final contents of the cache?
    
 
    
    
    
         |  | v | Tag | Data in Cache (2 bytes) | 
         | 000 |  |  |  | 
         | 001 |  |  |  | 
         | 010 |  |  |  | 
         | 011 |  |  |  | 
         | 100 |  |  |  | 
         | 101 |  |  |  | 
         | 110 |  |  |  | 
         | 111 |  |  |  | 
    
    
     
    
    
         | Requested Address | Data at Address (in hex) | Hit/Miss? | 
         | 00000000 00000000 | 00 00 |  | 
         | 00000000 00101110 | 00 0F |  | 
         | 00000000 00101010 | C0 03 |  | 
         | 00000000 01011010 | F0 00 |  | 
         | 00000000 01011110 | FF FF |  | 
    
    
     
     
    
    
    Assume that these 5 data requests are in a loop, and so will be
    repeated many times in a row.  What will the hit rate converge to
    for this set of requests?
    
    Multi-Word Lines in Direct-Mapped Cache
    
    Now assume that we have a direct-mapped cache with 4 lines, each of
    which contains 4 two-byte words.
    In one iteration of the 5 address requests above,
    which requests would be hits and which would
    be misses, assuming that the cache is empty beforehand?
    What would be the final contents of the cache after that 1 iteration?
    (Put a "v" in any word in the cache that is valid but the
    information you've been given isn't enough to know what the value
    should be.)
    
 
    
    
        
         |  | v | Tag | Data in Cache | 
         | 00 |  |  |  |  |  |  | 
         | 01 |  |  |  |  |  |  | 
         | 10 |  |  |  |  |  |  | 
         | 11 |  |  |  |  |  |  | 
    
    
     
    
    
         | Requested Address | Data at Address | Hit/Miss? | 
         | 00000000 00000000 | 0000 |  | 
         | 00000000 00101110 | 000F |  | 
         | 00000000 00101010 | C003 |  | 
         | 00000000 01011010 | F000 |  | 
         | 00000000 01011110 | FFFF |  | 
    
    
     
     
    
    
    Assuming that the 5 data requests are in a loop, what does the hit
    rate converge to for this set of requests?
    
    Set-Associative Caches
    
    The tables below represent an 8-line, 2-way set associative cache,
    a 4-line, 2-way set associative cache, and
    a 4-line, fully associative cache.  (Fully associative means that
    all the lines in the cache are part of a single, large set.)
    Show the final contents of each cache after 1 iteration of the
    same 5 address requests used in the previous exercises,
    assuming an LRU replacement policy.
    What would the hit ratio converge to for each cache, assuming
    that the 5 address requests are in a loop, and so will
    be repeated many times in a row?
    
 
    
    
    
         |  | v | Tag | Data in Cache | 
         | 00 |  |  |  | 
         |  |  |  | 
         | 01 |  |  |  | 
         |  |  |  | 
         | 10 |  |  |  | 
         |  |  |  | 
         | 11 |  |  |  | 
         |  |  |  | 
    
    
     
    
     
    
    Reflection
    
    Which of the exercises above illustrate the effects of spatial
    locality?  Which exhibit the effects of temporal locality?