Jump to content

Recommended Posts

Posted

A processor with a frequency of 2GHz executes the following code:

for (i=0;i<4096;i++) {

s = s + X * Y

}

Assume that s is in a register (8 bytes), vectors X and Y are aligned to block level in memory. All elements of X

and Y are 8 bytes long. X and Y hold FP values.

The processor has a data cache (L1) of 32 KBytes, two‐way set associative, and block size of 64 Bytes. The

miss penalty is 128 cycles (time to access the main memory). On a miss, the CPU stalls the execution of

instructions until the miss is served (the data is sent to the CPU). The instruction cache does not have any

miss. Assume vectors X and Y are NOT loaded in the L1 when the execution starts.

The compiler translates the code to 8 instructions, two of which are memory accesses. The average CPI when

we hit in the cache is 1.

Question 1: Compute the total number of misses

Question 2: Compute the misses per instruction

Question 3: Taking into account the misses, compute the CPI, MIPS and MFLOPS.

Now assume we add a second level of cache (L2) and assume that vectors X and Y are in the L2 when the

execution starts.

Question 4: Compute the miss penalty for the L1 to double the performance.

Hope can help me thanks

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
×
×
  • Create New...

Important Information

This website uses cookies to ensure you get the best experience on our website. See our Privacy Policy and Terms of Use