Homework 5

Due Date : 23:59:59 Thursday December 2
Total Points : 100 pts

You must work on this assignment with one other person. You should work together on all parts of the assignment, but submit only one set of solutions. If you each work on part, then you each only learn part of the material. Please be sure to write both names on the submitted solutions.

Note: Copying material from Wikipedia, other online sources, or any source will not be tolerated. This form of plagiarism has occurred in the past, and penalties for violating the Duke Community Standard will be severe.

The Questions

  1. (15 pts) Assume you have a 2-wide superscalar core with dynamic instruction scheduling. Assume that the dynamic scheduler is optimal - it chooses the very best possible schedule (instead of, say, scheduling the oldest available instructions first). Assume that an instruction must wait for 2 cycles before consuming the data provided by a load instruction. For example, if a load writes to r6 on cycle 1, then an instruction cannot consume r6 until cycle 3. Assume that all instructions take one cycle and that data-dependent instructions cannot execute on the same cycle.
    (a) Show the execution of Thread 1 on this core - that is, on what cycle does each instruction execute. What is the utilization of the core, measured in terms of IPC/(superscalar width)? Do the same thing for Thread 2.
    (b) Show the simultaneous execution of both Thread 1 and Thread 2 on an SMT core. Assume that the core can fetch instructions from both threads during the same cycle. The core is identical to the core for part (a), except that it has 2 thread contexts. What is the utilization of the core now?

    Thread 1
    add r1, r2, r3 // r1=r2+r3
    load r4, A // r4=Mem[A]
    sub r5, r1, r4
    mul r6, r5, r2

    Thread 2
    load r11, B
    load r12, C
    xor r13, r11, r12
    and r16, r13, r12

  2. (10 pts) H&P 4.3
  3. (30 pts) H&P 4.16
  4. (15 pts) Discuss the advantages and disadvantages of using a bus vs. 2 dimensional mesh to interconnect processors in a cache coherenct shared memory multiprocessor.
  5. (15 pts) Write a brief summary (2 paragrqphs) of the RIFLE paper; please ensure you identify and explain the main ideas and provide your opinion on the strengths and weaknesses of the idea given everything you've learned this semester.
  6. (15 pts) Write a brief summary (2 paragraphs) of how DIVA works, what are the advantages of using a small checker core?

