GATE | GATE-IT-2004 | Question 46
If we use internal data forwarding to speed up the performance of a CPU (R1, R2 and R3 are registers and M is a memory reference), then the sequence of operations
• Data Forwarding : In figure(2),ADD and SUB instructions have data dependency due to R1 registers,2nd and 3rd instructions read the register R1 value at ID stage but 1st instruction updates the value of R1 after WB stage .So 2nd SUB instruction is stalled for next two cycles to get updated value of R1 register.
• Internal data forwarding is a mechanism to reduces the stalls due to data dependency, it uses hardware technique to forward the result of interstage buffer register (IBR) to next instruction’s buffer register. As soon as result is available after ALU operation (in 1st instruction), result is transferred as input to ALU unit, then updated value of R1 gets available after ALU operation (otherwise it is available after WB satge),so no stalls are there.
• The ALU result from the EX/MEM register is always fed back to the ALU input latches. If the forwarding hardware detects that the previous ALU operation has written the register corresponding to the source for the current ALU operation, con- trol logic selects the forwarded result as the ALU input rather than the value read from the register file.
• As given in Ques, It is straight from question,register R1 value is copied(or better say loaded) to memory location M then M’s value is stored to registers R2 and R3 .Options A,B and C are wrong since they do not produce same result as desired.
• Let’s suppose register R1,R2,R3 and memory reference M have initial values 10,20,30 and 40 respectively then after the execution of sequence of operation,registers R2,R3 and memory references M have values 10,10 and 10 respectively.
• In option A ,after the execution of operations,registers R2,R3 and memory references M have values 20,10 and 20 respectively. In option B, registers R2 , R3 and memory references M have values 10,10 and 40 respectively and option C ,reg- isters R2,R3 and memory references M have values 20,20 and 10 respectively . But option D produces ,all registers and memory reference have value same value 10 as desired , so option (D) is correct only.
This solution is contributed by Nirmal Bharadwaj.