Pipeline will have to be stalled till Ei stage of l4 completes, as Ei stage will tell whether to take branch or not. After that l4(WO) and l9(Fi) can go in parallel and later the following instructions. So, till l4(Ei) completes : 7 cycles * (10 + 1 ) ns = 77ns From l4(WO) or l9(Fi) to l12(WO) : 8 cycles * (10 + 1)ns = 88ns Total = 77 + 88 = 165 ns
as an alternative to register allocation at compile time
for efficient access to function parameters and local variables
to handle certain kinds of hazards
as part of address translation
Register Indirect Scaled Addressing
Base Indexed Addressing
R1←c, R2←d, R2←R1+R2, R1←e, R2←R1-R2
Now to calculate the rest of the expression we must load a and b into the registers but we need the
content of R2 later.
So we must use another Register.
R1←a, R3←b, R1←R1-R3, R1←R1+R2
Pipeline registers overhead is not counted in normal time execution So the total count will be 5+6+11+8= 30 [without pipeline] Now, for pipeline, each stage will be of 11 n-sec (+ 1 n-sec for overhead). and, in steady state output is produced after every pipeline cycle. Here, in this case 11 n-sec. After adding 1n-sec overhead, We will get 12 n-sec of constant output producing cycle. dividing 30/12 we get 2.5
Instruction Meaning of instruction I0 :MUL R2 ,R0 ,R1 R2 ¬ R0 *R1 I1 :DIV R5 ,R3 ,R4 R5 ¬ R3/R4 I2 :ADD R2 ,R5 ,R2 R2 ¬ R5+R2 I3 :SUB R5 ,R2 ,R6 R5 ¬ R2-R6
The program below uses six temporary variables a, b, c, d, e, f.
a = 1 b = 10 c = 20 d = a+b e = c+d f = c+e b = c+e e = b+f d = 5+e return d+f
Assuming that all operations take their operands from registers, what is the minimum number of registers needed to execute this program without spilling?
All of the given expressions use at-most 3 variables, so we never need more than 3 registers.
It requires minimum of 3 registers.
Principle of Register Allocation: If a variable needs to be allocated to a register, the system checks for any free register available, if it finds one, it allocates. If there is no free register, then it checks for a register that contains a dead variable ( a variable whose value is not going to be used in the future ), and if it finds one then it allocates. Otherwise, it goes for Spilling ( it checks for a register whose value is needed after the longest time, saves its value into the memory, and then use that register for current allocation, later when the old value of the register is needed, the system gets it from the memory where it was saved and allocate it in any register which is available ).
But here we should not apply spilling as directed in the question.
Let's allocate the registers for the variables.
a = 1 ( let's say register R1 is allocated for variable 'a' )
b = 10 ( R2 for 'b' , because value of 'a' is going to be used in the future, hence can not replace variable of 'a' by that of 'b' in R1)
c = 20 ( R3 for 'c', because values of 'a' and 'b' are going to be used in the future, hence can not replace variable 'a' or 'b' by 'c' in R1 or R2 respectively)
d = a+b ( now, 'd' can be assigned to R1 because R1 contains a dead variable which is 'a' and it is so-called because it is not going to be used in future, i.e. no subsequent expression uses the value of variable 'a')
e = c+d ( 'e' can be assigned to R1, because currently R1 contains value of variable 'd' which is not going to be used in the subsequent expression.)
Note: an already calculated value of a variable is used only by READ operation ( not WRITE), hence we have to see only on the RHS side of the subsequent expressions whether the variable is going to be used or not.
f = c+e ( ' f ' can be assigned to R2, because value of 'b' in register R2 is not going to be used in subsequent expressions, hence R2 can be used to allocate for ' f ' replacing 'b' )
b = c+e ( ' b ' can be assigned to R3, because value of 'c' in R3 is not being used later )
e = b+f ( here 'e' is already in R1, so no allocation here, direct assignment )
d = 5+e ( 'd' can be assigned to either R1 or R3, because values in both are not used further, let's assign in R1 )
return d+f ( no allocation here, simply contents of registers R1 and R2 are added and returned)
hence we need only 3 registers, R1 R2 and R3.
I. It is useful in creating self-relocating code. II. If it is included in an Instruction Set Architecture, then an additional ALU is required for effective address calculation. III.The amount of increment depends on the size of the data item accessed.
II and III only
I. It must be a trap instruction II. It must be a privileged instruction III. An exception cannot be allowed to occur during execution of an RFE instruction
I and II only
I, II and III only