Computer Organization and Architecture

Question 1
In a k-way set associative cache, the cache is divided into v sets, each of which consists of k lines. The lines of a set are placed in sequence one after another. The lines in set s are sequenced before the lines in set (s+1). The main memory blocks are numbered 0 onwards. The main memory block numbered j must be mapped to any one of the cache lines from.
Tick
(j mod v) * k to (j mod v) * k + (k-1)
Cross
(j mod v) to (j mod v) + (k-1)
Cross
(j mod k) to (j mod k) + (v-1)
Cross
(j mod k) * v to (j mod k) * v + (v-1)


Question 1-Explanation: 
Number of sets in cache = v. So, main memory block j will be mapped to set (j mod v), which will be any one of the cache lines from (j mod v) * k to (j mod v) * k + (k-1). (Associativity plays no role in mapping- k-way associativity means there are k spaces for a block and hence reduces the chances of replacement.)
Question 2
Consider the following sequence of micro-operations.
     MBR ← PC 
     MAR ← X  
     PC ← Y  
     Memory ← MBR
Which one of the following is a possible operation performed by this sequence?
Cross
Instruction fetch
Cross
Operand fetch
Cross
Conditional branch
Tick
Initiation of interrupt service


Question 2-Explanation: 
MBR - Memory Buffer Register ( that stores the data being transferred to and from the immediate access store) MAR - Memory Address Register ( that holds the memory location of data that needs to be accessed.) PC - Program Counter ( It contains the address of the instruction being executed at the current time ) The 1st instruction places the value of PC into MBR The 2nd instruction places an address X into MAR. The 3rd instruction places an address Y into PC. The 4th instruction places the value of MBR ( which was the old PC value) into Memory. Now it can be seen from the 1st and the 4th instructions, that the control flow was not sequential and the value of PC was stored in the memory, so that the control can again come back to the address where it left the execution. This behavior is seen in the case of interrupt handling. And here X can be the address of the location in the memory which contains the beginning address of Interrupt service routine. And Y can be the beginning address of Interrupt service routine. In case of conditional branch (as for option C ) only PC is updated with the target address and there is no need to store the old PC value into the memory. And in the case of Instruction fetch and operand fetch ( as for option A and B), PC value is not stored anywhere else. Hence option D.
Question 3

Consider an instruction pipeline with five stages without any branch prediction: Fetch Instruction (FI), Decode Instruction (DI), Fetch Operand (FO), Execute Instruction (EI) and Write Operand (WO). The stage delays for FI, DI, FO, EI and WO are 5 ns, 7 ns, 10 ns, 8 ns and 6 ns, respectively. There are intermediate storage buffers after each stage and the delay of each buffer is 1 ns. A program consisting of 12 instructions I1, I2, I3, …, I12 is executed in this pipelined processor. Instruction I4 is the only branch instruction and its branch target is I9. If the branch is taken during the execution of this program, the time (in ns) needed to complete the program is

Cross

132

Tick

165

Cross

176

Cross

328



Question 3-Explanation: 
Pipeline will have to be stalled till EI stage of I4 completes, 
as EI stage will tell whether to take branch or not. 

After that I4(WO) and I9(FI) can go in parallel and later the
following instructions.
So, till I4(EI) completes : 7 cycles * (10 + 1 ) ns = 77ns
From I4(WO) or I9(FI) to I12(WO) : 8 cycles * (10 + 1)ns = 88ns
Total = 77 + 88 = 165 ns
Question 4
A RAM chip has a capacity of 1024 words of 8 bits each (1K × 8). The number of 2 × 4 decoders with enable line needed to construct a 16K × 16 RAM from 1K × 8 RAM is
Cross
4
Tick
5
Cross
6
Cross
7


Question 4-Explanation: 
RAM chip size = 1k ×8[1024 words of 8 bits each]
RAM to construct =16k ×16
Number of chips required = (16k x 16)/ ( 1k x 8)
                         = (16 x 2)
[16 chips vertically with each having 2 chips
horizontally]
So to select one chip out of 16 vertical chips, 
we need 4 x 16 decoder.

Available decoder is  2 x 4 decoder
To be constructed is 4 x 16 decoder

Hence 4 + 1 = 5 decoders are required.
Question 5
The following code segment is executed on a processor which allows only register operands in its instructions. Each instruction can have atmost two source operands and one destination operand. Assume that all variables are dead after this code segment.
   c = a + b;
   d = c * a;
   e = c + a;
   x = c * c;
   if (x > a) {
      y = a * a;
   }
   else {
     d = d * d;
     e = e * e;
  }
Suppose the instruction set architecture of the processor has only two registers. The only allowed compiler optimization is code motion, which moves statements from one place to another while preserving correctness. What is the minimum number of spills to memory in the compiled code?
Cross
0
Tick
1
Cross
2
Cross
3


Question 5-Explanation: 
r1......r2
a.......b......c = a + b
a.......c......x = c * c
a.......x......but we will have to store c in mem as we don\'t know if x > a
................. or not
y.......x......y = a * a
choosing the best case of x > a , min spills = 1 
Question 6

Consider the same data as above question. What is the minimum number of registers needed in the instruction set architecture of the processor to compile this code segment without any spill to memory? Do not apply any optimization other than optimizing register allocation.

Cross

3

Tick

4

Cross

5

Cross

6



Question 6-Explanation: 

Note that for solving the above problem we are not allowed for code motion. So, we will start analyzing the code line by line and determine how many registers will be required to execute the above code snippet. Assuming the registers are numbered R1, R2, R3 and R4. The analysis has been shown in the table below 

 



So, from the above analysis we can conclude that we will need minimum 4 registers to execute the above code snippet. This explanation has been contributed by Namita Singh. 

Question 7
The amount of ROM needed to implement a 4 bit multiplier is
Cross
64 bits
Cross
128 bits
Cross
1 Kbits
Tick
2 Kbits


Question 7-Explanation: 
For a 4 bit multiplier, there are 24 * 24 combinations, i.e., 28 combinations. Also, Output of a 4 bit multiplier is 8 bits. Thus, the amount of ROM needed = 28 * 8 = 211 = 2048 bits = 2Kbits
Question 8

Register renaming is done in pipelined processors

Cross

as an alternative to register allocation at compile time

Cross

for efficient access to function parameters and local variables

Tick

to handle certain kinds of hazards

Cross

as part of address translation



Question 8-Explanation: 

Register renaming is done to eliminate WAR (Write after Read) and WAW (Write after Write) dependency between instructions which could have caused pipieline stalls. Hence, (C) is the answer.

Example:

I1: Read A to B
I2: Write C to A

Here, there is a WAR dependency and pipeline would need stalls. In order to avoid it register renaming is done and 

Write  C to A
will be 
Write  C to A

WAR dependency is actually called anti-dependency and there is no real dependency except the fact that both uses same memory location. Register renaming can avoid this. Similarly WAW also. 
 

Question 9
A computer has a 256 KByte, 4-way set associative, write back data cache with block size of 32 Bytes. The processor sends 32 bit addresses to the cache controller. Each cache tag directory entry contains, in addition to address tag, 2 valid bits, 1 modified bit and 1 replacement bit. The number of bits in the tag field of an address is
Cross
11
Cross
14
Tick
16
Cross
27


Question 9-Explanation: 
A set-associative scheme is a hybrid between a fully associative cache, and direct mapped cache. It\'s considered a reasonable compromise between the complex hardware needed for fully associative caches (which requires parallel searches of all slots), and the simplistic direct-mapped scheme, which may cause collisions of addresses to the same slot (similar to collisions in a hash table). (source: http://www.cs.umd.edu/class/spring2003/cmsc311/Notes/Memory/set.html). Also see http://csillustrated.berkeley.edu/PDFs/handouts/cache-3-associativity-handout.pdf   Number of blocks = Cache-Size/Block-Size = 256 KB / 32 Bytes = 213 Number of Sets = 213 / 4 = 211 Tag + Set offset + Byte offset = 32 Tag + 11 + 5 = 32 Tag = 16
Question 10

Consider the data given in previous question. The size of the cache tag directory is

Tick

160 Kbits

Cross

136 bits

Cross

40 Kbits

Cross

32 bits



Question 10-Explanation: 

16 bit address 2 bit valid 1 modified 1 replace Total bits = 20 20 × no. of blocks = 160 K bits. 

There are 241 questions to complete.

  • Last Updated : 19 Nov, 2018

Share your thoughts in the comments
Similar Reads