GATE | GATE-CS-2014-(Set-2) | Question 65

Consider a join (relation algebra) between relations r(R)and s(S) using the nested loop method. There are 3 buffers each of size equal to disk block size, out of which one buffer is reserved for intermediate results. Assuming size(r(R)) < size(s(S)), the join will have fewer number of disk block accesses if

(A) relation r(R) is in the outer loop.
(B) relation s(S) is in the outer loop.
(C) join selection factor between r(R) and s(S) is more than 0.5.

(D) join selection factor between r(R) and s(S) is less than 0.5.


Answer: (A)

Explanation:

Nested loop join is one of the methods to implement database in memory. A nested loop join is an  algorithm that joins two sets by using two nested loops.

According to nested join,given relation R and S



  For each tuple r in R do
For each tuple s in S do
If r and s satisfy the join condition
Then output the tuple <r,s>

Cost estimations for the above loop:
– b(R) and  b(S) number of blocks in R and in S
– Each block of outer relation is read once
– Inner relation is read once for each block of outer relation

Summing up : IO= b(R)+b(R)*b(S) total  IO operations

Lets assume |R|>|S|  i.e b(R) =10  and b(s) =3
Now,   if R is outer relation then, IO= 10+10*3=40
if S is outer relation then IO=3+10*3=33

As it can be observed , that total IO is lesser if the value of outer variable is less and as it is already given that |R|<|S|.Therefore,  Relation r(R) should be in the outer loop to have fewer number of disk block accesses.
References:

  1. https://www.informatik.huberlin.de/de/forschung/gebiete/wbi/teaching/archive/sose05/dbs2/slides/09_joins.pdf
  2. https://en.wikipedia.org/wiki/Nested_loop_join


Quiz of this Question



My Personal Notes arrow_drop_up