Open In App

Primary Indexing in Databases

Indexing is a technique used to reduce access cost or I/O cost, now the question arrives what is access cost? Access cost is defined as the number of secondary memory blocks which is transferred from secondary memory to main memory in order to access required data. In this article, we are going to discuss every point about primary indexing.

What is Primary Indexing?

A Primary Index is an ordered file whose records are of fixed length with two fields. The first field of the index is the primary key of the data file in an ordered manner, and the second field of the ordered file contains a block pointer that points to the data block where a record containing the key is available.



Primary Indexing

Working of Primary Indexing

Primary Indexing

Types of Primary Indexing

Advantages of Primary Indexing

Disadvantages of Primary Indexing

Question for Practice

Question: Consider a database of fixed-length records, stored as an ordered file. The database has 25,000 records, with each record being 100 bytes, of which the primary key occupies 15 bytes. The data file is block-aligned in that each data record is fully contained within a block. The database is indexed by a primary index file, which is also stored as a block-aligned ordered file. The figure below depicts this indexing scheme.

Primary Indexing

Suppose the block size of the file system is 1024 bytes, and a pointer to a block occupies 5 bytes. The system uses binary search on the index file to search for a record with a given key. You may assume that a binary search on an index file of b blocks takes ⌈log2⁡b⌉ block accesses in the worst case. Given a key, the number of block accesses required to identify the block in the data file that may contain a record with the key, in the worst case, is _______. [ GATE CS/IT 2023 ]



Solution:

As it is given it is fixes length record,
So one block should fully contain within block so here it means
that they are saying that you should use un-spanned file organization.

=> No. of Records(Rn) = 25,000
=> Size of Record(Rs) = 100 Bytes
=> Size of Key(Ks) = 15 Bytes
=> Block Pointer Size(Ps) = 5 Bytes

So , Index Size(Is) = Size of Key + Block Pointer Size = 15 + 5 = 20 Bytes
Disk Block Size(Bs) = 1024 Bytes

=> No. of Record per Block = ⌊(Disk Block Size/Record Size)⌋
= ⌊1024⌋ = ⌊10.24⌋ = 10
=> No. of Block Required for record = ⌈( no. of record / no of record per block)⌉
= ⌈ 25000/10 ⌉ = 2500

=> The database is indexed by a primary index file it means,
that here we have to use sparse indexing.
=> No. of Index per Block =⌊(Disk Block Size/Index Size)⌋ 
= ⌊ (1024/20) ⌋
= 51 index per blocks
=> No. of Block Required for indexes = ⌈ ( No. of index /No. of index per block) ⌉
= ⌈ 2500/51 ⌉
= 50

=> It is given that Binary Search is used to search the desired block
and we know that it takes (log n) time to search in worst case
where n is the no. of comparison required

=> Here No. of comparison = No. of Index Block

=> Hence the number of block accesses required to identify the block in
the data file that may contain a record with the key,
in the worst case, is (log 50) = 6 Blocks

Frequently Asked Question on Primary Indexing – FAQs

Can we build two primary indexes on a file?

No we can’t build two primary index on a file, there is only one primary index on a file.

What is the difference between clustered and non-clustered primary index?

The records in the data file in clustered index are ordered physically based on primary key while in non-clustered primary index the records in the data file are not ordered physically.

How does primary indexing improve database performance?

Primary indexing improve the performance of the database by decreasing the time required to retrieve record from table.


Article Tags :