Sparse Files are a type of computer file that allows for efficient storage allocation for large data. A file is considered to be sparse when much of its data is zero (empty data).
Support for the creation of such files is generally provided by the File system. This type of file is used significantly in computer science areas such as DBMS (Database Management Systems), Digital Image Processing, etc.
Sparse files are created differently than a normal (non-empty) file. Whenever a sparse file is created metadata representing the empty blocks (bytes) of disks is written to the disk, rather than the actual bytes which make up block, using less disk space. This is because empty bytes don’t need to be saved, thus they can be represented by metadata.
Actual data blocks are only written when any non-empty (zero) data is written to the file. When reading sparse files, the file system transparently converts metadata representing empty blocks into “real” blocks filled with null bytes at runtime. The application is unaware of this conversion as conversion happens at the file system level. A sparse file need not be totally filled with null data, rather certain empty sections of a file could also be flagged as sparse. The data still follows the aforementioned mechanism, but on a smaller scale.
Advantages of Sparse files :
- A large amount of storage space can be allocated without physically writing any sectors, and therefore allows for faster file creation.
- Allocation occurs only when non-empty data is written, therefore disk space is saved.
- Since the logical space of sparse files is more than allocated space, therefore more data can be read then allocated.
- If the initial allocation requires writing all zeros to space, then no actual allocation occurs thus preventing unnecessary disk read-writes.
- On files which aren’t completely sparse it reduces time of first write as system doesn’t have to allocate blocks for “skipped” space.
- In certain scenarios is better then file compression.
Disadvantages of Sparse files :
- Most file copy operations destroy the sparse properties the file. Therefore, sparse regions of file are explicitly allocated on disk, losing their sparse properties.
- Since logical size of file can be greater then their allocated size, file system free space reports may not be correct.
- Several applications do not not work efficiently with sparse files.
- Sparse files may become fragmented overtime with valid data writes
Attention reader! Don’t stop learning now. Get hold of all the important CS Theory concepts for SDE interviews with the CS Theory Course at a student-friendly price and become industry ready.
- MySQL | Database Files
- The Multistage Algorithm in Data Analytics
- Frequent Itemsets and it's applications in data analytics
- Attributes and its types in data analytics
- Key Roles for Data Analytics project
- Difference between Selenium Remote Webdriver and Selenium Webdriver
- Difference between 1NF and 3NF in DBMS
- Real World Applications of Cloud Computing
- Clustering in Data Mining
- Difference between Big Data and Cloud Computing
- Difference between Big Data and Data Analytics
- Difference between DBMS and DSMS
- Basic approaches for Data generalization (DWDM)
- Difference between Selenium RC and Selenium Webdriver
If you like GeeksforGeeks and would like to contribute, you can also write an article using contribute.geeksforgeeks.org or mail your article to firstname.lastname@example.org. See your article appearing on the GeeksforGeeks main page and help other Geeks.
Please Improve this article if you find anything incorrect by clicking on the "Improve Article" button below.