File Models in Distributed System
In this article, we will go through the concept of File Models in Distributed Systems. In Distributed File Systems (DFS), multiple machines are used to provide the file system’s facility. Different file systems often employ different conceptual models. The models based on structure and mobility are commonly used for the modeling of files.
There are two types of file models:
- Unstructured and Structured Files
- Mutable and Immutable Files
Based on the structure criteria, file models are of two types:
1. Unstructured Files: It is the simplest and most commonly used model. A file is a collection of an unstructured sequence of data in the unstructured model. There is no substructure associated with it. The data and structure of each file available in the file system is an uninterpreted sequence of bytes as it relies on the application used like UNIX or DOS. Most modern OS prefers to use the unstructured file model instead of the structured file model because of sharing of files by different applications. It follows no structure so different applications can interpret in different ways.
2. Structured Files: The rarely used file model now is the Structured file model. Here in the structured file model, the file system sees a file consisting of a collection of a sequence of records in order. Files exhibit different types, different sizes, and different properties. It can also be possible that records of different files belonging to the same file system are of variant sizes. Files possess different properties despite they belong to the same file system. The smallest unit of data that can be retrieved is termed a record. The read or write operations are performed on a set of records. In a structured files system, there are various “File Attributes” available, which describe the file. Each attribute consists of a name with its value. File attributes rely on the file system used. It contains information regarding files, file size, file owner, date of last modification, date of file creation, access permission, and date of last access. The Directory Service facility is used to maintain file attributes because of the varying access permissions.
The structured files further consist of two types:
- Files with Non-Indexed records: In files with non-indexed records, the retrieving of records is performed concerning a position in the file. For example third record from the beginning, the third record from the last/end.
- Files with Indexed records: In files with indexed records, one or more key fields exist in each record, each of which can be addressed by providing its value. To locate records fast, a file is maintained as a B-tree or other equivalent data structure or hash table.
Based on the modifiability criteria, file models are of two types:
3. Mutable Files: The mutable file model is used by the existing OS. The existing contents of a file get overwritten by the new contents after file updating. As the same file gets updated again and again after writing new contents so a file is described as a single sequence of records.
4. Immutable Files: Cedar File System uses the Immutable file model. In the immutable file model, the file cannot be changed once it has been created. The file can only be deleted after its creation. To implement file updates, multiple versions are created of the same file. Every time a new version of the file is created when a file is updated. There is consistent sharing in this file model because of the sharing of only immutable files. Distributed Systems support caching and replication schemes and hence, overcome the limitation to maintain consistency of multiple copies. Drawbacks of using the Immutable file model- increase in space utilization and increase in disk allocation activity. CFS employs the “Keep” parameter to maintain the no. of the current version of the file. When the value of the parameter is 1 then it causes the creation of a new file version. The existing version gets deleted and the disk space is reused for another one. When the value of the parameter is greater than 1 then that refers to the existence of multiple versions of a file. The specific version of a file can be accessed by mentioning its full name. In case the version number is not mentioned then CFS uses the lowest version number for the implementation of operations like the “delete” operation and the highest version number for the other operations like the “open” operation.
Please Login to comment...