Open In App

File Organization in COBOL

Improve
Improve
Like Article
Like
Save
Share
Report

A record-based COBOL file is a collection of records and file organization deals with how records are stored on a backing disk storage unit. The way records are organized on the device is important because that affects how records can be accessed and the latency of accessing those records.

COBOL provides 3 different types of file organization:

  • Sequential File Organization
  • Indexed Sequential File Organization
  • Relative File Organization

Sequential File Organization:

The simplest method of organizing records on a disk. Records are organized one after another in a serialized manner. Irrespective of the type of storage device, these files are processed serially. Hence, accessing any record in this file requires accessing all the previous records. So we start reading from the very first read, and sequentially go through each record until the required record or the EOF.

  • Any record is a collection field. A sequential file can be ordered based on a particular field called the “key” field or can be unordered. For eg., a file storing student records can be ordered based on “StudentID” field, such that “StudentID” values are always sorted, or it can be without any order.
  • Inserting records in unordered files is simple, we just insert the record at the end of the file. But it’s not that trivial in the case of ordered files, as we have to maintain order even after insertion. A new file has to be created in this case, which has original records and the new records incorrect order.
  • Deleting a record from a sequential file, be it ordered or unordered, requires rewriting the whole file into a new file without the record being deleted. Records to be deleted first need to be identified in the file. Record matching is done using key fields – values of the key fields for the records to be deleted are provided, which are matched against key fields of records in the file.
  • Updating a record means modifying fields of records, modifying the length of fields, or adding a new field is not allowed. Updating a record from a sequential file, be it ordered or unordered, requires rewriting the whole file into a new file with the record values updated.
  • Updating or deleting records in sequential files is quite expensive both in terms of computation, as it requires data copy, and storage, as it requires extra space for the new files.
  • Sequential files can be used if the file is going to be processed sequentially always. If a file can be accessed directly or randomly, then the sequential organization should be avoided.

Syntax:

INPUT-OUTPUT SECTION.
FILE-CONTROL.
SELECT file-name ASSIGN TO dd-name-jcl
ORGANIZATION IS SEQUENTIAL

Pros and cons of Sequential file organization:

  • These files are slow when the majority of records in the file are unaffected by an operation, as reading almost all the records to operate on a few records is not worth the resources required.
  • Update/deletion of records takes up double storage temporarily.
  • When most of the records in the file are going to be affected by an operation, operating on these files is faster compared to other types of organization, as there are no indexes to be traversed or record location to be calculated. Moreover, since all records are stored in a contiguous manner, the system works with records in batches, reducing the number of disk accesses required.
  • Though this file organization uses extra space while deleting/updating records, that usage is temporary, while other file systems take up more storage on a permanent basis. This organization does not use indexes and actually recovers space from deleted records.
  • Apart from disks, sequential files can also be stored on cheaper storage devices like magnetic tapes.

Indexed Sequential File Organization:

Direct access of a record was not possible with Sequential organization, to overcome this Indexed Sequential file organization is used. The indexed file consists of two files – data file and index file. The data file is created just like a sequential file, but it can also be accessed randomly. Index file consists of the value of each key field and address of the corresponding record on the backing storage device. Reading a record from an indexed file does not require reading all previous records in the file, instead, the given key field is searched in the index file, and once found the stored address is accessed directly to read the corresponding record.

  • The file system builds an index from data records based on the primary key. An index can also be built using other fields called “alternate keys”. An indexed file lets you access the records directly or sequentially using any of its keys.
  • Just like Sequential organization, files can be read sequentially with Indexed organization too, and the sequence is that of the key values.
  • Deleting or updating a record from an indexed file does not require the creation of a new file and data copy. But indexed files cannot be stored on sequential storage devices like magnetic tapes, these can only be stored on disks.

Syntax:

INPUT-OUTPUT SECTION.
FILE-CONTROL.
  SELECT file-name ASSIGN TO dd-name-jcl
  ORGANIZATION IS INDEXED
  RECORD KEY IS primary-key
  ALTERNATE RECORD KEY IS rec-key

Pros and cons of Indexed file organization:

  • Amongst the file organizations allowing direct access, these are the slowest, as accessing a record goes through a number of levels of the index and each of these requires disk access.
  • Updating and deleting records may require rebuilding indexes.
  • Indexes take up space on storage devices.
  • When a record is deleted, its space is not recovered fully until the indexes are rebuilt.
  • Indexed files can only be stored on devices supporting direct access, hence cannot be stored on magnetic tapes.
  • Can use more than one key and keys can also be alphanumeric.
  • Alternate keys are allowed to be duplicated, only the primary key must be unique.

Relative File Organization:

Relative file organization also allows direct/random access to stored records, but this file organization does not use an index. Instead, the key field itself is converted to an actual disk address, and hence there is no need for a search to find the record. The value of such a key field is called “relative record number”. Record in relative files is organized on ascending relative record numbers. Unlike indexed files, there can only be one key field and it has to be numeric. Also, file allocation happens for all the records starting from 1 to the record with the highest relative record number even though all the records are not yet populated.

A simple relative file organization may have a one-to-one correlation between key-value and the record’s disk location. For eg. a Student Info file is created with StudentID as a record key, whose value ranges from 1 to 999, then the record StudentID=1 can be placed on the first location on disk, StudentID=2 on the next disk location, and so on. To access a record with StudentID=458 means the system can directly go to disk location 458 and access the record. Similarly, there can be a different scheme that has a base value added to a relative record number, for eg. with a base value of 2000, the first record key value will be 2001 and the last would be 2999, given such a key value we would subtract 2000 from it to get the relative record number.

  • Insertion of a new record requires a relative record number, so that system can find the corresponding disk location and write the record to it.
  • Updating a record also requires a relative record number, so that system can find the corresponding disk location and rewrite the record.
  • Deletion of a record requires a relative record number, so that system can find the corresponding disk location and mark the record as deleted. In relative file organization, the system does not really delete the record but just marks it as free. Hence a relative file is usually sparsely populated and the space requirement of these files is larger than other types of organization.

Syntax:

INPUT-OUTPUT SECTION.
FILE-CONTROL.
  SELECT file-name ASSIGN TO dd-name-jcl
  ORGANIZATION IS RELATIVE
  RELATIVE KEY IS rec-key

Pros and cons of Relative file organization:

  • High storage space requirement as empty records also takes up space.
  • Space cannot be recovered from deleted records.
  • No more than one key can be used and the key must be numeric.
  • Relative files can only be stored on devices supporting direct access, hence cannot be stored on magnetic tapes.
  • Amongst the file organizations allowing direct access, these are the fastest.


Last Updated : 04 Mar, 2022
Like Article
Save Article
Previous
Next
Share your thoughts in the comments
Similar Reads