Digital Data storage in DNA

We may all have storage devices like SSDs, flash memory devices and SD cards of different capacities but we might need a new storage device as they all get full with time but now we may not have to worry about their capacities as the Scientists at New York Genome Center have come up with a new way to encode digital data in DNA to create the highest density large scale data storage scheme ever invented. It is capable of storing 215 petabytes (215 million gigabytes) in a single gram of DNA(It means it could store every bit of datum ever recorded by humans in a container about the size and weight of a couple of pickup trucks) and it could potentially last for hundreds of thousands of years. Isn’t that cool?

How did they do it?
The DNA in our cells contains the instructions for building all the proteins that keep us running. DNA is made up of repeating sequences of the nucleic acids adenine, guanine, cytosine, and thymine (A, G, C, and T) which are sometimes called base pairs. Each sequence of three bases translates to a different amino acid, which are the building blocks of proteins. It’s data storage just like what we do with hard drives but with much higher potential density.
The four-lettered nucleobase alphabet of DNA (A, C, G and T) can be transformed into binary code—for example, as 00 for A, 01 for C, 10 for G and 11 for T. Scientists looked at the algorithms that were being used to encode and decode the data and first converted the files into binary strings of 1s and 0s compressing them into one master file and then split the data into short strings of binary code. They devised an algorithm called a DNA Fountain which randomly packaged the strings into droplets, to which they added extra tags to put the file back together.

They started with six files including a full computer operating system and a computer virus. In all, the researchers generated a digital list of 72,000 DNA strands, each 200 bases long. They sent these as text files and later, the sequences were fed into a computer which translated the genetic code back into binary and used the tags to reassemble the six original files. The approach worked so well that the new files contained no errors and were also able to make a virtually unlimited number of error free copies of their files.



Advantages:

Storage Limits

Estimates based on bacterial genetics suggest that digital DNA could one day rival or exceed today’s storage technology.

Hard Disk Flash Memory Bacterial DNA
Read-write speed
(µs per bit)
~3,000 – 5,000 ~100 <100
Data retention
(years)
>10 >10 >100
Power usage
(watts per gigabyte)
~0.04 ~0.01 – 0.04 <10-10
Data density
(bits per cm3)
~1013 ~1016 ~1019

DNA has many advantages for storing digital data.

  • It is ultracompact.
  • It can last hundreds of thousands of years if kept in a cool, dry place.
  • As long as human societies are reading and writing DNA, they will be able to decode it.
  • DNA won’t degrade over time like cassette tapes and CDs, and it won’t become obsolete.

Disadvantages:

  • High cost.
  • DNA is significantly harder and slower to read than conventional computer transistors i.e., in terms of access speed it is actually less RAM-like than our average computer SSD or spinning magnetic hard-drive.

This article is contributed by Aakash Pal. If you like GeeksforGeeks and would like to contribute, you can also write an article using contribute.geeksforgeeks.org or mail your article to contribute@geeksforgeeks.org. See your article appearing on the GeeksforGeeks main page and help other Geeks.

Please write comments if you find anything incorrect, or you want to share more information about the topic discussed above.



My Personal Notes arrow_drop_up


Article Tags :

Be the First to upvote.


Please write to us at contribute@geeksforgeeks.org to report any issue with the above content.