It’s no secret that we are generating far more information than we can possibly store. When you click an aging link and get that “404 file not found” message, you have likely tried to access information that has been vaporized to make room for something newer. And “aging” may mean just a few months old. Even with the insanely cheap cost of modern data storage, we just can’t keep everything. When you think about storing all this information on the scale of a decade or century, the problem is staggering.
The appeal of DNA as a storage medium is its extremely high density. Estimates of potential information density are on the order of an exabyte (a billion GB) per cubic millimeter, which is about a billion times more dense than the most advanced technologies available today. In other words, a hectare’s worth of storage tapes (20,000 m^3) could be reduced to 20 cubic centimeters, a volume smaller than an iPhone.
Sounds great. But of course we are not there yet. The principal obstacles to using DNA memory devices are the accuracy and speed of information storage and retrieval. Cost is an issue now, but undoubtedly will fall dramatically as technologies mature and scale. In biological systems, the error rates of replication range from 1 in 10,000 to 1 in 1,000,000,000[1] . The more accurate rates are achieved only through use of an elaborate and energetically expensive proofreading machinery. Readout is painfully slow, about 90 nucleotides per second in RNA transcription, corresponding to some 20 bytes per second[2] . Also there is the problem of stability. DNA stored in an aqueous medium is subject to hydrolysis, depurination and oxidative deamination reactions that introduce errors or make it unreadable.
Significant progress has been made in addressing these issues. A UK group has demonstrated the reading and writing of MB-scale files with 100% accuracy[3] in a scalable format. DNA stored in lyophilized form should be stable for decades, in glassy media perhaps for centuries.
Despite these advances, the principal limitation for storing information in DNA is likely to be read-write speed. Any method that relies on biological or chemical processes, such as hybridization or enzymatic sequencing, is going to be subject to diffusion rates as the limiting factor in information storage and retrieval. Diffusion is an incredibly slow process – about 15 orders of magnitude slower than the propagation of electrical signals. Perhaps advanced imaging methods, such as atomic force microscopy, could circumvent the need for solution-based information retrieval methods and speed this process up substantially – but probably not by anything like 15 logs.
Given this limitation, the economic impact of DNA storage devices is likely to be small in the short term. We will not be wearing DNA-enabled devices on our wrists or using them to control our driverless cars. But they could well have a significant role in storing huge sets of data for long periods of time. Information from gigantic particle physics projects have been mentioned as an example. 3D imaging of specimens in museums could be another. This type of storage, combined with advanced 3D printing technologies, could allow us to perfectly replicate any artifact or specimen in any museum.
Maybe some of the information stored this way will be retrieved in the future to solve some very big problems – how to resurrect a keystone species or enable teleportation technologies. The impact of DNA storage devices is not likely to be in preserving cat videos, but in retaining a full and complete description of the natural world at a given point in time. That could have a very big impact indeed.
Footnotes
[1] DNA Replication Fidelity
[2] RNA transcription rate
[3] Toward practical high-capacity low-maintenance storage of digital information in synthesised DNA
1 thought on “What will be the economic impact of repurposing DNA for digital storage?”
Comments are closed.