Storing Data In DNA
For the most part, we go about our daily lives unaware that we are information storage devices. We store all sorts of information in our brains. Some of this information is quite useful, but most of it could probably be deemed trivial in the big picture. But no matter what specific information we individually store in our brains, each and every one of us carry inside us the key information for creating life.
Inside each of us is DNA (deoxyribonucleic acid), molecules that contain genetic information used for the development and ongoing functioning of all life. To greatly oversimplify the miracle of DNA, it both stores information and reads genetic code. It’s kind of like a tiny computer in that way as it has both storage and processing capability.
The problem is that we are creating so much digital data at such a staggeringly exponential rate, that today's hard-drives are not a sustainable medium for storing more and more data.
Within DNA, genetic information is encoded as a sequence of four types of nucleotides, which are molecules that can carry packets of energy. Through a complex process that utilizes these nucleotides as well as RNA (ribonucleic acid), DNA instructs our cells what to do. Cells without DNA would be as useless as computers without software.
We have an estimated 10 trillion cells in our bodies, each of which contains DNA. If you were to line up all of the DNA packed into your body’s cells, it would stretch from the Earth to the Sun and back 100 times, according to the National Human Genome Research Institute.
DNA is well-suited for biological information storage. But it also turns out that it may be useful for digital information storage. That’s what two scientists, Ewan Birney and Nick Goldman, at the European Bioinformatics Institute (EBI) demonstrated when they encoded and stored Shakespeare’s sonnets, an audio clip of Martin Luther King Jr.’s “I Have a Dream” speech, and a picture of their office on a strand of lab-synthesized DNA.
Birney and Goldman were not the first scientist to successfully store data in DNA. In 2012, a couple of scientists at Harvard’s Wyss Institute successfully encoded and packed 700 terabytes of digital data into a gram of DNA.
Today, we still tend to think of data storage in terms of gigabytes. For example, the MacBook Pro laptop I’m writing this column on has a 500GB hard-drive, which is roughly half a gigabyte. So, I’d need approximately 1,400 of the hard-drives like the one in my laptop to store that same amount of data.
Of course, there are hard-drives with capacity far greater than the one in my laptop. The problem is that we are creating so much digital data at such a staggeringly exponential rate, that today’s hard-drives are not a sustainable medium for storing more and more data.
According to a recent study done by the University of Southern California, we’ve created and stored 295 billion gigabytes of data since 1986. The volume of digital data created is growing at a rate of about 60 percent. Most recent estimates conclude that there is a total of 1 trillion gigabytes of digital data in existence in the world today.
The European Bioinformatics Institute is contributing to that data creation and storage too. EBI maintains the world’s largest database of genetic information.
“The data we’re being asked to be guardians of is growing exponentially,” Goldman said in an NPR interview. “But our budgets are not growing exponentially.”
DNA is a promising solution to big data storage because it does it in such a small way. The trick is going to be perfecting the process of encoding, synthesizing, sequencing, and decoding data in and out of DNA storage. Also, the cost of lab-synthesized DNA needs to come down in price.
According to Goldman, the estimated cost of sequencing the DNA they used for their storage of data was $12,400 per megabyte. The lab that they worked with, Agilent Technologies, waived those costs for the experiment.
But the cost of lab-synthesized DNA has been dropping each year. Goldman and Birney estimate that in about a decade it may be more cost-effective to archive large volumes of data in DNA than in a warehouse full of computer hard-drives.
In addition to being able to store large volumes of data in a small amount of DNA, the other advantage of DNA is non-volatility. DNA lasts a long, long time; tens of thousands of years. For example, we’ve recovered DNA from Neanderthals and wooly mammoths.
“And that’s not even a carefully controlled sample,” Goldman points out. “That’s just a mammoth that laid down and died somewhere cold.” DNA that was stored in a carefully controlled environment would potentially last much longer, said Goldman.
One day, we could advance this technique of encoding, sequencing, and storing information in DNA to the point of being able to encode and store every bit—literally every “bit”—of information that we’ve amassed about life, the universe, and everything.
We could then send that information off into our galaxy, out into deep space and the infinite universe. Maybe we’ll have a target planet out there, one that we’ve determined can support life, and we create a delivery mechanism for creating life on that target planet.
Or maybe this was already done long ago by some ancient civilization that evolved and advanced their technology to this point long before life suddenly and miraculously appeared on planet Earth.
Scott Dewing is a technologist, teacher, and writer. He lives with his family on a low-tech farm in the State of Jefferson. Archives of his columns and other postings can be found on his blog at: blog.insidethebox.org