Pentagon wants to store data in ‘DNA’ to safeguard the information it holds on US citizens

Pentagon wants to store its sensitive data on DNA by 2028 in a bid to safeguard ‘limitless’ amounts of private information

DNA storage has the potential to hold ‘limitless’ amounts of data for millions of years.

Now, the Pentagon says it wants to use the technology to safeguard the vast quantities of sensitive information it holds on citizens.

Intelligence officials have announced they are looking for a research team to help them develop a new storage system based around the structure of our genes.

The group would have to develop an archive capable of processing information ‘at the exabyte scale’ within 10 years, according to the announcement.

One exabyte of data is around four million times larger than the memory capacity of the top iPhone X model.  

Scroll down for video 

The Pentagon plans to store data on DNA in a bid to safeguard the vast quantities of information it holds on US citizens. Officials have announced they are looking for a research team to help them develop a new system based around the structure of our genes (stock)

The Pentagon plans to store data on DNA in a bid to safeguard the vast quantities of information it holds on US citizens. Officials have announced they are looking for a research team to help them develop a new system based around the structure of our genes (stock)

DNA has a number of advantages over traditional methods for storing digital data.

‘DNA won’t degrade over time like cassette tapes and CDs, and it won’t become obsolete,’ Yaniv Erlich, a computer scientist at Columbia University told Science.

New technologies can write and read large amounts of DNA at a time, allowing it to be scaled up – something that the Pentagon is hoping to exploit.

Experimental US research agency the Intelligence Advanced Research Projects Activity (Iarpa), based in Washington, D.C, said it will fund a four-year scheme.

During this period, a research team is expected to build a table top device that writes data onto ‘sequence-controlled polymers’ – synthetic molecules structured like DNA.

The group will also develop a device that reads this data once it’s stored as well as an operating system that allows officials to search and access the information.

The announcement, posted to the Federal Business Opportunities website, said the system must write one terabyte and read ten terabytes per day by the project’s end.

A single terabyte is equal to 1,000 gigabytes, and can store around 250 million pages of plain text.

The system must also ‘present a clear and commercially viable path to future deployment at the exabyte scale’ within ten years, according to Iarpa.

The group would have to develop an archive capable of processing information 'at the exabyte scale' within 10 years, according to the announcement. One exabyte of data is around four million times larger than the memory capacity of the top iPhone X model (stock image)

The group would have to develop an archive capable of processing information ‘at the exabyte scale’ within 10 years, according to the announcement. One exabyte of data is around four million times larger than the memory capacity of the top iPhone X model (stock image)

One exabyte is equal to more than one billion gigabytes, and could hold all of the printed material at the Library of Congress a hundred thousand times.

Allegedly, five exabytes of data could store ‘all words ever spoken by human beings’, according to Caltech researcher Roy Williams.

Humans generate a glut of data every day, and US intelligence officials are looking for more efficient ways to store what they hold on members of the public.

Computer software firm Domo estimates we will each generate 140 gigabytes of data a day by 2020, a figure that is only expected grow in the years that follow.

Modern exabyte-scale data archives cost billions to operate each year and take up large areas of land, which Iarpa says must improve over the next decade.

HOW DO RESEARCHERS STORE DATA ON SYNTHETIC DNA?

A number of research teams have developed methods to store huge reams of data on small, synthetic DNA molecules.

It is hoped that this technique could make giant data-storage facilities a thing of the past, as DNA and similar polymers can store information more than 100,000 times more efficiently than traditional technology.

Researchers begin with a method that converts the long strings of ones and zeros in digital data into the four basic blocks of DNA sequences – adenine, guanine, cytosine and thymine.

The digital data is then chopped into pieces and stored by synthesising a massive number of synthetic DNA molecules.

These can be dehydrated and preserved for a long period of time. 

To retrieve the files, they use modern sequencing technology to read the DNA strands, followed by software to translate the genetic code back into binary. 

‘This resource intensive model does not offer a tractable path to scaling beyond the exabyte regime in the future,’ the agency wrote.

‘Faced with exponential data growth, large data consumers may soon face a choice between investing exponentially more resources in storage or discarding an exponentially increasing fraction of data.’

Iarpa said it aims to develop a new system ‘with reduced physical footprint, power and cost requirements relative to conventional storage technologies.’

In February, the agency outlined plans for an exabyte-scale storage facility that fits into a single room and costs less than £0.75 million ($1 million) per year to operate.

While scientists are yet to build a system that comes even close to this scale, research into DNA data storage has accelerated over the past five years.

The archive systems take inspiration from our genes, storing information in ‘blocks’ on synthetic DNA molecules that are synthesised in a lab.

Researchers convert the reams of ones and zeros in digital data into the four basic blocks of DNA sequences – adenine, guanine, cytosine and thymine.

The digital data is then chopped into pieces and stored by synthesising a massive number of synthetic DNA molecules.

These can be dehydrated and preserved for a long period of time at relatively low cost compared to the energy-intensive data centres of today.

To retrieve the files, scientists use modern sequencing technology to read the DNA strands, followed by software to translate the genetic code back into binary. 

Advertisement

Leave a Reply

Your email address will not be published. Required fields are marked *