Researchers in Waterford have devised a new way of archiving data in DNA and then retrieving it again.

It is hoped the development could lead to new methods of storing huge amounts of data in genetic material over long periods.

Experts say the need for this is becoming increasingly great as more and more devices become internet connected.

We are currently producing 16 zettabytes (a zettabyte is one billion terabytes) of data a year globally.

That trend is only set to continue and it is predicted that by 2025 that figure will have risen to 160 zettabytes annually.

This quantity of data will require vast numbers of energy sapping and land occupying data centres, unless we can find higher capacity and more efficient methods of storage.

Scientists see DNA as one potential solution to this possible future crisis because the genetic material is capable of storing huge volumes of information - up to one zettabyte in one gram.

Director of Research at the Telecommunications Software and Systems Group (TSSG) at Waterford Institute of Technology Dr Sasitharan Balasubramaniam said: "When we look at DNA, that is like the software of the cell itself and actually holds the code of what the functionality of the cell will be."

"So when we take that we can then think of that as a storage medium for our own data as well. In that case, what we do is convert information into digital data, convert that into nucleotides and then use that then to store information."

Previous studies have successfully encoded data in DNA and then decoded it again.

Now, however, scientists at Waterford Institute of Technology have developed a method of using different types of bacteria to archive information longer term in DNA and then retrieve it.

Outlined in a paper published on the Cornell University Library's arXIV system, the storage and retrieval technique uses plasmids, or tiny circular rings of double-stranded DNA.

The researchers encode a simple message - in this case "Hello World" - into the plasmids and store them in a strain of the E. Coli bacteria called Novablue that is trapped in a specific location, which becomes the archive storage location.

Another form of E. Coli bacteria, HB101, which is mobile, is then released and travels to the Novablue.

Once it meets it, the plasmids containing the data are transferred from the Novablue to the HB101 through a connection process known as conjugation.

The HB101 then swims to a device capable of extracting the plasmids and reading the data they store.

The movement of the bacteria and the conjugation is controlled and made possible by the use and placement of two different antibiotics, tetracycline and streptomycin, within the archive storage and retrieval area.

Novablue is resistant to tetracycline, while HB101 is able to resist streptomycin.

In order to complete its swim across the archive storage area, the HB101 must therefore first conjugate with the Novablue in order to pick up its resistance to tetracycline.

In the process of doing this, the HB101 picks up the data storing plasmids from the Novablue.

There are a number of challenges to be overcome before this proof of principle could be used routinely and securely.


First, the process would have to be speeded up, as currently the transfer takes up to three days, although the scientists think this should be possible.

"Potentially DNA can be written down to seconds, there are instruments around a long-time that can do this," said Dr Lee Coffey, Lecturer in Molecular Biology at WIT.

"In terms of transfer of DNA between different bacteria and cells, that's happening with millions of billions of cells at one time per second in a reaction. So if we are talking about data transfer - getting cells to take DNA for us from A to B - at the moment it is a matter of how fast can they swim.

"But if we are talking about getting this down to microfluidic scale then we are talking about milliseconds for the transfer to happen."

The cost will also have to come down because in its current guise the technique is expensive.

Researchers will also have to find a way of stripping out errors that may arise during the encoding and decoding.

The also must find a stable and secure way of storing the bacteria, so that their integrity is not undermined and the security of the data compromised.

Work will also have to be focused on scaling up the amount of data that can be archived and retrieved quite considerably.

However, the scientists say if these issues can be overcome, the method opens the door to a new era of data storage and many other applications.

"Health is an obvious area we would go into," said Kevin Doolin, Director of Innovation at TSSG.

"You could imagine in years to come when a child is born you could start encoding information in them about their health records and it is there for their entire life. They would go to the doctor and it is all there at the prick of your finger.

"Things like food security and food traceability, you could encode specific information in DNA on a steak."