Skip to content

How to Preserve

Data Storage Media and Digital Preservation: Today and The Future

It is now easier than ever to create digital content, but preserving that content remains a challenge. By 2025, it is estimated that the amount of data being stored globally from all sectors will reach a whopping 175 zettabytes. Mass migrations from obsolete storage formats will only add to the data storage tsunami, as there simply isn't enough storage media to manage the load, nor does the planet have enough raw materials to manufacture our current media ad infinitum.

Future technologies, and the actions that preservationists, archivists, and other stewards of digital content take today, will be crucial to stemming the tide going forward. Andrea Kalas, SVP of Asset Management at Paramount Pictures, sits down with Digital Bedrock founder and CEO Linda Tadic to discuss potential solutions, from R&D into alternative technologies still in their infancy, such as DNA storage, to concepts of hierarchical storage principles and appraisal policies that will help determine what content should be retained. The sooner we discuss and tackle these concerns, the better-equipped we will be to face them head on.

Join the Discussion

Each post in the Academy Digital Preservation Forum has a companion topic in our community. Click 'Continue the discussion' to go to our community and give your response to this post.

  • Avatar for rob.hummel

    Big progress when in December of 2020, Microsoft successfully recorded 5 bytes to DNA in just over 21 hours.

  • Avatar for cinemaculture

    Linda Tadic’s article discussed the general need for denser and more durable storage media for archival data, and she reported on two particular technologies – “glass” storage (data permanently marked in silica crystal by femto-lasers) and DNA storage. Some very interesting progress regarding the storage of data using DNA has been reported recently. In this work, the researchers were able add “letters” to DNA sequences (that is, an extension of the AGTC alphabet) to significantly increase the ability to store data at a molecular level. The novel version of DNA is a variant of what we normally find in nature, and appears to be an important step in the program to evolve a stable data storage technology for massive data storage, another advance toward the replacement of the more transient storage devices that currently support cloud storage. This nano storage model offers data density and longevity that exceed all present containers by orders of magnitude.

    Note that the primary article at the end of this list provides a rich bibliography of this research and cognate technologies germane to nano-storage of data.

    It seems like scientific breakthroughs take roughly fifty years to evolve into viable technologies such as those we adopt in the media industry. So where are we on the clock for molecular data storage?

  • Avatar for cinemaculture

    A recent article from the BBC (9 October 2022) details some of the latest research on the use of DNA for mass data storage. This post reviews the spectrum of developmental challenges to the adoption of DNA storage, including encoding methodologies, capacity, speed of reading and writing, accuracy of writing and error correction schemes, cost of writing, storage and reading, storage and basic materials. The article reports on two previously under-reported issues. The first of these issues is the vulnerability of DNA data. While previous press releases have pointed to the extreme longevity of DNA in certain contexts, this article references degradation, gives a lower end duration of 1000 years and also points to research on specialized container strategies will be necessary to protect DNA used for storage. Similarly, most of the earlier discussion has focused on the ubiquity of molecular DNA as a raw material, and the current BBC article uncovers some of the issues with obtaining a suitable DNA material substrate for mass data storage. The article provides a number of references for further reading, including a U.S. government data storage initiative, an organization devoted to mass data storage on DNA, as substantial 1200+ page bibliography of current research and a few non-comprehensive references to competing storage technologies.

    The article:

    URLs in the article:

    The U.S. government project on mass data storage:

    The organization devoted to collating information regarding developments of DNA storage:

    The bibliography:

    Silica nano-structures:

    Non-DNA molecular data storage:

    https://pubs.acs.org/doi/abs/10.1021/acs.macromol.0c00666

  • Avatar for cinemaculture

    Continuing the bibliography on data storage using DNA:

    https://www.nature.com/articles/s41586-022-05218-7

    https://www.nature.com/articles/s41563-021-01021-3

    https://www.nature.com/articles/s41467-021-23669-w

    https://www.nature.com/articles/s41467-020-16797-2