Technical Standards and Conventions
A Work in Progress
This is not a definitive list, and probably never will be. As digital preservation evolves, so too will the tools associated with the work evolve. We encourage others to suggest standards, processes, technologies, approaches, organizations, and other useful ways towards industry adoption of established, shared approaches to digital preservation.
This document lists some of the technical standards and conventions that are of particular importance to moving-image archivists working with digital assets, and specifically to those archivists working in the entertainment industry. Photochemical film preservation relied on the standards created for motion picture film production to migrate images stored on nitrate stock onto acetate or polyester stock. Tape formats were also used when migrating images and audio from older formats. In the digital world, this approach to photochemical preservation can help inform digital preservation. As we move from the preservation of a “carrier” such as film stock or videotape to a focus on preserving files, understanding how to approach preservation involves reviewing standards and technologies to evaluate what they can provide to support digital preservation initiatives. Many of the standards and conventions listed in this document can be used as guidelines for naming files, grouping files, correctly identifying files, and recognizing the long-term preservation benefits of different file formats. There is not a standard per se on how to digitally preserve. The closest that comes to that is the trusted digital repository standard which is a certification process of the organization and its approaches to digital preservation. This document lists some of these guidelines but is not comprehensive. Motion picture film, analog recording tape, paper or digital images, digital sound, and digital metadata files — whatever the choice for “essence carriers and containers,” they are all built using some form of technology; and technologies only become widely and sustainably adopted when they are sufficiently specified, standardized, and, of course, useful and practical.
Technical sustainability is especially important for long-term digital preservation to minimize — and counter — the threat of technological obsolescence.
Technical conventions and specifications that achieve wide acceptance may be considered standards. A benefit of standards is interoperability and interchangeability across products, systems, and manufacturers. Some standards are developed, or adopted, by formal, independent standards bodies operating under so-called due-process procedures; these include organizations oriented toward specific fields or industries, such as the Society of Motion Picture and Television Engineers (SMPTE), and national standards institutes, such as the Deutsche Institut für Normung (DIN). Several standards-development bodies, such as the International Organization for Standardization (ISO) operate internationally and publish standards adopted globally.
The following links and descriptions catalog work that has been done — and is still being done — in service of finding sustainable technical practices relevant to long-term digital preservation of motion pictures. Academy Digital Preservation Forum members are invited to add to or comment on this list by clicking here.
Background on storage
Provided here is some background and historical perspective on the archival storage of image and sound, including notes on cloud storage.
Wrapper formats are used to combine more granular elements of audiovisual content such as images and audio. A Digital Cinema Package (DCP) is essentially a wrapper format and depends in part on MXF. These wrapper formats are more often used for distribution or delivery but may be useful for archives if there is a need for keeping granular elements together in a package for conformance or ease of use. Archivists may preserve the separate elements separately and also maintain assets in wrapper formats. MXF created a structure based on OAIS models to contain both assets and metadata among other components.
File Formats and Codecs
Each granular element of audiovisual content is saved digitally in its own audio or image file format determined by the chosen codec (short for coder/decoder). A codec may be compressed (either lossy or lossless) or uncompressed. For proper preservation and long-term archive, audio and image are typically saved in industry-standard uncompressed or losslessly-compressed formats in order to preserve all original information in the elements. With a lossy codec some information is lost through the compression (although a good lossy image codec may be effectively “visually lossless”); lossy codecs are typically used for distribution masters because they are much smaller in file size compared to lossless or uncompressed codecs. The file format wraps the coded data into a file, carries additional metadata about the element, and provides the file extension for identification.
Our industry has many several standardized guidelines on how to approach preservation and archive of our digital assets.
Unique identifiers for objects, titles or other asset or metadata entities are critical for search and discovery and for system interoperability. The following are registered unique numbers. Most entertainment companies maintain unique numbering systems for their intellectual properties which usually uniquely identify title and version for feature films. Unique IDs are important to preservation in terms of identification and disambiguation, in automation, and in metric collection to ensure all assets are preserved.
A digest hashing algorithm, or "hash function" (aka “checksum”), takes an input (e.g. the contents of a file) and generates an output value, referred to as a digest or hash. A well-designed hash function has properties that can be valuable to digital preservation by helping to document a lack of data loss and file uniqueness.